Software
Licensing
All project software is released as open-source under permissive licenses. Selected tools are distributed under Apache 2.0 and MIT licenses; some components use GPL-3.0 or public-domain dedication. See individual repositories for detailed licensing information.
The software is a tracer based on ptracer that can generate a log of all the files accessed by a program, and then create a docker image which the Ubuntu packages (and possibly R packages) required to run again the program.
This repository curates quantitative transparency disclosures about the online sexual exploitation of minors, i.e., people under the age of eighteen, in machine-readable form. It also includes a 4,400-line Python library for validating and tidying the data and Python as well as R notebooks with the analysis for the corresponding report "Putting the Count Back Into Accountability: An Analysis of Transparency Data About the Sexual Exploitation of Minors".
This dataset contains a collection of religious texts in Czech (any type of religion) publicly available on the Internet. It has been curated from the SlimPajama-627B dataset using a classifier. The purposes of this dataset include exploring sentiments in religious texts and studying the evolution of religious sayings over time.
Library implementing set-theoretic types operations (set connectives, subtyping, constraint solving, etc.). This library is used for different research projects, in particular a type-checker for the language R developed at Czech Technical University, and Pysem, a Python type-checker under development at Université Paris-Saclay (France).
A type-checking library for dynamic languages, implementating advanced set-theoretic typing techniques (type inference, type narrowing, ad-hoc and parametric polymorphism, etc.). It aims to demonstrate the effectiveness and usability of set-theoretic types for typing dynamic languages such as JavaScript, Python, or R.
A prototype for a new compiler infrastructure for the R programming language. Instead of embedding the compiler in the language runtime, we have developed a compiler-as-a-service, a client-server system that offloads client compilation requests to a server with feedback-driven optimizations for R.
Implementation of a proof-of-concept for keeping multiple feedback vectors, one per call context, and then using the newly available information for driving optimizations. It is done on top of RIR, a just-in-time compiler for the language R. The new branch also includes an in-house recording tool that enables fine-grained debugging of events in the virtual machine, such as function invocations, compilation, and deoptimization.
Software tool that processes the EU's DSA Transparency Database, producing comprehensive reports with timelines.
Computational substrate for end-user document-oriented programming implementing collaborative editing and programming by demonstration.
Smalltalk VM implementation via the filesystem based on identifying Unix executables with Smalltalk methods.