Practical Data Science Type Systems Tools and Languages

Current Members

Jan Vitek
Professor

Works on the design and implementation of programming languages. Leads the project and collaborates across all three research directions, from type system design and compiler engineering to empirical studies of data science practices.

Robert Grimm
Researcher

Studies the application of data analysis to questions of accountability and transparency. Current work analyzes transparency disclosures from internet platforms to assess the reliability of reporting on child sexual exploitation, and develops methodologies combining auto-ethnography with procedural justice to study the punitive aspects of content moderation.

Jakob Hain
PhD Student

Works on typed intermediate representations for dynamic languages. Contributes to the design of FIR, a statically typed high-level IR that makes optimization-relevant behaviors manifest and amenable to static analysis, enabling specialization, devirtualization, inlining, and copy elimination in the R JIT compiler.

Joel Jakubovic
Postdoctoral Researcher

Studies programming systems and end-user programming. Develops computational substrates that represent programs as series of edits on documents, enabling collaborative editing, programming by demonstration, and incremental recomputation. Also investigates the fundamental limits of making programming easier using dialectic methods and proto-theories.

Matěj Kocourek
Matěj Kocourek
PhD Student

Works on JIT compilation techniques for R. Builds a baseline compiler using copy-and-patch compilation that stitches together pre-compiled native code fragments from the existing GNU R interpreter, achieving fast compilation times and reasonable speedups with manageable implementation complexity as an external R package.

Jan Kofroň
Professor

Works on software verification and program analysis. Contributes to tools that let developers interactively combine static analysis techniques of varying precision for program comprehension, using scalable data-flow analysis to narrow scope before applying precise symbolic execution.

Mickael Laurent
Postdoctoral Researcher

Works on set-theoretic types and type inference for dynamic languages. Also designs a typed intermediate representation for optimizing R.

Lucie Lerch
Lucie Lerch
Administrator

Project coordination and administrative support.

Pavel Parízek
Professor

Works on program analysis and verification tools. Develops modular symbolic data flow analysis for automated data lineage tracking in enterprise systems, tracing how data values flow across databases and application code at industrial scale.

Tomáš Petříček
Professor

Studies how people program and the limits of making programming easy. Develops document-oriented end-user programming systems and explores interdisciplinary perspectives on the practice of data science, drawing on ideas from open-source software, self-sustainable systems, and visual programming.

Anastasija Skotorenko
Anastasija Skotorenko
Research Assistant

Past Members

Martin Blicha
Researcher

Works on automated reasoning and software verification, developing SMT solvers and model checking tools.

Alek Boruch-Gruszecki
Researcher

Works on gradual typing, type inference, and capability tracking. Develops gradual compartmentalization, an approach that allows incrementally migrating applications to object capabilities for intra-process isolation, combining runtime authority enforcement with type-system tracking to provide continuous security guarantees.

Vincent Holliday
Vincent Holliday
Research Assistant
Václav Ježek
Václav Ježek
Researcher

Studies the application of sentiment analysis and machine learning to humanities research. Investigates the ethical dimensions of applying data science methods to theological and historical texts, including analysis of patristic sources and the intersection of computational methods with traditional exegetical approaches.

Florian Sihler
Visitor

Works on dataflow analysis and static program slicing for R. Develops flowR, a hybrid analysis framework that combines static and dynamic techniques to help data scientists understand dependencies, identify dead code, and improve the quality of their analysis pipelines.