Alcides Fonseca

40.197958, -8.408312

Research Ideas for Students

This page contains several research ideas that students can pursue under my supervision. They can be adapted for a BSc part-time project, or for a full MSc dissertation. All ideas should result in (at least) one publication.

(AI for) Programming Languages

Synthesis of General-Purpose Software

We have been developing Æon, a programming language with refinement types to express requirements (e.g., a list that has 3 elements, or an integer greater than 3 and less than 7). In this language, we can leave some parts of the program as holes that the compiler can fill in for us! It’s automatic programming!

You can learn more about this project in its own page. Last year, Paulo Santos won the LASIGE best poster award working on the first prototype.

We are looking into expanding this language/compiler with following capabilities:

  • Faster synthesis;
  • Evaluate different genetic operators;
  • Evaluate this language in a larger data-set.

Recommended skills: Compilers, Evolutionary Algorithms

A Cooperative Time Cost Model Type System

Cost-Models are necessary for efficient parallelization of programs. The cost-model of an AST expression can be obtained by composing the cost-model of children nodes. However, in function invocations, it is necessary to infer the cost-model of function invocations. This is very similar to how bidirectional type systems work. This work consists on designing a secondary type system for a programming language that expresses the complexity of functions. This information is then used for evaluating whether a function call will execute asynchronously or not in the context of automatic parallelization.

f :: int -> int where { time = 2s }
f n = n + 1

program :: [Int] where { time <= 3s }
program = map f [1,2,3]

This program has a type error, because map will take 6s to execute on this list (3*2s each f) and it expects
it to run under 3s.

Recommended skills: Compilers

Automatically extracting features from heterogenous data sources for ML

CAMELOT is the most recent project in RSS, where we collaborate with Feedzai (a Portuguese company that does fraud detection for banks, e-commerce and other financial institutions) to improve the process of integrating Feedzai core ML technologies with the different data landscape of their clients. The idea is to design an efficient (the whole fraude detection has to run under 200ms!) application that takes a transaction ID and gathers relevant information that can be used to predict whether that transaction is fraudulent or not.

Project in cooperation with:

Recommended skills: Compilers, Algorithms

Programming Languages for AI

Understanding deep intronic genetic variants

This is a collaboration with IMM to try to understand what are the reasons for pathogenic mutations in the DNA that are not directly in the coding genes. We have a dataset of genetic mutations and extra information about each position that we want to understand.

Recommended skills: Machine Learning, Compilers, Parallel Programming

THOR – Improving the mobile diagnostic of COVID19

We are collaborating with the THOR project to improve the diagnosis of COVID19 in rural areas without access to properly-equipped hospitals.

Our goal is to develop Machine Learning algorithms that are able to detect a given number of patterns that doctors use to diagnose COVID.

This project is being developed in collaboration with IBEB, INESC-TEC and Hospital Garcia da Horta.
The thesis will be co-advised with Nuno Garcia.