Current research

Broad research area

Artificial Intelligence, Computational Biology and Bioinformatics.

Main current research lines

Most of my current research is focused on the design of AI methods for Precision Medicine and Molecular Biology:

Large Language models and their application to Medicine and Biomolecular modeling

Large Language Models (LLMs) represents a breakthrough research area with broad applications in several disciplines. We are interested in designing AI and in particualr LLM models to support the diagnosis, prognosis and treatment of diseases. A second broad research line is about the design of LLM models for modeling proteins and RNAs. In particular we are working to apply genomic foundation models to predict and discover non coding RNA (ncRNA) interactions, to support the research in basic RNA biology research and the discovery of novel ncRNA-based drugs.

Graph Representation Learning methods for Network Medicine and System Biology

Several fundamental problems in Molecular Biology and in Medicine can be modeled through graphs of molecular or medical entities (e.g. networks of interacting proteins or networks of phenotypes, diseases, or patients that share similar biomolecular or clinical profiles). In this framework we design explainable Graph Representation Learning methods to:

combine integrate multiple types of omics,clinical and imaging data in a multi-modal setting to construct highly informative integrated biomedical graphs;
analyze integrated graphs or process complex heterogeneous graphs to predict properties or interactions between biomolecules and drugs, discover biomarkers associated to specific diseases or to stratify patients according to their pathology-based risk.

Ensemble methods for the discovery of pathogenic variants in genetic diseases

Leveraging our results on the prediction of genetic risk using hyper-ensemble methods, we are designing ensembles of multi-modal tranformers trained on genome sequences and epigenomic features to prioritize genetic variants at single nucleotide level in non coding regions of human genome causative of Mendelian genetic diseases. By combining these multi-modal tranformers trained on genomic and epigenomic data with learning machines trained on phenotypic data we plan to design breakthrough methods for the unresolved diagnoses of genetic diseases.