Multi-omics data integration methods to study rare genetic diseases

EVENT : C3BI Seminars


Main speaker : Anaïs Baudot, from Networks and Systems Biology , Marseille Medical Genetics Unit (MMG)
Date : 11-03-2021 at 02:00 pm
Location : Teams (e-seminars) ,Institut Pasteur, Paris


Link to the seminar:
________________________________________________________________________________
https://teams.microsoft.com/l/meetup-join/19%3a8ba527d9b67a49f99a705c485e691b61%40thread.tacv2/1614681436154?context=%7b%22Tid%22%3a%22096815dc-d9eb-4bc3-a5a3-53c77e7d34e2%22%2c%22Oid%22%3a%22ecd024dd-8901-4512-9222-f115faa816ca%22%7d
________________________________________________________________________________

ABSTRACT

The technological advances and accumulation of biomedical datasets are yielding unprecedented opportunities to better understand genetic diseases but necessitate proper exploration and integration methods to unravel a complete picture of biological systems. I will discuss about the computational strategies we recently developed, using i) multilayer networks to integrate a large range of interactions, and associated exploration algorithms and ii) dimensionality reduction to extract biological knowledge simultaneously from multiple omics. On the application side, I will discuss about the analysis of rare genetic diseases, which raise various challenges: many patients are undiagnosed, phenotypes can be highly heterogeneous, and only a few treatments exist.

Selected associated publications & preprints
• Cantini, L., Zakeri, P., Hernandez, C., Naldi, A., Thieffry, D., Remy, E., Baudot, A., 2021. Benchmarking joint multi-omics dimensionality reduction approaches for the study of cancer. Nature Communications 12. https://doi.org/10.1038/s41467-020-20430-7
• Novoa-del-Toro, E.-M., Mezura-Montes, E., Vignes, M., Magdinier, F., Tichit, L., Baudot, A., 2020. A Multi-Objective Genetic Algorithm to Find Active Modules in Multiplex Biological Networks. bioRxiv 2020.05.25.114215. https://doi.org/10.1101/2020.05.25.114215
• Pio-Lopez, L., Valdeolivas, A., Tichit, L., Remy, É., Baudot, A., 2020. MultiVERSE: a multiplex and multiplex-heterogeneous network embedding approach. https://arxiv.org/abs/2008.10085
• Valdeolivas A, Tichit L, Navarro C, Perrin S, Odelin G, Levy N, et al. Random Walk with Restart on Multiplex and Heterogeneous Biological Networks. Bioinformatics. 2018 Jul 18. https://academic.oup.com/bioinformatics/article/35/3/497/5055408


Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting

Virtual seminar : “Improved metagenome binning and assembly using deep variaQonal autoencoders”

EVENT : C3BI Seminars


Main speaker : Simon RASMUSSEN, from Technical University of Denmark and Novo Nordisk Foundation Center for Protein Research University of Copenhagen Date : 07-01-2021 at 10:00 am Location : Teams (e-seminars) ,Institut Pasteur, Paris


Despite recent advances in metagenomic binning, reconstruction of microbial species from metagenomics data remains challenging. Here we develop Variational Autoencoders for Metagenomic Binning (VAMB), a program that uses deep variational autoencoders to encode sequence co-abundance and k-mer distribution information prior to clustering. We show that a variational autoencoder is able to integrate these two distinct data types without any prior knowledge of the datasets. VAMB outperforms existing state-of-the-art binners, reconstructing 29–98% and 45% more near-complete genomes on simulated and real data, respectively. Furthermore, VAMB is able to separate closely related strains up to 99.5% ANI and reconstructed 255 and 91 near complete Bacteroides vulgatus and Bacteroides dorei sample-specific genomes as two distinct clusters from a dataset of 1,000 human gut microbiome samples. We use 2,606 near-complete bins from this dataset to show that species of the human gut microbiome have different geographical distribution patterns. VAMB can be run on standard hardware and is freely available at https://github.com/RasmussenLab/vamb.

LINK TO THE E-SEMINAR: https://teams.microsoft.com/meetingOptions/?organizerId=74f6fdc2-678a-4869-872b-1c56c5a54726&tenantId=096815dc-d9eb-4bc3-a5a3-53c77e7d34e2&threadId=19_meeting_NThjYzU4NzMtOGViMC00MzYzLTk0ZjktMmE4MmZkYzQyODdl@thread.v2&messageId=0&language=fr-FR

Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting

STATISTICAL METHODS FOR MOLECULAR DYNAMICS AND INTERACTION ANALYSIS IN FLUORESCENCE MICROSCOPY

EVENT : C3BI Seminars


Main speaker : Charles KERVRANN, from Inria Rennes – Bretagne Atlantique / SERPICO Project-Team Date : 27-02-2020 at 02:00 pm Location : Auditorium Francois Jacob – BIME (26) ,Institut Pasteur, Paris


The characterization of biomolecule dynamics and interactions in living cells is essential to decipher biological mechanisms and processes. This topic is usually addressed in fluorescent video-microscopy from particle trajectories computed by object tracking algorithms. However, classifying individual trajectories into predefined diffusion classes (e.g. sub-diffusion, free diffusion (or Brownian motion), super-diffusion), estimating diffusion model parameters, or detecting diffusion mode switches, is a difficult task in most cases. Meanwhile, colocalization is generally applied to detect interactions between two biomolecules observed in an image pair. Colocalization aims at characterizing spatial associations between two fluorescently-tagged biomolecules by quantifying the co- occurrence and correlation between the two channels acquired in fluorescence microscopy. This problem remains an open issue in diffraction-limited microscopy and raises new challenges with the emergence of super-resolution imaging. To address these challenging issues, we propose a computational framework based on statistical tests to both classify biomolecule trajectories and to detect spatially-varying colocalization in single molecule imaging (PALM, STORM). The methodological approach is well-grounded in statistics and is more robust than previous techniques. In this talk, I will present the underlying concepts and methods. The resulting algorithms are flexible in most cases, with a minimal number of control parameters to be tuned (p-values). They can be applied to a large range of problems in cell imaging and can be integrated in generic image- based workflows, including for high content screening applications.

References: 1. V. Briane, C. Kervrann, M. Vimond. Statistical analysis of particle trajectories in living cells, Phys. Rev. E 97, 062121 2. V. Briane, M. Vimond, C. Valades Cruz, A. Salomon, C. Wunder, C. Kervrann. A sequential algorithm to detect diffusion switching along intracellular particle trajectories, Bioinformatics, btz489, 2019. 3. V. Briane, M. Vimond, C. Kervrann. An overview of diffusion models for intracellular dynamics analysis, Briefings in Bioinformatics, bbz052, 2019 4. V. Briane, M. Vimond, A. Salomon, C. Kervrann. A computational approach for detecting micro-domains and confinement domains in cells: a simulation study, 2019. 5. F. Lavancier, T. Pécot, L. Zengzhen, C. Kervrann, Testing independence between two random sets for the analysis of colocalization in bio-imaging. Biometrics, doi:10.1111/BIOM.13115, 2019.

Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting

STATISTICAL METHODS FOR MOLECULAR DYNAMICS AND INTERACTION ANALYSIS IN FLUORESCENCE MICROSCOPY

EVENT : C3BI Seminars


Main speaker : Charles KERVRANN, from Inria Rennes – Bretagne Atlantique / CNRS-UMR 144 Paris Inria  

Institut Curie  CNRS  UPMC  PSL Research University SERPICO Project Team Date : 27-02-2020 at 02:00 pm Location : Auditorium Francois Jacob – BIME (26) , Institut Pasteur, Paris


The characterization of biomolecule dynamics and interactions in living cells is essential to decipher biological mechanisms and processes. This topic is usually addressed in fluorescent video-microscopy from particle trajectories computed by object tracking algorithms. However, classifying individual trajectories into predefined diffusion classes (e.g. sub-diffusion, free diffusion (or Brownian motion), super-diffusion), estimating diffusion model parameters, or detecting diffusion mode switches, is a difficult task in most cases. Meanwhile, colocalization is generally applied to detect interactions between two biomolecules observed in an image pair. Colocalization aims at characterizing spatial associations between two fluorescently-tagged biomolecules by quantifying the co-occurrence and correlation between the two channels acquired in fluorescence microscopy. This problem remains an open issue in diffraction-limited microscopy and raises new challenges with the emergence of super-resolution imaging.

To address these challenging issues, we propose a computational framework based on statistical tests to both classify biomolecule trajectories and to detect spatially-varying colocalization in single molecule imaging (PALM, STORM). The methodological approach is well-grounded in statistics and is more robust than previous techniques. In this talk, I will present the underlying concepts and methods. The resulting algorithms are flexible in most cases, with a minimal number of control parameters to be tuned (p-values). They can be applied to a large range of problems in cell imaging and can be integrated in generic image-based workflows, including for high content screening applications.     1. V. Briane, C. Kervrann, M. Vimond. Statistical analysis of particle trajectories in living cells, Phys. Rev. E 97, 062121 2. V. Briane, M. Vimond, C. Valades Cruz, A. Salomon, C. Wunder, C. Kervrann. A sequential algorithm to detect diffusion switching along intracellular particle trajectories, Bioinformatics, btz489, 2019. 3. V. Briane, M. Vimond, C. Kervrann. An overview of diffusion models for intracellular dynamics analysis, Briefings in Bioinformatics, bbz052, 2019 4. V. Briane, M. Vimond, A. Salomon, C. Kervrann. A computational approach for detecting micro-domains and confinement domains in cells: a simulation study, 2019. 5. F. Lavancier, T. Pécot, L. Zengzhen, C. Kervrann, Testing independence between two random sets for the analysis of colocalization in bio-imaging. Biometrics, doi:10.1111/BIOM.13115, 2019.

Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting

Inferring the parameters of random walks without tracking

EVENT : C3BI Seminars


Main speaker : Till Kletti, from Former member of the Decision and Bayesian Computation Group at the DBC Date : 30-01-2020 at 02:00 pm Location : Auditorium Francois Jacob – BIME (26), Institut Pasteur, Paris


We consider the problem of inferring random walk models (e.g. spatial maps of diffusivity, drift) of a set of moving particles (e.g. biomolecules) using discrete-time snapshots of their positions (a movie). A main difficulty stems from the fact that the particles are not labelled, which makes the particle matching between two consecutive snapshots uncertain. We describe how to account for this uncertainty using the belief propagation (BP) algorithm, which outperforms explicit tracking methods when the particle density is high. Furthermore, we describe procedures allowing us to account for blinking of the particles. Finally, we show applications of the method to mapping heterogeneous diffusivity fields experienced by biomolecules in the plasma membrane of living cells.


Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting

Inferring interaction partners and evolutionary constraints from protein sequences

EVENT : C3BI Seminars


Main speaker : Anne-Florence Bitbol, from CNRS – Sorbonne Universite Date : 16-01-2020 at 02:00 pm Location : Auditorium Francois Jacob – BIME (26) ,Institut Pasteur, Paris


Proteins and multi-protein complexes play crucial roles in our cells. The amino-acid sequence of a protein encodes its function, including its structure and its possible interactions. In evolution, random mutations affect the sequence, while natural selection acts at the level of function. Hence, shedding light on the sequence-function mapping of proteins is central to a systems-level understanding of cells, and has far-reaching applications in synthetic biology and drug targeting. The current explosion of available sequence data has inspired data-driven approaches to discover the principles of protein operation. At the root of these approaches is the observation that amino-acid residues which possess related functional roles often evolve in a correlated way.

First, I will present two novel methods to predict protein-protein interactions from sequence data. One method is based on the maximum-entropy inference approach that has already allowed to infer protein structures from sequences, and the other one is based on information theory. These methods accurately identify which proteins are functional interaction partners among the paralogous proteins of two families, starting from sequence data alone. They also provide signatures of the existence of interactions between protein families. I will further discuss the role of correlations arising from the shared evolutionary history of interacting partners in the success of these methods. Then, I will propose a simple interpretation of the origin of the “sectors” of collectively correlated amino acids that have been discovered in several protein families through statistical analyses of sequence alignments. I will show that selection acting on any functional property of a protein, represented by an additive trait, can give rise to such a sector.

Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting

Cancelled – Inferring interaction partners and evolutionary constraints from protein sequences

EVENT : C3BI Seminars


Main speaker : Anne-Florence Bitbol, from CNRS – Sorbonne Université Date : Cancelled at XX pm Location : Retrovirus room – LWOFF (22) ,Institut Pasteur, Paris


Proteins and multi-protein complexes play crucial roles in our cells. The amino-acid sequence of a protein encodes its function, including its structure and its possible interactions. In evolution, random mutations affect the sequence, while natural selection acts at the level of function. Hence, shedding light on the sequence-function mapping of proteins is central to a systems-level understanding of cells, and has far-reaching applications in synthetic biology and drug targeting. The current explosion of available sequence data has inspired data-driven approaches to discover the principles of protein operation. At the root of these approaches is the observation that amino-acid residues which possess related functional roles often evolve in a correlated way.

First, I will present two novel methods to predict protein-protein interactions from sequence data. One method is based on the maximum-entropy inference approach that has already allowed to infer protein structures from sequences, and the other one is based on information theory. These methods accurately identify which proteins are functional interaction partners among the paralogous proteins of two families, starting from sequence data alone. They also provide signatures of the existence of interactions between protein families. I will further discuss the role of correlations arising from the shared evolutionary history of interacting partners in the success of these methods. Then, I will propose a simple interpretation of the origin of the “sectors” of collectively correlated amino acids that have been discovered in several protein families through statistical analyses of sequence alignments. I will show that selection acting on any functional property of a protein, represented by an additive trait, can give rise to such a sector.

Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting

CANCELLED : Computational Biology in the Crossroad of Big Data, Artificial Intelligence and High Performance Computing

EVENT : C3BI Seminars


Main speaker : Alfonso Valencia, from Barcelona Super Computing Center (BSC-CNS) Date : 26-09-2019 at 02:00 pm


Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting

Genomic enzymology web tools for functional assignment: Generating and analyzing Sequence Similarity Networks (SSNs) and Genome Neighborhood Networks (GNNs) with the EFI suite

EVENT : C3BI Seminars


Main speaker : Rémy Zallot, from University of Illinois Date : 20-06-2019 at 02:00 pm Location : Duclaux room down groundfloor – DUCLAUX (01), Institut Pasteur, Paris


Protein databases contain an exponentially growing number of sequences as a result of the decrease of cost and difficulty of genome sequencing. The rate of data accumulation far exceeds the rate of functional studies, producing an increase in genomic ‘dark matter’, sequences for which no precise and validated function is defined. Strategies to leverage the protein and genome databases for discovery of the functions of novel enzymes belonging to the dark matter are needed. “Genomic enzymology” is the integration of relationships among sequence-function space in protein families and the genome context of their bacterial, archaeal, and fungal members to propose function. The Enzyme Function Initiative suite of webtools (https://efi.igb.illinois.edu) include the EFI-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks (SSNs) for protein families and the EFI-Genome Neighborhood Tool (EFI-GNT) producing Genome Neighborhood Networks (GNNs) and Genome Neighborhood Diagrams (GND) for analyzing and visualizing genome context of SSNs clusters. Together, these tools facilitate the “Genomic enzymology” application to the ‘dark matter’ problem. A detailed overview of the principle of SSNs, GNNs and GNDs generation will be presented. The identification of an unexpected reaction in the Queuosine biosynthesis pathway will illustrate the approach.


Due to security policy in Institut Pasteur, please register before if you plan to come to this meeting