Hub members Have many expertise, covering most of the fields in bioinformatics and biostatistics. You'll find below a non-exhaustive list of these expertise
Searched keyword : Non applicable
Related people (7)
After a Master degree in Genome Analysis and Molecular Modeling at Denis Diderot University, I did a PhD in NMR / bioinformatics at Denis Diderot University, where I worked on the development and use of a software named DaDiModO which uses SAXS data and RDC/NMR data to calculate models of structural proteins. After a postdoc aiming to adapt ARIA software to allow execution on computing grid in the Structural Bioinformatic Team at Institut Pasteur in collaboration with IBCP, I joined CIB/DSI Team where I was responsible for the development of bioinformatics projects and the deployment, maintenance and evolution of the Pasteur Galaxy server. I joined the Hub/C3BI team in 2017 as research engineer where I’m involved in several projects such as structural bioinformatics, softwares and web development. I am also in charge of the maintenance of the Galaxy Pasteur instance.
Data managementGalaxyStructural bioinformaticsWeb developmentDatabaseProgram developmentScientific computingDatabases and ontologiesWorkflow and pipeline developmentGrid and cloud computing
- Development of a secure API for ARIAweb(Benjamin BARDIAUX - Structural Bioinformatics) - In Progress
- Development of a web server to calculate functional binding sites using Deep Learning(Olivier SPERANDIO - Structural Bioinformatics) - In Progress
- A pipeline to detect correlated evolution on phylogenetic trees(Eduardo ROCHA - Microbial Evolutionary Genomics) - In Progress
Activities Contact for any subject related to IFB. Help scientists to develop new tools (architecture, design, implementation). animate the Python Working Group at pasteur . O|B|F (http://www.open-bio.org/) member. Skills Strong programming experience in Python. Software architecture and design. NoSQL DataBase (MongoDB, CouchDB) XML/YAML continuous integration (github/travis-CI/readthedocs, gitlab/gitlab-CI) containers (Docker, Singularity) linux (Gentoo, Xubuntu) IFB developer Main projects on the campus Mobyle http://Mobyle.pasteur.fr Mobyle: a new full web bioinformatics framework IntegronFinder (ongoing project) MacsyFinder (ongoing project) githubaccess to my projects on github Teaching Unix (Unix-I , Unix-II) Python . Education 2002 Phd in Molecular and cellular biology. “Rôle de deux protéines QN1 et PATF impliquées dans l’arrêt de prolifération des cellules de la neurorétine aviaire au cours du developpement”. 2001 “Informatique En Biologie” course (Pasteur)
Data managementDatabaseProgram developmentScientific computingDatabases and ontologies
As a senior research engineer, I have explored many corners of computer science and artificial intelligence. I can most notably help you on the following topics. Skills Decision support systems Boxes and arrows design and implementation of decision-aid software (web-based as well as native interfaces and backends), visualization and diagrams (how to summarize complex data/concepts in a visual way), integration of third-party modules (how to design API to use external services, how to integrate software that does not really want to be integrated). Automated decision A black-box with a black-box inside score function modelling (how to design a metric defining a quality for a solution to a decision problem, while maintaining good mathematical properties), optimization problem modelling (how to design a formal model of a decision problem to be automatically solved by a computer), solving automated configuration problems (how to set parameters of a complex system so as to maximize its performances), Scientific computing Lego blocks and arrows efficient algorithmics (how to cope with combinatorial explosion or curse of dimension when implementing complex algorithms), highly modular software architectures (how to structure your code to allow efficient —and automated— exploration of your ideas), modern C++ (how to program with C++ using —almost— the same concepts than in Python), shell scripting (how to use the existing Unix tools to —very— efficiently automatize any task). Artificial Intelligence search heuristics, metaheuristics or evolutionary computation (how to solve hard optimization problems), design of experiments for randomized algorithmics (how to design experiments involving modern AI, using rigorous statistics), automated planning (how to compute shortest paths, and more generally optimize sequences of actions), semantic graph mining (how to find patterns in an ontology).
Related projects (31)
The CRISPR/Cas9 technology is a recent breakthrough in rapid genetic editing. A major part of getting the technology to work is the proper design of a guide RNA that will help Cas9 target specific genomic sequences. The design of this guide RNA must take into account all possible matches along an organism’s genome with as little as 50% similarity. Such a high toleration for error means that current alignment algorithms are not well suited to the task. This issue leads to suboptimal guide RNA design and/or lengthy periods of the design process. It is a problem that is exacerbated when considering CRISPR/Cas9 for high throughput applications. The development of a new brand of sequence comparison algorithms is required.
Nous recherchons un logiciel basé ou pas sur spade qui tourne sous R qui permet de faire une analyse des fichiers de cytométrie de type FCS par clustering multiparamétrique. nous serions également intéressé par visne également. les fichiers peuvent être transformé en CSV Il existe un logiciel commercial Cytobank mais qui ne rempli pas toute les fonctions souhaité.
Mapping of research themes and fields of expertise available in the Institut Pasteur international Network
Using data extracted from Pubmed, we would like to develop a tool for systematic analysis of research themes and fields of expertise available in the Institut Pasteur International Network (IPIN). The tool would be available to the Pasteur community and could be questioned using search terms in Pubmed, identifying articles involving research teams from IPIN and displaying the name of authors, research units, and location in a visual format. We hope this tool would enable researchers to identify colleagues for sharing expertise and developing collaborations.
When dealing with high depth read data, a simple way to associate accurate analyses to moderate computational resources is to extract a subset of raw reads that allows observing both homogeneous and moderate coverage depth. Unfortunately, current implementations are often unexpectedly slow and require many significant pre-processing of large files to be used in practice. In the current scientific context, much effort must go into algorithm design and efficient programming to process large data with reasonable running times. An efficient implementation should therefore be developed in order to quickly perform read coverage homogenization. Indeed, such tool will help to deal with highly redundant sequencing data by creating read subsets with useful properties. As the read coverage homogenization step is expected to be systematically used for pre-processing the large amount of raw reads generated in the PIBnet context, a development carried out by members of the CIB platform is expected to lead to efficient solutions that will take advantage of the computing resources hosted by the Institut Pasteur.
Improvement of two existing tools, COV2HTML (published in 2014) and SEQ2HTML (private), and transform two scripts into interactive web interfaces addressed to biologists.
Hydrogen deuterium exchange detected by mass spectrometry (HDX-MS) is a powerful technique to probe the conformation and dynamics of proteins. Over the past 10 years, the HDX-MS workflow has been optimized and automatized leading to a rapid expansion of the technology in both academic lab and pharmaceutical companies. Thanks to these improvements, modern HDX-MS can be applied to investigate more complex biological systems, including large protein complexes and membrane proteins. However, the higher the size of the protein under study, the more complex the HDX-MS data. Several noncommercial and commercial software solutions have been developed to help for the analysis of HDX-MS data. We are currently using DynamX 3.0 that is a Waters-specific product specifically designed for the nanoACQUITY UPLC system with HDX technology. The aim of the project is to design and implement a statistical tool compatible with the output generated by DynamX to read ily validate results obtained with large protein complexes.
DNA topoisomerase IB (Topo IB) enzymes are ubiquitous in eukaryotes, where they represent the major DNA topoisomerase I activity. However, Topo IB sequences are also found in other phyla, such as archaea and bacteria, as well as viruses. Given the large amount of sequenced data available in public databases, this project aims to infer a robust Topo IB gene tree based on a representative set of homologous sequences gathered from a large taxonomic sample.
Mise a disposition d'un(e) bioinformaticien(ne) du hub pour les analyses bioinformatiques du transcriptome et de l epigenome
La PF Transcriptome et Epigenome développe des projets de séquençage à haut débit (collaboration et service) avec des équipes du Campus. Ceux-ci couvrent l'ensemble des thématiques du campus ainsi qu'une large gamme d'organismes (des virus aux mammifères). La plate-forme exerce des activités de biologie humide (construction des librairies et séquençage) et de biologie sèche (analyse bioinformatiques et statistiques). La personne mise a disposition interagira étroitement avec les autres bioinformaticiens du pôle BioMics et du Hub. Ses activités concerneront notamment: - La participation à la conception et à la mise en place des projets avec les équipes demandeuses, la prise en charge des analyses et le reporting aux utilisateurs - La mise en place d'un workflow d'analyse bioinformatique des données de transcriptome /épigénome en étroite collaboration avec le C3BI, la DSI et les autres bioinformaticiens du pole. Ce workflow permettra le contrôle qualité des données, leur prétraitement, le mapping des séquences sur les génomes/transcriptomes de réference, et le comptage des reads pour les différents éléments de l'annotation - L'adaptation du workflow d'analyse aux questions biologiques et aux organismes étudiés dans le cadre des activités de la PF - L'activité de veille technologique et bibliographique (test et validation de nouveaux outils d'analyse, updates d'outils existants...) - La mise en place et le développement d'outils d'analyse adaptés aux futurs projets de la PF: single cell RNAseq, métatranscriptome, ChIPseq, analyse des isoformes de splicing.. Ceci se fera notamment via la réalisation d'analyses dédiées avec certains utilisateurs. Les outils mis en place et validés dans ce cadre seront ensuite utilisés pour l'ensemble des projets. - L'activité de communication et de formation (participation aux réunions du consortium France Génomique,formation permanente à l' Institut Pasteur… - la participation a d autres projets du Pole BioMics (selon disponibilité) Bernd Jagla, qui était le bioinformaticien de la plateforme a rejoint le Hub au 1er janvier 2016. Rachel Legendre est mise a disposition depuis le 2 novembre 2015 et remplace Bernd Jagla. Je souhaite que Rachel Legendre soit mise à disposition de la plateforme pour une durée d'au moins 2 ans.
Development and use of statistical programs to analyze RNA-Seq data produced at the Transcriptome & Epigenome Platform
The Transcriptome & Epigenome Platform is dedicated to the development and use of high throughput approaches for transcriptomics and epigenomics studies. The platform is accessible to any research team from the Pasteur Institute (80% of the projects) as well as from outside. It is involved (most often as collaborator) in several projects funded by the ANR, AVIESAN and by the Pasteur Institute in the framework of the PTR programs. Next Generation Sequencing (NGS) based on the Illumina technology (HiSeq 2000/2500 sequencers) is used to perform RNA-sequencing experiments for which a large amount of data is generated. After a first step involving bioinformatics, specific statistical methods must be used be analyze rigorously the data. These analyses are most often performed by the statistician(s) of the platform. They are also in charge of bibliographical survey activity.
Development and use of statistical programs to analyze RNA-Seq data produced at the Transcriptome & Epigenome Platform
The Transcriptome & Epigenome Platform is dedicated to the development and use of high throughput approaches for transcriptomics and epigenomics studies. The platform is accessible to any research team from the Pasteur Institute (80% of the projects) as well as from outside. It is involved (most often as collaborator) in several projects funded by the ANR, Microbes and Brain, ERANET and by the Pasteur Institute in the framework of the PTR programs. Next Generation Sequencing (NGS) based on the Illumina technology (HiSeq 2000/2500 sequencers) is used to perform RNA-sequencing experiments for which a large amount of data is generated. After a first step involving bioinformatics, specific statistical methods must be used be analyze rigorously the data. These analyses are most often performed by the statistician(s) of the platform. They are also in charge of bibliographical survey activity.
The Transcriptome and EpiGenome platform has a strong expertise in the bioinformatical and statistical analysis of RNA seq data. Nevertheless, we have more and more demands for the use of NGS to characterize the epigenome (using ChIPseq approach) or chromatine accessibility (by ATAC-seq) .We thus need to further develop and validate analysis workflows for these types of data. This project aims at developing and formalizing collaboration between the platform and some experts in this field at the hub. This would include: joint project kic-off meetings, development and validation of ChIPseq and ATACseq analysis pipelines (notably including data preprocessing, reads mapping, peak calling...).
Development of a web application and new functionalities for the maintenance and curation of iPPI-DB
A new version of the iPPI-DB, a manually curated database that contains the structure, some physicochemical characteristics, the pharmacological data and the profile of the PPI targets of several hundred modulators of protein-protein interactions.
This new version will include:
- A maintenance application that facilitates and automates the updates of the database. The computation of the various physico-chemical properties of the modulators and chemical similarity screening on the Galaxy server of the Institut Pasteur.
- A new target-centric mode, based on the mapping of all druggable cavities at the core of PPI interfaces throughout the Protein Data Bank.
Nous avons créer un programme sous R pur l'analyse non supervisé de fichier de cytométrie et nous avons besoin d'aide pour optimiser ce programme. Nous avons également besoin de conseils pour optimiser le clustering. Nous avons déjà rencontré Hugo Varet.
The ARIA (Ambiguous Restraints for Iterative Assignment) software, developed at the Structural Bioinformatics Unit, automatizes the treatment of NMR data and protein structure calculation by molecular dynamics simulation. To enhance the visibility of the software, it is necessary to develop a new web interface where users will be able to easily manage their data, perform calculations and analyze the results of the ARIA calculations.
Rationnel. Le mode de présentation clinique du sepsis est très polymorphe. Chez les patients septiques consultants dans les services des urgences, la présence d’une hyperthermie ou d’autres critères du syndrome de réponse inflammatoire systémique (SIRS) n’est pas suffisante pour aider au diagnostic de sepsis. De nombreux efforts de recherche ont abouti à la proposition d’innombrables biomarqueurs de sepsis essentiellement étudiés en soins intensifs. Même si certains, comme la procalcitonine (PCT) ont atteint un relativement bon degré de prédiction aux urgences, leur usage en routine demeure controversé. Compte tenu de la physiopathologie complexe du sepsis, une approche combinatoire pourrait permettre d’atteindre des performances difficilement envisageables avec un biomarqueur seul. Objectif primaire. Etudier les performances statistiques d’un panel de biomarqueurs d’intérêt, individuellement et en association, pour le diagnostic de sepsis aux urgences. Objectifs secondaires. Etudier les performances statistiques d’un panel de biomarqueurs d’intérêt, individuellement et en association, pour le diagnostic d’état septique grave (sepsis sévère et choc septique) et la stratification du risque (prédiction de l’admission en soins intensifs et/ou du décès). Type d’étude. Etude de cohorte monocentrique prospective non-interventionnelle Patients et critères d’inclusion. 300 patients consultant dans le service des urgences ayant une suspicion de sepsis + 30 sujets sains. Critères de non inclusion. Patient mineur de moins de 18 ans, femme enceinte, conditions de vie rendant impossible le suivi à 28 jours, refus de participer à l’étude. Mesures. Pour chaque patient, lors du bilan sanguin initial, prélèvement de 3 tubes pour le dosage a posteriori d’un panel de biomarqueurs d’intérêt explorant les différentes voies biologiques activées au cours du sepsis.
The goal of the project is to determine if there are differences in the midgut microbiome of our lab colonies of Aedes aegypti. We frequently observe various phenotypic differences between different colonies of mosquitoes and it is a recurring question whether these phenotypic differences are a result of differences in the microbiome. We will sequence the microbiome of 6 representative established lab colonies that have been collected from geographically diverse areas and compare the bacterial communities between the them. This data will help us dissect the importance that variation of the midgut microbiome of lab colonies of Aedes aegypti has on the phenotypic differences we observe in the lab.
Mood disorders such as bipolar and major depressive illnesses are among the most severe psychiatric disorders. They have high prevalence and chronic course, and are associated with significant mental and somatic comorbidities and high personal and societal costs (lost productivity and increased medical expenses). Patients with bipolar disorder (BD), for example, exhibit a reduced lifespan compared with the general population, a finding that cannot only be explained by high suicide risk, reduced access to medical care and lifestyle factors. However, the pathophysiological mechanisms of BD are poorly understood, and patients often have incomplete treatment response. Advanced mathematical approaches such as machine learning techniques are increasingly being used to generate predictions based on complex data, and it has been successfully used to detect a number of clinical outcomes and to predict behaviours. In combination with mobile technologies (e.g. smartphones, wearables) to collect behavioural, physiological and environmental data, these big data predictive approaches may provide a much richer and deeper understanding of phenomenology and pathophysiological mechanisms of mood and bipolar disorders. By taking advantage of the high-standard bioinformatics expertise offered by the C3BI, this multidisciplinary, collaborative project aims to explore how clinical and biological factors, may contribute for better characterizing BD patients as well as to identify predictors of treatment response in BD. Our project also aims to explore how daily behavioural and physiological parameters may influence mood and behaviour in individuals at-risk or suffering from mood disorders.
There exists a broad biodiversity inside the Listeria monocytogenes species, which can be summarized by the existence of evolutionary lineages and more than 100 clonal complexes (CCs or clones) based on core genome multilocus sequence typing (cgMLST), which are geographically and temporally widespread. We aim to link genomic markers to temporal, geographical and sampling origin in order to better understand the ecology and evolution of Listeria monocytogenes.
DISCO-Bac (http://disco-bac.web.pasteur.fr/), a Web server, is a part of a recent publication https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-017-3932-y (Co-authored by former Hub member Olivia Doppelt-Azoueral, who conceived and implemented the DISCO-BAC database and its sophisticated interface). The main result of the paper is to show the widespread existence small peptides across prokaryotes, the predictions being accessible, along with context information, through DISCO-BAC. We have been informed by DSI that "The app is curretly hosted in the wrong zone (for legacy reasons) and you'll have to reinstall it on another VM." The 2017 paper has already been cited 6 times in 2018: https://scholar.google.fr/scholar?cites=4133998924229445176&as_sdt=2005&sciodt=0,5&hl=en, and we believe that its content is only adequately visible through an interface such as DISCO-Bac (on which Olivia has done a great job).
Hi-C contact maps reflect the relative contact frequencies between pairs of genomic loci, quantified through deep-sequencing. Differential analyses of these maps facilitate downstream biological interpretations. However, the multi-fractal nature of the DNA polymer inside the cellular envelope results in frequency values spanning several orders of magnitude: contacts involving loci pairs at large genomic distance are much sparser compared to closer pairs. The same is true for poorly covered regions such as telomeres and repeated sequences. Poor coverage translates into low signal-to-noise ratios. There is no clear consensus to address this limitation. We present a fast, flexible procedure operating on simple data that takes into account the contacts in each region of a contact map. Binning is performed only when necessary on noisy regions, preserving informative ones. This results in high-quality, low-noise contact maps that can be conveniently visualized for rigorous comparative analyses.
Development and design of new functionalities for MEMHDX, a web application dedicated to the statistical analysis and vizualization of large HDX-MS datasets.
Hydrogen Deuterium eXchange followed by Mass Spectrometry (HDX-MS) is a recognized biophysical tool in structural biology capable of probing protein/ligand interactions, conformational changes, and protein folding and dynamics. Over the last decade, major improvements in the technology have been made (i.e., refrigerated UHPLC system, mass spectrometers with enhance resolution and sensitivity…) allowing the structural analysis of highly challenging biological systems. The characterization of such biological systems results in very complex HDX-MS datasets for which specific analytical software are needed. In this context, our group and the C3Bi have developed “MEMHDX” (Mixed-Effects Model for HDX experiments) to aid in the rapid statistical validation and the visualization of large HDX-MS datasets. This web application is freely accessible to the HDX-MS scientific community at the project home page http://memhdx.c3bi.pasteur.fr The current version of the application allows for the comparison of two unique conditions using only one unique charge state. This limitation has been pointed out by several MEMHDX users. The current project aims at designing and implementing new functionalities in MEMHDX to enhance its analytical capabilities. The possibilities for users to compare multiple HDX-MS conditions using multiple charge states will be introduced within the web application and the visualization tool provided by MEMHDX will be modified accordingly.
Providing correlationPlus software to the scientific community for analysis of dynamical correlations in biological macromolecules
Molecular dynamics simulations and elastic network models are two widely used computational methods for investigation of dynamics of biological macromolecules. These methods can reveal dynamical correlations between residues, nucleotides, domains and chains of biological macromolecules. Even though analyses of these correlations are employed frequently, there is not an application and API that can facilitate the analysis and the visualization of them. A coherent API/app can accelerate the analysis process and reveal details of allosteric interactions. We developed a Python package called correlationPlus that can facilitate and accelerate the dynamical correlation analyses. The package contains both an API and a command line interface. It analyzes raw dynamical correlation maps and plots 2D heatmaps. It can extract the correlation map of individual chains automatically. The correlations can be projected onto PDB structures with correlationPlus and they can be visualized by the popular molecular visualization software VMD. Several studies showed that graph theoretical analysis of dynamical correlations can reveal active sites and domains within proteins. correlationPlus provides a purely Python framework to calculate several graph theoretical centrality measures such as degree, betweenness, closeness, current flow closeness, and current flow betweenness etc. In addition to 2D figures of the centralities, the centrality measure in question can also be projected onto the protein structure with correlationPlus for 3D inspection by VMD. To make correlationPlus app and API available to the scientific community, we need to package and make it distributable. As in many scientific software, correlationPlus also depend on many excellent libraries such as numpy, matplotlib, prody etc. Installation of correlationPlus with pip and/or conda can help the users to install correlationPlus by satisfying the requirements automatically. In this way, the end-users can analyze dynamical correlations rapidly. Unfortunately, we do not have any expertise in the packaging and distribution of Python packages. As a result, we need technical expertise of C3BI for packaging and making correlationPlus distributable to the scientific community.
Track Analyzer is Python-based data visualization pipeline for tracking data. It does not perform any tracking, but takes as input any kind of tracked data. It analyzes trajectories by computing standard parameters such as velocity, acceleration, diffusion coefficient, divergence and curl maps, etc. This pipeline also offers a trajectory visualization in 2D (and soon in 3D rendering), using a selection tool allowing to perform some fate mapping and back-tracking.
With an estimated 1031 particles on earth, bacteriophages are the most abundant genomic entities across all habitats and important drivers of microbial communities. Growing evidence suggest that they play roles in intestinal human microbiota homeostasis, and recent metagenomics studies on the viral fraction of this ecosystem have provided crucial information about their diversity and specificity. However, the bacterial hosts of this viral fraction, a necessary information to characterize further the balance of these ecosystems, remain poorly characterized. Here we unveil, using an enhanced metagenomic Hi-C approach, a large network of 6,651 host-phage relationships in the healthy human gut allowing to study in situ phage-host ratio. We notably found that half of these contigs appear to be sleeping prophages whereas ¼ exhibit a higher coverage than their associated MAG representing potentially active phages impacting the ecosystem. We also detect different candidate members of the crAss-like phage family as well as their bacterial hosts showing that these elusive phages infect different genus of Bacteroidetes. This work opens the door to single sample analysis and concomitant study of phages and bacteria in complex communities.
Autism Spectrum Disorder (ASD), a disorder of social communication and restricted and stereotyped interests, represents a major societal challenge with its prevalence of 2.93% (Baio et al., 2018). Since genetic factors have been identified, their links with phenotypic expression (incl. neuroanatomical aspects) are being explored (Bourgeron et al., 2015). The first neuroanatomical studies are promising, and reinforced by the development of interaction neurosciences, notably with the high-density electroencephalography (HD-EEG). Among different EEG markers, the alpha rhythm (AR), an indirect marker of neuronal synchronization (Hummel et al., 2002) is described differently in ASD; especially during social interaction tasks at the parietal level and the upper temporal sulcus, areas involved in the "mirror neuron system" (Rizzolatti & Craighero, 2004) – then, described as “broken” in ASD (Oberman et al., 2005). More specifically in a sub-band of the AR, at the frontal and parietal levels, AR anomalies suggest more visual attention implication (Dumas et al., 2014). Differences are also described over low and high EEG frequencies in ASD (Kozhushko et al., 2018). We hypothesize that, integrating spectral, spatial and dynamic (involving ASD symptoms) dimensions, would identify EEG markers, highlighting the underlying electrophysiological mechanism of ASD, but also providing an early and accurate diagnosis (Engeman et al., 2015, 2018), key to a favorable prognosis in ASD.
We have recently developed and published our last version of iPPI-DB (https://ippidb.pasteur.fr/), our database of protein-protein interactions modulators. Thanks to the group of Hervé Ménager, the database is now hosted at Institut Pasteur and is completely remodeled, with a new web interface. As for many other databases, its success and interest for the community rely on the constant input of new data. For this, we also designed a contributor mode that allows anyone to add published data from the literature using the iPPI-DB interface directly. In order to manage properly the visibility of our contributors, we would need to add a contributor management webpage. As contributors are asked to log in using their ORCID ID, the idea is to use such an ID to depict the individual contributions of each contributor. It could first acquire some data from the ORCID website, and also nicely summarize the publications that each contributor has entered in iPPI-DB. The goal is evidently to convince more and more people to help us maintain iPPI-DB. As now,
We have developed a computer tool, named InDeep, that relies on 3D fully convolutional neural networks to predict functional binding sites at the surface of proteins. These functional binding sites can take two forms, either a epitope binding site (location of a protein-protein interaction), or a druggable binding site (location for the binding of a future drug). Presently, the tool is already used on campus in several structural biology and drug discovery projects to support the identification of chemical probes with therapeutic purposes. This includes SARS-CoV2 projects in collaboration with Fabrice Agou, Félix Rey and Marc Delarue. In its present form, InDeep relies on GPU calculations and has to be used within a Linux Shell in command line and although a pymol plugin has been also developed, it requires some prerequired installations. This impedes its usage by the largest audience especially by the community for whom it was designed, namely biologists and chemists. The purpose of this project would be to design a web interface assisted by a GPU cluster on campus to allow the use of InDeep even for non-computer specialists. This would also be the opportunity to add intuitive functionalities (3D structure visualization, metrics, cross-references to well established databases) to assist the user in his/her attempt to identify pertinent functional binding sites.
Last year, we have developed the ARIAweb server for automated NMR structure calculation. The server was well received by the community (200+ users and ~1400 jobs performed) as of today. ARIAweb offers an interface for data conversion and interactive setup of ARIA calculations. Somme years ago, developers of NMR related software agreed on a new standard for storage of NMR data used in structure calculation, called NEF (NMR Exchange Format). The Structural Bioinformatics, as developer of ARIA committed to adhere to this new standard. A version of ARIA has been developed to read and write NEF files that will allow easy exchange of data and construction pipelines between various NMR software, all using the same NEF format. Since ARIAweb is the only online service for structure calculation from NMR data, we would like to allow other software/servers using NEF to interc-communicate more easily with ARIAweb. The purpose of this project is to develop an application programming interface (API) within the Django framework of ARIAweb to allow for i) user authentication, ii) upload of NEF file along with few parameters in JSON format, iii) submission of job and status tracking and iv) retrieval of results. This API will make ARIAweb a new central hub for other NMR applications (such as the CCPN software suite, https://www.ccpn.ac.uk/v3-software/about).
Installation logiciels sur machine virtuelle linux deployée sur 16 postes au centre des enseignements (salle 5 )