PANGAIA – Pan-genome Graph Algorithms and Data Integration
Genomes are strings over the letters A,C,G,T, which represent nucleotides, the building blocks of DNA. In view of ultra-large amounts of genome sequence data emerging from ever more and technologically rapidly advancing genome sequencing devices—in the meantime, amounts of sequencing data accrued are reaching into the exabyte scale—the driving, urgent question is: how can we arrange and analyze these data masses in a formally rigorous, computationally efficient and biomedically rewarding manner?
Graph based data structures have been pointed out to have disruptive benefits over traditional sequence based structures when representing pan-genomes, sufficiently large, evolutionarily coherent collections of genomes. This idea has its immediate justification in the laws of genetics: evolutionarily closely related genomes vary only in relatively little amounts of letters, while sharing the majority of their sequence content. Graph-based pan-genome representations that allow to remove redundancies without having to discard individual differences, make utmost sense. In this project, we will put this shift of paradigms—from sequence to graph based representations of genomes—into full effect. As a result, we can expect a wealth of practically relevant advantages, among which arrangement, analysis, compression, integration and exploitation of genome data are the most fundamental points. In addition, we will also open up a significant source of inspiration for computer science itself.
Development and testing of molecular and informatic tools for effective characterisation and interpretation of clinically relevant microsatellite repetitive motifs from genomic data
The proposed project is based on the recognition of facts that: (i) the genomic material of each person contains an immense amount of health-related information; ii) the usability of these genomic information depends on our ability to identify genomic variants; iii) microsatellite motifs (STRs) play an important role in various aspects of physiological and pathological processes of our organisms. Despite that STRs represent the most variable loci of our genome, their variability is still very poorly described. In particular this is caused by a lack of tools allowing their accurate and comprehensive evaluation from large-scale genomic data sets. The aim of our project is, therefore, to examine specific aspects of possibilities of STR motifs characterisation from whole-genome sequence data with simultaneous development and validation of molecular-genetic approaches and bioinformatics tool capable of processing data derived from massively parallel sequencing. From these data the developed tool should be able to extract clinically relevant information, such as the numbers and exact sequence of repetitions of particular alleles, the phase of individual parts of complex motifs, signs of motif instability, and the presence of possible pathological expansions of repeat numbers. As a clinical model we chose two main patient groups: 1) patients with a molecularly confirmed diagnosis of disease caused by expansions of STR motifs (myotonic dystrophy type 1 and 2, Huntington’s disease and Fragile X syndrome); 2) patients with Lynch syndrome, in whom instability of microsatellite motif is an important clinical biomarker. Based on the generated data as well as on data derived from appropriate conventional validation methods and other already available tools, we plan to perform comprehensive statistical validation and characterization of the reliability, accuracy and practical applicability of our newly developed tool in specific areas of biomedicine and personalized healthcare.
Establishment of Competence centre for research and development in the field of molecular medicine
The aim of the project is to establish and operate a state-of-the-art joint research center, which interacts with the academic and business sectors in molecular medicine and manages and protects intellectual property rights and technology transfer.
REVOGENE – Research centre for molecular genetics
The aim of the project is to create a research center of molecular genetics – REVOGENE, which will enable research in the field of Next-Generation Sequencing (NGS). The results of this research can be used eg. in clinical genetics, microbiology and virology, neonatology, oncology, pharmacology, transplantology, pathology and many others.
The project and established research center will enable the implementation of applied research of international quality in the field of molecular genetics in connection with the area of biotechnology, progressive materials and knowledge technologies with the support of ICT.
The direct outcome of the project will be new methods and procedures of NGS sequencing of genomes, exomes, transcriptomes and metagenomes with a focus on molecular medicine, which will enable to use the knowledge gained in specialized molecular genetic diagnostics.
Analýza voľnej fetálnej DNA získanej zo slín tehotných žien a jej potenciálne využitie v neinvazívnej prenatálnej DNA diagnostike
During the project planned milestones which allowed the gradual successful progress were reached. These included in particular – creation of biobank with saliva and blood samples from pregnant women, the identification of the optimal sampling of saliva, optimization of protocol for isolation of DNA from saliva and control blood samples, identification of the optimal method of concentrating of the isolated circulating DNA, determination of potential impact of external DNA contamination in the event of analysis of saliva samples, design and implementation of the procedure for the determination of the level of fragmentation of fetal DNA in maternal circulation, the identification of the method for detection of external contamination of circulating DNA, the introduction of procedures to identify paternal alleles of STR polymorphisms in circulating DNA, the introduction of whole genome analysis allowing the detection of fetal alleles of SNP polymorphisms, identification of the possibility of coisolation of circulating DNA and miRNA, detection of fetal DNA in exosomal fraction of plasma and serum of pregnant women. However, despite the complexity and extensiveness of obtained results it was not possible to undoubtedly and definetely confirm the presence of fetal DNA in maternal saliva. But the results of whole-genome DNA sequencing carried out in the final stages of the project still indicate that, there is still reason to assume that fetal DNA is present in the saliva of pregnant women. Therefore, at present, validation study based on whole genome sequencing on a larger set of samples obtained from pregnant women shortly before delivery is running. This should soon bring definitive results to confirm or refute the hypothesis of the presence of fetal DNA in the saliva of pregnant women.
Využitie fetálnej DNA v maternálnej plazme pre neinvazívnu prenatálnu diagnostiku
In the project, more than 450 samples from pregnant women were analyzed. During preanalytical phase blood draw system EDTA Vacutainer was succesfully validated. Direct comparison of kits used for cffDNA isolation was performed and best results were achieved with Qiagen CNA Kit and Qiagen DSP Virus Kit. Three different methods of cffDNA concentration were tested and then used in the Y-STR multiplex PCR analyses and cffDNA quantification using singlecopy SRY detection system. Protocol for highly sensitive assay based on mutlicopy DYS detection system was optimized and implemented into routine clinical practice in non-invasive prenatal gender test. Based on the know-how gained and with the sensitive DYS-based detection system technology for the detection of cffDNA contamination with DNA of high moleular weight was designed and tested. The potential of patenting this contamination detecting procedure is currently under review. After optimization of RhD detection system on artificialy prepared samples clinical validation of non-invasive prenatal test of fetal RhD status was launched. Based on the recent research in the field the miRNA analysis of maternal plasma was introduced to the project for identification of molecular marker positively determining female gender fetus. Based on the pilot miRNA study 6 candidates have been identified and their applicability in positive identification of female fetuses in the non-invasive fetal gender test is in validation phase. Validation of results of pilot studies performed during project is currently underway on additional more than 250 samples from pregnant women, with 90 representing samples from women with pathological course of pregnancy, where also cffDNA quantification is in progress.
Optimalizácia izolácie, purifikácie a analýzy fetálnej DNA izolovanej z periférnej krvi matky