Apply for a position in our exciting research on “ALgorithms for PAngenome Computational Analysis” (ALPACA), and become part of the next generation of experts in computational pangenomics!


September 2021

February 15, 2021


  • Do you have a background in computer science, mathematics, bioinformatics or artificial intelligence / data science?
  • Are you interested in exploring the individual variation of entire evolutionary ensembles, such as particular species (humans, plants), pathogens (viruses, bacteria), or certain types of cancer?
  • Would you like to become part of the next generation of experts in the design of data structures that support the safe arrangement and the efficient and biomedically useful exploitation of exabytes of individual genetic data?
  • Have you just graduated or are you in the first four years of your research career?
  • Are you interested in doing your PhD in a joint network of high-profile research institutes across Europe under guidance of renowned supervisors?

Apply for a position in our exciting research on “Algorithms for PAngenome” (ALPACA), and become part of the next generation of experts in computational pangenomics!

Funded by the European Commission through the Horizon 2020 Marie Sklodowska-Curie ITN Programme, the ALPACA network offers a high-level fellowship for joint research on new data structures, algorithms and statistical / machine learning approaches to store, arrange, process and analyze millions of individual genomes and genetic profiles. The most talented and motivated students will be selected for advanced multidisciplinary research training, preferably starting September 2021.


Genomes are strings over the letters A, C, G, T, which represent nucleotides, the building blocks of DNA. In view of genome sequence data emerging from ever more and technologically rapidly advancing genome sequencing devices, amounting to exabytes in the meantime, the driving question is:

“How can we arrange and analyze these data masses in a computationally / mathematically / statistically appropriate way such that we can redeem the biomedical promises of these data masses, with respect to understanding cancer, rare genetic diseases, and the development, the virulence and resistance patterns of pathogens?”

The individual variation that affects evolutionarily related, large collections of genomes (termed pan-genomes) follows patterns that the laws of genetics systematically imply. This explains why graph-based data structures, which focus on highlighting the individual variation, while summarizing redundancies in a compact way, have clear benefits over the naive idea to store genomes as strings.

However, although having proven to hold great promises, research about graph-based data structures that enable to capture (exabytes of) genetic data can be considered to still be in its infancy. The goal of the research conducted in this project is to make substantial progress with respect to the design and development of such graph-based data structures.


The move from sequence- to graph-based pan-genome data structures is unavoidable when seeking to exploit the wealth of genetic data, instead of having devices massively congested. Putting the paradigm shift (from sequences to graphs) in effect requires new ways of thinking about genomes, as well as computer programs and mathematical models that reflect this.

However, developing, maintaining and computationally / statistically exploiting graph-based pan-genomes requires skills that common-day education does not yet provide. The goal of this project is to train a new ‘class’ of researchers / operators / administrators who are able to deal with the (exabyte-scale) masses of genome data in terms of the progressive, graph-based approaches the research of this project deals with.

ESR’s (early-stage researchers = PhD students) will carry out corresponding research at Geneton Ltd. and Comenius University, Faculty of Natural Sciences. To acquire further practical skills, the ESR will also spend time at partner institutions, among which leading industrial players, in the frame of month-long secondments.

For more information on the available position and more detailed project description, please visit or ALPACA Website.


  • You will get the chance to participate in specially developed lectures and courses (e.g. on specific techniques, academic soft skills, etc.);
  • Already at an early stage in your career, you can start building your personal professional network due to having your PhD project embedded in a high profile consortium, encompassing renowned universities and innovative companies (such as EMBL-EBI, BaseClear, Institut Pasteur, University of Cambridge, INRIA, Cornell University, Institut National de Recherche en Informatique et Automatique, DNA nexus).
  • You will be exposed to research in a non-academic environment, by spending one month (or more) in the non-academic sector. This will sharpen your understanding of strategies, requirements and skills for research in business environments.
  • There will be the opportunity for you to spend time and perform research with other members of the consortium. This will widen your horizon with respect to related scientific disciplines, techniques and also alternative philosophies when pursuing scientific goals.
  • You will be advised by excellent group leaders, representing outstanding members of their research communities, and approved, experienced PhD supervisors.


To comply with the funding rules of the Horizon 2020 Marie Sklodowska-Curie programme:

  • You qualify as an Early Stage Researcher, meaning that – on the starting date of your employment with the host institute – you are in the first four years of your research career and have not (yet) been awarded a doctoral degree.
  • You have not resided and/or have had your main activity (study, work, etc.) in the country where the position is announced for more than 12 months during the 3 years prior to the starting date of your employment with the respective host institute.

In addition:

  • You have research experience in one or multiple fields relevant to the ALPACA project, such as computer science, mathematics, bioinformatics, molecular biology, artificial intelligence / data science, or a combination thereof
  • You are proficient in English language (academic level).

Please also note that, although particular expertise always means a plus, there are no particular prerequisites beyond the basic training in computer science / mathematics / statistics / bioinformatics required.


You may apply for a position until February 15, 2021, 23:59 (CET) / before February 16, 2021 by sending a professional CV and motivation letter to

Only applications that are complete and submitted before the application deadline, submitted according to the procedure described at the ALPACA website will be considered.


Marie Sklodowska Curie (MSC) projects offer highly competitive and attractive salary and working conditions

  • Selected candidates will have a fulltime employment contract for the duration of 36 months
  • The salaries during the EU funded period of the fellowship comply with the H2020 Marie Sklodowska Curie Work Programme. More information can be found in the Information note for Marie Skłodowska-Curie Fellows inInnovative Training Networks (ITN) (version 2, 11/11/2019).
  • For selected candidates, general standards and conditions of the host institute will apply, so ESR’s are equivalent to regular PhD students in terms of generally applicable terms and regulations.

ALPACA is committed to the principles of the European Charter and Code of Conduct for the Recruitment of Researchers.

ALPACA has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under the Marie Skłodowska-Curie grant agreement No 956229.
Unsolicited marketing is not appreciated.