Good Leaver

Signalling network inference from paired phosphoproteomics/transcriptomics datasets

Posted 2 years ago


The European Bioinformatics Institute is an international, innovative and interdisciplinary research organisation funded by European states. The EBI’s goal is to help scientists realise the potential of big data in biology, exploiting complex information to make discoveries that benefit humankind.  

Project Supervisor  Dr. Evangelia Petsalaki

Project location: Remote 

Project Requirements 

  • Experience identifying signals from large datasets 
  • Solid foundational understanding of machine learning/statistics 
  • Bioinformatics knowledge useful but not essential  


Cell signalling describes the processes that occur in a cell in response to changes in its environment. These processes are controlled by proteins that interact with each other. Modification of these proteins by addition of a phosphate group (phosphorylation) regulates these interactions and drives the signal transduction. The end result of a cell signalling process is typically (but not always) a change in the activity of proteins called transcription factors.  These then change the protein contents of the cell, and therefore its functions and behavior, by driving changes in the expression of different genes from which the proteins are produced. 


These processes underlie all cell functions and most diseases. Understanding how the proteins interact with each other in the context of cell signalling to result in changed cell behavior is crucial for understanding cell functions and disease development, and to develop therapeutics.  

Phosphoproteomics datasets are collected using mass spectrometry and they aim to identify the phosphorylated proteins in a sample, i.e. they capture a snapshot of cell signalling.  Transcriptomics datasets are collected using next generation sequencing (current) or microarrays (older technology) and show which genes are expressed in a sample. They can also be used to infer transcription factor activities.   

Trying to infer signalling networks from each of these datasets is problematic due to sparsity, noise and inability to detect low abundance proteins, such as transcription factors for phosphoproteomics, and the fact that the gene expression itself doesn’t always correlate with the protein abundance, or phosphorylation in cells.    

There are several attempts to infer signalling networks from phosphoproteomics datasets, but they perform poorly, among other things due to the fact that the problem is poorly defined: There are thousands of phosphoproteins, and typically only a handful of conditions from which to infer the relationships between them.  

In this project we propose using the transcriptomics datasets to constrain the test space and use statistical approaches to integrate parts of the network that are not measured by phosphoproteomics, to improve the inference of signalling networks from omics data.   

Apply Online

A valid email address is required.
A valid phone number is required.
Copy link
Powered by Social Snap