What I do: building gene trees

Since October last year I’ve been working as a PhD student at the Theoretical Biology group of the Faculty of Science at Utrecht University. I actually work for the Physiological Chemistry group of Prof. dr. Bos at the Academic Medical Centre, but that’s another story… Below I will explain a bit about what I am doing with my current project. I will try to keep it as uncomplicated as possible.

My project involves studying the evolution of signaling pathways in Eukaryotes and trying to understand specifically the emergence of new signaling pathways. Signaling pathways are a chain of events in the cell, carried out by proteins, which have evolved to ‘let the cell know’ what happens outside of the cell so it can react accordingly.

The observation on which my project is based is the fact that the complex eukaryotes (like us) tend to have had many duplications of key proteins which have gained their own function and regulate different processes. My job, in short, is to find out approximately when, why and how this happened for some specific protein families.

I can make a ‘phylogenetic’ profile of genes by using their sequence and align the homologous parts (parts of a sequence which are similar) so I can see the small differences in the sequences caused by mutations over time. These differences in the sequence can tell which genes are more closely related than others. This is because closely related genes will look more ‘alike’, so a human gene will be more like the mouse gene (both are mammals) than the same protein in a plant. With sequences you can determine the ‘genetic’ distance between genes and construct a tree. In the most ideal case this would provide me with a genealogy of genes from species much like the ‘tree of life’ as we now understand it.

Once I have constructed this tree, I can reconstruct what happened to a family of genes throughout evolution. I can see at which branching duplication of genes have occurred and which species have lost the gene and when. Combining this information with other sources will hopefully allow me to understand how and when the duplications have evolved to become new hubs in their own signaling pathway or why some genes have been lost.

There are some problems however that I will have to overcome. A very important method within the world of bioinformatics is determining homology between genes of different species. This works well when the evolutionary distance between the species is not to great. However when the distance is large there is the chance that the two proteins don’t look alike but still are related and share the same function. Because these sequences are still related they both belong in the tree. My biggest problem will be to find all the homologues in all of the species I search in. For this I use techniques like Hidden Markov Models and PSI-BLAST.

Building the tree is only the first part of my research but it could already provide very interesting and exciting results. Once I have the tree I can do many other things with it when I combine it with other types of biological data. I shouldn’t have any problem to fill those four years, I think :-) .

This first year I have also been busy with a project which involves protein complexes and protein-protein interaction data. Within the near future I should have a paper coming out and than I will come back to you with more details…

Leave a comment

1 Comments.

  1. Interesting. All was clear, except this formulation:
    [..] key proteins which have then evolved to be at the hub of their own signaling pathway.

    p.s. Does the site header picture show your desk? I have exactly the same blue bottle (SIGG?) standing next to my monitor. Typical…

Leave a Reply


[ Ctrl + Enter ]