Inparablog

A comparative genomics and bioinformatics blog

  • Home
  • About
  • Perl
    • Arrays
    • RegExp
  • Links
  • Photos
  • Contact
RSS
Category Archives: Bioinformatics

The phylogenetic tree contest

Posted on March 3, 2009 by John
No comments

The bigger your phylogenetic tree, the bigger your headache! And I’m not just talking about the huge amount of time it will take to calculate the alignments and the actual tree using PhyloBayes or PhyML. Interpretation will become near impossible. ‘Simplify’ is the magic word here, I think, but at some point you have to look at your phylogenetic tree as a whole…

I recently teamed up with a collegue to work together on a project because our individual research projects crossed paths. We devided work and each picked a gene family to work on. The resulting phylogenetic trees are depicted below.

Who's got the largest tree

Sometimes you can not beat good old fashioned paper :) .

I’m using Dendroscope as my prefered tree viewer, but I don’t know if there is anything better out there for viewing large phylogenies. If somebody can recomment me something, please drop me a comment!

Categories: Bioinformatics | Tags: Phylogenetic tree, phylogeny, tree

Apparently I need to have a Windows, a Linux AND a Mac machine

Posted on September 16, 2008 by John
No comments

Right! Bioinformaticians are a industrious lot. Writing all kinds of software to make life easier for other scientists… But why do some write their software to work on only one OS? I know JAVA is not everything, but it works! And most scripting languages you can install and run on nearly every OS (Perl, Python, etc).

I found this article in my email and at first sight it looked nice and useful to me, but I’m not even gonna try to read the article and try the software… Because I don’t have a Mac…

Should I get a Mac?

Categories: Bioinformatics, Ramblings | Tags: Mac OS, software

Protein complex evolution and network rewiring

Posted on July 26, 2008 by John
No comments

My very first article is out!

van Dam, T.J.P., Snel, B. (2008). Protein Complex Evolution Does Not Involve Extensive Network Rewiring. PLoS Computational Biology, 4(7), e1000132. DOI: 10.1371/journal.pcbi.1000132

This is the author summary:

Protein complexes are a pivotal part of the functioning of cells in health and disease. Studying the evolution of these essential cellular features is of great intrinsic as well as practical interest. However, the study of the evolution of protein complexes by comparative analysis is fraught with difficulties. Hence current reports that reveal low overlap in the interactome between species are often reluctant to equate this low level of overlap to a low level of conservation. Here we exploit new public data sets, which display unparalleled coverage, to study the amount of co-complex membership conservation, and we present a novel measure for the absence of interactions. We thereby observe a hitherto unreported high level of conservation of 90% of the interactions when the presence of the genes coding for the protein pairs that participate in the same protein complex is also conserved. This allows for new insights into the evolution of protein complexes: the evolutionary dynamics of protein complexes are, by and large, not the result of network rewiring (i.e. acquisition or loss of co-complex memberships), but mainly due to genomic acquisition or loss of genes coding for subunits.

Categories: Articles, Bioinformatics

Detecting homology

Posted on July 14, 2008 by John
No comments

I thought I had them all, but I was wrong. When is something not sharing homology and when do you just fail to detect? I think this is still a big problem in bioinformatics… Bitscores or E-values? Which method? Blast, PSI-Blast, HMMER? Argggh!!!!

Categories: Bioinformatics, Ramblings | Tags: Homology

If publishing in BMC Bioinformatics is that simple

Posted on February 13, 2008 by John
No comments

Last week I read an article by Fourment and Gillings, A comparison of common programming languages used in bioinformatics [pubmed][doi] in BMC Bioinformatics. It basically is about a comparison of programming languages often used in bioinformatics. They compare Perl, Python, C, C++, C#(.NET) and java. The authors stress that each particular language has advantages for use in different bioinformatic applications. Fine, I can agree with that, but…

The more I think about the article, the more I am vexed by it. Besides kicking down obvious open doors, the results and methods leave many things to be desired. There are no error bars or standard deviations in any of the figures which would have been stupidly easy to do and necessary. All programs were written by the same person with varying experience in Perl, C++ and Java, other languages where learned while writing the programs. I think this is a recipe for disaster. Every language has its peculiarities which can be avoided or used to the fullest only when one has some decent experience with the language. A colleague of mine (who is the resident python expert) classified the blast parser as ‘rather messy’ after one short glance.

I simply can not get my head around the fact that Python parses a 9.6 gig blast output file in 38 minutes while Perl does the same thing in a little more than 7… 38 minutes! 9.6 gig blast output file! I have tried these scripts myself on some blast output (not nearly as large) I had lying around and found huge differences in processing time using the same script and blast file. Also… 9.6 gig! They mention the sequence used to search, but not the database they searched in… How do you end up with a relevant blast search output so large?

I think this article still needs a lot of work to convince me of the numbers they report. I am willing to agree that Python is better than Perl in some things and vice versa, but I have strong opinion with how this study was performed. Although it is nice that someone presents actual numbers and figures about how different languages perform, I do not think it is good enough to be published in BMC Bioinformatics.

Categories: Bioinformatics, Perl | Tags: Python, Review

BBC 2007: Day 2

Posted on November 14, 2007 by John
No comments

Second day at the BBC in Leuven. Quite some interesting stories that day! One talk was about using structure information to analyze how proteins bind to protein domains which are common in signal transduction. The second speaker actually told us something similar to what I have been doing for my Masters. Though not the same, he did use some of the ideas we also had for comparing protein interaction networks. He gave a link to the pre-published text and I am printing it at this exact same moment!

The keynote speaker M. Madan Babu had a brilliant presentation about the structure, evolution and dynamics of transcription regulation networks. As the first speaker after the break, his microphone stopped working. When that was fixed, the beamer broke down. He could still laugh about it though. When he could finally continue he told us about regulation motives in yeast. When analyzing the network they found a limited set of motives which were predominant in the network. These motives were analyzed in an evolutionary context by looking at duplications of the transcription factors and their target genes. Only in rare instances were these motives explained by duplications, which was counter intuitive. Also an a priori assumption in which transcriptional “hubs” should control relatively more duplicated genes was found not to be the case. They did find enrichment for some types of motives in specific processes such as DNA replication and sporulation. Feed forward loops for example are enriched in slow processes.

When looking at chromosomal localization of target genes and transcription factors they found a clear preference for target genes to be concentrated in one or at most two chromosomes. Even within the chromosomes target genes display regional preferences or avoidance. This mapping of preferences could help for optimizing expression of exogenous genes regulated by endogenous transcription factors.

The following talks included evolution of chromalveolates, which was very interesting, as well as a talk about MANTiS, which is an orthology database which is supposed to go on line in December. Instead of Inparanoid or Bi-directional best hits they use phylogenetic trees, which of course is much better. Instead of general orthology it can infer orthologs vs. paralogs and in-paralogs vs. out-paralogs. This depends on the quality of the trees used, and how the gene families have been determined, but it is good to do this so others can use it.

Wrap up: After a bad start with the poster session on Monday, the BBC took off with some very interesting talks. I especially liked the keynote talk by Madan Babu. I’ve noticed that a lot of research presented in the talks and especially on the posters, involved making bioinformatic tools for biologists who will not use them. I am a bit pessimistic in this I know, but as a molecular biologist myself I can only wonder.

Categories: Bioinformatics | Tags: BBC, Conference
Previous Entries
  • Search

  • The author

    Gravatar My name is John van Dam and I am a Post-Doc at St. Radboud University Medical Center (NL). My research involves bioinformatics and comparative genomics on cilia and signal transduction pathways.
  • About me

    • LinkedIn Profile
    • Mendeley Profile
    • Research Blogging Profile
  • Bioinformatics Blogs

    • Bioinformatics
    • Bioinformatics Zen
    • Fisheye Perspective
    • nodalpoint
    • Omics! Omics!
    • Public Rambling
    • The Tree of Life
    • What You’re Doing Is Rather Desperate
    • YOKOFAKUN
  • Perl

    • Beginning Perl
    • Bio::Perl
    • PerlMonks
  • Tags

    Backreferences BBC Conference Cordyceps E-values Fungus Hardware Homology Insects Lightning Mac OS Meiosis Office paradox permalinks PhD Phylogenetic tree phylogeny Python Quadrupel radio Regexp Regular Expressions research Review software Thunder Trappist tree Upgrade Weather Westvleteren Westvleteren 12 Wordpress Youtube
  • Copyright notice

    Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
    Creative Commons Licentie
  • Meta

    • Log in
    • Entries RSS
    • Comments RSS
    • WordPress.org
© Inparablog. Proudly Powered by WordPress | Nest Theme by YChong