Archive

Posts Tagged ‘Python’

If publishing in BMC Bioinformatics is that simple

February 13th, 2008 John No comments

Last week I read an article by Fourment and Gillings, A comparison of common programming languages used in bioinformatics [pubmed][doi] in BMC Bioinformatics. It basically is about a comparison of programming languages often used in bioinformatics. They compare Perl, Python, C, C++, C#(.NET) and java. The authors stress that each particular language has advantages for use in different bioinformatic applications. Fine, I can agree with that, but…

The more I think about the article, the more I am vexed by it. Besides kicking down obvious open doors, the results and methods leave many things to be desired. There are no error bars or standard deviations in any of the figures which would have been stupidly easy to do and necessary. All programs were written by the same person with varying experience in Perl, C++ and Java, other languages where learned while writing the programs. I think this is a recipe for disaster. Every language has its peculiarities which can be avoided or used to the fullest only when one has some decent experience with the language. A colleague of mine (who is the resident python expert) classified the blast parser as ‘rather messy’ after one short glance.

I simply can not get my head around the fact that Python parses a 9.6 gig blast output file in 38 minutes while Perl does the same thing in a little more than 7… 38 minutes! 9.6 gig blast output file! I have tried these scripts myself on some blast output (not nearly as large) I had lying around and found huge differences in processing time using the same script and blast file. Also… 9.6 gig! They mention the sequence used to search, but not the database they searched in… How do you end up with a relevant blast search output so large?

I think this article still needs a lot of work to convince me of the numbers they report. I am willing to agree that Python is better than Perl in some things and vice versa, but I have strong opinion with how this study was performed. Although it is nice that someone presents actual numbers and figures about how different languages perform, I do not think it is good enough to be published in BMC Bioinformatics.

Categories: Bioinformatics, Perl Tags: ,