Flags and Lollipops

Monday, March 12, 2007

Software availabilty: a quick survey of OUP Bioinformatics

When writing Friday's post about the Nature Methods 'software availability' editorial I spent some time trawling through Nodalpoint's archives looking for comments about defunct software distributions to serve as anecdotal evidence. Broken links to resources seem like a problem that many people have encountered.

I figured that I'd do some empirical research and check out all of the Application Notes published in the March issues of Bioinformatics from the past four years.

Some "this study isn't very scientific" disclaimers: It's not a huge dataset. I'm lumping databases, software and web services together to talk about 'resources' in general. There's only one resource per paper, and it's whatever is referred to in the abstract 'availability' section. I started off going through every paper in each issue to see if they mentioned resources but it rapidly because tiresome and so for 2005, 2004 and 2003 I just looked at the Application Notes.

So on to the results - the raw data is at the end of the post, but briefly:

  • 12% of resources from the March 2006 issues are no longer available.
  • 17% of the resources from 2005 and 2004 are no longer available.
  • 11% of the resources from 2003 are no longer available.
  • Only one of the resources I looked at was hosted on SourceForge. It's still available.
  • Many, many resources were hosted in home directories (i.e. whatever.edu/~username/ ).
  • A couple of resources that were available 'upon request' made clear that they were free for non-profit use only - is holding the software back a way of screening potential customers?


Two other things I noticed:

  • OUP Bioinformatics used to have lots of original research and now it's all applications and databases (not necessarily a bad thing, I'm just saying. Neil has mentioned this before, too)
  • People writing bioinformatics web services love frames. Stop using frames, please.


Perhaps a compromise between making software open source and keeping it locked up until you / your technology transfer officer can become fantastically rich by selling it to big pharma is to upload a tarball of the software executable (that runs on a reference platform: Windows, OS X, Linux?) and some documentation to, say, WebCite? No mailing lists, CVS access or anything fancy are necessary, after all: just a permanent snapshot of the software that you used to write your paper.

Anyway, the raw data:

March 2006

27 resources
3 available on request (11%)
3 unavailable (of all resources: 11% / of freely available resources: 12%)
1 in SourceForge

March 2005

33 resources
4 available on request (8.25%)
5 unavailable (15% / 17%)
1 unavailable site redirects to an ad filled domain parking page, how rude.

March 2004

29 resources
all freely available (i.e. not 'on request')
5 unavailable (17% / 17%)

March 2003

22 resources
5 available on request (22%)
2 unavailable (9% / 11%)

Labels: , ,

Comments and trackbacks Feel free to post your comments Blogger Pierre Anonymous Neil Anonymous Deepak Blogger Sandy Anonymous Mike Barton Anonymous SNP Blogger Pedro Beltrão Blogger Bishu . This post has trackbacks.

Trackbacks:

8 Comments:

At March 12, 2007 4:11 PM, Blogger Pierre said...

Nice post. Thank you.

 
At March 12, 2007 4:18 PM, Anonymous Neil said...

I recently reviewed a paper for the NAR web issue. There were frames a-plenty. I suggested they remove them. What is this, 1996?

This is a great survey highlighting an ongoing problem. Just one project in sourceforge, shocking! I'm inclined to think that this trend of multiple academic research groups leaving their broken crap all over the web is unsustainable. Someone needs to take charge. Bring on Google Bioinformatics!

To my shame, I have a long forgotten Applications Note. It's still online but hasn't updated in 3 years or more.

 
At March 12, 2007 4:25 PM, Anonymous Deepak said...

It is quite funny how so many bioinformatics (both web services and content sites) have such terrible web design (web 0.9).

Neil .. this is a function of projects being started by students and then left completely hanging once the student/post-doc leaves.

 
At March 12, 2007 7:12 PM, Blogger Sandy said...

Hi Stu,

The Scientist had a great article about a year ago (I think) on "abandoned ware." It was pretty amusing and as Deepak notes, I think results largely from the lack of funding to continue supporting programs once they've been developed and the authors have gotten their degrees.

 
At March 13, 2007 11:25 AM, Anonymous Mike Barton said...

A very informative post Stu.

This is a subject I often rant about in my office. I think part of the problem is that once the paper has been submitted you don't need to care about the application anymore. You can put it on your citation record and then forget about it.

We don't need 100 different microarray tools; we need that one that works well. But you don't get published for adding a feature to someone else’s tool.

One exceptions cytoscape, where scientists get papers for writing plugins that add extra functionality.

 
At March 14, 2007 2:21 PM, Anonymous SNP said...

Some points from the pharma user side:

1) I'm happy to kick money over to someone who writes useful software, but I'm not going to start into months of negotiations between our lawyers and your lawyers over software I've never seen. Allow, up front, a 30 day demo period for commercial users!

2) My impression is that these informatics groups vastly overestimate the amount of money to be made. Squeezing a few thousand dollars out of me and a couple of other users who want your niche software can't possibly be worth the amount of time your tech transfer lawyers will spend on negotiations with our corporate bureaucracies. You'd be far better off giving the software away and soliciting letters of support from (a much bigger pool of) users for your grant applications.

3) For heaven's sake, post source code! It's not like your project automatically becomes GPL'd by doing so, and I can then run it on Linux or OS X or anything besides that ancient version of Digital Unix for Alpha your group uses.

 
At March 19, 2007 1:01 PM, Blogger Pedro Beltrão said...

Great post. Maybe you could consider sending it as a letter to the editors. They should try to find a possible solution that balances the long term availability of the applications with possible commercial uses.

 
At September 14, 2009 12:32 PM, Blogger Bishu said...

web survey software is a cost-effective way to find out what your customers, employees, and stakeholders are thinking. Surveys have something to say.

 

Post a Comment

<< Home


See all posts from: July 2005 August 2005 September 2005 October 2005 November 2005 December 2005 January 2006 February 2006 March 2006 April 2006 May 2006 June 2006 July 2006 September 2006 October 2006 November 2006 December 2006 January 2007 February 2007 March 2007 April 2007 May 2007 June 2007 July 2007 August 2007 October 2007 November 2007 December 2007 January 2008 February 2008 March 2008 April 2008 May 2008 October 2008 December 2008 January 2009 February 2009 March 2009 June 2009