Flags and Lollipops

Wednesday, June 20, 2007

Precedings, pt2

We're going to try using groups and forums on Nature Network as places to discuss Web Publishing products. The idea is that Network's limitations will become apparent very quickly (well, not very quickly, it's well written - but complex and still relatively new) and we'll be able to prioritize fixes and new features better. This makes sense. Before anybody mentions it in the comments I think that Network needs RSS too. AFAIK it's on the to-do list.

Anyway, Precedings has a group there which you can use to request new features, discuss the issues surrounding biomedical preprints, ask questions about the site, etc. Hilary Spencer (who runs Precedings) has already posted about submitting research to Precedings vs just posting it on your blog.

It's interesting stuff - wasn't there a similar discussion on NFTB or Nodalpoint or somewhere a while back, about using WebCite to archive blogged research?

Take a look.

Labels: ,

Comments and trackbacks Feel free to post your comments . This post has trackbacks.

Tuesday, June 12, 2007

Scintilla

We're doing some pre-launch tweaking of Scintilla, a new science aggregation product, at Nature. It's a something that Alf and I have been working on recently (well, Alf has been working on it, I mostly just complain and then swoop in to take credit by sending out invites at the last minute ;)).

You're welcome to check it out and send any comments, suggestions or bug reports to scintilla@nature.com. We'll be acting on your feedback, so do drop us a line.

There'll be more official announcement on Nascent later.

Labels: , ,

Comments and trackbacks Feel free to post your comments Blogger brian . This post has trackbacks.

Monday, March 12, 2007

Software availabilty: a quick survey of OUP Bioinformatics

When writing Friday's post about the Nature Methods 'software availability' editorial I spent some time trawling through Nodalpoint's archives looking for comments about defunct software distributions to serve as anecdotal evidence. Broken links to resources seem like a problem that many people have encountered.

I figured that I'd do some empirical research and check out all of the Application Notes published in the March issues of Bioinformatics from the past four years.

Some "this study isn't very scientific" disclaimers: It's not a huge dataset. I'm lumping databases, software and web services together to talk about 'resources' in general. There's only one resource per paper, and it's whatever is referred to in the abstract 'availability' section. I started off going through every paper in each issue to see if they mentioned resources but it rapidly because tiresome and so for 2005, 2004 and 2003 I just looked at the Application Notes.

So on to the results - the raw data is at the end of the post, but briefly:

  • 12% of resources from the March 2006 issues are no longer available.
  • 17% of the resources from 2005 and 2004 are no longer available.
  • 11% of the resources from 2003 are no longer available.
  • Only one of the resources I looked at was hosted on SourceForge. It's still available.
  • Many, many resources were hosted in home directories (i.e. whatever.edu/~username/ ).
  • A couple of resources that were available 'upon request' made clear that they were free for non-profit use only - is holding the software back a way of screening potential customers?


Two other things I noticed:

  • OUP Bioinformatics used to have lots of original research and now it's all applications and databases (not necessarily a bad thing, I'm just saying. Neil has mentioned this before, too)
  • People writing bioinformatics web services love frames. Stop using frames, please.


Perhaps a compromise between making software open source and keeping it locked up until you / your technology transfer officer can become fantastically rich by selling it to big pharma is to upload a tarball of the software executable (that runs on a reference platform: Windows, OS X, Linux?) and some documentation to, say, WebCite? No mailing lists, CVS access or anything fancy are necessary, after all: just a permanent snapshot of the software that you used to write your paper.

Anyway, the raw data:

March 2006

27 resources
3 available on request (11%)
3 unavailable (of all resources: 11% / of freely available resources: 12%)
1 in SourceForge

March 2005

33 resources
4 available on request (8.25%)
5 unavailable (15% / 17%)
1 unavailable site redirects to an ad filled domain parking page, how rude.

March 2004

29 resources
all freely available (i.e. not 'on request')
5 unavailable (17% / 17%)

March 2003

22 resources
5 available on request (22%)
2 unavailable (9% / 11%)

Labels: , ,

Comments and trackbacks Feel free to post your comments Blogger Pierre Anonymous Neil Anonymous Deepak Blogger Sandy Anonymous Mike Barton Anonymous SNP Blogger Pedro Beltrão . This post has trackbacks.

Friday, March 09, 2007

Nature Methods on software availability

Nature Methods has a new editorial clarifying its position on making the software used in papers available to readers (about time a journal did this):
The minimum level of disclosure that Nature Methods requires depends on how central the software is to the paper. If a software program is the focus of the report, we expect the programming code to be made available. Without the code, the software—and thus the paper—would become a black box of little use to the scientific community. In many papers, however, the software is only an ancillary part of the method, and the focus is on the methodological approach or an insight gained from it.

In these cases, releasing the code may not be a requirement for publication, but such custom-developed software will often be as important for the replication of the procedure as plasmids or mutant cell lines. We therefore insist that software or algorithms be made available to readers in a usable form. The guiding principle is that enough information must be provided so that users can reproduce the procedure and use the method in their own research at reasonable cost—both monetary and in terms of labor.
I think it's quite a well thought out piece. The editors recognize, for instance, that some short programs and algorithms are better made available as pseudocode (well, they say 'a small set of equations', but I know which one I'd prefer).

I'm not sure it goes far enough, though. For example: if the software runs as a web service, is making that service public enough to satisfy the journal's requirements? Can you host any code releases on your own server?

The problem with answering either of those questions with a 'yes' is that there's no guarantee that the software is still going to be available after a year or two (something most bioinformaticians are acutely aware of): postdocs and grad students move on, server accounts (and labs) get closed, bugs crop up and there's nobody willing to fix them, websites get redeveloped... etc.

What happens when we read an older paper, the software isn't around any more and we report it to an editor?

When we ask authors to make sequences available we require them to be deposited in GenBank. Should we require software authors to deposit their code on Sourceforge, Google Code or some other (more) permanent repository (in which case, what about the executable only software or software that has a restrictive licence)?

There are open comment threads at both Methagora - the Nature Methods blog - and Nautilus, which covers the whole spread of Nature journals. I urge you to go forth and help shape journal policy (perhaps).

Labels: , ,

Comments and trackbacks Feel free to post your comments Anonymous Deepak . This post has trackbacks.

Thursday, March 01, 2007

Leave Nature HQ, walk left, see...


Cheeky monkeys. Can't help but feel that the money would've been better spent elsewhere, though. Yes, I know that it's an ape not a monkey.

Labels: , ,

Comments and trackbacks Feel free to post your comments . This post has trackbacks.

Friday, February 23, 2007

Nature Network - add papers to profile by PMID

Nature Network v2 was launched on Valentines Day. Nature Network, in case you haven't heard of it before, is sort of like MySpace for scientists (except not crap). Previously it was restricted to scientists in the Boston area, but this new version allows anybody to sign up, network with others, post to blogs and forums, set up a profile... you get the idea.

There's space on your profile to enter the papers that you've authored, which is cool. Unfortunately for lazy people like me you have to enter the details of those papers manually (or at least you do for now).

Luckily the form HTML is of great semantic beauty. Thus: nnaddbypmid.user.js, a quick Greasemonkey script that fetches paper metadata from PubMed using the the EUtils. Follow the 'add publication' link on your Nature Network profile as normal, then fill in the PMID textbox and click on the new 'look up details' button. Voila! The other fields should get filled in automagically.

Labels: , , ,

Comments and trackbacks Feel free to post your comments Blogger Pierre Blogger Egon Willighagen Blogger Stew Blogger Konrad Förstner Blogger Egon Willighagen . This post has trackbacks.


See all posts from: July 2005 August 2005 September 2005 October 2005 November 2005 December 2005 January 2006 February 2006 March 2006 April 2006 May 2006 June 2006 July 2006 September 2006 October 2006 November 2006 December 2006 January 2007 February 2007 March 2007 April 2007 May 2007 June 2007 July 2007 August 2007 October 2007 November 2007 December 2007 January 2008 February 2008 March 2008 April 2008