Flags and Lollipops

Thursday, December 22, 2005

Golden rules for bioinformatics web applications

Continuing yesterday's well executed web applications theme I've drafted a list of common sense "golden rules" for bioinformatics web application interfaces. After all, the underlying algorithm might be fantastic but if nobody can use it then you may as well have kept it to yourself.

When I say "golden rule", of course, I mean "generally, and in my opinion". If you've got more to add, or disagree with any of them, add a comment and I'll check it out.

1. Know your Audience

  • Explain jargon
  • Describe promises and limitations of application at outset
  • Have common presets for options

Work out who your application is designed for then tailor its presentation to them. Make it easy for people to get what they want out of your software: that's why you wrote it in the first place, isn't it?

Ideally all of your users will be your peers, or will have read your research papers and done some background literature searches on the algorithm that you've implemented. In practice your users will probably be lab monkeys interested in X and who want X without having to care about the precise details of how your system works.

This means you have to achieve a balance between complexity and usability. Have presets for sets of options ("best precision", "best recall"). Explain jargon as it appears on input forms and results pages. List your system's promises and limitations on the first page so that users don't have to be an expert in the field to know what kind of things will work and what won't.

2. Keep It Simple, Stupid

  • Don't add features that nobody is going to use
  • Let user decide when to ramp up complexity

Related to the above - don't add features that nobody is ever going to use. Adding lots of hyperlinks to GeneCards and RefSeq after every gene name doesn't add any value to your software. Neither does displaying internal database ids or algorithm score breakdowns which don't make any sense out of context.

Where possible, start off simple and let the user ramp up complexity when they are comfortable with it. If you've got a genome browser that can display lots of features, show a basic subset first and let the user add more rather than bombarding them with almost everything at once (*cough* Ensembl *cough*).

3. Guide your users

  • Produce a tutorial (and possibly a user manual)
  • Always provide examples of input

You don't need to write a manual. Just provide some guidance for your users: a short tutorial would do just fine. Always provide examples of valid input. If possible, try to explain any output on the results page.

4. Use standard formats

  • Always accept relevant standard input formats
  • Output relevant standard output formats

Accept standard inputs and export standard outputs. Why waste a hundred people's time on data munging (and put off a hundred more) when you could write a simple script now and implement it server side?

By standard, I mean proper standard - FASTA, please, for sequences. And make the output machine readable - that doesn't necessarily mean XML, tab delimited text files will do.

5. Software wants to be free

  • Let your users decide how best to use your software

If your application can be run standalone, make it available standalone, preferably under an open source license. If your software is good enough to be peer reviewed as an application note the source code should be good enough to be peer reviewed too.

This will work to your advantage. If you don't believe me, just wait until your server crashes because some postdoc halfway across the world is trying to incorporate your application into a pipeline and is generating hundreds of requests per minute.

Comments and trackbacks Feel free to post your comments Anonymous Mauricio Blogger Pedro Beltrão Blogger Stew . This post has trackbacks.

Trackbacks:

3 Comments:

At December 22, 2005 8:58 PM, Anonymous Mauricio said...

Great! It seems like you are re-opening the topic from the keynote speech "Creating a bioinformatics nation" given by Lincoln Stein at the 2002 O'Reilly Open Bioinformatics Conference.

It would be nice to read more additions/opinions.

 
At January 05, 2006 4:06 PM, Blogger Pedro Beltrão said...

Did you say 10 golden rules ? :) I read only 5.

You already say that the output should be machine readable but I would add that the whole process should be usable via a computer interface.

 
At January 13, 2006 12:50 AM, Blogger Stew said...

Yeah, the bullet points aren't very clear. There are meant to be five general headings and ten rules.

By usable via computer interface you mean just easily scriptable or by something more formal like web services? (it's a good point, in any case).

Another possible rule - use permanent web space so that when your home directory gets shut down after you leave the lab any software doesn't disappear too...

 

Post a Comment

<< Home


See all posts from: July 2005 August 2005 September 2005 October 2005 November 2005 December 2005 January 2006 February 2006 March 2006 April 2006 May 2006 June 2006 July 2006 September 2006 October 2006 November 2006 December 2006 January 2007 February 2007 March 2007 April 2007 May 2007 June 2007 July 2007 August 2007 October 2007 November 2007 December 2007 January 2008 February 2008 March 2008 April 2008 May 2008