Flags and Lollipops

Friday, September 30, 2005

Garage Genomics

I've stumbled across a couple of websites in the last six months devoted to biohacking - hobbyists tinkering with genetically modifying organisms in their toolsheds. I haven't seen any actual examples - there doesn't seem to be a biohacker equivalent to hackaday - though I do faintly remember a team from my university getting together to take part in some sort of biohacking competition that used parts from MIT's BioBricks program, so it's not that new a concept. The idea behind BioBricks, if you haven't heard of it, is to:
isolate discrete biomolecular mechanisms and define standard interfaces for them so that they can be assembled in much the same way as electronic circuits.
It's all quite interesting. Biohacking in my basement doesn't appeal to me, but then I already use computers at home; if I wanted to bring in a genetics lab too then why bother ever leaving work?

So what would you want to do it for? Fun and profit, I suppose. Well, that and terrorism; presumably you could hack together some sort of terrible plague, if you had enough time and money. Some elements of the media (Not necessarily the EE Times, they just had a nice, concise article) have latched on to this idea. It sells quite well to scaremongerers: not only are we at risk from bioterrorism, we're at risk from bioterrorism built in our backyard for twenty bucks. Should the government be bringing out tougher regulations? Should tinkering with genetics outside a lab be made illegal? (actually, that's another thing that I haven't seen mentioned in relation to biohacking: don't health and safety regulations in most countries already prohibit genetic modification in the home?)

Groups could use homegrown genetics for their own evil purposes but surely the type of organisation that wants to unleash biological destruction on us is the type of organisation that wouldn't bother messing around with e.coli in a garage when (bearing in mind that it'd actually be pretty difficult to actually weaponize anything) there are far faster and cheaper ways of causing mayhem. There's an interesting PDF available from the aforementioned MIT BioBricks people called Risks and Rewards, which suggests that people should balance the potential benefits of biohacking (people inventing useful things in their spare time) with the risks (to quote the PDF: "Bin Laden Genetics Ltd.")

I agree with that, but the same PDF also has some rather shaky sounding suggestions for how to combat the threat that biohacking might present in the future: sticking to a code of ethics, for one - because terrorists are ethical? - and encouraging biohackers in the first place so that there's always a pool of amateurs ready to help governmental organisations find vaccines or cures or whatever is neccessary: this is akin to one of the arguments for open source software, I guess, though to look at suspect code an open source coder can use an everyday PC; to look at a suspect, highly virulent strain of genetically engineered flu or something presumbably you need a high grade containment facility of the kind unavailable through mail order.

So that's the terrorism aspect. Then there's the profit: one could (plausibly) design innovative biological solutions to engineering problems, or genetically modify crops and vegetables - presumably to patent rather than to produce and then sell from a stall on the sidewalk outside your house. I'm happy with people doing this up to a point: there are fairly obvious ethical considerations once people start messing with animals, which should be a strictly enforced no-no (you'd think that the genetic modification of higher order organisms would be too complex a task for a lone amateur in any case).

Which brings me to Eduardo Kac, who is an artist, a biohacker and the creator of GFP Bunny, on which I reserve judgement. To his credit, Eduardo does say that creating glow-in-the-dark bunnies:
must be done with great care, with acknowledgment of the complex issues thus raised and, above all, with a commitment to respect, nurture, and love the life thus created
... but as some pointed out at the time perhaps creating an animal purely in service of art breaks that commitment to respect.

Comments and trackbacks Feel free to post your comments . This post has trackbacks.

Thursday, September 29, 2005

Visions of Science

My early optimism regarding Firefox extensions was misplaced; I'm still working on one (and some literature searches for work). Meanwhile, this is doing the rounds of the bioblogosphere - the Novartis / Daily Telegraph Visions of Science competition. The BBC has some of the best photographs up in slightly higher resolution.

An example:

It's a shot taken with a scanning electron microscope of a cancer cell "moving down a pore in a filter". Hot diggity! Can't beat that for a visualisation.

Comments and trackbacks Feel free to post your comments . This post has trackbacks.

Friday, September 23, 2005

Bioinformatics & Firefox

My spare time at the moment is taken up with writing an extension for Firefox, which with any luck I'll talk more about next week. OK, that and Railroad Tycoon 3 which I picked up for £10 (about $6) in the budget section of my local video game store. Oh, the shame...

Anyway, considering that Firefox is only supposed to have about 10% of the browser market it's maybe remarkable that the web logs for this site and of the departmental web site at my work suggest that, actually, in bioinformatics the figure is a lot higher. Closer to 60% in fact, which is pretty good going (I'm not a hater; if you want to use IE go right ahead. Personally I just find Firefox a lot more productive).

Maybe it's not all that surprising; you gotta figure that for the internet as a whole a lot of the 90% using Internet Explorer are working stiffs on corporate managed desktops or people who just want to surf the web without worrying about any of that technical stuff. Bioinformatics is a computer science-y discipline: no doubt if this was a hardcore CS blog then the alternative browser share would be a lot higher.

To get back to the point: Firefox and bioinformatics. I was planning to do a rundown of some of the more interesting science related extensions at this point but then I discovered that Jawahar Swaminathan has already done something very similar over at Nodalpoint a couple of months ago. Jawahar is the clever chappie behind Biobar and he knows what he's talking about: check it out if you haven't already.

One interesting looking extension that isn't specifically geared towards science is Piggy Bank, which is described as:
an extension to the Firefox Web browser that turns it into a “Semantic Web browser”, letting you make use of existing information on the Web in more useful and flexible ways.
Piggy Bank was created by a collaboration between W3C and MIT called SIMILE and is released under an open source licence. Essentially it's a way of scraping data from different web sites (i.e. parsing out the useful bits), organising it using ontologies and then collating it all together in some useful form. It's data mashups made easy.

One of the potential uses they give is of somebody moving house who wants to combine the apartment for rent notices from one web site with the addresses of local schools, subway stations etc. on the same map. Piggy Bank can scrape the relevant data from the different sites and then display it all on Google Maps.

Obviously screen scraping isn't the way forward - see Greg's Nodalpoint post the other day to read more about the standards which formalize things - but it's an interesting halfway house. I'd be interested to hear if anybody has had any experiences with Piggy Bank (good or bad) in the life sciences domain.

Comments and trackbacks Feel free to post your comments Blogger Greg Tyrelle Blogger Stew Anonymous Anonymous Anonymous Anonymous Anonymous Anonymous Blogger Soaring Bear . This post has trackbacks.

Monday, September 19, 2005

The Joy of S/MARs

Despite having my common sense, several seminars and many references in literature tell me that it's not true I still have to fight the tendency to think of cell nuclei as little bio-jelly filled bags in which bits of DNA are floating around freely in nice chromosome-shaped chunks, like you see on karyotypes.

This is possibly due to my reductionist, computer scientist brain rebelling against yet another level of complexity - the geography of the cell - when surely there's enough work to be done with the -omics we have. But anyway...

Up until the 70s, nuclear architecture was a mystery. Light and early electron microscopy couldn't pick out any structures and so everybody calmly just pretended like they had better things to do like making flies grow two heads. Then new cell preparation and fluoroscopy techniques appeared and some light was shed on eukaryotic cell nuceli.

A complex, flexible network of protein and RNA fibrils called the nuclear matrix was discovered. Amongst other things, this network serves as a scaffold to which chromosomes are attached (the relevant parts were imaginatively titled "the chromosome scaffold").

I'll skip the in-depth review of chromatin structure (people much more clever and eloquent than I could ever be do a better job elsewhere). Essentially, DNA (for most of the time) is wrapped around histone proteins, packaged as 'beads on a string'. This string is stuck to the scaffold. What I want to talk about today are those bits of DNA which make up the "sticky bits" of the string - marked as "AT-rich regions" in the diagram above.

These are the Scaffold / Matrix Attachment Regions, or S/MARs for short. You can work out where they are in the lab - slowly - and there are a limited number of S/MAR sequences for different organisms available on the internet from the SMARt DB.

There's no real consensus sequence. They do tend to share some general features, though: S/MARs are between 300bp and several kb in length, tend to be AT rich and are enriched for features like Topoisomerase II binding and cleavage sites and curved or kinked DNA. Recently researchers have expanded on the latter feature - it turns out that S/MARs also have a high potential for stress-induced duplex destabilization (SIDD). Craig Benham at UC Davies has web based software to calculate SIDD for short sequences - incidentally, he has also produced work showing that SIDD-prone sites might be linked to regulatory potential, at least in E.Coli.

Why is anybody interested in where S/MARs are anyway? Well, there's their relationship to regulatory regions, faint evidence that they have something to do with where translocation breakpoints and gross deletions happen on chromosomes and the relationship between structural domains (the "loops of DNA" in-between attachment regions in the diagram above) and functional domains. It's also been mooted that there's a relationship between gene expression and the contents of each structural domain; thus, for example, important, highly-expressed genes are perhaps the only gene on their structural domains while other, larger domains contain groups of less important genes.

There's no shortage of interesting future experiments but what is lacking is the data. At the moment, the three ways to identify putative S/MARs in-silico are MAR-Wiz, Smartest and Web SIDD, all of which are web based scripts with limits on the amount of sequence that they can handle at once; limits that make genomic studies difficult (unless you're the people who wrote the software in the first place). Their workings aren't very transparent - I'm not sure if MAR-Wiz is even peer reviewed.

Which brings me to the point of this post... if anybody out there is looking for a neat coding project, a standalone S/MAR finder that incorporates SIDD as a feature would be great (an open source one that we could all tinker with would be even better). My attempts to create such a thing myself have exploded in a puff of greek letters, misunderstood equations and lack of time. If you doubt how useful S/MAR finding software might be, check out the number of papers that use MAR-Wiz (aka MAR-Finder).

p.s. I know about the EMBOSS one, but it's really behind the times.

Comments and trackbacks Feel free to post your comments Anonymous Neil Blogger Stew Anonymous Anonymous . This post has trackbacks.

Friday, September 16, 2005

Hype Cycles

If you're unfamiliar with what a hype cycle is: it's essentially a five step process that strategy consultants The Gartner Group use to characterise the over-enthusiasm (hype) that typically accompanies the introduction of new technologies.

It occurred to me that my life in research is one endlessly repeating personal hype cycle.
1. "Technology Trigger"
Gartner says:
The first phase of a Hype Cycle is the "technology trigger" or breakthrough, product launch or other event that generates significant press and interest.
Hot diggity! That semi-plausible theory I came up with in the bath is correct! It is possible to derive tissue specificity / regulatory potential / secondary structure from dinucleotide frequencies / a big-ass database / support vector machines / microarrays!
2. "Peak of Inflated Expectations"
Gartner says:
In the next phase, a frenzy of publicity typically generates over-enthusiasm and unrealistic expectations. There may be some successful applications of a technology, but there are typically more failures.
My colleagues are complimentary. This is a Nature cover story for sure. Maybe now somebody will finally pay for me to fly first class somewhere nice in return for a keynote between Pina Coladas. I am a coding genius. Maybe I'll sit next to Lincoln Stein on the way there and give him some tips on Perl. And his haircut.
3. "Trough of Disillusionment"
Gartner says:
Technologies enter the "trough of disillusionment" because they fail to meet expectations and quickly become unfashionable. Consequently, the press usually abandons the topic and the technology.
Stupid Nature. What do they know? Those twelve other papers just scratched the surface, my program has a better GUI and the logo is of a monkey smoking a pipe, what more do you want? If you ask me, "significant" is up for individual interpretation. Maybe I should start a blog.
4. "Slope of Enlightenment"
Gartner says:
Although the press may have stopped covering the technology, some businesses continue through the "slope of enlightenment" and experiment to understand the benefits and practical application of the technology.
What's this? Appearing in The Annals of Mongolian Medicine has drawn the attention of somebody who has actually used my program to do something useful. And they want to collaborate...
5. "Plateau of Productivity"
Gartner says:
A technology reaches the "plateau of productivity" as the benefits of it become widely demonstrated and accepted. The technology becomes increasingly stable and evolves in second and third generations. The final height of the plateau varies according to whether the technology is broadly applicable or benefits only a niche market.
Neh, I got a paper out of it and some contacts, which is something to show for the last year. Eventually all the little pieces will come together and next year I'll get cited in one of the papers cited in the Nature Reviews cover story. And now, back to the bath...

Comments and trackbacks Feel free to post your comments . This post has trackbacks.

Wednesday, September 14, 2005

Schemaball

Am feeling a bit guilty after being so negative towards the poor symbiowhatsit consortium. To make up for it here's something positive: a link to Martin Krzywinski's homepage at the BC Cancer Agency's Genome Sciences Centre.

Martin is responsible for Schemaball (found via Information Aesthetics) which is an open source Perl script that visualises SQL table schemas.
Schemaball produces images called schema balls. Schema balls are schema visualizations in which tables are ordered along a circle with table relationships drawn as curves or straight lines.
That's cool in itself, but his homepage also links to other goodies like Circos, which creates very pretty circular visualisations of sequence alignments, conservation and other such sequence features. Some of the screenshots are awesome (well, I thought that they were. I like genome visualisations, the mileage of your awe may vary).

Obviously there's also a lot of non circular diagram related material - a regular expression cooker and this interesting page on language fingerprints, for example.

Comments and trackbacks Feel free to post your comments Anonymous Neil . This post has trackbacks.

What's in a name?

I stumbled across this press release from EBI today, via Biology News Net. It's basically about how the EU is giving half a million euros (~ $600,000 USD?) to a project which will work out what kind of bioinformatics / medical informatics crossover projects should be funded in the future. The way that they're doing this is to interview experts in the field - which is fair enough - and (to paraphrase) by identifying and analysing the content of relevant literature by "bibliometric and data-mining" means to identify "areas of opportunity"- which sounds a bit dodgy, but hey, I don't really know how projects like this usually work. I'd have thought you'd get much better information for less work by sticking to the interviews.

Anyway, to come to the point of this post... the full title of the project is Synergies in Medical Informatics and Bioinformatics. The snappy acronym? SYMBIOmatics. It's a good example of a great bioinformatics project name:
  • Mid-acronym change in capitalization brutally offends word flow, forcing readers to pay more attention to the project description in long, boring documents
  • Weighs in at twelve letters, which heftily highlights the importance of the project. All important projects have really long acronyms
  • Bit in capitals infringes commercial trademarks (of a UK sports turf cleaning company and Poland's largest organic fruit and vegetable distributor, amongst others)
  • Sounds faintly futuristic (the -matics part), would have been better if it sounded like an animal or something which could be used as the project logo, ideally a monkey smoking a pipe as I have some great clipart of that on my hard drive already, but never mind...
  • Shortens words by "crazy person collage" method, reducing Bioinformatics to BIOmatics, for example, while Medical Informatics becomes simply M. This has to be done when you have a great pun, smoking monkey reference or far-out future-word like "SYMBIOmatics" in mind and need to fit the project name into it
  • For EU funding bonus points is universally nonsensical, thus neatly sidestepping the EU preference for acronyms that work in more than one major language
Bioinformatics is a bit too much like mainstream IT when it comes to crap acronyms. At least we haven't gotten to the point where database releases are named (it's still Ensembl 33, not Ensembl Yukon or Longsteer or anything). Actually, didn't it use to be EnsEMBL, or am I making that up? I refer you to rule #1 above...

p.s I'm sure that the Symbiomatics project itself is perfectly respectable, good luck to them etc... it's the name I don't like.

Comments and trackbacks Feel free to post your comments . This post has trackbacks.

Saturday, September 10, 2005

Getting back up to speed

Well, I'm back. Haven't had a chance to do any real work, which includes actually reading any journals or following up any new sources of data. I have, however, pretty much caught up on my blogroll and the interesting posts therein. Some of the things that caught my attention:

Spitshine talked about an essay in the editorial of PLoS Medicine entitled Why Most Published Research Findings Are False, and joked about the mainstream media headline that would accompany that news. OK, not joked, accurately predicted. The Bad Science column in The Guardian (thanks Snowdeal) written by Ben Goldacre talked about mainstream media's portrayal of science in general, and in particular how it creates a parody of science which it can then critique with impunity. It too mentions the essay:
It predictably generated a small flurry of ecstatic pieces from humanities graduates in the media, along the lines of science is made-up, self-aggrandising, hegemony-maintaining, transient fad nonsense; and this is the perfect example of the parody hypothesis that we'll see later. Scientists know how to read a paper. That's what they do for a living: read papers, pick them apart, pull out what's good and bad.
Ben puts the blame on PR and editorial departments who lack science graduates (fair enough), a reliance on authority figures with questionable credentials (Kevin Warwick, say no more) and... well, there's an eloquent rant about humanities graduates being angry at science, which I'm not so sure about.

There's more blogging goodness about the PLoS essay over at Universal Acid.

Meanwhile, over at Nodalpoint there's a post by Pedro about the idea of virtual collaborations. It's kind of like open source research, I guess. Personally I think it's a good idea - especially given some of the refinements suggested in the comments posted after the story. For example, Greg comments:

The main barrier to entry will be social, someone will likely have to take a leading role in the development, do organization etc. And the project must maintain focus. Basically the same kind of social issues that crop up in regular research environments (labs). So I guess this is a virtual opportunity to try you hand at being a PI :)

Now I think with the right kind of question all these issues can be handled. I think this needs to be done outside of the context of existing institutions (too much social baggage) and it must be very grass-roots.
I agree. I'm keen to participate in collaborative projects like the ones mentioned; my only worry is justifying the time spent to my PI. "So who are these people that you're working with... science bloggers? And your chances of publication are... you don't know yet?". At least with BioPerl contributions you may have had to write the relevant module for an existing project. Still, I'm all for it.

Comments and trackbacks Feel free to post your comments . This post has trackbacks.


See all posts from: July 2005 August 2005 September 2005 October 2005 November 2005 December 2005 January 2006 February 2006 March 2006 April 2006 May 2006 June 2006 July 2006 September 2006 October 2006 November 2006 December 2006 January 2007 February 2007 March 2007 April 2007 May 2007 June 2007 July 2007 August 2007 October 2007 November 2007 December 2007 January 2008 February 2008 March 2008