Machine Learning
Anyway, there's lots of other machine learning algorithms out there, some of which are a lot better suited to working with biological data. Unfortunately, unless you're collaborating with the local CS department or have a knowledgeable ainthusiast to hand it can be difficult to actually find implementations of those algorithms. Even if you do obtain one you're usually forced to jump through hoops to get data into the correct format (my pet hate is software that needs more than one file to describe a single dataset).
That's where Weka comes in. It's a freely available Java based suite of machine learning algorithims written by Ian Witten and Eibe Frank - amongst many others - at the University of Waikato in New Zealand (a weka is a kind of curious, flightless bird native to NZ).
The good thing about it is that once you have your data in Weka's ARFF format you can perform any amount of data manipulation, clustering, data mining and rule learning with it, although my installation suffers from an unfortunate memory leak that means I have to periodically restart the software. It's a flexible system; there's a GUI for those just interested in performing one-off tasks or you can access the underlying code via the Weka API. Lots of algorithms, too: decision trees, support vector machines, k-nearest neighbour, rule tables...
I discovered Weka a while back via the previous edition of Ian and Eibe's excellent book and that's probably the easiest way to get started with it (usually I'd say screw the book and just dive in, but elements of Weka's online documentation are sadly lacking). The book is also a pretty good introduction to machine learning and explains a lot of the underlying concepts in a clear and concise fashion.
Furthermore:
KDnuggets has a list of other machine learning software available on the web - if you're looking for more information, then that's a good place to start.
Spitshine
Stew
Anonymous
Stew
Neil
NS
. This post has trackbacks.
