<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss'><id>tag:blogger.com,1999:blog-14832160</id><updated>2010-01-04T15:50:15.145Z</updated><title type='text'>Flags and Lollipops - Bioinformatics Blog</title><subtitle type='html'>Blog of bioinformatics papers, links and stories that I thought were interesting (so your mileage may vary).</subtitle><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default?start-index=26&amp;max-results=25'/><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://www.ghastlyfop.com/blog/atom.xml'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>179</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-14832160.post-3365753464855471322</id><published>2009-06-13T22:50:00.004+01:00</published><updated>2009-06-23T16:04:13.804+01:00</updated><title type='text'>Aggregating activity from Twitter</title><content type='html'>&lt;i&gt;Update: you can't follow a specific set of users using GNIP any more - their feed is equivalent to the 'spritzer' method in the official Twitter API.&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.ghastlyfop.com/blog/uploaded_images/Picture-1-722357.png"&gt;&lt;img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 200px; height: 156px;" src="http://www.ghastlyfop.com/blog/uploaded_images/Picture-1-722328.png" alt="" border="0" /&gt;&lt;/a&gt;Interested in building a real time aggregator for Twitter? Who isn't? You have lots of options:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Just the vanilla API&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Simply call &lt;a href="http://apiwiki.twitter.com/Twitter-REST-API-Method%3A-statuses-user_timeline"&gt;user_timeline&lt;/a&gt; for each user that you are interested in every x minutes.&lt;br /&gt;&lt;br /&gt;The standard &lt;a href="http://apiwiki.twitter.com/Rate-limiting"&gt;rate limit&lt;/a&gt; on the Twitter API is 100 requests per hour e.g. checking 25 users every 15 minutes is pretty much the best you'll be able to do. If you're a lazy chancer you can try and get your application whitelisted which removes rate limits.&lt;br /&gt;&lt;br /&gt;Good:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Very simple&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Not so good:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Too simple - won't scale.&lt;/li&gt;&lt;li&gt;Slow update time (while the number of calls you can make per hour is limited)&lt;/li&gt;&lt;li&gt;Seeing so much redundant data returned for each call makes the internet cry.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;b&gt;Vanilla API + robot&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Create a new Twitter account, log in and follow the people you're interested in aggregating tweets from. You don't have to follow people manually - you could do it programmatically using the &lt;a href="http://apiwiki.twitter.com/Twitter-REST-API-Method%3A-friendships%C2%A0create"&gt;friendships/create&lt;/a&gt; API call.&lt;br /&gt;&lt;br /&gt;Now just check the &lt;a href="http://apiwiki.twitter.com/Twitter-REST-API-Method%3A-statuses-friends_timeline"&gt;friends_timeline&lt;/a&gt; for that user as often as you like (up to the hourly rate limit, obviously). Page through results if necessary.&lt;br /&gt;&lt;br /&gt;Twitter has some (sensible) rules about follower / following ratios. Once you're following ~ 800 people further follow requests will be blocked; you have to wait until you have more followers before adding anybody else. You can't whitelist your way out of this.&lt;br /&gt;&lt;br /&gt;Good:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Again, pretty simple.&lt;/li&gt;&lt;li&gt;Better update time (aggregation within a couple of minutes of a tweet)&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Not so good:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Can only follow ~ 800 people before Twitter starts blocking your follow requests. &lt;/li&gt;&lt;li&gt;Users will know that you're aggregating them (is this a bug or a feature?). Can't keep following / unfollowing people - they'll get spammed by emails telling about it.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;b&gt;GNIP&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.gnip.com/"&gt;GNIP&lt;/a&gt; works activity streams from a bunch of different web 2.0 sites. Here's how it works in a nutshell:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;you set up a GNIP account&lt;/li&gt;&lt;li&gt;you add rules to your account ("give me all tweets by @twalf" "give me all tweets by @ianmulvany") and set up a web hook (a script on your server). You can have up to 25k rules per site for free.&lt;/li&gt;&lt;li&gt;GNIP receives data in real time from Twitter&lt;/li&gt;&lt;li&gt;If any data matches your rule set then GNIP POSTs to your web hook with some metadata about the matching tweet (a unique id, the tweeter's username, a URI for the actual message)&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;Now you'll get pinged whenever anybody in your rules tweets - in close to real time.&lt;br /&gt;&lt;br /&gt;Rules can be &lt;a href="http://docs.google.com/Doc?id=dpw6zj9_0fdcnttgd"&gt;added programmatically&lt;/a&gt; or by hand. GNIP's API docs are pretty opaque but it's actually a fairly simple, efficient system once you've gotten to grips with it.&lt;br /&gt;&lt;br /&gt;Unfortunately the metadata that gets POSTed to you doesn't contain the actual tweet. For that you have to go back to Twitter using the supplied URI, which points to the message in XML format. Remember that there's a rate limit on the Twitter API so by default you won't be able to aggregate more than a hundred messages per hour. This sucks. Whitelisting is pretty much the only way you're going to overcome this.&lt;br /&gt;&lt;br /&gt;Twitter on GNIP is unique in this respect; none of the other services require you to call the originating site to get messages. It's especially annoying as tweets are only 140 characters long - it's definitely not a space / bandwidth issue!&lt;br /&gt;&lt;br /&gt;Good:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Fast update time (pretty close to real time)&lt;/li&gt;&lt;li&gt;GNIP infrastructure can help you aggregate from other sites (Digg, Delicious...) in the future.&lt;/li&gt;&lt;li&gt;Follow up to 25k people for free and without scaling issues.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Not so good:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Relatively complex. &lt;/li&gt;&lt;li&gt;GNIP can be a bit flaky - occasionally it goes down and you lose updates for a few hours.&lt;/li&gt;&lt;li&gt;Requires whitelisting by Twitter once you're collecting more than a hundred tweets p/h.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;b&gt;Twitter streaming API&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Twitter has a &lt;a href="http://apiwiki.twitter.com/Streaming-API-Documentation"&gt;streaming API&lt;/a&gt; in alpha.&lt;br /&gt;&lt;br /&gt;You can follow up to 200k users by POSTing their ids to &lt;b&gt;http://stream.twitter.com/birddog.json&lt;/b&gt; - after you've been approved by Twitter and signed a usage agreement.&lt;br /&gt;&lt;br /&gt;&lt;strikethrough&gt;You can follow up to 2k users for free using &lt;b&gt;http://stream.twitter.com/shadow.json&lt;/b&gt; which is similar.&lt;/strikethrough&gt;&lt;br /&gt;&lt;br /&gt;You can follow up to 200 users for free using &lt;b&gt;http://stream.twitter.com/follow.json&lt;/b&gt; which is similar.&lt;br /&gt;&lt;br /&gt;Once you've opened a connection to shadow or birddog it'll never close. When a followed user tweets it'll come down the wire as a line of JSON (ending with a carriage return). Think &lt;a href="http://simonwillison.net/2007/Dec/5/comet/"&gt;Comet&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Good:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;As fast an update as you're ever going to get.&lt;/li&gt;&lt;li&gt;Don't need to rely on third parties (like GNIP)&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Not so good:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Still in alpha.&lt;/li&gt;&lt;li&gt;Need an agreement from Twitter to follow more than 2k users.&lt;/li&gt;&lt;li&gt;Complex (in that it requires you to move away from reactive, asynchronous scripts towards an app that can keep an HTTP connection open for hours)&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-3365753464855471322?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/3365753464855471322/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=3365753464855471322' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3365753464855471322'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3365753464855471322'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2009/06/aggregating-activity-from-twitter.html' title='Aggregating activity from Twitter'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-4920803736283048684</id><published>2009-03-20T14:12:00.003Z</published><updated>2009-03-20T14:34:35.660Z</updated><title type='text'>Postgenomic hiatus</title><content type='html'>A couple of weeks ago I switched off the Postgenomic aggregation pipeline. &lt;br /&gt;&lt;br /&gt;This is mainly because the pipeline scripts were hogging disk / memory resources on the server which it shares with a bunch of other applications. I'm not sure exactly where the process is sticking; but to be honest it's not a complete surprise.&lt;br /&gt;&lt;br /&gt;Writing a blog aggregator is actually pretty easy; the hard part is dealing with all the weird edge cases. I haven't been paying close attention to the Postgenomic pipeline recently; I think what's currently going wrong is a combination of slow queries across what's now a very large database and one or more odd posts or blogs clogging up the pipeline (I'd post more details if I had them).&lt;br /&gt;&lt;br /&gt;NPG doesn't officially support Postgenomic any more, though it does host it ably. Patching the code is something I do 'on the side', which is why it hasn't been fixed yet - I'm really pushed for time with other projects that need to take priority and will be for at least another three or four weeks.&lt;br /&gt;&lt;br /&gt;In the meantime, no new blogs will be picked up and posts won't be aggregated at postgenomic.com. The site itself and the API will continue to work.&lt;br /&gt;&lt;br /&gt;If you use postgenomic.com for any mashups or scripts then I apologise for the outage - sorry! It will be fixed, it's just the timing that's an issue.&lt;br /&gt;&lt;br /&gt;In the meantime, please consider switching to Nature.com Blogs - the user facing features aren't as complete but the backend is. What's more it's fully supported by NPG developers and IT staff.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-4920803736283048684?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/4920803736283048684/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=4920803736283048684' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/4920803736283048684'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/4920803736283048684'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2009/03/postgenomic-hiatus.html' title='Postgenomic hiatus'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-2043835155665294776</id><published>2009-02-23T14:11:00.002Z</published><updated>2009-02-23T15:36:18.890Z</updated><title type='text'>Paperview</title><content type='html'>About to put some software here, need to know the permanent URI.&lt;br /&gt;&lt;br /&gt;Er, check back later?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-2043835155665294776?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/2043835155665294776/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=2043835155665294776' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2043835155665294776'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2043835155665294776'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2009/02/paperview.html' title='Paperview'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-1678740679751257048</id><published>2009-01-30T12:00:00.003Z</published><updated>2009-01-30T12:05:43.737Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='ereaders'/><title type='text'>Sony eReader on OSX</title><content type='html'>For future Google reference, the Sony PRS-505 &lt;i&gt;is&lt;/i&gt; perfectly compatible with OSX (just like the Kindle). If you plug it into your mac's USB port it should show up as a new disk image ("Untitled", but still); just drag and drop EPUB, PDF or text files into the database/media/books directory et voila.&lt;br /&gt;&lt;br /&gt;Unfortunately it won't charge through USB while plugged into Macbook Pros (don't know about other laptops or desktop Macs) - seems like there's not enough power. I had to find a PC to recharge at.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-1678740679751257048?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/1678740679751257048/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=1678740679751257048' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1678740679751257048'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1678740679751257048'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2009/01/sony-ereader-on-osx.html' title='Sony eReader on OSX'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-5309329529967833768</id><published>2009-01-25T14:31:00.003Z</published><updated>2009-01-25T15:26:22.178Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='new scientist'/><title type='text'>Graham Lawton and Darwin was Wrong</title><content type='html'>New Scientist this week has an eye grabbing cover.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;The cover sports a big green tree with the words “Darwin Was Wrong.” I hope they sell a lot of magazines with that load of tripe, since they certainly were not thinking about the generations of school kids and church-goers who will now be treated to that cover in every creationist power point presentation between now and the Rapture. How many people do you think will actually read the article to discover what it was, precisely, that Darwin got wrong?&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;(from &lt;a href='http://scienceblogs.com/evolutionblog/2009/01/the_trouble_with_science_journ.php'&gt;EvolutionBlog&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;There's some fair (I think) coverage in a couple of places like &lt;a href='http://sandwalk.blogspot.com/2009/01/darwin-was-wrong.html'&gt;Sandwalk&lt;/a&gt; and lots of &lt;a href='http://blogsearch.google.co.uk/blogsearch?hl=en&amp;q=darwin%20was%20wrong%20new%20scientist&amp;um=1&amp;ie=UTF-8&amp;sa=N&amp;tab=wb'&gt;not so fair coverage&lt;/a&gt; everywhere else. &lt;br /&gt;&lt;br /&gt;I don't really understand what the big deal is. How dare a mainstream publication use a sensational cover to help sell copies? How dare a journalist cover a story that might be quote mined selectively by creationists?&lt;br /&gt;&lt;br /&gt;It doesn't really matter if you're on a magazine front cover or tucked away on pg 127 - if somebody wants to quote you &lt;a href='http://scienceblogs.com/pharyngula/2007/08/the_crazy_billboard_lady_is_ba.php'&gt;out of context&lt;/a&gt; then they can. Surely the thing to do at that point is to confront the person doing the mis-quoting, not to berate the original author.&lt;br /&gt;&lt;br /&gt;The cover does make a lovely image for ID proponents to include in powerpoint presentations, yes. But why should New Scientist care? Why should they pander to creationists and sell fewer copies of a magazine that probably does more than any number of science blogs to get schoolkids interested in science?&lt;br /&gt;&lt;br /&gt;Graham Lawton is not the enemy.&lt;br /&gt;&lt;br /&gt;(New Scientist &lt;i&gt;is&lt;/i&gt; &lt;a href='http://www.badscience.net/category/papers-new-scientist/'&gt;full of crap&lt;/a&gt; sometimes, though)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-5309329529967833768?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/5309329529967833768/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=5309329529967833768' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/5309329529967833768'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/5309329529967833768'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2009/01/graham-lawton-and-darwin-was-wrong.html' title='Graham Lawton and Darwin was Wrong'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-9139923591417073288</id><published>2009-01-24T18:04:00.002Z</published><updated>2009-01-24T18:08:46.121Z</updated><title type='text'>Unforgiving UTF-8 to ASCII conversion</title><content type='html'>The bulk loader for App Engine doesn't support unicode (?). Irksome.&lt;br /&gt;&lt;br /&gt;Here's a quick and dirty solution if you've got iconv installed.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;iconv -c -f UTF-8 -t ASCII utf8_data.csv &gt; ascii_data.csv&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Drops unacceptable unicode characters (i.e. anything that doesn't have a direct ASCII match). Did say it was dirty...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-9139923591417073288?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/9139923591417073288/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=9139923591417073288' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/9139923591417073288'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/9139923591417073288'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2009/01/unforgiving-utf-8-to-ascii-conversion.html' title='Unforgiving UTF-8 to ASCII conversion'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-3871829842370400122</id><published>2009-01-22T20:31:00.006Z</published><updated>2009-01-22T21:12:16.100Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='openid'/><category scheme='http://www.blogger.com/atom/ns#' term='author identifiers'/><category scheme='http://www.blogger.com/atom/ns#' term='crossref'/><title type='text'>A specialist OpenID service to provide unique researcher IDs</title><content type='html'>I was going to write a post re: &lt;a href='http://friendfeed.com/e/c1fd00ec-15f9-d894-4ea9-4ffeaac5ae28/A-specialist-OpenID-service-to-provide-unique/'&gt;this epic Friendfeed thread&lt;/a&gt; (and &lt;a href='http://blog.openwetware.org/scienceintheopen/2009/01/20/a-specialist-openid-service-to-provide-unique-researcher-ids/'&gt;Cameron's original post&lt;/a&gt;) but then &lt;a href='http://itc.conversationsnetwork.org/shows/detail1772.html'&gt;Geoff Bilder&lt;/a&gt; left some comments that pretty much covered everything:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;Let me just address a few points: &lt;br /&gt;&lt;br /&gt;a) Yes, CrossRef is exploring this space. &lt;br /&gt;b) For those afraid of CrossRef being in the thrall of "traditional publishers", I will note CrossRef members include PLOS, PubMed Central, The Encyclopedia of Life, Hindawi, Jove, OECD, World Bank, some IRs... In short, we are catholic in our definition of "publisher". I should also note that we are a non-profit. When/if we charge for things, it is only so that we can sustain the service. &lt;br /&gt;c) It is true that CrossRef could go under. Any place could go under. But because so many depend on us already, a central concern of our members is to make arrangements so that we can pass-on data and systems should something happen. &lt;br /&gt;d) CrossRef is looking for something that will work across disciplines. We represent the sciences, social sciences, humanities, etc. &lt;br /&gt;e) Cameron is right- the author ID problem is "much bigger than publishers". We are talking to researchers, librarians, funding agencies, etc. about what they would require from a service. We were at the CNI meeting and Cliff Lynch is on our advisory board and is aware of our project. &lt;br /&gt;f) We too see OpenID is a critical component of the system, but we don't think OpenID and the Contributor ID are one and the same. As Richard says, OpenIDs are pretty fragile. There are also complicating issues that would arise from multiple institutional affiliations, etc. (OpenID delegation is only a geek solution to this). &lt;br /&gt;g) Gumunder described our approach pretty well. We envision creating a repository of profiles. People could use open-ids (they might have a few) or shibboleth ids to authenticate with the service in order to edit their profiles. OAuth and MicroID might be used for other aspects of the service (e.g. profile exchange, blog signing)&lt;br /&gt;&lt;/blockquote&gt; &lt;br /&gt;&lt;br /&gt;I'm definitely up for getting off the ground quick and fast - and arguably the big disadvantage of CrossRef is that it's not always very good at that simply because it represents so many interests - but basically they're the people best placed to do this and they have the will and technical ability to see it through. Why compete when you can cooperate?&lt;br /&gt;&lt;br /&gt;You could still start small, be unafraid to fail and try things out before any CrossRef sanctioned solution arrives though. It might be cool (and useful) to see unique author IDs across particular datasets or disciplines and if things were set up properly you could potentially just import the unique author ID / person pairs into CrossRef later to help seed the system.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-3871829842370400122?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/3871829842370400122/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=3871829842370400122' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3871829842370400122'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3871829842370400122'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2009/01/specialist-openid-service-to-provide.html' title='A specialist OpenID service to provide unique researcher IDs'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-8203133197519972721</id><published>2008-12-09T13:51:00.002Z</published><updated>2008-12-09T13:54:27.093Z</updated><title type='text'>Connotea Ian, today</title><content type='html'>Look at that focus! The dedication!&lt;br /&gt;&lt;br /&gt;&lt;img src="http://www.ghastlyfop.com/ianworking.jpg" /&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;New hardware is here&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Code is all set up&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Database files are being moved over now&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Testing is imminent&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;i&gt;don't lose hope&lt;/i&gt;&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-8203133197519972721?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/8203133197519972721/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=8203133197519972721' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/8203133197519972721'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/8203133197519972721'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/12/connotea-ian-today.html' title='Connotea Ian, today'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-7817170130307612957</id><published>2008-12-05T10:13:00.003Z</published><updated>2008-12-05T10:18:08.666Z</updated><title type='text'>Strip HTML tags from a string, Ruby edition</title><content type='html'>Get &lt;a href='http://code.whytheluckystiff.net/hpricot/'&gt;Hpricot&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;require 'hpricot'&lt;br /&gt;page = Hpricot("&amp;lt;b&amp;gt;some marked up &amp;lt;i&amp;gt;text&amp;lt;/i&amp;gt;&amp;lt;/b&amp;gt;")&lt;br /&gt;puts page.to_plain_text&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Interestingly the &lt;a href='http://code.whytheluckystiff.net/hpricot/wiki/HpricotChallenge#StripallHTMLtags'&gt;Hpricot FAQ&lt;/a&gt; says:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;Q: How do I strip all HTML tags from a page? &lt;br /&gt;A: Use regex replace!&lt;br /&gt;A2: The regex is ok, but will break in some cases, even with valid html. Try the to_plain_text or inner_text methods instead. &lt;br /&gt;&lt;/blockquote&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-7817170130307612957?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/7817170130307612957/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=7817170130307612957' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/7817170130307612957'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/7817170130307612957'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/12/strip-html-tags-from-string-ruby.html' title='Strip HTML tags from a string, Ruby edition'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-1005489970670389392</id><published>2008-12-03T17:36:00.004Z</published><updated>2008-12-03T17:43:44.301Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='google'/><category scheme='http://www.blogger.com/atom/ns#' term='html'/><title type='text'>Strip HTML tags from a string, Python edition</title><content type='html'>Obtain &lt;a href='http://www.crummy.com/software/BeautifulSoup/'&gt;Beautiful Soup&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;from BeautifulSoup import BeautifulSoup&lt;br /&gt;&lt;br /&gt;''.join(BeautifulSoup(page).findAll(text=True))&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;where 'page' is your string of text and HTML.&lt;br /&gt;&lt;br /&gt;I'm not a pythonista, there might be a nicer way of doing it (Beautiful Soup is a lot of overhead). Might want to expand on this a bit to make sure spacing is handled OK, you can keep certain tags etc. etc. Feel free to post corrections or better suggestions in the comments.&lt;br /&gt;&lt;br /&gt;Just don't use one line &amp;lt;(?:.*?)&amp;gt; regular expressions. No, really.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-1005489970670389392?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/1005489970670389392/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=1005489970670389392' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1005489970670389392'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1005489970670389392'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/12/strip-html-tags-from-string-python.html' title='Strip HTML tags from a string, Python edition'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-1035112992403897471</id><published>2008-10-25T23:43:00.007+01:00</published><updated>2008-11-11T16:53:37.026Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='ed94d0312d51b6a703d2a3db0e9f7486'/><title type='text'>Academia.edu</title><content type='html'>I like &lt;a href='http://academia.edu'&gt;academia.edu&lt;/a&gt;. The academic family tree idea is pretty cool (I know that the concept has been around for a while in various guises but their implementation is pretty slick) and I like the fact that new visitors can arrive and be interacting with the site within minutes. It's also nice to see an academic networking site that, well, doesn't look like Facebook.&lt;br /&gt;&lt;br /&gt;I'm also impressed by the speed at which they've been throwing up refinements and bug fixes... and by the adverts on &lt;a href='http://www.phdcomics.com/comics.php'&gt;PhD&lt;/a&gt;. Canny marketing (good work &lt;a href='http://sfbay.craigslist.org/sfc/mar/883535723.html'&gt;poorly paid but well fed intern&lt;/a&gt;)! The academia.edu team are a smart bunch of people which is probably how they &lt;a href='http://www.crunchbase.com/company/academia-edu'&gt;got funding&lt;/a&gt; in the first place.&lt;br /&gt;&lt;br /&gt;For balance what's &lt;i&gt;not&lt;/i&gt; good about it? The flash freezes my mac on an empty cache... and the .edu TLD is really only for educational institutions, not commercial enterprise (vetting &lt;a href='http://en.wikipedia.org/wiki/.edu'&gt;only started in 2001&lt;/a&gt;, academia.edu was first registered back in '99). Tsk! Ironically my other bugbear is that I can't join properly because I work for a commercial enterprise and not an accredited educational institution.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-1035112992403897471?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/1035112992403897471/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=1035112992403897471' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1035112992403897471'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1035112992403897471'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/10/academiaedu.html' title='Academia.edu'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-3844133151062123693</id><published>2008-05-24T22:46:00.002+01:00</published><updated>2008-05-24T22:50:30.685+01:00</updated><title type='text'>Disappointed with Popfly</title><content type='html'>&lt;a href='http://www.popfly.com/'&gt;Popfly&lt;/a&gt; is the mashup editor that Microsoft released last year. The idea is good. The 3D graphics are good. Silverlight is a bit buggy in Firefox (sidebars don't always redraw properly) but that's OK.&lt;br /&gt;&lt;br /&gt;If you're going to create a web 2.0 mashups builder, though, don't you think it's be a good idea to &lt;a href='http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1654015&amp;SiteID=1'&gt;provide some Atom support&lt;/a&gt;?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-3844133151062123693?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/3844133151062123693/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=3844133151062123693' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3844133151062123693'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/3844133151062123693'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/05/disappointed-with-popfly.html' title='Disappointed with Popfly'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-2058507409502687178</id><published>2008-05-19T17:08:00.006+01:00</published><updated>2008-05-19T17:15:29.759+01:00</updated><title type='text'>Meta-analysis</title><content type='html'>The journal platform team here at NPG just rolled out machine readable metadata for the papers we publish in Dublin Core, &lt;a href='http://www.prismstandard.org/'&gt;PRISM&lt;/a&gt; (good PRISM, not to be confused with &lt;a href='http://www.researchinformation.info/news/news_story.php?news_id=120'&gt;evil PRISM&lt;/a&gt;) and Google metadata formats.&lt;br /&gt;&lt;br /&gt;No more scraping to automatically get the citation for a paper, it's all in the HEAD:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_journal_title&amp;quot; content=&amp;quot;Nature&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_publisher&amp;quot; content=&amp;quot;Nature Publishing Group&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_authors&amp;quot; content=&amp;quot;Paul Schenk, Isamu Matsuyama, Francis Nimmo&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_title&amp;quot; content=&amp;quot;True polar wander on Europa from global-scale small-circle depressions&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_volume&amp;quot; content=&amp;quot;453&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_issue&amp;quot; content=&amp;quot;7193&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_firstpage&amp;quot; content=&amp;quot;368&amp;quot; /&amp;gt;&lt;br /&gt;&amp;lt;meta name=&amp;quot;citation_doi&amp;quot; content=&amp;quot;doi:10.1038/nature06911&amp;quot; /&amp;gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Useful for apps like Zotero and Connotea (which before now downloaded two files each time you bookmarked a Nature paper: the page itself and then the linked EndNote file to parse).&lt;br /&gt;&lt;br /&gt;The metadata will be there for all papers going forward and back through some of the archives.&lt;br /&gt;&lt;br /&gt;For fulltext indexing of papers behind the paywall you can use the linekd &lt;a href='http://opentextmining.org/wiki/Main_Page'&gt;OTMI&lt;/a&gt; file (I only just saw &lt;a href='http://otmi.twease.org/otmi/app'&gt;Twease&lt;/a&gt;, which does just that) although there's only OTMI for Nature papers at the moment, I think.&lt;br /&gt;&lt;br /&gt;Lastly at some point in the future we're aiming to put &lt;a href='http://www.adobe.com/products/xmp/'&gt;XMP&lt;/a&gt; metadata in our PDFs, which should make it much easier for scripts and applications (like &lt;a href='http://mekentosj.com/papers/'&gt;Papers&lt;/a&gt;) to look at PDF files on your filesystem and work out what they represent.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-2058507409502687178?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/2058507409502687178/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=2058507409502687178' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2058507409502687178'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2058507409502687178'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/05/meta-analysis.html' title='Meta-analysis'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-6060331122924735147</id><published>2008-04-18T11:41:00.004+01:00</published><updated>2008-04-18T12:01:26.234+01:00</updated><title type='text'>Nice work Pedro!</title><content type='html'>Noticed while leafing through today's Nature that Pedro has a paper out (&lt;a href='http://www.nature.com/nature/journal/v452/n7189/full/nature06847.html'&gt;Isalan et al.&lt;/a&gt;, Evolvability and hierarchy in rewired bacterial gene networks).&lt;br /&gt;&lt;br /&gt;There's more on this over at &lt;a href='http://pbeltrao.blogspot.com/2008/04/shuffle-project.html'&gt;Public Rambling&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-6060331122924735147?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/6060331122924735147/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=6060331122924735147' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6060331122924735147'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6060331122924735147'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/nice-work-pedro.html' title='Nice work Pedro!'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-1912616361265309197</id><published>2008-04-17T16:34:00.001+01:00</published><updated>2008-04-17T16:34:43.609+01:00</updated><title type='text'>Ian owes me a pint</title><content type='html'>&lt;i&gt;(update: Gavin Bell at Nature gave up one of his app spots so that I could put this live, which I did: only to discover that Google App Engine is even more unforgiving of timeouts than Facebook. Currently trying to work out how to make the bookmarking process, for now it doesn't work very well. Also the search is broken, though that's Google's fault and not mine.)&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;I bet &lt;a href="http://network.nature.com/profile/U3DF456C6"&gt;Ian&lt;/a&gt; earlier that I could rewrite Connotea on &lt;a href="http://code.google.com/appengine/"&gt;App Engine&lt;/a&gt; in six hours. I can't remember why. Probably ego (mine, I mean). He didn't actually bet me a pint but he should have done...&lt;br /&gt;&lt;br /&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://www.ghastlyfop.com/blog/uploaded_images/pycite-718416.png" alt="" border="0" /&gt;&lt;br /&gt;&lt;br /&gt;... because the original estimate was a tad optimistic (ahem). After twelve hours I've produced &lt;a href="http://code.google.com/p/pycite/"&gt;pycite&lt;/a&gt;, though, which is pretty good going I think. I'll admit it: Python is actually very cool.&lt;br /&gt;&lt;br /&gt;pycite is three hundred lines of logic and a set of html templates that implements a (very simple) social bookmarking service. Sadly I don't actually have an App Engine account so it's not live on the web anywhere (I'll buy whoever &lt;i&gt;does&lt;/i&gt; have an account and puts it up first a pint - let's spread the love), you'll have to download it and run it locally to see it in action.&lt;br /&gt;&lt;br /&gt;What you can do with it:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;run it without owning a server of your own&lt;/li&gt;&lt;br /&gt;&lt;li&gt;log in with your Google account&lt;/li&gt;&lt;br /&gt;&lt;li&gt;add new bookmarks (the citation will be collected automagically)&lt;/li&gt;&lt;br /&gt;&lt;li&gt;view everybody's bookmarks&lt;/li&gt;&lt;br /&gt;&lt;li&gt;filter bookmarks by user:&lt;pre&gt;http://path.to.pycite/users/bob.smith&lt;/pre&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;and by tag:&lt;pre&gt;http://path.to.pycite/tags/diabetes&lt;/pre&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;and by user and tag:&lt;pre&gt;http://path.to.pycite/users/bob.smith/tags/diabetes&lt;/pre&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;and by keyword (the full text of each bookmarked page is searchable):&lt;pre&gt;http://path.to.pycite/users/bob.smith?q=t2d&lt;/pre&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;get atom feeds for all of the above&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;What you &lt;b&gt;can't&lt;/b&gt; do with it (yet):&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;edit or delete bookmarks&lt;/li&gt;&lt;br /&gt;&lt;li&gt;anything else&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;I've put it all up on &lt;a href="http://code.google.com/p/pycite/"&gt;Google Code&lt;/a&gt;. It's fairly straightforward stuff so if you've got any brilliant social bookmarking ideas then go for it. Send me an email and I'll give you write access to the subversion repository.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-1912616361265309197?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/1912616361265309197/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=1912616361265309197' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1912616361265309197'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1912616361265309197'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/ian-owes-me-pint_17.html' title='Ian owes me a pint'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-2282167223440949006</id><published>2008-04-07T14:26:00.003+01:00</published><updated>2008-04-07T14:39:40.696+01:00</updated><title type='text'>Gaggle</title><content type='html'>&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://www.ghastlyfop.com/blog/uploaded_images/canada-goose-729490.jpg" border="0" alt="" /&gt; I hadn't heard of &lt;a href='http://www.systemsbiology.org/Technology/Data_Management/Gaggle'&gt;Gaggle&lt;/a&gt; before but both &lt;a href='http://mndoci.com/blog/'&gt;Deepak&lt;/a&gt; and Sutee Dee (who needs a homepage.. ;)) from the ISB mentioned it last week so I figured it was worth a look. It's a system built by Paul Shannon at the ISB in Seattle to share data between different bioinformatics applications on the fly. It has been around for a while, I think - there was a &lt;a href='http://www.biomedcentral.com/1471-2105/7/176'&gt;BMC Bioinformatics paper&lt;/a&gt; describing the system in March 2006.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;A small server program (the ´Gaggle Boss´) provides communication among analysis and display programs (the ´geese´) which are modest and minimal adaptations of existing (or novel) bioinformatics and computational biology programs, and web resources. The Boss and the geese all run as separate programs on the user´s desktop computer, communicating with each other, at the user´s behest, by passing simple messages.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;(from &lt;a href='http://www.systemsbiology.org/Technology/Data_Management/Gaggle'&gt;the ISB's 'about Gaggle' page&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;I ran through &lt;a href='http://gaggle.systemsbiology.org/docs/2007-04/demo/hpylori/'&gt;a tutorial&lt;/a&gt; showing data sharing between (modified versions of) &lt;a href='http://www.cytoscape.org/'&gt;Cytoscape&lt;/a&gt; (also developed by ISB), R and a data matrix viewer no problem. Quite cool.&lt;br /&gt;&lt;br /&gt;You can't share data from an arbitrary application (I don't think?), they need to be modified to send messages to the Boss goose. Having said that there's a Firefox extension called Firegoose which lets you pass messages to and from web apps, Entrez etc. I couldn't get it working properly but suspect that's something to do with my install rather than the extension itself.&lt;br /&gt;&lt;br /&gt;Anyway, it's good to see stuff like this. Truth be told it's not the slickest thing ever, but it's still pretty cool - and it works. I wonder if you could turn it into a simple lab notebook - could you write a brief description of what you're going to try and do for the Boss app every time you send data to another app or something?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-2282167223440949006?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/2282167223440949006/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=2282167223440949006' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2282167223440949006'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2282167223440949006'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/gaggle.html' title='Gaggle'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-6439854942449577655</id><published>2008-04-04T00:22:00.006+01:00</published><updated>2008-04-04T06:49:54.883+01:00</updated><title type='text'>Why you should try online dating</title><content type='html'>(you can jump to the short answer &lt;a href='#whyonline'&gt;here&lt;/a&gt;, if you're feeling impatient)&lt;br /&gt;&lt;br /&gt;Onto the psychology &lt;i&gt;of&lt;/i&gt; social media. &lt;a href='http://students.washington.edu/stech/'&gt;Kristin Stecher&lt;/a&gt; of the University of Washington and Dave Evans of Psychster LLC both gave interesting talks about profile pages.&lt;br /&gt;&lt;br /&gt;&lt;a href='http://www.psychster.com/'&gt;Psychster&lt;/a&gt; is a consulting company dedicated to "the social science of social networking". Recently they've been looking at interpersonal perception (how does person A perceive person B? How close is that to B's self perception?). Most research into this uses 'fake' people - i.e. A is given a detailed written description of B and works off of that, rather than meeting anybody face to face.&lt;br /&gt;&lt;br /&gt;To try and get a large 'real people' dataset Psychster created a Facebook application (and later &lt;a href='http://youjustgetme.com/'&gt;a website&lt;/a&gt;) where users could fill out a questionnaire that rated their personality on a variant of the &lt;a href='http://en.wikipedia.org/wiki/Big_Five_personality_traits'&gt;big five&lt;/a&gt; personality inventory (the big five being openness, conscientiousness, extraversion, agreeableness, and neuroticism). They then had the option of rating the personalities of other people (not just their friends), the idea being to collect how users saw themselves, how others saw them and the correlation between the two.&lt;br /&gt;&lt;br /&gt;On the standalone website users created profiles to reflect their personalities. Profiles could contain any number of elements (name, location, gender, favourite movie, most embarrassing moment...) chosen from a large list.&lt;br /&gt;&lt;br /&gt;The results in general:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; people do 'get' each other (where to 'get' a person means to guess a personality close to their actual, self-rated personality).&lt;br /&gt;&lt;li&gt; people on Facebook get each other better (this kind of figures - you'd want to go rate your real life friends).&lt;br /&gt;&lt;li&gt; women are better guessers than men - but only when guessing random strangers.&lt;br /&gt;&lt;li&gt; women are easier to get.&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Psychster looked at different profile elements on the standalone website to see if the presence or any in particular were correlated with higher rates of accuracy.&lt;br /&gt;&lt;br /&gt;Profile elements that make somebody easier to get:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; A link to a funny video (the number one predictor of personality)&lt;br /&gt;&lt;li&gt; What makes me glad to be alive?&lt;br /&gt;&lt;li&gt; Most embarassing thing I ever did:&lt;br /&gt;&lt;li&gt; Proudest thing I ever did:&lt;br /&gt;&lt;li&gt; My spirituality:&lt;br /&gt;&lt;li&gt; A great person:&lt;br /&gt;&lt;li&gt; I believe this:&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;Profile elements that make you harder to get:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; Profile picture (but only if it is of a non-person)&lt;br /&gt;&lt;li&gt; An awful website:&lt;br /&gt;&lt;li&gt; An awful person:&lt;br /&gt;&lt;li&gt; A great book:&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;That last one (naming a great book making it harder to guess your personality) is pretty interesting. Dave did say that he hadn't yet done any proper analysis of why it might be. I wonder if there's any research into how much (or little) reading habits have to do with your personality? &lt;a href='http://www.intergalacticmedicineshow.com/cgi-bin/mag.cgi?article=012&amp;do=columns&amp;vol=carol_pinchefsky'&gt;Here's a tangent&lt;/a&gt; (why do some people get interested in science fiction?) if you're interested. &lt;a href='http://www.sciencedirect.com/science?_ob=ArticleURL&amp;_udi=B6WM0-4H3Y9GN-3&amp;_user=10&amp;_rdoc=1&amp;_fmt=&amp;_orig=search&amp;_sort=d&amp;view=c&amp;_acct=C000050221&amp;_version=1&amp;_urlVersion=0&amp;_userid=10&amp;md5=603356cc387f95194a5f4aac8a7fe31c'&gt;Here's another&lt;/a&gt; (people who read lots of fiction aren't socially awkward, in fact the tendency to get absorbed in a story correlates with empathy scores).&lt;br /&gt;&lt;br /&gt;OK, anyway...&lt;br /&gt;&lt;br /&gt;Why were women easier to read? Because they tended to fill out the profile elements that were good predictors ("my most embarrassing moment").&lt;br /&gt;&lt;br /&gt;At this point you might be wondering (well, I wondered) who cares how well an online profile reflects your true personality. One answer is the online dating industry who have a vested interest in not setting you up with anybody plainly unsuitable. If profiles were set up the right way then maybe you could tell in advance if the guy or girl messaging you is worth seeing in the real world.&lt;br /&gt;&lt;br /&gt;Sticking with the online dating theme, &lt;a name='whyonline'&gt;&amp;nbsp&lt;/a&gt;it turns out that the levels of agreement (between actual and guessed personalities) you get by looking at Facebook profiles approach those you see in long term acquaintances. They're certainly better than what you get after a short face to face meeting (like a date). In fact, short f2f meetings are particularly bad at helping you gauge levels of agreeableness and neuroticism - not good. I think this means that stalking potential partners online actually makes good, practical sense and should be encouraged.&lt;br /&gt;&lt;br /&gt;In case you needed any reassurance.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-6439854942449577655?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/6439854942449577655/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=6439854942449577655' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6439854942449577655'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6439854942449577655'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/why-you-should-try-online-dating.html' title='Why you should try online dating'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-6361116208557056094</id><published>2008-04-03T19:32:00.004+01:00</published><updated>2008-04-04T00:13:33.778+01:00</updated><title type='text'>Do you use language differently when you're depressed?</title><content type='html'>Can you tell if somebody is clinically depressed by analyzing their use of language? I'm not a psychologist, so take the background info below with a pinch of salt but the topic came up at ICWSM (more on how later) and I thought it was fascinating.&lt;br /&gt;&lt;br /&gt;In 2001 &lt;a href='http://www.psychosomaticmedicine.org/cgi/content/full/63/4/517'&gt;Stirman et al&lt;/a&gt; compared the collected works of nine poets who eventually committed suicide and nine poets who didn't (as a control set). Their theory was that the depressed (and eventually suicidal) poets would use more first person singular (&lt;i&gt;I, me, my&lt;/i&gt;) and words related to hopelessness and desperation (&lt;i&gt;hate, worthless, death, grave&lt;/i&gt;) and that was supported by the data.&lt;br /&gt;&lt;br /&gt;&lt;a href='http://www.ingentaconnect.com/content/psych/pcem/2004/00000018/00000008/art00006'&gt;Rude et al&lt;/a&gt; later found something similar when they compared essays (on a common topic - "coming to college") written by college students. Depressed students used "I" and negative words significantly more often than controls.&lt;br /&gt;&lt;br /&gt;Interestingly &lt;a href='http://ajp.psychiatryonline.org/cgi/content/abstract/145/4/464?ijkey=e783c5358ba68086685be026e20d59dfa026b950&amp;keytype2=tf_ipsecsha'&gt;Oxman et al&lt;/a&gt; has found that spoken language patterns can be a good discriminator for classifying patients as depressed or not, so it's not just written language use that may be different.&lt;br /&gt;&lt;br /&gt;Anyway, at ICWSM &lt;a href='http://homepage.psy.utexas.edu/HomePage/Faculty/Gosling/People.htm#Nairan%20Ramirez'&gt;Nairán Ramírez-Esparza&lt;/a&gt; from the University of Texas presented a language analysis of some depression discussion boards on About.com. She ran a two part study: the first to confirm Stirman and Rude's findings and the second making use of the fact that the About.com boards are bilingual (there's a Spanish section too) to see how different cultures talk about depression.&lt;br /&gt;&lt;br /&gt;Her approach was pretty simple - she collected ~ 400 posts from the depression forum and 400 posts from a breast cancer forum as a control, broke each post down into single words and then used off-the-shelf software to classify them (as verb, adjective, pronoun, positive emotion, negative emotion, etc.). She did this for both English and Spanish sections of the site.&lt;br /&gt;&lt;br /&gt;Her results seemed to confirm the earlier studies: first person pronouns were found three times more frequently in the depression forum posts than in the controls and words relating to negative emotions occurred four times as frequently. This was true for both English and Spanish datasets.&lt;br /&gt;&lt;br /&gt;The second part of her study was to see if English and Spanish speakers approach depression differently; what do they talk about? She studied this by using normalized word frequency counts then grouping different words into themes.&lt;br /&gt;&lt;br /&gt;The top five themes discussed in the English dataset:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Treatment (medicine, doctor, therapist...)&lt;br /&gt;Disclosure (tell, discuss, talk...)&lt;br /&gt;Family (mom, dad, brother, sister...)&lt;br /&gt;Symptoms ...&lt;br /&gt;School &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;And the top five themes from the Spanish dataset:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;Family&lt;br /&gt;Relationship history  &lt;br /&gt;Hopelessness&lt;br /&gt;School&lt;br /&gt;Treatment&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;I'm a bit suspicious of results that are so intuitively appealing (family and romance are more important to Spanish people?). One thing that I did wonder was how much the results are skewed by different community expectations: if you visit a discussion forum where people are sharing stories about their depression and everybody else mentions their family maybe you feel compelled to mention your family too. Maybe the English language forums are dominated by a younger age group and so older visitors shy away, or v.v.&lt;br /&gt;&lt;br /&gt;Anyway, it was interesting stuff. Somebody in the audience wondered aloud if this means that you could build a system to identify people at risk of depression (or perhaps more to the point suicide) by analyzing their language online. Maybe this could be built into the next version of the anti-plagiarism software used in high schools and colleges (I'm not advocating that, just saying)...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-6361116208557056094?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/6361116208557056094/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=6361116208557056094' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6361116208557056094'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6361116208557056094'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/do-you-use-language-differently-when.html' title='Do you use language differently when you&apos;re depressed?'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-4920127895857641509</id><published>2008-04-02T02:26:00.011+01:00</published><updated>2008-04-02T02:56:11.267+01:00</updated><title type='text'>Analyzing MySpace profiles</title><content type='html'>&lt;p&gt;This morning &lt;a href="http://faculty.cs.tamu.edu/caverlee/index.html"&gt;James Caverlee&lt;/a&gt; presented his study of almost two million (well, two sets of ~ one million - one set of profiles picked at random and one gathered by traversing the social graph) MySpace profiles. It was interesting stuff. Some bits and pieces below.&lt;br /&gt;&lt;/p&gt;&lt;p&gt;MySpace users live up to gender stereotypes, rather disappointingly:&lt;br /&gt;&lt;br /&gt;&lt;style type="text/css"&gt;.nobrtable br { display: none }&lt;/style&gt;&lt;br /&gt;&lt;div class="nobrtable"&gt;&lt;br /&gt;&lt;b&gt;Words most frequently appearing in MySpace profiles&lt;/b&gt;&lt;br/&gt;&lt;br /&gt;&lt;table style="padding: 0px; margin-top: 5px;" border="1" cellpadding="4" cellspacing="0" width="600"&gt;&lt;br /&gt;&lt;tr&gt;&lt;br /&gt;&lt;td width="50%"&gt;Women&lt;/td&gt;&lt;br /&gt;&lt;td width="50%"&gt;Men&lt;/td&gt;&lt;br /&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;br /&gt;&lt;td width="50%"&gt;&lt;br /&gt;love, people, dancing, life, shopping, can, girl, family, hearts, being, have, notebook, are, dance, favourite, things&lt;br /&gt;&lt;/td&gt;&lt;br /&gt;&lt;td width="50%"&gt;&lt;br /&gt;dating, sport, networking, metal, serious, football, relationship, sh*t, single, wars,&lt;br /&gt;straight, band, video, f*ck, guitar, gay&lt;br /&gt;&lt;/td&gt;&lt;br /&gt;&lt;/tr&gt;&lt;br /&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;And geographic ones (didn't manage to write all of these down in time):&lt;br /&gt;&lt;br /&gt;&lt;div class="nobrtable"&gt;&lt;br /&gt;&lt;table border="1" cellpadding="4" cellspacing="0" width="600"&gt;&lt;br /&gt;&lt;tbody&gt;&lt;br /&gt;&lt;tr&gt;&lt;br /&gt;&lt;td width="50%"&gt;users in Oregon&lt;/td&gt;&lt;br /&gt;&lt;td width="50%"&gt;users in Alabama&lt;/td&gt;&lt;br /&gt;&lt;/tr&gt;&lt;br /&gt;&lt;tr&gt;&lt;br /&gt;&lt;td width="50%"&gt;&lt;br /&gt;camping, hiking, pixies, snowboarding, wine, vegans&lt;br /&gt;&lt;/td&gt;&lt;br /&gt;&lt;td width="50%"&gt;&lt;br /&gt;football, jesus, gospel, nascar&lt;br /&gt;&lt;/td&gt;&lt;br /&gt;&lt;/tr&gt;&lt;br /&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Demographics wise ~ 50% of the profiles that they picked at random had one or no friends (i.e. weren't active). Age wise the peak is at 24, with smaller peaks at 69 and 100. The 69 peak is a secret MySpace code, apparently - it means that you're interested in, uh, one-handed typing (this wasn't made clear, but I'm guessing). By having a common age - 69 - you can use MySpace's advanced search to find others looking for the same thing. 69 year olds on MySpace are most similar (in their use of language) to people in their mid thirties.&lt;br /&gt;&lt;br /&gt;Younger users are overwhelmingly female. There is a 2:1 ratio of girls to boys at age 14. This difference decreases as age increases. The flip over point is at 20 - after that you start seeing more men than women.&lt;br /&gt;&lt;br /&gt;About 20% of the profiles in the connected dataset were marked as 'private'. Over time this percentage is rising. Having privacy preferences set is negatively correlated with age.&lt;br /&gt;&lt;br /&gt;He had a fantastic slide showing top terms wrt to age... will post it and a link to the slideshow when it's online.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-4920127895857641509?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/4920127895857641509/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=4920127895857641509' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/4920127895857641509'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/4920127895857641509'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/analyzing-myspace-profiles.html' title='Analyzing MySpace profiles'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-384010838642059061</id><published>2008-04-01T06:43:00.003+01:00</published><updated>2008-04-01T07:00:24.343+01:00</updated><title type='text'>Tossed Salad and Scrambled Eggs</title><content type='html'>I'm in Seattle for the ICWSM. The first day just finished and I'm going to blog about the more interesting talks tomorrow when I'm more awake. In the meantime:&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt; Crowdvine for conferences is actually pretty useful&lt;br /&gt;&lt;li&gt; &lt;a href='http://www.imdb.com/title/tt0891527/'&gt;Lions for Lambs&lt;/a&gt; is terrible&lt;br /&gt;&lt;li&gt; &lt;a href='http://images.google.co.uk/images?hl=en&amp;client=firefox-a&amp;channel=s&amp;rls=org.mozilla:en-US:official&amp;hs=VtY&amp;resnum=0&amp;q=st+trinians&amp;um=1&amp;ie=UTF-8&amp;sa=N&amp;tab=wi'&gt;St Trinians&lt;/a&gt; is actually quite good&lt;br /&gt;&lt;li&gt; &lt;a href='http://en.wikipedia.org/wiki/The_Century_of_the_Self'&gt;The Century of the Self&lt;/a&gt; is brilliant&lt;br /&gt;&lt;li&gt; Seattle looks really nice from the air&lt;br /&gt;&lt;li&gt; Note to self: US Milky Way bars = UK Mars bars, tricksy bastards&lt;br /&gt;&lt;li&gt; Steak + beer + bay views = awesome (thanks Deepak!)&lt;br /&gt;&lt;li&gt; More Starbuckses than normal&lt;br /&gt;&lt;li&gt; Everybody is disconcertingly friendly. People keep offering to take me skiing. And to see waterfalls. People here big on waterfalls&lt;br /&gt;&lt;li&gt; &lt;a href='http://labs.live.com/'&gt;MS Live Labs&lt;/a&gt; are hiring&lt;br /&gt;&lt;li&gt; &lt;a href='http://en.wikipedia.org/wiki/Brad_Fitzpatrick'&gt;Brad Fitzpatrick&lt;/a&gt; is a great speaker but I found his talk disappointing - too much hand waving about OpenID / OAuth / XMPP / XRDS. Dude, it's a room full of social network developers, you're preaching to the converted&lt;br /&gt;&lt;li&gt; Sadly &lt;a href='http://www.hpl.hp.com/research/idl/people/huberman/'&gt;Bernardo Huberman&lt;/a&gt; has cancelled. Marc "most unGoogleable name ever" Smith is talking instead. Marc is either founder of Poetry Slam (cool) a Happy Hardcore DJ (not cool) or a senior research sociologist at Microsoft Research (as yet undecided)&lt;br /&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-384010838642059061?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/384010838642059061/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=384010838642059061' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/384010838642059061'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/384010838642059061'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/04/tossed-salad-and-scrambled-eggs.html' title='Tossed Salad and Scrambled Eggs'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-1407953762224469141</id><published>2008-03-19T11:31:00.008Z</published><updated>2008-03-19T12:44:49.476Z</updated><title type='text'>Dawkins officially bigger than Jesus - datamining Scienceblogs.com</title><content type='html'>I've run all of the posts from Scienceblogs.com in 2007 through the &lt;a href='http://www.programmableweb.com/api/clearforest-semantic-web-services1/mashups'&gt;ClearForest API&lt;/a&gt;. ClearForest extracts entities - people, places, organizations - from plain text.&lt;br /&gt;&lt;br /&gt;I'm in the process of pulling things together for a visualization, but here's a quick answer to the 'who are Sciblings talking about?' question. The 'count' is the number of times that each entity was seen (could be multiple times in the same post) across 2007.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;+-----------------------------------------------+-------+&lt;br /&gt;| term                                          | count |&lt;br /&gt;+-----------------------------------------------+-------+&lt;br /&gt;| Michael Egnor                                 |  1855 | &lt;br /&gt;| Richard Dawkins                               |  1737 | &lt;br /&gt;| Bush                                          |  1669 | &lt;br /&gt;| Congress                                      |  1430 | &lt;br /&gt;| Charles Darwin                                |  1226 | &lt;br /&gt;| Michael Behe                                  |  1031 | &lt;br /&gt;| Chris Mooney                                  |   927 | &lt;br /&gt;| FDA                                           |   920 | &lt;br /&gt;| DCA                                           |   765 | &lt;br /&gt;| National Aeronautics and Space Administration |   745 | &lt;br /&gt;| National Institute of Health                  |   741 | &lt;br /&gt;| Bush administration                           |   721 | &lt;br /&gt;| Google                                        |   700 | &lt;br /&gt;| Guillermo Gonzalez                            |   691 | &lt;br /&gt;| White House                                   |   658 | &lt;br /&gt;| Supreme Court                                 |   655 | &lt;br /&gt;| Thomas Jefferson                              |   632 | &lt;br /&gt;| John Edwards                                  |   614 | &lt;br /&gt;| Casey Luskin                                  |   605 | &lt;br /&gt;| George W. Bush                                |   603 | &lt;br /&gt;| Jesus Christ                                  |   601 | &lt;br /&gt;| Discovery Institute                           |   596 | &lt;br /&gt;| the New York Times                            |   587 | &lt;br /&gt;| Larry Moran                                   |   576 | &lt;br /&gt;| World Health Organization                     |   543 | &lt;br /&gt;| Hillary Clinton                               |   517 | &lt;br /&gt;+-----------------------------------------------+-------+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Bear in mind that ClearForest extracts &lt;i&gt;entities&lt;/i&gt;, not key terms. It can't tell us how often blog posts are talking about mammoth DNA, supernovae or dicyemid mesozoa. That's a different dataset entirely...&lt;br /&gt;&lt;br /&gt;.... this one, in fact, generated using the Yahoo! term extraction API which pulls out important concepts (terms) from text. The dataset is about half the size of the above as I'm only including ScienceBlogs indexed in &lt;a href='http://www.postgenomic.com'&gt;Postgenomic&lt;/a&gt;. Here 'count' is the number of distinct posts containing a term:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;+---------------------+-------+&lt;br /&gt;| term                | count |&lt;br /&gt;+---------------------+-------+&lt;br /&gt;| evolution           |   963 | &lt;br /&gt;| carnival            |   923 | &lt;br /&gt;| global warming      |   640 | &lt;br /&gt;| intelligent design  |   543 | &lt;br /&gt;| new york times      |   542 | &lt;br /&gt;| blogosphere         |   468 | &lt;br /&gt;| religion            |   460 | &lt;br /&gt;| brain               |   437 | &lt;br /&gt;| climate change      |   432 | &lt;br /&gt;| creationist         |   420 | &lt;br /&gt;| birds               |   415 | &lt;br /&gt;| creationism         |   409 | &lt;br /&gt;| creationists        |   398 | &lt;br /&gt;| pz                  |   378 | &lt;br /&gt;| darwin              |   367 | &lt;br /&gt;| discovery institute |   354 | &lt;br /&gt;| atheists            |   351 | &lt;br /&gt;| atheist             |   333 | &lt;br /&gt;| biology             |   314 | &lt;br /&gt;| richard dawkins     |   301 | &lt;br /&gt;| skeptics            |   290 | &lt;br /&gt;| love                |   289 | &lt;br /&gt;| genes               |   288 | &lt;br /&gt;| job                 |   286 | &lt;br /&gt;| money               |   283 | &lt;br /&gt;| orac                |   281 | &lt;br /&gt;| god                 |   276 | &lt;br /&gt;| atheism             |   266 | &lt;br /&gt;| animals             |   261 | &lt;br /&gt;| bush                |   258 | &lt;br /&gt;| google              |   258 | &lt;br /&gt;+---------------------+-------+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;In light of this data it's tempting to revisit that Bayblab post suggesting that &lt;a href='http://bayblab.blogspot.com/2008/02/state-of-science-blogging.html'&gt;Sciblings spend too much time discussing ID&lt;/a&gt;. That'd be a mistake, though: the numbers above are absolutes. 963 posts had 'evolution' as a key term but that's only 2.4% of all posts that year (my 2c: I think that Sciblings &lt;i&gt;do&lt;/i&gt; talk about Egnor, ID and creationism too much, but hey, it's their blogs - I just skip over those posts).&lt;br /&gt;&lt;br /&gt;I also had a look at linking patterns - who do ScienceBloggers link to the most? Here 'count' is the number of unique posts that have a link to a particular domain.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;+-------------------------+-------+&lt;br /&gt;| domain                  | count |&lt;br /&gt;+-------------------------+-------+&lt;br /&gt;| www.scienceblogs.com    | 15966 | &lt;br /&gt;| en.wikipedia.org        |  2016 | &lt;br /&gt;| www.technorati.com      |  1797 | &lt;br /&gt;| www.nytimes.com         |  1388 | &lt;br /&gt;| www.amazon.com          |  1078 | &lt;br /&gt;| www.sciencedaily.com    |   661 | &lt;br /&gt;| www.washingtonpost.com  |   478 | &lt;br /&gt;| feeds.feedburner.com    |   467 | &lt;br /&gt;| www.nature.com          |   453 | &lt;br /&gt;| news.yahoo.com          |   401 | &lt;br /&gt;| news.bbc.co.uk          |   333 | &lt;br /&gt;| www.youtube.com         |   305 | &lt;br /&gt;| www.del.icio.us         |   297 | &lt;br /&gt;| www.cnn.com             |   260 | &lt;br /&gt;| www.eurekalert.org      |   260 | &lt;br /&gt;| farm3.static.flickr.com |   259 | &lt;br /&gt;| www.sciencemag.org      |   231 | &lt;br /&gt;| www.ncbi.nlm.nih.gov    |   225 | &lt;br /&gt;| www.pandasthumb.org     |   224 | &lt;br /&gt;| www.google.com          |   219 | &lt;br /&gt;| www.latimes.com         |   213 | &lt;br /&gt;| www.gnxp.com            |   208 | &lt;br /&gt;| sandwalk.blogspot.com   |   197 | &lt;br /&gt;| www.dailykos.com        |   196 | &lt;br /&gt;| www.donorschoose.org    |   194 | &lt;br /&gt;+-------------------------+-------+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Presumably the technorati links are from tags. Sciencebloggers link to scienceblogs.com far more than anywhere else - but I'd guess that this is simply because there are a lot of good science blogs on one domain there.&lt;br /&gt;&lt;br /&gt;Wikipedia's reliability might be &lt;a href='http://news.bbc.co.uk/1/hi/technology/4530930.stm'&gt;in question&lt;/a&gt; but it's interesting that almost everybody uses it to define terms.&lt;br /&gt;&lt;br /&gt;Drilling down, where do ScienceBloggers link to papers?&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;+--------------------------------+-------+&lt;br /&gt;| domain                         | count |&lt;br /&gt;+--------------------------------+-------+&lt;br /&gt;| www.nature.com                 |   241 | &lt;br /&gt;| www.sciencemag.org             |   194 | &lt;br /&gt;| www.dx.doi.org                 |   177 | &lt;br /&gt;| www.ncbi.nlm.nih.gov           |   111 | &lt;br /&gt;| www.pnas.org                   |   104 | &lt;br /&gt;| www.plosone.org                |    89 | &lt;br /&gt;| biology.plosjournals.org       |    76 | &lt;br /&gt;| content.nejm.org               |    67 | &lt;br /&gt;| medicine.plosjournals.org      |    65 | &lt;br /&gt;| www.sciencedirect.com          |    43 | &lt;br /&gt;| www.arxiv.org                  |    33 | &lt;br /&gt;| genetics.plosjournals.org      |    22 | &lt;br /&gt;| www.jneurosci.org              |    15 | &lt;br /&gt;| www.cell.com                   |    14 | &lt;br /&gt;| compbiol.plosjournals.org      |    10 | &lt;br /&gt;| pediatrics.aappublications.org |    10 | &lt;br /&gt;| www.jcb.org                    |    10 | &lt;br /&gt;| mbe.oxfordjournals.org         |     9 | &lt;br /&gt;| www.ajp.psychiatryonline.org   |     8 | &lt;br /&gt;| www.current-biology.com        |     8 | &lt;br /&gt;| www.journals.uchicago.edu      |     8 | &lt;br /&gt;| www.plosntds.org               |     8 | &lt;br /&gt;| www.blackwell-synergy.com      |     7 | &lt;br /&gt;+--------------------------------+-------+&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;Nature and Science are at the top, perhaps unsurprisingly - but if you add up the counts from the different PLoS journals it'd be up there too.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-1407953762224469141?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/1407953762224469141/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=1407953762224469141' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1407953762224469141'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/1407953762224469141'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/03/dawkins-officially-bigger-than-jesus.html' title='Dawkins officially bigger than Jesus - datamining Scienceblogs.com'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-5699348932312761650</id><published>2008-03-19T00:39:00.003Z</published><updated>2008-03-19T11:10:34.635Z</updated><title type='text'>Science streaming</title><content type='html'>&lt;a href='http://www.bioinformaticszen.com/2008/03/passive-research-streaming-using-twitter-flickr-and-citeulike/'&gt;Michael Barton&lt;/a&gt; has a nice post up:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;I currently use Subversion to back up my project files, and I noticed Twitter status updates are very similar in length to subversion log messages. I created a short script so that every time I do a subversion repository check in, the message is also sent to Twitter.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;I'd like to see activity aggregators accept arbitrary updates - sort of like Facebook's Beacon updating people's News Feed, but done properly.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-5699348932312761650?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/5699348932312761650/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=5699348932312761650' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/5699348932312761650'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/5699348932312761650'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/03/science-streaming.html' title='Science streaming'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-8827439579897692707</id><published>2008-03-18T03:08:00.006Z</published><updated>2008-03-18T03:41:52.118Z</updated><title type='text'>Nature archive visualized - draft</title><content type='html'>I'm using up my annual carry-over vacation days by taking some time off work this week. Normal people probably use this valuable breathing space to bond with their loved ones, play badminton and learn exciting new hobbies. So far I've sat alone in my flat for thirty six hours straight writing &lt;a href='http://en.wikipedia.org/wiki/Processing_(programming_language)'&gt;Processing&lt;/a&gt; sketches &lt;a href='#nb-natviz'&gt;*&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;So... &lt;a href='http://www.ghastlyfop.com/terms/movie_dropdown.mp4'&gt;here's a draft visualization&lt;/a&gt; (14mb MP4, should play in your browser with Quicktime) of the key words and phrases found in Nature journal over the past thirty years.&lt;br /&gt;&lt;br /&gt;The video starts with the phrases from 1970 and continues until 2007.&lt;br /&gt;&lt;br /&gt;Phrases appear on the right in the year that they were first seen, then travel leftwards, disappearing in the year they were last seen.&lt;br /&gt;&lt;br /&gt;The size of each phrase is related to how often it was seen relative to all the other phrases.&lt;br /&gt;&lt;br /&gt;The hue of each phrase is related to how many distinct journal issues it appeared in - green / yellow phrases are relatively transient while red / brown phrases are stable, appearing in many different contexts.&lt;br /&gt;&lt;br /&gt;The data is incomplete (it's a bit sparse after '88) and I took lots of shortcuts to see how things might look, so don't read too much into which phrases appear and when for now... a better version will follow - this is just a release early, release often draft.&lt;br /&gt;&lt;br /&gt;Eventually I'd like to have a sort of &lt;a href='http://en.wikipedia.org/wiki/Pop-Up_Video'&gt;Pop-up Video&lt;/a&gt; timeline of science from the 50s till today, with major events (and relevant terms) flashing up on screen.&lt;br /&gt;&lt;br /&gt;If you're particularly impatient here's a version from Vimeo. The quality is rubbish, mainly because I munged the file with iMovie (which is crap) to add some rockin' beats. I still suggest you get the &lt;a href='http://www.ghastlyfop.com/terms/movie_dropdown.mp4'&gt;mp4 instead&lt;/a&gt;, though.&lt;br /&gt;&lt;br /&gt;Tommorrow I'm going to the park.&lt;br /&gt;&lt;br /&gt;&lt;object type="application/x-shockwave-flash" width="600" height="483" data="http://www.vimeo.com/moogaloop.swf?clip_id=796830&amp;amp;server=www.vimeo.com&amp;amp;fullscreen=1&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=01AAEA"&gt; &lt;param name="quality" value="best" /&gt; &lt;param name="allowfullscreen" value="true" /&gt; &lt;param name="scale" value="showAll" /&gt; &lt;param name="movie" value="http://www.vimeo.com/moogaloop.swf?clip_id=796830&amp;amp;server=www.vimeo.com&amp;amp;fullscreen=1&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=01AAEA" /&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name='nb-natviz'&gt;*&lt;/a&gt; I watched &lt;a href='http://www.imdb.com/title/tt0804522/'&gt;Rendition&lt;/a&gt;, too, it was quite good.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-8827439579897692707?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/8827439579897692707/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=8827439579897692707' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/8827439579897692707'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/8827439579897692707'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/03/nature-archive-visualized.html' title='Nature archive visualized - draft'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-2549078339726866397</id><published>2008-03-11T10:44:00.005Z</published><updated>2008-03-11T10:49:24.227Z</updated><title type='text'>Seattle</title><content type='html'>I'm going to be in Seattle the first week of April for the &lt;a href='http://icwsm.org/2008/index.shtml'&gt;ICWSM&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;There are a whole bunch of awesome looking talks and one or two wild cards. Wild cards like:&lt;br /&gt;&lt;br /&gt;Spontaneous Inference of Personality Traits from Online Profiles&lt;br /&gt;Kristin Stecher, Scott Counts&lt;br /&gt;&lt;br /&gt;Which sounds interesting, anyway.&lt;br /&gt;&lt;br /&gt;Let me know if you're in the area and fancy &lt;a href='mailto:e.adie@nature.com'&gt;meeting up&lt;/a&gt; for lunch or a drink. I'm in town from the 29th of March to the 5th April.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-2549078339726866397?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/2549078339726866397/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=2549078339726866397' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2549078339726866397'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/2549078339726866397'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/03/seattle.html' title='Seattle'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14832160.post-6956589572679735448</id><published>2008-03-10T23:11:00.003Z</published><updated>2008-03-10T23:50:56.119Z</updated><title type='text'>New JoVE blog &amp; commenting on papers</title><content type='html'>Anna Kushnir's &lt;a href='http://jove-blog.blogspot.com/'&gt;new blog for JoVE&lt;/a&gt; is up and running (actually it has been up and running for a while, I'm a bit behind with blogging. Those January Open Science posts are coming at some point, too). It's a nice mix of content.&lt;br /&gt;&lt;br /&gt;Of particular interest are a &lt;a href='http://jove-blog.blogspot.com/2008/02/science-participation.html'&gt;couple&lt;/a&gt; of &lt;a href='http://jove-blog.blogspot.com/2008/03/going-incognito.html'&gt;interesting entries&lt;/a&gt; talking about the online participation - or lack thereof - of scientists. See also Noah Gray's &lt;a href='http://blogs.nature.com/nn/actionpotential/2008/03/ng_neuroscience_and_web.html'&gt;take on neuroscientists and web 2.0&lt;/a&gt; and David Crotty's &lt;a href='http://www.cshblogs.org/cshprotocols/2008/02/14/why-web-20-is-failing-in-biology/'&gt;'why web 2.0 is failing in biology'&lt;/a&gt; post.&lt;br /&gt;&lt;br /&gt;Did you skip over all those links? You shouldn't, really. At least read &lt;a href='http://www.cshblogs.org/cshprotocols/2008/02/14/why-web-20-is-failing-in-biology/'&gt;David Crotty's&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;So, yeah, anyway, why scientists don't comment on papers - my take is that being too busy and being afraid of the consequences don't come into it. &lt;br /&gt;&lt;br /&gt;Sure, they're valid concerns - but &lt;i&gt;everybody&lt;/i&gt; is busy at work and everybody realizes that what you say on the internet is recorded forever by Googlebot. People still write ranty forum posts and blog comments.&lt;br /&gt;&lt;br /&gt;IMHO the main reasons scientists don't leave comments are:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;There's no point&lt;/b&gt; - who's going to read it? Will you get any feedback? Will you get any credit for it?&lt;br /&gt;&lt;br /&gt;and&lt;br /&gt;&lt;br /&gt;&lt;b&gt;It's too much work&lt;/b&gt; - writing a comment should be a one click operation. Well, two clicks, one to get the focus in the textbox and the other to press 'submit'.&lt;br /&gt;&lt;br /&gt;Science publishers can address both of these issues, but we've been failing to do so.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14832160-6956589572679735448?l=www.ghastlyfop.com%2Fblog' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/6956589572679735448/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=14832160&amp;postID=6956589572679735448' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6956589572679735448'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14832160/posts/default/6956589572679735448'/><link rel='alternate' type='text/html' href='http://www.ghastlyfop.com/blog/2008/03/new-jove-blog-commenting-on-papers.html' title='New JoVE blog &amp; commenting on papers'/><author><name>Stew</name><uri>http://www.blogger.com/profile/01323861927990299545</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='00231275041461154105'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>7</thr:total></entry></feed>