Flags and Lollipops

Monday, October 08, 2007

Connotea & Postgenomic complaints

I was catching up on my Bloglines subscriptions and came across Tiageo's 'Pissed with Connotea' post about some complaints he had about some NPG products. I was going to reply in a comment but figured that T's points are worth addressing here, too.


1. [Connotea] Performance and downtime: Sometimes to submit a citation takes ages, or the service is down (although, regarding downtime, it seems to be getting better).


Yeah. This is something that has plagued Connotea recently though it is getting better as you say. Unfortunately the solutions aren't as easy as you might think. Ian Mulvany - who runs the site - was saying earlier that they're looking at adding a new server this week, which should help.


2. Postgenomic: When I tried to use it, it was mostly down. Want anedoctical evidence? If you search now (as of the posting date of this entry) for postgenomic on google, the cached page says: “Unable to select database”, nothing more.
3. Postgenomic again: A few months I submitted my blog. I got no answer at all.


These are also fair enough but I should point out that Postgenomic isn't an NPG product - it's run by me. NPG do let me keep developing it during work hours.

What I'm saying is - don't blame NPG, blame me. ;)

Downtime hasn't been a problem for Postgenomic recently, I don't think (I check it every day and it has been fine) but there *was* a long standing issue to do with firewalls and TCP/IP expiry timestamps that I won't bore you with - suffice to say that it meant that pages wouldn't load correctly on some platforms and that it's fixed now.

I know about the Google thing. It's a bit embarrassing but I haven't gotten round to doing anything about it yet..

re: not replying to your email... I'm normally quite good at getting back to people quickly (no, really). I found yours again and my excuse is that I was on a plane to SFO when I read it - must have put it on the 'to do' pile and forgotten about it. Sorry. :(

As a side note you're listed now.

HTH.

Labels: , ,

Comments and trackbacks Feel free to post your comments Blogger Tiago Blogger Stew Anonymous maxine Anonymous Pawel Blogger Stew Blogger Bill Hooker . This post has trackbacks.

Monday, July 09, 2007

EasyPG

Pierre Far from BlogSci has written an excellent WordPress plugin called EasyPG that helps you mark up blog posts for Postgenomic (and Chemical Blogspace).

Now we just need to find somebody who can write MoveableType plugins...

Labels: ,

Comments and trackbacks Feel free to post your comments . This post has trackbacks.

Monday, June 18, 2007

Publishers, trackbacks and shared data

The elevator pitch version of this post: if you're a science publisher interested in the web then let's talk about collaborating on a shared system that will stimulate online discussion, kickstart commenting and recognize the sometimes valuable contributions already being made every day by science blogs.


I'm a strong believer in allowing commenting on online papers. This is something under serious discussion at Nature (the question is how to do it properly). The vast majority of researchers read, organize and discover papers online; we should give them the tools and opportunity to discuss papers online, too.

It's easy to be dispirited by the lack of comments on early adopters - though what would an appropriate number of comments on a paper be? Is one comment pointing out a critical error worth more than a hundred saying 'nice paper'?

In the relatively near future two things will happen to help push commenting forward:

  • We'll (scientists in general) develop systems that track and credit scientific contributions - including relatively minor ones like wiki edits and comments - that aren't in manuscript form.

  • We'll make it easy enough to leave comments and for content stakeholders to be alerted so that they can reply for a positive feedback loop to kick in - more authors responding means commenting is seen to be more useful, so more comments are left... etc.


Until then, though, there is a way of supplementing comments submitted directly to journals: science blogs.

I think it's fairly safe to say that the number of blog posts discussing papers is much, much larger than the number of online comments left on papers from all STM publishers combined. Prove me wrong and I'll take you out for cocktails.

Some specific examples of papers discussed in blog posts:

This recent paper in Cell has no comments but three blog posts written about it. This paper in PLoS One has two blog citations but only one comment (which is a link to one of the blog posts - this has been discussed previously on the PLoS One blog).

So how can publishers use blog content to supplement commenting systems? I think Postgenomic is the answer, or at least a good starting point.

Postgenomic is a science blog aggregation site with an open source codebase. The data it collects is accessible via a REST based API.

Postgenomic follows several hundred science blogs and tracks the papers that they link to. Publishers can easily - and should, IMHO - access this data and display blog trackbacks next to the papers that they publish online.

Technorati or a homegrown system could possibly be used to do the same thing. Here's why STM publishers should use Postgenomic instead:

  • Postgenomic was written specifically to deal with scientific literature. It handles tricky things like disambiguation: a single paper X might be linked to at different URIs by different blogs (imagine that one blogger links to the abstract on PubMed, another to the PDF and a third to the fulltext view). It understands DOIs and PMIDs. We have a lot of experience with this sort of thing at Nature - see Connotea.

  • As the list of aggregated blogs is strictly controlled there's no need for publishers to manually curate each and every trackback on their papers.

  • Postgenomic has been running for more than a year and is recognized by the community - at least to the extent that new blogs are submitted regularly. If somebody starts a new blog and wants to be included on paper trackback whitelists, or a blog changes address or an archive is deleted then it makes sense for there to be one, central place for this to be dealt with. The science blog community is relatively small already, why fragment it further?


My suggestion is that wherever you'd allow comments on papers you also collect trackbacks, displaying the title and excerpt of blog posts citing the paper in question.

Blog trackbacks on papers are a winning proposition for everybody involved. Bloggers get recognition and increased exposure, readers get more relevant content, publishers get papers worth coming back to after you've downloaded the PDF, authors see more discussion surrounding their research.

If you're interested in talking about this further then please get in touch.

Labels: , , ,

Comments and trackbacks Feel free to post your comments Blogger Pedro Beltrão Blogger Egon Willighagen Anonymous hartleydavidson . This post has trackbacks.

Thursday, May 17, 2007

Pg10k

OK, it's cheating because the figure includes books as well as papers, but Postgenomic has now tracked more than ten thousand citations in blog posts. As the majority of blogs either (a) don't supply fulltext RSS feeds, just excerpts or (b) strip out HTML and thus the links from feeds there must be a sizeable dark figure, too - how many citations are being missed by Postgenomic and Chemical Blogspace, I wonder?

Anyway, paper #10,000 was Mauro Costa-Mattioli's paper in Cell about stress induced translation regulation (conveniently the citing post from Gene Expression explains what that is and then goes into some interesting detail - an excellent advert for science blogging).

I'm pleased. Scientists write blogs and put science in them. They talk about recent papers. Their numbers are growing. Might blog trackbacks be a good or even necessary supplement to comments on a paper on a journal website?

It'd be interesting to take, say, BioMedCentral papers from the past twelve months and compare the number of comments on each to the number of citations from posts. I think that BMC does comments quite well, possibly better than any other STM publisher - PLoS included - not that that's necessarily saying much (also there's still no comment RSS feed, boo). Using comment data from PLoS One would be another option (was speaking about this with a colleague earlier today) but considering how new PLoS One is perhaps there isn't enough data in Postgenomic yet for any results to be meaningful.

Actually, it'd also be interesting to compare the number of blog citations to the number of 'real' citations recieved by each paper in the index. Is blog buzz a good indicator of impact?

A brief stats update: the site has been running for about fourteen months. The most popular book has been The God Delusion, with relevant posts from 15 different blogs. The most popular 'proper' paper (anything with a DOI in PubMed gets tracked, which includes some opinion pieces) was Ben Voight's a map of recent positive selection in the human genome, from PLoS Biology.

There are 735 blogs in the index, of which 341 were active in the past week. Usually ~2,500 posts are aggregated each week (a major exception being the last two weeks of December, when this number falls to 1,400). There are ~120,000 blog posts in the database.

I've been busy with other projects at NPG recently but plan on spending some more time on Postgenomic over the next few months. If you've got any ideas (or you'd like to help out with coding, documentation, design - it's an open source project, born from discussions in the comment threads of bioinformatics blogs) then please let me know. If you're interested in using data from Postgenomic in some way then that's cool too, I'm keen to help.

I was going to reiterate my thanks to people who have contributed so far but the list is too long and I'd forget people. You know who you all are - ta muchly. Science bloggers rock.

Labels: , ,

Comments and trackbacks Feel free to post your comments Blogger neilfws Blogger Pedro Beltrão Blogger Stew . This post has trackbacks.

Friday, April 27, 2007

Blog trackback bookmarklet

Update: now returns HTML instead of atom :)

Bookmarklet to retrieve science blog trackbacks for the current page (a blog post permapage, for example?), courtesy of Alf:


javascript:location.href='http://www.postgenomic.com/page_trackback.php?url='+encodeURIComponent(location.href)

And the link (just drag it up to your bookmarks bar): PgTrack

Want to try it out? Bora's seminal 'science on science blogs' post has a lot of incoming links.

Note that you can check for trackbacks on any page - BBC news stories, papers, whatever - it's just they'll all be from blogs indexed in Postgenomic. And about that exact URL.

Labels: , ,

Comments and trackbacks Feel free to post your comments Blogger baoilleach Blogger Stew . This post has trackbacks.

Monday, February 12, 2007

PLoS One / Postgenomic mashup

Chris Surridge has an interesting post over at the PLoS blog about the comments (or the lack thereof) on PLoS One papers. He mentions one paper in particular that has a long discussion thread associated with it on Gene Expression but no real comments on the actual PLoS One site.

As a temporary solution (?) to the problem of blog comments not being immediately accessible from the paper, summaries of notable manuscripts are going to be posted to the PLoS publishing blog with open comment threads. Based on the three posts already up I think this is a terrible idea.

Partly this is personal preference - I hate blogs that just replicate tables of contents - but more importantly I think that it misses the point.

People like the GNXP folks have taken the time and trouble to build up a loyal community that fosters debate and to create an environment in which visitors enjoy interacting with the site and with each other. Sticking up an abstract or two on your own blog just isn't going to compete with that, doesn't matter how much traffic you get.

Blog properly - engage your audience - or don't blog at all. It's a personal communication medium, that's one of the reasons why people feel more comfortable commenting in a blogging environment. A link and an abstract on a publisher's blog isn't personal, it's an advert. The PLoS One blogs are generally a good read at the moment, don't ruin them.

I'm not just PLoS bashing here: I like the ideas behind PLoS One and we do the same 'if we blog the abstract then people will comment!' thing at Nature on some blogs (the ones I don't read any more). The intention is good, it's just misguided, IMHO.

Anyway, I think that a better solution would be to embrace the existing science blogosphere and to explore ways of working with it more closely. As a proof of concept, here's a Greasemonkey script that adds science blog trackbacks to PLoS One.

It's doesn't look particularly nice, mainly because I didn't have time to style things very well. Feel free to do with it as you will, though (you could get it working with PLoS Two, for a start).

Labels: , , , , ,

Comments and trackbacks Feel free to post your comments . This post has trackbacks.


See all posts from: July 2005 August 2005 September 2005 October 2005 November 2005 December 2005 January 2006 February 2006 March 2006 April 2006 May 2006 June 2006 July 2006 September 2006 October 2006 November 2006 December 2006 January 2007 February 2007 March 2007 April 2007 May 2007 June 2007 July 2007 August 2007 October 2007 November 2007 December 2007 January 2008 February 2008 March 2008 April 2008 May 2008 October 2008 December 2008 January 2009 February 2009 March 2009 June 2009