Flags and Lollipops

Thursday, April 03, 2008

Do you use language differently when you're depressed?

Can you tell if somebody is clinically depressed by analyzing their use of language? I'm not a psychologist, so take the background info below with a pinch of salt but the topic came up at ICWSM (more on how later) and I thought it was fascinating.

In 2001 Stirman et al compared the collected works of nine poets who eventually committed suicide and nine poets who didn't (as a control set). Their theory was that the depressed (and eventually suicidal) poets would use more first person singular (I, me, my) and words related to hopelessness and desperation (hate, worthless, death, grave) and that was supported by the data.

Rude et al later found something similar when they compared essays (on a common topic - "coming to college") written by college students. Depressed students used "I" and negative words significantly more often than controls.

Interestingly Oxman et al has found that spoken language patterns can be a good discriminator for classifying patients as depressed or not, so it's not just written language use that may be different.

Anyway, at ICWSM Nairán Ramírez-Esparza from the University of Texas presented a language analysis of some depression discussion boards on About.com. She ran a two part study: the first to confirm Stirman and Rude's findings and the second making use of the fact that the About.com boards are bilingual (there's a Spanish section too) to see how different cultures talk about depression.

Her approach was pretty simple - she collected ~ 400 posts from the depression forum and 400 posts from a breast cancer forum as a control, broke each post down into single words and then used off-the-shelf software to classify them (as verb, adjective, pronoun, positive emotion, negative emotion, etc.). She did this for both English and Spanish sections of the site.

Her results seemed to confirm the earlier studies: first person pronouns were found three times more frequently in the depression forum posts than in the controls and words relating to negative emotions occurred four times as frequently. This was true for both English and Spanish datasets.

The second part of her study was to see if English and Spanish speakers approach depression differently; what do they talk about? She studied this by using normalized word frequency counts then grouping different words into themes.

The top five themes discussed in the English dataset:


Treatment (medicine, doctor, therapist...)
Disclosure (tell, discuss, talk...)
Family (mom, dad, brother, sister...)
Symptoms ...
School


And the top five themes from the Spanish dataset:


Family
Relationship history
Hopelessness
School
Treatment


I'm a bit suspicious of results that are so intuitively appealing (family and romance are more important to Spanish people?). One thing that I did wonder was how much the results are skewed by different community expectations: if you visit a discussion forum where people are sharing stories about their depression and everybody else mentions their family maybe you feel compelled to mention your family too. Maybe the English language forums are dominated by a younger age group and so older visitors shy away, or v.v.

Anyway, it was interesting stuff. Somebody in the audience wondered aloud if this means that you could build a system to identify people at risk of depression (or perhaps more to the point suicide) by analyzing their language online. Maybe this could be built into the next version of the anti-plagiarism software used in high schools and colleges (I'm not advocating that, just saying)...

Comments and trackbacks Feel free to post your comments . This post has trackbacks.

Trackbacks:

0 Comments:

Post a Comment

<< Home


See all posts from: July 2005 August 2005 September 2005 October 2005 November 2005 December 2005 January 2006 February 2006 March 2006 April 2006 May 2006 June 2006 July 2006 September 2006 October 2006 November 2006 December 2006 January 2007 February 2007 March 2007 April 2007 May 2007 June 2007 July 2007 August 2007 October 2007 November 2007 December 2007 January 2008 February 2008 March 2008 April 2008 May 2008