Analyzing MySpace profiles
This morning James Caverlee presented his study of almost two million (well, two sets of ~ one million - one set of profiles picked at random and one gathered by traversing the social graph) MySpace profiles. It was interesting stuff. Some bits and pieces below.
MySpace users live up to gender stereotypes, rather disappointingly:
Words most frequently appearing in MySpace profiles
| Women | Men |
love, people, dancing, life, shopping, can, girl, family, hearts, being, have, notebook, are, dance, favourite, things | dating, sport, networking, metal, serious, football, relationship, sh*t, single, wars, straight, band, video, f*ck, guitar, gay |
And geographic ones (didn't manage to write all of these down in time):
| users in Oregon | users in Alabama |
camping, hiking, pixies, snowboarding, wine, vegans | football, jesus, gospel, nascar |
Demographics wise ~ 50% of the profiles that they picked at random had one or no friends (i.e. weren't active). Age wise the peak is at 24, with smaller peaks at 69 and 100. The 69 peak is a secret MySpace code, apparently - it means that you're interested in, uh, one-handed typing (this wasn't made clear, but I'm guessing). By having a common age - 69 - you can use MySpace's advanced search to find others looking for the same thing. 69 year olds on MySpace are most similar (in their use of language) to people in their mid thirties.
Younger users are overwhelmingly female. There is a 2:1 ratio of girls to boys at age 14. This difference decreases as age increases. The flip over point is at 20 - after that you start seeing more men than women.
About 20% of the profiles in the connected dataset were marked as 'private'. Over time this percentage is rising. Having privacy preferences set is negatively correlated with age.
He had a fantastic slide showing top terms wrt to age... will post it and a link to the slideshow when it's online.
