Sunday, February 25, 2007

Lubin Odana vs Lord Boyzici

I sometimes wonder what makes Lubin Lubin. Are there certain uses of my language which suggest preoccupations or obsessions, that I'm not aware of? One way of working this out is to use a computer analysis tool which looks at your language use, comparing two "texts" together, and coming up with lists of words or phrases which occur statistically more often in one text when compared against another. There's a whole academic body of literature surrounding this, and it's a technique I use quite a lot at work. It has its limitations, but it's also quite a bit of fun.

So, out of interest I made a text file of all my blog entries for Jan and Feb 2007, and compared that against another text file of another blogger, Lord Boyzici (I know him in real life so hopefully he's not offended that I chose him for my comparison.) I deleted all the comments, any passages where we'd quoted other people, or stuff on menu sidebars. Obviously these are small recent samples, and had I used someone else, the results would have been different - but not as different as you'd expect them to be. Having done a lot of these comparisons, they do tend to be quite good at revealing people's linguistic "tendencies", no matter what you compare them against.

Here's what I found:

Words that I use more than Tom:

Pronouns: her, she, you, they, their, my

Adverbs: particularly, eventually, sometimes, anyway, however, probably, really

"people words": students, people

"Gay" words: mother, gay, male, diva

"Comedy" words: joke, hilarious, parody

"Negative" words: mental, unhappy, crisis, depression, cynical, crime, poor, hate

"The media and celebrity": cinema, book, website, books, magazines, fans, cast, characters, famous

"The body": attractive, weight

I also tend to use the following phrases a lot: "one of those", "in the world", "in the film", "some of the" and "you are so".


Tom's words:

pronouns: I

"game/leisure" words: game, quiz, poker, players, league, won, questions, scores, luck, played, chips, round, pub, top

"time" words: weekend, friday, evening, last

adverbs: rather

Random gramamtical words: this, which, will, am, was

Tom uses the phrases "I ended up", "If you can" and "If you have" a lot (compared to me).

So what does it all mean? Some of it confirms what I'd already thought - Tom writes about his poker games a lot while I write about films and books. Tom likes to write about his free time. I don't seem to have that focus (maybe I just have more free time so it's not a big deal!) I like the Tomism "I ended up" which I initially thought implies that Tom tells stories about things that recently happened to him, but then didn't go according to plan - a kind of fun spontaniety which I, in my tightly controlled world, do not have. However, when I looked at all of Tom's cases of "I ended up", most tend to occur at the end of stories about the outcome of his poker games (though two relate to drinking stories).

Along with the focus on poker, Tom also does a weekly quiz, so that suggests a focus on competition at his blog, which I don't have. On the other hand, I write about being gay quite a bit and various women (usually female celebrities that I'm interested in - oh dear - just call me a big gay stereotype!) My use of the words "attractive" and "weight" also suggest a slight superficiality, which I'm more than aware of thankyou.

But I didn't realise that I tended to have so much "negativity" in my blog. Though maybe that's just in relation to Tom, who describes himself as "usually happy" I can confirm that he is). I had also suspected that I tend to over-use adverbs that don't really have much of a proper function, other than suggesting a slightly (there I go again) pompous, long-winded or hedged style. And now I have proof that I do. Oh well.

2 comments:

Reluctant Nomad said...

This looks like fun. I think I want to do it too.

Tom SF said...

There should be dozens of comments here!

Damnit - I want to be the centre of attention!