Friday, December 11, 2009

Fingerprints on everything

There’s a fascinating piece on the BBC News website about linguistic research being done at Umea University in Sweden on how an author’s word choices create an identifiable writing style fingerprint. This research is built upon earlier work by linguist George Zipf, the guy who formulated Zipf’s Law which demonstrates a mathematical relationship between the frequency of a word in a text to that word’s rank in the author’s list of most commonly used words. The new research found that authors have what they’re calling “unique word curves” based on words that appear just once in a text, and here’s the freaky part – if you pull out 1,000 words from a 10,000 word work, 10,000 words from a 100,000 word work or 100,000 words form a 200,000 word work, you get the same word curve.

Now I’ve never been any good at math and the jury is still out if I’m any good with words, but what this says to me is that even with a (estimated) half-million English words to choose from, we predictably come back to the same ones.

So far the researchers have analyzed the complete works of Hardy (Thomas, not Oliver), Melville and Lawrence, but I would like to offer my books up for study as well. While I haven’t a clue which words I use uniquely in a given text, I can tell you words I probably use way too much. The ones that come to mind (their conjugations and/or pulrals) are probably common for all authors: said, look, stand, sat, drank, hair, eye, person, guy, paper, grab and pestiferous.

That’s my fingerprint, I think, I’m sticking to it.

Or is it sticking to me?

5 comments:

Vicki Delany said...

The eternal pestiferous. This is a good reason why one always needs to have someone critique, or at least look over, a writer's work. Because their common words are used so much the author doesn't notice them. They are also called crutch words. My crutch word is little. I also over use also, just, actually (a big no-no), really, pretty (as in pretty bad).

Charles benoit said...

Just! Ah yes, one of my favs, too. I wonder why I didn't notice that...

Donis Casey said...

I'm a 'just-er" as well. I'm also bad about 'however'.

Rick Blechta said...

Words I use too much is what an editor-friend calls "waffle words". I tend not to make a definite statement, but then through in a waffle word or two and make it a "sort of a definite" statement. I have no idea why I do this, and she always rakes me over the coals when she catches me doing it, but it still happens. At least now I'm sort of aware of it...

I also think more people should use forms of the word "cimbasso" in their various forms of communication, as in, "His extreme weight made Franklin look more than a little cimbasso-like."

Charles benoit said...

Yeah well who doesn't overuse cimbasso?