‘Cuckquean’, abbreviations, and vocabulary change

June 22, 2017

Catching up on my column for Macmillan Dictionary Blog, I have three recent posts to share.

Golly, matey – vocabulary change is massively awesome looks at how the words we use reflect our shifting habits and preoccupations:

To look more broadly at these ripples in the collective lexicon, we can turn to big data in the form of language corpora. One of these, the Spoken British National Corpus, allows many kinds of linguistic research, such as studying how English vocabulary and regional dialects are shifting. The project was in the news recently with a story about ‘words we no longer use’. The headline exaggerates, but there are indeed words we use much less – or much more – than we did twenty years ago. The corpus data can illustrate how our lives have changed over the years.

TL;DR: Abbreviations FTW is an overview of the different types of abbreviations and the different ways we style and use them:

Efficiency is intrinsic to communication, and can drive language change. Set phrases that are used repeatedly are commonly abbreviated, as they save people time and effort. In digital communication, abbreviations may also serve as tribal markers – tfw users are in the know about internet lingo. Ikr. Sometimes, as in the case of lol, abbreviations may even undergo grammatical transformation.

Cucks, cuckolds, cuckqueans and cuckoos briefly explores the origins and applications of this nest of interconnected words:

Quean is a notable word in its own right. It comes from Old English cwene, meaning ‘woman’, from Proto-Indo-European *gwen-, which is also the root of queen, misogyny, and gynaecology. In English, cwene was originally a neutral word; but like many terms of female reference, it gradually took on negative senses and connotations, coming to mean ‘impudent woman’, ‘hussy’, and ‘prostitute’. In Scots it has retained its original neutral sense.

Each post is bite-sized, readable in 2–3 minutes. For more, you can browse the full archive.

ETAOIN SRHLDCU, or: What are the most common words and letters in English?

January 7, 2013

Most of us know that ‘e’ is the most common letter in English and the is the most common word. Many are familiar with ETAOIN SHRDLU, the nonsense string that used to appear in print because of early-20thC printer design and now serves as shorthand for the most popular letters.

Beyond prevailing lore and trivia, we’re generally less certain about the English language’s most common words and letters. Different studies over the years have produced varying results, depending on the datasets and methods used.

Now Google’s director of research Peter Norvig has used the vast data from the Google Books corpus – over 743 billion words – to produce updated word- and letter-frequency tables. Here’s his letter count:

Peter Norvig - English language letter count frequency table

As you can see, it violates ETAOIN SHRDLU only slightly, becoming ETAOIN SRHLDCU.

The 50 most common words, in order of frequency, are: the, of, and, to, in, a, is, that, for, it, as, was, with, be, by, on, not, he, I, this, are, or, his, from, at, which, but, have, an, had, they, you, were, there, one, all, we, can, her, has, there, been, if, more, when, will, would, who, so, no.

Norvig also investigated the most common word lengths, sequences of letters (“n-grams”), letters in various positions in words, and much more. It’s a fascinating page – a feast for data fiends and word nerds alike. (And they are often alike.)