The OED Text Visualizer

June 27, 2020

The OED Text Visualizer is an amazing new research tool from OED Labs based on a powerful data engine that automatically annotates text. The Visualizer displays etymological information in an attractive visual format that can ‘open up new areas of questioning and means of discovery’.

It works like this: Paste up to 500 words into the box on this page, add the text’s date, click the button, and you get an instant display of word origins, helpfully colour-coordinated, along a 1,000-year timeline.

Here’s what I got with the first eight paragraphs of my post on the word culchie:

[click to embiggen]

Screengrab of the OED Text Visualizer. It shows a rectangular display with colour-coded bubbles of various sizes scattered along a timescale from before the year 1000 up to 2000 on the x-axis. Along the top are the colour codes: English, in blue (97), Germanic, in dark green (82), Romance, in red (66), Latin, in purple (23), other, in yellow (6), and Celtic, in orange (1).

Read the rest of this entry »

Is ‘corpse whale’ the real etymology of ‘narwhal’?

February 8, 2019

‘We know more about the rings of Saturn than we know about the narwhal,’ writes Barry Lopez in Arctic Dreams. This ignorance extends to its etymology. Wondering why the animal remains ‘so obscure and uncelebrated’, Lopez believes that the answer lies partly with ‘a regrettable connotation of death in the animal’s name’:

The pallid color of the narwhal’s skin has been likened to that of a drowned human corpse, and it is widely thought that its name came from the Old Norse for “corpse” and “whale,” nár + hvalr. A medieval belief that the narwhal’s flesh was poisonous has been offered in support of this interpretation, as well as the belief that its “horn” was proof at that time against being poisoned.

This is certainly the prevailing etymology. Look up narwhal in most major dictionaries that offer one – American Heritage, Oxford, Merriam-Webster, the Online Etymology Dictionary – and you’ll see the ‘corpse whale’ derivation presented more or less definitively, with a ‘probably’ or two included as insurance.

Lopez shares a different possibility:

Read the rest of this entry »

Idries Shah on words for Sufis

January 3, 2015

Idries Shah’s 1964 book The Sufis, which I read over the holidays, has several interesting passages on language, a couple of which I quote below. The first excerpt concerns the history and use of the protean word Sufism and some of the various terms used to refer to Sufis:

Exactly how old is the word “Sufism”? There were Sufis at all times and in all countries, says the tradition. Sufis existed as such and under this name before Islam. But, if there was a name for the practitioner, there was no name for the practice. The English word “Sufism” is anglicized from the Latin, Sufismus; it was a Teutonic scholar who, as recently as 1821, coined the Latinization which is now almost naturalized into English. Before him there was the word tasawwuf – the state, practice or condition of being a Sufi. This may not seem an important point, but to the Sufis it is. It is one reason why there is no static term in use among Sufis for their cult. They call it a science, an art, a knowledge, a Way, a tribe – even by a tenth-century portmanteau term, perhaps translatable as psychoanthropology (nafsaniyyatalinsaniyyat) – but they do not call it Sufism.

Read the rest of this entry »

Language change and the arbitrariness of the sign

October 28, 2013

Swiss linguist Ferdinand de Saussure (1857–1913) devised a model of linguistic meaning involving what he called the signifier (a symbolic or phonological form) and what it signifies. Their association is a basic unit of communication he referred to as a linguistic sign, and it is fundamentally arbitrary.

For example, rose signifies a flower with a pleasant smell, but by any other name it would, per Romeo, smell as sweet. Generally speaking, the meaning of a word cannot be predicted from its form, nor its form from its meaning.

Ferdinand de SaussureSaussure also drew a useful distinction between two approaches to linguistic study, which he called diachronic and synchronic – essentially historical and ahistorical. How he knitted these concepts together may be seen in this passage by Jonathan Culler in his book Saussure (Fontana Modern Masters, 1976):

What is the connection between the arbitrary nature of the sign and the profoundly historical nature of language? We can put it this way: if there were some essential or natural connection between signifier and signified, then the sign would have an essential core which would be unaffected by time or which at least would resist change. This unchanging essence could be opposed to those ‘accidental’ features which did alter from one period to another. But in fact, as we have seen, there is no aspect of the sign which is a necessary property and which therefore lies outside time. Any aspect of sound or meaning can alter; the history of languages is full of radical evolutionary alterations of both sound and meaning. . . . In short, neither signifier nor signified contains any essential core which time cannot touch. Because it is arbitrary, the sign is totally subject to history, and the combination at a particular moment of a given signifier and signified is a contingent result of the historical process.

The fact that the sign is arbitrary or wholly contingent makes it subject to history but also means that signs require an ahistorical analysis. This is not as paradoxical as it might seem. Since the sign has no necessary core which must persist, it must be defined as a relational entity, in its relations to other signs. And the relevant relations are those which obtain at a particular time.

There are exceptions to the arbitrary nature of the sign, such as onomatopoeia or sound symbolism, but even these may have aspects that are arbitrary or informed by the cultures in which they exist. And they are greatly outnumbered by the arbitrary signs.

John Lyons notes in Language and Linguistics that this arbitrary quality makes languages more difficult to learn, but it also gives them great flexibility and adaptability.

A radical awareness of language’s mutability

March 28, 2012

I recently read Henry Hitchings’s Defining the world: The extraordinary story of Dr Johnson’s dictionary, and I recommend it heartily to those of you who enjoy its principal fields of interest: words, history, literature, biography, and lexicography.

As well as recreating the history of Johnson’s Dictionary, which was first published in 1755, Hitchings’s book serves as a frank and affectionate portrait of Samuel Johnson himself, and as a vivid profile of 18th-century England. It’s an elegant and enthralling account that includes a keen analysis of Johnson’s linguistic attitudes and shows how these developed over the course of creating his mighty work.

Before beginning the Dictionary in earnest, Johnson wrote a lengthy Plan of an English Dictionary, in which he presented his ambitions for the book and his suitability for the task. It was addressed to the Earl of Chesterfield in order to win his patronage. Chesterfield, we read, was “obsessed with propriety of usage . . . and with embalming or even bettering the language”. Johnson said the dictionary’s chief intent would be “to preserve the purity, and ascertain the meaning of our English idiom”.

The order of these aspirations is no accident. Johnson’s characterisation of English as “licentious” and “inconstant” has what Hitchings refers to as “a distinctly moral cast”. But although the emphasis on stability was “consistent with [Johnson’s] own political instincts”, Hitchings suggests that it was probably exaggerated for Chesterfield’s sake: years later the Dictionary’s preface would contain a sober and eloquent acknowledgement of the irresistibility of linguistic change.

From Defining the world:

Linguistic conservatives like Chesterfield were afraid that unchecked changes in general usage would cause the English of the eighteenth century to become as bewildering to its inheritors as the language of Chaucer was to them. They were correct, of course, in seeing that their language was in flux. Then and now, the engines of this change include international commerce and travel, which involve contact with other languages; shifts in political doctrine or consensus; translations, which frequently preserve the idiom of their originals; fashion (in Johnson’s age, the nascent cult of sensibility), whose adherents require a special figurative language to articulate their refined and rarefied perspectives; and advertising, which uses foreign terms to connote mystique. These transfusions are what keep a language alive, but this is a modern view. Chesterfield could not begin to see that change was a force for the good. With time, Johnson’s conservatism — the desire to ‘fix’ the language — gave way to a radical awareness of language’s mutability. But from the outset the impulse to standardize and straighten English out was in competition with the belief that one should chronicle what’s there, and not just what one would like to see.

250 years later, Johnson’s Dictionary remains “not merely readable, but vital”, Hitchings writes, its every page brimming with philological lore and choice quotation. It is not just a landmark in lexicography but a great work of literature, described by Robert Burchfield as “the only dictionary compiled by a writer of the first rank”.*

The sixth edition of the Dictionary (1785) is available in multiple formats from the Internet Archive: Volume 1 and Volume 2.


* My Tumblr blog has a short passage by Burchfield on semantic drift.

Where did ‘she’ come from?

March 9, 2011

There’s no shortage of proposed etymologies for she, the third person singular feminine pronoun, but its origins remain uncertain. It appears to have arisen in the 12th century, but how it did so has proved difficult to establish. The Shorter Oxford English Dictionary says it is

probably a phonetic development of Old English hīo, hēo hoo pronoun feminine of he pronoun. Other suggested etymologies include derivation from Old English sēo, sīo feminine adjective . . . or from hypothesized forms in West Germanic.*

This array of possibilities corresponds, more or less, with the “probably”s of other authorities. According to the American Heritage Dictionary, she is “probably [an] alteration of Old English sēo, feminine demonstrative pronoun”. Eric Partridge, in Origins: A Short Etymological Dictionary of Modern English, says it comes

through Middle English she (earlier scae), variants sche, scheo, scho:? from Old English sēo, variant sīo, feminine of the article ‘the’, originally a demonstrative pronoun: cf Old High German siu, Middle High German siu, sie, , German sie, Old Saxon siu; cf also Sanskrit syā, this one.

This tallies with the route outlined by Walter Skeat in his Concise Etymological Dictionary of the English Language:

Middle English sche. Anglo-Saxon seó, used as feminine of definite article, but in the Northumbrian dialect as demonstrative pronoun. Feminine of se originally ‘he;’ cognate with Mœso-Gothic sa, that.

Merriam-Webster dates its first known use to the 12th century, and suggests its origin is

probably [an] alteration of hye, alteration of Old English hēo she

Douglas Harper’s Online Etymology Dictionary has quite a detailed entry:

mid-12c., probably evolved from Old English seo, sio (accusative sie), feminine of demonstrative pronoun se “the.” The O.E. word for “she” was heo, hio, however by 13c. the pronunciation of this had converged by phonetic evolution with he “he,” so the feminine demonstrative pronoun probably was used in its place (cf. similar development in Dutch zij, German sie, Greek he, etc.). The original h- survives in her. A relic of the O.E. pronoun is in Manchester-area dialectal oo “she.”

An alternative that’s generally overlooked, as it is by each of the above, is the possible connection with the Irish word /ʃi:/. It means “she”, it is pronounced identically to she, and it can be traced back to Old Celtic. Its roots appear, inevitably, in MacBain’s Etymological Dictionary of the Gaelic Language:

she, Irish í, , Old Irish í, , , Welsh, Breton hi: *; Gothic si, ea, German sie, they; Sanskrit syá: Indo-European sjo-, sjā– (Brug.).

Loreto Todd, in Green English: Ireland’s influence on the English language, makes the case for :

in the tenth century, there were parts of England where the same pronoun he could mean ‘he’, ‘she’ and ‘they’. Such a high degree of ambiguity was not allowed to continue. Speakers, especially in the north of England, began to adopt the Norse forms þai, þeʒʒm and þeʒʒre, which developed into modern ‘they’, ‘them’ and ‘their’. The change from he(o) to ‘she’ is much less easy to account for. No dialect of English or Norse had a personal pronoun that would or could have developed directly into ‘she’, although many etymologists have struggled to explain it by invoking combinations of Old English and Old Norse personal pronouns and by suggesting that the demonstrative pronoun seo, probably pronounced like ‘say + o’, can help in explaining the shift from ‘he’ to ‘she’.

Proposing as a conceivable inspiration, Todd notes the presence of Irish clerics and scribes in many communities in England at the time, and quotes from Martyn Wakelin’s English Dialects: An Introduction:

the early Scandinavian settlements (ninth century and earlier) in this country were mainly Danish and were on the Eastern side of England. Norwegian settlements occurring somewhat later (mainly in the first half of the tenth century by men who had been living in Ireland) were in the northwestern counties and the North and West Ridings of Yorkshire.

The emphasis is Todd’s. She acknowledges that this etymology is controversial. I don’t know how controversial – maybe it’s simply dismissed as unlikely – though at least one scholar finds Todd’s contention “satisfactory” (Radoslava Pekarová, The Influence of the Irish Language on Irish English Grammar; PDF, 308 KB).

But the change from he(o) to she might not be so unlikely or unusual. A. H. Smith’s Some Place-Names and the Etymology of “She”** (1925) showed that there are

certain place-names in the north of England and in Scotland which illustrate a peculiar sound development in English. Old English initial he—in these cases shows a tendency to become late Middle English sh-[ʃ].

Smith concludes:

the evidence of the place-names . . . shows that a development of O.E. he- to M.E. sch- did take place, which could explain the derivation of modern English she from O.E. hēo, especially as the periods when M.E. ʒhe, ʒhe and M.E. sche, scho were prevalent agree more or less with the periods when He-, Hy, Yh– and Sch-, Sh– were prevalent in the place-names

and summarises as follows:

Source: A. H. Smith: Some Place-Names and the Etymology of ‘She’

Smith’s research forms part of the evidence to which M. L. Samuels refers in his analysis in Linguistic Evolution: with special reference to English (1975):

The ME reflex of OE heo was he, so that large areas of the country were left without a formal distinction between ‘he’ and ‘she’, while even in the remaining areas the other surviving forms (hy, heo) were not ideal for the purpose. This systemic gap, which can be shown from the ambiguities in many surviving texts, was filled by a typical drag-chain process – the selection of originally rare variants, the stress-shifted forms /hjo/ and /hje/. These then changed, via the intermediate stage /ço, çe/ to /ʃo, ʃe/, perhaps first in the heavily Norse-influenced Cumberland-Yorkshire belt which provides numerous parallels for the change. (pp. 114–116)

Some of this material is based on Samuels’ earlier paper, The Role of Functional Selection in the History of English*** (1965), which includes the following image and accompanying text:

M.L. Samuels, The Role of Functional Selection in the History of English, Fig.3

The intermediate stage /ç/ would, as pointed out by Vachek, survive for a time as a marginal phoneme (spelt ʒ or ʒh), but would naturally give way to /ʃ/, which was equally distinctive for the purpose of functional differentiation, yet far better integrated phonemically.

Convincing proof for this theory (as against the older derivations from Old English sēo or Old Norse sjá) is now available from study of the Middle English distributions of forms. The later ME distribution, schematized in Fig. 3, shows ʒ(h)-forms in border areas which divide the newer s(c)he, s(c)ho in the Midlands from the older he, hy, heo in the south. . . . In other words, a belt of ʒ(h)-forms started in the northern belt and moved southwards across the country, always followed, after what appears to have been a comparatively uniform lapse of time, by s(c)h-forms. . . . we may justifiably conclude that the modern form she arose from an originally unusual phonetic variant in the spoken chain, and that it spread to large areas in which, failing such a form, the pronominal system was wholly lacking in balance.

The argument is developed in detail in Samuels’ book and paper. Although I found it fascinating, I’m not qualified to assess it, and I don’t know if there are more recent findings that substantially confirm or contradict it. Certainly there is nothing like a consensus on the matter, and it’s easy to see why the OED describes the etymology of she as “difficult”!

There’s a short and helpful discussion of the various possibilities in the comments of this post at Language Hat, where linguist Marie-Lucie says that “the Old English and Old Irish s- forms could have a common ancestor in Proto-Indo-European, or at least a branch of it”.

I love a good mystery, but this is one I wouldn’t mind seeing resolved. And much as I’m taken by Todd’s -hypothesis, I wouldn’t bet on it.


* In this and other quoted text, I’ve written some abbreviations in full.
** Review of English Studies, 1(4), Oct. 1925, pp. 437–440.
*** Transactions of the Philological Society, 64(1), Nov. 1965, pp. 15–40.

Finding a folly euphonic

November 22, 2009



Qu’il est heureux de se défendre
Quand le coeur ne s’est pas rendu!
Mais qu’il est fâcheux de se rendre
Quand le bonheur est suspendu!
Par un discours sans suite et tendre,
Égarez un coeur eperdu;
Souvent par un mal-entendu
L’amant adroit se fait entendre.



How happy to defend our heart,
When Love has never thrown a dart!
But ah! unhappy when it bends,
If pleasure her soft bliss suspends!
Sweet in a wild disordered strain,
A lost and wandering heart to gain,
Oft in mistaken language wooed
The skilful lover’s understood.


I found this poem and its translation in Literary Frivolities, Fancies, Follies and Frolics (1880) by William T. Dobson, who in turn found them in Isaac D’Israeli’s Curiosities of Literature (1791–1823). Some lines seem rather loosely translated, but no matter. Dobson writes that the French author Claudine Guérin de Tencin once sang this verse to the writer and scientist Bernard le Bovier de Fontenelle, who was impressed enough to request that she repeat the performance. When she pointed out that the verses were mere nonsense, he admitted that they were “so much like the fine verses I have heard here, that it is not surprising I should be for once mistaken!”

An amphigouri, also amphigourie or amphigory, can be considered a burlesque equivalent of what is known in English as a nonsense poem or nonsense verse. The OED says the word is a learned jocular formation from amphi- (Greek for around, about) and allégorie, where the Greek -agoria means speech or speaking. Alternatively, the latter part of the word may have come from gyros, Greek for circle or ring: like these entities, an amphigouri is well-rounded and attractively presented, but has nothing of substance inside. As Dobson put it, the verse is “richly-rhymed, elegantly expressed, but actual nonsense!”

Nonsense it may be, but my ears prefer it to the non-rhyming, inelegantly expressed nonsense that sometimes passes for meaningful communication.