June 27, 2020
The OED Text Visualizer is an amazing new research tool from OED Labs based on a powerful data engine that automatically annotates text. The Visualizer displays etymological information in an attractive visual format that can ‘open up new areas of questioning and means of discovery’.
It works like this: Paste up to 500 words into the box on this page, add the text’s date, click the button, and you get an instant display of word origins, helpfully colour-coordinated, along a 1,000-year timeline.
Here’s what I got with the first eight paragraphs of my post on the word culchie:
[click to embiggen]

Read the rest of this entry »
7 Comments |
etymology, language, language history, lexicography, linguistics, philology, words | Tagged: dictionary, etymology, language, language history, language tools, lexicography, linguistics, OED, OED Text Annotator, OED Text Visualizer, philology, research, words |
Permalink
Posted by Stan Carey
December 18, 2019
The Scots Syntax Atlas (SCOSYA) is a fantastic, newly launched website that will appeal to anyone interested in language and dialect, especially regional varieties and their idiosyncratic grammar. Its home page says:
Would you say I like they trainers? What about She’s no caring? Have you ever heard anyone say I div like a good story? And might you say You’re after locking us out? All of these utterances come from dialects of Scots spoken across Scotland, but where exactly can you hear them?
To answer this question, we travelled the length and breadth of Scotland, visiting 145 communities, from Shetland in the north to Stranraer in the south. We were particularly interested in the different ways that sentences are built up in these different areas. This part of a language is called its syntax, and it’s one of the most creative aspects of how people use language.
The resulting interactive Atlas has four main sections: How do people speak in…?, Stories behind the examples, Who says what where?, and Community voices. The two questions are self-explanatory. Community voices is a collection of extracts (audio and transcripts) from the conversations recorded – a trove of accent and dialect diversity.
Read the rest of this entry »
4 Comments |
dialect, grammar, language, linguistics, science, speech, syntax, usage | Tagged: accent diversity, accents, atlas, dialect, grammar, language, language tools, linguistic research, linguistics, maps, research, science, Scotland, Scots, Scots language, Scots Syntax Atlas, Scottish English, speech, syntax, usage |
Permalink
Posted by Stan Carey
October 12, 2016
Whenever I had a query about slang (and I’ve had many), or felt like a random trawl through the underbelly of language (which was often), my first port of call, traditionally, was Chambers Dictionary of Slang by Jonathon Green. I have several slang dictionaries for various countries or lexical domains, but CDoS was the most generally useful. It has since been superseded: instead of CDoS I now turn to GDoS.
Green’s Dictionary of Slang is the culmination of a life’s work for Green. First published in print as a three-volume behemoth in 2010, to awards and rave reviews, it now emerges in digital form with about 30% ‘revised, augmented and generally improved’. I’ve been beta-testing the website and can report it is a beautiful thing, vast and wondrous, filthy and fabulous, endlessly diverting and eye-opening.
Today, thanks to sterling work by web developer David P. Kendal, sees the official launch of Green’s Dictionary of Slang Online.
Read the rest of this entry »
8 Comments |
language, language history, lexicography, slang, words | Tagged: database, dictionaries, GDoS, Green's Dictionary of Slang, Jonathon Green, language, language history, language news, language tools, lexicography, reading, slang, slang lexicography, vulgar tongue, words |
Permalink
Posted by Stan Carey
August 8, 2011
Michael Rundell, a lexicographer at Macmillan Dictionary, wrote last year about a new area of linguistic research “based not on conventional corpora, but on Twitter feeds”. The demo website he linked to has since been updated, and is worth another look.
Now called Tweetolife (grandiosely subtitled “the science of human life in Twitter messages”), it offers a slick and simple interface that shows how words and phrases used on Twitter break down according to gender, or time of day. Like this:

Read the rest of this entry »
13 Comments |
language, language and gender, linguistics, science, words | Tagged: computational linguistics, corpus linguistics, gender, internet, language, language tools, linguistics, natural language processing, NLP, research, science, Tweetolife, Twitter, words |
Permalink
Posted by Stan Carey
July 25, 2011
Tomasz P. Szynalski, an English-Polish translator, has created TypeIt, a useful website for typing phonetic transcriptions in the International Phonetic Alphabet (IPA). Text can be entered in a range of fonts and with special characters, marks and glyphs from other languages.
/’vɛri ‘hændi ɪn’diːd, ɘnd fʌn tɘ juːz/
I don’t know when it was developed – recently, I think. There are many websites with charts, explanations and audio files of IPA, but few that are designed for immediate online transcription. I like Richard Ishida’s, Weston Ruter’s, Paolo Mairano’s and this Phonemic Chart too, but it’s good to have options. Another: i2Speak.
Thanks to Lauren Hall-Lew, who brought TypeIt to my attention on Twitter.
[Note: I’ve edited this post slightly to add a couple of IPA tools that were mentioned in the comments.]
22 Comments |
language, linguistics, phonetics, speech | Tagged: International Phonetic Alphabet, IPA, language, language tools, linguistics, phonetics, phonology, pronunciation, speech, tools, translation, websites |
Permalink
Posted by Stan Carey