The OED Text Visualizer is an amazing new research tool from OED Labs based on a powerful data engine that automatically annotates text. The Visualizer displays etymological information in an attractive visual format that can ‘open up new areas of questioning and means of discovery’.
It works like this: Paste up to 500 words into the box on this page, add the text’s date, click the button, and you get an instant display of word origins, helpfully colour-coordinated, along a 1,000-year timeline.
Here’s what I got with the first eight paragraphs of my post on the word culchie:
[click to embiggen]
Each bubble represents a word in the inputted text, its size proportionate to its frequency in the text. When you hover the cursor over a bubble, you get information about the word. The x-axis is a timeline from Old English to today, and the y-axis shows a word’s frequency in modern English on a logarithmic scale.
The Germanic clump in the top left are the, be, and, to, and in. The big blue words are of, or, a, its, and Irish – this one an obvious indicator of the topic of my text. The Latin cluster from the 15th century on have a scholarly, multisyllabic flavour: significance, dictionary, equivalent, indicate, speculate, synonym, connotation. Purple was a good choice.
The sole word identified as Celtic is bog. Culchie is unquestionably Irish, but the Annotator categorizes it as ‘place name’, hence the big yellow bubble in the bottom right. Below the display are fuller breakdowns of each word in tables of tokens and lexemes; these can be exported at a click as CSV or JSON files:
The OED says a fully optimized version of the Text Visualizer – which is currently in beta – will be along soon. In the meantime it invites feedback: ‘You are welcome to trial different types of text, play with the visualization and raw data, test out the tool functionality, and then share your thoughts.’
I highly recommend giving it a spin. It’s a fun, fascinating, and intuitive tool.