The word “cloud” refers primarily to the familiar watery wisps overhead. It can also refer to analogous physical phenomena (a cloud of dust, an electron cloud, a cloud in glass or translucent stone), or to figurative concepts (a cloud of suspicion or gloom hanging over someone). In computing, cloud has acquired other meanings, such as cloud computing, which I won’t be writing about, and word clouds or tag clouds, which I will. If you already know this terrain, feel free to skip forward.
A simple example is the category cloud on the right hand side of this blog. [Edit: I have since replaced the category cloud with a tag cloud.] First, I categorise and label (“tag”) each of my blog posts. For example, this post is categorised under blogging and is tagged blogging, tagging, toys, words, Wordle, tag clouds, and word clouds. The resulting metadata can be displayed automatically, thereby informing visitors of the general content of the blog, instantly and aesthetically.
The size of a tag in a cloud is directly proportionate to its popularity; a glance at the category tag cloud shows that “usage” is my most frequently used category tag at the time of writing. Clicking on a tag will select the content with that tag, or in this case within that category. Word clouds have been applied to all sorts of text, from politicians’ speeches to rugby tweets. Smashing Magazine has an excellent presentation of tag cloud examples and uses.
That’s the summary: now for the fun. Wordle is a toy and a tool that creates word clouds from any text you give it. Its website is a model of good design and clear information. Once you have created a Wordle cloud, you can modify it by colour, font type, and tag alignment. You can also remove common words (such as the, to, and, it, and of) from English or from 25 other languages. This gives the cloud a much more accurate flavour of the submitted content.
Now that Sentence first has been on the go for a while, I decided to give it the Wordle treatment. I included post titles and text, but not tags or comments (or this post). Here is the result, which you can click on to enlarge:
Some random observations. Because there is no stemming, the cloud includes common and commonly; usage, use, useful, and used. Certain words appear because of their prominence in posts dedicated to them, such as however, principal, and mwdeu. Others appear because they feature a few times in a set phrase, such as death in Blue Screen of Death, and splices in comma splice.
One thing I love about Wordle clouds is the way arbitrary sentence fragments emerge, often incongruously and sometimes almost poetically, like the results of using cut-ups.* In the Sentence first word cloud I see the sensible advice “first make word”, the Tarzanesque “language good”, and the more lyrical but enigmatic “whether sometimes something though Irish better”.
And now, for fun and mystery, here is an unidentified word cloud:
Can you guess or work out its source? If you don’t know or don’t want to guess, you could tell me your favourite fragments. All words are lower case, and apostrophes are omitted, so id means I’d, ill means I’ll, etc.
If you want a clue, you will find one in invisible writing (i.e. white font) on the next line. Highlight (left-click and drag) to see the hidden text.
It has already been mentioned on this blog!
If it proves too difficult, I’ll supply the answer or another clue next week. In the meantime, Happy Easter and happy wordling.