Word clouds: The great debate

by | Mar 9, 2023

Doing a quick Google search on the topic of word clouds results in some very negative blog posts such as “Word Clouds are Lame,” “Why Word Clouds Harm Insights,” and “The Problem with Word Clouds.” This negativity is absolutely warranted – word clouds are very limited in their analytical capabilities. Even a popular word cloud tool, WordItOut indicates they are mostly used for fun.

Despite the negativity out there, word clouds are popular. An analysis of Google searches for visualization tools since 2004 reveal increasing interest in word clouds, and at some point in 2016 they seemed to rival bar charts for interest (at least as measured by Google searches).

While we don’t typically use word clouds at HelloInfo, they are a data visualization tool that warrants a bit of an investigation. Let’s take a look:

What are word clouds?

Word clouds visually represent the words used in a passage of text, with the size of each word proportionate to its frequency in the passage. The software used creates the word cloud ‘artfully’ arranging the words. As an example, here is a word cloud of this blog post:

A word cloud.
Word cloud detailing all words in this blog post that appear two or more times.

What are the issues with word clouds?

There are some significant issues with word clouds. The biggest issue is around how data is portrayed as it is difficult to extract meaningful data from a word cloud.

Consider the word cloud above – it shows the words in this blog post that appear at least twice. As an alternate view of this, the following visualization shows the count of the top 10 most frequent words in this blog post.   

A bar chart
Bar chart detailing the top 10 most frequent words in this blog post.

In the word cloud, it is unclear how many more times the word ‘word’ appears than clouds, we can see that it is larger, but it is not possible to understand that it appears almost twice as many times. From the bar chart you can easily ascertain this.

Further, it is also possible with the bar chart to understand the full set of terms that round out the top 10 list. In the word cloud it is difficult to quickly pick out any but the top 3-5 most frequently represented words.

Other issues with word clouds:

  1. There is only one attribute measured in a word cloud analysis (frequency). 
  2. Phrases are, by default, broken out into separate words (although they do not need to be if you configure a word cloud properly). 
  3. As words are naturally different lengths, longer words take up disproportionately more room in the visualization than they should causing a biased perspective.
  4. Overall, there is typically little value to showing the frequency of word usage!

What alternatives are there?

To solve some of the problems with word clouds, alternate visualization tools can be used that better visualize the data. For example:

  • Bar charts
  • Tree maps
  • Circle packing (or word bubbles)
  • Donut chart

An understanding of each of these tools can be found here.

When might word clouds be useful? 

While there are admittedly limited use cases for them, at HelloInfo we have deployed word clouds for one project. In this project, we were conducting an analysis of trends affecting a certain industry. As part of the project output, we created word clouds representing the frequency of trend-based terminology used in competitive thought leadership. As our research was not quantitative in nature, we did not want to misrepresent our findings by assigning numerical values to the different terms (as would be needed for a bar chart or other visualization). By deploying a word cloud, our clients were able to easily see the terms used and the relative use of the terms without implying specific quantities associated with them.

It should be noted that we strongly feel that word clouds should only be used in certain data visualization situations. In this project, we deployed several approaches with the visualizations to ensure that we were delivering maximum impact:

  1. We configured the word clouds to represent phrases, not just individual words.
  2. Given that a word cloud is so weak in its analysis, we also provided supplemental text to explain the specific terms and phrases in context to our larger trend.
  3. We deployed many different visualization tools in this engagement overall, and determined a word cloud accomplished what we wanted it to (show the related terms with some indication of difference in size) while also providing an alternate visual for the reader to keep things interesting.

When building our word clouds for this project, the HelloInfo team used WordItOut as it allowed us to edit the font and colors, upload large data sets, and use phrases, rather than individual words. And I suppose if we had wanted a memento of a project, we could have ordered one of their custom word cloud shirts.

Here at HelloInfo there are no hard and fast rules for our data visualization. If you’re passionate about word clouds, and want to see our team deploy them on your project, please schedule a meeting with us to discuss!

Keep reading…

Say Hello to Violeta!

Say Hello to Violeta!

HelloInfo is continuing on its exciting start to 2024 by announcing yet another powerful addition to our team! Our...

Hello, 2024!

Hello, 2024!

It has been a busy start to 2024 for the HelloInfo team with new projects being kicked-off and many others wrapping...

Say Hello to Bernie!

Say Hello to Bernie!

HelloInfo is proud and excited to announce that we have added a new member to the team! Bernie has over 17 years of...