Sentimental Gotchas

In measuring social media, the current accepted standard that we are using right now is brand sentiment – in other words, the percentage of tweets, posts, and forum discussions that are skewed either positively or negatively toward the brand in question.

What happens in systems like this is that social media analysis providers like Brandtology, Radian 6, and Meltwater would take a feed/crawl of social media sites, use automated semantic analysis to measure the positivity or the negativity of the keywords, and assign a rating to the post accordingly. The posts are then aggregated and a sentiment rating is assigned to the brand overall – for instance, if 50 percent of the posts are deemed to be positive, and 50 percent negative, the sentiment for the brand would be 50 percent overall – quite neutral.

Now, the process described above is obviously a very simplistic view. And the actual process itself is obviously less straightforward, but the end result will always be clear, concise, and simple enough for C-levels to understand; the basic premise being whether the social media sphere is talking good things or bad things about the brand on any particular day.

Gotcha: Semantics, Semantics, Semantics

However, when looking over sentiment results, it is imperative that you do not use the metrics at face value. For one thing, most semantic analysis tools, advanced as they are, still only scratch the surface of the human capacity for linguistic mangling… something that is exacerbated in the social media world.

This obviously would lead to a lot of skew in the sentiment results, and where possible, you should go through all the posts to reassign the sentiment of the posts where a machine might not have assigned the sentiment rating correctly (most solutions out there will allow you to do that).

Gotcha: Internationalization

Internationalization is also another issue. There is the obvious issue of non-English languages that the current systems have to support… but even for English language posts, tweets, and status updates there are dramatic differences in the way English is written colloquially on the Internet across different markets that could drastically change the sentiment of a post.

Further muddying the waters is the fact that in many Asian cultures, even if English is the de-facto language, it is mixed with local language in a way that makes it hard for systems to parse out the actual sentiment of the post (for instance, a tweet from the Philippines could switch from English to Tagalog effortlessly within the space of 140 characters; Singlish is another renowned linguistic hybrid).

Gotcha: Abbreviations

Case in point, we were doing a social media scan for a big bank in Indonesia for which the name was an abbreviation. As it turns out, that particular abbreviation was also the same abbreviation for a typical work in Bahasa, and the signal-versus-noise ratio increased exponentially, making it harder to get good insights out of the social media scans.


Now, you might be asking what the point is in pointing out all these gotchas of sentiment analysis. What we have found is that sometimes, people put too much faith in sentiment ratings as the be all and end all of social media analysis where it is actually just barely scratching the surface. What we would need to do is the following:

  1. Use sentiment analysis as the start of your analysis, not the end point. Scrub through the results; rejig the ratings manually if possible. It is the insights that are important for action, not the rating itself.
  2. Always, always be aware of the gotchas when looking through the results. This will skew and enable you to get more value out of your analysis.

And most importantly, remember that social media analysis is just measurement at the end of the day – the utmost thing is action. Measurement is used to guide action, and without immediate action especially in the fickle world of social media, your measurement tools will just be another white elephant in your repertoire.

Related reading