Most of the data I deal with day-to-day are numerical: data used to churn out key drivers, benchmarks and segments. But there’re plenty of data out there that aren’t numbers and call for a completely different approach. One method that marketers in the vanguard have bought into is “Sentiment Analysis”, or the attempt to sniff out how people feel by combing through what they say (usually on social media), using computer algorithms to unearth patterns that previously lay buried in a mountain of words. Any marketer can spend an hour going over 50 customer reviews or 10 blog posts, but trying to read your customers’ minds en masse requires some kind of Sentiment Analysis.
That being said, for those of us measuring customer sentiment the traditional way, it hardly means ditching your Net Promoter Scores, the 5 star rating system, and newer toys like tags, tweets, likes and pins. Why? Because Sentiment Analysis attempts to decipher (to its credit) one of the hardest things in the world to get right – human emotions – and sophisticated as its algorithms are, they’re still crude when it comes to replicating the nuance of human perception. Put it another way, if sentiment analysis is a GPS, it can map out large features like mountains and valleys, but can’t offer turn-by-turn navigation.
In its simplest form, sentiment extraction consists of labeling each substantive word or word group (i.e. no “a” or “the”) in a sentence “positive(+1)”, “negative(0)” or “neutral (-1)” and sum up the scores accordingly. For example, “I love cereal” would be scored positive. Unfortunately, from there the complexity has no upper bound since few customer sentiments are that clear cut. If you’re considering adding Sentiment Analysis to your toolkit, here’re some questions that you might want to ask yourself:
“Should I customize to my industry?” If you have moderate to heavy industry specific jargon or word usage in the texts you’re planning to analyze, do it. “Sick” in gamer parlance may be a positive, even a superlative, even though almost everywhere else it means those customers are not coming back. Acronyms too are a problem – “is” the English word is not “IS” the Image Stabilizer when both appear in a review of camera lens. Generic texts usually already have established corpuses (basically word dictionaries) for analysis but more specialized texts need human raters to plow through a sample first in order to establish context specific rules and sentiment ratings and feed those rules into the algorithm. It takes longer, is more expensive, but it gives you a more accurate read.
“How many types of emotions should I track?”– Here the tradeoff is simple, more emotions, more nuance, less accuracy. Thumb up or thumb down is easy to detect but things like sarcasm is much harder if not impossible to nail down correctly. The same goes with humor. The need for context is simply too great, as I was told at a recent seminar on the topic. If I tweeted “Nice, cheese pizza for lunch” friends would know I’m being sarcastic because I’m not a fan of cheese (sadly) but the algorithm wouldn’t pick it up because most people love cheese. Getting your hands on historical tweets cost upwards of $3000 per month and that’s useful only if I regularly and publically tweet about food and complained about cheese. Even the obviously and overwhelmingly sarcastic customer reviews of Bic’s For Her ball point pen is hard to detect via algorithm (to be fair one or two human reviewers also didn’t get it). If you truly want to know about more nuanced emotions, consider hiring human raters to do it on a representative sample of customer feedback.
“How do I know if my results are representative?” The problem here is two-fold. First, where you get customer chatter may not be reflective of your customer base and hence require some weighting after the fact. For example, studies have shown that Twitter users are male-heavy and skewed where race/ethnicity is concerned. Second, if you’re running a global business, most of the comprehensive sentiment analysis done now is done on English language texts, analysis of other languages are spottier because of grammar, spelling, etc. Keeping the sentiments you track simple and universal (love/hate) will make for easier cross-country comparisons later on.
In our attempt to better understand customer sentiment, it’s quite easy to put our faith in what’s on a computer screen. But we forget that the best algorithm is actually one that mimics our behavior, taught and trained using our rules of speech. Take the human out of the equation, and there won’t be much sentiment left to uncover.