Thoughts from the 6th annual Text Analytics Summit

I just returned from the 6th annual Text Analytics Summit in Boston.  It was an enjoyable conference, as usual.  Larger players such as SAP and IBM both had booths at the show alongside pure play vendors Clarabridge, Attensity, Lexalytics, and Provalis Research.  This was good to see and it underscores the fact that platform players acknowledge text analytics as an important piece of the information management story.   Additionally, more analysts were at the conference this year, another sign that the text analytics market is becoming more mainstream.   And, most importantly, there were various end-users in attendance and they were looking at using text analytics for different applications (more about that in a second).

Since a large part of the text analytics market is currently being driven by social media and voice of the customer/customer experience management related applications, there was a lot of talk about this topic, as expected.  Despite this, there were some universal themes that emerged which are application agnostic. Interesting nuggets include:

  • The value of quantifying success. I found it encouraging that a number of the talks addressed a topic near and dear to my heart:  quantifying the value of a technology.  For example, the IBM folks when describing their Voice of the Customer solution, specifically laid out attributes that could be used to quantify success for call center related applications (e.g. handle time per agent, first call resolution). The user panel in the Clarabridge presentation actually focused part of the discussion on how companies measure the value of text analytics for Customer Experience Management.   Panelists discussed replacing manual processes, identifying the proper issue, and other attributes (some easy to quantify, some not so easy to quantify).  Daniel Ziv, from Verint even cited some work from Forrester that tries to measure the value of loyalty in his presentation on the future of interaction analytics.
  • Data Integration. On the technology panel, all of the participants (Lexalytics, IBM, SPSS/IBM, Clarabridge, Attensity) were quick to point out that while social media is an important source of data, it is not the only source.   In many instances, it is important to integrate this data with internal data to get the best read on a problem/customer/etc.  This is obvious but underscores two points.  First, these vendors need to differentiate themselves from the 150+ listening posts and social media analysis SaaS vendors that exclusively utilize social media and are clouding the market.  Second, integrating data from multiple sources is a must have for many companies.  In fact, there was a whole panel discussion on data quality issues in text analytics.  While the structured data world has been dealing with quality and integration issues for years, aside from companies dealing with the quality of data in ECM systems, this is still an area that needs to be addressed.
  • Home Grown. I found it interesting that at least one presentation and several end-users I spoke to stated that they have built/will build home grown solutions.  Why? One reason was that a little could go a long way.  For example, Gerand Britton from Constantine Cannon LLP described that the biggest bang for the buck in eDiscovery was performing near duplicate clustering of documents.  This means putting functionality in place that can recognize that an email containing information sent to another person who responds that he or she received it is essentially the same document and a cluster like this should be reviewed by one person rather than two or three.  In order to put this together, the company used some SPSS technology and homegrown functionality.  Another reason for home grown is that companies feel their problem is unique.  A number of attendees I spoke to mentioned that they had either built their own tools or that their problem would require too much customization and they could hire University people to help build specific algorithms.
  • Growing Pains.  There was a lot of discussion on two topics related to this.  First, a number of companies and attendees spoke about a new “class” of knowledge worker.  As companies move away from manually coding documents to automating extraction of concepts, entities, etc.  the kind of analysis that will be needed to derive insight will no doubt be different.  What will this person look like?   Second, a number of discussions sprang up around how vendors are being given a hard time about figures such as 85% accuracy in classifying, for example, sentiment.  One hypothesis given for this was that it is a lot easier to read comments and decide what the sentiment should be than reading the output of a statistical analysis.
  • Feature vs. Solution?  Text analytics is being used in many, many ways.   This includes building full-blown solutions around problem areas that require the technology to embedding it as part of a search engine or URL shortener.   Most people agreed that the functionality would become more pervasive as time goes on.  People will ultimately use applications that deploy the technology and not even know that it is there.  And, I believe, it is quite possible that many of the customer voice/customer experience solutions will simply become part of the broader CRM landscape through time.

I felt that the most interesting presentation of the Summit was a panel discussion on the semantic web.  I am going to write about that conversation separately and will post it in the next few days.

Can Text Analytics and Speech Analytics Have a Happy Marriage?

Sid Banerjee, CEO of Clarabridge, told me about Clarabridge’s recently announced partnership with CallMiner. It is a marriage of text analytics with speech analytics, and hence it should attract the attention of the big call center users. In fact, it should interest every business that makes extensive use of telephony. Working together, these technologies provide a more comprehensive view of a customer interactions, analyzing both written text (email, documents) and conversations.

The Clarabridge Content Mining Platform (CMP)

Clarabridge CMP is the text analytics part of the solution. It employs automated extraction and optional overlay rules to extract entities, facts, relationships, events, and “sentiments” from documents. Users can source documents from designated servers, then apply semantic extraction techniques to pull data from the documents.

CMP provides data quality, cleansing and organization tools for managing the harvested data. Once extracted and processed, the data can be stored in a database or warehouse and subsequently used by Business Intelligence (BI) tools. CMP has the tools to store the information and establish connectors to various reporting, statistical and data-mining tools for analysis.

Eureka – It’s Emotional!

Cliff LaCoursiere, co-founder and SVP of business development at CallMiner, explained to me that CallMiner’s Eureka! product used speech analytics algorithms to identify speech characteristics such as tempo, silence, or stress in call records, to produce an emotional assessment of the person on the phone.

Apparently, its algorithms measure the tension of a person’s voice by first base-lining the tension and then measuring any increase or decrease. For example, the sentence “You’ve been a big help.” could mean that you’ve actually been a big help, or said sarcastically, could mean you’ve been worse than useless. Without measuring the emotional content of the call, you may think you know what was said, when you don’t. And even if a transcript obviously indicates that a customer is satisfied, you don’t know how satisfied.

The Partnership

It’s clear that merging text and audio analysis will provide a wider and deeper understanding of customer interactions. For obvious reasons, CallMiner is attracting attention and customers in the call center area. Clarabridge already gets traction in customer experience monitoring – tackling the key question “How are my customers feeling and acting?” So, the partnership is clearly a natural one.

In a study we recently performed at Hurwitz & Associates, Customer Care was the top application for companies planning to implement text analytics solutions. Customer care applications use text analytics to gather information about product feedback (“voice of the customer”), customer sentiment, customer satisfaction, retention and churn, market intelligence, or reputation and brand management. So this partnership is a sensible next step in unifying communications channels and understanding comments in context. We expect it to make an impact in the customer care world.


Get every new post delivered to your Inbox.

Join 1,710 other followers