My two cents on the 2008 Text Analytics Summit

I add my thoughts to others who have blogged on the Summit. See the Seth Grimes posting, for all of the other comments.

  • I too agree that it was great to see a large number of end users at the Summit, this year. I was especially interested in the fact that a number of them were in the investigation phase. And, what were many investigating? You got it – Voice of the Customer and the closely intertwined area of sentiment analysis.
  • VoC was a major theme. In fact, I was overwhelmed by the number of talks in the area. I thought that the presentation about what Gaylord Hotels is doing with text analytics and VoC was extremely interesting. Tony Bodoh took the audience through a journey beginning with the fact that before text analytics it would take the company weeks to even see customer comments. He said that the working pilot the company did in conjunction with Clarabridge took only 10 weeks. He reported benefits in process improvements, value-oriented marketing, and facility improvements. He even told us about reticular activating systems! Go look that one up.
  • I also met a number of people from new start-ups in the sentiment space, each taking a slightly different angle. I wonder if sentiment analysis will become a confusing space in the near future.
  • There was some discussion at the Summit about text analytics and Web 2.0. I would have like to hear more about this, as text analytics will be important in Web 2.0
  • And speaking of important, another interesting concept was brought up several times – the idea that text analytics will morph to become part of something bigger. I don’t want to say component, although others were.
  • I was hoping to hear more about text analytics and content management. At one point, during the expert panel, I had the chance to ask the audience if anyone was deploying their text analytics in conjunction with their content management systems. A handful responded affirmatively. Unfortunately, I didn’t catch up with them. If you’re reading this and are deploying text analytics with your ECM system, I’d like to hear from you.

All in all though, I was impressed with the Summit.

Text Analytics and the Predictive Enterprise

I recently had the chance to get an update on what SPSS is up to in text analytics.  It was an interesting conversation for several reasons:


  • First, it highlighted an important point about text analytics – which we know but is worth repeating – which is that the analysis of unstructured data can be more useful, in many scenarios, when accompanied by structured data.
  • Second, it got me thinking more about social media/network analysis, which prompted the question on the recent “four questions about innovations in analysis blog I recently posted.



A few words of background.   SPSS’s goal is help its customers analyze everything about data associated with people – behavior, attitudes, and so on to help an organization understand anyone it interacts with.  In fact, Olivier Jouve, VP of Corporate Development at SPSS was quite clear that SPSS is not a BI company.  Rather, SPSS software helps to enable what SPSS refers to as the “Predictive Enterprise”.   The Predictive Enterprise makes use of analytics (not simply reports) to help manage multiple dimensions across the enterprise including customer intimacy, product placement, and even operational issues such as fraud. 


SPSS offers a suite of text-mining products that is based on 25 years of research in the application of natural language processing (NLP) technologies. In 2002, SPSS bought LexiQuest™, a linguistics-based text-mining company, intending to combine LexiQuest’s extraction capabilities with SPSS’s data-mining capabilities in order to strengthen the company’s position in predictive analytics. All of SPSS’s text-analytics products now share this same core linguistic functionality.



It’s not just about text


While the market for text analytics has moved out of the early adopter stage, depending on what type of analysis you’re trying to accomplish, it often is not just about the text.




For example, consider the following churn scenario:  A telecommunications company is concerned about churn.  The company realizes that it has a wealth of information at its disposal to help predict churn.  On the structured data side it has collected demographic information, usage information, trouble ticket, and product information about each of its customers.  On the unstructured side, it also has collected call center notes, emails, and customer satisfaction surveys.  The company decides to invest in text analytics software that can sift through its call center notes, emails, and survey notes. At the end of the exercise, the company has some great insight into customer complaints that it can certainly act on.  However, it has not exactly gotten the information it might need to solve the churn problem.  In order to do this, it is probably more useful to marry the unstructured information from the call centers and surveys and emails to an actual customer and all of the structured information about that customer.  This way, using some predictive modeling the company can train its system to zero in on those customers that are likely to drop its service and make the right decisions to help retain them.  


According to SPSS many of its customers have seen upwards of a 50% reduction in churn by combining data mining with text mining.


Social media is becoming an important source of information for companies


What about other forms of media such as blogs, message threads, etc.?  SPSS is also moving into social network/media analysis because as Olivier said, “The number of people participating in Web 2.0 activities is growing rapidly across all age groups, and businesses are using the direct influence they have traditionally had over customers’ decisions about their products.  Peer to Peer networks are now a trusted source of insight and information.”   This is quite true.  Our recent Hurwitz & Associates survey confirmed that companies do plan to make use of the information found in various kinds of social networks, even if they don’t think they are making use of text analytics.  One interesting point on this front is that blogs, message boards, etc. do provide a great source of information of customer sentiment, opinions, etc. The challenge will be mapping this kind of information back to the other information that a company keeps about its customers, and making sense of the behaviors.  I’ll look forward to hearing more about what SPSS is doing to help solve this problem.


Four Questions about Innovations in Analysis

Several weeks ago, Hurwitz & Associates deployed a short survey entitled, “Four questions about innovations in analysis”.  Well, the results and they are quite interesting!




First, a few words about the survey itself and who responded to the survey.


  1. We wanted to make the survey short and sweet.  We were interested in what  kinds of analytical technology companies thought were important and specifically how companies were using text analytics to analyze unstructured information.  Finally, since there has been a lot of buzz about analyzing social media we asked about this, as well
  2. Let me say up front that given the nature of our list, I would categorize most of the respondents to the survey as fairly technology savvy.  In all,  61 people responded to the survey, 32% of these respondents were from high technology companies.  The verticals included professional services, followed by manufacturing, financial/insurance, healthcare and pharmaceutical. There were also some responses from governmental agencies, telecommunications and energy companies.  So, while the results are unscientific in terms of a random sample across all companies, they probably do reflect the intentions of potential early adopters, although not in a statistically significant manner.
  3. In analyzing the results, I first looked at the overall picture and then examined individual verticals as well as filtered the results by other attributes (such as those using text analytics vs. those not using the technology) to get a feel for what these companies were thinking about and whether one group was different from another.  These subgroups are of course, quite small and the results should be viewed accordingly.


The importance of innovative technologies

 We first asked all of the respondents to rate a number of technologies in terms of importance to their companies.  Figure 1 shows the results.  Overall, most of these technologies were at least somewhat important to this technology savvy group, with query and reporting leading the pack.  This isn’t surprising.  Interestingly, OLAP data cubes appeared to be the least important analytical technology – at least with this group of respondents.  Other technologies, such as performance management, predictive modeling, and visualization ranked fairly high, as well.  Again not surprisingly, text analytics ranked lower than some of the other technologies probably since it is just moving out of the early adopter stage.  Some of the respondents, from smaller firms, had no idea what any of these technologies were.  And, in terms of text analytics, one company commented, ” yeekes, this must be big time company kind of stuff. Way up in the clouds here, come down to earth.” They, no doubt, are still using Excel and Access for their analytical needs.  Other smaller companies were very interested in “non-cube” technologies such as some of the visualization products on the market today.



  Continue reading


Get every new post delivered to your Inbox.

Join 1,710 other followers