Five reasons to use text analytics

I just started writing a blog for AllAnalytics, focusing on advanced analytics.  My first posting outlines five use cases for text analytics.  These include voice of the customer, fraud, warranty analysis, lead generation, and customer service routing.  Check it out. 

Of course there are many more use cases for text analytics.  On the horizontal solutions front these include enhancing search, survey analysis and eDiscovery.  The list is huge on the vertical side including medical analysis, other scientific research, government intelligence,  and the list goes on.

If you want to learn more about text analytics, please join me for my webinar on Best Practices for Text Analytics this Thursday, April 29th,  at 2pm ET.  You can register here

Top of Mind – Data in the Cloud

I attended Cloud Camp Boston yesterday. It was a great meeting with some good discussions.  Several hundred people attended.  What struck me about the general session (when all attendees were present) was that there was a lot of interest around data in the cloud.  For example, during the “unpanel” (where people become panelists in real time), 50%; (5 of the 10 questions) that were up for grabs dealt with data in the cloud.  That’s pretty significant. 

  • How do I integrate large amounts of enterprise data in the cloud? (answers included various approaches, more traditional to new vendor technology were mentioned)
  • How do I move my enterprise data into the cloud? (answers included ship it FedEx on a hard drive and make sure there is a proven chain of custody around the transfer)
  • How do I ensure the security of my data in the cloud? (no answer – that deserved its own breakout session)
  • What is the maximum sustained data transfer rate in the cloud? (answers included when it takes a server down, no one knows, but a year ago someone mentioned that 8 gigabytes a second took down a cloud provider)
  • How do applications (and data) interoperate in the cloud? (answers included that standards need to rule)

 There were some interesting break out sessions as well.  One – the aforementioned security (and audit), another an intro to cloud computing (moderated by Judith Hurwitz), one about channel strategies, and a number of others.  I attended a break out session about Analytics and BI in the cloud and again, for obvious reasons, much of the discussion was data centric.   Some of the discussion items included: 

  • What public data sets are available in the cloud? 
  • What is the data infrastructure needed to support various kinds of data analysis? 
  • What SaaS vendors offer business analytics in the cloud? 
  • How do I determine what apps/data make sense to move to the cloud?

 The upshot?  Data in the cloud – moving it, securing it, accessing it, manipulating it, and analyzing it – is going to be a hot topic in 2010.

Analyzing Data in the Cloud

I had an interesting chat with Roman Stanek, CEO of Good Data last week about the debate over data security and reliability in the cloud.  For those of you who are not familiar with Good Data, it provides a collaborative business analytics platform as a SaaS offering.

The upshot of the discussion was something like this:

 The argument over data security and reliability in the cloud is the wrong argument.    It’s not just about moving your existing data to the cloud.  It’s about using the cloud to provide a different level of functionality, capability, and service than you could obtain using a traditional premises solution- even if you move that solution to the “hosted” cloud. 

What does this mean?  First, companies should not simply be asking the question,  “should I move my data to the cloud?”  They should be thinking about new capabilities the cloud provides as part of the decision making process.  For example, Good Data touts its collaborative capabilities and its ability to do mash ups and certain kinds of benchmarking (utilizing external information) as differentiators to standard premises-based BI solutions.  This leads to the second point that a hosted BI solution is a different animal than a SaaS solution. For example, a user of Good Data could pull in information from other SaaS solutions (such as as part of the analysis process.  This might be difficult with a vanilla hosted solution.

 So, when BI users think about moving to the public cloud they need to assess the risk vs. the reward of the move.  If they are able to perform certain analysis that they couldn’t perform via a premises model and this analysis is valuable, then any perceived or real risk might be worth it.

Collaborative BI and Good Data

Collaborative BI is a trend that is beginning to gain momentum.  The idea behind it is to make it easier for people in a company (or even outside of the company) to work together to analyze data and share analysis.   There are several ways to accomplish this goal – from the simple to more complex..  The simplest is to provide an easy way to email an analysis.  Another way to share analyses is to disseminate it via a web based interface.  Or, multiple users can have access to the same web-based interface and use it to collaborate on a particular analysis project. They can share analysis and comment on the analysis. 


The latter is the way in which a new entrant into the BI space – Good Data  – is addressing the issue.  I recently had an interesting conversation with Roman Stanek, CEO and founder of Good Data.  The company will be releasing the full beta version of its SaaS solution in November. 


So, what is Good Data all about?  Good Data is a collaborative BI solution targeted at medium sized companies.  The service allows users to upload structured data, analyze it, share and iterate on it.  The company provides a beta version with sample data already loaded, so prospective users can get a feel for the service.  Here’s what the service enables you to do:


  • Upload the data: Good Data will take dump of a database in a flat file format (such as csv) and infer a model from the flat file.  If the user doesn’t agree with the model, he or she can fix it.  The file size is unlimited, but practically speaking it can be as large as several gigabytes.  The user can upload as many files as he or she likes.
  • Analyze the data.  Good Data then allows users to slice and dice the data and report on it.  Charts are also available and it handles time series data.
  • Share and iterate on the data.  Good Data provides a wiki-like interface so users can share their reports and charts and other users can comment on them.   



The service is really pre-beta at this point, so it doesn’t have all the bells and whistles.  However, it was enough to get an idea of the direction that the company is going with the service.  The beta comes with a data set about the food industry already installed, so I can’t comment on how easy it is to actually upload data or what happens if there is something wrong with the data.  It was easy, however, to slice and dice the data to get reports and charts.  There is a nice comment and annotation feature included in the service where the user can comment on a report authored by another person.  The service also enables you to organize your analyses by category and you can see what updates, etc. have occurred on the analysis easily enough.   The reports and plots can be exported to a PDF for XLS format.    Here’s a screen shot from the reports web page:




The user starts a project and invites others to join.  These people can generate reports that others can see (for example, I created some reports and Peter Olmer created some other reports). The reports are arranged by date and who created them.  They can also be filed into one of the spaces on the left.  If we look at the Sales by Store report we would see the following:

This report, of course, might open up a whole line of questions about each store’s performance (note the comment I inserted).  The model that Good Data uses to capture the data allows the user to drill down into any of these stores for additional information about brands sold, staff, and any additional information that has been captured at the store level.  This enables further slicing and dicing of the data.


Collaborative BI

Over the last year, I’ve seen a number of start-ups addressing various aspects of BI and trying to provide user-friendly solutions for small and mid-size companies.  This is a good thing. The more people are exposed to data and analysis the more comfortable they’ll be with it.   I like the collaborative idea behind Good Data and some of the tools that it has already built into the service.  The company is planning the following set of collaborative features for the first release

·        Collaboration around definition of report

·        Collaboration around data model and definition of attributes/facts

·         Collaboration around data/data quality

·         Homepage feed of project activity

·        Alerts & Notifications when data reaches pre-defined thresholds.


I’m looking forward to seeing the full beta and the real thing.   


Hear My Voice!

I’ve been writing a lot about text analytics because I think it is a critical technology for deriving insight from unstructured data.  Late last summer,  Hurwitz & Associates published a report on Text Analytics.  As part of our research, we surveyed companies that had deployed this technology, were planning to deploy the technology, or had no plans to deploy text analytics.  We asked companies planning to implement the technology as well as those that had already deployed it what kinds of applications they had deployed or were considering deploying.   

Voice of the Customer Rules

 The top response was “customer care” applications which include using text analytics to gather information about products, customer sentiment, customer satisfaction, retention and churn, or reputation and brand management. In fact, close to 70% of the respondents cited this application as one they had either already deployed or were planning to deploy in the next year.    

It was no surprise, then when I spoke with Michelle DeHaaff, VP of Marketing at Attensity, that Voice of the Customer (perhaps a broader and better term than customer care) is rapidly becoming a main focus area for the company, and an area where it has gotten a lot of traction.  For those of you who aren’t familiar with Attensity, Attensity’s flagship technology uses what it terms “exhaustive extraction™,” which automatically extracts facts from parsed text (who did what to whom, when, where, under what conditions) and organizes this information. Attensity believes that this technique sets its solutions apart from competitors’ products because it doesn’t require extensive knowledge engineering capabilities; there is no need to develop rules or taxonomies.  

What does this mean for Voice of the Customer applications? Attensity provides software to analyze “traditional” unstructured information such as call center notes, customer emails, and survey responses, as well as unstructured information in blogs and web forums – a rich new source of first person feedback.  Using exhaustive extraction, customer feedback is dissected to analyze sentiment, root cause, and what customers are talking about, in general.  

Voice of the Customer as a Service

 And Attensity just announced a new service directly addressing this market – Attensity Voice of the Customer on Demand – a Software as a Service model that will allow companies to supply Attensity with their unstructured information and get back analysis about the information via a web based application. The fee for the service is based on data source and the size of the data. The service provides:

  • Sentiment, root cause, and a set of analysis and reporting tools to dig deeper and ask more questions about the data.
  • Published “out of the box” reports on customer sentiment, top issues (by product, customer segment, date, region) and root cause.
  • The ability to validate issues discussed by customers online via blogs and web forums with data reported by customers via email into a service center.

Insight from both inside and outside the company

 Think about it.  Companies can now analyze both internal and external sources of unstructured information to gain a better insight about their market.  By tapping into external sources of customer voice, such as blogs and web forums  companies can understand how its competitors are faring as well as how its own brand is holding up. This is exciting stuff. 


Get every new post delivered to your Inbox.

Join 1,710 other followers