Collaborative BI is a trend that is beginning to gain momentum. The idea behind it is to make it easier for people in a company (or even outside of the company) to work together to analyze data and share analysis. There are several ways to accomplish this goal – from the simple to more complex.. The simplest is to provide an easy way to email an analysis. Another way to share analyses is to disseminate it via a web based interface. Or, multiple users can have access to the same web-based interface and use it to collaborate on a particular analysis project. They can share analysis and comment on the analysis.
The latter is the way in which a new entrant into the BI space – Good Data - is addressing the issue. I recently had an interesting conversation with Roman Stanek, CEO and founder of Good Data. The company will be releasing the full beta version of its SaaS solution in November.
So, what is Good Data all about? Good Data is a collaborative BI solution targeted at medium sized companies. The service allows users to upload structured data, analyze it, share and iterate on it. The company provides a beta version with sample data already loaded, so prospective users can get a feel for the service. Here’s what the service enables you to do:
- Upload the data: Good Data will take dump of a database in a flat file format (such as csv) and infer a model from the flat file. If the user doesn’t agree with the model, he or she can fix it. The file size is unlimited, but practically speaking it can be as large as several gigabytes. The user can upload as many files as he or she likes.
- Analyze the data. Good Data then allows users to slice and dice the data and report on it. Charts are also available and it handles time series data.
- Share and iterate on the data. Good Data provides a wiki-like interface so users can share their reports and charts and other users can comment on them.
The service is really pre-beta at this point, so it doesn’t have all the bells and whistles. However, it was enough to get an idea of the direction that the company is going with the service. The beta comes with a data set about the food industry already installed, so I can’t comment on how easy it is to actually upload data or what happens if there is something wrong with the data. It was easy, however, to slice and dice the data to get reports and charts. There is a nice comment and annotation feature included in the service where the user can comment on a report authored by another person. The service also enables you to organize your analyses by category and you can see what updates, etc. have occurred on the analysis easily enough. The reports and plots can be exported to a PDF for XLS format. Here’s a screen shot from the reports web page:
The user starts a project and invites others to join. These people can generate reports that others can see (for example, I created some reports and Peter Olmer created some other reports). The reports are arranged by date and who created them. They can also be filed into one of the spaces on the left. If we look at the Sales by Store report we would see the following:
This report, of course, might open up a whole line of questions about each store’s performance (note the comment I inserted). The model that Good Data uses to capture the data allows the user to drill down into any of these stores for additional information about brands sold, staff, and any additional information that has been captured at the store level. This enables further slicing and dicing of the data.
Over the last year, I’ve seen a number of start-ups addressing various aspects of BI and trying to provide user-friendly solutions for small and mid-size companies. This is a good thing. The more people are exposed to data and analysis the more comfortable they’ll be with it. I like the collaborative idea behind Good Data and some of the tools that it has already built into the service. The company is planning the following set of collaborative features for the first release
· Collaboration around definition of report
· Collaboration around data model and definition of attributes/facts
· Collaboration around data/data quality
· Homepage feed of project activity
· Alerts & Notifications when data reaches pre-defined thresholds.
I’m looking forward to seeing the full beta and the real thing.
I had the opportunity last week to learn more about Leximancer, a text analytics company with its roots in Australia. Leximancer recently moved some of its operations to the U.S. and I had a very interesting conversation with CEO Neil Hartley about the company.
Leximancer was founded in 2005. Its technology is based on seven years of research by Dr. Andrew Smith of the University of Queensland. The product employs a mainly statistical approach to text analytics. You simply feed the software unstructured text and the corpus is analyzed and a concept map is generated. These concept maps display main concepts, their relationship with other concepts, and emergent themes.
Here is an example of a concept map that was created from customer survey data from two different quarters:
This concept map displays some key themes associated with Q1 and Q2 survey responses (File Q1, File Q2). In Q1 (on the left), the major theme was “slow” and in Q2 the major theme was “experience”. These themes suggest that there was a problem with customer service in Q1. Other themes that are clustered around these themes provide more insight, as do the other concepts that make up the themes (e.g. slow and difficult associated with the slow theme) and the pathways between these concepts. For example, in Q1, the concepts such as slow, difficult, reporting, system, tools indicate that there may have been a problem with some of the company’s customer service. During Q2 the concepts better, quality, team appear to indicate that things are improving. However, it would be important to dive into the actual text associated with these concepts and pathways to determine if this is actually the case. Leximancer lets the user do this quite easily.
What is interesting about the approach is that it does not require any real set up. You simply submit your documents and generate a concept map. The simplified process goes something like this:
The documents are submitted ->Stop words (e.g. a, the) are omitted ->A keyword count is performed –>The corpus is then broken up into segments and co-occurrence of words determined such that the resultant concepts represent a thesaurus of words that travel through the text together.
Of course, some thought needs to go into the analysis that you want to perform so that you are feeding the system relevant information and getting useful concepts back.
Customer Insight Portal
Neil was nice enough to provide me with an account to Leximancer’s Customer Insight Portal – a SaaS offering. The portal is very easy to use. You simply login and then tell the system the files you would like to analyze. You can upload internal documents or specify the URL(s) you would like to mine. Once the analysis is complete, you can then drill in and out of the concepts and highlight the pathways between concepts.
I decided to explore the news about the financial crisis. I input two popular financial websites into the insight portal and got out a concept map that looks like this. Note that this is a piece of the concept map. You can see various themes – crisis, voters, falls, credit and so on. Associated with each theme are a number of concepts. For example, the economic crisis theme has concepts such as confidence, stock, unemployment, banking, economy and so on associated with it. The falls theme has information associated with the Dow as well as concepts around jobs and seats. I was interested to understand the seats concept and its relationship to the economic crisis, so I highlighted the path.
In a separate window (not shown here) all of the articles related to the concept path are highlighted. It then became obvious from the articles, that given the financial crisis, the democrats stand to gain more seats in the senate and lock up a 60 seat filibuster proof majority.
Customer Insight allows businesses to aggregate customer feedback and analyze it in order to get to the root cause. This feedback can be from surveys, blogs, forums, and so on. Leximancer also offers the Lexbox – a feedback widget that companies can insert on their own websites to use as an additional source of information.
Leximancer has about 200 customers, mostly in the educational and government space. Police departments, for example, are using the software to connect people and events as well as using it to perform social network profiling. Leximancer is also beginning to branch out to other verticals. It is looking to pursue a predominately OEM strategy, which is a good idea. Some of the vendors it will partner with will probably use the concept maps directly (depending who their audience is). Others will take the output from the maps and use it in another way. I plan to do some further analysis using the customer insight portal and will provide my additional feedback then.
I recently had the opportunity to speak with Jeff Caitlin, CEO of Lexalytics Ltd. about the Lexalytics/Infonic merger. Although the merger occurred several months ago, it was actually good timing, because Jeff could explain a bit more about what is happening with newly merged company, what the products look like, where the company is heading and so on.
For those of you not familiar with the two companies, Lexalytics is a five-year-old firm best known for sentiment analysis. In fact, its technology is embedded in a number of online services that deal with customer sentiment and reputation management, including Cymphony. It also OEMs its software to some well-known search vendors such as Fast (now Microsoft). Lexalytics merged with the text analytics division at Infonic in late July 2008 in order to gain momentum in the market. Infonic, a publicly traded UK based company previously named Corpora plc, focuses on document management and other software to enable organizations to capture and share information. On the text analytics side, it has several large customers from the financial services vertical including Thomson Reuters and Dow Jones Factiva.
Lexalytics Ltd. now offers several products to the market. These include:
- Salience: This is the core analytics product upon which the other products are based. It enables entity extraction, relationship extraction, sentiment analysis, and document summarization. It also provides pronoun handling which means that the software can distinguish, for example, that “John Smith” is the same person as “He” in the sentence, “He is a great leader.” It includes a series of entity libraries that contain people, companies, and brands. The company also provides a sentiment toolkit to enhance a sentiment dictionary. This means that the user can input their own sentiment rules to pick up phrases specific to their industry such as “missed expectations” in the financial services sector is negative.
- Acquisition Engine: This tool gathers content from news feeds, blogs, websites or local file systems. It includes a context free HTML cleaner that strips off all of the navigation bars, ads, and so on that can be found on web pages so the data that gets to the text analytics engine is cleaner. It actually uses the Salience Engine to grab the data. It can also gather data from structured ODBC compliant databases.
- Analytics Tool Kit: This tool kit enables the user to build visualizations by embedding information into office products such as excel and PowerPoint.
- Classifier: This tool buckets content. The company offers three classification methods: keyword, query, or training
Lexalytics Ltd. plans to release version 4.0 of Salience in October. This upgraded engine is a hybrid of the two text analytics products. For example, it will incorporate Infonic’s tonal analyzer that would be able to assign a positive sentiment to a phrase such as “disgustingly pretty” rather than a neutral score (i.e. disgusting = negative, pretty = positive). It will also provide new functionality such improved extraction for entities such as people, companies, and brands as well as the ability to produce meta-themes (i.e. concepts such as computer hardware or software).
Challenging other players?
My initial impression was that since Lexalytics 1) puts a big emphasis on pulling data from websites, blogs and other online sources and 2) seems to have a heavy focus on entities such as people, company, brand I could see it competing with other pure-play vendors including Clarabridge and Attensity on some deals, but not necessarily deals that involve sifting through call center notes or customer surveys, etc. I asked Jeff about this. His response was that although the company hasn’t focused on call center and customer surveys over the last few years, it is now starting to see an increase in interest in both of these areas. He said that while the company many not directly challenge players like Clarabridge, it may partner with others that sell solutions in these spaces.
Sentiment analysis is a hot area right now in text analytics. In a short survey Hurwitz & Associates conducted this past summer, Voice of the Customer and Competitive Intelligence were the top two areas of interest noted by end-users planning to deploy text analytics. Both of these would utilize sentiment analysis. There are a number of new players entering the market that are focused specifically on sentiment analysis. Some are still quite small but competition in this space will no doubt increase. The merger of Lexalytics with Infonic should help the combined company compete more effectively because it expands its footprint and enhances its capabilities.
I recently had the opportunity to speak with Michael DeNitto and Mark Budreski from MarketSight about enhancements to the MarketSight platform and future plans. MarketSight is a web-based survey analysis tool. I use it as part of my own toolkit at Hurwitz & Associates. While I am a fairly sophisticated data analyst, what is nice about MarketSight is that you don’t necessarily have to be an expert to use it to analyze your survey results.
What does this mean? Suppose you are a product manager and you want to get some feedback about your product from your customers. You deploy an online survey to hundreds of customers. The data consists of a series of responses to various kinds of questions and might include demographic information, ranking and rating various product features (for example, as need to have, nice to have, don’t need to have or on a scale of 1-5) and so on. The analysis might necessitate determining which features are most important and whether there is a significant difference between groups of customers. MarketSight was developed to provide an easy to use data analysis tool to enable business users to quickly and iteratively analyze this kind of data, without having to wait for IT (or an expert) to help them do so. This helps to reduce costs, provides flexibility, and frees up researchers to do more sophisticated analysis.
MarketSight recently released version 7.0, which provides some new enhancements to the product. Along with these new enhancements, MarketSight has also released something it calls the Market Research Portal which enables market researchers to share their analysis with internal or external clients via a web-based, company branded portal. This is an interesting new service that builds on the trend in the overall BI market for information sharing and collaboration.
The MarketSight solution provides the following features:
Banner Reports: provide a high level analysis of the dataset in a report format.
Cross Tabs: MarketSight provides an intuitive interface that enables users to run cross-tabulations (cross-tabs) to summarize data. The cross-tabs are useful for looking at the interaction of two (or more) variables. Data from a number of different output formats including sophisticated statistical analysis tools such as SPSS, SAS, and Excel are supported. Data can be selected for analysis using simple drag and drop interface. Data can also be weighted. This cross tab analysis can be easily iterated on to explore and examine the data as deeply as needed.
Statistical significance tests: A key feature of the software is the ability to automatically determine whether output is statistically significant. For example, to see if the response of one group is significantly different from another group. MarketSight highlights the cells in the cross tabs analysis that are different from each other and also increases the font size of those cells for easy analysis. Users can specify confidence levels. Advanced users also have the ability to display p-values, set a second confidence level or select more conservative statistical testing approaches.
Automatic chart generation: The software automatically produces professional looking charts from the data. This visualization feature can help in the analysis.
Porting capability to other presentation tools: MarketSight provides an easy way to pipe results to tools such as PowerPoint and Excel. This is helpful in cases where the business user wants to create a presentation, or change the look and feel of the output.
Other features: The software provides other features to keep easily keep track of analysis performed and to set usefulness ratings for the analysis.
The solution is now offered in three editions: Academic ($95/user/year), Professional ($995/user/year), and Enterprise ($1495/user/year). The Professional and Enterprise subscriptions also come with a read-only option.
The Market Research Portal (Enterprise customers)
This latest addition to the MarketSight Enterprise portfolio provides one place to organize and share market research. Users can also upload Word, Powerpoint, videos, or any other information that is relevant to the market research to the portal. This analysis can then be shared. Companies can brand the site with their own logos and URLs. So, if I am a market researcher and I want to share my findings with a broader base, I can set the permissions in the MarketSight tool to enable colleagues as well as external clients to enter the portal. This can be done in a view only mode or clients can subscribe to the service and use the data directly. This solution might be particularly attractive to market research firms to help them deal more effectively with their clients.
MarketSight already enables its clients to upload data in SAS, SPSS, and Excel format into the solution. It is also planning to support the Triple-S format - a standard format, popular in Europe, which enables survey programs running on different platforms to exchange data.
And, MarketSight is looking to include support for analyzing unstructured data in its solution, as well. This means that information in survey comment fields could be analyzed and relevant data extracted and used as part of the overall survey analysis. Given that, often, much of what a respondent is thinking is found in the comments this is good news.