The Inaugural Hurwitz & Associates Predictive Analytics Victory Index is complete!

For more years than I like to admit, I have been focused on the importance of managing data so that it helps companies anticipate changes and therefore be prepared to take proactive action. Therefore, as I watched the market for predictive analytics really emerge I thought it was important to provide customers with a holistic perspective on the value of commercial offerings. I was determined that when I provided this analysis it would be based on real world factors. Therefore, I am delighted to announce the release of the Hurwitz & Associates Victory Index for Predictive Analytics! I’ve been working on this report for a quite some time and I believe that it will be very valuable tool for companies looking to understand predictive analytics and the vendors that play in this market.

Predictive analytics has become a key component of a highly competitive company’s analytics arsenal. Hurwitz & Associates defines predictive analytics as:

A statistical or data mining solution consisting of algorithms and techniques that can be used on both structured and unstructured data (together or individually) to determine future outcomes. It can be deployed for prediction, optimization, forecasting, simulation, and many other uses.

So what is this report all about? The Hurwitz & Associates Victory Index is a market research assessment tool, developed by Hurwitz & Associates that analyzes vendors across four dimensions: Vision, Viability, Validity and Value. Hurwitz & Associates takes a holistic view of the value and benefit of important technologies. We assess not just the technical capability of the technology but its ability to provide tangible value to the business. For the Victory Index we examined more than fifty attributes including: customer satisfaction, value/price, time to value, technical value, breadth and depth of functionality, customer adoption, financial viability, company vitality, strength of intellectual capital, business value, ROI, and clarity and practicality of strategy and vision. We also examine important trends in the predictive analytics market as part of the report and provide detailed overviews of vendor offerings in the space.

Some of the key vendor highlights include:
• Hurwitz & Associates named six vendors as Victors across two categories including SAS, IBM (SPSS), Pegasystems, Pitney Bowes, StatSoft and Angoss.
• Other vendors recognized in the Victory Index include KXEN, Megaputer Intelligence, Rapid-I, Revolution Analytics, SAP, and TIBCO.

Some of the key market findings include:
• Vendors have continued to place an emphasis on improving the technology’s ease of use, making strides towards automating model building capabilities and presenting findings in business context.
• Predictive analytics is no longer relegated to statisticians and mathematicians. The user profile for predictive analytics has shifted dramatically as the ability to leverage data for competitive advantage has placed business analysts in the driver’s seat.
• As companies gather greater volumes of disparate kinds of data, both structured and unstructured, they require solutions that can deliver high performance and scalability.
• The ability to operationalize predictive analytics is growing in importance as companies have come to understand the advantage to incorporating predictive models in their business processes. For example, statisticians at an insurance company might build a model that predicts the likelihood of a claim being fraudulent.

I invite you to find out more about the report by visiting our website: www.hurwitz.com

Five Analytics Predictions for 2011

In 2011 analytics will take center stage as a key trend because companies are at a tipping point with the volume of data they have and their urgent need to do something about it. So, with 2010 now past and 2011 to look forward to, I wanted to take the opportunity to submit my predictions (no pun intended) regarding the analytics and advanced analytics market.

Advanced Analytics gains more steam. Advanced Analytics was hot last year and will remain so in 2011. Growth will come from at least three different sources. First, advanced analytics will increase its footprint in large enterprises. A number of predictive and advanced analytics vendors tried to make their tools easier to use in 2009-2010. In 2011 expect new users in companies already deploying the technology to come on board. Second, more companies will begin to purchase the technology because they see it as a way to increase top line revenue while gaining deeper insights about their customers. Finally, small and mid sized companies will get into the act, looking for lower cost and user -friendly tools.
Social Media Monitoring Shake Out. The social media monitoring and analysis market is one crowded and confused space, with close to 200 vendors competing across no cost, low cost, and enterprise-cost solution classes. Expect 2011 to be a year of folding and consolidation with at least a third of these companies tanking. Before this happens, expect new entrants to the market for low cost social media monitoring platforms and everyone screaming for attention.
Discovery Rules. Text Analytics will become a main stream technology as more companies begin to finally understand the difference between simply searching information and actually discovering insight. Part of this will be due to the impact of social media monitoring services that utilize text analytics to discover, rather than simply search social media to find topics and patterns in unstructured data. However, innovative companies will continue to build text analytics solutions to do more than just analyze social media.
Sentiment Analysis is Supplanted by other Measures. Building on prediction #3, by the end of 2011 sentiment analysis won’t be the be all and end all of social media monitoring. Yes, it is important, but the reality is that most low cost social media monitoring vendors don’t do it well. They may tell you that they get 75-80% accuracy, but it ain’t so. In fact, it is probably more like 30-40%. After many users have gotten burned by not questioning sentiment scores, they will begin to look for other meaningful measures.
Data in the cloud continues to expand as well as BI SaaS. Expect there to still be a lot of discussion around data in the cloud. However, business analytics vendors will continue to launch SaaS BI solutions and companies will continue to buy the solutions, especially small and mid sized companies that find the SaaS model a good alternative to some pricey enterprise solutions. Expect to see at least ten more vendors enter the market.

On-premise becomes a new word. This last prediction is not really related to analytics (hence the 5 rather than 6 predictions), but I couldn’t resist. People will continue to use the term, “on-premise”, rather than “on-premises” when referring to cloud computing even though it is incorrect. This will continue to drive many people crazy since premise means “a proposition supporting or helping to support a conclusion” (dictionary.com) rather than a singular form of premises. Those of us in the know will finally give up correcting everyone else.

What is advanced analytics?

There has been a lot of discussion recently around advanced analytics. I’d like to throw my definition into the rink. I spent many years at Bell Laboratories in the late 1980s and 1990s deploying what I would call advanced analytics. This included utilizing statistical and mathematical models to understand customer behavior, predict retention, or analyze trouble tickets. It also included new approaches for segmenting the customer base and thinking about how to analyze call streams in real time. We also tried to utilize unstructured data from call center logs to help improve the predictive power of our retention models, but the algorithms and the compute power didn’t exist at the time to do this.

Based on my own experiences as well as what I see happening in the market today as an analyst, I view advanced analytics as an umbrella term that includes a class of techniques and practices that go well beyond “slicing and dicing and shaking and baking” data for reports. I would define advanced analytics as:

“Advanced analytics provides algorithms for complex analysis of either structured or unstructured data. It includes sophisticated statistical models, machine learning, neural networks, text analytics, and other advanced data mining techniques. Among its many use cases, it can be deployed to find patterns in data, prediction, optimization, forecasting, and for complex event processing/analysis. Examples include predicting churn, identifying fraud, market basket analysis, or understanding website behavior. Advanced analytics does not include database query and reporting and OLAP cubes. “

Of course, the examples in this definition are marketing-centric and advanced analytics obviously extends into multiple arenas. Hurwitz & Associates is going to do a deep dive into this area in the coming year. We are currently fielding a study about advanced analytics and we’ll be producingadditional reports. For those of you who are interested in completing my survey, here is the link:

Advanced Analytics and the skills needed to make it happen: Takeaways from IBM IOD

Advanced Analytics was a big topic at the IBM IOD conference last week. As part of this, predictive analytics was again an important piece of the story along with other advanced analytics capabilities IBM has developed or is in the process of developing to support optimization. These include Big Insights (for big data), analyzing data streams, content/text analytics, and of course, the latest release of Cognos.

One especially interesting topic that was discussed at the conference was the skills required to make advanced analytics a reality. I have been writing and thinking a lot this subject so I was very happy to hear IBM address it head on during the second day keynote. This keynote included a customer panel and another speaker, Dr. Atul Gawande, and both offered some excellent insights. The panel included Scott Friesen (Best Buy), Scott Futren (Guinnett County Public Schools), Srinivas Koushik (Nationwide), and Greg Christopher (Nestle). Here are some of the interrelated nuggets from the discussions:

• Ability to deliver vs. the ability to absorb. One panelist made the point that a lot of new insights are being delivered to organizations. In the future, it may become difficult for people to absorb all of this information (and this will require new skills too).
• Analysis and interpretation. People will need to know how to analyze and how to interpret the results of an analysis. As Dr. Gawande pointed out, “Having knowledge is not the same as using knowledge effectively.”
• The right information. One of the panelists mentioned that putting analytics tools in the hands of line people might be too much for them, and instead the company is focusing on giving these employees the right information.
• Leaders need to have capabilities too. If executives are accustomed to using spreadsheets and relying on their gut instincts, then they will also need to learn how to make use of analytics.
• Cultural changes. From call center agents using the results of predictive models to workers on the line seeing reports to business analysts using more sophisticated models, change is coming. This change means people will be changing the way that they work. How this change is handled will require special thought by organizations.

IBM executives also made a point of discussing the critical skills required for analytics. These included strategy development, developing user interfaces, enterprise integration, modeling, and dealing with structured and unstructured data. IBM has, of course, made a huge investment in these skills. GBS executives emphasized the 8,500 employees in its Global Business Services Business Analytics and Optimization group. Executives also pointed to the fact that the company has thousands of partners in this space and that 1 in 3 IBMers will attend analytics training. So, IBM is prepared to help companies in their journey into business analytics.

Are companies there yet? I think that it is going to take organizations time to develop some of these skills (and some they should probably outsource). Sure, analytics has been around a long time. And sure, vendors are making their products easier to use and that is going to help end users become more effective. Even if we’re just talking about a lot of business people making use of analytic software (as opposed to operationalizing it in a business process), the reality is that analytics requires a certain mindset. Additionally, unless someone understands the context of the information he or she is dealing with, it doesn’t matter how user friendly the platform is – they can still get it wrong. People using analytics will need to think critically about data, understand their data, and understand context. They will also need to know what questions to ask.

I whole-heartedly believe it is worth the investment of time and energy to make analytics happen.

Please note:

As luck would have it, I am currently fielding a study on advanced analytics! In am interested in understanding what your company’s plans are for advanced analytics. If you’re not planning to use advanced analytics, I’d like to know why. If you’re already using advanced analytics I’d like to understand your experience.

If you participate in this survey I would be happy to send you a report of our findings. Simply provide your email address at the end of the survey! Here’s the link:

Click here to take survey

Analyzing Big Data

The term “Big Data” has gained popularity over the past 12-24 months as a) amounts of data available to companies continually increase and b) technologies have emerged to more effectively manage this data. Of course, large volumes of data have been around for a long time. For example, I worked in the telecommunications industry for many years analyzing customer behavior. This required analyzing call records. The problem was that the technology (particularly the infrastructure) couldn’t necessarily support this kind of compute intensive analysis, so we often analyzed billing records rather than streams of calls detail records, or sampled the records instead.

Now companies are looking to analyze everything from the genome to Radio Frequency ID (RFID) tags to business event streams. And, newer technologies have emerged to handle massive (TB and PB) quantities of data more effectively. Often this processing takes place on clusters of computers meaning that processing is occurring across machines. The advent of cloud computing and the elastic nature of the cloud has furthered this movement.

A number of frameworks have also emerged to deal with large-scale data processing and support large-scale distributed computing. These include MapReduce and Hadoop:

-MapReduce is a software framework introduced by Google to support distributed computing on large sets of data. It is designed to take advantage of cloud resources. This computing is done across large numbers of computer clusters. Each cluster is referred to as a node. MapReduce can deal with both structured and unstructured data. Users specify a map function that processes a key/value pair to generate a set of intermediate pairs and a reduction function that merges these pairs
-Apache Hadoop is an open source distributed computing platform that is written in Java and inspired by MapReduce. Data is stored over many machines in blocks that are replicated to other servers. It uses a hash algorithm to cluster data elements that are similar. Hadoop can cerate a map function of organized key/value pairs that can be output to a table, to memory, or to a temporary file to be analyzed.

But what about tools to actually analyze this massive amount of data?

Datameer

I recently had a very interesting conversation with the folks at Datameer. Datameer formed in 2009 to provide business users with a way to analyze massive amounts of data. The idea is straightforward: provide a platform to collect and read different kinds of large data stores, put it into a Hadoop framework, and then provide tools for analysis of this data. In other words, hide the complexity of Hadoop and provide analysis tools on top of it. The folks at Datameer believe their solution is particularly useful for data greater than 10 TB, where a company may have hit a cost wall using traditional technologies but where a business user might want to analyze some kind of behavior. So website activity, CRM systems, phone records, POS data might all be candidates for analysis. Datameer provides 164 functions (i.e. group, average, median, etc) for business users with APIs to target more specific requirements.

For example, suppose you’re in marketing at a wireless service provider and you offered a “free minutes” promotion. You want to analyze the call detail records of those customers who made use of the program to get a feel for how customers would use cell service if given unlimited minutes. The chart below shows the call detail records from one particular day of the promotion – July 11th. The chart shows the call number (MDN) as well as the time the call started and stopped and the duration of the call in milliseconds. Note that the data appear under the “analytics” tab. The “Data” tab provides tools to read different data sources into Hadoop.

This is just a snapshot – there may be TB of data from that day. So, what about analyzing this data? The chart below illustrates a simple analysis of the longest calls and the phone numbers those calls came from. It also illustrates basic statistics about all of the calls on that day – the average, median, and maximum call duration.

From this brief example, you can start to visualize the kind of analysis that is possible with Datameer.

Note too that since Datameer runs on top of Hadoop, it can deal with unstructured as well as structured data. The company has some solutions in the unstructured realm (such as basic analysis of twitter feeds), and is working to provide more sophisticated tools. Datameer offers its software either on either a SaaS license or on premises.

In the Cloud?

Not surprisingly, early adopters of the technology are using it in a private cloud model. This makes sense since some companies often want to keep control of their own data. Some of these companies already have Hadoop clusters in place and are looking for analytics capabilities for business use. Others are dealing with big data, but have not yet adopted Hadoop. They are looking at a complete “big data BI” type solution.

So, will there come a day when business users can analyze massive amounts of data without having to drag IT entirely into the picture? Utilizing BI adoption as a model, the folks from Datameer hope so. I’m interested in any thoughts readers might have on this topic!

Five requirements for Advanced Analytics

The other day I was looking at the analytics discussion board that I moderate on the Information Management site. I had posted a topic entitled “the value of advanced analytics.” I noticed that the number of views on this topic was at least 3 times as many as on other topics that had been posted on the forum. The second post that generated a lot of traffic was a question about a practical guide to predictive analytics.

Clearly, companies are curious and excited about advanced analytics. Advanced analytics utilizes sophisticated techniques to understand patterns and predict outcomes. It includes complex techniques such as statistical modeling, machine learning, linear programming, mathematics, and even natural language processing (on the unstructured side). While many kinds of “advanced analytics” have been around for the last 20+ years (I utilized it extensively in the 80s) and the term may simply be a way to invigorate the business analytics market, the point is that companies are finally starting to realize the value this kind of analysis can provide.

Companies want to better understand the value this technology brings and how to get started. And, while the number of users interested in advanced analytics continues to increase, the reality is that there will likely be a skills shortage in this area. Why? Because advanced analytics isn’t the same beast as what I refer to as, “slicing and dicing and shaking and baking” data to produce reports that might include information such as sales per region, revenue per customer, etc.

So what skills are needed for the business user to face the advanced analytics challenge? It’s a tough question. There is a certain thought process that goes into advanced analytics. Here are five (there are no doubt, more) skills I would say at a minimum, you should have:

1. It’s about the data. So, thoroughly understand your data. A business user needs to understand all aspects of his or her data. This includes answers to questions such as, “What is a customer?” “What does it mean if a data field is blank?” “Is there seasonality in my time series data?” It also means understanding what kind of derived variables (e.g. a ratio) you might be interested in and how you want to calculate them.
2. Garbage in, Garbage out. Appreciate data quality issues. A business user analyzing data cannot simply assume that the data (from whatever source) is absolutely fine. It might be the case, but you still need to check. Part of this ties to understanding your data, but it also means first looking at the data and asking if it make sense. And, what do you do with data that doesn’t make sense?
3. Know what questions to ask. I remember a time in graduate school when, excited by having my data and trying to analyze it, a wise professor told me not to simply throw statistical models at the data because you can. First, know what questions you are trying to answer from the data. Ask yourself if you have the right data to answer the questions. Look at the data to see what it is telling you. Then start to consider the models. Knowing what questions to ask will require business acumen.
4. Don’t skip the training step. Know how to use tools and what the tools can do for you. Again, it is simple to throw data at a model, especially if the software system suggests a certain model. However, it is important to understand what the models are good for. When does it make sense to use a decision tree? What about survival analysis? Certain tools will take your data and suggest a model. My concern is that if you don’t know what the model means, it makes it more difficult to defend your output. That is why vendors suggest training.
5. Be able to defend your output. At the end of the day, you’re the one who needs to present your analysis to your company. Make sure you know enough to defend it. Turn the analysis upside down, ask questions of it, and make sure you can articulate the output

I could go on and on but I’ll stop here. Advanced analytics tools are simply that – tools. And they will be only as good as the person utilizing them. This will require understanding the tools as well as how to think and strategize around the analysis. So my message? Utilized properly these tools can be great. Utilized incorrectly– well – it’s analogous to a do-it-yourself electrician who burns down the house.

What about Analytics in Social Media monitoring?

I was speaking to a client the other day.  This company was very excited about tracking its brand using one of the many listening posts out on the market.  As I sat listening to him, I couldn’t help but think that a) it was nice that his company could get its feet wet in social media monitoring using a tool like this and b) that they might be getting a false sense of security because the reality is that these social media tracking tools provide a fairly rudimentary analysis about brand/product mentions, sentiment, and influencers.  For those of you not familiar with listening posts here’s a quick primer.

Listening Post Primer

Listening posts monitor the “chatter” that is occurring on the Internet in blogs, message boards, tweets, etc.  They basically:

  • Aggregate content from across many,  many Internet sources.
  • Track the number of mentions of a topic (brand or some other term) over time and source of mention.
  • Provide users with positive or negative sentiment associated with topic (often you can’t change this, if it is incorrect).
  • Provide some sort of Influencer information.
  • Possibly provide a word cloud that lets you know what other words are associated with your topic.
  • Provide you with the ability to look at the content associated with your topic.

They typically charge by the topic.  Since these listening posts mostly use a search paradigm (with ways to aggregate words into a search topic) they don’t really allow  you to “discover” any information or insight that you may not have been aware of unless you happen to stumble across it while reading posts or put a lot of time into manually mining this information.  Some services allow the user to draw on historical data.  There are more than 100 listening posts on the market.

I certainly don’t want to minimize what these providers are offering.  Organizations that are just starting out analyzing social media will certainly derive huge benefit from these services.  Many are also quite easy to use and the price point is reasonable. My point is that there is more that can be done to derive more useful insight from social media.  More advanced systems typically make use of text analytics software.   Text analytics utilizes techniques that originate in computational linguistics, statistics, and other computer science disciplines to actually analyze the unstructured text.

Adding Text Analytics to the Mix

Although still in the early phases, social media monitoring is moving to social media analysis and understanding as text analytics vendors apply their technology to this problem.  The space is heating up as evidenced by these three recent announcements:

  • Attensity buys Biz 360. The other week, Attensity announced its intention to purchase Biz360, a leading listening post. In April, 2009, Attensity combined with two European companies that focus on semantic business applications to form Attensity Group (was formerly Attensity Corporation).  Attensity has sophisticated technology which makes use of “exhaustive extraction” techniques (as well as nine other techniques) to analyze unstructured data. Its flagship technology automatically extracts facts from parsed text (who did what to whom, when, where, under what conditions) and organizes this information.  With the addition of Biz360 and its earlier acquisitions, the Biz360 listening post will feed all Attensity products.  Additionally, the  Biz360 SaaS platform will be expanded to include deeper semantic capabilities for analysis, sentiment, response and knowledge management utilizing Attensity IP.  This service will be called Attensity 360.  The service will provide listening and deep analysis capabilities.  On top of this, extracted knowledge will be automatically routed to the group in the enterprise that needs the information.  For example, legal insights  about people, places, events, topics, and sentiment will be automatically routed to legal, customer service insights to customer service, and so on. These groups can then act on the information.  Attensity refers to this as the “open enterprise.” The idea is an end-to-end listen-analyze-respond-act process for enterprises to act on the insight they can get from the solution.
  • SAS announces its social media analytics software. SAS purchased text analytics vendor Teragram last year.  In April, SAS announced SAS® Social Media Analytics which, “Analyzes online conversations to drive insight, improve customer interaction, and drive performance.”  The product provides deep unstructured data analysis capabilities around both internal and external sources of information (it has partnerships with external content aggregators, if needed) for brand, media, PR, and customer related information.  SAS has then coupled with this the ability to perform advanced analytics such as predictive forecasting and correlation on this unstructured data.  For example, the SAS product enables companies to forecast number of mentions, given a history of mentions, or to understand whether sentiment during a certain time period was more negative, say than a previous time period.  It also enables users to analyze sentiment at a granular level and to change sentiment (and learn from this), if it is not correct.  It can deal with sentiment in 13 languages and supports 30 languages.
  • Newer social media analysis services such as NetBase are announced. NetBase is currently in limited release of its first consumer insight discovery product called ConsumerBase.  It has eight  patents pending around its deep parsing  and semantic modeling technology.  It combines deep analytics with a content aggregation service and a reporting capability.  The product provides analysis around likes/dislikes, emotions, reasons why, and behaviors.  For example, whereas a listening post might interpret the sentence, “Listerine kills germs because it hurts” as either a negative or neutral statement, the NetBase technology uses a semantic data model to understand not only that this is a positive statement, but also the reason it is positive.

Each of these products and services are slightly different.  For example, Attensity’s approach is to listen, analyze, relate (it to the business), and act (route, respond, reuse) which it calls its LARA methodology.   The SAS solution is part of its broader three Is strategy: Insight- Interaction- Improve.  NetBase is looking to provide an end to end service that helps companies to understand the reason around emotions, behaviors, likes and dislikes.   And, these are not the only game in town. Other social media analysis services announced in the last year (or earlier) include those from other text analytics vendors such as IBM, Clarabridge, and Lexalytics. And, to be fair, some of the listening posts are beginning to put this capability into their services.

This market is still in its early adoption phase, as companies try to put plans together around social media, including utilizing it for their own marketing purposes as well as analyzing it for reasons including and beyond marketing. It will be extremely important for users to determine what their needs and price points are and plan accordingly.

Follow

Get every new post delivered to your Inbox.

Join 1,189 other followers