Social Network Analysis: What is it and why should we care?

When most people think of social networks they think of Facebook and Twitter, but social network analysis has its roots in psychology, sociology, anthropology and math (see Scott, John Social Network Analysis for more details). The phrase has a number of different definitions, depending on the discipline you’re interested in, but for the purposes of this discussion social network analysis can be used to understand the patterns of how individuals interact.  For other definitions, look here.

I had a very interesting conversation with the folks from SAS last week about Social Network Analysis.   SAS has a sophisticated social network analysis solution that draws upon its analytics arsenal to solve some very important problems.  These include discovering banking or insurance fraud rings, identifying tax evasion, social services fraud, and health care fraud (to name a few) These are huge issues.  For example, the 2009 ABA Deposit Account Fraud Survey found that eight out of ten banks reported having check fraud losses in 2008. A new report by the National Insurance Crime Bureau (NICB) shows an increase in claims related to “opportunistic fraud,” possibly due to the economic downturn.   These include worker’s compensation, staged and caused accidents.

Whereas some companies (and there are a number of them in this space) use mostly rules (e.g. If the transaction is out of the country, flag it) to identify potential fraud, SAS utilizes a hybrid approach that can also include:

  • Anomalies; e.g. the number of unsecured loans exceeds the norm
  • Patterns; using predictive models to understand account opening and closing patterns
  • Social link analysis: e.g. to identify transactions to suspicious counterparties

Consider the following fraud ring:

  • Robert Madden shares a phone number with Eric Sully and their accounts have been shut down
  • Robert Madden also shares and address with Chris Clark
  • Chris Clark Shares a phone with Sue Clark and she still has open accounts
  • Sue Clark and Eric Sully also share an address with Joseph Sullins who has open accounts and who is soft matched to Joe Sullins who has many open accounts and has been doing a lot of cash cycling between them.

This is depicted in the ring of fraud that the SAS software found, which is shown above.   The dark accounts indicate accounts that have been closed.  Joe Sullins represents a new burst of accounts that should be investigated.

The SAS solution accepts input from many sources (including text, where it can use text mining to extract information from, say a claim).  The strength of the solution is in its ability to take data from many sources and in the depth of its analytical capability.

Why is this important?

Many companies set up Investigation Units to investigate potential fraud.  However, often times there are large numbers of false positives (i.e. investigations that show up as potential fraud but aren’t) which cost the company a lot of to investigate.  Just think about how many times you’ve been called by your credit card company when you’ve made a big purchase or traveled out of the country and forgot to call them and you understand the dollars wasted on false positives.    This cost, of course, pales in comparison to the billions of dollars lost each year to fraud.    Social network analysis, especially using more sophisticated analytics, can be used to find previously undetected fraud rings.

Of course, social network analysis has other use cases as well as fraud detection.   SAS uses Social Network Analysis as part of its Fraud Framework, but it is expanding its vision to include customer churn and viral marketing  (i.e. to understand how customers are related to each other).   Other use cases include terrorism and crime prevention, company organizational analysis, as well as various kinds of marketing applications such as finding key opinion leaders.

Social network analysis for marketing is an area I expect to see more action in the near term, although people will need to be educated about social networks, the difference between social network analysis and social media analysis (as well as where they overlap) and the value of the use cases.  There seems to be some confusion in the market, but that is the subject of another blog.

The Importance of multi-language support in advanced search and text analytics

I had an interesting briefing with the Basis Technology team the other week.  They updated me on the latest release of their technology called Rosette 7.   In case you’re not familiar with Basis Technology it is the multilingual engine that is embedded in some of the biggest Internet search engines out there – including Google, Bing, and Yahoo.  Enterprises and the government also utilize it.  But, the company is not just about keyword search.  Its technology also enables the extraction of entities (about 18 different kinds) such as organizations, names, and places.  What does this mean?  It means that the software can discover these kinds of entities across massive amounts of data and perform context sensitive discovery in many different languages.

An Example

Here’s a simple example.  Say you’re in the Canadian consulate and you want to understand what is being said about Canada across the world.   You type “Canada” into your search engine and get back a listing of documents.  How do you make sense of this?  Using Basis Technology entity extraction (an enhancement to search and a basic component of text analytics), you could actually perform faceted (i.e. guided) navigation across multiple languages.  This is illustrated in the figure below.  Here, the user typed “Canada” into the search engine and got back 89 documents.  In the main pane in the browser, you can see that an arrow in a number of different languages highlights the word Canada, so you know that it is included in these documents.  On the left hand side of the screen is the guided navigation pane.  For example, you can see that there are 15 documents that contain a reference to Obama and another 6 that contain a reference to Barack Obama.  This is not necessarily a co-occurrence in a sentence, just in the document.  So, any of these articles would contain a reference to Obama and Canada.  This would help you determine what Obama might have said about Canada. Or, what the connection is between Canada and the BBC (under organization).  This idea is not necessarily new, but the strong multilingual capabilities make it compelling for global organizations.

If you have eagle eyes, you will notice that the search on Canada returned 89 documents, but the entity “Canada” only returned 61 documents.  This illustrates what entity extraction is all about.  When the search for Canada was run on the Rosette Name Indexer tab (see upper right hand corner of the screen shot) the query searched for Canada against all automatically extracted “Canada” entities that existed in all of the documents.  This includes all persons, locations, and organizations that have similar names. This included entities like “Canada Post” and “Canada Life” which are organizations, not the country itself. Therefore the 28 other documents with a Canada variant are organizations or other entities.

Use Cases

There are obviously a number of different use cases where the ability to extract entities across languages can be important.  Here are three:

  • Watch lists.  With the ability to extract entities, such as people, in multiple languages, this kind of technology is good for government or financial watch lists.  Basis can resolve matches and translate names in 9 different languages. This includes resolving multiple spelling variations of foreign names. It also enables organizations to match names of people, places, and organizations against entries in a multilingual database.
  • Legal discovery.  Basis technology can identify  55 different languages.    Companies would use this technology, for example, to identify multiple languages within a document and then route them appropriately.  Additionally, Basis can extract entities in 15 different languages (and search in 21) so the technology could be used to process many documents and extract the entities associated with them to find the right set of documents needed in legal discovery.
  • Brand image, competitive intelligence.   The technology can be used to extract company names across multiple languages.  The software can also be used against disparate data sources, such as internal document management systems as well as external sources such as the Internet.  This means that it could cull the Internet to extract company name (and variations on the name) in multiple languages.  I would expect this technology to be used by “listening posts” and other “Voice of the Customer” services in the near future.

While this technology is not a text analytics analysis platform, it does provide an important piece of core functionality needed in a global economy.  Look for more announcements from the company in 2010 around enhanced search in additional languages.

My Take on the SAS Analyst Conference

I just got back from the SAS analyst event that was held in Steamboat Springs, Colorado.   It was a great meeting.  Here are some of the themes I heard over the few days I was there:

SAS is a unique place to work.

Consider the following:  SAS revenue per employee is somewhat lower than the software industry average because everyone is on the payroll.  That’s right.  Everyone from the grounds keepers to the health clinic professionals to those involved in advertising are on the SAS payroll.   The company treats its employees very well, providing fitness facilities and on site day care (also on the payroll). You don’t even have to buy your own coffee or soda! The company has found that these kinds of perks have a positive impact.  SAS announced no layoffs in 2009 and this further increased morale and productivity.  The company actually saw increased profits in 2009.   Executives from SAS also made the point that even thought they might have their own advertising, etc. they do not want to be insular.  The company knows it needs new blood and new ideas.  On that note, check out the next two themes:

Innovation is very important to SAS.

Here are some examples:

  • Dr. Goodnight gave his presentation using the latest version of the SAS BI dashboard, which looked pretty slick.
  • SAS has recently introduced some very innovative products and the trend will continue. One example is its social network analysis product that has been doing very well in the market.  The product analyzes social networks and can, for example, uncover groups of people working together to commit fraud.  This product was able to find $32M in welfare fraud in several weeks.
  • SAS continues to enhance its UI, which it has been beat up about in the past. We also got pre-briefed on some new product announcements that I can’t talk about yet, but other analysts did tweet about them at the conference.   There were a lot of tweats at this conference and they were analyzed in real time.

The partnership with Accenture is a meaningful one.

SAS execs stated that although they may not have that many partnerships, they try to make the ones they have very real.  While, on the surface, the recent announcement regarding the Accenture SAS Analytics Group might seem like a me too after IBM BAO, it is actually different.  Accenture’s goal is transform the front office, like ERP/CRM was transformed.  It wants to, “Take the what and turn it into so what and now what?” It views analytics not simply as a technology, but a new competitive management science that enables agility.  It obviously won’t market it that way as the company takes a business focus.  Look for the Accenture SAS Analytics Group to put out services such as Churn management as a service, Risk and fraud detection as a service.  They will operationalize this as part of a business process.

The Cloud!

SAS has a number of SaaS offerings in the market and will, no doubt, introduce more.  What I found refreshing was that SAS takes issues around SaaS very seriously.  You’d expect a data company to be concerned about their customers’ data and they are. 

Best line of the conference

SAS is putting a lot of effort into making its products easier to use and that is a good thing.  There are ways to get analysis to those people who aren’t that analytical.  In a discussion about the skill level required for people to use advanced analytics, however, one customer commented, “Just because you can turn on a stove doesn’t mean you know how to cook.”  More on this in another post.


Get every new post delivered to your Inbox.

Join 1,189 other followers