When most people think “analytical application” they think “classic BI” or “predictive modeling.” That’s no longer accurate. The very nature of analytical applications is changing. Text analytics brings unstructured information into the equation. Visualization changes the way that we look at data and information.
The reality is that companies are starting to build applications that require many types of information – structured, unstructured, images (e.g. from sensors and satellites), and audio, video et al. This mix of information may involve many layers of complexity and interconnected relationships and it won’t easily fit into a structured database or data warehouse. The underlying knowledge about the particular problem area may be evolving as well.
Let me give you some examples.
Consider portfolio modeling in the financial services sector. It is not enough to simply analyze past performance, it is necessary to look at external indicators, such as political events in other countries in order to manage a portfolio. Political unrest in an area of operation can directly impact company stock price. Currency issues may impede business opportunities. Not only does the portfolio manager need to access a wide variety of data, but the data needs to interconnect meaningfully. So there needs to be an underlying infrastructure that caters for dynamic changes to interrelated knowledge relevant to the portfolio.
In law enforcement, it may not be enough just to have records of criminals and crimes. Other information types, such as location (geospatial) data of crime scenes and surrounding areas can provide useful information. An appreciation of context is also necessary, for example, to know that terms such as “arrest” mean something specific in law enforcement vs. medicine. And of course, intelligence information is always changing.
Semantic knowledge modeling can account for discrete data such as these, in addition to qualitative influences, to answer larger questions about perpetrators, motives and patterns of behavior: What do this suspect’s relationships with these locations tell me about this seemingly unrelated event? Are these property crimes part of an organized effort?
What is semantic knowledge modeling?
Simply put, a knowledge model is a way to abstract disparate data and information. Knowledge modeling is about describing what data means and where it fits. It allows us to understand and abstract knowledge. Consequently it helps us to understand how different pieces of information relate to each other.
A semantic model is one kind of a knowledge model. The semantic model consists of a network of concepts and the relationships between those concepts. Concepts are a particular idea or topic with which the user is concerned. Using the financial services example above, a concept might be “political unrest”. Or in the law enforcement example, a concept might be a robbery. The concepts and relationships together are often known as an ontology; the semantic model that describes knowledge.
As knowledge changes, the semantic model can change too. For example, robberies may have occurred at various banks. As the number of robberies change, the model can be updated. Other qualitative data related to those banks, or those robberies, can be fed into a continuously updated model; demographic or population shifts in a bank’s neighborhood, a change in bank ownership, details of the circumstances of a particular robbery. This enriches the knowledge about patterns and influences.
Semantic models enable users to ask questions of the information in a natural way and help to identify patterns and trends in this information and discover relationships between disparate pieces of information.
The Thetus Solution
Thetus, a Portland, Oregon based software company has developed the infrastructure to support these kinds of applications. Their flagship product is called Thetus Publisher. It is infrastructure software for developing semantic models. The Publisher enables the modeling of complex problems represented in different or disparate data sets. The product set consists of three core components:
- Model/Ontology Management – which enables users to build ontologies or to import them. The knowledge model provides a layer of abstraction required for users to interact with the information in a natural way. The model is populated with known concepts, facts and relationships and reveals what data means and where it fits in the model.
- Lineage – The knowledge model tracks and records the history or “lineage” of the information, which is important in many analytical applications. Moreover, changes in the model are tracked to enable understanding of changes over time. This helps answer questions such as “when did we learn this?” and “why was this decision made?”
- Workflow – The Thetus workflow engine enables the integration of various analytics including entity extraction, link analysis and geotagging. Workflow is automated based on rules and conditions defined in the model.
Thetus is actively building applications in defense, intelligence, energy, environmental services and law enforcement verticals. These are problem spaces characterized by data sources that are disparate and distributed, and a knowledge base that is evolving. However, the same technology solution is relevant to other business verticals, as well. While many companies are still struggling to analyze their structured data, there is nonetheless room to apply innovative approaches to analytic applications. And, although this technology is in the early adopter stage for many markets, there has been investment in semantic technology on the web and other industries that may help to push it ahead. Hurwitz & Associates plans to keep watching.