This week marks the one year anniversary of the IBM Watson computer system succeeding at Jeopardy!. Since then, IBM has gotten a lot of interest in Watson. Companies want one of those.
But what exactly is Watson and what makes it unique? What does it mean to have a Watson? And, how is commercial Watson different from Jeopardy Watson?
What is Watson and why is it unique?
Watson is a set of technologies that processes and analyzes massive amounts of both structured and unstructured data in a unique way. One statistic given at the recent IOD conference is that Watson can process and analyze information from 200 million books in three seconds. While Watson is very advanced it uses technologies that are commercially available with some “secret sauce” technologies that IBM Research has either enhanced or developed. It combines software technologies from big data, content and predictive analytics, and industry specific software to make it work.
Watson includes several core pieces of technology that make it unique
So what is this secret sauce? Watson understands natural language, generates and evaluates hypotheses, and adapts and learns.
First, Watson uses Natural Language Processing (NLP). NLP is a very broad and complex field, which has developed over the last ten to twenty years. The goals of NLP are to derive meaning from text. NLP generally makes use of linguistic concepts such as grammatical structures and parts of speech. It breaks apart sentences and extracts information such as entities, concepts, and relationships. IBM is using a set of annotators to extract information like symptoms, age, location, and so on.
So, NLP by itself is not new, however, Watson is processing vast amounts of this unstructured data quickly, using an architecture designed for this.
Second, Watson works by generating hypotheses which are potential answers to a question. It is trained by feeding question and answer (Q/A) data into the system. In other words, it is shown representative questions and learns from the supplied answers. This is called evidence based learning. The goal is to generate a model that can produce a confidence score (think logistic regression with a bunch of attributes). Watson would start with a generic statistical model and then look at the first Q/A and use that to tweak coefficients. As it gains more evidence it continues to tweak the coefficients until it can “say” confidence is high. Training Watson is key since what is really happening is that the trainers are building statistical models that are scored. At the end of the training, Watson has a system that has feature vectors and models so that eventually it can use the model to probabilistically score the answers. The key here is something that Jeopardy! did not showcase – which is that it is not deterministic (i.e. using rules). Watson is probabilistic and that makes it dynamic.
When Watson generates a hypothesis it then scores the hypothesis based on the evidence. Its goal is to get the right answer for the right reason. (So, theoretically, if there are 5 symptoms that must be positive for a certain disease and 4 that must be negative and Watson only has 4 of the 9 pieces of information, it could ask for more.) The hypothesis with the highest score is presented. By the end the analysis, Watson is confident when it knows the answer and when it doesn’t know the answer.
Here’s an example. Suppose you go in to see your doctor because you are not feeling well. Specifically, you might have heart palpitations, fatigue, hair loss, and muscle weakness. You decide to go see a doctor to determine if there is something wrong with your thyroid or if it is something else. If your doctor has access to a Watson system then he could use it to help advise him regarding your diagnosis. In this case, Watson would already have ingested and curated all of the information in books and journals associated with thyroid disease. It also has the diagnosis and related information from other patients from this hospital and other doctors in the practice from the electronic medical records of prior cases that it has in its data banks. Based on the first set of symptoms you might report it would generate a hypothesis along with probabilities associated with the hypothesis (i.e. 60% hyperthyroidism, 40% anxiety, etc.). It might then ask for more information. As it is fed this information, i.e. example patient history, Watson would continue to refine its hypothesis along with the probability of the hypothesis being correct. After it is given all of the information and it iterates through it and presents the diagnosis with the highest confidence level, the physician would use this information to help assist him in making the diagnosis and developing a treatment plan. If Watson doesn’t know the answer, it will state that it has does not have an answer or doesn’t have enough information to provide an answer.
IBM likens the process of training a Watson to teaching a child how to learn. A child can read a book to learn. However, he can also learn by a teacher asking questions and reinforcing the answers about that text.
Can I buy a Watson?
Watson will be offered in the cloud in an “as a service” model. Since Watson is in its own class, let’s call this Watson as a Service (WaaS). Since Watson’s knowledge is essentially built in tiers, the idea is that IBM will provide the basic core knowledge in a particular WaaS solution space, say all of the corpus about a particular subject – like diabetes – and then different users could build on this.
For example, in September IBM announced an agreement to create the first commercial applications of Watson with WellPoint – a health benefits company. Under the agreement, WellPoint will develop and launch Watson-based solutions to help improve patient care. IBM will develop the base Watson healthcare technology on which WellPoint’s solution will run. Last month, Cedars-Sinai signed on with WellPoint to help develop an oncology solution using Watson. Cedars-Sinai’s oncology experts will help develop recommendations on appropriate clinical content for the WellPoint health care solutions. They will assist in the evaluation and testing of these tools. In fact, these oncologists will “enter hypothetical patient scenarios, evaluate the proposed treatment options generated by IBM Watson, and provide guidance on how to improve the content and utility of the treatment options provided to the physicians.” Wow.
Moving forward, picture potentially large numbers of core knowledge bases that are trained and available for particular companies to build upon. This would be available in a public cloud model and potentially a private one as well, but with IBM involvement. This might include Watsons for law or financial planning or even politics (just kidding) – any area where there is a huge corpus of information that people need to wrap their arms around in order to make better decisions.
IBM is now working with its partners to figure out what the user interface for these Watsons- as a Service might look like. Will Watson ask the questions? Can end-users, say doctors, put in their own information and Watson will use it? This remains to be seen.
Ready for Watson?
In the meantime, IBM recently rolled out its “Ready for Watson.” The idea is that a move to Watson might not be a linear progression. It depends on the business problem that companies are looking to solve. So IBM has tagged certain of its products as “ready” to be incorporated into a Watson solution. IBM Content and Predictive Analytics for Healthcare is one example of this. It combines IBM’s content analytics and predictive analytics solutions that are components of Watson. Therefore, if a company used this solution it could migrate it to a Watson-as a Service deployment down the road.
So happy anniversary IBM Watson! You have many people excited and some people a little bit scared. For myself, I am excited to see where Watson is on its first anniversary and am looking forward to see what progress it has made on its second anniversary.