23 Apr

Warwick in the Press: Data Science Central, Humans in the Loop

Warwick Analytics has been featured on the leading online publication and blog for Data Scientists, Data Science Central.

The article Humans-in-the-Loop? Which Humans? Which Loop? looks at the different humans that be incorporated into the automation of a contact centre in order to minimise the level of human intervention required whilst maximising performance.

Automation of contact centers yields promise although not without humans-in-the-loop somewhere in the system to maintain the performance. There are many different flavors for human-in-the-loop, and with some novel technology appearing, an optimized system is possible with the minimum amount of humans and without any data science skills. There is now no reason why the contact centers of the future need to look like those of the present and same for the possibilities of customer experience as well.

You can read the full article here

Share this
19 Apr

The future is labels -machine learning for text with automated labels

Labels are how humans define and categorise different concepts. There’s lots of evolutionary psychology, neuroscience and linguistics behind this, but without going into that, without labels human (and other animal) intelligence would not be possible, and maybe not artificial intelligence either. Labels are the algebra of everyday life.

But what’s that got to do with AI? As it happens, quite a lot. When we want to understand what people believe or perceive, we do it by analysing their communication either written or spoken. Let’s say we’re wanting to analyse voice of customer text data.

The classical way to approach this is text mining based on keywords and rules to drive topic analysis e.g. using TFIDF or some other kind of ‘vectorization’, and sentiment analysis of the opinion terms.

There are issues here. Firstly, what are we supposed to do with all the topics? If we build a word cloud how useful is that? If they use synonyms which aren’t in a dictionary, do we group these together in advance? We are essentially trying to second-guess and group terms, which might not match the intentions of the customers, or be different for different situations. Things for sentiment are even more dissonant and we haven’t begun to explore the technical challenges with sarcasm, context, comparators and double negatives which all perform very poorly in such analyses.

So how else are we meant to analyse text data, apart from painfully compiling dictionaries and constant manual checking? Well, say hello to the wonderful world of labels. The labels being referred here are generated from machine learning i.e. by replicating human judgment based on a training sample of manually labelled data. The machine doesn’t need to be told keywords, it figures out common patterns which might be a lot more than single keywords, and might include where they are in the sentence and whether they are nouns or verbs, just as a human might.

So, if labels are so great, why isn’t everyone using them? Well the short answer is that it’s expensive. It’s expensive in terms of time because someone with the requisite domain knowledge needs to generate the labels, and it’s even more expensive because a data scientist needs to use those labels to try to generate a signal using various techniques without resorting to ‘data torture’ (i.e. the phenomenon of eventually getting out of a dataset what you wanted, even if not scientifically justifiable). The problem and approach need to be carefully defined, the data cleansed, parsed and filtered to suit the approach, and frankly a great deal of trial and error. Even if a predictive model is generated, it needs to be tuned, tested for stability and then checked and curated carefully over time in case the data and performance changes (and they always do in anything interesting!). This explains why labelling from a machine learning point of view is precious and only used sparingly for the highest-value use cases.

Thankfully this no longer needs to be the case thanks to the latest technologies. Imagine a world where AI-based labelling is cheap and plentiful, where data scientists are not required to tune and drive models.

Welcome to supercharged labelling. The basic premise is that the labelling machine judges its own uncertainty and invites user intervention to label things manually that it needs to maximise its performance for the minimum human intervention. The human intervention just needs to be someone with domain knowledge and doesn’t need to be a data scientist, and the labelling required is ‘just enough’ to achieve the requisite business performance. No data artistry needed. Also, because it invites human intervention when there’s uncertainty, it can spot new topics i.e. ‘early warning’ of new signals, and keep the models maintained to their requisite performance. If there are differences in labelling, labels can be merged or moved around in hierarchies. If the performance at the granular level isn’t high then it will chose the coarser level just as a human might.

We call this technology Optimized Learning. It has been used to address use cases such complaint handling automation in airlines, generating the topics that cause churn for financial services, assisting chatbots to retrieve the relevant information for the query, measuring the brand attributes for CPG brands and their competitors, and recommending corrective action in machinery and vehicle maintenance.

To spell out the potential savings, suppose a business wants to automate its complaint handling system by building a predictive model of categories for queries (i.e. labels). There might be hundreds of categories and, as a data scientist, you might ask for an estimate of the initial labelling set which could take many man-weeks, with the possibility of not actually finding a signal. Then there’s feature engineering, in itself an iterative activity with no guarantees. If all this takes 6 weeks of labelling, then the latest technology PrediCX might be typically 2% of that i.e. just over half a day to achieve the same performance. Any time spent in feature engineering is also massively reduced as you rapidly test and tune with more certainty and a much quicker feedback loop. Furthermore, the time spent curating models disappears from the data science team and is instead replaced by the minimal amount of labelling when new or ambiguous signals appear. This might be a day or so per year rather than a heavy overhead. You can quickly see that a model that might cost many hundreds of thousands per year of human input might literally only cost a few thousand instead, and be more flexible and powerful in terms of early warning and adaptability.

So now you can see that labels are indeed very powerful in a machine learning context. They move text analytics to the next level and now there are technologies which lower the time, cost and technical skill levels to deploying them. What will you label?

 

Share this
12 Apr

Warwick Analytics research finds human-in-the-loop validation critical for chatbot owners

Our independent survey of over 500 chatbot owners and developers reveals the majority are not satisfied with the performance of their chatbots and say human validation is key.

New research from text analytics specialists Warwick Analytics shows that 59% of businesses who have a chatbot are unsatisfied with its performance.

551 professionals involved in the development or management of chatbots were surveyed by Warwick Analytics.

When discussing the technical challenges respondents faced trying to improve their own chatbots, the most common issues were improving containment rates (90%), reducing errors (83%), and developing the responses for the chatbot (79%).

More significantly, an overwhelming 93% believed that human validation and/or curation was important to maintain and improve the performance of their chatbots.

Dan Somers, CEO of Warwick Analytics says: “Achieving the right level of human-in-the-loop input is key for chatbot owners and managers. Human validation is required for accuracy and improvement but if too much is required then a business may as well have a human service desk. It’s all about finding the right technology that minimises the human intervention required but still increases accuracy. Our software PrediCX does exactly that.”

In addition, 21% of respondents who were yet to deploy a chatbot said it was because the performance of chatbots wasn’t acceptable in their opinion.

Warwick Analytics provides machine learning technology to help maintain and improve chatbots using a human-in-the-loop platform called PrediCX accessible via an API.

 

Download the full report for free here

Share this
26 Mar
21 Mar
12 Mar

White Paper: Bringing Models to Life. Downloadable Free Copy

Pygmalion was a Greek myth about a sculpture who brought one of his sculptures to life. The myth has become reality: Modern-day Pygmalions live in the realm of data science where they are deploying AI to bring automation
and autonomy to many facets of our lives.

Whilst there are a lot of fanciful headlines and hyperbole about the latest algorithm, the reality is that to deploy a machine learning model in an operational environment, it needs to be trained well on relevant data, and if the environment changes, to continue to be trained so that it adapts.

In the world of customer interactions and customer experience, there are many machine learning techniques being applied e.g. to try to automate customer services and contact centers, processes, as well as garner insight from the ever-growing ocean of voice of customer data such as surveys, complaints, reviews, call logs and social media. The challenges of building accurately predicting models are hard enough, but also the signals are changing all the time in both nature and mix: New products get launched, new ways of talking about the same things appear, new channels require new data structures (e.g. business chat and chatbots) and perhaps more significantly, customer expectations are changing all the time, sometimes driven by experiences outside the industry in question. For example it's no exaggeration to say that the simplification of devices by the likes of Apple and ease of shopping from Amazon has led to change in expectation, and indeed the expectation of change. In a recent study by Accenture, only 7% of brands are exceeding customer expectations and 25% do not meet expectations.

This leaves the machine learning experts in a quandary: How can businesses develop machine learning models which automate processes and contact centers not just today, but reliably ongoing? How can they get continually rich insight from models when the data are changing around them?

 

Download the full White Paper for free here

 

Your Name (required)

Your Email (required)

Share this
05 Mar

Watch our presentation from Energized Labs: Machine Learning with Human-in-the-Loop

André Louçã presents a thought provoking talk in this Energized Labs video, detailing how the path of Warwick Analytics and Machine Learning have changed and developed over time.

Watch the video now to hear André explain the main technology developed by his team, and a demonstration of Predicx, showing that there is no need to have huge teams of labellers when only one person is able to maintain the model providing trustable output.

 

Share this
23 Feb

Warwick Analytics featured in Global Predictive Maintenance Market Research Report

The report analyzes competition and latest developments on the future Predictive Maintenance market. On the basis of major manufacturers, the global Predictive Maintenance market is segmented based on the key manufacturers, growth rate, Predictive Maintenance revenue, research and modification taking place. In addition, it gives the rise in opportunities for companies in the Predictive Maintenance market. Some of the outstanding manufacturers in the Predictive Maintenance market enclosed in the report are Warwick Analytics (PrediCX), SKF, PTC, Robert Bosch, SAP SE, IBM, General Electric, Rockwell Automation, Software AG and RapidMiner. The report also includes a detailed analysis of Predictive Maintenance key market segments and sub-segments.

From a geographical prospect, the report studies the Predictive Maintenance market across regions such as North America, Europe, Latin America, Asia Pacific, Middle East and Africa. The regional market will get advantage from the well-established Predictive Maintenance framework and the high level of digitizing in the region’s Predictive Maintenance sector.

You can get a sample copy of the report here.

 

Share this
09 Feb

Digital Transformation or just better FAQs?

A lot has been made of digital transformation, and how many businesses are using self-serve web-based applications to engage with their customers, employees and other stakeholders to be able to enhance and in some cases reinvent the customer experience often with both a stickier customer journey and lower service costs. Uber and AirBnB are often held up as the poster-boys but there are many businesses who are not ‘digital native’ companies emulating in their own way.

As with so many buzzphrases, there is usually a less-sexy way of saying the same thing which has been around for a long-time. In the field of customer interaction, most people will think of digital transformation as the growth in chatbots and social media-enabled communication. However I would argue that the main bastion of change has to be directed at FAQs.

Sexy or not, FAQs used to be the only way to read about self-help and avoid calling a contact center. They are frequently cited as inherently flawed as these blogs from the UK Government and eloquently in this technical writers blog. Yet if you stop and think, a well-structured FAQ if it is searchable with natural language is a critical asset as it is really the same thing as a chatbot, but perhaps without the charm or manners.

In a more measured manner, really FAQs are part of a spectrum of communication channels (one-way and two) where customers can solve problems. They sit alongside forums, social media, chatbots, chat, phone and email (see diagram below).

Also surveys and reviews can trigger interactions depending on the content. The current state of most organisations is that all these elements are separate silos, and whilst the customer experience team are trying hard to break these silos down, there are few who would see FAQs as on the same spectrum as chatbots and forums. Also people’s expectations of FAQs are to see a laundry list of requests which is not how they desire to interact. Imagine though if you could write your query in any way you wanted into a search bar and it would retrieve the correct response. Imagine also that the search was entirely consistent across all channels. Is this just a fantasy?

Machine learning for text is capable to classify interactions to be able to automate responses to natural language. However as we see with chatbot fails, it is hard to get this right due to the complexity and variability of human dialogue and chatbot containment rates are still below where their proprietors would want them to be. Further complexity is that human dialogue varies immensely across the channels: People don’t write in emails how they use chat which is different again to forums, nor how they speak or write a complaint, or even fill in a survey. By way of an example, a study at an airline found that the average topic in a chat was just over one whereas in a call it was nearer to two and in a complaint it was two and a half. People use different channels for different things, and also use different channels for the same thing in different ways. If the company is trying to classify aka tag or ‘label’ each interaction, then it will very easily fall into the trap of having different categories or tags for different channels, not by design but because it is hard to normalise them whatever technology you’re using. This phenomenon doesn’t really have a formal name but it is rife and disruptive. The ideal is some kind of ‘homogenization’ of the tags i.e. so that “late shipping” can be the same concept whatever channel. This then allows the guardians of the customer journey to understand what’s going wrong (and right), get a global view, and also understand if customers are calling back about the same thing on a different channel because they didn’t get it resolved. This also means that the customer journey and knowledge base can be fixed once for each breach, in the knowledge that it is fixing things across the board.

Machine learning can help this harmonization process although it is fraught with challenges, not least because the models for each tag need to be built especially for each channel, for example the “late shipping” tag for chat will need to be a different model to the “late shipping” tag for email or complaint. What data scientists know is that the process of building the machine learning models is intense: New York Times estimated that up to 80% of a data scientist’s time is spent “data wrangling”. CrowdFlower estimates “data preparation” at 80%. Further, assumptions and errors are an inevitable part of the process where human judgement and skill is required. More than this, 76% of data scientists view data preparation as the least enjoyable part of their work. Furthermore someone needs to build a training set for the models and that typically involves a human somewhere labelling the various interactions into topics and a topology that can drive the correct response. This is laborious in a linear fashion.

There are a number of different approaches to this problem. One company addressing this problem in a novel way is Warwick Analytics which is a spin-out from The University of Warwick. It has developed a proprietary technology called ‘Optimized Learning’ which puts a ‘human-in-the-loop’ in a very effective way: What this means is that the technology classifies the customer interactions in a meaningful way but when its certainty is low, it asks for assistance for a human to classify or ‘label’ the interactions which provide the most information back to training the models. Therefore it is theoretically and practically guaranteed to involve the minimum human interaction to maximise the performance of the models and hence the accuracy. The human trainer can be offline, as well as involving the customer in certain circumstances. The company has worked with many enterprises to improve chatbots, automate contact centers, complaints handling and improve the quality of self-service and FAQs.

So in conclusion, FAQs are an old-fashioned and much discredited digital experience, yet in the new world of digital transformation and harmonization, they can come back to center stage thanks to some clever technology and the human-in-the-loop.

 

Share this
08 Feb

Warwick Analytics in the Press: Data Science Central, Data Scientists Need Designer Labels Too

Warwick Analytics has been featured on Data Science Central discussing how ‘Data Scientists need Designer Labels Too’.

 

Overview

When we want to understand what people believe or perceive, we do it by analysing their communication either written or spoken. Let’s say we’re wanting to analyse voice of customer text data.

The classical way to approach this is text mining based on keywords and rules to drive topic analysis e.g. using TFIDF or some other kind of ‘vectorization’, and sentiment analysis of the opinion terms.

Thankfully this no longer needs to be the case thanks to the latest technologies. Imagine a world where AI-based labelling and machine learning for text is cheap and plentiful, where data scientists are not required to tune and drive models.

There are issues here. Firstly, what are we supposed to do with all the topics? If we build a word cloud how useful is that? If they use synonyms which aren’t in a dictionary, do we group these together in advance? We are essentially trying to second-guess and group terms, which might not match the intentions of the customers, or be different for different situations. Things for sentiment analysis are even more dissonant and we haven’t begun to explore the technical challenges with sarcasm, context, comparators and double negatives which all perform very poorly in such analyses.

So how else are we meant to analyse text data, apart from painfully compiling dictionaries and constant manual checking? Well, say hello to the wonderful world of labels. The labels being referred here are generated from machine learning i.e. by replicating human judgment based on a training sample of manually labelled data. The machine doesn’t need to be told keywords, it figures out common patterns which might be a lot more than single keywords, and might include where they are in the sentence and whether they are nouns or verbs, just as a human might.

You can read the full article here

Share this

© 2017 Warwick Analytics. All rights reserved. Registered in England & Wales. Number 07724630. Registered address 35 Kingsland Road, London, E2 8AA. VAT 120435168.

Warwick Analytics