ende
  • +44 (0)20 7060 6990

A3 is the world’s first automated predictive analytics platform.

It is a large calculating engine, accessed via an API that can work within any Big Data framework. It contains the predictive and other advanced analytical workflows required to handle and automate complex analytics. It is ideal for data scientists and analysts in any sector who wish to speed up the data transformation required, particularly with heterogeneous datasets.

There are many data science workflows for different analyses built and parallelized within A3:

Explanatory Analysis

Explanatory Analysis

Look for ‘explanations’, finding the factors and combinations of factors which best provide insight in order to point to the root cause.

Association Rules

Association Rules

Make decisions on pricing, promotions and store/web location.

Classification (2)

Classification

Classify whether a record (which could represent a customer, vehicle, patient) is likely to be in one outcome or another.

Cyclical Graph

Cyclic Graph

Automate predictive analytics within cyclic graph data. This enables you to data mine networks and relationships between entities.

Clustering (2)

Clustering

Group similar entities (e.g. customers, patients, vehicles etc.) without a definite guide as to what those groups are targeting.

Semantic Analysis

Semantic Analysis

Handle text to generate summaries of documents, find specific terms, score whether an entity is rated good or bad.

Anomaly Detection (2)

Anomaly Detection

Looks for features which are ‘odd’ or ‘out of the norm’ and which might indicate fraudulent behavior or alert to something which might go wrong.

More about A3

Anomaly Detection

Anomaly Detection is looking for features which are ‘odd’ or ‘out of the norm’ and which might indicate e.g. fraudulent behavior or alert to something which might go wrong in a mechanical system even though there is no specific trigger threshold for it.

Anomaly Detection can be ‘supervised’ or ‘unsupervised’ (or indeed semi-supervised). Arguably, unsupervised is more powerful it is suggesting signals which look odd without necessarily having seen them before and having a reference ‘decision marker’ for those features. It can be supervised (or trained) afterwards to avoid ‘information overload’ i.e. too many alarms and alerts. Potentially it can do this by being combined with other techniques like clustering to ‘mute’ classes of anomalies or features within the data which should not trigger an alarm. Use cases might be looking for fraud and error, or early indications of issues occurring in a live system or process e.g. manufacturing or operating machinery. There are many algorithms for this category, and many proprietary algorithms and suites e.g. coming out of the finance, fraud and IT security industries.

Semantic Analysis

This is a huge area covering text or content analytics with many different use cases, e.g. summary annotation, entity search/recognition, sentiment analyses and others. In plain English these respectively refer to being able to handle text to generate summaries of documents, find specific terms, score whether an entity is rated good or bad. It is not strictly predictive analytics but worthy of note here due to the increasing challenges faced by data scientists of trying to apply predictive analytics to text.

There are many algorithms, techniques and suites dedicated to each of these use cases. There is too much to cover on the topic specifically here. However the main point is that typically text analytics is a separate area to predictive analytics such that deriving predictive analytics from unstructured or heterogeneous data means having different software packages (and sometimes even different teams of data scientists) to pre-process the text to structure it prior to analysis. Dictionaries are required to be built up and often Natural Language Processing (NLP) is used. Joining text analytics to predictive analytics is one of the main challenges and the ability to do this in an automated way is one of the holy grails of data science.

Forecasting

The newcomer might be tempted to react that all predictive analytics is forecasting, and indeed many predictive techniques can be used to forecast. Yet the terminology typically refers to time-based predictions in particular e.g. future economic growth, patient demand, customer footfall etc. It also typically refers to the use of regression techniques against other indicators and parameters in order to drive the model. Whilst the use case is subject to the same challenges elsewhere, there are specific data transformation challenges e.g. regressions of variables with different time frequencies.

Explanatory Analysis

This is a use case which is looking for ‘explanations’ i.e. from a business use case, finding the factors and combinations of factors which best provide insight in order to point to the root cause (and therefore can be used to predict and prevent). Some readers might comment that this sounds similar to, and can be resolved by, the classification use cases above. In some limited cases this is correct, however in many others it is not: There are many instances where some parameters correlate to events (i.e. good classification), but offer poor ‘explanations’ or insight. For example at a superficial business level it might be easy to understand that some red cars of a certain vintage and model have the strongest correlation with a particular set of failures, but it offers no insight as to the explanation. It might provide a hint or it might be a self-defining correlation. This is a lot more than just the data within the dataset, although the technical reasons are outside the scope of this document. Suffice to say that Explanatory Analysis is a particular, but very significant, variant of Classification, which presents a significant analytical challenge again highlighting the challenges confronting a data scientist.

Clustering

This is a broad category also known as unsupervised classification (“Segmentation” also generally falls into this category although it can also be supervised classification). In other words it is trying to group similar entities (e.g. customers, patients, vehicles etc.) without a definite guide as to what those groups are targeting. So in contrast to Classification, it doesn’t have ‘decision-markers’. To illustrate this further, if we are not grouping patients by known outcomes of a subset of the patients, then we need to group them in some other intrinsic, and potentially subjective, way based on characteristics. It is therefore inherently much more difficult to validate as there is no right and wrong although constraints can be imposed such as numbers of clusters. Perhaps surprisingly to the newcomer it can be incredibly useful if, instead of a heterogeneous mess, one is able to group together similar entities into a manageable number of clusters.

The diversity of techniques and further the algorithms for each technique is even wider than classification and highlights further the ‘art’ involved as well as the work to perform analyses. Many of these algorithms employ a ‘distance function’ so that entities that are similar are ‘nearer’ to those that are more dissimilar. The kinds of business use cases might be the grouping of similar customer reviews or even species of animals. There are many cases where unstructured data would be part of the input data, and therefore highlights further the additional analyses, decisions and subjectivity.

Classification

Classification refers to any use case where records are labelled into different classes. Whilst in itself it does not sound inherently ‘predictive’ it is one of the most common type of predictive analytics use cases, primarily for classifying whether a record (which could represent a customer, vehicle, patient) is likely to be in one outcome or another. The classes are the outcomes and the data are usually ‘trained’ (e.g. known patients with known outcomes) and then ‘predicted’ (e.g. new records of patients with unknown outcomes). The various parameters and other data associated with the records are used in the calculation.

This use case can be addressed in different techniques (e.g. decision trees, logistic regression, neural networks) and algorithms (e.g. for decision trees there are CART, C4.5, kNN etc.). It is ‘supervised’ meaning that there are decision-markers in the datasets to train the models (i.e. in the example above it is the outcomes of the patients which might be binary or another distribution).

Selecting the right algorithm for the job has intrinsic subjectivity and as well as the personal preferences of the data scientist, and it can depend on the data structure (see below) as well as the business use case. If it has stable efficacy in one situation that might be good, whereas in another situation the ‘understanding’ of the predictive factors might be important. An illustration of this is using neural networks which might be great at recognizing handwriting but provide a complex and unintelligible set of rules in terms of how it does it. In another situation it might not provide efficacy.

One thing that all data scientists will agree on is that dealing in real-world situations often means dealing with heterogeneous, dirty, incomplete and/or unstructured data. As noted above in Section 3, the majority of data science time is the transformation of data to get it into a shape that conventional classification techniques can deal with. Being able to automate this whole use case brings great benefit to the data scientist and business user alike as we shall explore further.

Association Rules

The simplest business use case to understand here is to understand the products that a consumer buys in a retailer e.g. if they buy eggs, do they also buy milk and bread etc. This allows retailers to make decisions on pricing, promotions and store/web location. Some well-known algorithms include Apriori, Eclat and FP-Growth, but they only do half the job, since they are algorithms for mining frequent itemsets (i.e. rules composed by two different sets of items). The data scientist still needs to employ another step after this to generate rules from frequent itemsets found in a database: There are various parameters that are required to be decided by a data scientist based on judgement for the domain, e.g. the strength of the itemset associations and the confidence thereof.

Cyclic Graph

Automate predictive analytics within cyclic graph data. This enables you to data mine networks and relationships between entities.

© 2017 Warwick Analytics. All rights reserved. Registered in England & Wales. Number 07724630. Registered address 35 Kingsland Road, London, E2 8AA. VAT 120435168.

Warwick Analytics