Amazon, Microsoft, Databricks, Google, HPE, and IBM machine learning toolkits run the gamut in breadth, depth, and ease Credit: Shutterstock What we call machine learning can take many forms. The purest form offers the analyst a set of data exploration tools, a choice of ML models, robust solution algorithms, and a way to use the solutions for predictions. The Amazon, Microsoft, Databricks, Google, and IBM clouds all offer prediction APIs that give the analyst various amounts of control. HPE Haven OnDemand offers a limited prediction API for binary classification problems. Not every machine learning problem has to be solved from scratch, however. Some problems can be trained on a sufficiently large sample to be more widely applicable. For example, speech-to-text, text-to-speech, text analytics, and face recognition are problems for which “canned” solutions often work. Not surprising, a number of machine learning cloud providers offer these capabilities through an API, allowing developers to incorporate them in their applications. These services will recognize spoken American English (and some other languages) and transcribe it. But how well a given service will work for a given speaker will depend on the dialect and accent of the speaker and the extent to which the solution was trained on similar dialects and accents. Microsoft Azure, IBM, Google, and Haven OnDemand all have working speech-to-text services. There are many kinds of machine learning problems. For example, regression problems try to predict a continuous variable (such as sales) from other observations, and classification problems attempt to predict the class into which a given set of observations will fall (say, email spam). Amazon, Microsoft, Databricks, Google, HPE, and IBM provide tools for solving a range of machine learning problems, though some toolkits are much more complete than others. In this article, I’ll briefly discuss these six commercial machine learning solutions, along with links to the five full hands-on reviews that I’ve already published. Google’s announcement of cloud-based machine learning tools and applications in March was, unfortunately, well ahead of the public availability of Google Cloud Machine Learning. Amazon Machine Learning Amazon has tried to put machine learning in easy reach of mere mortals. It is intended to work for analysts who understand the business problem being solved, whether or not they understand data science and machine learning algorithms. In general, you approach Amazon Machine Learning by first cleaning and uploading your data in CSV format in S3; then creating, training, and evaluating an ML model; and finally by creating batch or real-time predictions. Each step is iterative, as is the whole process. Machine learning is not a simple, static magic bullet, even with the algorithm selection left to Amazon. Amazon Machine Learning supports three kinds of models — binary classification, multiclass classification, and regression — and one algorithm for each type. For optimization, Amazon Machine Learning uses Stochastic Gradient Descent (SGD), which makes multiple sequential passes over the training data and updates feature weights for each sample mini-batch to try to minimize the loss function. Loss functions reflect the difference between the actual value and the predicted value. Gradient descent optimization works well for continuous, differentiable loss functions only, such as the logistic and squared loss functions. For binary classification, Amazon Machine Learning uses logistic regression (logistic loss function plus SGD). For multiclass classification, Amazon Machine Learning uses multinomial logistic regression (multinomial logistic loss plus SGD). For regression, Amazon Machine Learning uses linear regression (squared loss function plus SGD). After training and evaluating a binary classification model in Amazon Machine Learning, you can choose your own score threshold to achieve your desired error rates. Here we have increased the threshold value from the default of 0.5 so that we can generate a stronger set of leads for marketing and sales purposes. Amazon Machine Learning determines the type of machine learning task solved from the type of the target data. For example, prediction problems with numerical target variables imply regression; prediction problems with non-numeric target variables are binary classification if there are only two target states, and multiclass classification if there are more than two. Choices of features in Amazon Machine Learning are held in recipes. Once the descriptive statistics have been calculated for a data source, Amazon will create a default recipe, which you can either use or override in your machine learning models on that data. Once you have a model that meets your evaluation requirements, you can use it to set up a real-time Web service or to generate a batch of predictions. Bear in mind, however, that unlike physical constants, people’s behavior varies over time. You’ll need to check the prediction accuracy metrics coming out of your models periodically and retrain them as needed. Azure Machine Learning In contrast to Amazon, Microsoft tries to provide a full assortment of algorithms and tools for experienced data scientists. Thus, Azure Machine Learning is part of the larger Microsoft Cortana Analytics Suite offering. Azure Machine Learning also features a drag-and-drop interface for constructing model training and evaluation data flows from modules. The Azure Machine Learning Studio contains facilities for importing data sets, training and publishing experimental models, processing data in Jupyter Notebooks, and saving trained models. Machine Learning Studio contains dozens of sample data sets, five data-format conversions, several ways to read and write data, dozens of data transformations, and three options to select features. In Azure Machine Learning proper, you’ll find multiple models for anomaly detection, classification, clustering, and regression; four methods to score models; three strategies to evaluate models; and six processes to train models. You can also use a couple of OpenCV (Open Source Computer Vision) modules, statistical functions, and text analytics. That’s a lot of stuff, theoretically enough to process any kind of data in any kind of model, as long as you understand the business, the data, and the models. When the canned Azure Machine Learning Studio modules don’t do what you want, you can develop Python or R modules. You can develop and test Python 2 and Python 3 language modules using Jupyter Notebooks, extended with the Azure Machine Learning Python client library (to work with your data stored in Azure), scikit-learn, matplotlib, and NumPy. Azure Jupyter Notebooks will eventually support R as well. For now, you can use RStudio locally and change the input and output for Azure later if needed, or install RStudio in a Microsoft Data Science VM. When you create a new experiment in Azure Machine Learning Studio, you can start from scratch or choose from about 70 Microsoft samples, which cover most of the common models. There is additional community content in the Cortana Gallery. The Azure Machine Learning Studio makes quick work of generating a Web service for publishing a trained model. This simple model comes from a five-step interactive introduction to Azure Machine Learning. The Cortana Analytics Process (CAP) starts with some planning and setup steps, which are critical unless you are a trained data scientist who’s already familiar with the business problem, the data, and Azure Machine Learning, and who has already created the necessary CAP environments for the project. Possible CAP environments include an Azure storage account, a Microsoft Data Science VM, an HDInsight (Hadoop) cluster, and a machine learning workspace with Azure Machine Learning Studio. If the choices confuse you, Microsoft documents why you’d pick each. CAP continues with five processing steps: ingestion, exploratory data analysis and pre-processing, feature creation, model creation, and model deployment and consumption. Microsoft recently released a set of cognitive services that have “graduated” from Project Oxford to an Azure preview. These are pretrained for speech, text analytics, face recognition, emotion recognition, and similar capabilities, and they complement what you can do by training your own models. Databricks Databricks is a commercial cloud service based on Apache Spark, an open source cluster computing framework that includes a machine learning library, a cluster manager, Jupyter-like interactive notebooks, dashboards, and scheduled jobs. Databricks (the company) was founded by the people who created Spark, and with Databricks (the service), it’s almost effortless to spin up and scale out Spark clusters. The library, MLlib, includes a wide range of machine learning and statistical algorithms, all tailored for the distributed memory-based Spark architecture. MLlib implements, among others, summary statistics, correlations, sampling, hypothesis testing, classification and regression, collaborative filtering, cluster analysis, dimensionality reduction, feature extraction and transformation functions, and optimization algorithms. In other words, it’s a fairly complete package for experienced data scientists. This live Databricks notebook, with code in Python, demonstrates one way to analyze a well-known public bike rental data set. In this section of the notebook, we are training the pipeline, using a cross validator to run many Gradient-Boosted Tree regressions. Databricks is designed to be a scalable, relatively easy-to-use data science platform for people who already know statistics and can do at least a little programming. To use it effectively, you should know some SQL and either Scala, R, or Python. It’s even better if you’re fluent in your chosen programming language, so you can concentrate on learning Spark when you get your feet wet using a sample Databricks notebook running on a free Databricks Community Edition cluster. InfoWorld Scorecard Variety of models (25%) Ease of development (25%) Integrations (15%) Performance (15%) Additional services (10%) Value (10%) Overall Score (100%) Amazon Machine Learning 8 9 9 9 8 9 8.7 Azure Machine Learning 9 8 9 9 8 9 8.7 Databricks with Spark 1.6 10 9 9 9 8 9 9.2 HPE Haven OnDemand 7 8 8 8 7 8 7.5 IBM Watson and Predictive Analytics 10 9 9 9 9 8 9.2 Google Cloud Machine Learning Google recently announced a number of machine-learning-related products. The most interesting of these are Cloud Machine Learning and the Cloud Speech API, both in limited preview. The Google Translate API, which can perform language identification and translation for more than 80 languages and variants, and the Cloud Vision API, which can identify various kinds of features from images, are available for use — and they look good based on Google’s demos. The Google Prediction API trains, evaluates, and predicts regression and classification problems, with no options for the algorithm to use. It dates from 2013. The current Google machine learning technology, the Cloud Machine Learning Platform, uses Google’s open source TensorFlow library for training and evaluation. Developed by the Google Brain team, TensorFlow is a generalized library for numerical computation using data flow graphs. It integrates with Google Cloud Dataflow, Google BigQuery, Google Cloud Dataproc, Google Cloud Storage, and Google Cloud Datalab. I have checked out the TensorFlow code from its GitHub repository; read some of the C, C++, and Python code; and pored over the TensorFlow.org site and TensorFlow white paper. TensorFlow lets you deploy computations to one or more CPUs or GPUs in a desktop, server, or mobile device, and it has all sorts of training and neural net algorithms built in. On a geekiness scale, it probably rates a 9 out of 10. Not only is it way beyond the capabilities of business analysts, but it’s likely to be hard for many data scientists. Google Translate API, Cloud Vision API, and the new Google Cloud Speech API are pretrained ML models. According to Google, its Cloud Speech API uses the same neural network technology that powers voice search in the Google app and voice typing in Google Keyboard. HPE Haven OnDemand Haven OnDemand is HPE’s entry into the cloud machine learning sweepstakes. Haven OnDemand’s enterprise search and format conversions are its strongest services. That’s not surprising since the service is based on IDOL, HPE’s private search engine. However, Haven OnDemand’s more interesting capabilities are not fully cooked. Haven OnDemand currently has APIs classified as Audio-Video Analytics, Connectors, Format Conversion, Graph Analysis, HP Labs Sandbox (experimental APIs), Image Analysis, Policy, Prediction, Query Profile and Manipulation, Search, Text Analysis, and Unstructured Text Indexing. I have tried out a random set and explored how the APIs are called and used. Haven speech recognition supports only a half-dozen languages, plus variations. The recognition accuracy for my high-quality U.S. English test file was OK, but not perfect. The Haven OnDemand Connectors, which allow you to retrieve information from external systems and update it through Haven OnDemand APIs, are already quite mature, basically because they are IDOL connectors. The Text Extraction API uses HPE KeyView to extract metadata and text content from a file that you provide; the API can handle more than 500 different file formats, drawing on the maturity of KeyView. Graph Analysis, a set of preview services, only works on an index trained on the English Wikipedia. You can’t train it on your own data. From the Image Analysis group, I tested bar-code recognition, which worked fine, and face recognition, which did better on HPE’s samples than on my test images. Image recognition is currently limited to a fixed selection of corporate logos, which has limited utility. The Haven OnDemand bar-code recognition API can isolate the bar code in an image file (see the red box) and convert it to a number, even if the bar code is on a curved surface, at an angle up to about 20 degrees, or blurry. The API does not perform the additional step of looking up the bar-code number and identifying the product. I was disappointed to discover that HPE’s predictive analytics only deals with binary classification problems: no multiple classifications and no regressions, never mind unguided learning. That severely limits its applicability. On the plus side, the Train Prediction API automatically validates, explores, splits, and prepares the CSV or JSON data, then trains Decision Tree, Logistic Regression, Naive Bayes, and support vector machine (SVM) binary classification models with multiple parameters. Then it tests the classifiers against the evaluation split of the data and publishes the best model as a service. Haven OnDemand Search uses the IDOL engine to perform advanced searches against both public and private text indexes. Text Analysis APIs range from simple autocomplete and term expansion to language identification, concept extraction, and sentiment analysis. IBM Watson and Predictive Analytics IBM offers machine learning services based on its “Jeopardy”-winning Watson technology and the IBM SPSS Modeler. It actually has sets of cloud machine learning services for three different audiences: developers, data scientists, and business users. SPSS Modeler is a Windows application, recently also made available in the cloud. The Modeler Personal Edition includes data access and export; automatic data prep, wrangling, and ETL; 30-plus base machine learning algorithms and automodeling; R extensibility; and Python scripting. More expensive editions have access to big data through an IBM SPSS Analytic Server for Hadoop/Spark, champion/challenger functionality, A/B testing, text and entity analytics, and social network analysis. The machine learning algorithms in SPSS Modeler are comparable to what you find in Azure Machine Learning and Databricks’ Spark.ml, as are the feature selection methods and the selection of supported formats. Even the automodeling (train and score a bunch of models and pick the best) is comparable, although it’s more obvious how to use it in SPSS Modeler than in the others. IBM Bluemix hosts Predictive Analytics Web services that apply SPSS models to expose a scoring API that you can call from your apps. In addition to Web services, Predictive Analytics supports batch jobs to retrain and reevaluate models on additional data. There are 18 Bluemix services listed under Watson, separate from Predictive Analytics. The AlchemyAPI offers a set of three services (AlchemyLanguage, AlchemyVision, and AlchemyData) that enable businesses and developers to build cognitive applications that understand the content and context within text and images. Concept Expansion analyzes text and learns similar words or phrases based on context. Concept Insights links documents that you provide with a pre-existing graph of concepts based on Wikipedia topics. The Dialog Service allows you to design the way an application interacts with the user through a conversational interface, using natural language and user profile information. The Document Conversion service converts a single HTML, PDF, or Microsoft Word document into normalized HTML, plain text, or a set of JSON-formatted Answer units that can be combined with other Watson services. I used Watson to analyze a familiar bike rental data set supplied as one of the examples. Watson came up with a decision tree model with 48 percent predictive strength. This worksheet has not separated workday and nonworkday riders. Language Translation works in several knowledge domains and language pairs. In the news and conversation domains, the to/from pairs are English and Brazilian Portuguese, French, Modern Standard Arabic, or Spanish. In patents, the pairs are English and Brazilian Portuguese, Chinese, Korean, or Spanish. The Translation service can identify plain text as being written in one of 62 languages. The Natural Language Classifier service applies cognitive computing techniques to return the best matching classes for a sentence, question, or phrase, after training on your set of classes and phrases. Personality Insights derives insights from transactional and social media data (at least 1,000 words written by a single individual) to identify psychological traits, which it returns as a tree of characteristics in JSON format. Relationship Extraction parses sentences into their components and detects relationships between the components (parts of speech and functions) through contextual analysis. Additional Bluemix services improve the relevancy of search results, convert text to and from speech in a half-dozen languages, identify emotion from text, and analyze visual scenes and objects. Watson Analytics uses IBM’s own natural language processing to make machine learning easier to use for business analysts and other non-data-scientist business roles. Machine learning curve The set of machine learning services you should evaluate depends on your own skills and those of your team. For data scientists and teams that include data scientists, the choices are wide open. Data scientists who are good at programming can do even more: Google, Azure, and Databricks require more programming expertise than Amazon and SPSS Modeler, but they are more flexible. Watson Services running in Bluemix give developers additional pretrained capabilities for cloud applications, as do several Azure services, three Google cloud APIs, and some Haven OnDemand APIs for document-based content. The new Google TensorFlow library is for high-end machine learning programmers who are fluent in Python, C++, or C. The Google Cloud Machine Learning Platform appears to be for high-end data scientists who know Python and cloud data pipelines. While Amazon Machine Learning and Watson Analytics claim to be aimed at business analysts or “any business role” (whatever that means), I am skeptical about how well they can fulfill those claims. If you need to develop machine learning applications and have little or no statistical, mathematical, or programming background, I’d submit that you really need to team up with someone who knows that stuff. Read the reviews: Review: Azure Machine Learning is for pros only Review: Amazon puts machine learning in reach Review: Databricks makes big data dreams come true Review: IBM Watson strikes again Review: HPE’s machine learning cloud overpromises, underdelivers Artificial IntelligenceAnalyticsCloud ComputingData ScienceData ManagementPredictive Analytics