InfoWorld editors and reviewers recognize the year’s best software development, cloud computing, big data analytics, and machine learning tools. Credit: IDG Welcome to InfoWorld’s Technology of the Year Awards, our annual celebration of the best, most innovative, most important products in the information technology landscape. In this 2019 edition of the awards, you might happen to guess that containers, cloud-native application stacks, distributed data processing systems, and machine learning are major themes. Among our 17 winners, you’ll find three leading machine learning libraries, a distributed training framework that accelerates deep learning, and an automated platform that guides nonexperts through feature engineering, model selection, training, and optimization. That makes more picks in machine learning than any other product category, including software development—a reflection of the astonishing level of activity in the space. Three databases made our winner’s list this year, including a wide-column data store, a multi-purpose data store, and a database that seems as much application platform as data store. Because data always has to move from here to there, preferably in real-time, we’ve also included two leading platforms for building stream processing applications. Read on to learn about this year’s winners. IDG Kubernetes Kubernetes (aka K8s) has had an astonishing rise the past couple of years. It used to be one of a crowd of container orchestration systems, but now it is rapidly becoming the standard platform everywhere, whether on one of the major cloud providers or in on-premises enterprise installations. If you’re in the operations realm, spending time getting to grips with Kubernetes will likely pay back dividends as the open source project continues its relentless march. Based on ideas and lessons learned from running Google’s massive data centers over the course of a decade, Kubernetes is a battle-tested platform for deploying, scaling, and monitoring container-based applications and workloads across large clusters. In the past year, Kubernetes releases brought major highlights such as an overhaul of storage and move to the Container Storage Interface, TLS-secured Kubelet bootstrapping, and improved support for Microsoft Azure. We’ve also seen important additions to the core Kubernetes stack, such as Istio, which defines a service mesh for even more control over deployment, observability, and security. And we’ve seen more specialized frameworks appear, such as Kubeflow, which allows you to easily spin up TensorFlow or PyTorch machine learning pipelines on Kubernetes, all controlled by Jupyter Notebooks likewise running on the cluster. The number of third-party tools and frameworks aimed at easing some aspect of Kubernetes management—from simplifying app definitions to monitoring multiple clusters—seems to grow with each passing day. As does the number of Kubernetes adopters, with major announcements and testimonials in 2018 coming from the likes of IBM, Huawei, Sling TV, and ING. Heck, even Chick-fil-A is running Kubernetes in every restaurant. Isn’t it about time you jumped on board? —Ian Pointer IDG Firebase In the future we may or may not have quantum computing, mind reading AIs, and sublinear algorithms for solving the traveling salesman’s problem, but whatever comes along, we can be sure that we’ll call them a “database.” All great software technology eventually gets absorbed by the Borg tended by the DBAs. The emergence of Firebase is a good example of just how this will happen. At first glance, Firebase looks like a simple storage solution for keys and their accompanying values. In other words, a bag of pairs that is kept reasonably consistent just like the other NoSQL databases. But over the years, Google has been adding features that have let Firebase do more and more of the work that a cloud-based web app might do. Google has even started referring to Firebase as a mobile platform. Remember the challenge you have caching data on the client when the Internet is less than perfect? The Firebase team realized that the synchronization routines that keep the database consistent are also ideal tools for pushing and pulling data from your mobile client. They opened up their synchronization process and now your code doesn’t need to juggle some complicated algorithm for handshaking or fiddling with the network. You just hand your bits to Firebase and, like magic, they appear in the copy of the handset. It’s all just one big database and your server routines and client routines just read and write from the communal pool. Google keeps adding more as the company integrates Firebase with the rest of its stack. Authentication? Your social log-in to Facebook or, of course, Google, will get your users access to the right slices of the database. Analytics? Hosting? Messaging? All of Google’s solutions are gradually being pulled under the umbrella of the database. And the machine learning of the future? It’s already a beta option for Firebase users who want to analyze the key/value pairs already in the database. In a sense, we’ve already started to merge AIs with databases. —Peter Wayner IDG Serverless Framework The first generation of cloud, which rented us servers, saved us time by lifting all of the tedious hardware-related duties from our shoulders. The servers lived in distant buildings where the heating, cooling, and maintenance were someone else’s problem. The next generation of cloud technology is getting rid of the servers, at least in name, and saving us not only from fretting over operating system patches and updates, but from most of the headaches associated with application delivery. There is still server hardware and an operating system somewhere under our code, but now even more of it is someone else’s responsibility. Instead of getting chores that come with root access, we can just upload our functions and let someone else’s stack of software evaluate them. We can focus on the functions and leave everything else to the little elves that keep the clouds running. But there are challenges. Serverless computing means rethinking technical architectures. Relying on events and asynchronous queues requires refactoring applications into neatly divided tasks. While some tooling support has arrived, much still needs to be figured out: integration debugging, distributed monitoring, deployment packaging, function versioning, etc. Then there is vendor lock-in to worry about. The leading FaaS (functions as a service) providers—AWS Lambda, Microsoft Azure Functions, and Google Cloud Functions—all have their own specialized methods for deployment and operation. That is where Serverless Framework comes to the rescue, offering a layer of abstraction over vendor-specific implementations to streamline app deployment. The open source framework gives you convenient ways to test and deploy your functions to various cloud providers and eases configuration updates via a common YAML file, while also providing rich features for function management and security. In addition to the majors, Serverless Framework supports Kubeless, a framework for deploying FaaS on Kubernetes clusters, and Apache OpenWhisk, a Docker-based platform that underpins IBM Cloud Functions and offers broad language support and unique features to handle more-persistent connections. Serverless computing is neither mature nor a silver bullet for every use case, but the economics and efficiency are hard to resist. With Serverless Framework available to smooth over the bumps, why not join the growing number of businesses turning to serverless to slash operational costs and speed up deployments? —James R. Borck IDG Elastic Stack If you’re running a user-facing web application these days, providing sophisticated search functionality is not an option. Users are constantly being presented with free-text search interfaces that will fix their spelling, automatically suggest alternative phrases, and highlight search results to show them why certain results were returned. Like it or not, these are the search standards you have to live up to. Luckily, the Elastic Stack will meet all of your search needs and much more. Consisting primarily of Elasticsearch, Kibana, Logstash, and Beats, the Elastic Stack supports many use cases including user-facing document search and centralized log aggregation and analytics. Indexing documents one at a time or in bulk into Elasticsearch is a breeze from almost any language, complete with best guesses for mapping types for all of your fields (think column data types in relational databases). Now you have the full search API at your disposal, including fuzzy search, highlighting, and faceted search results. Pair that with a front-end tool like Searchkit and you’ll have a quick prototype of faceted, free-text searching in no time. Aggregating logs from any number of separate services couldn’t be easier using Logstash and Beats, allowing you to send log lines to a centralized Elasticsearch cluster for easier troubleshooting and analytics. Once you have log data indexed, use Kibana to build charts and assemble dashboards to get system health at a glance. The Elastic Stack is one of today’s must-haves for any new web project. —Jonathan Freeman IDG DataStax Enterprise Apache Cassandra—an open-source large-scale distributed column-family database inspired by Google’s Bigtable paper—is a great way to run massively scalable global data infrastructure. The masterless design is ideal for running many types of high-throughput cloud applications. However, Cassandra is not the easiest system to deploy and manage. It also leaves you wanting when trying to do various types of applications involving analytics, search, and graph operations. DataStax Enterprise (aka DSE) adds these capabilities along with improved performance and security, vastly improved management, advanced replication, in-memory OLTP, a bulk loader, tiered storage, search, analytics, and a developer studio. Like Bigtable and Cassandra, DataStax Enterprise is best suited for large databases—terabytes to petabytes—and is best used with a denormalized schema that has many columns per row. DataStax and Cassandra users tend to use it for very large-scale applications. For example, eBay uses DataStax Enterprise to store 250TB of auction data with 6 billion writes and 5 billion reads daily. DataStax Enterprise 6 brought several new features in DSE Analytics, DSE Graph, and DSE Search in 2018, along with finer-grained security settings. Improvements to DataStax Studio track the improvements in DSE Analytics, such as support for Spark SQL, and expanded IDE support for DSE Graph with interactive graphs. To top it all off, benchmarks show DSE 6 to be multiples faster than Cassandra (see InfoWorld’s review). —Andrew C. Oliver IDG Apache Kafka Honestly, it’s odd to imagine a world without Apache Kafka. The distributed streaming platform will soon celebrate its eighth birthday, and the project continues to be the rock-solid open source choice for streaming applications, whether you’re adding something like Apache Storm or Apache Spark for processing or using the processing tools provided by Apache Kafka itself. Kafka can handle low-latency applications without breaking a sweat, and its log-based storage makes it a great choice where reliability is required. For interfacing with databases and other data sources, Kafka Connect includes a host of connectors to popular offerings such as Microsoft SQL Server, Elasticsearch, HDFS, Amazon S3, and many more, allowing you to flow data into your Apache Kafka cluster simply by editing a configuration file. Imagine setting up an entire pipeline from a database to Amazon S3 without having to write custom code—or touch any Java code whatsoever. Confluent, one of the major developers of Apache Kafka—including the original creators: Jay Kreps, Neha Narkhede, and Jun Rao—offers a platform that builds on top of the open source offering. While this includes traditional enterprise goodies such as better operational user interfaces, it also includes KSQL, a library that provides you with the ability to interrogate and process the data held within Kafka topics using straight SQL. And if you don’t feel up to the task of running Apache Kafka yourself, Google offers a managed platform in conjunction with Confluent, while Amazon has Managed Streaming for Kafka (Amazon MSK). Amazon MSK is currently in public preview, likely to hit general availability sometime in 2019. —Ian Pointer IDG Apache Beam Apache Beam takes a forward-thinking approach to developing batch and stream processing pipelines. Unlike most platforms, Beam abstracts away the development language from the final execution engine. You can write your pipeline in Java, Python, or Go, then mix-and-match a runtime engine to fit your specific needs—say, Apache Spark for in-memory jobs or Apache Flink for low-latency performance. Your business logic isn’t pegged to a specific execution engine, so you’re not locked in as technologies obsolesce. Plus, developers don’t need to grapple with the specifics of runner configuration. Internally, Beam manages all of the mechanics for temporal event processing. Whether its well-defined batches or out-of-sequence bursts coming from intermittent IoT sensors, Beam aggregates multiple event windows, waits for its onboard heuristics to determine that enough data has accumulated, then fires a trigger to begin processing. Transforms, data enrichment, and flow monitoring are all part of the mix. Beam supports a multitude of runners (Spark, Flink, Google Dataflow, etc.), I/O transforms (Cassandra, HBase, Google BigQuery, etc.), messaging (Kinesis, Kafka, Google Pub/Sub, etc.), and file sources (HDFS, Amazon S3, Google Cloud Storage, etc.). The open source underpinnings of Beam are even showing up in third-party solutions like Talend Data Streams, which compiles to Beam pipelines. Apache Beam doesn’t merely provide a solid engine for processing distributed ETL, real-time data analytics, and machine learning pipelines, it does so in a way that future-proofs your investment. —James R. Borck IDG Redis It’s a NoSQL database! It’s an in-memory cache! It’s a message broker! It’s all of the above and then some! Redis provides so many useful capabilities in one bag, it is not surprising that the so-called “in-memory data structure store” has become a staple of modern web application stacks, with library support in just about every programming language you might choose to use. Redis offers the ability to work at just the level of complexity and power you need for a given job. If all you need is a simple in-memory cache for data fragments, you can have Redis set up and working with your application in just a few minutes. If you want what amounts to a disk-backed NoSQL system, with different data structures and your choice of cache eviction schemes, you can have that with just a little more effort. Redis 5.0, released in October 2018, introduced many powerful new features, the most significant being the new stream data type. This log-like, append-only data structure makes it possible to build Apache Kafka-like messaging systems with Redis. Other improvements in Redis 5.0 include better memory management and fragmentation control—important performance enhancements for a system built around in-memory storage as its main metaphor. Redis Enterprise, available from Redis Labs, adds advanced features like shared-nothing clusters, automatic sharding and rebalancing, instant auto-failover, multi-rack and multi-region replication, tunable durability and consistency, and auto-tiering across RAM and Flash SSDs. —Serdar Yegulalp IDG Visual Studio Code The beauty of Visual Studio Code is that it can be just as much, or as little, as you want it to be. Visual Studio Code will serve as a fast and lightweight editor, if that’s all you need, or balloon into a full-blown development environment, thanks to plug-ins and add-ons for just about every major language or runtime in use today. Python, Java, Kotlin, Go, Rust, JavaScript, TypeScript, and Node.js (not to mention Microsoft’s own .Net languages) all have excellent support—as do supplementary document formats such as Markdown, HTML, reStructuredText, and LLVM IR. In addition to broad support and wide adoption, Visual Studio Code stands out for the relentless stream of improvements and additions that pours into the product. No area of functionality has been ignored. Thus you’ll find strong support for Git, Team Foundation Server, Docker, code linting, refactoring, large files, etc. etc. There’s even the ability to run Visual Studio Code in a self-contained directory, opening the door to repackaging Visual Studio Code as a standalone environment for whatever new purpose you could dream up. —Serdar Yegulalp IDG .Net Core It has been said that converting a software project to open source is either the best thing or the worst thing you could do for it. In the case of Microsoft’s .Net Framework, open sourcing a subset of the functionality as .Net Core has been a resounding net positive, resulting in a lighter-weight runtime with an open development process, a cross-platform-first philosophy, and a compatibility bridge back to the main .Net Framework for apps that need it. Version 2.1, released in May 2018, rolled in a great many features that complemented this larger plan. Biggest among them: .Net Core Global Tools, a new system for deploying and extending the command-line tools used to manage .Net apps; the Windows Compatibility Pack, providing access to the 20,000 APIs used in the big-brother .Net Framework for Windows-native apps; API analysis tools to identify Windows API dependencies when porting a Windows app; and mechanisms for publishing self-contained .Net Core apps with the latest runtime bundled in. —Serdar Yegulalp IDG LLVM At first glance, LLVM might seem an esoteric choice for our award. A toolkit for building programming language compilers? But the power of the LLVM compiler framework sits at the heart of a number of A-list projects: Clang, Rust, Swift, Julia, and many other innovative projects moving programming languages forward. LLVM gives developers the means to generate machine-native code programmatically—and without their having to know the vicissitudes of each architecture and platform they want to target. The obvious use case is language compilers, but LLVM also makes possible a whole variety of other applications. For instance, PostgreSQL uses LLVM to dynamically generate code that accelerates SQL queries. Similarly, the Numba project uses LLVM to transform slow Python code into fast assembly language for high-speed number-crunching applications. LLVM’s two major releases of 2018 introduced a slew of improvements: better support for newer Intel processors, multiprocessor scenarios, mitigations for common processor flaws, tooling for evaluating the performance of generated code on specific CPU architectures, and further work on the LLVM linker, LLD, which can produce standalone executables from LLVM across multiple platforms. —Serdar Yegulalp IDG TensorFlow TensorFlow is an open source software library for numerical computation using dataflow graphs. The graph nodes represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture enables you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. TensorFlow also includes TensorBoard, a data visualization toolkit. At Version 1.12, TensorFlow is by far the most widely used and widely cited deep learning framework available. While it still supports its original low-level API, the favored high-level API is now tf.keras, an implementation of the Keras API standard for TensorFlow that includes TensorFlow-specific enhancements. While TensorFlow still supports building dataflow graphs and running them later in sessions, it also now fully supports eager execution mode, an imperative, define-by-run interface. Eager execution mode supports automatic differentiation via the <a href="https://www.tensorflow.org/api_docs/python/tf/GradientTape" rel="nofollow">tf.GradientTape</a> API. One of the enhancements in tf.keras is support for eager execution. Both the Keras API and eager execution mode will be featured in TensorFlow 2.0. While some other APIs will be deprecated in version 2.0, there will be a conversion tool for existing code, in addition to a compatibility library. Estimators are TensorFlow’s most scalable and production-oriented model type. You may either use the pre-made Estimators that Google provides or write your own custom Estimators. Estimators are themselves built on <a href="https://www.tensorflow.org/api_docs/python/tf/keras/layers" rel="nofollow">tf.keras.layers</a>, which simplifies customization. It is usually much easier to create models with Estimators than with the low-level TensorFlow APIs. Pre-made Estimators enable you to work at a much higher conceptual level than the base TensorFlow APIs. —Martin Heller IDG Keras Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. Additional back ends, such as MXNet and PlaidML, are supported by third parties. The implementation of Keras within TensorFlow, tf.keras, has some TensorFlow-specific enhancements. Keras isn’t the only simplified high-level API for building neural network models, but its rising prominence in TensorFlow emphasizes its quality and importance. Keras is currently the second-most-cited neural networks API, after TensorFlow. Keras was created to be user friendly, modular, easy to extend, and to work with Python. The API was “designed for human beings, not machines,” and “follows best practices for reducing cognitive load.” Neural layers, cost functions, optimizers, initialization schemes, activation functions, and regularization schemes are all standalone modules in Keras that you can combine to create new models. New modules are simple to add, as new classes and functions. Models are defined in Python code, not separate model configuration files. The biggest reasons to use Keras stem from its guiding design principles, primarily the one about being user friendly. Beyond ease of learning and ease of model building, Keras offers the advantages of broad adoption, supports a wide range of production deployment options, has strong support for multiple GPUs and distributed training, and is backed by Google, Microsoft, Amazon, Apple, Nvidia, Uber, and others. —Martin Heller IDG PyTorch A library for creating tensors and dynamic neural networks in Python with strong GPU acceleration, PyTorch is currently the third-most-cited neural networks framework, after TensorFlow and Keras. A dynamic neural network is one that can change from iteration to iteration, for example allowing a PyTorch model to add and remove hidden layers during training to improve its accuracy and generality. PyTorch recreates the graph on the fly at each iteration step. PyTorch integrates acceleration libraries such as Intel MKL, Nvidia cuDNN, and Nvidia NCCL to maximize speed. Its core CPU and GPU tensor and neural network back ends—TH (Torch), THC (Torch Cuda), THNN (Torch Neural Network), and THCUNN (Torch Cuda Neural Network)—are written as independent libraries with a C99 API. At the same time, PyTorch is not a Python binding into a monolithic C++ framework—the intention is for PyTorch to be deeply integrated with Python and to allow the use of other Python libraries. PyTorch has the ability to snapshot a tensor whenever it changes. The framework approximates the gradient at every saved tensor by looking at the differences between that point and the previous tensor. Called “autograd,” this speeds up calculating gradients by up to a factor of three. Given that the steepest descent optimizers rely on gradients, it can speed up the entire training process by up to a factor of three. TensorFlow has the same capability. PyTorch is primarily supported by Facebook, but other sponsors and contributors include Twitter, Salesforce, and Microsoft. Microsoft has contributed technology that originated in its own CNTK framework, to add to the capabilities PyTorch inherited from Torch and Caffe. —Martin Heller IDG Horovod Horovod is a distributed training framework for TensorFlow, Keras, and PyTorch, created at Uber. The goal of Horovod is to make distributed deep learning fast and easy to use. Horovod builds upon ideas from Baidu’s draft implementation of the TensorFlow ring allreduce algorithm. Uber originally tried using Distributed TensorFlow with parameter servers. The engineers found the MPI model to be much more straightforward and to require far less code changes. Uber claims that the Horovod system makes it possible to train AI models roughly twice as fast as a traditional deployment of TensorFlow. Horovod uses Open MPI (or another MPI implementation) for message passing among nodes, and Nvidia NCCL for its highly optimized version of ring allreduce. Horovod achieves 90 percent scaling efficiency for both Inception-v3 and ResNet-101, and 68 percent scaling efficiency for VGG-16 on up to 512 Nvidia Pascal GPUs. In December 2018, Uber announced that it was moving the Horovod project under the wing of the Linux Foundation’s LF Deep Learning Foundation for open source artificial intelligence software. —Martin Heller IDG XGBoost XGBoost (eXtreme Gradient Boosting) is an open source machine learning library that implements distributed gradient boosting for Python, R, Java, Julia, and other programming languages. The core project code is in C++. XGBoost provides a parallel tree boosting algorithm that solves many data science problems in a fast and accurate way. The same code runs on single machines and distributed environments (Hadoop, MPI, Spark, etc.). The distributed version can scale to solve problems even bigger than billions of examples. XGBoost became famous in data science circles by winning a number of Kaggle competitions. It originated in a research project at the University of Washington. The 2016 paper on XGBoost by Tianqi Chen and Carlos Guestrin explains gradient tree boosting algorithms and the refinements added to XGBoost, such as cache-aware prefetch and sparsity awareness. The paper also compares the performance of XGBoost with two other commonly used exact greedy tree boosting implementations for classification, in Scikit-learn and R.gbm, and tests XGBoost against pGBRT (parallel gradient boosted regression trees) on the learning to rank problem. —Martin Heller IDG H2O Driverless AI When it comes to turning raw data into predictive analytics, H2O Driverless AI outpaces all comers with its automated simplicity. Best-practice guardrails and guideposts direct non-AI experts down a path of discovery to uncover hidden patterns using supervised and unsupervised machine learning. You supply the data and isolate the dependent variables. H2O’s homegrown algorithms do the heavy lifting of feature engineering, model selection, training, and optimization. It’s not magic. You’re still expected to have an understanding of your data set and the capacity to interpret the output. But H2O’s visual tools and clear explanations go a long way to bridge understanding across business teams, IT, and data scientists. Data scientists and developers can lift the hood to finesse model parameters and build out functions with Python and R. Jupyter notebooks export machine learning pipeline code for production. Whether on-premises or in the cloud, H2O can work with your existing big data infrastructure such as Hadoop or Spark, ingest data from HDFS, Amazon S3, or Azure Data Lake, and tap Nvidia GPU processing for additional speed. H2O pushed several significant updates in 2018—most importantly for natural language processing, time series prediction, and gradient boosting. Visual decision trees now graphically step users through understanding “how” a prediction was made—to clarify, for example, why an insurance claim was flagged fraudulent. H2O even started making its algorithms directly available on the Amazon and Microsoft clouds. H2O Driverless AI won’t put data engineers out on the street or solve every advanced machine learning problem. But it provides a streamlined alternative to building AI from scratch, reducing the time required for businesses to become more predictive, and less reactive. —James R. Borck Software DevelopmentCloud ComputingAnalyticsMachine Learning