Paul Krill
Editor at Large

ML.NET 2.0 enhances text classification

news
Nov 23, 20222 mins

Upgrade to Microsoft’s machine learning framework for .NET improves model building for text classification, introduces a sentence similarity API, and adds more AutoML capabilities.

Wired brain illustration - next step to artificial intelligence
Credit: Shutterstock

Microsoft has launched ML.NET 2.0, a new version of its open source, cross-platform machine learning framework for .NET. The upgrade features capabilities for text classification and automated machine learning.

Unveiled November 10, ML.NET 2.0 arrived in tandem with a new version of the ML.NET Model Builder, a visual developer tool for building machine learning models for .NET applications. The Model Builder introduces a text classification scenario that is powered by the ML.NET Text Classification API.

Previewed in June, the Text Classification API enables developers to train custom models to classify raw text data. The Text Classification API uses a pre-trained TorchSharp NAS-BERT model from Microsoft Research and the developer’s own data to fine-tune the model. The Model Builder scenario supports local training on either CPUs or CUDA-compatible GPUs.

Also in ML.NET 2.0:

  • Binary classification, multiclass classification, and regression models using preconfigured automated machine learning pipelines make it easier to begin using machine learning.
  • Data preprocessing can be automated using the AutoML Featurizer.
  • Developers can choose which trainers are used as part of a training process. They also can choose tuning algorithms used to find optimal hyperparameters.
  • Advanced AutoML training options are introduced to choose trainers and choose an evaluation metric to optimize.
  • A sentence similarity API, using the same underlying TorchSharp NAS-BERT model, calculates a numerical value representing the similarity of two phrases.

Future plans for ML.NET include expansion of deep learning coverage and emphasizing use of the LightBGM framework for classical machine learning tasks such as regression and classification. The developers behind ML.NET also intend to improve the AutoML API to enable new scenarios and customizations and simplify machine learning workflows.

Paul Krill

Paul Krill is editor at large at InfoWorld. Paul has been covering computer technology as a news and feature reporter for more than 35 years, including 30 years at InfoWorld. He has specialized in coverage of software development tools and technologies since the 1990s, and he continues to lead InfoWorld’s news coverage of software development platforms including Java and .NET and programming languages including JavaScript, TypeScript, PHP, Python, Ruby, Rust, and Go. Long trusted as a reporter who prioritizes accuracy, integrity, and the best interests of readers, Paul is sought out by technology companies and industry organizations who want to reach InfoWorld’s audience of software developers and other information technology professionals. Paul has won a “Best Technology News Coverage” award from IDG.

More from this author