Paul Krill
Editor at Large

Microsoft unveils imaging APIs for Windows Copilot Runtime

news
Nov 19, 20242 mins

Generative AI-backed APIs will allow developers to build image super resolution, image segmentation, object erase, and OCR capabilities into Windows applications.

Generative AI, robotic hand with paintbrush
Credit: Neil Lockhart/Shutterstock

Microsoft’s Windows Copilot Runtime, which allows developers to integrate AI capabilities into Windows, is being fitted with AI-backed APIs for image processing. It will also gain access to Phi 3.5 Silica, a custom-built generative AI model for Copilot+ PCs.

Announced at this week’s Microsoft Ignite conference, the new Windows Copilot Runtime imaging APIs will be powered by on-device models that enable developers and ISVs to integrate AI within Windows applications securely and quickly, Microsoft said. Most of the APIs will be available in January through the Windows App SDK 1.7 experimental 2 Experimental release.

Developers will be able to bring AI capabilities into Windows apps via these APIs:

  • Image description, providing a text description of an image.
  • Image super resolution, increasing the fidelity of an image as well as upscaling the resolution of an image.
  • Image segmentation, enabling the separation of foreground and background of an image, along with removing specific objects or regions within an image. Image editing or video editing apps will be able to incorporate background removal using this API, which is powered by the Segment Anything Model (SAM).
  • Object erase, enabling erasing of unwanted objects from an image and blending the erased area with the remainder of the background.
  • Optical character recognition (OCR), recognizing and extracting text present within an image.

Phi 3.5 Silica, built from the Phi series of models, will be included in the Windows Copilot Runtime out of the box. It will be custom-built for the Snapdragon X series neural processing unit (NPU) in Copilot+ PCs, enabling text intelligence capabilities such as text summarization, text completion, and text prediction, Microsoft said.

Paul Krill

Paul Krill is editor at large at InfoWorld. Paul has been covering computer technology as a news and feature reporter for more than 35 years, including 30 years at InfoWorld. He has specialized in coverage of software development tools and technologies since the 1990s, and he continues to lead InfoWorld’s news coverage of software development platforms including Java and .NET and programming languages including JavaScript, TypeScript, PHP, Python, Ruby, Rust, and Go. Long trusted as a reporter who prioritizes accuracy, integrity, and the best interests of readers, Paul is sought out by technology companies and industry organizations who want to reach InfoWorld’s audience of software developers and other information technology professionals. Paul has won a “Best Technology News Coverage” award from IDG.

More from this author