Grant Gross
Senior Writer

IBM, Mayo form open source health IT consortium

news
Apr 2, 20093 mins

The Open Health Natural Language Processing Consortium, announced Thursday, will focus on technology to allow for large-scale data aggregation

Biomedical informatics researchers at IBM and the Mayo Clinic have launched a new open source consortium focused on natural language processing (NLP), in an effort to help doctors share diagnosis and treatment information.

The Open Health Natural Language Processing Consortium, announced Thursday, will focus on technology to allow for large-scale data aggregation, allowing doctors to mine medical records in their specialties to find similar cases to study before making difficult diagnoses or before determining treatment.

Doctors will be able to review any physician notes on similar cases, but no personally identifiable patient information will be available in the database, IBM and Mayo said.

With the launch of the consortium, the two organizations have released two projects under open source licenses, one focused on clinical notes and one on pathology reports. The consortium is using the Apache license, version 2.0.

The organizations are inviting others to help develop NLP tools. “By making it an open source initiative, we hope to enable wide use of these NLP tools so medical advancements can happen faster and more efficiently,” Dr. Christopher Chute, a Mayo Clinic bioinformatics expert and senior consultant on the project, said in a statement.

Two other health care organizations, Seattle Group Health and the U.S. Department of Veterans Affairs Boston Healthcare System, plan to participate in the consortium, and other participants are welcome, IBM and Mayo said.

As more health care providers adopt electronic health records, it will become increasingly important to be able to search those records, the organizations said. Mayo and IBM have developed a system for extracting information from more than 25 million text-based clinical notes based on IBM’s open source Unstructured Information Management Architecture, or UIMA, they said.

The two organizations have also developed a system to extract cancer diseases characteristics from pathology reports, allowing for the computation of cancer stage.

“Large-scale information extraction from the clinical narrative is a vital component in advancing translational research and patient care,” Guergana Savova, a medical informatics specialist and Mayo’s lead on the project, said in a statement. “It ‘unlocks’ the clinical textual data that resides in huge repositories. Such technology would allow for large-scale data aggregation, analyses and usage — just imagine the power of data from millions of patients.”

The organizations have not yet determined what NLP projects to work on next, an IBM spokeswoman said. “The goal is to first get feedback from participating institutions on the initial project, and then expand,” she said.

Grant Gross

Grant Gross, a senior writer at CIO, is a long-time IT journalist who has focused on AI, enterprise technology, and tech policy. He previously served as Washington, D.C., correspondent and later senior editor at IDG News Service. Earlier in his career, he was managing editor at Linux.com and news editor at tech careers site Techies.com. As a tech policy expert, he has appeared on C-SPAN and the giant NTN24 Spanish-language cable news network. In the distant past, he worked as a reporter and editor at newspapers in Minnesota and the Dakotas. A finalist for Best Range of Work by a Single Author for both the Eddie Awards and the Neal Awards, Grant was recently recognized with an ASBPE Regional Silver award for his article “Agentic AI: Decisive, operational AI arrives in business.”

More from this author