Serdar Yegulalp
Senior Writer

Adatao enhances Hadoop with natural-language queries and machine learning

news analysis
May 1, 20152 mins

By leveraging machine learning, the Data Intelligence Platform hopes to make querying Hadoop as easy as typing questions

Last year, data visualization company Adatao — spearheaded by the former engineering director of Google Apps — announced its Data Intelligence Platform, a tool set for querying Hadoop data that’s meant to be as easy as putting together a document in Google Docs.

Today Adatao announced general availability for the tool set, which claims to put machine intelligence at the disposal of nontechnical users making natural-language queries with Hadoop data.

Christopher T. Nguyen, CEO of Adatao, describes the goals for the Data Intelligence Platform as “big data for the iPhone generation.” More than simply creating visually appealing reports from data, this involves putting machine-learning techniques within reach of less technical users.

In a demo of the product’s visual-reporting system, called Adatao Narratives, Nguyen showed how the natural-language querying process worked, using a data set that listed airline delays over the course of a year. By typing “show relationship between arrdelay [arrival delay] and month,” he produced a graph depicting that relationship. (Those who don’t want to use natural language can work with SQL queries, R, or Python.)

The lowest levels of the product’s stack are familiar big data infrastructures: Hadoop, Amazon Redshift, conventional DBMSes, and the rest. Other common big data access tools — such as Spark, Presto, and Cloudera Impala — sit on top and are used to perform raw queries.

From there, Adatao adds a machine-learning layer called Predictive Engine, and above that an application-building layer used to build the likes of Narratives or Adatao’s dashboarding tools. An open source layer called Distributed Data Frame allows vendors or users to create their own abstractions to data sources by way of Spark.

Because machine learning is such a broad term, it can be tough to assess its use in a given product. While Adatao leverages and connects with a slew of open source technologies, the product itself is proprietary, so it’s less easy to tell specifically how machine learning is leveraged under the hood.

Most of what was profiled in the demo falls in the realm of predictive analytics, an area of machine learning where cloud vendors such as Amazon, Microsoft, and IBM are competing to build easy-to-leverage solutions. If Adatao works to provide a more intuitive way to leverage those products, it will fare better than if it tries to compete only on the algorithm side.

Serdar Yegulalp

Serdar Yegulalp is a senior writer at InfoWorld. A veteran technology journalist, Serdar has been writing about computers, operating systems, databases, programming, and other information technology topics for 30 years. Before joining InfoWorld in 2013, Serdar wrote for Windows Magazine, InformationWeek, Byte, and a slew of other publications. At InfoWorld, Serdar has covered software development, devops, containerization, machine learning, and artificial intelligence, winning several B2B journalism awards including a 2024 Neal Award and a 2025 Azbee Award for best instructional content and best how-to article, respectively. He currently focuses on software development tools and technologies and major programming languages including Python, Rust, Go, Zig, and Wasm. Tune into his weekly Dev with Serdar videos for programming tips and techniques and close looks at programming libraries and tools.

More from this author