Serdar Yegulalp
Senior Writer

CrateDB packs NoSQL flexibility, SQL familiarity

news analysis
Dec 19, 20162 mins

The open source database includes Elasticsearch-style full-text search, SQL querying, uncomplicated clustering, and unpack-and-go installation

wooden crates box storage
Credit: Frank Winkler

CrateDB, an open source, clustered database designed for missions like fast text search and analytics, released its first full 1.0 version last week after three years in development.

It’s built upon several existing open source technologies — Elasticsearch and Lucene, for instance — but no direct knowledge of them is needed to deploy it, as CrateDB offers more than a repackaging of those products.

The database caught the attention of InfoWorld’s Peter Wayner back in 2015 because it promised “a search engine like [Apache] Lucene [and ‘its larger, scalable, and distributed cousin Elasticsearch’], but with the structure and querying ease of SQL.”

The idea is to provide more than a full-text search system. CrateDB’s use cases include big data analytics and scalable aggregations across large data sets. It allows querying via standard ANSI SQL, but it uses a distributed, horizontally scalable architecture, so that any number of nodes can be spun up and run side by side with minimal work.

CrateDB gets two major advantages from the NoSQL side. One is support for unstructured data via JSON documents and BLOB storage, with JSON data queryable through SQL as well. Another is support for high-speed writing, to make the database a suitable target for high-speed data ingestion a la Hadoop.

But CrateDB’s biggest draw may be the setup process and the overall level of get-in-and-go usability. The only prerequisite is Java 8, or you can use Docker to run a provided container image. Nodes automatically discover each other as long as they’re on a network that supports multicast. The web UI can bootstrap a cluster with sample data (courtesy of Twitter), and the command-line shell uses conventional SQL syntax for inserting and querying data. Also included is support for PostgreSQL’s wire protocol, although any actual SQL commands sent through it need to adhere to CrateDB’s implementation of SQL.

CrateDB’s one of a flood of recent database products that all address specific issues that have sprung up: scalability, resiliency, mixing modalities (NoSQL vs. SQL, document vs. graph), high-speed writes, and so on. The philosophy behind such products generally runs like this: Existing solutions are too old, hidebound, or legacy-oriented to solve current and future problems, so we need a clean slate. The trick will be to see whether the benefits of the clean slate outweigh the difficulties of moving to it — hence, CrateDB’s emphasis on usability and quick starts.

Serdar Yegulalp

Serdar Yegulalp is a senior writer at InfoWorld. A veteran technology journalist, Serdar has been writing about computers, operating systems, databases, programming, and other information technology topics for 30 years. Before joining InfoWorld in 2013, Serdar wrote for Windows Magazine, InformationWeek, Byte, and a slew of other publications. At InfoWorld, Serdar has covered software development, devops, containerization, machine learning, and artificial intelligence, winning several B2B journalism awards including a 2024 Neal Award and a 2025 Azbee Award for best instructional content and best how-to article, respectively. He currently focuses on software development tools and technologies and major programming languages including Python, Rust, Go, Zig, and Wasm. Tune into his weekly Dev with Serdar videos for programming tips and techniques and close looks at programming libraries and tools.

More from this author