Five core attributes of a streaming data platform

opinion
Aug 8, 20163 mins

These attributes are necessary for the implementation of an integrated streaming data platform

As your data-driven organization considers incorporating new data sources like mobile apps, websites that serve a global audience, or sensor information from the internet of things, technologists will have questions about the required attributes of a streaming data platform.

There are five core attributes that are necessary for the implementation of an integrated streaming platform and allow for both the acquisition of streaming data and the analytics that make streaming applications possible:

Low latency: Streaming data platforms need to match the pace of the data sources that they will acquire data from as part of a stream. One of the keys to streaming data platforms is the ability to match the speed of data acquisition with the requirements of the near real-time analytics needed to disrupt particular business models or markets. The value of real-time streaming analytics diminishes when you have to wait for the data to be landed in a data warehouse or a Hadoop-based data lake architecture. In particular, for location-based services and predictive maintenance applications, the time between when the data is created and landed in a data management environment represents a missed customer opportunity at the least or a stranded multi-million dollar asset critical to your business operations at the most.

Scalable: Streaming data platforms are not just connecting a couple of data sources behind the corporate firewall. Streaming data platforms need to be able to match the projected growth of connected devices and the internet of things. This means that streaming data platforms will need to be able to stream data from a large number of sources — potentially millions or even billions of sources, both internally and externally.

Diverse: Streaming data platforms will need to support not just “new era” data sources from mobile devices, cloud sources, or the Internet of Things. Streaming data platforms will also be required to support “legacy” platforms such as relational databases, data warehouses, and operational applications like ERP, CRM, and SCM. These are the platforms with the information to place streaming devices, mobile apps, and browser click information into context to provide value-added insights.

Centralized: One of the core tenants of a streaming data platform is to make streaming architectures simpler to understand and easier to implement. Using a centralized architecture, streaming data platforms can not only reduce the number of potential connections between streaming data sources and streaming data destinations, but they can provide a centralized repository of technical and business metadata to enable common data formats and transformations.

Durable: The ability to land data in a data warehouse or Hadoop-based data lake environment is a key component to a streaming data platform. This allows for not only the “in-flight” acquisition and analysis of the streaming data, but allows for a streaming data platform to support historical analysis that can be used for the development of pattern-based policy rules or advanced analytical clustering for streaming data analysis and processing.

With these five core attributes as the foundation for your streaming data platform, you can start the technology journey toward building a robust and complete platform that will enable the streaming applications that your data-driven organization will be built upon.

John Myers joined Enterprise Management Associates in 2011 as Senior Analyst of Business Intelligence (BI). In this role, John delivers comprehensive coverage of the business intelligence and data warehouse industry with a focus on database management, data integration, data visualization, and process management solutions.

John has years of experience working in areas related to business analytics in professional services consulting and product development roles. He has also helped organizations solve their business analytics problems whether they relate to operational platforms, such as customer care or billing, or applied analytical applications, such as revenue assurance or fraud management.

During his professional career, John has spent over 10 years working with business analytics implementations associated with the telecommunications industry. In this role, John has worked on reducing the complexity of the flood of data associated with the augmented role of telecommunications on everyday activities, including increased importance of smartphone and tablet applications; emerging role of over the top (OTT) video content (IPTV); and potential of machine to machine (M2M) connectivity for smart grids. John was recognized as a key component of the TeleManagement Forum's (TMForum) work on analytics-based content distribution as part of the TMForum's Content Encounter applied solution demonstration series in 2009.

In 2005, John founded the Blue Buffalo Group, a consulting and analysis firm that provides business intelligence expertise to outlets such as BeyeNetwork's Telecom Channel, the Data Warehousing Institute (TDWI), and BillingOSS magazine, and go-to-market industry analysis that enables organizations to penetrate the telecommunications industry vertical. Prior to starting his own firm, John was a technical consulting principal for Subex (formerly Subex/Azure and Azure Solutions) and a senior systems consultant for American Management Systems (now CGI).

John's passions are rooted in analytics associated with the events of business process and the behavioral events of people to provide knowledge on various business models. He brings over fifteen years of experience to every client interaction.

The opinions expressed in this blog are those of John L. Myers and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.

More from this author