Keeping Things Simple

Decodable Team

Share this post

‍

Stream Processing Lags Streaming Adoption

Why hasn’t stream processing experienced hockey stick growth like streaming has? In our experience, the majority of data (by some estimates, as much as 80%) passing through real time streaming platforms (Kafka, Kinesis, Pulsar, etc) is not transformed in flight, just passed along. Google Trends provides solid evidence that stream processing is not enjoying the widespread adoption it deserves:

‍

Culture and Skills are a Drag

We see a few main reasons why real-time processing isn’t growing proportionately, all of which are culture-based:

1) Transforming data in flight is a lot harder than simply writing to/reading from a stream. Most engineers don’t have the expertise in streaming to intercept messages in flight, and many don’t know Java/Scala/Python well enough to perform the transformations reliably. The business demand for in-flight data processing talent far outstrips the supply, throttling adoption of the technique.

2) Data engineers have muscle memory around batch ELT, so they’ll use Kafka/Kinesis/etc to move the data to a DB and then transform it after it lands (a very common antipattern). Because of batch-based architectures, everything in the world has been instant since the development of the internet except our datacenters. Customers purchase in real-time but inventory systems, dashboards, and sales software operate with hours or days of lag. So even when companies use Kafka, users often cram a square into a round hole because that’s what they’re used to. As a result, your real-time app or dashboard becomes…well, very not real-time, and your Event Driven Architecture becomes just an Architecture.

3) Streaming transformation platforms are historically hard to operate. Don't believe this? Put yourself in the shoes of a data platform team leader and check your reaction as your users ask for an open source software solution(say, Storm or Flink) so they can run transform jobs. Now your team is supporting an open source technology with no support backstop, a technology for which there is hot competition for talent and experience in the market. What’s worse, you’re supporting users, many of who (as stated in #1) are not skilled at streaming OR Java/Scala/Python. Your users are essentially chaos monkeys…and guess who’s going to get pulled into re-writing jobs constantly? Supporting odd workloads in Kafka can be tough, but supporting odd workloads in a stream processing platform is untenable.

Eliminate the Friction

The world needs a platform that makes streaming transformation easy. Easy for the people writing jobs and easy for people supporting the platform.

Make it easy for the people writing transformation jobs:

An intuitive UI, CLI, and API
Jobs in SQL, not coding languages
Crafted guardrails to prevent mistakes and anti-patterns in the first place
Clear error messages and documentation
Easy to connect sources and deliver to destinations

Make it easy for the team operating the platform:

Autoscaling performance: gigabytes per second at tens of millisecond latencies
No clusters, no CSU saturation, no AZ considerations, no CPU/Disk monitoring
Predictable pricing, not pay-per-event. You don’t have to be a forensic accountant
Robust telemetry and monitoring
Simplified CSP (Connection, Stream, Pipeline) paradigm. With other tools, you have to look up properties and create connections in ddl. We’ve already boxed up the config parameters, tuned them right. We won’t ever ask about delivery modes. We won’t ask about consumer groups or offsets.

At Decodable we’re on a mission to make stream processing intuitive, safe, correct and fast.

Try our quickstart walkthrough and have a pipeline up in 3 minutes.

‍

📫 Email signup 👇

Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?

Simply enter your email address here and we'll send you the next issue as soon as it's published—and nothing else, we promise!

👍 Got it!

Oops! Something went wrong while submitting the form.

Decodable Team

Announcing General Availability of the Decodable Real-Time Data Platform

February 22, 2022

min read

Announcing General Availability of the Decodable Real-Time Data Platform

Eric Sammer

We’re Abusing The Data Warehouse - RETL, ELT, And Other Weird Stuff

May 3, 2022

min read

We’re Abusing The Data Warehouse - RETL, ELT, And Other Weird Stuff

Eric Sammer

Powered by Apache Flink and Debezium, Decodable is a real-time data platform that unifies ELT, ETL, and stream processing.

Start Free Talk To An Expert

Heading 2

‍