Blog
Blog /

Keeping Things Simple

Riley Johnson
Decodable

Stream Processing Lags Streaming Adoption

Why hasn’t stream processing experienced hockey stick growth like streaming has? In our experience, the majority of data (by some estimates, as much as 80%) passing through real time streaming platforms (Kafka, Kinesis, Pulsar, etc) is not transformed in flight, just passed along. Google Trends provides solid evidence that stream processing is not enjoying the widespread adoption it deserves:

Culture and Skills are a Drag

We see a few main reasons why real-time processing isn’t growing proportionately, all of which are culture-based:

1) Transforming data in flight is a lot harder than simply writing to/reading from a stream. Most engineers don’t have the expertise in streaming to intercept messages in flight, and many don’t know Java/Scala/Python well enough to perform the transformations reliably. The business demand for in-flight data processing talent far outstrips the supply, throttling adoption of the technique.

2) Data engineers have muscle memory around batch ELT, so they’ll use Kafka/Kinesis/etc to move the data to a DB and then transform it after it lands (a very common antipattern). Because of batch-based architectures, everything in the world has been instant since the development of the internet except our datacenters.  Customers purchase in real-time but inventory systems, dashboards, and sales software operate with hours or days of lag. So even when companies use Kafka, users often cram a square into a round hole because that’s what they’re used to.  As a result, your real-time app or dashboard becomes…well, very not real-time, and your Event Driven Architecture becomes just an Architecture.

3) Streaming transformation platforms are historically hard to operate. Don't believe this? Put yourself in the shoes of a data platform team leader and check your reaction as your users ask for an open source software solution(say, Storm or Flink) so they can run transform jobs. Now your team is supporting an open source technology with no support backstop, a technology for which there is hot competition for talent and experience in the market.  What’s worse, you’re supporting users, many of who (as stated in #1) are not skilled at streaming OR Java/Scala/Python. Your users are essentially chaos monkeys…and guess who’s going to get pulled into re-writing jobs constantly? Supporting odd workloads in Kafka can be tough, but supporting odd workloads in a stream processing platform is untenable.

Eliminate the Friction

The world needs a platform that makes streaming transformation easy. Easy for the people writing jobs and easy for people supporting the platform.  

Make it easy for the people writing transformation jobs:

  • An intuitive UI, CLI, and API
  • Jobs in SQL, not coding languages
  • Crafted guardrails to prevent mistakes and anti-patterns in the first place
  • Clear error messages and documentation
  • Easy to connect sources and deliver to destinations

Make it easy for the team operating the platform:

  • Autoscaling performance: gigabytes per second at tens of millisecond latencies
  • No clusters, no CSU saturation, no AZ considerations, no CPU/Disk monitoring
  • Predictable pricing, not pay-per-event. You don’t have to be a forensic accountant
  • Robust telemetry and monitoring
  • Simplified CSP (Connection, Stream, Pipeline) paradigm.  With other tools, you have to look up properties and create connections in ddl. We’ve already boxed up the config parameters, tuned them right. We won’t ever ask about delivery modes. We won’t ask about consumer groups or offsets.

At Decodable we’re on a mission to make stream processing intuitive, safe, correct and fast.

Try our quickstart walkthrough and have a pipeline up in 3 minutes.

A Practical Introduction to the Data Mesh

There’s been quite a bit of talk about data meshes recently, both in terms of philosophy and technology. Unfortunately, most of the writing on the subject is thick with buzzwords, targeted toward VP and C-level executives, unparsable to engineers. The motivation behind the data mesh, however, is not only sound but practical and intuitive.

Learn more

Announcing General Availability of the Decodable Real-Time Data Platform

Now GA, Decodable puts the power of real-time data engineering in the hands of every developer, so you can use your existing SQL skills to connect sources and sinks, to build and deploy real-time pipelines – that just work. Getting started with Decodable is free, easy and only takes a few minutes.

Learn more

We’re Abusing The Data Warehouse; RETL, ELT, And Other Weird Stuff.

By now, everyone has seen the rETL (Reverse ETL) trend: you want to use data from app #1 to enrich data in app #2. In this blog, Decodable's founder discusses the (fatal) shortcomings of this approach and how to get the job done.

Learn more

Heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Learn more
Tags
Pintrest icon in black
Product

Start using Decodable today.