Back
May 24, 2023
8
min read

Apache Flink is the Industry Standard and It’s Really Obvious

By
Eric Sammer
Share this post

Here at Decodable, we have long believed that Apache Flink is the best stream processing system, with a proven track record of meeting the demands of some of the largest and most sophisticated businesses in the world, such as Netflix, Uber, Stripe, and many more. We are excited to welcome Confluent to the ranks of companies, like ourselves, who regard Flink as the industry standard for meeting the real-time stream processing needs of businesses, large and small. Our growing team of experts is dedicated to realizing the singular vision of unleashing the power of stream processing and enabling companies to build the real-time applications and services they need to maintain their competitive edge by meeting, and exceeding, the present and future needs and expectations of their customers.

The Future is Real-Time Stream Processing

Businesses are using stream processing to make smarter and faster decisions by acting on time-sensitive and mission-critical events, performing real-time analytics, and building applications with features delivered to end-users in real time. While the spectrum of use cases continues to expand across an ever-widening range of domains, common applications of stream processing include fraud detection, real-time transformation and ingestion of application events, processing IoT sensor data, network monitoring, generating context-aware online advertising, cybersecurity analysis, geofencing and vehicle tracking, and many others.

The Decodable Promise

Decodable has taken the best-of-breed open source projects, including Apache Flink and Debezium, to build a powerful, fully-managed stream processing platform for you, and that is our sole focus.  We provide a simple, easy-to-use developer experience using SQL or Java. Operated and maintained by streaming experts, Decodable works with your existing stack, so you can build real-time data pipelines without worrying about stitching together the lower level building blocks.

Why Decodable is Built on Apache Flink

For us, the choice to build our platform on Apache Flink was clear. One of the most important indicators of success for any open source project is a diverse, active community. Flink receives contributions from engineers from a myriad of different industries, including  fin-tech, retail/e-commerce, gaming, insurance, and logistics—each of whom have different requirements, and as a result, push on different aspects of the project making it more robust for everyone.

Flink also works with a wide array of cloud providers, storage systems, and has connectors for the most critical and popular data infrastructure that exists today. It has clear data processing semantics, robust state management, and job recovery which together help to ensure correctness—properties that take millions of hours of production time to develop and validate.

Flink supports different API layers ranging from highly flexible, low level primitives to purpose-built, high level APIs that can be easily optimized by the engine. This gives us and our users options for a vast array of different use cases.

And finally, our own experience. The team at Decodable has significant experience building and running Flink at scale. We are comprised of seasoned experts in the data streaming space, including Robert Metzger (Apache Flink PMC Chair), Gunnar Morling (former project lead for Debezium at Red Hat), and a team of folks who have run these systems at scale.

A Fully-Managed Stream Processing Platform

Our approach to solving the challenge of stream processing was to develop a system that provides simple APIs with the right semantics and abstraction for a real-time data platform, and to shift the underlying complexities inside. Rather than worrying about the low-level details, we think it should be possible to think of real-time data processing in terms of:

Streams of data records, which are a log of schema-compatible data records. The platform enforces schema compatibility when streams, connections and pipelines are attached to each other.

Connections to external data systems, which transport data between your data infrastructure and Decodable. A connection is an instance of a connector - which encapsulates logic about how to connect to a specific type of system.

Pipelines that process data in streams, which are defined by SQL statements or code written against the Flink APIs. A pipeline reads input from one or more streams, performs processing, and outputs to exactly one stream.

The diagram below illustrates how connections, streams and  pipelines interact to ingest, process, and send data:

For a closer look at how we designed our platform architecture, be sure to check out this blog post by one of our founding engineers, Sharon Xie.

Apache Flink Beyond the Basics

Decodable’s stream processing platform is more than the sum of its parts. Offering a fully-managed, pre-integrated environment provides several important advantages and customer benefits.

  • Powerful Stream Processing with Flink SQL and Flink APIs. Decodable supports using SQL to process data, familiar to anyone who has used relational database systems. And for customers with custom JVM-based code, Decodable also supports running your own Apache Flink jobs.
  • Fully Managed Platform as a Service. The Decodable data streaming infrastructure is built on Apache Flink, a powerful, secure, and reliable open source technology, delivering fast time-to-value without the complexities.
  • Fast time to deployment. Simple deployment of pipelines without provisioning, configuring, or managing any infrastructure, while reducing costly, time-consuming tasks and risks from human error, enabling you to see results in minutes, not months.
  • Out-of-the-Box Connector Library. The Decodable platform includes a large and growing library of out-of-the-box connectors to enable data ingestion from any data source, including databases with change-data capture, and egress to any data sink with minimal configuration and setup.
  • CDC Source Connectors. Build any data store into your streaming application using simple Change Data Capture (CDC) connectors. Emit change streams to any compatible data sink.
  • Scale Seamlessly with Task-Based Sizing and Pricing. One of the primary drawbacks of stream processing is that it can be “resource hungry,” and many organizations deal with this by over-provisioning, which can be very costly. Decodable overcomes this by using task sizing to dynamically configure workloads to maximize performance and minimize cost.
  • Simple Web-Based Interface. With Decodable, you can rapidly prototype and iterate stream processing pipelines and deploy to production via an optimized developer “inner loop” that provides an intuitive, smooth developer workflow.
  • Comprehensive API and Scriptable CLI. Decodable delivers a fully integrated platform with simple abstractions. It is equipped with a comprehensive API and scriptable CLI for automation and integration with existing GitOps tools and processes.
  • SOC2 Type II and GDPR compliant. The Decodable platform is architected for reliability and security, with safety and privacy protocols built in. The platform also includes separate control and data planes, and logs for compliance and audits.

Advantages of a Fully-Managed Platform

Companies can choose to research the ecosystem of open source projects and the supporting technologies which can be assembled, deployed, and integrated to meet their real-time stream processing needs; or they can work with a partner who can offer them a pre-integrated, production-ready platform backed by an experienced team dedicated to their success. Our job is to free you from having to build the platform when the processing and analytics that support your apps and services drive the real value. At Decodable, we want to help you hit the ground running and achieve your streaming goals today.

Additional Resources

📫 Email signup 👇

Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?

Simply enter your email address here and we'll send you the next issue as soon as it's published—and nothing else, we promise!

👍 Got it!
Oops! Something went wrong while submitting the form.
Eric Sammer

Eric Sammer is a data analytics industry veteran who has started two companies, Rocana (acquired by Splunk in 2017), and Decodable. He is an author, engineer, and leader on a mission to help companies move and transform data to achieve new and useful business results. Eric is a speaker on topics including data engineering, ML/AI, real-time data processing, entrepreneurship, and open source. He has spoken at events including the RTA Summit and Current, on podcasts with Software Engineering Daily and Sam Ramji, and has appeared in various industry publications.

Here at Decodable, we have long believed that Apache Flink is the best stream processing system, with a proven track record of meeting the demands of some of the largest and most sophisticated businesses in the world, such as Netflix, Uber, Stripe, and many more. We are excited to welcome Confluent to the ranks of companies, like ourselves, who regard Flink as the industry standard for meeting the real-time stream processing needs of businesses, large and small. Our growing team of experts is dedicated to realizing the singular vision of unleashing the power of stream processing and enabling companies to build the real-time applications and services they need to maintain their competitive edge by meeting, and exceeding, the present and future needs and expectations of their customers.

The Future is Real-Time Stream Processing

Businesses are using stream processing to make smarter and faster decisions by acting on time-sensitive and mission-critical events, performing real-time analytics, and building applications with features delivered to end-users in real time. While the spectrum of use cases continues to expand across an ever-widening range of domains, common applications of stream processing include fraud detection, real-time transformation and ingestion of application events, processing IoT sensor data, network monitoring, generating context-aware online advertising, cybersecurity analysis, geofencing and vehicle tracking, and many others.

The Decodable Promise

Decodable has taken the best-of-breed open source projects, including Apache Flink and Debezium, to build a powerful, fully-managed stream processing platform for you, and that is our sole focus.  We provide a simple, easy-to-use developer experience using SQL or Java. Operated and maintained by streaming experts, Decodable works with your existing stack, so you can build real-time data pipelines without worrying about stitching together the lower level building blocks.

Why Decodable is Built on Apache Flink

For us, the choice to build our platform on Apache Flink was clear. One of the most important indicators of success for any open source project is a diverse, active community. Flink receives contributions from engineers from a myriad of different industries, including  fin-tech, retail/e-commerce, gaming, insurance, and logistics—each of whom have different requirements, and as a result, push on different aspects of the project making it more robust for everyone.

Flink also works with a wide array of cloud providers, storage systems, and has connectors for the most critical and popular data infrastructure that exists today. It has clear data processing semantics, robust state management, and job recovery which together help to ensure correctness—properties that take millions of hours of production time to develop and validate.

Flink supports different API layers ranging from highly flexible, low level primitives to purpose-built, high level APIs that can be easily optimized by the engine. This gives us and our users options for a vast array of different use cases.

And finally, our own experience. The team at Decodable has significant experience building and running Flink at scale. We are comprised of seasoned experts in the data streaming space, including Robert Metzger (Apache Flink PMC Chair), Gunnar Morling (former project lead for Debezium at Red Hat), and a team of folks who have run these systems at scale.

A Fully-Managed Stream Processing Platform

Our approach to solving the challenge of stream processing was to develop a system that provides simple APIs with the right semantics and abstraction for a real-time data platform, and to shift the underlying complexities inside. Rather than worrying about the low-level details, we think it should be possible to think of real-time data processing in terms of:

Streams of data records, which are a log of schema-compatible data records. The platform enforces schema compatibility when streams, connections and pipelines are attached to each other.

Connections to external data systems, which transport data between your data infrastructure and Decodable. A connection is an instance of a connector - which encapsulates logic about how to connect to a specific type of system.

Pipelines that process data in streams, which are defined by SQL statements or code written against the Flink APIs. A pipeline reads input from one or more streams, performs processing, and outputs to exactly one stream.

The diagram below illustrates how connections, streams and  pipelines interact to ingest, process, and send data:

For a closer look at how we designed our platform architecture, be sure to check out this blog post by one of our founding engineers, Sharon Xie.

Apache Flink Beyond the Basics

Decodable’s stream processing platform is more than the sum of its parts. Offering a fully-managed, pre-integrated environment provides several important advantages and customer benefits.

  • Powerful Stream Processing with Flink SQL and Flink APIs. Decodable supports using SQL to process data, familiar to anyone who has used relational database systems. And for customers with custom JVM-based code, Decodable also supports running your own Apache Flink jobs.
  • Fully Managed Platform as a Service. The Decodable data streaming infrastructure is built on Apache Flink, a powerful, secure, and reliable open source technology, delivering fast time-to-value without the complexities.
  • Fast time to deployment. Simple deployment of pipelines without provisioning, configuring, or managing any infrastructure, while reducing costly, time-consuming tasks and risks from human error, enabling you to see results in minutes, not months.
  • Out-of-the-Box Connector Library. The Decodable platform includes a large and growing library of out-of-the-box connectors to enable data ingestion from any data source, including databases with change-data capture, and egress to any data sink with minimal configuration and setup.
  • CDC Source Connectors. Build any data store into your streaming application using simple Change Data Capture (CDC) connectors. Emit change streams to any compatible data sink.
  • Scale Seamlessly with Task-Based Sizing and Pricing. One of the primary drawbacks of stream processing is that it can be “resource hungry,” and many organizations deal with this by over-provisioning, which can be very costly. Decodable overcomes this by using task sizing to dynamically configure workloads to maximize performance and minimize cost.
  • Simple Web-Based Interface. With Decodable, you can rapidly prototype and iterate stream processing pipelines and deploy to production via an optimized developer “inner loop” that provides an intuitive, smooth developer workflow.
  • Comprehensive API and Scriptable CLI. Decodable delivers a fully integrated platform with simple abstractions. It is equipped with a comprehensive API and scriptable CLI for automation and integration with existing GitOps tools and processes.
  • SOC2 Type II and GDPR compliant. The Decodable platform is architected for reliability and security, with safety and privacy protocols built in. The platform also includes separate control and data planes, and logs for compliance and audits.

Advantages of a Fully-Managed Platform

Companies can choose to research the ecosystem of open source projects and the supporting technologies which can be assembled, deployed, and integrated to meet their real-time stream processing needs; or they can work with a partner who can offer them a pre-integrated, production-ready platform backed by an experienced team dedicated to their success. Our job is to free you from having to build the platform when the processing and analytics that support your apps and services drive the real value. At Decodable, we want to help you hit the ground running and achieve your streaming goals today.

Additional Resources

📫 Email signup 👇

Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?

Simply enter your email address here and we'll send you the next issue as soon as it's published—and nothing else, we promise!

Eric Sammer

Eric Sammer is a data analytics industry veteran who has started two companies, Rocana (acquired by Splunk in 2017), and Decodable. He is an author, engineer, and leader on a mission to help companies move and transform data to achieve new and useful business results. Eric is a speaker on topics including data engineering, ML/AI, real-time data processing, entrepreneurship, and open source. He has spoken at events including the RTA Summit and Current, on podcasts with Software Engineering Daily and Sam Ramji, and has appeared in various industry publications.