Blog
Blog /

Introducing the Decodable SDK for Custom Pipelines

Gunnar Morling
Decodable

Today it’s my great pleasure to announce the first release of a software development kit (SDK) for Decodable Custom Pipelines! This SDK helps you to build custom Apache Flink® jobs, tightly integrated with managed streams and connectors of the Decodable platform. In this blog post I’d like to discuss why we’ve built Custom Pipelines and this SDK, and how to get started with using the SDK.

Decodable Custom Pipelines

Custom Pipelines, written in any JVM language such as Java or Scala, complement Decodable’s support for SQL-based data streaming pipelines. While SQL is a great choice for a large share of data streaming use cases, allowing you to map and project, filter and join, group and aggregate your data, in some cases it might not be flexible enough.

Based on the powerful Java-based APIs of Apache Flink, Custom Pipelines allow you to implement arbitrary custom logic, invoke external web services, integrate 3rd-party connectors which aren’t natively supported by Decodable, and much more, into your data flows. Just like SQL pipelines, you do not need to provision, scale, or tune any infrastructure to leverage Custom Pipelines. 

As such, Custom Pipelines are the perfect escape hatch for cases where SQL doesn’t quite cut it. At the same time, we wanted Custom Pipelines to be closely integrated with the rest of the platform. When running your own Flink jobs, it should be possible to benefit from Decodable managed connectors and persistent, replayable data streams. This is where the new Custom Pipelines SDK comes in, providing you with all the right integration points needed.

Getting Started With the Custom Pipelines SDK

The simplest way for getting started with building your Flink jobs for the Decodable platform is to use the quickstart archetype provided by Apache Flink. Run the following to start a new Maven project:

The SDK is available from Maven Central, so you can simply add it as a dependency to the pom.xml file of the newly generated project:

Alternatively, if you are using Gradle, add the dependency to your build.gradle file like so:

With the SDK dependency in place, you can start developing your stream processing job using the full power of Flink. At the time of publication, Flink’s DataStream API is supported by the SDK, and support for the Table API is on the roadmap. For retrieving data from Decodable streams as well as sending data to them, the SDK provides the DecodableStreamSource and DecodableStreamSink interfaces. You can use these two interfaces to integrate your custom stream processing logic with managed connectors, or other SQL-based pipelines.

Here’s an example of a typical job implementation:

In this example, purchase order data is ingested from the “purchase-orders” stream. This stream is receiving data from a CDC connector and contains data from the database of an ERP system. For ease of use, the data is deserialized into a POJO, PurchaseOrder, using Flink’s JsonSerializationSchema. Each incoming purchase order is processed using a custom mapping function, PurchaseOrderProcessor, which is where, for example, you could invoke an external service for order validation. Finally, the records are written to an output stream via a DecodableStreamSink. Another managed Decodable connector could take the data from that stream and send it to an external system.

To learn more about the Decodable SDK for Custom Pipelines, see the API documentation for the project.

Testing

Good testability is a key requirement during modern application development. To that end, the SDK comes with a complete testing framework, letting you validate the logic of your stream processing job in end-to-end tests. The testing framework uses the Testcontainers project for spinning up the underlying infrastructure. Here is a test for the stream processing job above:

Using Testcontainers, an ephemeral Redpanda broker is started as a data streaming platform. The test inserts a purchase order as a JSON document into the “purchase-orders” stream. Next, the job under test is executed. Finally, one record is retrieved from the “purchase-orders-processed” stream and its content is asserted to match the expected structure.

This testing framework allows you to implement comprehensive tests, in your own IDE or CI environment, so that you can verify that your jobs work as expected before deploying them to production.

Deploying your Custom Pipelines

After implementing and testing your stream processing job, you can now deploy it as a pipeline to Decodable. Make sure you have the Custom Pipelines feature enabled in the Decodable UI:

Then, package your job as a JAR file by running mvn clean verify from the root directory of your project. Next, deploy your job to Decodable using the Decodable CLI:

Alternatively, you can upload your job via the web UI:

At this point, your job is deployed but it isn’t running yet. You can activate it either in the CLI like so:

Or, using the UI:

After waiting a few moments, your job is running and processing purchase orders from the input stream and sending them to the output stream.

Next Steps

Today’s Beta1 release marks the first step towards a fully supported and feature-rich SDK for Decodable Custom Pipelines. This is a perfect time for you to join the Decodable user community, sign up for Custom Pipelines, and give it a try.

Over the next few weeks and months, we plan to build out the SDK, e.g., adding integration with Flink’s Table API, creating a bespoke Custom Pipelines Maven Archetype, providing support for collecting metrics and accessing Decodable secrets in your jobs, and much more.

The SDK is fully open source (using the Apache License, version 2), and its source code is managed on GitHub. Your contributions in the form of bug reports, feature requests, and pull requests are highly welcome. We’d also love to hear from you about your use cases for Custom Pipelines and the SDK. Until then, Happy Streaming!

Additional Resources

Support for Running Your Own Apache Flink Jobs Is Coming To Decodable

Decodable announces a major new feature of the platform: support for running your own custom Apache Flink® jobs, written in any JVM-based programming language such as Java or Scala.

Learn more

Apache Flink is the Industry Standard and It’s Really Obvious

At Decodable, we have long believed that Apache Flink is the best stream processing system, with a proven track record of meeting the demands of some of the largest and most sophisticated businesses in the world, such as Netflix, Uber, Stripe, and many more. We are excited to welcome Confluent to the ranks of companies, like ourselves, who regard Flink as the industry standard for meeting the real-time stream processing needs of businesses, large and small.

Learn more

Heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Learn more
Tags
Pintrest icon in black
SDK
Java

Start using Decodable today.