Back
April 22, 2024
4
min read

Decodable Spring Updates: Refining Data Movement for Efficiency and Flexibility

By
Sharon Xie
Share this post

As we embrace the spring season, we're excited to share with you our latest product updates. These new features enhance your experience with Decodable by boosting efficiency, broadening the support with the data ecosystem, and offering a cleaner, simpler way to manage how data moves through your stack.

Our mission as a company is to tackle the challenge of data movement in a pragmatic way, delivering data where it needs to be, when it needs to be there, with as little or as much processing as needed. Each of the features below tie directly into this:

These are all available in both paid and free trial versions of Decodable. Try it out for yourself!

Multi-Stream Connectors

Connectors form the backbone of the integration capabilities of Decodable. With the release of Multi-Stream Connectors (MSC), you can now create a connection to an external system with multiple streams. Connecting to numerous schemas and tables in a Postgres database? One connection. Do you have multiple Kafka topics to read or write to? One connection. 

This simplifies the management of connections, as well as reducing your costs. Moreover, the MSC-enabled connectors greatly reduce manual configuration requirements by: 

  • Automatically retrieving external resource information, such as name and schema, and mapping into Decodable resources. 
  • Automatically creating external resources on the sink side when necessary, for example creating tables in Snowflake if they don’t already exist.

We are in the process of rolling out MSC support across our entire library of connectors, with the following already live:

Declarative Resource Management

Inspired by the exemplary developer experience of Kubernetes, we've introduced a similar declarative approach for users of the Decodable CLI. This will make integration of Decodable into CI/CD processes even easier. Users simply declare Decodable connections, pipelines, streams, and secrets in a YAML file, and execute:

decodable apply [flags] <filename>

This command creates resources from the specified file in an idempotent and atomic manner, simplifying integration with source control tools and development workflows.

Declarative resource management supports creating resources, with activation of those resources soon to follow.

To get started, download the latest version of the Decodable CLI — and be sure to give us feedback on this feature as we continue to develop it.

New Connectors

Apache Iceberg Sink

We’ve added support for writing to Apache Iceberg from Decodable for real-time analytics needs. Iceberg is a popular open-source table format that makes large-scale data lake management easy with useful features like SQL support, schema evolution, and time-travel. The connector only writes to 1 table for connection, and we are actively working on MSC support for Iceberg.

Amazon Simple Queue Service (SQS) notification support for S3 Source

We have enhanced the existing S3 connector with support for Amazon SQS event notifications. This means the connector no longer requires periodic scans of the entire S3 bucket. Instead, the connector performs a one-time scan for initial snapshotting, followed by the processing of new changes as indicated by SQS notifications. This adjustment benefits users of the connector by significantly reducing both cost and latency.

Oracle & Microsoft SQL Server CDC Source (Coming Soon)

Due for release soon are the Oracle and Microsoft SQL Server CDC source connectors. With these you’ll be able to use Decodable to stream data in a low-latency and low-impact way from your Oracle (with no GoldenGate licensing requirements!) and Microsoft SQL Server databases. Whether driving real-time applications, feeding analytics systems, or offloading data for processing elsewhere, these connectors will provide huge benefits for many of our users.

Support for Flink 1.18 in Custom Pipelines

Announced last year, Decodable’s Custom Pipelines support for “bring your own job” has been hugely popular. Users appreciate the flexibility of being able to run pipelines in either SQL or custom Flink jobs, whichever is more suitable for the task at hand. We’ve recently added support for Flink 1.18 with Java 11 and will be adding Java 17 soon.

Pipelines built with Flink JARs benefit from the same hosted platform and support that SQL pipelines do, along with a powerful user interface for managing them. Check out the docs to get started with Custom Pipelines.

PyFlink Support 

Not only do we offer support for Java workloads in Custom Pipelines, but we are excited to open our early adopters program for users of PyFlink!

The Apache Flink community has many Python enthusiasts, especially in the machine learning and data engineering space. We heard the feedback and we’ve been working hard to provide the flexibility of writing Python code to process your data streams. 

If you’re interested in being an early adopter, please drop us a line

Pipeline Snapshot Management

Pipeline snapshots are very useful for making pipeline modifications and recovering from failures. By using snapshots, you can restore pipeline jobs to reprocess data from a previous point in time while maintaining a consistent state. This capability significantly reduces pipeline downtime by eliminating the need to reprocess all historical data.

Users can now configure pipelines to take periodic snapshots. Additionally, at any time while the job is running, a one-off snapshot can be created. Pipelines can be activated using any previously created (but non-expired) snapshot.

Use the following commands to manage pipeline snapshots:

Usage:
  decodable pipeline snapshot [command]
Available Commands:
  create      create a snapshot for a pipeline
  delete      delete a snapshot for a pipeline
  get         Get a snapshot for a pipeline
  list        List the snapshots for a pipeline

To start a pipeline with a snapshot, use:

decodable pipeline activate <id> --start-from-snapshot-id  <snapshot-id>

Download the latest version of the Decodable CLI to manage snapshots. The ability to create and view snapshots and activate pipelines with selected snapshots will be included in the web UI in an upcoming release.

AWS Marketplace

Decodable is now available on AWS Marketplace, making it much easier for existing AWS users to manage billing and use their AWS credits. 

Spring Forward with Decodable

We invite you to explore these updates firsthand and see the difference they can make in your data strategy. Start today (no credit card needed!) and experience the power of pragmatic data movement.

Join Decodable engineers Gunnar Morling and John MacKinnon on May 8, 9am PT/12pm ET, for a live tech talk and demo as they build a multi-stream data flow from a MySQL database to Snowflake using Multi-Stream Connectors. Register today!

📫 Email signup 👇

Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?

Simply enter your email address here and we'll send you the next issue as soon as it's published—and nothing else, we promise!

👍 Got it!
Oops! Something went wrong while submitting the form.
Sharon Xie

Sharon is a founding engineer at Decodable. Currently she leads product management and development. She has over six years of experience in building and operating streaming data platforms, with extensive expertise in Apache Kafka, Apache Flink, and Debezium. Before joining Decodable, she served as the technical lead for the real-time data platform at Splunk, where her focus was on the streaming query language and developer SDKs.

Table of contents

Let's Get Decoding

As we embrace the spring season, we're excited to share with you our latest product updates. These new features enhance your experience with Decodable by boosting efficiency, broadening the support with the data ecosystem, and offering a cleaner, simpler way to manage how data moves through your stack.

Our mission as a company is to tackle the challenge of data movement in a pragmatic way, delivering data where it needs to be, when it needs to be there, with as little or as much processing as needed. Each of the features below tie directly into this:

These are all available in both paid and free trial versions of Decodable. Try it out for yourself!

Multi-Stream Connectors

Connectors form the backbone of the integration capabilities of Decodable. With the release of Multi-Stream Connectors (MSC), you can now create a connection to an external system with multiple streams. Connecting to numerous schemas and tables in a Postgres database? One connection. Do you have multiple Kafka topics to read or write to? One connection. 

This simplifies the management of connections, as well as reducing your costs. Moreover, the MSC-enabled connectors greatly reduce manual configuration requirements by: 

  • Automatically retrieving external resource information, such as name and schema, and mapping into Decodable resources. 
  • Automatically creating external resources on the sink side when necessary, for example creating tables in Snowflake if they don’t already exist.

We are in the process of rolling out MSC support across our entire library of connectors, with the following already live:

Declarative Resource Management

Inspired by the exemplary developer experience of Kubernetes, we've introduced a similar declarative approach for users of the Decodable CLI. This will make integration of Decodable into CI/CD processes even easier. Users simply declare Decodable connections, pipelines, streams, and secrets in a YAML file, and execute:

decodable apply [flags] <filename>

This command creates resources from the specified file in an idempotent and atomic manner, simplifying integration with source control tools and development workflows.

Declarative resource management supports creating resources, with activation of those resources soon to follow.

To get started, download the latest version of the Decodable CLI — and be sure to give us feedback on this feature as we continue to develop it.

New Connectors

Apache Iceberg Sink

We’ve added support for writing to Apache Iceberg from Decodable for real-time analytics needs. Iceberg is a popular open-source table format that makes large-scale data lake management easy with useful features like SQL support, schema evolution, and time-travel. The connector only writes to 1 table for connection, and we are actively working on MSC support for Iceberg.

Amazon Simple Queue Service (SQS) notification support for S3 Source

We have enhanced the existing S3 connector with support for Amazon SQS event notifications. This means the connector no longer requires periodic scans of the entire S3 bucket. Instead, the connector performs a one-time scan for initial snapshotting, followed by the processing of new changes as indicated by SQS notifications. This adjustment benefits users of the connector by significantly reducing both cost and latency.

Oracle & Microsoft SQL Server CDC Source (Coming Soon)

Due for release soon are the Oracle and Microsoft SQL Server CDC source connectors. With these you’ll be able to use Decodable to stream data in a low-latency and low-impact way from your Oracle (with no GoldenGate licensing requirements!) and Microsoft SQL Server databases. Whether driving real-time applications, feeding analytics systems, or offloading data for processing elsewhere, these connectors will provide huge benefits for many of our users.

Support for Flink 1.18 in Custom Pipelines

Announced last year, Decodable’s Custom Pipelines support for “bring your own job” has been hugely popular. Users appreciate the flexibility of being able to run pipelines in either SQL or custom Flink jobs, whichever is more suitable for the task at hand. We’ve recently added support for Flink 1.18 with Java 11 and will be adding Java 17 soon.

Pipelines built with Flink JARs benefit from the same hosted platform and support that SQL pipelines do, along with a powerful user interface for managing them. Check out the docs to get started with Custom Pipelines.

PyFlink Support 

Not only do we offer support for Java workloads in Custom Pipelines, but we are excited to open our early adopters program for users of PyFlink!

The Apache Flink community has many Python enthusiasts, especially in the machine learning and data engineering space. We heard the feedback and we’ve been working hard to provide the flexibility of writing Python code to process your data streams. 

If you’re interested in being an early adopter, please drop us a line

Pipeline Snapshot Management

Pipeline snapshots are very useful for making pipeline modifications and recovering from failures. By using snapshots, you can restore pipeline jobs to reprocess data from a previous point in time while maintaining a consistent state. This capability significantly reduces pipeline downtime by eliminating the need to reprocess all historical data.

Users can now configure pipelines to take periodic snapshots. Additionally, at any time while the job is running, a one-off snapshot can be created. Pipelines can be activated using any previously created (but non-expired) snapshot.

Use the following commands to manage pipeline snapshots:

Usage:
  decodable pipeline snapshot [command]
Available Commands:
  create      create a snapshot for a pipeline
  delete      delete a snapshot for a pipeline
  get         Get a snapshot for a pipeline
  list        List the snapshots for a pipeline

To start a pipeline with a snapshot, use:

decodable pipeline activate <id> --start-from-snapshot-id  <snapshot-id>

Download the latest version of the Decodable CLI to manage snapshots. The ability to create and view snapshots and activate pipelines with selected snapshots will be included in the web UI in an upcoming release.

AWS Marketplace

Decodable is now available on AWS Marketplace, making it much easier for existing AWS users to manage billing and use their AWS credits. 

Spring Forward with Decodable

We invite you to explore these updates firsthand and see the difference they can make in your data strategy. Start today (no credit card needed!) and experience the power of pragmatic data movement.

Join Decodable engineers Gunnar Morling and John MacKinnon on May 8, 9am PT/12pm ET, for a live tech talk and demo as they build a multi-stream data flow from a MySQL database to Snowflake using Multi-Stream Connectors. Register today!

📫 Email signup 👇

Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?

Simply enter your email address here and we'll send you the next issue as soon as it's published—and nothing else, we promise!

👍 Got it!
Oops! Something went wrong while submitting the form.
Sharon Xie

Sharon is a founding engineer at Decodable. Currently she leads product management and development. She has over six years of experience in building and operating streaming data platforms, with extensive expertise in Apache Kafka, Apache Flink, and Debezium. Before joining Decodable, she served as the technical lead for the real-time data platform at Splunk, where her focus was on the streaming query language and developer SDKs.

Let's Get Decoding