Optimizing Data For InfluxDB Using Decodable

Share this post

According to gestalt psychology, "the whole is more than the sum of its parts." This is certainly true of Decodable and InfluxDB. Decodable with its effortless streaming, data cleansing, and enrichment paired with InfluxDB’s powerful time series database’s ability to analyze and manage data is a formidable system. Some of the specific ways Decodable works with InfluxDB to improve performance (with some links back to the InfluxDB best practice guidance) are:

Optimize Writes

You can use Decodable to easily capture streaming events and push them to InfluxDB when they reach the optimal 5,000 lines. You can also modify timestamps to follow the Influx best practice of “Use the coarsest time possible” and follow rate-limiting best practices by using Decodable as a buffer, smoothing out ingest. Another typical pattern: Use Decodable to lower the resolution of the dataset by calculating results over an interval and sending that data to InfluxDB (e.g. average the values over the course of 5 seconds to smooth the data, and send the result to InfluxDB). This eliminates unnecessary complexity, speeds up queries, and reduces the cost of storage and computation. Read the InfluxDB best practice.

Reduce Cardinality

Excessively high cardinality can cause memory issues and slow your InfluxDB experience. There are several patterns to mitigate this without losing important data nuance. You can use Decodable to reformat events to either group them or reorder fields to choose a more suitable field for the key. Read the InfluxDB best practice.

Eliminate Duplicate Events

With Decodable’s built-in exactly-once mandate, you can effortlessly prevent duplicate events simply by passing through Decodable. You can also easily use Decodable to keep a cache of previous events, checking for matches in specific fields and discarding duplicates. Alternatively, you can use Decodable to add a unique field (or increment the timestamp) to an event that appears to be duplicated so that it’s handled correctly by InfluxDB. Read the InfluxDB best practice.

Improve Schemas

As InfluxDB documentation states, “Complex measurements make some queries impossible”. You frequently can’t control upstream formats, but you can take control once the data comes to you. Using Decodable allows you to follow InfluxDB best practices to remove keywords and special characters, convert to numeric types, and ensure keys don’t repeat. You can also parse out concatenated attributes and avoid regex queries, and take control of tags. Read the InfluxDB best practice.

Join Streaming Tables

Engineers often need to enrich data using an outside database before landing in Influx. This is called a stream-table join, and it’s commonly used to bring more data into Influx and reduce the need for joins/lookups in InfluxDB. There are a few reasons for this:

Joins in-database can have performance considerations
Database joins can be an expensive operation that requires data to be shuffled around at query time. If you can pre-join your data in the stream processor, then you'll end up with a single, denormalized, pristine table in InfluxDB

Pre-process Data

Decodable can easily pre-process the data to meet some of the most common processing needs so that it arrives ready to use:

Flattening data into the relational data model
Filter unneeded columns to reduce the data that needs to be processed
Create new fields that are produced from existing fields
Convert numeric columns to string columns or visa versa

In Decodable, window functions can be used to create subsets of your streams with a start time and end time. A pipeline can then aggregate the windowed data and the aggregate can be written into InfluxDB.

Get Started Today

If you are an InfluxDB user today and you have a streaming platform (Kafka, Kinesis, Pulsar, RedPanda, etc), you can log into Decodable.co and easily connect and transform your data today. If you need help, set up a free session with one of our experts here and we’ll assist in creating your pipeline.

You can get started with Decodable for free - our developer account includes enough for you to build a useful pipeline and - unlike a trial - it never expires.

Learn more:

Read the docs
Check out our other blogs
Subscribe to our YouTube Channel
Check out the full Decodable Connector Catalog

Join the community Slack

‍

📫 Email signup 👇

Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?

Simply enter your email address here and we'll send you the next issue as soon as it's published—and nothing else, we promise!

👍 Got it!

Oops! Something went wrong while submitting the form.

Decodable Team

March 11, 2022

min read

Decodable Platform Overview

Sharon Xie

Decodable's Imply Polaris Connector: The Druid Easy Button

September 12, 2022

min read

Decodable's Imply Polaris Connector: The Druid Easy Button

Decodable Team

New CDC connectors for MySQL and PostgreSQL: Don’t Let Your Data be a Day Late and a Dollar Short

September 12, 2022

min read