Solution Overview
Customer 360 refers to getting a single view of customer engagement across the entire customer journey. It connects apps and data sources from customer interactions to give businesses a 360-degree customer view. It includes customer data from a variety of sources, including customer demographics, customer relationship management (CRM), social media, eCommerce, marketing, sales, customer service, mobile apps, and many other customer touchpoints.
Businesses can leverage insights gained from a comprehensive customer view to improve and deliver exceptional experiences, increase customer loyalty, create reliable customer profiles to improve marketing and sales initiatives, streamline and connect business processes and workflows to improve efficiency and functionality, and reduce time and cost caused by human error in the customer journey.
In this example, we'll walk through how the Decodable data service is used to clean, transform, and aggregate data from multiple data sources.
Pipeline Architecture
Customer 360 data comes in many forms from many sources, including call logs, clickstream data, ecommerce activity, geolocation, NPS systems, and social media feeds. For this example, we will look at transforming two different data sources into a consistent schema which can then be sent to the same sink connection to be used for analysis, regardless of the original source or form of the data.
Below we can see examples of raw call log and clickstream data. Each data source is in a unique data format and uses different field names for similar data. By using one or more Decodable pipelines, which are streaming SQL queries that process data, we can transform the raw data into a form that is best suited for how it will be consumed.
Call Log Records
Clickstream Log Records
For this example, a single pipeline is used to process each of the two raw incoming data streams into the desired form. Depending on the complexity of the processing required, it is also possible to use multiple pipelines in a series of stages, with the output of each one being used as the input for the next. In more complex cases, it can be helpful to break it down into smaller, more manageable steps. This results in pipelines that are easier to test and maintain. Each stage in the sequence of pipelines is used to bring the data closer to its final desired form using SQL queries.
Decodable uses SQL to process data that should feel familiar to anyone who has used relational database systems. The primary differences you'll notice are that:
- You activate a pipeline to start it, and deactivate a pipeline to stop it
- All pipeline queries specify a source and a sink
- Certain operations, notably JOINs and aggregations, must include windows
Unlike relational databases, all pipelines write their results into an output data stream (or sink). As a result, all pipelines are a single statement in the form INSERT INTO <sink> SELECT ... FROM <source>, where sink and source are streams you've defined.
Transform Call Logs
As with most data services pipelines, the first step is to apply a variety of transformations to clean up and simplify the input data. For this example, an inner select is used to parse the XML object blob using the xpaths function and extract the desired fields. Then the start_time field is converted from a string to atimestamp type and the call_time_seconds field in converted to an integer.
Pipeline: Standardize Data Stream
After creating a new pipeline and entering the SQL query, clicking the Run Preview button will verify its syntax and then fire up a new executable environment to process the next 10 records coming in from the source stream and display the results. Decodable handles all the heavy lifting on the backend, allowing you to focus on working directly with your data streams to ensure that you are getting the results you need.
Transform Clickstream
For the website clickstream data, the required transformations for this example are fairly minimal. Primarily the field names are changed to match the desired schema for a standardized data stream, and theevent_timestamp field is converted to a timestamp.
Pipeline: Standardize data stream
Conclusion
At this point, a sink connection (one that writes a stream to an external system, such as AWS S3, Kafka, Kinesis, Postgres, Pulsar, or Redpanda) can be created to allow the results to be consumed by your own applications and services.
As we can see from this example, a sophisticated business problem can be addressed in a very straight-forward way using Decodable pipelines. It is not necessary to create docker containers, there is no SQL server infrastructure to set up or maintain, all that is needed is a working familiarity with creating the SQL queries themselves.
You can watch demonstrations of several examples on the Decodable YouTube channel.
Additional documentation for all of Decodable’s services is available here.
Please consider joining us on our community Slack.