Sporting Events

Solution Overview

Sports mobile apps are enabling professional sports teams and event organizers to capture and enhance their audiences around the world. The internet has made it possible for fans to find team or transfer news, match highlights, and club merchandise on-demand. With 90 percent of mobile internet time spent on apps, having a sports app driven by a real-time data stream has never been so valuable. In the sporting events segment, revenue is projected to reach more than $27 billion in 2022 and the number of users is expected to be over 315 million by 2026.

In this example, we’ll walk through how the Decodable data service is used to clean, transform, and enrich real-time sporting event data. The processed data could then be used to drive a mobile app or to update a website.

Pipeline Architecture

Below we can see a sample of raw sporting event data. In its current form, it is more detailed than what is needed by our hypothetical app. By using one or more Decodable pipelines, which are streaming SQL queries that process data, we can transform the raw data into a form that is best suited for how it will be consumed.

For this example, two separate pipelines are used in series, with the output of each one being used as the input for the next. While it is possible to perform all the desired processing in a single large, complex pipeline, it is most often desirable to split them into smaller, more manageable processing steps. This results in pipelines that are easier to test and maintain. Each stage in the sequence of pipelines is used to bring the data closer to its final desired form using SQL queries.

Decodable uses SQL to process data that should feel familiar to anyone who has used relational database systems. The primary differences you’ll notice are that:

  • You activate a pipeline to start it, and deactivate a pipeline to stop it
  • All pipeline queries specify a source and a sink
  • Certain operations, notably JOINs and aggregations, must include windows

Unlike relational databases, all pipelines write their results into an output data stream (or sink). As a result, all pipelines are a single statement in the form INSERT INTO <sink> SELECT ... FROM <source>, where sink and source are streams you’ve defined.

Unnest Data Stream Array

Each record of the sample raw event stream contains data about player actions during a sporting event. A single events field contains an array of event data that needs to be unnested (or demultiplexed) into multiple records. To accomplish this, a cross join is performed between the events-raw data stream and the results of using the unnest function on the events field.

For example, if a given input record contains an array of 25 player actions, this pipeline will transform each input record into 25 separate output records for processing by subsequent pipelines.

When the pipeline is running, the effects of unnesting the input records can be seen in the Overview tab which shows real-time data flow statistics. The input metrics will show a given number of records per second, while the output metrics will show a higher number based on how many elements are in the events array.

As the records are being unnested, the type of player action is also being transformed into specific fields which will be used to create summary statistics.

Pipeline: Extract Player Action Events

After creating a new pipeline and entering the SQL query, clicking the Run Preview button will verify its syntax and then fire up a new executable environment to process the next 10 records coming in from the source stream and display the results. Decodable handles all the heavy lifting on the backend, allowing you to focus on working directly with your data streams to ensure that you are getting the results you need.

Aggregate And Filter Sporting Events

This final pipeline stage leverages the SQL hop group window function to create a set of records across a fixed 5-minute duration and hops (or slides) by an interval of 1 minute. Because the hop interval is smaller than the window size, the hopping windows are overlapping. Therefore, records from the data stream are assigned to multiple windows. Every minute, a new record is generated that contains a summary of the past 5 minutes. This results in a set of summary statistics that slide forward in time as the sporting event progresses.

Pipeline: Aggregate Player Action Statistics

Conclusion

At this point, a sink connection (one that writes a stream to an external system, such as AWS S3, Kafka, Kinesis, Postgres, Pulsar, or Redpanda) can be created to allow the results to be consumed by your own applications and services.

As we can see from this example, a sophisticated business problem can be addressed in a very straight-forward way using Decodable pipelines. It is not necessary to create docker containers, there is no SQL server infrastructure to set up or maintain, all that is needed is a working familiarity with creating the SQL queries themselves.


You can watch demonstrations of several examples on the Decodable YouTube channel.

Additional documentation for all of Decodable’s services is available here.

Please consider joining us on our community Slack.

Other Solutions

Inventory Control

The ability to track and manage the movement of products through your warehouse is critical to the health and growth of businesses and to satisfy customers, on time.

Learn more

Shipping & Tracking

The ability to see, in real-time, logistics and tracking information improves transportation decisions leading to reduced costs and enhanced services.

Learn more

Health Monitoring

Data from healthcare monitoring devices can inform healthcare staff of any changes in patient condition, alert them to issues with devices, and respond proactively.

Learn more

Fraud Detection

Securing online applications and services is a major requirement for businesses of all types, and threat actors are constantly increasing the sophistication of their attacks.

Learn more

Food Delivery

Real-time data for food delivery is critical to customer satisfaction. Order status updates are constantly updated, sent to customers in apps and SMS.

Learn more

Financial Analysis

Financial analysis is used to evaluate economic trends, set financial policy, build long-term plans, and identify investments or prioritize projects.

Learn more

Customer 360

Customer 360 connects apps and data sources from customer interactions to give businesses a 360-degree view across the end to end customer journey.

Learn more

Claims Adjudication

Errors in claims data can indicate the need for review by a claims examiner, including mismatched coding, omission of required data, and noncompliance.

Learn more

Heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Learn more