Blog /

The Interview: Machine Learning With Flink

Robert Metzger

Apache Flink is a robust big data processing framework that works for both stream and batch processing and is the “heir apparent” to Hadoop and Spark. Apache Flink ML is a library which provides machine learning APIs and infrastructures that simplify the building of ML pipelines. Flink ML supports use cases like predictive intelligence, customer segmentation and many more.

Robert Metzger, Software Engineer at Decodable and PMC member of Apache Flink, recently spoke with Dong Lin, a Flink committer and one of the driving forces behind Flink ML, on the eve of the July release of Apache Flink ML 2.1.0. They discussed the status of the project and the plans for its future.

If you’re looking for an introduction to the machine learning space in general and what Flink ML brings to the space, this video of Robert & Dong’s conversation is a great place to start.

What you’ll learn:

  • What kind of machine learning tasks are suitable for Flink? What features of Flink make it well suited for machine learning?
  • What are the main competitors to Flink as an overall solution, and what are the competitors of Flink ML in the machine learning space?
  • Where in the machine learning space does Flink ML fit?
  • Flink itself can do joining aggregations quite well with various API's. But what is Flink ML providing on top of that?
  • What is feature engineering and why does Flink excel in this?
  • Are there any plans for Flink ML to use other language ecosystems?
  • Are there any examples of Flink ML integration with TensorFlow or other common popular frameworks?
  • What is the Flink-Extended organization in GitHub, and what are projects like Clink, Deep Learning, and AI Flow all about?  
  • What kind of training can you do with Flink ML and what use cases can you actually implement using these algorithms?
  • What Flink ML be used for model inference?
  • What are the new features of Flink ML 2.1?
  • What are the plans for the next release?
  • What's the long term vision for Flink ML?
  • If someone is interested in contributing to Flink ML, where can they start?

Watch the interview:

Decodable Arrives at Flink Forward With Updated Connectors, Talks on Autoscaling and Multi-Tenancy on Apache Flink®

SAN FRANCISCO—August 1, 2022—FLINK FORWARD—Decodable, the real-time data engineering company, is bringing updated connectors and talks on using Flink at scale to Flink Forward, the Apache Flink Conference taking place this week in San Francisco. Apache Flink is an open source, unified stream-processing and batch-processing framework on which the Decodable platform is built.

Learn more

Flink Deployments At Decodable

Decodable’s platform uses Apache Flink to run our customers’ real-time data processing jobs. This blog post explores how we securely, reliably and efficiently manage the underlying Flink deployments at Decodable in a multi-tenant environment.

Learn more

Not All Flink is Created Equal: Comparing Kinesis Data Analytics with Decodable

Flink is an amazing technology; the power to process real-time data on the stream is something that everyone running Kafka, Pulsar, Kinesis and other popular messaging platforms will want to use eventually. In this blog and video we'll show the relative experience of two Flink-based cloud services; Decodable and Kinesis Data Analytics.

Learn more


Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Learn more
Pintrest icon in black

Start using Decodable today.