Skip to content

TFX Cloud Solutions

Looking for insights into how TFX can be applied to build a solution that meets your needs? These in-depth articles and guides may help!

Note

These articles discuss complete solutions in which TFX is a key part, but not the only part. This is nearly always the case for real-world deployments. So implementing these solutions yourself will require more than just TFX. The main goal is to give you some insight into how others have implemented solutions that may meet requirements that are similar to yours, and not to serve as a cookbook or list of approved applications of TFX.

Architecture of a machine learning system for near real-time item matching

Use this document to learn about the architecture of a machine learning (ML) solution that learns and serves item embeddings. Embeddings can help you understand what items your customers consider to be similar, which enables you to offer real-time "similar item" suggestions in your application. This solution shows you how to identify similar songs in a dataset, and then use this information to make song recommendations. Read more

Data preprocessing for machine learning: options and recommendations

This two-part article explores the topic of data engineering and feature engineering for machine learning (ML). This first part discusses best practices of preprocessing data in a machine learning pipeline on Google Cloud. The article focuses on using TensorFlow and the open source TensorFlow Transform (tf.Transform) library to prepare data, train the model, and serve the model for prediction. This part highlights the challenges of preprocessing data for machine learning, and illustrates the options and scenarios for performing data transformation on Google Cloud effectively. Part 1 Part 2

Architecture for MLOps using TFX, Kubeflow Pipelines, and Cloud Build

This document describes the overall architecture of a machine learning (ML) system using TensorFlow Extended (TFX) libraries. It also discusses how to set up a continuous integration (CI), continuous delivery (CD), and continuous training (CT) for the ML system using Cloud Build and Kubeflow Pipelines. Read more

MLOps: Continuous delivery and automation pipelines in machine learning

This document discusses techniques for implementing and automating continuous integration (CI), continuous delivery (CD), and continuous training (CT) for machine learning (ML) systems. Data science and ML are becoming core capabilities for solving complex real-world problems, transforming industries, and delivering value in all domains. Read more

Setting up an MLOps environment on Google Cloud

This reference guide outlines the architecture of a machine learning operations (MLOps) environment on Google Cloud. The guide accompanies hands-on labs in GitHub that walk you through the process of provisioning and configuring the environment described here. Virtually all industries are adopting machine learning (ML) at a rapidly accelerating pace. A key challenge for getting value from ML is to create ways to deploy and operate ML systems effectively. This guide is intended for machine learning (ML) and DevOps engineers. Read more

Key requirements for an MLOps foundation

AI-driven organizations are using data and machine learning to solve their hardest problems and are reaping the rewards.

“Companies that fully absorb AI in their value-producing workflows by 2025 will dominate the 2030 world economy with +120% cash flow growth,” according to McKinsey Global Institute.

But it’s not easy right now. Machine learning (ML) systems have a special capacity for creating technical debt if not managed well. Read more

How to create and deploy a model card in the cloud with Scikit-Learn

Machine learning models are now being used to accomplish many challenging tasks. With their vast potential, ML models also raise questions about their usage, construction, and limitations. Documenting the answers to these questions helps to bring clarity and shared understanding. To help advance these goals, Google has introduced model cards. Read more

Analyzing and validating data at scale for machine learning with TensorFlow Data Validation

This document discusses how to use the TensorFlow Data Validation (TFDV) library for data exploration and descriptive analytics during experimentation. Data scientists and machine learning (ML) engineers can use TFDV in a production ML system to validate data that's used in a continuous training (CT) pipeline, and to detect skews and outliers in data received for prediction serving. It includes hands-on labs. Read more