Extension¶
tfx.v1.extensions
¶
TFX extensions module.
MODULE | DESCRIPTION |
---|---|
google_cloud_ai_platform |
Google cloud AI platform module. |
google_cloud_big_query |
Google Cloud Big Query module. |
Modules¶
google_cloud_ai_platform
¶
Google cloud AI platform module.
MODULE | DESCRIPTION |
---|---|
experimental |
Types used in Google Cloud AI Platform under experimental stage. |
CLASS | DESCRIPTION |
---|---|
BulkInferrer |
A Cloud AI component to do batch inference on a remote hosted model. |
Pusher |
Component for pushing model to Cloud AI Platform serving. |
Trainer |
Cloud AI Platform Trainer component. |
Tuner |
TFX component for model hyperparameter tuning on AI Platform Training. |
ATTRIBUTE | DESCRIPTION |
---|---|
ENABLE_UCAIP_KEY |
|
ENABLE_VERTEX_KEY |
|
JOB_ID_KEY |
|
LABELS_KEY |
|
SERVING_ARGS_KEY |
|
TRAINING_ARGS_KEY |
|
UCAIP_REGION_KEY |
|
VERTEX_CONTAINER_IMAGE_URI_KEY |
|
VERTEX_REGION_KEY |
|
Attributes¶
ENABLE_UCAIP_KEY
module-attribute
¶
ENABLE_UCAIP_KEY = documented(obj='ai_platform_training_enable_ucaip', doc='Deprecated. Please use ENABLE_VERTEX_KEY instead. Keys to the items in custom_config of Trainer for enabling uCAIP Training. ')
ENABLE_VERTEX_KEY
module-attribute
¶
ENABLE_VERTEX_KEY = documented(obj='ai_platform_enable_vertex', doc='Keys to the items in custom_config of Trainer and Pusher for enabling Vertex AI.')
JOB_ID_KEY
module-attribute
¶
JOB_ID_KEY = documented(obj='ai_platform_training_job_id', doc='Keys to the items in custom_config of Trainer for specifying job id.')
LABELS_KEY
module-attribute
¶
LABELS_KEY = documented(obj='ai_platform_training_labels', doc='Keys to the items in custom_config of Trainer for specifying labels for training jobs on the AI Platform only. Not applicable for Vertex AI, where labels are specified in the CustomJob as defined in: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.customJobs#CustomJob.')
SERVING_ARGS_KEY
module-attribute
¶
SERVING_ARGS_KEY = documented(obj='ai_platform_serving_args', doc='Keys to the items in custom_config of Pusher/BulkInferrer for passing serving args to AI Platform.')
TRAINING_ARGS_KEY
module-attribute
¶
TRAINING_ARGS_KEY = documented(obj='ai_platform_training_args', doc='Keys to the items in custom_config of Trainer for passing training_job to AI Platform, and the GCP project under which the training job will be executed. In Vertex AI, this corresponds to a CustomJob as defined in:https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.customJobs#CustomJob.In CAIP, this corresponds to TrainingInputs as defined in:https://cloud.google.com/ml-engine/reference/rest/v1/projects.jobs#TrainingInput')
UCAIP_REGION_KEY
module-attribute
¶
UCAIP_REGION_KEY = documented(obj='ai_platform_training_ucaip_region', doc='Deprecated. Please use VERTEX_REGION_KEY instead. Keys to the items in custom_config of Trainer for specifying the region of uCAIP.')
VERTEX_CONTAINER_IMAGE_URI_KEY
module-attribute
¶
VERTEX_CONTAINER_IMAGE_URI_KEY = documented(obj='ai_platform_vertex_container_image_uri', doc='Keys to the items in custom_config of Pusher/BulkInferrer for the serving container image URI in Vertex AI.')
VERTEX_REGION_KEY
module-attribute
¶
VERTEX_REGION_KEY = documented(obj='ai_platform_vertex_region', doc='Keys to the items in custom_config of Trainer and Pusher for specifying the region of Vertex AI.')
Classes¶
BulkInferrer
¶
BulkInferrer(examples: Channel, model: Optional[Channel] = None, model_blessing: Optional[Channel] = None, data_spec: Optional[Union[DataSpec, RuntimeParameter]] = None, output_example_spec: Optional[Union[OutputExampleSpec, RuntimeParameter]] = None, custom_config: Optional[Dict[str, Any]] = None)
Bases: BaseComponent
A Cloud AI component to do batch inference on a remote hosted model.
BulkInferrer component will push a model to Google Cloud AI Platform, consume examples data, send request to the remote hosted model, and produces the inference results to an external location as PredictionLog proto. After inference, it will delete the model from Google Cloud AI Platform.
TODO(b/155325467): Creates a end-to-end test for this component.
Component outputs
contains:
inference_result
: Channel of typestandard_artifacts.InferenceResult
to store the inference results.output_examples
: Channel of typestandard_artifacts.Examples
to store the output examples.
Construct an BulkInferrer component.
PARAMETER | DESCRIPTION |
---|---|
examples
|
A Channel of type
TYPE:
|
model
|
A Channel of type |
model_blessing
|
A Channel of type |
data_spec
|
bulk_inferrer_pb2.DataSpec instance that describes data selection.
TYPE:
|
output_example_spec
|
bulk_inferrer_pb2.OutputExampleSpec instance, specify if you want BulkInferrer to output examples instead of inference result.
TYPE:
|
custom_config
|
A dict which contains the deployment job parameters to be passed to Google Cloud AI Platform. custom_config.ai_platform_serving_args need to contain the serving job parameters. For the full set of parameters, refer to https://cloud.google.com/ml-engine/reference/rest/v1/projects.models |
RAISES | DESCRIPTION |
---|---|
ValueError
|
Must not specify inference_result or output_examples depends on whether output_example_spec is set or not. |
ATTRIBUTE | DESCRIPTION |
---|---|
EXECUTOR_SPEC |
|
SPEC_CLASS |
|
Source code in tfx/extensions/google_cloud_ai_platform/bulk_inferrer/component.py
Pusher
¶
Pusher(model: Optional[Channel] = None, model_blessing: Optional[Channel] = None, infra_blessing: Optional[Channel] = None, custom_config: Optional[Dict[str, Any]] = None)
Bases: Pusher
Component for pushing model to Cloud AI Platform serving.
Construct a Pusher component.
PARAMETER | DESCRIPTION |
---|---|
model
|
An optional Channel of type |
model_blessing
|
An optional Channel of type
|
infra_blessing
|
An optional Channel of type
|
custom_config
|
A dict which contains the deployment job parameters to be passed to Cloud platforms. |
METHOD | DESCRIPTION |
---|---|
add_downstream_node |
Experimental: Add another component that must run after this one. |
add_downstream_nodes |
Experimental: Add another component that must run after this one. |
add_upstream_node |
Experimental: Add another component that must run before this one. |
add_upstream_nodes |
Experimental: Add components that must run before this one. |
from_json_dict |
Convert from dictionary data to an object. |
get_class_type |
|
remove_downstream_node |
|
remove_upstream_node |
|
to_json_dict |
Convert from an object to a JSON serializable dictionary. |
with_id |
|
with_node_execution_options |
|
with_platform_config |
Attaches a proto-form platform config to a component. |
ATTRIBUTE | DESCRIPTION |
---|---|
DRIVER_CLASS |
|
EXECUTOR_SPEC |
|
POST_EXECUTABLE_SPEC |
|
PRE_EXECUTABLE_SPEC |
|
SPEC_CLASS |
|
component_id |
TYPE:
|
component_type |
TYPE:
|
downstream_nodes |
|
driver_class |
|
exec_properties |
|
executor_spec |
|
id |
Node id, unique across all TFX nodes in a pipeline.
TYPE:
|
inputs |
|
node_execution_options |
TYPE:
|
outputs |
Component's output channel dict. |
platform_config |
|
spec |
|
type |
TYPE:
|
type_annotation |
|
upstream_nodes |
|
Source code in tfx/extensions/google_cloud_ai_platform/pusher/component.py
id
property
writable
¶id: str
Node id, unique across all TFX nodes in a pipeline.
If id
is set by the user, return it directly.
Otherwise, return
RETURNS | DESCRIPTION |
---|---|
str
|
node id. |
add_downstream_node
¶Experimental: Add another component that must run after this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_upstream_node
.
PARAMETER | DESCRIPTION |
---|---|
downstream_node
|
a component that must run after this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_downstream_nodes
¶Experimental: Add another component that must run after this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_upstream_nodes
.
PARAMETER | DESCRIPTION |
---|---|
downstream_nodes
|
a list of components that must run after this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_upstream_node
¶Experimental: Add another component that must run before this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_downstream_node
.
PARAMETER | DESCRIPTION |
---|---|
upstream_node
|
a component that must run before this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_upstream_nodes
¶Experimental: Add components that must run before this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
PARAMETER | DESCRIPTION |
---|---|
upstream_nodes
|
a list of components that must run before this node.
|
Source code in tfx/dsl/components/base/base_node.py
from_json_dict
classmethod
¶Convert from dictionary data to an object.
get_class_type
classmethod
¶get_class_type() -> str
Source code in tfx/dsl/components/base/base_node.py
remove_downstream_node
¶
remove_upstream_node
¶
to_json_dict
¶Convert from an object to a JSON serializable dictionary.
Source code in tfx/dsl/components/base/base_node.py
with_node_execution_options
¶
with_platform_config
¶Attaches a proto-form platform config to a component.
The config will be a per-node platform-specific config.
PARAMETER | DESCRIPTION |
---|---|
config
|
platform config to attach to the component.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
the same component itself. |
Source code in tfx/dsl/components/base/base_component.py
Trainer
¶
Trainer(examples: Optional[Channel] = None, transformed_examples: Optional[Channel] = None, transform_graph: Optional[Channel] = None, schema: Optional[Channel] = None, base_model: Optional[Channel] = None, hyperparameters: Optional[Channel] = None, module_file: Optional[Union[str, RuntimeParameter]] = None, run_fn: Optional[Union[str, RuntimeParameter]] = None, train_args: Optional[Union[TrainArgs, RuntimeParameter]] = None, eval_args: Optional[Union[EvalArgs, RuntimeParameter]] = None, custom_config: Optional[Dict[str, Any]] = None)
Bases: Trainer
Cloud AI Platform Trainer component.
Construct a Trainer component.
PARAMETER | DESCRIPTION |
---|---|
examples
|
A Channel of type |
transformed_examples
|
Deprecated field. Please set |
transform_graph
|
An optional Channel of type
|
schema
|
An optional Channel of type |
base_model
|
A Channel of type |
hyperparameters
|
A Channel of type |
module_file
|
A path to python module file containing UDF model definition.
The module_file must implement a function named
TYPE:
|
run_fn
|
A python path to UDF model definition function for generic trainer. See 'module_file' for details. Exactly one of 'module_file' or 'run_fn' must be supplied if Trainer uses GenericExecutor (default).
TYPE:
|
train_args
|
A proto.TrainArgs instance, containing args used for training
Currently only splits and num_steps are available. Default behavior
(when splits is empty) is train on
TYPE:
|
eval_args
|
A proto.EvalArgs instance, containing args used for evaluation.
Currently only splits and num_steps are available. Default behavior
(when splits is empty) is evaluate on
TYPE:
|
custom_config
|
A dict which contains addtional training job parameters that will be passed into user module. |
METHOD | DESCRIPTION |
---|---|
add_downstream_node |
Experimental: Add another component that must run after this one. |
add_downstream_nodes |
Experimental: Add another component that must run after this one. |
add_upstream_node |
Experimental: Add another component that must run before this one. |
add_upstream_nodes |
Experimental: Add components that must run before this one. |
from_json_dict |
Convert from dictionary data to an object. |
get_class_type |
|
remove_downstream_node |
|
remove_upstream_node |
|
to_json_dict |
Convert from an object to a JSON serializable dictionary. |
with_id |
|
with_node_execution_options |
|
with_platform_config |
Attaches a proto-form platform config to a component. |
ATTRIBUTE | DESCRIPTION |
---|---|
DRIVER_CLASS |
|
EXECUTOR_SPEC |
|
POST_EXECUTABLE_SPEC |
|
PRE_EXECUTABLE_SPEC |
|
SPEC_CLASS |
|
component_id |
TYPE:
|
component_type |
TYPE:
|
downstream_nodes |
|
driver_class |
|
exec_properties |
|
executor_spec |
|
id |
Node id, unique across all TFX nodes in a pipeline.
TYPE:
|
inputs |
|
node_execution_options |
TYPE:
|
outputs |
Component's output channel dict. |
platform_config |
|
spec |
|
type |
TYPE:
|
type_annotation |
|
upstream_nodes |
|
Source code in tfx/extensions/google_cloud_ai_platform/trainer/component.py
EXECUTOR_SPEC
class-attribute
instance-attribute
¶
id
property
writable
¶id: str
Node id, unique across all TFX nodes in a pipeline.
If id
is set by the user, return it directly.
Otherwise, return
RETURNS | DESCRIPTION |
---|---|
str
|
node id. |
add_downstream_node
¶Experimental: Add another component that must run after this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_upstream_node
.
PARAMETER | DESCRIPTION |
---|---|
downstream_node
|
a component that must run after this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_downstream_nodes
¶Experimental: Add another component that must run after this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_upstream_nodes
.
PARAMETER | DESCRIPTION |
---|---|
downstream_nodes
|
a list of components that must run after this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_upstream_node
¶Experimental: Add another component that must run before this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_downstream_node
.
PARAMETER | DESCRIPTION |
---|---|
upstream_node
|
a component that must run before this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_upstream_nodes
¶Experimental: Add components that must run before this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
PARAMETER | DESCRIPTION |
---|---|
upstream_nodes
|
a list of components that must run before this node.
|
Source code in tfx/dsl/components/base/base_node.py
from_json_dict
classmethod
¶Convert from dictionary data to an object.
get_class_type
classmethod
¶get_class_type() -> str
Source code in tfx/dsl/components/base/base_node.py
remove_downstream_node
¶
remove_upstream_node
¶
to_json_dict
¶Convert from an object to a JSON serializable dictionary.
Source code in tfx/dsl/components/base/base_node.py
with_node_execution_options
¶
with_platform_config
¶Attaches a proto-form platform config to a component.
The config will be a per-node platform-specific config.
PARAMETER | DESCRIPTION |
---|---|
config
|
platform config to attach to the component.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
the same component itself. |
Source code in tfx/dsl/components/base/base_component.py
Tuner
¶
Tuner(examples: BaseChannel, schema: Optional[BaseChannel] = None, transform_graph: Optional[BaseChannel] = None, base_model: Optional[BaseChannel] = None, module_file: Optional[str] = None, tuner_fn: Optional[str] = None, train_args: Optional[TrainArgs] = None, eval_args: Optional[EvalArgs] = None, tune_args: Optional[TuneArgs] = None, custom_config: Optional[Dict[str, Any]] = None)
Bases: Tuner
TFX component for model hyperparameter tuning on AI Platform Training.
Construct a Tuner component.
PARAMETER | DESCRIPTION |
---|---|
examples
|
A BaseChannel of type
TYPE:
|
schema
|
An optional BaseChannel of type
TYPE:
|
transform_graph
|
An optional BaseChannel of type
TYPE:
|
base_model
|
A BaseChannel of type
TYPE:
|
module_file
|
A path to python module file containing UDF tuner definition.
The module_file must implement a function named |
tuner_fn
|
A python path to UDF model definition function. See 'module_file' for the required signature of the UDF. Exactly one of 'module_file' or 'tuner_fn' must be supplied. |
train_args
|
A trainer_pb2.TrainArgs instance, containing args used for
training. Currently only splits and num_steps are available. Default
behavior (when splits is empty) is train on
TYPE:
|
eval_args
|
A trainer_pb2.EvalArgs instance, containing args used for eval.
Currently only splits and num_steps are available. Default behavior
(when splits is empty) is evaluate on
TYPE:
|
tune_args
|
A tuner_pb2.TuneArgs instance, containing args used for tuning. Currently only num_parallel_trials is available.
TYPE:
|
custom_config
|
A dict which contains addtional training job parameters that will be passed into user module. |
METHOD | DESCRIPTION |
---|---|
add_downstream_node |
Experimental: Add another component that must run after this one. |
add_downstream_nodes |
Experimental: Add another component that must run after this one. |
add_upstream_node |
Experimental: Add another component that must run before this one. |
add_upstream_nodes |
Experimental: Add components that must run before this one. |
from_json_dict |
Convert from dictionary data to an object. |
get_class_type |
|
remove_downstream_node |
|
remove_upstream_node |
|
to_json_dict |
Convert from an object to a JSON serializable dictionary. |
with_id |
|
with_node_execution_options |
|
with_platform_config |
Attaches a proto-form platform config to a component. |
ATTRIBUTE | DESCRIPTION |
---|---|
DRIVER_CLASS |
|
EXECUTOR_SPEC |
|
POST_EXECUTABLE_SPEC |
|
PRE_EXECUTABLE_SPEC |
|
SPEC_CLASS |
|
component_id |
TYPE:
|
component_type |
TYPE:
|
downstream_nodes |
|
driver_class |
|
exec_properties |
|
executor_spec |
|
id |
Node id, unique across all TFX nodes in a pipeline.
TYPE:
|
inputs |
|
node_execution_options |
TYPE:
|
outputs |
Component's output channel dict. |
platform_config |
|
spec |
|
type |
TYPE:
|
type_annotation |
|
upstream_nodes |
|
Source code in tfx/components/tuner/component.py
id
property
writable
¶id: str
Node id, unique across all TFX nodes in a pipeline.
If id
is set by the user, return it directly.
Otherwise, return
RETURNS | DESCRIPTION |
---|---|
str
|
node id. |
add_downstream_node
¶Experimental: Add another component that must run after this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_upstream_node
.
PARAMETER | DESCRIPTION |
---|---|
downstream_node
|
a component that must run after this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_downstream_nodes
¶Experimental: Add another component that must run after this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_upstream_nodes
.
PARAMETER | DESCRIPTION |
---|---|
downstream_nodes
|
a list of components that must run after this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_upstream_node
¶Experimental: Add another component that must run before this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_downstream_node
.
PARAMETER | DESCRIPTION |
---|---|
upstream_node
|
a component that must run before this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_upstream_nodes
¶Experimental: Add components that must run before this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
PARAMETER | DESCRIPTION |
---|---|
upstream_nodes
|
a list of components that must run before this node.
|
Source code in tfx/dsl/components/base/base_node.py
from_json_dict
classmethod
¶Convert from dictionary data to an object.
get_class_type
classmethod
¶get_class_type() -> str
Source code in tfx/dsl/components/base/base_node.py
remove_downstream_node
¶
remove_upstream_node
¶
to_json_dict
¶Convert from an object to a JSON serializable dictionary.
Source code in tfx/dsl/components/base/base_node.py
with_node_execution_options
¶
with_platform_config
¶Attaches a proto-form platform config to a component.
The config will be a per-node platform-specific config.
PARAMETER | DESCRIPTION |
---|---|
config
|
platform config to attach to the component.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
the same component itself. |
Source code in tfx/dsl/components/base/base_component.py
Modules¶
experimental
¶
Types used in Google Cloud AI Platform under experimental stage.
ATTRIBUTE | DESCRIPTION |
---|---|
BULK_INFERRER_SERVING_ARGS_KEY |
|
ENDPOINT_ARGS_KEY |
|
PUSHER_SERVING_ARGS_KEY |
|
REMOTE_TRIALS_WORKING_DIR_KEY |
|
TUNING_ARGS_KEY |
|
BULK_INFERRER_SERVING_ARGS_KEY
module-attribute
¶BULK_INFERRER_SERVING_ARGS_KEY = documented(obj='ai_platform_serving_args', doc='Keys to the items in custom_config of Bulk Inferrer for passing bulkinferrer args to AI Platform.')
ENDPOINT_ARGS_KEY
module-attribute
¶ENDPOINT_ARGS_KEY = documented(obj='endpoint', doc='Keys to the items in custom_config of Pusher/BulkInferrer for optional endpoint override (CAIP).')
PUSHER_SERVING_ARGS_KEY
module-attribute
¶PUSHER_SERVING_ARGS_KEY = documented(obj='ai_platform_serving_args', doc='Keys to the items in custom_config of Pusher/BulkInferrer for passing serving args to AI Platform.')
REMOTE_TRIALS_WORKING_DIR_KEY
module-attribute
¶REMOTE_TRIALS_WORKING_DIR_KEY = documented(obj='remote_trials_working_dir', doc='Keys to the items in custom_config of Tuner for specifying a working dir for remote trial.')
TUNING_ARGS_KEY
module-attribute
¶TUNING_ARGS_KEY = documented(obj='ai_platform_tuning_args', doc='Keys to the items in custom_config of Tuner for passing training_job to AI Platform, and the GCP project under which the training job will be executed. In Vertex AI, this corresponds to a CustomJob as defined in:https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.customJobs#CustomJob.In CAIP, this corresponds to TrainingInputs as defined in:https://cloud.google.com/ml-engine/reference/rest/v1/projects.jobs#TrainingInput')
google_cloud_big_query
¶
Google Cloud Big Query module.
CLASS | DESCRIPTION |
---|---|
BigQueryExampleGen |
Cloud BigQueryExampleGen component. |
Pusher |
Cloud Big Query Pusher component. |
ATTRIBUTE | DESCRIPTION |
---|---|
PUSHER_SERVING_ARGS_KEY |
|
Attributes¶
PUSHER_SERVING_ARGS_KEY
module-attribute
¶
PUSHER_SERVING_ARGS_KEY = documented(obj='bigquery_serving_args', doc='Keys to the items in custom_config of Pusher for passing serving args to Big Query.')
Classes¶
BigQueryExampleGen
¶
BigQueryExampleGen(query: Optional[str] = None, input_config: Optional[Union[Input, RuntimeParameter]] = None, output_config: Optional[Union[Output, RuntimeParameter]] = None, range_config: Optional[Union[RangeConfig, RuntimeParameter, Placeholder]] = None, custom_executor_spec: Optional[ExecutorSpec] = None, custom_config: Optional[Union[CustomConfig, RuntimeParameter]] = None)
Bases: QueryBasedExampleGen
Cloud BigQueryExampleGen component.
The BigQuery examplegen component takes a query, and generates train and eval examples for downstream components.
Component outputs
contains:
examples
: Channel of typestandard_artifacts.Examples
for output train and eval examples.
Constructs a BigQueryExampleGen component.
PARAMETER | DESCRIPTION |
---|---|
query
|
BigQuery sql string, query result will be treated as a single split, can be overwritten by input_config. |
input_config
|
An example_gen_pb2.Input instance with Split.pattern as BigQuery sql string. If set, it overwrites the 'query' arg, and allows different queries per split. If any field is provided as a RuntimeParameter, input_config should be constructed as a dict with the same field names as Input proto message.
TYPE:
|
output_config
|
An example_gen_pb2.Output instance, providing output configuration. If unset, default splits will be 'train' and 'eval' with size 2:1. If any field is provided as a RuntimeParameter, input_config should be constructed as a dict with the same field names as Output proto message.
TYPE:
|
range_config
|
An optional range_config_pb2.RangeConfig instance, specifying the range of span values to consider.
TYPE:
|
custom_executor_spec
|
Optional custom executor spec overriding the default executor spec specified in the component attribute.
TYPE:
|
custom_config
|
An example_gen_pb2.CustomConfig instance, providing custom configuration for ExampleGen.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
RuntimeError
|
Only one of query and input_config should be set. |
METHOD | DESCRIPTION |
---|---|
add_downstream_node |
Experimental: Add another component that must run after this one. |
add_downstream_nodes |
Experimental: Add another component that must run after this one. |
add_upstream_node |
Experimental: Add another component that must run before this one. |
add_upstream_nodes |
Experimental: Add components that must run before this one. |
from_json_dict |
Convert from dictionary data to an object. |
get_class_type |
|
remove_downstream_node |
|
remove_upstream_node |
|
to_json_dict |
Convert from an object to a JSON serializable dictionary. |
with_beam_pipeline_args |
Add per component Beam pipeline args. |
with_id |
|
with_node_execution_options |
|
with_platform_config |
Attaches a proto-form platform config to a component. |
ATTRIBUTE | DESCRIPTION |
---|---|
DRIVER_CLASS |
|
EXECUTOR_SPEC |
|
POST_EXECUTABLE_SPEC |
|
PRE_EXECUTABLE_SPEC |
|
SPEC_CLASS |
|
component_id |
TYPE:
|
component_type |
TYPE:
|
downstream_nodes |
|
driver_class |
|
exec_properties |
|
executor_spec |
|
id |
Node id, unique across all TFX nodes in a pipeline.
TYPE:
|
inputs |
|
node_execution_options |
TYPE:
|
outputs |
Component's output channel dict. |
platform_config |
|
spec |
|
type |
TYPE:
|
type_annotation |
|
upstream_nodes |
|
Source code in tfx/extensions/google_cloud_big_query/example_gen/component.py
id
property
writable
¶id: str
Node id, unique across all TFX nodes in a pipeline.
If id
is set by the user, return it directly.
Otherwise, return
RETURNS | DESCRIPTION |
---|---|
str
|
node id. |
add_downstream_node
¶Experimental: Add another component that must run after this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_upstream_node
.
PARAMETER | DESCRIPTION |
---|---|
downstream_node
|
a component that must run after this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_downstream_nodes
¶Experimental: Add another component that must run after this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_upstream_nodes
.
PARAMETER | DESCRIPTION |
---|---|
downstream_nodes
|
a list of components that must run after this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_upstream_node
¶Experimental: Add another component that must run before this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_downstream_node
.
PARAMETER | DESCRIPTION |
---|---|
upstream_node
|
a component that must run before this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_upstream_nodes
¶Experimental: Add components that must run before this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
PARAMETER | DESCRIPTION |
---|---|
upstream_nodes
|
a list of components that must run before this node.
|
Source code in tfx/dsl/components/base/base_node.py
from_json_dict
classmethod
¶Convert from dictionary data to an object.
get_class_type
classmethod
¶get_class_type() -> str
Source code in tfx/dsl/components/base/base_node.py
remove_downstream_node
¶
remove_upstream_node
¶
to_json_dict
¶Convert from an object to a JSON serializable dictionary.
Source code in tfx/dsl/components/base/base_node.py
with_beam_pipeline_args
¶with_beam_pipeline_args(beam_pipeline_args: Iterable[Union[str, Placeholder]]) -> BaseBeamComponent
Add per component Beam pipeline args.
PARAMETER | DESCRIPTION |
---|---|
beam_pipeline_args
|
List of Beam pipeline args to be added to the Beam executor spec. |
RETURNS | DESCRIPTION |
---|---|
BaseBeamComponent
|
the same component itself. |
Source code in tfx/dsl/components/base/base_beam_component.py
with_node_execution_options
¶
with_platform_config
¶Attaches a proto-form platform config to a component.
The config will be a per-node platform-specific config.
PARAMETER | DESCRIPTION |
---|---|
config
|
platform config to attach to the component.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
the same component itself. |
Source code in tfx/dsl/components/base/base_component.py
Pusher
¶
Pusher(model: Optional[Channel] = None, model_blessing: Optional[Channel] = None, infra_blessing: Optional[Channel] = None, custom_config: Optional[Dict[str, Any]] = None)
Bases: Pusher
Cloud Big Query Pusher component.
Component outputs
contains:
pushed_model
: Channel of typestandard_artifacts.PushedModel
with result of push.
Construct a Pusher component.
PARAMETER | DESCRIPTION |
---|---|
model
|
An optional Channel of type |
model_blessing
|
An optional Channel of type
|
infra_blessing
|
An optional Channel of type
|
custom_config
|
A dict which contains the deployment job parameters to be passed to Cloud platforms. |
METHOD | DESCRIPTION |
---|---|
add_downstream_node |
Experimental: Add another component that must run after this one. |
add_downstream_nodes |
Experimental: Add another component that must run after this one. |
add_upstream_node |
Experimental: Add another component that must run before this one. |
add_upstream_nodes |
Experimental: Add components that must run before this one. |
from_json_dict |
Convert from dictionary data to an object. |
get_class_type |
|
remove_downstream_node |
|
remove_upstream_node |
|
to_json_dict |
Convert from an object to a JSON serializable dictionary. |
with_id |
|
with_node_execution_options |
|
with_platform_config |
Attaches a proto-form platform config to a component. |
ATTRIBUTE | DESCRIPTION |
---|---|
DRIVER_CLASS |
|
EXECUTOR_SPEC |
|
POST_EXECUTABLE_SPEC |
|
PRE_EXECUTABLE_SPEC |
|
SPEC_CLASS |
|
component_id |
TYPE:
|
component_type |
TYPE:
|
downstream_nodes |
|
driver_class |
|
exec_properties |
|
executor_spec |
|
id |
Node id, unique across all TFX nodes in a pipeline.
TYPE:
|
inputs |
|
node_execution_options |
TYPE:
|
outputs |
Component's output channel dict. |
platform_config |
|
spec |
|
type |
TYPE:
|
type_annotation |
|
upstream_nodes |
|
Source code in tfx/extensions/google_cloud_big_query/pusher/component.py
id
property
writable
¶id: str
Node id, unique across all TFX nodes in a pipeline.
If id
is set by the user, return it directly.
Otherwise, return
RETURNS | DESCRIPTION |
---|---|
str
|
node id. |
add_downstream_node
¶Experimental: Add another component that must run after this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_upstream_node
.
PARAMETER | DESCRIPTION |
---|---|
downstream_node
|
a component that must run after this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_downstream_nodes
¶Experimental: Add another component that must run after this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_upstream_nodes
.
PARAMETER | DESCRIPTION |
---|---|
downstream_nodes
|
a list of components that must run after this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_upstream_node
¶Experimental: Add another component that must run before this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_downstream_node
.
PARAMETER | DESCRIPTION |
---|---|
upstream_node
|
a component that must run before this node.
|
Source code in tfx/dsl/components/base/base_node.py
add_upstream_nodes
¶Experimental: Add components that must run before this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
PARAMETER | DESCRIPTION |
---|---|
upstream_nodes
|
a list of components that must run before this node.
|
Source code in tfx/dsl/components/base/base_node.py
from_json_dict
classmethod
¶Convert from dictionary data to an object.
get_class_type
classmethod
¶get_class_type() -> str
Source code in tfx/dsl/components/base/base_node.py
remove_downstream_node
¶
remove_upstream_node
¶
to_json_dict
¶Convert from an object to a JSON serializable dictionary.
Source code in tfx/dsl/components/base/base_node.py
with_node_execution_options
¶
with_platform_config
¶Attaches a proto-form platform config to a component.
The config will be a per-node platform-specific config.
PARAMETER | DESCRIPTION |
---|---|
config
|
platform config to attach to the component.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Self
|
the same component itself. |