lingvo.tools.beam_utils module
Tools for car beam pipelines.
- lingvo.tools.beam_utils.BeamInit()[source]
Initialize the beam program.
Typically first thing to run in main(). This call is needed before FLAGS are accessed, for example.
- lingvo.tools.beam_utils.GetPipelineRoot(options=None)[source]
Return the root of the beam pipeline.
Typical usage looks like:
- with GetPipelineRoot() as root:
_ = (root | beam.ParDo() | …)
In this example, the pipeline is automatically executed when the context is exited, though one can manually run the pipeline built from the root object as well.
- Parameters
options – A beam.options.pipeline_options.PipelineOptions object.
- Returns
A beam.Pipeline root object.
- lingvo.tools.beam_utils.GetReader(record_format, file_pattern, value_coder, **kwargs)[source]
Returns a beam Reader based on record_format and file_pattern.
- Parameters
record_format – String record format, e.g., ‘tfrecord’.
file_pattern – String path describing files to be read.
value_coder – Coder to use for the values of each record.
**kwargs – arguments to pass to the corresponding Reader object constructor.
- Returns
A beam reader object.
- Raises
ValueError – If an unsupported record_format is provided.
- lingvo.tools.beam_utils.GetWriter(record_format, file_pattern, value_coder, **kwargs)[source]
Returns a beam Writer.
- Parameters
record_format – String record format, e.g., ‘tfrecord’ to write as.
file_pattern – String path describing files to be written to.
value_coder – Coder to use for the values of each written record.
**kwargs – arguments to pass to the corresponding Writer object constructor.
- Returns
A beam writer object.
- Raises
ValueError – If an unsupported record_format is provided.
- lingvo.tools.beam_utils.GetEmitterFn(record_format)[source]
Returns an Emitter function for the given record_format.
An Emitter function takes in a key and value as arguments and returns a structure that is compatible with the Beam Writer associated with the corresponding record_format.
- Parameters
record_format – String record format, e.g., ‘tfrecord’ to write as.
- Returns
An emitter function of (key, value) -> Writer’s input type.
- Raises
ValueError – If an unsupported record_format is provided.