lingvo.datasets module

Utilities for dataset information.

exception lingvo.datasets.DatasetFunctionError[source]

Bases: TypeError

exception lingvo.datasets.GetAllDatasetParamsNotImplementedError[source]

Bases: NotImplementedError

lingvo.datasets.GetDatasets(cls: Any, warn_on_error: bool = True) List[str][source]

Returns the list of dataset functions (e.g., Train, Dev, …).

All public functions apart from NON_DATASET_MEMBERS are treated as datasets. Dataset functions should not have any required positional arguments.

Parameters
  • cls – A class variable or instance variable. This function expects to be called on classes that can be used as model tasks e.g. via model_registry.RegisterSingleTaskModel.

  • warn_on_error – When a class contains public methods that cannot be used as a dataset, if True, logs a warning, if False, raises a DatasetFunctionError.

Returns

A list of strings containing names of valid dataset functions for cls.

Raises

DatasetFunctionError – if the cls contains public methods that cannot be used as datasets, and warn_on_error is False.

lingvo.datasets.GetDatasetsAst(base_dir: str, model: str) List[str][source]

Gets datasets but without importing any code by using ast.

Useful when running from python interpreter without bazel build.

Parameters
  • base_dir – Base directory to search in.

  • model – The model string.

Returns

A list of strings containing names of valid dataset functions for cls. May not be accurate.

Raises

Exception – if anything goes wrong.