lingvo.datasets module
Utilities for dataset information.
- exception lingvo.datasets.GetAllDatasetParamsNotImplementedError[source]
Bases:
NotImplementedError
- lingvo.datasets.GetDatasets(cls: Any, warn_on_error: bool = True) List[str] [source]
Returns the list of dataset functions (e.g., Train, Dev, …).
All public functions apart from
NON_DATASET_MEMBERS
are treated as datasets. Dataset functions should not have any required positional arguments.- Parameters
cls – A class variable or instance variable. This function expects to be called on classes that can be used as model tasks e.g. via model_registry.RegisterSingleTaskModel.
warn_on_error – When a class contains public methods that cannot be used as a dataset, if True, logs a warning, if False, raises a DatasetFunctionError.
- Returns
A list of strings containing names of valid dataset functions for cls.
- Raises
DatasetFunctionError – if the cls contains public methods that cannot be used as datasets, and warn_on_error is False.
- lingvo.datasets.GetDatasetsAst(base_dir: str, model: str) List[str] [source]
Gets datasets but without importing any code by using ast.
Useful when running from python interpreter without bazel build.
- Parameters
base_dir – Base directory to search in.
model – The model string.
- Returns
A list of strings containing names of valid dataset functions for cls. May not be accurate.
- Raises
Exception – if anything goes wrong.