lingvo.tasks.milan.common_schema module
Defines a common tf.train.Example
format for image-caption(-like) data.
- lingvo.tasks.milan.common_schema.ImageFeatures(images_per_example=1)[source]
Returns definitions of common image features.
- Parameters
images_per_example – Number of images stored in each example.
- Returns
A dict of feature definitions usable with
tf.io.parse_example
.
- lingvo.tasks.milan.common_schema.TextFeatures(captions_per_example=1, bert_embeddings_shape=None)[source]
Returns definitions of the common text features.
- Parameters
captions_per_example – Number of text captions stored in each example.
bert_embeddings_shape – Optional time-major shape of BERT embedding sequences to include in the schema (if given). Set the leading (time) dimension to
None
if the sequences have variable length.
- Returns
A dict of feature definitions usable with
tf.io.parse_example
.
- lingvo.tasks.milan.common_schema.AudioFeatures(mfcc_shape=None, cpc8k_shape=None)[source]
Returns definitions of common audio features.
- Parameters
mfcc_shape – Optional time-major shape of MFCC features to include in the schema (if given). Set the leading (time) dimension to
None
if the sequences have variable length.cpc8k_shape – Optional time-major shape of CPC-8K features to include.
- Returns
A dict of feature definitions usable with
tf.io.parse_example
.