Tensor2Tensor Documentation

version GitHub
Issues Contributions
welcome Gitter License

Tensor2Tensor, or T2T for short, is a library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.



Training in the cloud

Solving your task

Below we list a number of tasks that can be solved with T2T when you train the appropriate model on the appropriate problem. We give the problem and model below and we suggest a setting of hyperparameters that we know works well in our setup. We usually run either on Cloud TPUs or on 8-GPU machines; you might need to modify the hyperparameters if you run on a different setup.

Image Classification

For image classification, we have a number of standard data-sets:

For ImageNet, we suggest to use the ResNet or Xception, i.e., use --model=resnet --hparams_set=resnet_50 or --model=xception --hparams_set=xception_base. Resnet should get to above 76% top-1 accuracy on ImageNet.

For CIFAR and MNIST, we suggest to try the shake-shake model: --model=shake_shake --hparams_set=shakeshake_big. This setting trained for --train_steps=700000 should yield close to 97% accuracy on CIFAR-10.

Language Modeling

For language modeling, we have these data-sets in T2T:

We suggest to start with --model=transformer on this task and use --hparams_set=transformer_small for PTB and --hparams_set=transformer_base for LM1B.

Sentiment Analysis

For the task of recognizing the sentiment of a sentence, use

We suggest to use --model=transformer_encoder here and since it is a small data-set, try --hparams_set=transformer_tiny and train for few steps (e.g., --train_steps=2000).

Speech Recognition

For speech-to-text, we have these data-sets in T2T:


For summarizing longer text into shorter one we have these data-sets:

We suggest to use --model=transformer and --hparams_set=transformer_prepend for this task. This yields good ROUGE scores.


There are a number of translation data-sets in T2T:

You can get translations in the other direction by appending _rev to the problem name, e.g., for German-English use --problem=translate_ende_wmt32k_rev.

For all translation problems, we suggest to try the Transformer model: --model=transformer. At first it is best to try the base setting, --hparams_set=transformer_base. When trained on 8 GPUs for 300K steps this should reach a BLEU score of about 28 on the English-German data-set, which is close to state-of-the art. If training on a single GPU, try the --hparams_set=transformer_base_single_gpu setting. For very good results or larger data-sets (e.g., for English-French), try the big model with --hparams_set=transformer_big.