tensor2tensor

T2T: Create Your Own Model

Here we show how to create your own model in T2T.

The T2TModel class - abstract base class for models

T2TModel has three typical usages:

Estimator: The method make_estimator_model_fn builds a model_fn for the tf.Estimator workflow of training, evaluation, and prediction. It performs the method call, which performs the core computation, followed by estimator_spec_train, estimator_spec_eval, or estimator_spec_predict depending on the tf.Estimator mode.
Layer: The method call enables T2TModel to be used a callable by itself. It calls the following methods:
- bottom, which transforms features according to problem_hparams’ input and target Modalitys;
- body, which takes features and performs the core model computation to return output and any auxiliary loss terms;
- top, which takes features and the body output, and transforms them according to problem_hparams’ input and target Modalitys to return the final logits;
- loss, which takes the logits, forms any missing training loss, and sums all loss terms.
Inference: The method infer enables T2TModel to make sequence predictions by itself.

Creating your own model

Create a class that extends T2TModel. This example creates a copy of an existing basic fully-connected network:
```
from tensor2tensor.utils import t2t_model

class MyFC(t2t_model.T2TModel):
    pass
```

Implement the body method:

class MyFC(t2t_model.T2TModel):
  def body(self, features):
    hparams = self.hparams
    x = features["inputs"]
    shape = common_layers.shape_list(x)
    x = tf.reshape(x, [-1, shape[1] * shape[2] * shape[3]])  # Flatten input as in T2T they are all 4D vectors
    for i in range(hparams.num_hidden_layers): # create layers
      x = tf.layers.dense(x, hparams.hidden_size, name="layer_%d" % i)
      x = tf.nn.dropout(x, keep_prob=1.0 - hparams.dropout)
      x = tf.nn.relu(x)
    return tf.expand_dims(tf.expand_dims(x, axis=1), axis=1)  # 4D For T2T.

Method Signature:

Args:
- features: dict of str to Tensor, where each Tensor has shape [batch_size, …, hidden_size]. It typically contains keys inputs and targets.
Returns one of:
- output: Tensor of pre-logit activations with shape [batch_size, …, hidden_size].
- losses: Either single loss as a scalar, a list, a Tensor (to be averaged), or a dictionary of losses. If losses is a dictionary with the key “training”, losses[“training”] is considered the final training loss and output is considered logits; self.top and self.loss will be skipped.

from tensor2tensor.utils import registry

@registry.register_model
class MyFC(t2t_model.T2TModel):
   # ...

Use it with t2t tools as any other model:

Have in mind that names are translated from camel case to snake_case MyFC -> my_fc and that you need to point t2t to the directory containing your model with the --t2t_usr_dir flag. For example if you want to train a model on gcloud with 1 GPU worker on the IMDB sentiment task, you can run your model by executing the following command from your model class directory.
```
t2t-trainer \
  --model=my_fc \
  --t2t_usr_dir=.
  --cloud_mlengine --worker_gpu=1 \
  --generate_data \
  --data_dir='gs://data' \
  --output_dir='gs://out' \
  --problem=sentiment_imdb \
  --hparams_set=basic_fc_small \
  --train_steps=10000 \
  --eval_steps=10 \
```

This site is open source. Improve this page.