Here we show how to create your own model in T2T.
T2TModel
has three typical usages:
make_estimator_model_fn
builds a model_fn
for the
tf.Estimator workflow of training, evaluation, and prediction. It performs
the method call
, which performs the core computation, followed by
estimator_spec_train
, estimator_spec_eval
, or estimator_spec_predict
depending on the tf.Estimator mode.Layer: The method call
enables T2TModel
to be used a callable by itself.
It calls the following methods:
bottom
, which transforms features according to problem_hparams
’
input and target Modality
s;body
, which takes features and performs the core model computation to
return output and any auxiliary loss terms;top
, which takes features and the body output, and transforms them
according to problem_hparams
’ input and target Modality
s to return
the final logits;loss
, which takes the logits, forms any missing training loss, and
sums all loss terms.infer
enables T2TModel
to make sequence
predictions by itself.Create a class that extends T2TModel
. This example creates a copy of an
existing basic fully-connected network:
from tensor2tensor.utils import t2t_model
class MyFC(t2t_model.T2TModel):
pass
Implement the body
method:
class MyFC(t2t_model.T2TModel):
def body(self, features):
hparams = self.hparams
x = features["inputs"]
shape = common_layers.shape_list(x)
x = tf.reshape(x, [-1, shape[1] * shape[2] * shape[3]]) # Flatten input as in T2T they are all 4D vectors
for i in range(hparams.num_hidden_layers): # create layers
x = tf.layers.dense(x, hparams.hidden_size, name="layer_%d" % i)
x = tf.nn.dropout(x, keep_prob=1.0 - hparams.dropout)
x = tf.nn.relu(x)
return tf.expand_dims(tf.expand_dims(x, axis=1), axis=1) # 4D For T2T.
Method Signature:
Args:
inputs
and targets
.Returns one of:
Register your model:
from tensor2tensor.utils import registry
@registry.register_model
class MyFC(t2t_model.T2TModel):
# ...
Use it with t2t tools as any other model:
Have in mind that names are translated from camel case to snake_case MyFC
-> my_fc
and that you need to point t2t to the directory containing your
model with the --t2t_usr_dir
flag. For example if you want to train a
model on gcloud with 1 GPU worker on the IMDB sentiment task, you can run
your model by executing the following command from your model class
directory.
t2t-trainer \
--model=my_fc \
--t2t_usr_dir=.
--cloud_mlengine --worker_gpu=1 \
--generate_data \
--data_dir='gs://data' \
--output_dir='gs://out' \
--problem=sentiment_imdb \
--hparams_set=basic_fc_small \
--train_steps=10000 \
--eval_steps=10 \