class deep_qa.common.params.Params)[source]

A Trainer object specifies data, a model, and a way to train the model with the data. Here we group all of the common code related to these things, making only minimal assumptions about what kind of data you’re using or what the structure of your model is.

The main benefits of this class are having a common place for setting parameters related to training, actually running the training with those parameters, and code for saving and loading models.

The intended use of this class is that you construct a subclass that defines a model, overriding the abstract methods and (optionally) some of the protected methods in this class. Thus there are four kinds of methods in this class: (1) public methods, that are typically only used by deep_qa/ (or some other driver that you create), (2) abstract methods (beginning with _), which must be overridden by any concrete subclass, (3) protected methods (beginning with _) that you are meant to override in concrete subclasses, and (4) private methods (beginning with __) that you should not need to mess with. We only include the first three in the public docs.


train_files: List[str], optional (default=None)

The files containing the data that should be used for training. See load_dataset_from_files() for more information.

validation_files: List[str], optional (default=None)

The files containing the data that should be used for validation, if you do not want to use a split of the training data for validation. The default of None means to just use the validation_split parameter to split the training data for validation.

test_files: List[str], optional (default=None)

The files containing the data that should be used for evaluation. The default of None means to just not perform test set evaluation.

max_training_instances: int, optional (default=None)

Upper limit on the number of training instances. If this is set, and we get more than this, we will truncate the data. Mostly useful for testing things out on small datasets before running them on large datasets.

max_validation_instances: int, optional (default=None)

Upper limit on the number of validation instances, analogous to max_training_instances.

max_test_instances: int, optional (default=None)

Upper limit on the number of test instances, analogous to max_training_instances.

train_steps_per_epoch: int, optional (default=None)

If create_data_arrays() returns a generator instead of actual arrays, how many steps should we run from this generator before declaring an “epoch” finished? The default here is reasonable - if this is None, we will set it from the data.

validation_steps: int, optional (default=None)

Like train_steps_per_epoch, but for validation data.

test_steps: int, optional (default=None)

Like train_steps_per_epoch, but for test data.

save_models: bool, optional (default=True)

Should we save the models that we train? If this is True, you are required to also set the model_serialization_prefix parameter, or the code will crash.

model_serialization_prefix: str, optional (default=None)

Prefix for saving and loading model files. Must be set if save_models is True.

num_gpus: int, optional (default=1) Number of GPUs to use. In DeepQa we use Data Parallelism,

meaning that we create copies of the full model for each GPU, allowing the batch size of your model to be scaled depending on the number of GPUs. Note that using multiple GPUs effectively increases your batch size by the number of GPUs you have, meaning that other code which depends on the batch size will be effected - for example, if you are using dynamic padding, the batches will be larger and hence more padded, as the dataset is chunked into fewer overall batches.

batch_size: int, optional (default=32)

Batch size to use when training.

num_epochs: int, optional (default=20)

Number of training epochs.

validation_split: float, optional (default=0.1)

Amount of training data to use for validation. If validation_files is not set, we will split the training data into train/dev, using this proportion as dev. If validation_files is set, this parameter gets ignored.

optimizer: str or Dict[str, Any], optional (default=’adam’)

If this is a str, it must correspond to an optimizer available in Keras (see the list in If it is a dictionary, it must contain a “type” key, with a value that is one of the optimizers in that list. The remaining parameters in the dict are passed as kwargs to the optimizer’s constructor.

loss: str, optional (default=’categorical_crossentropy’)

The loss function to pass to This is currently limited to only loss functions that are available as strings in Keras. If you want to use a custom loss function, simply override self.loss in the constructor of your model, after the call to super().__init__.

metrics: List[str], optional (default=[‘accuracy’])

The metrics to evaluate and print after each epoch of training. This is currently limited to only loss functions that are available as strings in Keras. If you want to use a custom metric, simply override self.metrics in the constructor of your model, after the call to super().__init__.

validation_metric: str, optional (default=’val_acc’)

Metric to monitor on the validation data for things like early stopping and saving the best model.

patience: int, optional (default=1)

Number of epochs to be patient before early stopping. I.e., if the validation_metric does not improve for this many epochs, we will stop training.

fit_kwargs: Dict[str, Any], optional (default={})

A dict of additional arguments to Keras’ method, in case you want to set something that we don’t already have options for. These get added to the options already captured by other arguments.

tensorboard_log: str, optional (default=None)

If set, we will output tensorboard log information here.

tensorboard_histogram_freq: int, optional (default=0)

Tensorboard histogram frequency: note that activating the tensorboard histgram (frequency > 0) can drastically increase model training time. Please set frequency with consideration to desired runtime.

debug: Dict[str, Any], optional (default={})

This should be a dict, containing the following keys:

  • “layer_names”, which has as a value a list of names that must match layer names in the model built by this Trainer.
  • “data”, which has as a value either “training”, “validation”, or a list of file names. If you give “training” or “validation”, we’ll use those datasets, otherwise we’ll load data from the provided files. Note that currently “validation” only works if you provide validation files, not if you’re just using Keras to split the training data.
  • “masks”, an optional key that functions identically to “layer_names”, except we output the mask at each layer given here.

show_summary_with_masking_info: bool, optional (default=False)

This is a debugging setting, mostly - we have written a custom model.summary() method that supports showing masking info, to help understand what’s going on with the masks.

Public methods

Trainer.evaluate_model(data_files: typing.List[str], max_instances: int = None)[source]
Trainer.load_data_arrays(data_files: typing.List[str], batch_size: int = None, max_instances: int = None) → typing.Tuple[, <built-in function array>, <built-in function array>][source]

Loads a Dataset from a list of files, then converts it into numpy arrays for both inputs and outputs, returning all three of these to you. This literally just calls self.load_dataset_from_files, then self.create_data_arrays; it’s just a convenience method if you want to do both of these at the same time, and also lets you truncate the dataset if you want.

Note that if you have any kind of state in your model that depends on a training dataset (e.g., a vocabulary, or padding dimensions) those must be set prior to calling this method.


data_files: List[str]

The files to load. These will get passed to self.load_dataset_from_files(), which subclasses must implement.

batch_size: int, optional (default = None)

Optionally pass a specific batch size to load the data arrays with. If this is not specified, we use the default self.batch_size attribute. This is a parameter so you can specify different batch sizes for training vs validation, for instance, which is useful if you are doing multi-gpu training.

max_instances: int, optional (default=None)

If not None, we will restrict the dataset to only this many instances. This is mostly useful for testing models out on subsets of your data.


dataset: Dataset

A Dataset object containing the instances read from the data files

input_arrays: numpy.array

An array or tuple of arrays suitable to be passed as inputs x to Keras’, y), model.evaluate(x, y) or model.predict(x) methods

label_arrays: numpy.array

An array or tuple of arrays suitable to be passed as outputs y to Keras’, y) or model.evaluate(x, y) methods

Trainer.load_model(epoch: int = None)[source]

Loads a serialized model, using the model_serialization_prefix that was passed to the constructor. If epoch is not None, we try to load the model from that epoch. If epoch is not given, we load the best saved model.


Trains the model.

All training parameters have already been passed to the constructor, so we need no arguments to this method.

Abstract methods

If you’re doing NLP, TextTrainer implements most of these, so you shouldn’t have to worry about them. The only one it doesn’t is _build_model (though it adds some other abstract methods that you might have to worry about).

Trainer.create_data_arrays(dataset:, batch_size: int = None) → typing.Tuple[<built-in function array>, <built-in function array>][source]

Takes a raw dataset and converts it into training inputs and labels that can be used to either train a model or make predictions. Depending on parameters passed to the constructor of this Trainer, this could either return two actual array objects, or a single generator that generates batches of two array objects.


dataset: Dataset

A Dataset of the same format as read by load_dataset_from_files() (we will call this directly with the output from that method, in fact)

batch_size: int, optional (default = None)

The batch size with which the dataset should be created. If this is None, the default self.batch_size will be used.


input_arrays: numpy.array or Tuple[numpy.array]

label_arrays: numpy.array, Tuple[numpy.array], or None

generator: a Python generator returning Tuple[input_arrays, label_arrays]

If this is returned, it is the only return value. We either return a Tuple[input_arrays, label_arrays], or this generator.

Trainer.load_dataset_from_files(files: typing.List[str]) →[source]

Given a list of file inputs, load a raw dataset from the files. This is a list because some datasets are specified in more than one file (e.g., a file containing the instances, and a file containing background information about those instances).

Trainer.score_dataset(dataset: → typing.Tuple[<built-in function array>, <built-in function array>][source]

Takes a Dataset, indexes it, and returns the output of evaluating the model on all instances, and labels for the instances from the data, if they were given. The specifics of the numpy array that are returned depend on the model and the instance type in the dataset.


dataset: Dataset

A Dataset read by :func:`~Trainer.load_dataset_from_files().


predictions: numpy.array

Predictions for each Instance in the Dataset. This could actually be a tuple/list of arrays, if your model has multiple outputs

labels: numpy.array

The labels for each Instance in the Dataset, if there were any (this will be None if there were no labels). We return this so you can easily compute metrics over these predictions if you wish. It’s hard to get numpy arrays with the labels from a non-indexed-and-padded Dataset, so we return it here so you don’t have to do any funny business to get the label array.


Given a raw Dataset object, set whatever model state is necessary. The most obvious use case for this is for computing a vocabulary in TextTrainer. Note that this is not an IndexedDataset, and you should not make it one. Use set_model_state_from_indexed_dataset() for setting state that depends on the data having already been indexed; otherwise you’ll duplicate the work of doing the indexing.


Given an IndexedDataset, set whatever model state is necessary. This is typically stuff around padding.

Trainer._build_model() →[source]

Constructs and returns a DeepQaModel (which is a wrapper around a Keras Model) that will take the output of self._get_training_data as input, and produce as output a true/false decision for each input. Note that in the multiple gpu case, this function will be called multiple times for the different GPUs. As such, you should be wary of this function having side effects unrelated to building a computation graph.

The returned model will be used to call, train_labels).


Called after a model is loaded, this lets you update member variables that contain model parameters, like max sentence length, that are not stored as weights in the model object. This is necessary if you want to process a new data instance to be compatible with the model for prediction, for instance.

Trainer._dataset_indexing_kwargs() → typing.Dict[str, typing.Any][source]

In order to index a dataset, we may need some parameters (e.g., an object that stores the vocabulary of your model, in order to convert words into indices). You can pass those here, or return an emtpy dictionary if there’s nothing. These will get passed to Dataset.to_indexed_dataset().

Protected methods


Returns a set of Callbacks which are used to perform various functions within Keras’ .fit method. Here, we use an early stopping callback to add patience with respect to the validation metric and a Lambda callback which performs the model specific callbacks which you might want to build into a model, such as re-encoding some background knowledge.

Additionally, there is also functionality to create Tensorboard log files. These can be visualised using ‘tensorboard –logdir /path/to/log/files’ after training.

classmethod Trainer._get_custom_objects()[source]

If you’ve used any Layers that Keras doesn’t know about, you need to specify them in this dictionary, so we can load them correctly.

Trainer._instance_debug_output(instance:, outputs: typing.Dict[str, <built-in function array>]) → str[source]

This method takes an Instance and all of the debug outputs for that Instance, puts them into some human-readable format, and returns that as a string. outputs will have one key corresponding to each item in the debug.layer_names parameter given to the constructor of this object.

The default here is pass instead of raise NotImplementedError, because you’re not required to implement debugging for your model.


Called during model loading. If you have some auxiliary pickled object, such as an object storing the vocabulary of your model, you can load it here.

Trainer._output_debug_info(output_dict: typing.Dict[str, <built-in function array>], epoch: int)[source]
Trainer._overall_debug_output(output_dict: typing.Dict[str, <built-in function array>]) → str[source]
Trainer._post_epoch_hook(epoch: int)[source]

This method gets called directly after, before making any early stopping decisions. If you want to modify anything after each iteration (e.g., computing a different kind of validation loss to use for early stopping, or just computing and printing accuracy on some other held out data), you can do that here. If you require extra parameters, use calls to local methods rather than passing new parameters, as this hook is run via a Keras Callback, which is fairly strict in it’s interface.

Trainer._pre_epoch_hook(epoch: int)[source]

This method gets called before each epoch of training. If you want to do any kind of processing in between epochs (e.g., updating the training data for whatever reason), here is your chance to do so.


Called after training. If you have some auxiliary object, such as an object storing the vocabulary of your model, you can save it here. The model config is saved by default.


Training models with Keras requires a different API if you produce data in batches uses a generator or if you just provide one big numpy array with all of your data, which Keras has to split into batches. This method tells us which Keras API we should use. If your model class produces data using a generator, return True here; otherwise, return False. The default implementation just returns False.