Backend Layers¶
Layers in this module generally just implement some simple operation from the Keras backend as a Layer. The reason we have these as Layers is largely so that we can properly handle masking.
AddMask¶

class
deep_qa.layers.backend.add_mask.
AddMask
(mask_value: float = 0.0, **kwargs)[source]¶ Bases:
deep_qa.layers.masked_layer.MaskedLayer
This
Layer
adds a mask to a tensor. It is intended solely for testing, though if you have a use case for this outside of testing, feel free to use it. Thecall()
method just returns the inputs, and thecompute_mask
method callsK.not_equal(inputs, mask_value)
, and that’s it. This is different from Keras’Masking
layer, which assumes higherorder input and does aK.any()
call incompute_mask
. Input:
 tensor: a tensor of arbitrary shape
 Output:
 the same tensor, now with a mask attached of the same shape
Parameters: mask_value: float, optional (default=0.0)
This is the value that we will compare to in
compute_mask
.
compute_mask
(inputs, mask=None)[source]¶ Computes an output mask tensor.
 # Arguments
 inputs: Tensor or list of tensors. mask: Tensor or list of tensors.
 # Returns
 None or a tensor (or list of tensors,
 one per output tensor of the layer).

compute_output_shape
(input_shape)[source]¶ Computes the output shape of the layer.
Assumes that the layer will be built to match that input shape provided.
 # Arguments
 input_shape: Shape tuple (tuple of integers)
 or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
 # Returns
 An input shape tuple.

get_config
()[source]¶ Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Container (one layer of abstraction above).
 # Returns
 Python dictionary.
BatchDot¶

class
deep_qa.layers.backend.batch_dot.
BatchDot
(**kwargs)[source]¶ Bases:
deep_qa.layers.masked_layer.MaskedLayer
This
Layer
callsK.batch_dot()
on two inputstensor_a
andtensor_b
. This function will work for tensors of arbitrary size as long asabs(K.ndim(tensor_a)  K.ndim(tensor_b)) < 1
, due to limitations inK.batch_dot()
. When the input tensors have more than three dimensions, they must have the same shape, except for the last two dimensions. See the examples for more explanation of what this means.We always assume the dimension to perform the dot is the last one, and that the masks have one fewer dimension that the tensors. Note that this layer does not return zeroes in places that are masked, but does pass a correct mask forward. If this then gets fed into
masked_softmax
, for instance, your tensor will be correctly normalized. We always assume the dimension to perform the dot is the last one, and that the masks have one fewer dimension than the tensors. Inputs:
 tensor_a: tensor with
ndim >= 2
.  tensor_b: tensor with
ndim >= 2
.
 tensor_a: tensor with
 Output:
 a_dot_b
Examples
The following examples will try to give some insight on how this layer works in relation to
K.batch_dot()
. Note that the Keras documentation (as of 2/13/17) onK.batch_dot
is incorrect, and that this layer behaves differently from the documented behavior.As a first example, let’s suppose that
tensor_a
andtensor_b
have the same number of dimensions. Let the shape oftensor_a
be(2, 3, 2)
, and let the shape oftensor_b
be(2, 4, 2)
. The mask accompanying these inputs always has one less dimension, so thetensor_a_mask
has shape(2, 3)
andtensor_b_mask
has shape(2, 4)
. The shape of thebatch_dot
output would thus be(2, 3, 4)
. This is because we are taking the batch dot of the last dimension, so the output shape is(2, 3)
(from tensor_a) with(4)
(from tensor_b) appended on (to get(2, 3, 4)
in total). The output mask has the same shape as the output, and is thus(2, 3, 4)
as well.>>> import keras.backend as K >>> tensor_a = K.ones(shape=(2, 3, 2)) >>> tensor_b = K.ones(shape=(2, 4, 2)) >>> K.eval(K.batch_dot(tensor_a, tensor_b, axes=(2,2))).shape (2, 3, 4)
Next, let’s look at an example where
tensor_a
andtensor_b
are “uneven” (different number of dimensions). Let the shape oftensor_a
be(2, 4, 2)
, and let the shape oftensor_b
be(2, 4, 3, 2)
. The mask accompanying these inputs always has one less dimension, so thetensor_a_mask
has shape(2, 4)
andtensor_b_mask
has shape(2, 4, 3)
. The shape of thebatch_dot
output would thus be(2, 4, 3)
. In the case of uneven tensors, we always expand the last dimension of the smaller tensor to make them even. Thus in this case, we expandtensor_a
to get a new shape of(2, 4, 2, 1)
. Now we are taking thebatch_dot
of a tensor with shape(2, 4, 2, 1)
and(2, 4, 3, 2)
. Note that the first two dimensions of this tensor are the same(2, 4)
– this is a requirement imposed byK.batch_dot
. Following the methodology of calculating the output shape above, we get that the output is(2, 4, 1, 3)
since we get(2, 4, 1)
fromtensor_a
and(3)
fromtensor_b
. We then squeeze the tensor to remove the 1dimension to get a final shape of(2, 4, 3)
. Note that the mask has the same shape.>>> import keras.backend as K >>> tensor_a = K.ones(shape=(2, 4, 2)) >>> tensor_b = K.ones(shape=(2, 4, 3, 2)) >>> tensor_a_expanded = K.expand_dims(tensor_a, axis=1) >>> unsqueezed_bd = K.batch_dot(tensor_a_expanded, tensor_b, axes=(2,3)) >>> final_bd = K.squeeze(unsqueezed_bd, axis=K.ndim(tensor_a)1) >>> K.eval(final_bd).shape (2, 4, 3)
Lastly, let’s look at the uneven case where
tensor_a
has more dimensions thantensor_b
. Let the shape oftensor_a
be(2, 3, 4, 2)
, and let the shape oftensor_b
be(2, 3, 2)
. Since the mask accompanying these inputs always has one less dimension,tensor_a_mask
has shape(2, 3, 4)
andtensor_b_mask
has shape(2, 3)
. The shape of thebatch_dot
output would thus be(2, 3, 4)
. Since these tensors are uneven, expand the smaller tensor,tensor_b
, to get a new shape of(2, 3, 2, 1)
. Now we are taking thebatch_dot
of a tensor with shape(2, 3, 4, 2)
and(2, 3, 2, 1)
. Note again that the first two dimensions of this tensor are the same(2, 3)
. We can see that the output shape is(2, 3, 4, 1)
since we get(2, 3, 4)
fromtensor_a
and(1)
fromtensor_b
. We then squeeze the tensor to remove the 1dimension to get a final shape of(2, 3, 4)
. Note that the mask has the same shape.>>> import keras.backend as K >>> tensor_a = K.ones(shape=(2, 3, 4, 2)) >>> tensor_b = K.ones(shape=(2, 3, 2)) >>> tensor_b_expanded = K.expand_dims(tensor_b, axis=1) >>> unsqueezed_bd = K.batch_dot(tensor_a, tensor_b_expanded, axes=(3, 2)) >>> final_bd = K.squeeze(unsqueezed_bd, axis=K.ndim(tensor_a)1) >>> K.eval(final_bd).shape (2, 3, 4)

compute_mask
(inputs, mask=None)[source]¶ Computes an output mask tensor.
 # Arguments
 inputs: Tensor or list of tensors. mask: Tensor or list of tensors.
 # Returns
 None or a tensor (or list of tensors,
 one per output tensor of the layer).

compute_output_shape
(input_shape)[source]¶ Computes the output shape of the layer.
Assumes that the layer will be built to match that input shape provided.
 # Arguments
 input_shape: Shape tuple (tuple of integers)
 or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
 # Returns
 An input shape tuple.
CollapseToBatch¶

class
deep_qa.layers.backend.collapse_to_batch.
CollapseToBatch
(num_to_collapse: int, **kwargs)[source]¶ Bases:
deep_qa.layers.masked_layer.MaskedLayer
Reshapes a higher order tensor, taking the first
num_to_collapse
dimensions after the batch dimension and folding them into the batch dimension. For example, a tensor of shape (2, 4, 5, 3), collapsed withnum_to_collapse = 2
, would become a tensor of shape (40, 3). We perform identical computation on the input mask, if there is one.This is essentially what Keras’
TimeDistributed
layer does (and then undoes) to apply a layer to a higherorder tensor, and that’s the intended use for this layer. However,TimeDistributed
cannot handle distributing across dimensions with unknown lengths at graph compilation time. This layer works even in that case. So, if your actual tensor shape at graph compilation time looks like (None, None, None, 3), or (None, 4, None, 3), you can still use this layer (andExpandFromBatch
) to get the same result asTimeDistributed
. If your shapes are fully known at graph compilation time, just useTimeDistributed
, as it’s a nicer API for the same functionality. Inputs:
 tensor with
ndim >= 3
 tensor with
 Output:
 tensor with
ndim = input_ndim  num_to_collapse
, with the removed dimensions folded into the first (batchsize) dimension
 tensor with
Parameters: num_to_collapse: int
The number of dimensions to fold into the batch size.

compute_mask
(inputs, mask=None)[source]¶ Computes an output mask tensor.
 # Arguments
 inputs: Tensor or list of tensors. mask: Tensor or list of tensors.
 # Returns
 None or a tensor (or list of tensors,
 one per output tensor of the layer).

compute_output_shape
(input_shape)[source]¶ Computes the output shape of the layer.
Assumes that the layer will be built to match that input shape provided.
 # Arguments
 input_shape: Shape tuple (tuple of integers)
 or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
 # Returns
 An input shape tuple.

get_config
()[source]¶ Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Container (one layer of abstraction above).
 # Returns
 Python dictionary.
ExpandFromBatch¶

class
deep_qa.layers.backend.expand_from_batch.
ExpandFromBatch
(num_to_expand: int, **kwargs)[source]¶ Bases:
deep_qa.layers.masked_layer.MaskedLayer
Reshapes a collapsed tensor, taking the batch size and separating it into
num_to_expand
dimensions, following the shape of a second input tensor. This is meant to be used in conjunction withCollapseToBatch
, to achieve the same effect as Keras’TimeDistributed
layer, but for shapes that are not fully specified at graph compilation time.For example, say you had an original tensor of shape
(None (2), 4, None (5), 3)
, then collapsed it withCollapseToBatch(2)(tensor)
to get a tensor with shape(None (40), 3)
(here I’m usingNone (x)
to denote a dimension with unknown length at graph compilation time, wherex
is the actual runtime length). You can then callExpandFromBatch(2)(collapsed, tensor)
with the result to expand the first two dimensions out of the batch again (presumably after you’ve done some computation when it was collapsed). Inputs:
 a tensor that has been collapsed with
CollapseToBatch(num_to_expand)
.  the original tensor that was used as input to
CollapseToBatch
(or one with identical shape in the collapsed dimensions). We will use this input only to get its shape.
 a tensor that has been collapsed with
 Output:
 tensor with
ndim = input_ndim + num_to_expand
, with the additional dimensions coming immediately after the first (batchsize) dimension.
 tensor with
Parameters: num_to_expand: int
The number of dimensions to expand from the batch size.

compute_mask
(inputs, mask=None)[source]¶ Computes an output mask tensor.
 # Arguments
 inputs: Tensor or list of tensors. mask: Tensor or list of tensors.
 # Returns
 None or a tensor (or list of tensors,
 one per output tensor of the layer).

compute_output_shape
(input_shape)[source]¶ Computes the output shape of the layer.
Assumes that the layer will be built to match that input shape provided.
 # Arguments
 input_shape: Shape tuple (tuple of integers)
 or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
 # Returns
 An input shape tuple.

get_config
()[source]¶ Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Container (one layer of abstraction above).
 # Returns
 Python dictionary.
Envelope¶

class
deep_qa.layers.backend.envelope.
Envelope
(**kwargs)[source]¶ Bases:
deep_qa.layers.masked_layer.MaskedLayer
Given a probability distribution over a begin index and an end index of some sequence, this
Layer
computes an envelope over the sequence, a probability that each element lies within “begin” and “end”.Specifically, the computation done here is the following:
after_span_begin = K.cumsum(span_begin, axis=1) after_span_end = K.cumsum(span_end, axis=1) before_span_end = 1  after_span_end envelope = after_span_begin * before_span_end
 Inputs:
 span_begin: tensor with shape
(batch_size, sequence_length)
, representing a probability distribution over a start index in the sequence  span_end: tensor with shape
(batch_size, sequence_length)
, representing a probability distribution over an end index in the sequence
 span_begin: tensor with shape
 Outputs:
 envelope: tensor with shape
(batch_size, sequence_length)
, representing a probability for each index of the sequence belonging in the span
 envelope: tensor with shape
If there is a mask associated with either of the inputs, we ignore it, assuming that you used the mask correctly when you computed your probability distributions. But we support masking in this layer, so that you have an output mask if you really need it. We just return the first mask that is not
None
(orNone
, if both areNone
).
compute_mask
(inputs, mask=None)[source]¶ Computes an output mask tensor.
 # Arguments
 inputs: Tensor or list of tensors. mask: Tensor or list of tensors.
 # Returns
 None or a tensor (or list of tensors,
 one per output tensor of the layer).

compute_output_shape
(input_shape)[source]¶ Computes the output shape of the layer.
Assumes that the layer will be built to match that input shape provided.
 # Arguments
 input_shape: Shape tuple (tuple of integers)
 or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
 # Returns
 An input shape tuple.
Max¶

class
deep_qa.layers.backend.max.
Max
(axis: int = 1, **kwargs)[source]¶ Bases:
deep_qa.layers.masked_layer.MaskedLayer
This
Layer
performs a max over some dimension. Keras has a similar layer calledGlobalMaxPooling1D
, but it is not as configurable as this one, and it does not support masking.If the mask is not
None
, it must be the same shape as the input. Input:
 A tensor of arbitrary shape (having at least 3 dimensions).
 Output:
 A tensor with one less dimension, where we have taken a max over one of the dimensions.

compute_mask
(inputs, mask=None)[source]¶ Computes an output mask tensor.
 # Arguments
 inputs: Tensor or list of tensors. mask: Tensor or list of tensors.
 # Returns
 None or a tensor (or list of tensors,
 one per output tensor of the layer).

compute_output_shape
(input_shape)[source]¶ Computes the output shape of the layer.
Assumes that the layer will be built to match that input shape provided.
 # Arguments
 input_shape: Shape tuple (tuple of integers)
 or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
 # Returns
 An input shape tuple.

get_config
()[source]¶ Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Container (one layer of abstraction above).
 # Returns
 Python dictionary.
Permute¶

class
deep_qa.layers.backend.permute.
Permute
(pattern: typing.Tuple[int], **kwargs)[source]¶ Bases:
deep_qa.layers.masked_layer.MaskedLayer
This
Layer
callsK.permute_dimensions
on both the input and the mask.If the mask is not
None
, it must have the same shape as the input. Input:
 A tensor of arbitrary shape.
 Output:
 A tensor with permuted dimensions.

compute_mask
(inputs, mask=None)[source]¶ Computes an output mask tensor.
 # Arguments
 inputs: Tensor or list of tensors. mask: Tensor or list of tensors.
 # Returns
 None or a tensor (or list of tensors,
 one per output tensor of the layer).

compute_output_shape
(input_shape)[source]¶ Computes the output shape of the layer.
Assumes that the layer will be built to match that input shape provided.
 # Arguments
 input_shape: Shape tuple (tuple of integers)
 or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
 # Returns
 An input shape tuple.
Repeat¶

class
deep_qa.layers.backend.repeat.
Repeat
(axis: int, repetitions: int, **kwargs)[source]¶ Bases:
deep_qa.layers.masked_layer.MaskedLayer
This
Layer
callsK.repeat_elements
on both the input and the mask, after callingK.expand_dims
.If the mask is not
None
, we must be able to callK.expand_dims
using the same axis parameter as we do for the input. Input:
 A tensor of arbitrary shape.
 Output:
 The input tensor repeated along one of the dimensions.
Parameters: axis: int
We will add a dimension to the input tensor at this axis.
repetitions: int
The new dimension will have this size to it, with each slice being identical to the original input tensor.

compute_mask
(inputs, mask=None)[source]¶ Computes an output mask tensor.
 # Arguments
 inputs: Tensor or list of tensors. mask: Tensor or list of tensors.
 # Returns
 None or a tensor (or list of tensors,
 one per output tensor of the layer).

compute_output_shape
(input_shape)[source]¶ Computes the output shape of the layer.
Assumes that the layer will be built to match that input shape provided.
 # Arguments
 input_shape: Shape tuple (tuple of integers)
 or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
 # Returns
 An input shape tuple.

get_config
()[source]¶ Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Container (one layer of abstraction above).
 # Returns
 Python dictionary.
RepeatLike¶

class
deep_qa.layers.backend.repeat_like.
RepeatLike
(axis: int, copy_from_axis: int, **kwargs)[source]¶ Bases:
deep_qa.layers.masked_layer.MaskedLayer
This
Layer
is likeRepeat
, but gets the number of repetitions to use from a second input tensor. This allows doing a number of repetitions that is unknown at graph compilation time, and is necessary when therepetitions
argument toRepeat
would beNone
.If the mask is not
None
, we must be able to callK.expand_dims
using the same axis parameter as we do for the input. Input:
 A tensor of arbitrary shape, which we will expand and tile.
 A second tensor whose shape along one dimension we will copy
 Output:
 The input tensor repeated along one of the dimensions.
Parameters: axis: int
We will add a dimension to the input tensor at this axis.
copy_from_axis: int
We will copy the dimension from the second tensor at this axis.

compute_mask
(inputs, mask=None)[source]¶ Computes an output mask tensor.
 # Arguments
 inputs: Tensor or list of tensors. mask: Tensor or list of tensors.
 # Returns
 None or a tensor (or list of tensors,
 one per output tensor of the layer).

compute_output_shape
(input_shape)[source]¶ Computes the output shape of the layer.
Assumes that the layer will be built to match that input shape provided.
 # Arguments
 input_shape: Shape tuple (tuple of integers)
 or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
 # Returns
 An input shape tuple.

get_config
()[source]¶ Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Container (one layer of abstraction above).
 # Returns
 Python dictionary.