# A cheat sheet for custom TensorFlow layers and models

### A cheat sheet for custom TensorFlow layers and models

How to redefine everything from numpy, and some actually useful tricks for part assignment, saving custom layers, etc.

We’re talking about creating custom TensorFlow layers and models in:

TensorFlow v2.5.0.

No wasting time with introductions, let’s go:

### Numpy mappings to TensorFlow

- Keep in mind for all operations: in TensorFlow,
**the first axis is the batch_size**. - Trick to get past the batch size when it’s inconvenient: wrap linear algebra in
`tf.map_fn`

, e.g. converting vectors of size`n`

in a batch`(batch_size x n)`

to diagonal matrices of size`(batch_size x n x n)`

:

mats = tf.map_fn(lambda vec: tf.linalg.tensor_diag(vec), vecs)

- For most operations such as
`tf.matmul`

, the batch_size is already taken care of:

The inputs must, following any transpositions, be tensors of rank >= 2 where the inner 2 dimensions specify valid matrix multiplication dimensions, and any further outer dimensions specify matching batch size.

- Sum an array:
`np.sum(x)`

→`tf.map.reduce_sum(x)`

- Dot product of two vectors:
`np.dot(x,y)`

→`tf.tensordot(x,y,1)`

- Matrix product with vector
`A.b`

:`np.dot(A,b)`

→`tf.linalg.matvec(A,b)`

- Trigonometry:
`np.sin`

→`tf.math.sin`

- Transpose
`(batch_size x n x m)`

→`(batch_size x m x n)`

:`np.transpose(x, axes=(0,2,1))`

→`tf.transpose(x, perm=[0,2,1])`

- Vector to diagonal matrix:
`np.diag(x)`

→`tf.linalg.tensor_diag(x)`

- Concatenate matrices:
`np.concatenate((A,B),axes=0)`

→`tf.concat([A,B],1)`

- Matrix flatten:
`[[A,B],[C,D]]`

into a single matrix:

tmp1 = tf.concat([A,B],2)

tmp2 = tf.concat([C,D],2)

mat = tf.concat([tmp1,tmp2],1)

- Kronecker product of a vector
`vec`

of size`n`

(makes a matrix`n x n`

):`tf.tensordot(vec,vec,axes=0)`

except the vectors usually come in a batch`vecs`

of size`(batch_size x n)`

, so we need to map:

mats = tf.map_fn(lambda vec: tf.tensordot(vec,vec,axes=0),vecs)

- Zeros:
`np.zeros(n)`

→`tf.zeros_like(n)`

. Note that to avoid the error “*Cannot convert a partially known TensorShape to a Tensor”,*you should use`tf.zeros_like`

instead of`tf.zeros`

since the former does not require the size to be known until runtime, see also here.

### Part assignment by indexes

In `numpy`

we can just write:

a = np.random.rand(3)

a[0] = 5.0

This doesn’t work in TensorFlow:

a = tf.Variable(initial_value=np.random.rand(3),dtype='float32')

a[0] = 5.0

# TypeError: 'ResourceVariable' object does not support item assignment

Instead, you can add/subtract unit vectors/matrices/tensors with the appropriate values. The `tf.one_hot`

function can be used to construct these:

e = tf.one_hot(indices=0,depth=3,dtype='float32')

# <tf.Tensor: shape=(3,), dtype=float32, numpy=array([1., 0., 0.], dtype=float32)>

and then change the value like this:

a = a + (new_val - old_val) * e

where `a`

is the TF variable, `new_val`

and `old_val`

are floats, and `e`

is the unit vector.

For example, in the previous example:

### Part assignment by indexes for matrices, etc.

Same as the vector example, but to construct the unit matrices you can use this helper function:

which you can adapt to higher order tensors as well.

### Getting the shape of things

Use `tf.shape(x)`

instead of `x.shape`

. You can read this discussion here: or:

Do not use .shape; rather use tf.shape. However, h = tf.shape(x)[1]; w = tf.shape(x)[2] will let h, w be symbolic or graph-mode tensors (integer) that will contain a dimension of x. The value will be determined runtime. In such a case, tf.reshape(x, [-1, w, h]) will produce a (symbolic) tensor of shape [?, ?, ?] (still unknown) whose tensor shape will be known on runtime.

### What to subclass — tf.keras.layers.Layer or tf.keras.Model?

`Model`

if you want to use `fit`

or `evaluate`

or something similar, otherwise `Layer`

.

From the docs:

Typically you inherit from keras.Model when you need the model methods like: Model.fit,Model.evaluate, and Model.save (see Custom Keras layers and models for details). One other feature provided by keras.Model (instead of keras.layers.Layer) is that in addition to tracking variables, a keras.Model also tracks its internal layers, making them easier to inspect.

### What do I need to implement to subclass a Layer?

- Obviously call
`super`

in the constructor. - Obviously implement the
`call`

method. - You should implement
`get_config`

and`from_config`

— they are needed often for saving the layer, e.g. if`save_traces=False`

in the model. `@tf.keras.utils.register_keras_serializable(`

should be added at the top of the class. From the docs:*package*="my_package")

This decorator injects the decorated class or function into the Keras custom object dictionary, so that it can be serialized and deserialized without needing an entry in the user-provided custom object dict.

This really means that if you do not put it, you must later load your model containing the layer like this Keras documentation describes:

loaded_1 = keras.models.load_model(

"my_model", custom_objects={"MyLayer": MyLayer}

)

which is kind of a pain.

### What do I need to implement to subclass a Model?

### Saving and loading models and layers

If you have correctly defined `get_config`

and `from_config`

for each of your layers and models as above, you can just save using:

model.save("model", save_traces=False)

# Or just save the weights

model.save_weights("model_weights")

Note that you do not need `save_traces`

if you have defined those correctly. This also makes the saved models much smaller!

To load, just use:

model = tf.keras.models.load_model("model")

# Or just load the weights for an already existing model

model.load_weights("model_weights")

Note that we don’t need to specify and `custom_objects`

if we correctly used the

@tf.keras.utils.register_keras_serializable(package="my_package")

decorator everywhere.

### How to reduce the size of TensorBoard callbacks if you have a large custom model

Similar to how `save_traces`

reduces the size of saved models, you can use `write_graph=False`

in the TensorBoard callback to reduce the size of these files:

logdir = os.path.join("logs",datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))

tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir,

histogram_freq=1,

write_graph=False)

Or, for a `ModelCheckpoint`

, you can use `save_weights_only=True`

to just save the weights and not the model at all:

val_checkpoint = tf.keras.callbacks.ModelCheckpoint(

'trained',

monitor='val_loss',

verbose=1,

save_best_only=True,

save_weights_only=True,

mode='auto',

save_frequency=1

)

### Conclusion

Hope this starter-set of tricks helped!