Tensorflow-Advanced - Vivek's Digital Garden

# Tensorflow Advanced When things are not linear (sequential) you need to make a more complex network multiple inputs, multiple outputs, etc, loops "functional api" training loop distributed training ## Functional API ### Example using MNIST model Sequential Definition ```python seq_model = tf.keras.Sequential([ Flatten(input_shape=[28.28]), Dense(128,activation='relu'), Dense(10, activation ='softmax') ]) ``` Three steps to define functional API 1. Input (Define input to the model) __biggest deviation from sequential__ 2. Layers Define a set of intnerconnected layers 3. Model ```python from tf.keras.layers import Input from tf.keras.layers import Model # ---- Input ------ input = Input(shape=(28.28)) # -----Layers ---------- x = Flatten()(input) x = Dense(128, activation='relu')(x) predictions= Dense(10, activation='softmax')(x) # -----Model ----------- func_model = Model(inputs=input, outputs=predictions) ``` you can choose unique variables for each layers [[Siamese Network]] t ## Custom Loss Functions ### Function based To create custom loss functions, you need a function that takes two parameters `y_true` and `y_pred` `y_true` - labels. `y_pred` - predictions ```python def my_huber_loss_func(y_true, y_pred): threshold=1 error = y_true - y_pred is_error_small = tf.abs(error) <=threshold small_error_loss = tf.square(error)/2 big_error_loss = threshold * (tf.abs(error) - (0.5 * threshold)) return tf.where(is_small_error, small_error_loss, big_error_loss) ``` you can wrap the loss function in another function that takes parameters ```python def my_huber_loss_with_param(threshold=1): def my_huber_loss(y_true, y_pred): ... return my_huber_loss ``` To use this ```python model.complie(optimizer='sgd', loss=my_huber_loss_with_param(threshold=1)) ``` ### Class based ```python from tensorflow.keras.losses import Loss class MyHuberLoss(Loss): def __init__(self, threshold): super().__init__() self.threshold=threshold def call(self, y_true, y_pred): ... ``` ### Mean Squared Error ```python from tensorflow.keras import backend as K def my_rmse(y_true, y_pred): error =y_true - y_pred sqr_error = K.square(error) mean_sqr_error = K.mean(sqr_error) sqrt_mean_sqr_error = K.sqrt(mean_sqr_error) return sqrt_mean_sqr_error ``` ### Contrastive Loss It is used for the [[Siamese Network]] $y * D^2 + (1-y)* max(margin - D, 0)^2$ $y$ is tensor of image similarities (0 or 1) $D$ is Euclidean distance between vectors $margin$ threshold ## Lambda Layer Simple Lambda layer. You can write your own code to define a layer ```python tf.keras.layers.Lambda(lamba x: tf.abs(x)) ``` Modified Relu ```python def my_relu(x): return K.maximum(0.5, x) model = tf.keras.layers.Sequential([ tf.keras.Dense(128), tf.keras.layers.Lambda(my_relu), tf.keras.layers.Dense(10, activation='softmax') ]) ``` ## Custom Layers If you need trainable parameters in your custom layers you need to inherit from keras __What is a layer?__ Its a class that collects parameters that encapsulate state(weights) and computation (forward pass) Custom layer needs to inherit frorm Layer have these three functions defined. ```python from tf.keras.layers import Layer class SimpleDense(Layer): def __init__(self, units, activation=None): super().__init__() self.units=units self.activation = tf.keras.activations.get(activation) def build(self, input_shape): w_init =tf.random_normal_initializer() self.w = tf.Variable(name='kernel', initial_value=w_init(shape=(input_shape[-1], self.units), dtype='float32'), trainable=True) b_init = tf.zeros_initializer() self.b = tf.Varriable(name='bias', initial_value=b_init(shape=(self.units,), dtype='float32'), trainable=True) def call(self, inputs): return self.activation(tf.matmul(inputs, self.w) + self.b) my_dense = SimpleDense(units=1) ``` ## Extending Model Class Inheriting from the existing `Model` class lets you use the Model methods such as `compile()`, `fit()`, `evaluate()`. Also get saving and serialization APIs `model.save()` and `model.save_weights()` and summarization and visualization APIs (`model.summary()` and `tf.keras.utils.plot_model()`) ```python class WideAndDeep(Model): def __init__(self, units=32, activation='relu', **kwargs): super().__init__(**kwargs) self.hidden1 = Dense(units, activation=activation) self.hidden2 = Dense(units, activation=activation) self.main_output = Dense(1) self.aux_output =Dense(1) def call(self, inputs): input_A, input_B = inputs hidden1 = self.hidden1(input_B) hidden2 = self.hidden2(hidden1) concat = concatenate([input_A, hidden2]) main_output = self.main_output(concat) aux_output = self.aux_output(hidden2) return main_output, aux_output ``` Defining networks this way enables you to have loops for defining multiple layers, if then statements, etc. You can also define subnetworks for use in larger models [[Resnet]] ## Callbacks Callbacks do helpful things during the training like saving checkpoints, early stopping etc. Callbacks are called at the end of an epoch ### Boiler plate for a callback class Generally you want to use the functions to define what happens at end of epoch or beginning of epoch ```python class Callback(object): def __init__(self): self.validation_data = None self.model=None def on_epoch_begin(self, epoch, logs=None): pass def on_epoch_end(self, epoch, logs=None): pass def on_(train|test|predict)_(begin|end)(self, logs=None): pass def on_(train|test|predict)_batch_(begin|end)(self, logs=None): pass ``` Where can you use callbacks? - `fit(..., callbacks=[])` - `fit_generator(..., callbacks=[])` - `evaluate(..., callbacks=[])` - `evaluate_generator(..., callbacks=[])` - `predict(..., callbacks=[])` - `predict_generator(..., callbacks=[])` ### Tensorboard ```python logs_dir = os.path.join("logs", datetime.datetime.now().strftime("%Y%m%d-%H%M%S")) tensorboard = tf.keras.callbacks.TensorBoard(log_dir=log_dir) ``` ### ModelCheckpoint Saves the model every so often ```python tf.keras.callbacks.ModelCheckpoint( filepath, monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', save_freq='epoch', options=None, **kwargs ) ``` you can save in native keras `.h5` format or in tensorflow format You can append the timestamp to saving the file ```python model.fit(..., callbacks=[ModelCheckpoint("weights.{epoch-02d}-{val_loss:.2f}.h5")]) ``` ### CSVLogger will log the training as a csv file. ```python model.fit(..., callbacks=[CSVLogger("training.csv")]) ``` ![[TensorFlow - Basics#Early stopping]] ![[TensorFlow - Basics#Learning Rate finder]] ### Detect Overfitting Callback Measure ratio between validation loss and training loss if the ratio become bigger then we maybe overfitting ```python class DetectOverfittingCallback(tf.keras.callbacks.Callback): def __init__(self, threshold): super().__init__() self.threshold = threhold def on_epoch_end(self, epoch, logs=None): ratio = logs['val_loss']/ logs['loss'] print(f"Epoch: {epoch}, Va/Train loss ratio: {ratio}") if ratio > threshold: print("Stopping Training") self.model.stop_training=True ``` ### Visualize training ```python class VisCallback(tf.keras.callbacks.Callback): def __init__(self, inputs, ground_truth, display_freq=10, n_samples=10): self.inputs=inputs self.ground_truth = ground_truth self.images= [] self.display_freq = display_freq self.n_samples = n_samples def on_epoch_end(self, epoch, logs=None): # randomly sample data indexes = np.random.choice(len(self.inputs), size=self.n_samples) X_test, y_test = self.inputs[indexes], self.ground_truth[indexes] predictions = np.argmax(self.model.predict(X_test), axis=1) display_digits(X_test, predcitions, y_test, epoch, n=serl.display_freq) buf = io.BytesIO() plt.savefig(buf, format='png') buf.seek(0) image = Image.open(buf) self.images.append(np.array(image)) If epoch % self.display_freq == 0: plt.show() def on_train_end(self, logs=None): imageio.mimsave('animated.gif', self.images, fps=1) # Visualization utilities plt.rc('font', size=20) plt.rc('figure', figsize=(15, 3)) def display_digits(inputs, outputs, ground_truth, epoch, n=10): plt.clf() plt.yticks([]) plt.grid(None) inputs = np.reshape(inputs, [n, 28, 28]) inputs = np.swapaxes(inputs, 0, 1) inputs = np.reshape(inputs, [28, 28*n]) plt.imshow(inputs) plt.xticks([28*x+14 for x in range(n)], outputs) for i,t in enumerate(plt.gca().xaxis.get_ticklabels()): if outputs[i] == ground_truth[i]: t.set_color('green') else: t.set_color('red') plt.grid(None) ``` ## Graph based vs Eager Execution ### Graph based All data and ops is loaded into a graph before evaluating them in a session ### Eager Executed line by line