-
Notifications
You must be signed in to change notification settings - Fork 17
Build your own Model
Ovation-CI provides you with a template to build your own models. This makes it easier for you to write code to build your architecture.
To create your own model, you just need to create a new class that descends from the Model
class defined in models/model.py
. You'll notice that most of our Model
classes do not currently descend from this Model
class. This is because it is new and we plan to change this in the future (we are accepting contributions ;-) ).
Your new model class needs to implement the following five methods:
class MyModel(Model):
def create_placeholders(self):
"""
Use this method to create all your placeholders for your model.
Incase you want to know what placeholders then refer to
https://www.tensorflow.org/api_docs/python/tf/placeholder
"""
pass
def build_model(self, metada_path=None, embedding_weights=None):
"""
Build your computation graph here. In simple terms, create your
Network layers here and compute the losses. You may want to keep the
Tensorflow variables that you create in the object so that you can
observe them later.
"""
pass
def create_scalar_summary(self, sess):
"""
This is the method where you you insert into the Summary object
the information to be displayed in the scalars tab in tensorboard.
"""
pass
def train_step(self):
"""
This is where you implement the code to feed a mini batch to your
computation graph and update your weights using the training
operations that you generated in compute_gradients(). You can also
observe Tensorflow variables that you have kept in this object by
passing it to sess.run()
:param sess: The Tensorflow Session to run your computations
:param batch: A mini batch of training data
:return: usually the loss or some evauation measures (accuracy,
pearson correlation)
NOTICE that you can change the parameters passed to this function.
See any of the templates for examples on how to write it.
"""
pass
def evaluate_step(self):
"""
This is similar to train step. But here you need to run the
computations in the eval mode. This usually means, setting the
dropout_keep_probability to 1.0, etc.
:param sess: The Tensorflow Session to run your computations
:param batch: A mini batch of evaluation data
:return: usually the loss or some evauation measures (accuracy,
pearson correlation)
NOTICE that you can change the parameters passed to this function.
See any of the templates for examples on how to write it.
"""
pass
Your new class would then be used by a Trainer Script to train this model.
We made sure to make the Model
class as small as possible and tried documenting it as much as we could. This way, we expect you to be able to override the methods that you deem not optimal for whatever you are currently developing. In fact, we even recommend you to do so, if you feel the current code is not suiting your needs.
import os
import pickle
import datetime
import tensorflow as tf
from utils import ops
from utils import distances
from utils import losses
from scipy.stats import pearsonr
from sklearn.metrics import mean_squared_error
from tensorflow.contrib.tensorboard.plugins import projector
from models.model import Model
class SiameseCNNLSTM(Model):
"""
A LSTM based deep Siamese network for text similarity.
Uses a word embedding layer, followed by a bLSTM and a simple Energy Loss
layer.
"""
def create_placeholders(self):
# A tensorflow Placeholder for the 1st input sentence. This
# placeholder would expect data in the shape [BATCH_SIZE X
# SEQ_MAX_LENGTH], where each row of this Tensor will contain a
# sequence of token ids representing the sentence
self.input_s1 = tf.placeholder(tf.int32, [None,
self.args.get("sequence_length")],
name="input_s1")
# This is similar to self.input_s1, but it is used to feed the second
# sentence
self.input_s2 = tf.placeholder(tf.int32, [None,
self.args.get("sequence_length")],
name="input_s2")
# This is a placeholder to feed in the ground truth similarity
# between the two sentences. It expects a Matrix of shape [BATCH_SIZE]
self.input_sim = tf.placeholder(tf.float32, [None], name="input_sim")
def build_model(self, metadata_path=None, embedding_weights=None):
"""
This method builds the computation graph by adding layers of
computations. It takes the metadata_path (of the dataset vocabulary)
and a preloaded word2vec matrix and input and uses them (if not None)
to initialize the Tensorflow variables. The metadata is used to
visualize the word embeddings that are being trained using Tensorflow
Projector. Additionally you can use any other tool to visualize them.
https://www.tensorflow.org/versions/r0.12/how_tos/embedding_viz/
:param metadata_path: Path to the metadata of the vocabulary. Refer
to the datasets API
https://github.com/mindgarage/Ovation/wiki/The-Datasets-API
:param embedding_weights: the preloaded w2v matrix that corresponds
to the vocabulary. Refer to https://github.com/mindgarage/Ovation/wiki/The-Datasets-API#what-does-a-dataset-object-have
:return:
"""
# Build the Embedding layer as the first layer of the model
self.embedding_weights, self.config = ops.embedding_layer(
metadata_path, embedding_weights)
self.embedded_s1 = tf.nn.embedding_lookup(self.embedding_weights,
self.input_s1)
self.embedded_s2 = tf.nn.embedding_lookup(self.embedding_weights,
self.input_s2)
self.s1_cnn_out = ops.multi_filter_conv_block(self.embedded_s1,
self.args["n_filters"],
dropout_keep_prob=self.args["dropout"])
self.s1_lstm_out = ops.lstm_block(self.s1_cnn_out,
self.args["hidden_units"],
dropout=self.args["dropout"],
layers=self.args["rnn_layers"],
dynamic=False,
bidirectional=self.args["bidirectional"])
self.s2_cnn_out = ops.multi_filter_conv_block(self.embedded_s2,
self.args["n_filters"], reuse=True,
dropout_keep_prob=self.args["dropout"])
self.s2_lstm_out = ops.lstm_block(self.s2_cnn_out,
self.args["hidden_units"],
dropout=self.args["dropout"],
layers=self.args["rnn_layers"],
dynamic=False, reuse=True,
bidirectional=self.args["bidirectional"])
self.distance = distances.exponential(self.s1_lstm_out,
self.s2_lstm_out)
with tf.name_scope("loss"):
self.loss = losses.mean_squared_error(self.input_sim, self.distance)
if self.args["l2_reg_beta"] > 0.0:
self.regularizer = ops.get_regularizer(self.args["l2_reg_beta"])
self.loss = tf.reduce_mean(self.loss + self.regularizer)
# Compute some Evaluation Measures to keep track of the training process
with tf.name_scope("Pearson_correlation"):
self.pco, self.pco_update = tf.contrib.metrics.streaming_pearson_correlation(
self.distance, self.input_sim, name="pearson")
# Compute some Evaluation Measures to keep track of the training process
with tf.name_scope("MSE"):
self.mse, self.mse_update = tf.metrics.mean_squared_error(
self.input_sim, self.distance, name="mse")
def create_scalar_summary(self, sess):
"""
This method creates Tensorboard summaries for some scalar values
like loss and pearson correlation
:param sess:
:return:
"""
# Summaries for loss and accuracy
self.loss_summary = tf.summary.scalar("loss", self.loss)
self.pearson_summary = tf.summary.scalar("pco", self.pco)
self.mse_summary = tf.summary.scalar("mse", self.mse)
# Train Summaries
self.train_summary_op = tf.summary.merge([self.loss_summary,
self.pearson_summary,
self.mse_summary])
self.train_summary_writer = tf.summary.FileWriter(self.checkpoint_dir,
sess.graph)
projector.visualize_embeddings(self.train_summary_writer,
self.config)
# Dev summaries
self.dev_summary_op = tf.summary.merge([self.loss_summary,
self.pearson_summary,
self.mse_summary])
self.dev_summary_writer = tf.summary.FileWriter(self.dev_summary_dir,
sess.graph)
def train_step(self, sess, s1_batch, s2_batch, sim_batch,
epochs_completed, verbose=True):
"""
A single train step
"""
# Prepare data to feed to the computation graph
feed_dict = {
self.input_s1: s1_batch,
self.input_s2: s2_batch,
self.input_sim: sim_batch,
}
# create a list of operations that you want to run and observe
ops = [self.tr_op_set, self.global_step, self.loss, self.distance]
# Add summaries if they exist
if hasattr(self, 'train_summary_op'):
ops.append(self.train_summary_op)
_, step, loss, sim, summaries = sess.run(ops,
feed_dict)
self.train_summary_writer.add_summary(summaries, step)
else:
_, step, loss, sim = sess.run(ops, feed_dict)
# Calculate the pearson correlation and mean squared error
pco = pearsonr(sim, sim_batch)
mse = mean_squared_error(sim_batch, sim)
if verbose:
time_str = datetime.datetime.now().isoformat()
print("Epoch: {}\tTRAIN {}: Current Step{}\tLoss{:g}\t"
"PCO:{}\tMSE={}".format(epochs_completed,
time_str, step, loss, pco, mse))
return pco, mse, loss, step
def evaluate_step(self, sess, s1_batch, s2_batch, sim_batch, verbose=True):
"""
A single evaluation step
"""
# Prepare the data to be fed to the computation graph
feed_dict = {
self.input_s1: s1_batch,
self.input_s2: s2_batch,
self.input_sim: sim_batch
}
# create a list of operations that you want to run and observe
ops = [self.global_step, self.loss, self.distance, self.pco,
self.pco_update, self.mse, self.mse_update]
# Add summaries if they exist
if hasattr(self, 'dev_summary_op'):
ops.append(self.dev_summary_op)
step, loss, sim, pco, _, mse, _, summaries = sess.run(ops,
feed_dict)
self.dev_summary_writer.add_summary(summaries, step)
else:
step, loss, sim, pco, _, mse, _ = sess.run(ops, feed_dict)
time_str = datetime.datetime.now().isoformat()
# Calculate the pearson correlation and mean squared error
pco = pearsonr(sim, sim_batch)
mse = mean_squared_error(sim_batch, sim)
if verbose:
print("EVAL: {}\tStep: {}\tloss: {:g}\t pco:{}\tmse:{}".format(
time_str, step, loss, pco, mse))
return loss, pco, mse, sim