Hunch allows users to turn arbitrary machine learning models built using Python into a scalable, hosted service. Simply put, a data scientist can now focus on building the model and Hunch takes care of turning this model into a REST API.
This guide helps you setup Hunch on local machine.
- Download Anaconda configuration file from osx_environment.yml/linux_environment.yml. Setup conda environment using the following commands.
conda env create -f osx_environment.yml -n hunch
# To activate hunch environment, run the following command.
source activate hunch
# To deactivate hunch environment, run the following command.
# source deactivate hunch
- Latest Hunch SDK can be downloaded from here.
- Install Hunch SDK using the following command.
pip install -v --upgrade --no-deps --disable-pip-version-check hunch-0.0.1.tar.gz
Set environment variable HUNCH_API_CONFIG to Hunch SDK client config file path. Refer Hello World
example below.
Typical configuration file looks like the following...
hunch_api_config.yaml
model_storage:
backend: "local_fs"
local_fs:
modelrepo_dir: "/tmp/model_repo" # This should be created before publishing the model
import json
import os
from hunchsdk.HunchApi import HunchApi
os.environ["HUNCH_API_CONFIG"]="hunch_api_config.yaml"
class HelloWorldModel:
def predict(self, input):
return json.dumps("Hello World")
model = HelloWorldModel()
print "Local Prediction:", model.predict(None)
HunchApi().publish_model(model, "HelloWorldExample", "1.0.0")
# Expected Result
# Local Prediction: "Hello World"
hunch_server_config.yaml
rotation_status_file: "/tmp/rotation_status"
model_storage:
backend: "local_fs"
local_fs:
modelrepo_dir: "/tmp/model_repo" # This directory should exist
model_loader:
working_dir: "/tmp"
custom_package_deployer:
working_dir: "/tmp/hunchsdk_custom_libs" # This directory should exist
Use the following command...
Refer gunicorn documentation for more details and options available.
gunicorn -b 0.0.0.0:8000 -w 10 --preload --pid model_server.pid --timeout 300 -e MODEL_SIZE_THRESHOLD=2000000000 -e HUNCH_CONFIG="hunch_server_config.yaml" -e MODELS_TO_LOAD="[[\"HelloWorldExample\", \"1.0.0\"]]" hunch_server.wsgi:app --daemon
This model can be invoked via REST API as follows.
import requests
import json
model_server_url = "http://localhost:8000/predict"
params = {"model_id":"HelloWorldExample", "model_version":"1.0.0"}
data = json.dumps(None)
response = requests.post(model_server_url, params = params, json = data)
print "Result:", response.content
# Expected Output
# Result: {"stack_trace": "NA", "result": "\"Hello World\""}
Get pid from model_server.pid
and use the following command to kill.
sudo kill -9 <pid>
Here is the end-to-end workflow w.r.t using Hunch.
- Data scientists install the python Hunch SDK (model publisher earlier) package on their workstation or any instance where models are being trained (we will refer to this as the local environment).
- Data scientists build/train a model which typically comprises of one or more objects from a standard ML libraries like Scikit-Learn, Tensorflow etc. At this point it is possible to make predictions in the local environment.
- Data scientists implement a Python class that contains all the objects required to make predictions using the model. This class is required to have a method called
predict
which accepts a JSON input and produces a JSON output (which is the prediction) using the objects which are the building blocks of the model. We will refer to this class as the model class. - Data scientists publish the model to the Blob Storage using an instance of the model class using HunchApi in Hunch SDK. This method expects model identifier and model version. Model version management is automated once Model Repository is integrated.
- Developers can now use the model as a REST API. That is, they can simply call a REST API with the MODEL identifier (and version) pass a JSON object which is the input to the model and receive the output of the model as a JSON object.
- Download Anaconda configuration file from osx_environment.yml/linux_environment.yml. Setup conda environment using the following commands.
conda env create -f osx_environment.yml -n hunch
# To activate hunch environment, run the following command.
source activate hunch
# To deactivate hunch environment, run the following command.
source deactivate hunch
- Latest Hunch SDK can be downloaded from here.
- Install Hunch SDK using the following command.
pip install -v --upgrade --no-deps --disable-pip-version-check hunch-0.0.1.tar.gz
Set environment variable HUNCH_API_CONFIG to Hunch SDK client config file path. For eg.
os.environ["HUNCH_API_CONFIG"]="hunch_api_config.yaml"
Typical configuration file looks like the following...
model_storage: # For more details on Model storage configuration refer "Model Blob Storage" section of this document.
backend: "local_fs" # Name of the blob storage backend. For ex: local_fs, s3, azure_blob_storage etc.
local_fs: # Local File System backed configuration
modelrepo_dir: "/tmp/model_repo" # Directory which you want to use for storing models.
s3: # S3 configuration is not really needed in this configuration as we are using local_fs based storage client. This is added as a reference.
access_key: "AccessKey" # S3 Account Access key
bucket: "S3Bucket" # S3 bucket
chunk_size: 134217728 # S3 chunk size to be used in multi part upload
endpoint: "S3 Endpoint" # S3 endpoint
max_error_retry: 5 # How many retry attempts on failure?
secret_key: "SecretKey" # S3 Account Secret Key
size_limit: 209715100 # If the size of the blob is greater than size_limit then multi part upload is used.
Typical Server configuration looks like the following ...
rotation_status_file: "/tmp/rotation_status" # Used in taking Hunch out of rotation and brnging Hunch back in rotation. The app should have write permissions to this fiel/dir.
model_storage: # For more details on Model storage configuration refer "Model Blob Storage" section of this document.
backend: "local_fs" # Name of the blob storage backend. For ex: local_fs, s3, azure_blob_storage etc.
local_fs:
modelrepo_dir: "/tmp/model_repo" # Directory which you want to use for storing models.
s3: # S3 configuration is not really needed in this configuration as we are using local_fs based storage client. This is added as a reference.
access_key: "AccessKey" # S3 Account Access key
bucket: "S3Bucket" # S3 bucket
chunk_size: 134217728 # S3 chunk size to be used in multi part upload
endpoint: "S3 Endpoint" # S3 endpoint
max_error_retry: 5 # How many retry attempts on failure?
secret_key: "SecretKey" # S3 Account Secret Key
size_limit: 209715100 # If the size of the blob is greater than size_limit then multi part upload is used.
model_loader:
working_dir: "/tmp" # Directory in which you want to write temporary files while loading models in Hunch
custom_package_deployer:
working_dir: "/tmp/hunchsdk_custom_libs" # This directory should exist. Custom packages needed by models are installed into this directory.
Cloudpickle is used for the serialization of models.
Usage is explained with the following set of examples.
Let us start with a very simple dummy model which we will call HelloWorldModel
. This model simply returns the string Hello World
on any (or no) input. The listing below illustrates the complete source code for publishing the model.
Note the following points.
- We import the
json
package for processing JSON. - We import the
hunchsdk.HunchApi
package. - We define a class called
HelloWorldModel
. - The class
HelloWorldModel
implements thepredict
method which acceptsinput
which should be JSON object(or should be JSON serializable/deserializable) and returns a JSON object(or should be JSON serializable/deserializable) which simply contains the stringHello World
. - We create an instance of the
HelloWorldModel
class. - We create an instance of the
HunchApi
class. - We print the output produced by the
predict
method. - We publish the model using the
publish_model
method that takes an instance of the model, model id and model version. Model ID and Version management will be automated once the integration with Model Repository is available.
HunchApi().publish_model(model_instance, # Required. Instance of the model class.
model_id, # Required. Model ID. This is given by the user and not autogenerated. Autogeneration will be enabled after the integration of Model Repository
model_version, # Required. Model Version. This is given by the user and not autogenerated. Autogeneration will be enabled after the integration of Model Repository
path_to_setup_py, # Optional. Path to setup.py of the custom package which the model is dependent on.
custom_package_name # Optional. Name of the custom package which the model is dependent on.
)
import json
from hunchsdk.HunchApi import HunchApi
class HelloWorldModel:
def predict(self, input):
return json.dumps("Hello World")
model = HelloWorldModel()
print "Local Prediction:", model.predict(None)
HunchApi().publish_model(model, "HelloWorldExample", "1.0.0")
# Expected Result
# Local Prediction: "Hello World"
Refer gunicorn documentation for more details and options available.
gunicorn -b 0.0.0.0:8000 -w 10 --pid model_server.pid --timeout 300 --log-config gunicorn_log.conf --access-logformat "%(h)s %(t)s %(r)s %(s)s %(D)s" -e MODEL_SIZE_THRESHOLD=2000000000 -e HUNCH_CONFIG="hunch_server_config.yaml" -e MODELS_TO_LOAD="[[\"HelloWorldExample\", \"1.0.0\"]]" hunch_server.wsgi:app --daemon
MODEL_SIZE_THRESHOLD
: Model size threshold beyond which the models will not load. This is used as a gaurdrail to prevent users from loading large models by mistake (For ex. Adding training dataset in the model).
HUNCH_CONFIG
: Hunch server configuration explained above. This yaml file should exist for Hunch to start successfully.
MODELS_TO_LOAD
: List of models to load. This has to be a list of model id and mdoel version pairs (This pair itself is a list) serialized as JSON string. Ex: "[["HelloWorldExample", "1.0.0"], ["SkLearnExample", "1.0.0"]]"
Get pid from model_server.pid
and use the following command to kill.
sudo kill -9 <pid>
Let us now look at how a developer can use HelloWorldModel
. The only communication between the data scientist and the developer is the model identifier and the model version.
Note the following points.
- We use the
requests
package for making REST API calls andjson
package for JSON processing. - The model server URL is specified, a different IP address is available for staging and production.
- A call is made using
post
where the model identifier and version are provided as parameters and the input to the model is sent as a JSON payload. - The rest API returns the exact result as was returned by the
predict
call invoked locally.
import requests
import json
model_server_url = "http://localhost:8000/predict"
params = {"model_id":"HelloWorldExample", "model_version":"1.0.0"}
data = json.dumps(None)
response = requests.post(model_server_url, params = params, json = data)
print "Result:", response.content
# Expected Output
# Result: {"stack_trace": "NA", "result": "\"Hello World\""}
Let us now consider another model with an internal state. The listing below illustrates a model that simply adds a given number to the input number.
The following points should be noted.
- The
AddNumber
model has the requiredpredict
that adds a number inself.number_to_add
to the given input and returns the result of this operation. - The
AddNumber
models has another method callednumber_to_add
which can be used to set the value ofself.number_to_add
. - We initialize the value of
self.number_to_add
before the model is exported.
import json
from hunchsdk.HunchApi import HunchApi
class AddNumber:
def number_to_add(self, number_to_add):
self.number_to_add = number_to_add
def predict(self, input):
input_number = input["input"]
return self.number_to_add + input_number
model = AddNumber()
model.number_to_add(42)
print "Local Prediction:", model.predict({"input":1})
HunchApi().publish_model(model, "AddNumber", "1.0.0")
# Expected Result
# Local Prediction: 43
Stop Hunch and then load AddNumber
model and Helloworld
model by using the following command.
gunicorn -b 0.0.0.0:8000 -w 10 --pid model_server.pid --timeout 300 --log-config gunicorn_log.conf --access-logformat "%(h)s %(t)s %(r)s %(s)s %(D)s" -e MODEL_SIZE_THRESHOLD=2000000000 -e HUNCH_CONFIG="hunch_server_config.yaml" -e MODELS_TO_LOAD="[[\"HelloWorldExample\", \"1.0.0\"],[\"AddNumber\", \"1.0.0\"]]" hunch_server.wsgi:app --daemon
This model can be invoked via REST API as follows.
import requests
import json
model_server_url = "http://localhost:8000/predict"
params = {"model_id":"AddNumber", "model_version":"1.0.0"}
data = {"input":1}
response = requests.post(model_server_url, params = params, json = data)
print "Result:", response.content
# Expected Output
# Result: {"stack_trace": "NA", "result": "43"}
Let us now consider an example where a ML model built with Scikit-Learn can be exported and predictions made via a REST API. The listing below illustrates training a Support Vector Machine on the Digits dataset. The following points are to be noted.
- As before we implement a
Model
class with apredict
method. - We add two helper methods to this class namely
train
andpredict_local
. As the name indicates there are for training the model and making local predictions. - The
train
method is set up to receive the training data as two parameterstraining_data_X
(the input) andtraining_data_Y
(the output). - The train method initializes the
Normalizer
class for preprocessing the data and theSVC
class (which is the model to be trained) from Scikit-Learn. These are declared as class variables (self.normalizer
andself.svc
) as we need this instances to make predictions. - The
SVC
class is initialized with the hyper-parameterssvm.SVC(gamma=0.001, C=100.)
. - The input data is normalized using
normalizer.fit_transform
which both normalizes the data and stores the mean and the variance as the internal state of the instance. The result ofnormalizer.fit_transform
is stored innormalised_training_data_X
which is a local variable as we do not need this to make predictions, we only need this during training. - The Support Vector Machine is trained using the
svc.fit
method which trains on the input data and stores the model parameters as internal state (in the instance). - Post the declaration of the
Model
class, we prepare the training data and pick one example for local prediction, this is to check if the hosted model is producing the same result. - We then train the model using the
train
method declared earlier, make a local prediction and export the model. As before this returns a model identifier and model version. - The only difference between the
predict
andpredict_local
method is that the former expects a numpy array and the later expects a JSON object.
import numpy
import simplejson
from sklearn import datasets
from sklearn import svm
from sklearn.preprocessing import Normalizer
from hunchsdk.HunchApi import HunchApi
class Model:
def train(self, training_data_X, training_data_Y):
self.normalizer = Normalizer()
self.svc = svm.SVC(gamma=0.001, C=100.)
normalised_training_data_X = self.normalizer.fit_transform(training_data_X)
self.svc.fit(normalised_training_data_X, training_data_Y)
def predict(self, given_input):
input_for_prediction = numpy.array(simplejson.loads(given_input))
prediction = self.svc.predict(self.normalizer.fit_transform(input_for_prediction))
return simplejson.dumps(prediction.tolist())
def predict_local(self, given_input):
prediction = self.svc.predict(self.normalizer.fit_transform(given_input))
return prediction
# Prepare the dataset
digits = datasets.load_digits()
training_data_X = digits.data
training_data_Y = digits.target
# Pick an example for local prediction
test_data_X = digits.data[-1:]
test_data_Y = digits.target[-1:]
# Train the model and make a local prediction
model = Model()
model.train(training_data_X, training_data_Y)
print "Local Prediction: ", model.predict_local(test_data_X)
print "Actual Label:", test_data_Y
# Publish the model
HunchApi().publish_model(model, "SkLearnExample", "1.0.0")
# Expected Output
# Local Prediction: [8]
# Actual Label: [8]
Stop Hunch and then load AddNumber
model, Helloworld
model and SkLearnExample
model by using the following command.
gunicorn -b 0.0.0.0:8000 -w 10 --pid model_server.pid --timeout 300 --log-config gunicorn_log.conf --access-logformat "%(h)s %(t)s %(r)s %(s)s %(D)s" -e MODEL_SIZE_THRESHOLD=2000000000 -e HUNCH_CONFIG="hunch_server_config.yaml" -e MODELS_TO_LOAD="[[\"HelloWorldExample\", \"1.0.0\"],[\"AddNumber\", \"1.0.0\"],[\"SkLearnExample\", \"1.0.0\"]]" hunch_server.wsgi:app --daemon
This model can be invoked via REST API as follows. Note that the output of the local prediction and the hosted prediction is identical.
import requests
import simplejson
# Pick one example for prediction
from sklearn import datasets
digits = datasets.load_digits()
test_data_X = digits.data[-1:]
model_server_url = "http://localhost:8000/predict"
params = {"model_id":"SkLearnExample", "model_version":"1.0.0"}
data = simplejson.dumps(test_data_X.tolist())
response = requests.post(model_server_url, params = params, json = data)
print "Result:", response.content
# Expected Output
# Result: {"stack_trace": "NA", "result": "[8]"}
Models are Python objects but not all Python objects are pure-Python objects, in certain cases they contain native (C, C++, Fortran) data structures. Such objects or models are not serializable using cloudpickle. Eg: Models built using frameworks like Tensorflow, Fasttext, Crf, Caffe etc. This document describes how to go about publishing models which are built using libraries which can't be pickled. Typically, these libraries provide support for loading and saving the model of interest.
Read through this small write up, thereafter we will walk through an example. The example will clear up any haze, regarding publishing such models.
- In python world, any .py file is called a module. You will have to create a prediction module (this can't be done in a notebook).
- Prediction module is where you write the model class, which implements the predict functionality.
- Along with model class, you are required to implement a load_model method, as described below.
- We also recommend writing a publish.py file and publish using that rather than using the jupyter notebook.
- Apart from the Model class, you'll have to implement a load_model method, which returns the instance of your model. This method is outside of the model class.
- Since the instance of your model is not serializable (dumps function from libs like pickle, json etc won't work because model built using libs like tensorflow are not pure python objects.)
- Argument to this method is the top level directory, where all the files/directories required to load the model are present. All other files/directories which facilitate prediction/inference, should also reside here . Eg: a vocabulary saved in a separate json file or an embeddings file etc.
- load_model is the method where you should load these meta files and pass them to your Model class.
- Within the load_model method resolve to the actual paths of these files.
- The load_model method should only take one argument that is the aforementioned top level directory.
Consider I have stored all my model files and other resources in a top level dir called "model_resource". Within this dir say I have a directory named "saved_model", where I have asked the learning framework (eg: tensorflow, crf, fasttext etc) to save the model. I also have a "aux_data" directory where I keep other supplementary data required for predictions, in this case they are dumped python dictionaries.
model_resource
├── saved_model # This where I asked my framework to save the model.
│ ├── model_file.1
│ ├── model_file.2
│ ├── model.index
│ └── model.meta
└── aux_data # This is where I have kept other resources required by the model. In my case a bunch of json dictionaries.
├── email_vocab.json
├── name_vocab.json
└── order_browse_vocab.json
Now, let's walk you through a dummy implementation of the prediction module, using the above directory structure as an example. Let me call the prediction module as my_prediction_module.py
from hunchsdk.HunchApi import HunchApi
import ... # other imports
.
.
# these can be your custom libs as well. custom libs work as before.
.
.
def load_model(path_to_model_resource):
'''
This function loads the models and other supplementary objects which the prediction function requires.
Inputs:
- path_to_model_resources_dir: Path to the top level directory which contains the saved model and other required objects.
'''
# Prepare your model
# because path_to_model_resource is the top level directory
path_to_saved_model = path_to_model_resources_dir + '/saved_model'
model = your_fav_framework.load_model(path_to_saved_model)
# You might require objects other than your model
path_to_email_vocab = path_to_model_resources_dir + '/aux_data/email_vocab.json'
email_vocab = json.load(path_to_email_vocab)
path_to_name_vocab = path_to_model_resources_dir + '/aux_data/name_vocab.json'
name_vocab = json.load(path_to_name_vocab)
path_to_order_browse_vocab = path_to_model_resources_dir + '/aux_data/order_browse_vocab.json'
order_browse_vocab = json.load(path_to_order_browse_vocab)
.
.
.
return Model(model, email_vocab, name_vocab, order_browse_vocab)
class Model:
def __init__(self, model, email_vocab, name_vocab, order_browse_vocab):
self.model = model
self.email_vocab = email_vocab
self.name_vocab = name_vocab
self.order_browse_vocab = order_browse_vocab
def predict(self, given_input):
.
.
.
return self.model.predict(given_input)
Just as a test, assume the "model_resource" directory was kept in my home directory, now if I call the load_model method in a python interpreter with appropriate path, it shall return an instance of the Model class. You might want to add my_prediction_module.py to the PYTHONPATH. You might also want to add the directory where your custom code resides, if you are using custom code.
>>> import sys
>>> sys.path.append(path_to_direcory_of_my_prediction_module.py)
>>> sys.path.append(path_to_directory_of_custom_code)
>>> from my_prediction_module import load_model
>>> model_instance = load_model(path_to_model_resource='/home/mlplatform/model_resource')
>>> type(model_instance)
<type 'Model'>
>>> model_instance.predict(some_input) # you should be able to call predict and see it
If above snippet runs well, we are good to publish.
API to publish a model:
from hunchsdk.HunchApi import HunchApi
mlp_client = HunchApi()
path_to_prediction_module, path_to_model_resources_dir, model_id, model_version, path_to_setup_py=None, custom_package_name=None
mlp_client.hunch_publish_new_asm_model(
path_to_prediction_module, # Required, Path to prediction module
path_to_model_resources_dir, # Required, This should be the same as the load_model's arguement
model_id, # Required, Model Id. This is not auto generated. Auto generation will be enabled once the integration with Model Repository is complete.
model_version, # Required, Model Version. This is not auto generated. Auto generation will be enabled once the integration with Model Repository is complete.
path_to_setup_py, # Optional. Path to setup.py of the custom package on which the model is dependent.
custom_package_name # Optional. Name of the custom package on which the model is dependent.
)
First of all, read through the above section. Which explains the recommended directory structure for tensorflow type models. Acquaint yourself with the model resource directory. When using the saver.save() method, make sure the path, where you want to store the checkpoints is relative and not absolute.
import tensorflow as tf
sess = tf.Session()
saver = tf.train.Saver()
saver.save(sess, "relative-path-to-the-checkpoint-file")
To make sure that everything has worked fine, open the checkpoint file generated by tensorflow and you should not find any absolute paths there. Example.
model_checkpoint_path: "model.ckpt-20550"
all_model_checkpoint_paths: "model.ckpt-20550"
Make sure all the variables you define are not defined in the default graph. For example
import tensorflow as tf
with tf.name_scope('embeddings'):
store_embedding = tf.get_variable("store_embedding", [vocab_size, embed_size])
name_embedding = tf.get_variable("name_embedding", [name_vocab_size, char_embed_size])
email_embedding = tf.get_variable("email_embedding", [email_vocab_size, char_embed_size])
When ever the variables are defined, they shouldn't be defined in the default graph as shown above. Make sure to explicitly create a tensorflow graph object and define the variables inside it.
import tensorflow as tf
model_graph = tf.Graph()
with model_graph.as_default():
with tf.name_scope('embeddings'):
store_embedding = tf.get_variable("store_embedding", [vocab_size, embed_size])
name_embedding = tf.get_variable("name_embedding", [name_vocab_size, char_embed_size])
email_embedding = tf.get_variable("email_embedding", [email_vocab_size, char_embed_size])
Any variable defined, by default is a member of the default computational graph. Since, hunch host's multiple models at once. It's mandatory to define these variables in a computational graph, solely owned by your model and there are no conflicts with models from other teams.
Make sure to
- Construct your session
- Initialize global variables
- Restore the checkpoints
Using the same graph you have used to define variables.
As follows
# the model_graph is the same as above
sess = tf.Session(graph=model_graph)
with model_graph.as_default():
tf.global_variables_initializer().run(session=sess)
saver = tf.train.Saver()
saver.save()
saver.restore(sess, tf.train.latest_checkpoint(model_path))
import requests
import json
model_server_url = "http://localhost:8000/predict"
params = {"model_id":"HelloWorldExample", "model_version":"1.0.0"}
data = json.dumps(None)
response = requests.post(model_server_url, params = params, json = data)
print "Result:", response.content
# Expected Output
# Result: {"stack_trace": "NA", "result": "\"Hello World\""}
Run the following command to create a package. This command has to be run from the directory where setup.py is available.
python setup.py sdist
Models can be stored in any blob storage like S3, Azure Blob Storage etc. Hunch has support for S3 and Local File System based storage. Users can add support to any blob storage by implementing storage_client module.
Implement the following class and add it to hunchsdk/storage_clients/<blob_storage_name>.
class StorageClient(object):
def __init__(self, storage_client_config):
pass
def get_model_blob(self, model_id, model_version):
"""
Returns model blob for the given model id and model version
Args:
model_id:
model_version:
Returns:
"""
pass
def write_model_blob(self, model_blob, model_id, model_version):
"""
Write model blob with the given model id and model version to Model Repository storage.
Args:
model_blob:
model_id:
model_version:
Returns:
"""
pass
Configuration required for your blob storage can be passed with storage_client_config as dict.
Documentation is available as markdown files. You can host documents using mkdocs as well.
- Install mkdocs.
- Go to documentation and run the following command to generate static pages needed for the site.
mkdocs build
- Go to documentation and run the following command to host documents and serve them.
mkdocs serve
- Vikas Garg (@vikasgarg1996)
- Roshan Nair (@roshan-nair)
- Karan Verma (@karan10111)
- Naresh Sankapelly (@nareshsankapelly)
- Adeel Zafar (@adzafar)
- Akshay Utkarsh Sharma (@akshay-sharma)
- Nikhil Ketkar (@nikhilketkar)