Model Development Guide¶
Rafiki leverages on a dynamic pool of model templates contributed by Model Developers.
As a Model Developer, you’ll define a Python class that conforms to Rafiki’s base model specification, and
submit it to Rafiki with the rafiki.client.Client.create_model()
method.
Implementing the Base Model Interface¶
As an overview, your model template needs to provide the following logic for deployment on Rafiki:
Definition of the space of your model’s hyperparameters (knob configuration)
Initialization of the model with a concrete set of hyperparameters (knobs)
Training of the model given a (train) dataset on the local file system
Evaluation of the model given a (validation) dataset onthe local file system
Dumping of the model’s parameters for serialization, after training
Loading of the model with trained parameters
Making batch predictions with the model, after being trained
Full details of Rafiki’s base model interface is documented at rafiki.model.BaseModel
.
Your model implementation has to follow a specific task’s specification (see Supported Tasks).
To aid your implementation, you can refer to Sample Models.
Testing¶
After implementing your model, you’ll use rafiki.model.dev.test_model_class()
to test your model.
Refer to its documentation for more details on how to use it, or refer to the sample models’ usage of the method.
Logging¶
utils.logger
in the rafiki.model
module provides a set of methods to log messages & metrics while your model is training.
These messages & metrics would be displayed on Rafiki Web Admin for monitoring & debugging purposes.
Refer to rafiki.model.LoggerUtils
for more details.
See also
Dataset Loading¶
utils.dataset
in the rafiki.model
module provides a simple set of in-built dataset loading methods.
Refer to rafiki.model.DatasetUtils
for more details.
Defining Hyperparameter Search Space¶
Refer to How Model Tuning Works for the specifics of how you can tune your models on Rafiki.
Sample Models¶
To illustrate how to write models for Rafiki, we have written the following:
Sample pre-processing logic to convert common dataset formats to Rafiki’s own dataset formats in ./examples/datasets/
Sample models in ./examples/models/
Example: Testing Models for IMAGE_CLASSIFICATION
¶
Download & pre-process the original Fashion MNIST dataset to the dataset format specified by
IMAGE_CLASSIFICATION
:python examples/datasets/image_files/load_fashion_mnist.py
Install the Python dependencies for the sample models:
pip install scikit-learn==0.20.0 pip install tensorflow==1.12.0
Test the sample models in
./examples/models/image_classification
:python examples/models/image_classification/SkDt.py python examples/models/image_classification/TfFeedForward.py
Example: Testing Models for POS_TAGGING
¶
Download & pre-process the subsample of the Penn Treebank dataset to the dataset format specified by
POS_TAGGING
:python examples/datasets/corpus/load_sample_ptb.py
Install the Python dependencies for the sample models:
pip install torch==0.4.1
Test the sample models in
./examples/models/pos_tagging
:python examples/models/pos_tagging/BigramHmm.py python examples/models/pos_tagging/PyBiLstm.py
Configuring the Model’s Environment¶
Your model will be run in Python 3.6 with the following Python libraries pre-installed:
requests==2.20.0 numpy==1.14.5 Pillow==5.3.0
Additionally, you’ll specify a list of Python dependencies to be installed for your model,
prior to model training and inference. This is configurable with the dependencies
option
during model creation. These dependencies will be lazily installed on top of the worker’s Docker image before your model’s code is executed.
If the model is to be run on GPU, Rafiki would map dependencies to their GPU-supported versions, if supported.
For example, { 'tensorflow': '1.12.0' }
will be installed as { 'tensorflow-gpu': '1.12.0' }
.
Rafiki could also parse specific dependency names to install certain non-PyPI packages.
For example, { 'singa': '1.1.1' }
will be installed as singa-cpu=1.1.1
or singa-gpu=1.1.1
using conda
.
Refer to the list of officially supported dependencies below. For dependencies that are not listed, they will be installed as PyPI packages of the specified name and version.
Dependency |
Installation Command |
|
|
|
|
|
|
|
|
|
|
Alternatively, you can build a custom Docker image that extends rafikiai/rafiki_worker
,
installing the required dependencies for your model. This is configurable with docker_image
option
during model creation.
See also
Your model should be GPU-sensitive based on the environment variable CUDA_AVAILABLE_DEVICES
(see here).
If CUDA_AVAILABLE_DEVICES
is set to -1
, your model should simply run on CPU.
You can assume that your model has exclusive access to the GPUs listed in CUDA_AVAILABLE_DEVICES
.