rafiki.model¶
Table of Contents
Core Classes¶
- 
class 
rafiki.model.BaseModel(**knobs)[source]¶ Rafiki’s base model class that Rafiki models should extend. Rafiki models should implement all abstract methods according to their associated tasks’ specifications, together with the static method
get_knob_config().In the model’s
__init__method, callsuper().__init__(**knobs)as the first line, followed by the model’s initialization logic. The model should be initialize itself withknobs, a set of generated knob values for the instance, and possibly save the knobs’ values as attribute(s) of the model instance. These knob values will be chosen by Rafiki based on the model’s knob config.For example:
def __init__(self, **knobs): super().__init__(**knobs) self.__dict__.update(knobs) ... self._build_model(self.knob1, self.knob2)
Parameters: knobs (dict[str, any]) – Dictionary of knob values for this model instance - 
static 
get_knob_config()[source]¶ Return a dictionary defining this model class’ knob configuration (i.e. list of knob names, their data types and their ranges).
Returns: Dictionary defining this model’s knob configuration Return type: dict[str, rafiki.model.BaseKnob] 
- 
train(dataset_uri)[source]¶ Train this model instance with given dataset and initialized knob values.
Parameters: dataset_uri (str) – URI of the train dataset in a format specified by the task 
- 
evaluate(dataset_uri)[source]¶ Evaluate this model instance with given dataset after training. This will be called only when model is trained.
Parameters: dataset_uri (str) – URI of the test dataset in a format specified by the task Returns: Accuracy as float from 0-1 on the test dataset Return type: float 
- 
predict(queries)[source]¶ Make predictions on a batch of queries with this model instance after training. Each prediction should be JSON serializable. This will be called only when model is trained.
Parameters: queries (list[any]) – List of queries, where a query is in a format specified by the task Returns: List of predictions, where a prediction is in a format specified by the task Return type: list[any] 
- 
dump_parameters()[source]¶ Return a dictionary of model parameters that fully defines this model instance’s trained state. This dictionary should be serializable by the Python’s
picklemodule. This will be used for trained model serialization within Rafiki. This will be called only when model is trained.Returns: Dictionary of model parameters Return type: dict[string, any] 
- 
static 
 
Knob Classes¶
- 
class 
rafiki.model.CategoricalKnob(values)[source]¶ Knob type representing a categorical value of type
int,float,boolorstr. A generated value of this knob would be an element ofvalues.
- 
class 
rafiki.model.IntegerKnob(value_min, value_max, is_exp=False)[source]¶ Knob type representing any
intvalue within a specific interval [value_min,value_max].is_expspecifies whether the knob value should be scaled exponentially.
Utility Classes & Methods¶
- 
model.test_model_class(model_class, task, dependencies, train_dataset_uri, test_dataset_uri, queries=[], knobs=None)¶ Tests whether a model class is properly defined by running a full train-inference flow. The model instance’s methods will be called in an order similar to that in Rafiki. It is assumed that all of the model’s dependencies have been installed in the current Python environment.
Parameters: - model_file_path (str) – Path to a single Python file that contains the definition for the model class
 - model_class (type) – The name of the model class inside the Python file. This class should implement 
rafiki.model.BaseModel - task (str) – Task type of model
 - dependencies (dict[str, str]) – Model’s dependencies
 - train_dataset_uri (str) – URI of the train dataset for testing the training of model
 - test_dataset_uri (str) – URI of the test dataset for testing the evaluating of model
 - queries (list[any]) – List of queries for testing predictions with the trained model
 - knobs (dict[str, any]) – Knobs to train the model with. If not specified, knobs from an advisor will be used
 
Returns: The trained model
- 
class 
rafiki.model.ModelLogger[source]¶ Allows models to log messages and metrics during model training, and define plots for visualization of model training.
To use this logger, import the global
loggerinstance from the modulerafiki.model.For example:
from rafiki.model import logger, BaseModel ... class MyModel(BaseModel): ... def train(self, dataset_uri): ... logger.log('Starting model training...') logger.define_plot('Precision & Recall', ['precision', 'recall'], x_axis='epoch') ... logger.log(precision=0.1, recall=0.6, epoch=1) ... logger.log('Ending model training...') ...
- 
define_loss_plot()[source]¶ Convenience method of defining a plot of
lossagainstepoch. To be used withrafiki.model.ModelLogger.log_loss().
- 
log_loss(loss, epoch)[source]¶ Convenience method for logging loss against epoch. To be used with
rafiki.model.ModelLogger.define_loss_plot().
- 
define_plot(title, metrics, x_axis=None)[source]¶ Defines a plot for a set of metrics for analysis of model training. By default, metrics will be plotted against time.
For example, a model’s precision & recall logged with e.g.
log(precision=0.1, recall=0.6, epoch=1)can be visualized in the plots generated bydefine_plot('Precision & Recall', ['precision', 'recall'])(against time) ordefine_plot('Precision & Recall', ['precision', 'recall'], x_axis='epoch')(against epochs).Only call this method in
rafiki.model.BaseModel.train().Parameters: - title (str) – Title of the plot
 - metrics (str[]) – List of metrics that should be plotted on the y-axis
 - x_axis (str) – Metric that should be plotted on the x-axis, against all other metrics. Defaults to 
'time', which is automatically logged 
- 
log(msg='', **metrics)[source]¶ Logs a message and/or a set of metrics at a single point in time.
Logged messages will be viewable on Rafiki’s administrative UI.
To visualize logged metrics on plots, a plot must be defined via
rafiki.model.ModelLogger.define_plot().Only call this method in
rafiki.model.BaseModel.train()andrafiki.model.BaseModel.evaluate().Parameters: - msg (str) – Message to be logged
 - metrics (dict[str, int|float]) – Set of metrics & their values to be logged as 
{ <metric>: <value> }, where<value>should be a number. 
- 
 
- 
class 
rafiki.model.ModelDatasetUtils[source]¶ Collection of utility methods to help with the loading of datasets.
To use these utility methods, import the global
dataset_utilsinstance from the modulerafiki.model.For example:
from rafiki.model import dataset_utils ... def train(self, dataset_uri): ... dataset_utils.load_dataset_of_image_files(dataset_uri) ...
- 
load_dataset_of_corpus(dataset_uri, tags=['tag'], split_by='\\n')[source]¶ Loads dataset with type CORPUS.
Parameters: dataset_uri (str) – URI of the dataset file Returns: An instance of CorpusDataset.
- 
load_dataset_of_image_files(dataset_uri, image_size=None)[source]¶ Loads dataset with type IMAGE_FILES.
Parameters: - dataset_uri (str) – URI of the dataset file
 - image_size (str) – dimensions to resize all images to (None for no resizing)
 
Returns: An instance of
ImageFilesDataset.
- 
 
- 
class 
rafiki.model.ImageFilesDataset(dataset_path, image_size)[source]¶ Class that helps loading of dataset with type IMAGE_FILES
classesis the number of image classes. Each dataset example is (image, class) where each image is a 2D list of integers (0, 255) as grayscale, each class is an integer from 0 to (k - 1).
- 
class 
rafiki.model.CorpusDataset(dataset_path, tags, split_by)[source]¶ Class that helps loading of dataset with type CORPUS
tagsis the expected list of tags for each token in the corpus. Dataset samples are grouped as sentences by a delimiter token corresponding tosplit_by.tag_num_classesis a list of <number of classes for a tag>, in the same order astags. Each dataset sample is [[token, <tag_1>, <tag_2>, …, <tag_k>]] where each token is a string, eachtag_iis an integer from 0 to (k_i - 1) as each token’s corresponding class for that tag, with tags appearing in the same order astags.