rafiki.model¶

Core¶

class rafiki.model.BaseModel(**knobs: Dict[str, Any])[source]¶

Rafiki’s base model class that Rafiki models must extend.

Rafiki models must implement all abstract methods below, according to the specification of its associated task (see Supported Tasks). They configure how this model template will be trained, evaluated, tuned, serialized and served on Rafiki.

In the model’s __init__ method, call super().__init__(**knobs) as the first line, followed by the model’s initialization logic. The model should be initialize itself with knobs, a set of generated knob values for the created model instance.

These knob values are chosen by Rafiki based on the model’s knob configuration (defined in rafiki.model.BaseModel.get_knob_config()).

For example:

def __init__(self, **knobs):
    super().__init__(**knobs)
    self.__dict__.update(knobs)
    ...
    self._build_model(self.knob1, self.knob2)

Parameters: knobs (rafiki.model.Knobs) – Dictionary mapping knob names to knob values

abstract static get_knob_config() → Dict[str, rafiki.model.knob.BaseKnob][source]¶

Return a dictionary that defines the search space for this model template’s knobs (i.e. knobs’ names, their types & their ranges).

Over the course of training, your model will be initialized with different values of knobs within this search space to maximize this model’s performance.

Refer to How Model Tuning Works to understand more about how this works.

Returns: Dictionary mapping knob names to knob specifications

abstract train(dataset_path: str, shared_params: Optional[Dict[str, Union[str, int, float, numpy.ndarray]]] = None, **train_args)[source]¶

Train this model instance with the given traing dataset and initialized knob values. Additional keyword arguments could be passed depending on the task’s specification.

Additionally, trained parameters shared from previous trials could be passed, as part of the SHARE_PARAMS policy (see Model Policies).

Subsequently, the model is considered trained.

Parameters

dataset_path – File path of the train dataset file in the local filesystem, in a format specified by the task
shared_params – Dictionary mapping parameter names to values, as produced by your model’s rafiki.model.BaseModel.dump_parameters().

abstract evaluate(dataset_path: str) → float[source]¶

Evaluate this model instance with the given validation dataset after training.

This will be called only when model is trained.

Parameters: dataset_path – File path of the validation dataset file in the local filesystem, in a format specified by the task
Returns: A score associated with the validation performance for the trained model instance, the higher the better e.g. classification accuracy.

abstract predict(queries: List[Any]) → List[Any][source]¶

Make predictions on a batch of queries after training.

This will be called only when model is trained.

Parameters: queries – List of queries, where a query is in the format specified by the task
Returns: List of predictions, in an order corresponding to the queries, where a prediction is in the format specified by the task

abstract dump_parameters() → Dict[str, Union[str, int, float, numpy.ndarray]][source]¶

Returns a dictionary of model parameters that fully define the trained state of the model. This dictionary must conform to the format rafiki.model.Params. This will be used to save the trained model in Rafiki.

Additionally, trained parameters produced by this method could be shared with future trials, as part of the SHARE_PARAMS policy (see Model Policies).

This will be called only when model is trained.

Returns: Dictionary mapping parameter names to values

abstract load_parameters(params: Dict[str, Union[str, int, float, numpy.ndarray]])[source]¶

Loads this model instance with previously trained model parameters produced by your model’s rafiki.model.BaseModel.dump_parameters(). This model instance’s initialized knob values will match those during training.

Subsequently, the model is considered trained.

destroy()[source]¶: Destroy this model instance, freeing any resources held by this model instance. No other instance methods will be called subsequently.

static teardown()[source]¶: Runs class-wide teardown logic (e.g. close a training session shared across trials).

class rafiki.model.BaseKnob[source]¶: The base class for a knob type.

rafiki.model.Knobs = typing.Dict[str, typing.Any]¶

The central part of internal API.

This represents a generic version of type ‘origin’ with type arguments ‘params’. There are two kind of these aliases: user defined and special. The special ones are wrappers around builtin collections and ABCs in collections.abc. These must have ‘name’ always set. If ‘inst’ is False, then the alias can’t be instantiated, this is used by e.g. typing.List and typing.Dict.

rafiki.model.KnobConfig = typing.Dict[str, rafiki.model.knob.BaseKnob]¶

The central part of internal API.

This represents a generic version of type ‘origin’ with type arguments ‘params’. There are two kind of these aliases: user defined and special. The special ones are wrappers around builtin collections and ABCs in collections.abc. These must have ‘name’ always set. If ‘inst’ is False, then the alias can’t be instantiated, this is used by e.g. typing.List and typing.Dict.

rafiki.model.Params = typing.Dict[str, typing.Union[str, int, float, numpy.ndarray]]¶

The central part of internal API.

This represents a generic version of type ‘origin’ with type arguments ‘params’. There are two kind of these aliases: user defined and special. The special ones are wrappers around builtin collections and ABCs in collections.abc. These must have ‘name’ always set. If ‘inst’ is False, then the alias can’t be instantiated, this is used by e.g. typing.List and typing.Dict.

Knobs¶

class rafiki.model.CategoricalKnob(values: List[Union[str, int, float, bool, rafiki.model.knob.KnobValue]])[source]¶: Knob type representing a categorical value of type int, float, bool or str. values is a list of candidate cateogrical values for this knob type; a realization of this knob type would be an element of values.

class rafiki.model.KnobValue(value: Union[str, int, float, bool])[source]¶: Wrapper for a CategoricalValue.

class rafiki.model.IntegerKnob(value_min: int, value_max: int, is_exp: bool = False)[source]¶: Knob type representing an int value within a specific interval [value_min, value_max]. is_exp specifies whether the knob value should be scaled exponentially.

class rafiki.model.FloatKnob(value_min: float, value_max: float, is_exp: bool = False)[source]¶: Knob type representing a float value within a specific interval [value_min, value_max]. is_exp specifies whether the knob value should be scaled exponentially.

class rafiki.model.FixedKnob(value: Union[str, int, float, bool, rafiki.model.knob.KnobValue])[source]¶: Knob type representing a fixed value of type int, float, bool or str. Essentially, this represents a knob type that does not require tuning.

class rafiki.model.PolicyKnob(policy: str)[source]¶

Knob type representing whether a certain policy should be activated, as a boolean. E.g. the EARLY_STOP policy knob decides whether the model should stop model training early, or not. Offering the ability to activate different policies can optimize hyperparameter search for your model. Activation of all policies default to false.

Refer to Model Policies to understand how to use this knob type.

class rafiki.model.ArchKnob(items: List[List[Union[str, int, float, bool, rafiki.model.knob.KnobValue]]])[source]¶

Knob type representing part of a model’s architecture as a fixed-size list of categorical values. items is a list of list of candidate categorical values; a realization of this knob type would be a list of categorical values, wIth the value at each index matching an element of the list of candidates at that index.

To illustrate, the following can be a definition of 3-layer model’s architecture search space inspired by the ENAS cell architecture construction strategy:

l0 = KnobValue(0) # Input layer as input connection
l1 = KnobValue(1) # Layer 1 as input connection
l2 = KnobValue(2) # Layer 2 as input connection
ops = [KnobValue('conv3x3'), KnobValue('conv5x5'), KnobValue('avg_pool'), KnobValue('max_pool')]
arch_knob = ArchKnob([
    [l0], ops, [l0], ops,                   # To form layer 1, choose input 1, op on input 1, input 2, op on input 2, then combine post-op inputs as preferred                                     
    [l0, l1], ops, [l0, l1], ops,           # To form layer 2, ...
    [l0, l1, l2], ops, [l0, l1, l2], ops,   # To form layer 3, ...
])

If the same KnobValue instance is reused at different indices in items, they are considered to be semantically identical for the purposes of architecture search. For example, in the above code snippet, the meaning of l0 in the 1st item index and l0 in the 3rd item index are identical - they refer to an input connection from the input layer.

It is assumed that the realization of components of the architecture later in the list is influenced by the components of architecture earlier in the list. For example, which operation is to be applied on an input (e.g. which value in ops) is somewhat dependent on the source of the input (e.g. which value in [l0, l1, l2]).

Note that the exact model architecture space is not fully described with this knob - it still depends on how the model code constructs the computation graph and implements the various building blocks of the model.

Encoding of the architecture with ArchKnob can be flexibly defined as necessary. You can search only over operations and have the connections in the architecture already hard-coded as part of the model, or you can search only over input indices of various layers and have operations already decided.

Utility¶

rafiki.model.dev.test_model_class(model_file_path: str, model_class: str, task: str, dependencies: Dict[str, str], train_dataset_path: str, val_dataset_path: str, test_dataset_path: str = None, budget: Dict[rafiki.constants.BudgetOption, Any] = None, train_args: Dict[str, any] = None, queries: List[Any] = None) -> (typing.List[typing.Any], <class 'rafiki.model.model.BaseModel'>)[source]¶

Tests whether a model class is more likely to be correctly defined by locally simulating a full train-inference flow on your model on a given dataset. The model’s methods will be called in an manner similar to that in Rafiki.

This method assumes that your model’s Python dependencies have already been installed.

This method also reads knob values and budget options from CLI arguments. For example, you can pass e.g. --TIME_HOURS=0.01 to configure the budget, or --learning_rate=0.01 to fix a knob’s value.

Parameters

model_file_path – Path to a single Python file that contains the definition for the model class
model_class – The name of the model class inside the Python file. This class should implement rafiki.model.BaseModel
task – Task type of model
dependencies – Model’s dependencies
train_dataset_path – File path of the train dataset for training of the model
val_dataset_path – File path of the validation dataset for evaluating trained models
test_dataset_path – File path of the test dataset for testing the final best trained model, if provided
budget – Budget for model training
train_args – Additional arguments to pass to models during training, if any
queries – List of queries for testing predictions with the trained model

Returns

(<predictions of best trained model>, <best trained model>)

class rafiki.model.LoggerUtils[source]¶

Allows models to log messages & metrics during model training.

This should NOT be initiailized outside of the module. Instead, import the global utils instance from the module rafiki.model and use utils.logger.

For example:

from rafiki.model import utils
...
def train(self, dataset_path, **kwargs):
    ...
    utils.logger.log('Starting model training...')
    utils.logger.define_plot('Precision & Recall', ['precision', 'recall'], x_axis='epoch')
    ...
    utils.logger.log(precision=0.1, recall=0.6, epoch=1)
    ...
    utils.logger.log('Ending model training...')
    ...

define_loss_plot()[source]¶: Convenience method of defining a plot of loss against epoch. To be used with rafiki.model.LoggerUtils.log_loss().

log_loss(loss, epoch)[source]¶: Convenience method for logging loss against epoch. To be used with rafiki.model.LoggerUtils.define_loss_plot().

define_plot(title, metrics, x_axis=None)[source]¶

Defines a plot for a set of metrics for analysis of model training. By default, metrics will be plotted against time.

For example, a model’s precision & recall logged with e.g. log(precision=0.1, recall=0.6, epoch=1) can be visualized in the plots generated by define_plot('Precision & Recall', ['precision', 'recall']) (against time) or define_plot('Precision & Recall', ['precision', 'recall'], x_axis='epoch') (against epochs).

Only call this method in rafiki.model.BaseModel.train().

Parameters

title (str) – Title of the plot
metrics (str[]) – List of metrics that should be plotted on the y-axis
x_axis (str) – Metric that should be plotted on the x-axis, against all other metrics. Defaults to 'time', which is automatically logged

log(msg='', **metrics)[source]¶

Logs a message and/or a set of metrics at a single point in time.

Logged messages will be viewable on Rafiki’s administrative UI.

To visualize logged metrics on plots, a plot must be defined via rafiki.model.LoggerUtils.define_plot().

Only call this method in rafiki.model.BaseModel.train() and rafiki.model.BaseModel.evaluate().

Parameters

msg (str) – Message to be logged
metrics (dict[str, int|float]) – Set of metrics & their values to be logged as { <metric>: <value> }, where <value> should be a number.

class rafiki.model.DatasetUtils[source]¶

Collection of utility methods to help with the loading of datasets.

Usage of these methods are optional. In fact, you are encouraged to rely on your preferred ML libraries’ dataset loading methods, or hand-roll your own.

This should NOT be initiailized outside of the module. Instead, import the global utils instance from the module rafiki.model and use utils.dataset.

For example:

from rafiki.model import utils
...
def train(self, dataset_path, **kwargs):
    ...
    utils.dataset.load_dataset_of_image_files(dataset_path)
    ...

load_dataset_of_corpus(dataset_path, tags=None, split_by='\\n')[source]¶

Loads dataset for the task POS_TAGGING

Parameters: dataset_path (str) – File path of the dataset
Returns: An instance of CorpusDataset.

load_dataset_of_image_files(dataset_path, min_image_size=None, max_image_size=None, mode='RGB', if_shuffle=False)[source]¶

Loads dataset for the task IMAGE_CLASSIFICATION.

Parameters

dataset_path (str) – File path of the dataset
min_image_size (int) – minimum width and height to resize all images to
max_image_size (int) – maximum width and height to resize all images to
mode (str) – Pillow image mode. Refer to https://pillow.readthedocs.io/en/3.1.x/handbook/concepts.html#concept-modes
if_shuffle (bool) – Whether to shuffle the dataset

Returns

An instance of ImageFilesDataset

load_dataset_of_audio_files(dataset_path, dataset_dir)[source]¶

Loads dataset with type AUDIO_FILES.

Parameters: dataset_uri (str) – URI of the dataset file
Returns: An instance of AudioFilesDataset.

normalize_images(images, mean=None, std=None)[source]¶

Normalize all images.

If mean mean and standard deviation std are None, they will be computed on the images.

Parameters

images – (N x width x height x channels) array-like of images to resize
mean (float[]) – Mean for normalization, by channel
std (float[]) – Standard deviation for normalization, by channel

Returns

(images, mean, std)

transform_images(images, image_size=None, mode=None)[source]¶

Resize or convert a list of N images to another size and/or mode

Parameters

images – (N x width x height x channels) array-like of images to resize
image_size (int) – width and height to resize all images to
mode (str) – Pillow image mode to convert all images to. Refer to https://pillow.readthedocs.io/en/3.1.x/handbook/concepts.html#concept-modes

Returns

output images as a (N x width x height x channels) numpy array

load_images(image_paths: List[str], mode: str = 'RGB') → numpy.ndarray[source]¶

Loads multiple images from the local filesystem. Assumes that images are of the same dimensions.

Parameters

images – Paths to images
mode (str) – Pillow image mode to convert all images to. Refer to https://pillow.readthedocs.io/en/3.1.x/handbook/concepts.html#concept-modes

Returns

images as a (N x width x height x channels) numpy array

class rafiki.model.ImageFilesDataset(dataset_path, min_image_size=None, max_image_size=None, mode='RGB', if_shuffle=False)[source]¶

Class that helps loading of dataset for task IMAGE_CLASSIFICATION.

classes is the number of image classes.

Each dataset example is (image, class) where:

Each image is a 3D numpy array (width x height x channels)

Each class is an integer from 0 to (k - 1)

class rafiki.model.CorpusDataset(dataset_path, tags, split_by)[source]¶

Class that helps loading of dataset for task POS_TAGGING.

tags is the expected list of tags for each token in the corpus. Dataset samples are grouped as sentences by a delimiter token corresponding to split_by.

tag_num_classes is a list of <number of classes for a tag>, in the same order as tags. Each dataset sample is [[token, <tag_1>, <tag_2>, …, <tag_k>]] where each token is a string, each tag_i is an integer from 0 to (k_i - 1) as each token’s corresponding class for that tag, with tags appearing in the same order as tags.