Shortcuts

lumin.nn.ensemble package

Submodules

lumin.nn.ensemble.ensemble module

class lumin.nn.ensemble.ensemble.Ensemble(input_pipe=None, output_pipe=None, model_builder=None)[source]

Bases: lumin.nn.ensemble.abs_ensemble.AbsEnsemble

Standard class for building an ensemble of collection of trained networks producedd by fold_train_ensemble() Input and output pipelines can be added. to provide easy saving and loaded of exported ensembles. Currently, the input pipeline is not used, so input data is expected to be preprocessed. However the output pipeline will be used to deprocess model predictions.

Once instanciated, lumin.nn.ensemble.ensemble.Ensemble.build_ensemble() or :meth:load should be called. Alternatively, class_methods lumin.nn.ensemble.ensemble.Ensemble.from_save() or lumin.nn.ensemble.ensemble.Ensemble.from_results() may be used.

Parameters
Examples::
>>> ensemble = Ensemble()
>>>
>>> ensemble = Ensemble(input_pipe, output_pipe, model_builder)
add_input_pipe(pipe)[source]

Add input pipeline for saving

Parameters

pipe (Pipeline) – pipeline used for preprocessing input data

Return type

None

add_output_pipe(pipe)[source]

Add output pipeline for saving

Parameters

pipe (Pipeline) – pipeline used for preprocessing target data

Return type

None

build_ensemble(results, size, model_builder, metric='loss', weighting='reciprocal', higher_metric_better=False, snapshot_args=None, location=PosixPath('train_weights'), verbose=True)[source]

Load up an instantiated Ensemble with outputs of fold_train_ensemble()

Parameters
  • results (List[Dict[str, float]]) – results saved/returned by fold_train_ensemble()

  • size (int) – number of models to load as ranked by metric

  • model_builder (ModelBuilder) – ModelBuilder used for building Model from saved models

  • metric (str) – metric name listed in results to use for ranking and weighting trained models

  • weighting (str) – ‘reciprocal’ or ‘uniform’ how to weight model predictions during predicition. ‘reciprocal’ = models weighted by 1/metric ‘uniform’ = models treated with equal weighting

  • higher_metric_better (bool) – whether metric should be maximised or minimised

  • snapshot_args (Optional[Dict[str, Any]]) –

    Dictionary potentially containing: ‘cycle_losses’: returned/save by fold_train_ensemble() when using an AbsCyclicCallback ‘patience’: patience value that was passed to fold_train_ensemble() ‘n_cycles’: number of cycles to load per model ‘load_cycles_only’: whether to only load cycles, or also the best performing model ‘weighting_pwr’: weight cycles according to (n+1)**weighting_pwr, where n is the number of cycles loaded so far.

    Models are loaded youngest to oldest

  • location (Path) – Path to save location passed to fold_train_ensemble()

  • verbose (bool) – whether to print out information of models loaded

Examples::
>>> ensemble.build_ensemble(results, 10, model_builder,
...                         location=Path('train_weights'))
>>>
>>> ensemble.build_ensemble(
...     results, 1,  model_builder,
...     location=Path('train_weights'),
...     snapshot_args={'cycle_losses':cycle_losses,
...                    'patience':patience,
...                    'n_cycles':8,
...                    'load_cycles_only':True,
...                    'weighting_pwr':0})
Return type

None

export2onnx(base_name, bs=1)[source]

Export all Model contained in Ensemble to ONNX format. Note that ONNX expects a fixed batch size (bs) which is the number of datapoints your wish to pass through the model concurrently.

Parameters
  • base_name (str) – Exported models will be called {base_name}_{model_num}.onnx

  • bs (int) – batch size for exported models

Return type

None

export2tfpb(base_name, bs=1)[source]

Export all Model contained in Ensemble to Tensorflow ProtocolBuffer format, via ONNX. Note that ONNX expects a fixed batch size (bs) which is the number of datapoints your wish to pass through the model concurrently.

Parameters
  • base_name (str) – Exported models will be called {base_name}_{model_num}.pb

  • bs (int) – batch size for exported models

Return type

None

classmethod from_results(results, size, model_builder, metric='loss', weighting='reciprocal', higher_metric_better=False, snapshot_args=None, location=PosixPath('train_weights'), verbose=True)[source]

Instantiate Ensemble from a outputs of fold_train_ensemble(). If cycle models are loaded, then only uniform weighting between models is supported.

Parameters
  • results (List[Dict[str, float]]) – results saved/returned by fold_train_ensemble()

  • size (int) – number of models to load as ranked by metric

  • model_builder (ModelBuilder) – ModelBuilder used for building Model from saved models

  • metric (str) – metric name listed in results to use for ranking and weighting trained models

  • weighting (str) – ‘reciprocal’ or ‘uniform’ how to weight model predictions during predicition. ‘reciprocal’ = models weighted by 1/metric ‘uniform’ = models treated with equal weighting

  • higher_metric_better (bool) – whether metric should be maximised or minimised

  • snapshot_args (Optional[Dict[str, Any]]) –

    Dictionary potentially containing: ‘cycle_losses’: returned/save by fold_train_ensemble() when using an AbsCyclicCallback ‘patience’: patience value that was passed to fold_train_ensemble() ‘n_cycles’: number of cycles to load per model ‘load_cycles_only’: whether to only load cycles, or also the best performing model ‘weighting_pwr’: weight cycles according to (n+1)**weighting_pwr, where n is the number of cycles loaded so far.

    Models are loaded youngest to oldest

  • location (Path) – Path to save location passed to fold_train_ensemble()

  • verbose (bool) – whether to print out information of models loaded

Return type

AbsEnsemble

Returns

Built Ensemble

Examples::
>>> ensemble = Ensemble.from_results(results, 10, model_builder,
...                                  location=Path('train_weights'))
>>>
>>> ensemble = Ensemble.from_results(
...     results, 1,  model_builder,
...     location=Path('train_weights'),
...     snapshot_args={'cycle_losses':cycle_losses,
...                    'patience':patience,
...                    'n_cycles':8,
...                    'load_cycles_only':True,
...                    'weighting_pwr':0})
classmethod from_save(name)[source]

Instantiate Ensemble from a saved Ensemble

Parameters

name (str) – base filename of ensemble

Return type

AbsEnsemble

Returns

Loaded Ensemble

Examples::
>>> ensemble = Ensemble.from_save('weights/ensemble')
get_feat_importance(fy, eval_metric=None)[source]

Call get_ensemble_feat_importance(), passing this Ensemble and provided arguments

Parameters
  • fy (FoldYielder) – FoldYielder interfacing to data on which to evaluate importance

  • eval_metric (Optional[EvalMetric]) – Optional EvalMetric to use for quantifying performance

Return type

DataFrame

load(name)[source]

Load an instantiated Ensemble with weights and Model from save.

Arguments;

name: base name for saved objects

Examples::
>>> ensemble.load('weights/ensemble')
Return type

None

static load_trained_model(model_idx, model_builder, name='train_weights/train_')[source]

Load trained model from save file of the form {name}{model_idx}.h5

Arguments

model_idx: index of model to load model_builder: ModelBuilder used to build the model name: base name of file from which to load model

Return type

Model

Returns

Model loaded from save

predict(inputs, n_models=None, pred_name='pred', callbacks=None, verbose=True)[source]

Compatability method for predicting data contained in either a Numpy array or a FoldYielder Will either pass inputs to lumin.nn.ensemble.ensemble.Ensemble.predict_array() or lumin.nn.ensemble.ensemble.Ensemble.predict_folds().

Parameters
  • inputs (Union[ndarray, FoldYielder, List[ndarray]]) – either a FoldYielder interfacing with the input data, or the input data as an array

  • n_models (Optional[int]) – number of models to use in predictions as ranked by the metric which was used when constructing the Ensemble. By default, entire ensemble is used.

  • pred_name (str) – name for new group of predictions if passed a FoldYielder

  • callbacks (Optional[List[AbsCallback]]) – list of any callbacks to use during evaluation

  • verbose (bool) – whether to print average predicition timings

Return type

Union[None, ndarray]

Returns

If passed a Numpy array will return predictions.

Examples::
>>> preds = ensemble.predict(input_array)
>>>
>>> ensemble.predict(test_fy)
predict_array(arr, n_models=None, parent_bar=None, display=True, callbacks=None)[source]

Apply ensemble to Numpy array and get predictions. If an output pipe has been added to the ensemble, then the predictions will be deprocessed. Inputs are expected to be preprocessed; i.e. any input pipe added to the ensemble is not used.

Parameters
  • arr (ndarray) – input data

  • n_models (Optional[int]) – number of models to use in predictions as ranked by the metric which was used when constructing the Ensemble. By default, entire ensemble is used.

  • parent_bar (Optional[ConsoleMasterBar]) – not used when calling the method directly

  • display (bool) – whether to display a progress bar for model evaluations

  • callbacks (Optional[List[AbsCallback]]) – list of any callbacks to use during evaluation

Return type

ndarray

Returns

Numpy array of predictions

Examples::
>>> preds = ensemble.predict_array(inputs)
predict_folds(fy, n_models=None, pred_name='pred', callbacks=None, verbose=True)[source]

Apply ensemble to data accessed by a FoldYielder and save predictions as a new group per fold in the foldfile. If an output pipe has been added to the ensemble, then the predictions will be deprocessed. Inputs are expected to be preprocessed; i.e. any input pipe added to the ensemble is not used. If foldyielder has test-time augmentation, then predictions will be averaged over all augmentated forms of the data.

Parameters
  • fy (FoldYielder) – FoldYielder interfacing with the input data

  • n_models (Optional[int]) – number of models to use in predictions as ranked by the metric which was used when constructing the Ensemble. By default, entire ensemble is used.

  • pred_name (str) – name for new group of predictions

  • callbacks (Optional[List[AbsCallback]]) – list of any callbacks to use during evaluation

  • verbose (bool) – whether to print average prediction timings

Examples::
>>> ensemble.predict_array(test_fy, pred_name='pred_tta')
Return type

None

save(name, feats=None, overwrite=False)[source]

Save ensemble and associated objects

Parameters
  • name (str) – base name for saved objects

  • feats (Optional[Any]) – optional list of input features

  • overwrite (bool) – if existing objects are found, whether to overwrite them

Examples::
>>> ensemble.save('weights/ensemble')
>>>
>>> ensemble.save('weights/ensemble', ['pt','eta','phi'])
Return type

None

Module contents

Read the Docs v: v0.4.0.1
Versions
latest
stable
v0.4.0.1
v0.3.1
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.

Docs

Access comprehensive developer and user documentation for LUMIN

View Docs

Tutorials

Get tutorials for beginner and advanced researchers demonstrating many of the features of LUMIN

View Tutorials