lumin.nn.ensemble package¶
Submodules¶
lumin.nn.ensemble.ensemble module¶
-
class
lumin.nn.ensemble.ensemble.
Ensemble
(input_pipe=None, output_pipe=None, model_builder=None)[source]¶ Bases:
lumin.nn.ensemble.abs_ensemble.AbsEnsemble
Standard class for building an ensemble of collection of trained networks producedd by
fold_train_ensemble()
Input and output pipelines can be added. to provide easy saving and loaded of exported ensembles. Currently, the input pipeline is not used, so input data is expected to be preprocessed. However the output pipeline will be used to deprocess model predictions.Once instanciated,
lumin.nn.ensemble.ensemble.Ensemble.build_ensemble()
or :meth:load should be called. Alternatively, class_methodslumin.nn.ensemble.ensemble.Ensemble.from_save()
orlumin.nn.ensemble.ensemble.Ensemble.from_results()
may be used.- Parameters
input_pipe (
Optional
[Pipeline
]) – Optional input pipeline, alternatively calllumin.nn.ensemble.ensemble.Ensemble.add_input_pipe()
output_pipe (
Optional
[Pipeline
]) – Optional output pipeline, alternatively calllumin.nn.ensemble.ensemble.Ensemble.add_ouput_pipe()
model_builder (
Optional
[ModelBuilder
]) – OptionalModelBuilder
for constructing models from saved weights.
- Examples::
>>> ensemble = Ensemble() >>> >>> ensemble = Ensemble(input_pipe, output_pipe, model_builder)
-
add_input_pipe
(pipe)[source]¶ Add input pipeline for saving
- Parameters
pipe (
Pipeline
) – pipeline used for preprocessing input data- Return type
None
-
add_output_pipe
(pipe)[source]¶ Add output pipeline for saving
- Parameters
pipe (
Pipeline
) – pipeline used for preprocessing target data- Return type
None
-
build_ensemble
(results, size, model_builder, metric='loss', weighting='reciprocal', higher_metric_better=False, snapshot_args=None, location=PosixPath('train_weights'), verbose=True)[source]¶ Load up an instantiated
Ensemble
with outputs offold_train_ensemble()
- Parameters
results (
List
[Dict
[str
,float
]]) – results saved/returned byfold_train_ensemble()
size (
int
) – number of models to load as ranked by metricmodel_builder (
ModelBuilder
) –ModelBuilder
used for buildingModel
from saved modelsmetric (
str
) – metric name listed in results to use for ranking and weighting trained modelsweighting (
str
) – ‘reciprocal’ or ‘uniform’ how to weight model predictions during predicition. ‘reciprocal’ = models weighted by 1/metric ‘uniform’ = models treated with equal weightinghigher_metric_better (
bool
) – whether metric should be maximised or minimisedsnapshot_args (
Optional
[Dict
[str
,Any
]]) –Dictionary potentially containing: ‘cycle_losses’: returned/save by
fold_train_ensemble()
when using anAbsCyclicCallback
‘patience’: patience value that was passed tofold_train_ensemble()
‘n_cycles’: number of cycles to load per model ‘load_cycles_only’: whether to only load cycles, or also the best performing model ‘weighting_pwr’: weight cycles according to (n+1)**weighting_pwr, where n is the number of cycles loaded so far.Models are loaded youngest to oldest
location (
Path
) – Path to save location passed tofold_train_ensemble()
verbose (
bool
) – whether to print out information of models loaded
- Examples::
>>> ensemble.build_ensemble(results, 10, model_builder, ... location=Path('train_weights')) >>> >>> ensemble.build_ensemble( ... results, 1, model_builder, ... location=Path('train_weights'), ... snapshot_args={'cycle_losses':cycle_losses, ... 'patience':patience, ... 'n_cycles':8, ... 'load_cycles_only':True, ... 'weighting_pwr':0})
- Return type
None
-
export2onnx
(base_name, bs=1)[source]¶ Export all
Model
contained inEnsemble
to ONNX format. Note that ONNX expects a fixed batch size (bs) which is the number of datapoints your wish to pass through the model concurrently.- Parameters
base_name (
str
) – Exported models will be called {base_name}_{model_num}.onnxbs (
int
) – batch size for exported models
- Return type
None
-
export2tfpb
(base_name, bs=1)[source]¶ Export all
Model
contained inEnsemble
to Tensorflow ProtocolBuffer format, via ONNX. Note that ONNX expects a fixed batch size (bs) which is the number of datapoints your wish to pass through the model concurrently.- Parameters
base_name (
str
) – Exported models will be called {base_name}_{model_num}.pbbs (
int
) – batch size for exported models
- Return type
None
-
classmethod
from_results
(results, size, model_builder, metric='loss', weighting='reciprocal', higher_metric_better=False, snapshot_args=None, location=PosixPath('train_weights'), verbose=True)[source]¶ Instantiate
Ensemble
from a outputs offold_train_ensemble()
. If cycle models are loaded, then only uniform weighting between models is supported.- Parameters
results (
List
[Dict
[str
,float
]]) – results saved/returned byfold_train_ensemble()
size (
int
) – number of models to load as ranked by metricmodel_builder (
ModelBuilder
) –ModelBuilder
used for buildingModel
from saved modelsmetric (
str
) – metric name listed in results to use for ranking and weighting trained modelsweighting (
str
) – ‘reciprocal’ or ‘uniform’ how to weight model predictions during predicition. ‘reciprocal’ = models weighted by 1/metric ‘uniform’ = models treated with equal weightinghigher_metric_better (
bool
) – whether metric should be maximised or minimisedsnapshot_args (
Optional
[Dict
[str
,Any
]]) –Dictionary potentially containing: ‘cycle_losses’: returned/save by
fold_train_ensemble()
when using anAbsCyclicCallback
‘patience’: patience value that was passed tofold_train_ensemble()
‘n_cycles’: number of cycles to load per model ‘load_cycles_only’: whether to only load cycles, or also the best performing model ‘weighting_pwr’: weight cycles according to (n+1)**weighting_pwr, where n is the number of cycles loaded so far.Models are loaded youngest to oldest
location (
Path
) – Path to save location passed tofold_train_ensemble()
verbose (
bool
) – whether to print out information of models loaded
- Return type
AbsEnsemble
- Returns
Built
Ensemble
- Examples::
>>> ensemble = Ensemble.from_results(results, 10, model_builder, ... location=Path('train_weights')) >>> >>> ensemble = Ensemble.from_results( ... results, 1, model_builder, ... location=Path('train_weights'), ... snapshot_args={'cycle_losses':cycle_losses, ... 'patience':patience, ... 'n_cycles':8, ... 'load_cycles_only':True, ... 'weighting_pwr':0})
-
classmethod
from_save
(name)[source]¶ Instantiate
Ensemble
from a savedEnsemble
- Parameters
name (
str
) – base filename of ensemble- Return type
AbsEnsemble
- Returns
Loaded
Ensemble
- Examples::
>>> ensemble = Ensemble.from_save('weights/ensemble')
-
get_feat_importance
(fy, eval_metric=None)[source]¶ Call
get_ensemble_feat_importance()
, passing thisEnsemble
and provided arguments- Parameters
fy (
FoldYielder
) –FoldYielder
interfacing to data on which to evaluate importanceeval_metric (
Optional
[EvalMetric
]) – OptionalEvalMetric
to use for quantifying performance
- Return type
DataFrame
-
load
(name)[source]¶ Load an instantiated
Ensemble
with weights andModel
from save.- Arguments;
name: base name for saved objects
- Examples::
>>> ensemble.load('weights/ensemble')
- Return type
None
-
static
load_trained_model
(model_idx, model_builder, name='train_weights/train_')[source]¶ Load trained model from save file of the form {name}{model_idx}.h5
- Arguments
model_idx: index of model to load model_builder:
ModelBuilder
used to build the model name: base name of file from which to load model
- Return type
- Returns
Model loaded from save
-
predict
(inputs, n_models=None, pred_name='pred', callbacks=None, verbose=True)[source]¶ Compatability method for predicting data contained in either a Numpy array or a
FoldYielder
Will either pass inputs tolumin.nn.ensemble.ensemble.Ensemble.predict_array()
orlumin.nn.ensemble.ensemble.Ensemble.predict_folds()
.- Parameters
inputs (
Union
[ndarray
,FoldYielder
,List
[ndarray
]]) – either aFoldYielder
interfacing with the input data, or the input data as an arrayn_models (
Optional
[int
]) – number of models to use in predictions as ranked by the metric which was used when constructing theEnsemble
. By default, entire ensemble is used.pred_name (
str
) – name for new group of predictions if passed aFoldYielder
callbacks (
Optional
[List
[AbsCallback
]]) – list of any callbacks to use during evaluationverbose (
bool
) – whether to print average predicition timings
- Return type
Union
[None
,ndarray
]- Returns
If passed a Numpy array will return predictions.
- Examples::
>>> preds = ensemble.predict(input_array) >>> >>> ensemble.predict(test_fy)
-
predict_array
(arr, n_models=None, parent_bar=None, display=True, callbacks=None)[source]¶ Apply ensemble to Numpy array and get predictions. If an output pipe has been added to the ensemble, then the predictions will be deprocessed. Inputs are expected to be preprocessed; i.e. any input pipe added to the ensemble is not used.
- Parameters
arr (
ndarray
) – input datan_models (
Optional
[int
]) – number of models to use in predictions as ranked by the metric which was used when constructing theEnsemble
. By default, entire ensemble is used.parent_bar (
Optional
[ConsoleMasterBar
]) – not used when calling the method directlydisplay (
bool
) – whether to display a progress bar for model evaluationscallbacks (
Optional
[List
[AbsCallback
]]) – list of any callbacks to use during evaluation
- Return type
ndarray
- Returns
Numpy array of predictions
- Examples::
>>> preds = ensemble.predict_array(inputs)
-
predict_folds
(fy, n_models=None, pred_name='pred', callbacks=None, verbose=True)[source]¶ Apply ensemble to data accessed by a
FoldYielder
and save predictions as a new group per fold in the foldfile. If an output pipe has been added to the ensemble, then the predictions will be deprocessed. Inputs are expected to be preprocessed; i.e. any input pipe added to the ensemble is not used. If foldyielder has test-time augmentation, then predictions will be averaged over all augmentated forms of the data.- Parameters
fy (
FoldYielder
) –FoldYielder
interfacing with the input datan_models (
Optional
[int
]) – number of models to use in predictions as ranked by the metric which was used when constructing theEnsemble
. By default, entire ensemble is used.pred_name (
str
) – name for new group of predictionscallbacks (
Optional
[List
[AbsCallback
]]) – list of any callbacks to use during evaluationverbose (
bool
) – whether to print average prediction timings
- Examples::
>>> ensemble.predict_array(test_fy, pred_name='pred_tta')
- Return type
None
-
save
(name, feats=None, overwrite=False)[source]¶ Save ensemble and associated objects
- Parameters
name (
str
) – base name for saved objectsfeats (
Optional
[Any
]) – optional list of input featuresoverwrite (
bool
) – if existing objects are found, whether to overwrite them
- Examples::
>>> ensemble.save('weights/ensemble') >>> >>> ensemble.save('weights/ensemble', ['pt','eta','phi'])
- Return type
None