lumin.nn.callbacks package¶
Submodules¶
lumin.nn.callbacks.adversarial_callbacks module¶

class
lumin.nn.callbacks.adversarial_callbacks.
PivotTraining
(n_pretrain_main, n_pretrain_adv, adv_coef, adv_model_builder, adv_targets, adv_update_freq, adv_update_on, main_pretrain_cb_partials=None, adv_pretrain_cb_partials=None, adv_train_cb_partials=None)[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Callback implementation of “Learning to Pivot with Adversarial Networks” (Louppe, Kagan, & Cranmer, 2016) (https://papers.nips.cc/paper/2017/hash/48ab2f9b45957ab574cf005eb8a76760Abstract.html). The default target data in the
FoldYielder
should be the target data for the main model, and it should contain additional columns for target data for the adversary (names should be passed to the adv_targets argument.)Once training begins, both the main model and the adversary will be pretrained in isolation. Further training of the main model then starts, with the frozen adversary providing a bonus to the loss value if the adversary cannot predict well its targets based on the prediction of the main model. At a set interval (multiples of per batch/fold/epoch), the adversary is refined for 1 epoch with the main model frozen (if per batch, this can take a long time with no progression indicated to the user). States of the model and the adversary are saved to the savepath after both pretraining and further training.
 Parameters
n_pretrain_main (
int
) – number of epochs to pretrain the main modeln_pretrain_adv (
int
) – number of epochs to pretrain the adversaryadv_coef (
float
) – relative weighting for the adversarial bonus (lambda in the paper), code assumes a positive value and subtracts adversarial loss from the main lossadv_model_builder (
ModelBuilder
) –ModelBuilder
defining the adversary (note that this should not define main_model+adversary)adv_targets (
List
[str
]) – list of column names in foldfile to use as targets for the adversaryadv_update_freq (
int
) – sets how often the adversary is refined (e.g. once every adv_update_freq ticks)adv_update_on (
str
) – str defines the tick for refining the adversary, can be batch, fold, or epoch. The paper refines once for every batch of training data.main_pretrain_cb_partials (
Optional
[List
[Callable
[[],Callback
]]]) – Optional list of partial callbacks to use when pretraining the main modeladv_pretrain_cb_partials (
Optional
[List
[Callable
[[],Callback
]]]) – Optional list of partial callbacks to use when pretraining the adversary modeladv_train_cb_partials (
Optional
[List
[Callable
[[],Callback
]]]) – Optional list of partial callbacks to use when refining the adversary model

on_batch_begin
()[source]¶ Slices off adversarial and mainmodel targets. Increments tick if required.
 Return type
None

on_train_begin
()[source]¶ Pretrains main model and adversary, then prepares for further training. Adds prepends training callbacks with a
TargReplace
instance to grab both the target and pivot data Return type
None
lumin.nn.callbacks.callback module¶

class
lumin.nn.callbacks.callback.
Callback
[source]¶ Bases:
lumin.nn.callbacks.abs_callback.AbsCallback
Base callback class from which other callbacks should inherit.

set_model
(model)[source]¶ Sets the callback’s model in order to allow the callback to access and adjust model parameters
 Parameters
model (
AbsModel
) – model to refer to during training Return type
None

set_plot_settings
(plot_settings)[source]¶ Sets the plot settings for any plots produced by the callback
 Parameters
plot_settings (
PlotSettings
) – PlotSettings class Return type
None

lumin.nn.callbacks.cyclic_callbacks module¶

class
lumin.nn.callbacks.cyclic_callbacks.
AbsCyclicCallback
(interp, param_range, cycle_mult=1, decrease_param=False, scale=1, cycle_save=False)[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Abstract class for callbacks affecting lr or mom
 Parameters
interp (
str
) – string representation of interpolation function. Either ‘linear’ or ‘cosine’.param_range (
Tuple
[float
,float
]) – minimum and maximum values for parametercycle_mult (
int
) – multiplicative factor for adjusting the cycle length after each cycle. E.g cycle_mult=1 keeps the same cycle length, cycle_mult=2 doubles the cycle length after each cycle.decrease_param (
bool
) – whether to begin by decreasing the parameter, otherwise begin by increasing itscale (
int
) – multiplicative factor for setting the initial number of epochs per cycle. E.g scale=1 means 1 epoch per cycle, scale=5 means 5 epochs per cycle.cycle_save (
bool
) – if true will save a copy of the model at the end of each cycle. Used for building ensembles from single trainings (e.g. snapshot ensembles)nb – number of minibatches (iterations) to expect per epoch

on_batch_begin
()[source]¶ Computes the new value for the optimiser parameter and passes it to _set_param method
 Return type
None

class
lumin.nn.callbacks.cyclic_callbacks.
CycleLR
(lr_range, interp='cosine', cycle_mult=1, decrease_param='auto', scale=1, cycle_save=False)[source]¶ Bases:
lumin.nn.callbacks.cyclic_callbacks.AbsCyclicCallback
Callback to cycle learning rate during training according to either: cosine interpolation for SGDR https://arxiv.org/abs/1608.03983 or linear interpolation for Smith cycling https://arxiv.org/abs/1506.01186
 Parameters
lr_range (
Tuple
[float
,float
]) – tuple of initial and final LRsinterp (
str
) – ‘cosine’ or ‘linear’ interpolationcycle_mult (
int
) – Multiplicative constant for altering the cycle length after each complete cycledecrease_param (
Union
[str
,bool
]) – whether to increase or decrease the LR (effectively reverses lr_range order), ‘auto’ selects according to interpscale (
int
) – Multiplicative constant for altering the length of a cycle. 1 corresponds to one cycle = one epochcycle_save (
bool
) – if true will save a copy of the model at the end of each cycle. Used for building ensembles from single trainings (e.g. snapshot ensembles)nb – Number of batches in a epoch
 Examples::
>>> cosine_lr = CycleLR(lr_range=(0, 2e3), cycle_mult=2, scale=1, ... interp='cosine', nb=100) >>> >>> cyclical_lr = CycleLR(lr_range=(2e4, 2e3), cycle_mult=1, scale=5, interp='linear', nb=100)

class
lumin.nn.callbacks.cyclic_callbacks.
CycleMom
(mom_range, interp='cosine', cycle_mult=1, decrease_param='auto', scale=1, cycle_save=False)[source]¶ Bases:
lumin.nn.callbacks.cyclic_callbacks.AbsCyclicCallback
Callback to cycle momentum (beta 1) during training according to either: cosine interpolation for SGDR https://arxiv.org/abs/1608.03983 or linear interpolation for Smith cycling https://arxiv.org/abs/1506.01186 By default is set to evolve in opposite direction to learning rate, a la https://arxiv.org/abs/1803.09820
 Parameters
mom_range (
Tuple
[float
,float
]) – tuple of initial and final momentainterp (
str
) – ‘cosine’ or ‘linear’ interpolationcycle_mult (
int
) – Multiplicative constant for altering the cycle length after each complete cycledecrease_param (
Union
[str
,bool
]) – whether to increase or decrease the momentum (effectively reverses mom_range order), ‘auto’ selects according to interpscale (
int
) – Multiplicative constant for altering the length of a cycle. 1 corresponds to one cycle = one epochcycle_save (
bool
) – if true will save a copy of the model at the end of each cycle. Used for building ensembles from single trainings (e.g. snapshot ensembles)nb – Number of batches in a epoch
 Examples::
>>> cyclical_mom = CycleMom(mom_range=(0.85 0.95), cycle_mult=1, ... scale=5, interp='linear', nb=100)

class
lumin.nn.callbacks.cyclic_callbacks.
OneCycle
(lengths, lr_range, mom_range=(0.85, 0.95), interp='cosine', cycle_ends_training=True)[source]¶ Bases:
lumin.nn.callbacks.cyclic_callbacks.AbsCyclicCallback
Callback implementing Smith 1cycle evolution for lr and momentum (beta_1) https://arxiv.org/abs/1803.09820 Default interpolation uses fastaistyle cosine function. Automatically triggers early stopping on cycle completion.
 Parameters
lengths (
Tuple
[int
,int
]) – tuple of number of epochs in first and second stages of cyclelr_range (
Union
[Tuple
[float
,float
],Tuple
[float
,float
,float
]]) – list of initial and max LRs and optionally a final LR. If only two LRs supplied, then final LR will be zero.mom_range (
Tuple
[float
,float
]) – tuple of initial and final momentainterp (
str
) – ‘cosine’ or ‘linear’ interpolationcycle_ends_training (
bool
) – whether to stop training once the cycle finishes, or continue running at the last LR and momentum
 Examples::
>>> onecycle = OneCycle(lengths=(15, 30), lr_range=[1e4, 1e2], ... mom_range=(0.85, 0.95), interp='cosine', nb=100)

class
lumin.nn.callbacks.cyclic_callbacks.
CycleStep
(frac_reduction, patience, lengths, lr_range, mom_range=(0.85, 0.95), interp='cosine', plot_params=False)[source]¶ Bases:
lumin.nn.callbacks.cyclic_callbacks.OneCycle
Combination of 1cycle and step decay. Initial 1cycle finishes, and step decay begins starting from best performing model and optimiser.
 Parameters
frac_reduction (
float
) – fractional reduction of the learning rate with each steppatience (
int
) – number of epochs to wait before steplengths (
Tuple
[int
,int
]) – OneCycle lengthslr_range (
List
[float
]) – OneCycle learning rates. Don’t have the final LR be too small.mom_range (
Tuple
[float
,float
]) – OneCycle momenta,interp (
str
) – Iterpolation mode for OneCycleplot_params (
bool
) – If true, will plot the parameter history at the end of training.
lumin.nn.callbacks.data_callbacks module¶

class
lumin.nn.callbacks.data_callbacks.
BinaryLabelSmooth
(coefs=0)[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Callback for applying label smoothing to binary classes, based on https://arxiv.org/abs/1512.00567 Applies smoothing during both training.
 Parameters
coefs (
Union
[float
,Tuple
[float
,float
]]) – Smoothing coefficients: 0>coef[0] 1>1coef[1]. if passed float, coef[0]=coef[1]
 Examples::
>>> lbl_smooth = BinaryLabelSmooth(0.1) >>> >>> lbl_smooth = BinaryLabelSmooth((0.1, 0.02))

class
lumin.nn.callbacks.data_callbacks.
BootstrapResample
(n_folds, bag_each_time=False, reweight=True)[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Callback for bootstrap sampling new training datasets from original training data during (ensemble) training.
 Parameters
n_folds (
int
) – the number of folds present in trainingFoldYielder
bag_each_time (
bool
) – whether to sample a new set for each subepoch or to use the same sample each timereweight (
bool
) – whether to reweight the sampleed data to mathch the weight sum (per class) of the original data
 Examples::
>>> bs_resample BootstrapResample(n_folds=len(train_fy))

class
lumin.nn.callbacks.data_callbacks.
ParametrisedPrediction
(feats, param_feat, param_val)[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Callback for running predictions for a parametersied network (https://arxiv.org/abs/1601.07913); one which has been trained using one of more inputs which represent e.g. different hypotheses for the classes such as an unknown mass of some new particle. In such a scenario, multiple signal datasets could be used for training, with background receiving a random mass. During prediction one then needs to set these parametrisation features all to the same values to evaluat the model’s response for that hypothesis. This callback can be passed to the predict method of the model/ensemble to adjust the parametrisation features to the desired values.
 Parameters
feats (
List
[str
]) – list of feature names used during training (in the same order)param_feat (
Union
[List
[str
],str
]) – the feature name which is to be adjusted, or a list of features to adjustparam_val (
Union
[List
[float
],float
]) – the value to which to set the paramertisation feature, of the list of values to set the parameterisation features to
 Examples::
>>> mass_param = ParametrisedPrediction(train_feats, 'res_mass', 300) >>> model.predict(fold_yeilder, pred_name=f'pred_mass_300', callbacks=[mass_param]) >>> >>> mass_param = ParametrisedPrediction(train_feats, 'res_mass', 300) >>> spin_param = ParametrisedPrediction(train_feats, 'spin', 1) >>> model.predict(fold_yeilder, pred_name=f'pred_mass_300', callbacks=[mass_param, spin_param])

class
lumin.nn.callbacks.data_callbacks.
TargReplace
(targ_feats)[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Callback to replace target data with requested data from foldfile, allowing one to e.g. train two models simultaneously with the same inputs but different targets for e.g. adversarial training. At the end of validation epochs, the target data is swapped back to the original target data, to allow for the correct computation of any metrics
 Parameters
targ_feats (
List
[str
]) – list of column names in foldfile to get and horizontally stack to replace target data in currentBatchYielder
 Examples::
>>> targ_replace = TargReplace(['is_fake']) >>> targ_replace = TargReplace(['class', 'is_fake'])

on_fold_begin
()[source]¶ Stack new target datasets and replace in target data in current
BatchYielder
 Return type
None
lumin.nn.callbacks.loss_callbacks module¶

class
lumin.nn.callbacks.loss_callbacks.
GradClip
(clip, clip_norm=True)[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Callback for clipping gradients by norm or value.
 Parameters
clip (
float
) – value to clip atclip_norm (
bool
) – whether to clip according to norm (torch.nn.utils.clip_grad_norm_) or value (torch.nn.utils.clip_grad_value_)
 Examples::
>>> grad_clip = GradClip(1e5)
lumin.nn.callbacks.lsuv_init module¶
This file contains code modfied from https://github.com/duchaaiki/LSUVpytorch which is made available under the following BSD 2Clause “Simplified” Licence: Copyright (C) 2017, Dmytro Mishkin All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The Apache Licence 2.0 underwhich the majority of the rest of LUMIN is distributed does not apply to the code within this file.

class
lumin.nn.callbacks.lsuv_init.
LsuvInit
(needed_std=1.0, std_tol=0.1, max_attempts=10, do_orthonorm=True, verbose=False)[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Applies LayerSequential UnitVariance (LSUV) initialisation to model, as per Mishkin & Matas 2016 https://arxiv.org/abs/1511.06422. When training begins for the first time, Conv1D, Conv2D, Conv3D, and Linear modules in the model will be LSUV initialised using the BatchYielder inputs. This involves initialising the weights with orthonormal matirces and then iteratively scaling them such that the stadndar deviation of the layer outputs is equal to a desired value, within some tolerance.
 Parameters
needed_std (
float
) – desired standard deviation of layer outputsstd_tol (
float
) – tolerance for matching standard deviation with targetmax_attempts (
int
) – number of times to attempt weight scaling per layerdo_orthonorm (
bool
) – whether to apply orthonormal initialisation first, or rescale the exisiting valuesverbose (
bool
) – whether to print out details of the rescaling
 Example::
>>> lsuv = LsuvInit() >>> >>> lsuv = LsuvInit(verbose=True) >>> >>> lsuv = LsuvInit(needed_std=0.5, std_tol=0.01, max_attempts=100, do_orthonorm=True)
lumin.nn.callbacks.model_callbacks module¶

class
lumin.nn.callbacks.model_callbacks.
SWA
(start_epoch, renewal_period=None, update_on_cycle_end=None, verbose=False)[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Callback providing Stochastic Weight Averaging based on (https://arxiv.org/abs/1803.05407) This adapted version allows the tracking of a pair of average models in order to avoid having to hardcode a specific start point for averaging:
Model average #0 will begin to be tracked start_epoch epochs/cycles after training begins.
cycle_since_replacement is set to 1
Renewal_period epochs/cycles later, a second average #1 will be tracked.
At the next renewal period, the performance of #0 and #1 will be compared on data contained in val_fold.
 If #0 is better than #1:
#1 is replaced by a copy of the current model
cycle_since_replacement is increased by 1
renewal_period is multiplied by cycle_since_replacement
 Else:
#0 is replaced by #1
#1 is replaced by a copy of the current model
cycle_since_replacement is set to 1
renewal_period is set back to its original value
Additonally, will optionally (default True) lockin to any cyclical callbacks to only update at the end of a cycle.
 Parameters
start_epoch (
int
) – epoch/cycle to begin averagingrenewal_period (
Optional
[int
]) – How often to check performance of averages, and renew tracking of least performant. If None, will not track a second average.update_on_cycle_end (
Optional
[bool
]) – Whether to lock in to the cyclic callback and only update at the end of a cycle. Default yes, if cyclic callback present.verbose (
bool
) – Whether to print out update information for testing and operation confirmation
 Examples::
>>> swa = SWA(start_epoch=5, renewal_period=5)
lumin.nn.callbacks.monitors module¶

class
lumin.nn.callbacks.monitors.
EarlyStopping
(patience, loss_is_meaned=True)[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Tracks validation loss during training and terminates training if loss doesn’t decrease after patience number of epochs. Losses are assumed to be averaged and will be reaveraged over the epoch unless loss_is_meaned is false.
 Parameters
patience (
int
) – number of epochs to wait without improvement before stopping trainingloss_is_meaned (
bool
) – if the batch loss value has been averaged over the number of elements in the batch, this should be true; average loss will be computed over all elements in batch. If the batch loss is not an average value, then the average will be computed over the number of batches.

class
lumin.nn.callbacks.monitors.
SaveBest
(auto_reload=True, loss_is_meaned=True)[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Tracks validation loss during training and automatically saves a copy of the weights to indicated file whenever validation loss decreases. Losses are assumed to be averaged and will be reaveraged over the epoch unless loss_is_meaned is false.
 Parameters
auto_reload (
bool
) – if true, will automatically reload the best model at the end of trainingloss_is_meaned (
bool
) – if the batch loss value has been averaged over the number of elements in the batch, this should be true; average loss will be computed over all elements in batch. If the batch loss is not an average value, then the average will be computed over the number of batches.

class
lumin.nn.callbacks.monitors.
MetricLogger
(show_plots=False, extra_detail=True, loss_is_meaned=True)[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Provides live feedback during training showing a variety of metrics to help highlight problems or test hyperparameters without completing a full training. If show_plots is false, will instead print training and validation losses at the end of each epoch. The full history is available as a dictionary by calling
get_loss_history()
. Parameters
loss_names – List of names of losses which will be passed to the logger in the order in which they will be passed. By convention the first name will be used as the training loss when computing the ratio of training to validation losses
n_folds – Number of folds present in the training data. The logger assumes that one of these folds is for validation, and so 1 training epoch = (n_fold1) folds.
extra_detail (
bool
) – Whether to include extra detail plots (loss velocity and training validation ratio), slight slower but potentially useful.

get_loss_history
()[source]¶ Get the current history of losses and metrics
 Returns
tuple of ordered dictionaries: first with losses, second with validation metrics
 Return type
history

get_results
(save_best)[source]¶ Returns losses and metrics of the (loaded) model
#TODO: extend this to load at specified index
 Parameters
save_best (
bool
) – if the training usedSaveBest
return results at best point else return the latest values Return type
Dict
[str
,float
] Returns
dictionary of validation loss and metrics

class
lumin.nn.callbacks.monitors.
EpochSaver
[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Callback to save the model at the end of every epoch, regarless of improvement
lumin.nn.callbacks.opt_callbacks module¶

class
lumin.nn.callbacks.opt_callbacks.
LRFinder
(lr_bounds=[1e07, 10], nb=None)[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Callback class for Smith learningrate range test (https://arxiv.org/abs/1803.09820)
 Parameters
nb (
Optional
[int
]) – number of batches in a epochlr_bounds (
Tuple
[float
,float
]) – tuple of initial and final LR
lumin.nn.callbacks.pred_handlers module¶

class
lumin.nn.callbacks.pred_handlers.
PredHandler
[source]¶ Bases:
lumin.nn.callbacks.callback.Callback
Default callback for predictions. Collects predictions over batches and returns them as stacked array