lumin.nn.metrics package¶

Submodules¶

lumin.nn.metrics.class_eval module¶

class lumin.nn.metrics.class_eval.AMS(n_total, wgt_name, br=0, syst_unc_b=0, use_quick_scan=True, name='AMS', main_metric=True)[source]¶

Bases: EvalMetric

Class to compute maximum Approximate Median Significance (https://arxiv.org/abs/1007.1727) using classifier which directly predicts the class of data in a binary classifiaction problem. AMS is computed on a single fold of data provided by a FoldYielder and automatically reweights data by event multiplicity to account missing weights.

Parameters:

n_total (int) – total number of events in entire data set
wgt_name (str) – name of weight group in fold file to use. N.B. if you have reweighted to balance classes, be sure to use the un-reweighted weights.
br (float) – constant bias offset for background yield
syst_unc_b (float) – fractional systematic uncertainty on background yield
use_quick_scan (bool) – whether to optimise AMS by the ams_scan_quick() method (fast but suffers floating point precision) if False use ams_scan_slow() (slower but more accurate)
name (Optional[str]) – optional name for metric, otherwise will be ‘AMS’
main_metric (bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted

Examples::

>>> ams_metric = AMS(n_total=250000, br=10, wgt_name='gen_orig_weight')
>>>
>>> ams_metric = AMS(n_total=250000, syst_unc_b=0.1,
...                  wgt_name='gen_orig_weight', use_quick_scan=False)

evaluate()[source]¶

Compute maximum AMS on fold using provided predictions.

Return type:: float
Returns:: Maximum AMS computed on reweighted data from fold

class lumin.nn.metrics.class_eval.BinaryAccuracy(threshold=0.5, name='Acc', main_metric=True)[source]¶

Bases: EvalMetric

Computes and returns the accuracy of a single-output model for binary classification tasks.

Parameters:

threshold (float) – minimum value of model prediction that will be considered a prediction of class 1. Values below this threshold will be considered predictions of class 0. Default = 0.5.
name (Optional[str]) – optional name for metric, otherwise will be ‘Acc’
main_metric (bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted

Examples::

>>> acc_metric = BinaryAccuracy()
>>>
>>> acc_metric = BinaryAccuracy(threshold=0.8)

evaluate()[source]¶

Computes the (weighted) accuracy for a set of targets and predictions for a given threshold.

Return type:: float
Returns:: The (weighted) accuracy for the specified threshold

class lumin.nn.metrics.class_eval.MultiAMS(n_total, wgt_name, targ_name, zero_preds, one_preds, br=0, syst_unc_b=0, use_quick_scan=True, name='AMS', main_metric=True)[source]¶

Bases: EvalMetric

Class to compute maximum Approximate Median Significance (https://arxiv.org/abs/1007.1727) using classifier which predicts the class of data in a multiclass classifiaction problem which can be reduced to a binary classification problem AMS is computed on a single fold of data provided by a FoldYielder and automatically reweights data by event multiplicity to account missing weights.

Parameters:

n_total (int) – total number of events in entire data set
wgt_name (str) – name of weight group in fold file to use. N.B. if you have reweighted to balance classes, be sure to use the un-reweighted weights.
targ_name (str) – name of target group in fold file which indicates whether the event is signal or background
zero_preds (List[str]) – list of predicted classes which correspond to class 0 in the form pred_[i], where i is a NN output index
one_preds (List[str]) – list of predicted classes which correspond to class 1 in the form pred_[i], where i is a NN output index
br (float) – constant bias offset for background yield
syst_unc_b (float) – fractional systematic uncertainty on background yield
use_quick_scan (bool) – whether to optimise AMS by the ams_scan_quick() method (fast but suffers floating point precision) if False use ams_scan_slow() (slower but more accurate)
name (Optional[str]) – optional name for metric, otherwise will be ‘AMS’
main_metric (bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted

Examples::

>>> ams_metric = MultiAMS(n_total=250000, br=10, targ_name='gen_target',
...                       wgt_name='gen_orig_weight',
...                       zero_preds=['pred_0', 'pred_1', 'pred_2'],
...                       one_preds=['pred_3'])
>>>
>>> ams_metric = MultiAMS(n_total=250000, syst_unc_b=0.1,
...                       targ_name='gen_target',
...                       wgt_name='gen_orig_weight',
...                       use_quick_scan=False,
...                       zero_preds=['pred_0', 'pred_1', 'pred_2'],
...                       one_preds=['pred_3'])

evaluate()[source]¶

Compute maximum AMS on fold using provided predictions.

Return type:: float
Returns:: Maximum AMS computed on reweighted data from fold

class lumin.nn.metrics.class_eval.RocAucScore(average='macro', max_fpr=None, multi_class='raise', name='ROC AUC', main_metric=True)[source]¶

Bases: EvalMetric

Computes and returns the area under the Receiver Operator Characteristic curve (ROC AUC) of a classifier model.

Parameters:

average (Optional[str]) –
As per scikit-learn. {‘micro’, ‘macro’, ‘samples’, ‘weighted’} or None, default=’macro’ If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data: Note: multiclass ROC AUC currently only handles the ‘macro’ and ‘weighted’ averages.

'micro':
Calculate metrics globally by considering each element of the label indicator matrix as a label.

'macro':
Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.

'weighted':
Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label).

'samples':
Calculate metrics for each instance, and find their average.

Will be ignored when y_true is binary.
max_fpr (Optional[float]) – As per scikit-learn. float > 0 and <= 1, default=None If not None, the standardized partial AUC over the range [0, max_fpr] is returned. For the multiclass case, max_fpr, should be either equal to None or 1.0 as AUC ROC partial computation currently is not supported for multiclass.
multi_class (str) –
As per scikit-learn. {‘raise’, ‘ovr’, ‘ovo’}, default=’raise’ Multiclass only. Determines the type of configuration to use. The default value raises an error, so either 'ovr' or 'ovo' must be passed explicitly.

'ovr':
Computes the AUC of each class against the rest. This treats the multiclass case in the same way as the multilabel case. Sensitive to class imbalance even when average == 'macro', because class imbalance affects the composition of each of the ‘rest’ groupings.

'ovo':
Computes the average AUC of all possible pairwise combinations of classes. Insensitive to class imbalance when average == 'macro'.
name (Optional[str]) – optional name for metric, otherwise will be ‘Acc’
main_metric (bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted

Examples::

>>> auc_metric = RocAucScore()
>>>
>>> auc_metric = RocAucScore(max_fpr=0.2)
>>>
>>> auc_metric = RocAucScore(multi_class='ovo')

evaluate()[source]¶

Computes the (weighted) (averaged) ROC AUC for a set of targets and predictions.

Return type:: float
Returns:: The (weighted) (averaged) ROC AUC for the specified threshold

lumin.nn.metrics.eval_metric module¶

class lumin.nn.metrics.eval_metric.EvalMetric(name, lower_metric_better, main_metric=True)[source]¶

Bases: Callback

Abstract class for evaluating performance of a model using some metric

Parameters:

name (Optional[str]) – optional name for metric, otherwise will be inferred from class
lower_metric_better (bool) – whether a lower metric value should be treated as representing better perofrmance
main_metric (bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted

abstract evaluate()[source]¶

Evaluate the required metric for a given fold and set of predictions

Return type:: float
Returns:: metric value

evaluate_model(model, fy, fold_idx, inputs, targets, weights=None, bs=None)[source]¶

Gets model predicitons and computes metric value. fy and fold_idx arguments necessary in case the metric requires extra information beyond inputs, tragets, and weights.

Parameters:

model (AbsModel) – model to evaluate
fy (FoldYielder) – FoldYielder containing data
fold_idx (int) – fold index of corresponding data
inputs (ndarray) – input data
targets (ndarray) – target data
weights (Optional[ndarray]) – optional weights
bs (Optional[int]) – optional batch size

Return type:

float

Returns:

metric value

evaluate_preds(fy, fold_idx, preds, targets, weights=None)[source]¶

Computes metric value from predictions. fy and fold_idx arguments necessary in case the metric requires extra information beyond inputs, tragets, and weights.

Parameters:

fy (FoldYielder) – FoldYielder containing data
fold_idx (int) – fold index of corresponding data
inputs – input data
targets (ndarray) – target data
weights (Optional[ndarray]) – optional weights
bs – optional batch size

Return type:

float

Returns:

metric value

get_df()[source]¶

Returns a DataFrame for the given fold containing targets, weights, and predictions

Return type:: DataFrame
Returns:: DataFrame for the given fold containing targets, weights, and predictions

get_metric()[source]¶

Returns metric value

Return type:: float
Returns:: metric value

on_epoch_begin()[source]¶

Resets prediction tracking

Return type:: None

on_epoch_end()[source]¶

Compute metric using saved predictions

Return type:: None

on_forwards_end()[source]¶

Save predictions from batch

Return type:: None

on_train_begin()[source]¶

Ensures that only one main metric is used

Return type:: None

class lumin.nn.metrics.eval_metric.TorchGeometricEvalMetric(name, lower_metric_better, main_metric=True)[source]¶

Bases: EvalMetric

Abstract class for evaluating performance of a model using some metric and PyTorch Geometric data

Parameters:

name (Optional[str]) – optional name for metric, otherwise will be inferred from class
lower_metric_better (bool) – whether a lower metric value should be treated as representing better perofrmance
main_metric (bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted

on_epoch_begin()[source]¶

Resets prediction tracking

Return type:: None

on_epoch_end()[source]¶

Compute metric using saved predictions

Return type:: None

on_forwards_end()[source]¶

Save predictions from batch

Return type:: None

lumin.nn.metrics.reg_eval module¶

class lumin.nn.metrics.reg_eval.RegAsProxyPull(proxy_func, return_mean, targ_name=None, use_bootstrap=False, use_pull=True, name=None, main_metric=True)[source]¶

Bases: RegPull

Compute mean or standard deviation of delta or pull of some feature which is being indirectly regressed to via a proxy function. Optionally, use bootstrap resampling on validation data.

Parameters:

proxy_func (Callable[[DataFrame], None]) – function which acts on regression predictions and adds pred and gen_target columns to the Pandas DataFrame it is passed which contains prediction columns pred_{i}
return_mean (bool) – whether to return the mean or the standard deviation
use_bootstrap (bool) – whether to bootstrap resamples validation fold when computing statisitic
use_weights – whether to actually use weights if wgt_name is set
use_pull (bool) – whether to return the pull (differences / targets) or delta (differences)
targ_name (Optional[str]) – optional name of group in fold file containing regression targets
name (Optional[str]) – optional name for metric, otherwise will be inferred from use_pull
main_metric (bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted

Examples::

>>> def reg_proxy_func(df):
>>>     df['pred'] = calc_pair_mass(df, (1.77682, 1.77682),
...                                 {targ[targ.find('_t')+3:]:
...                                 f'pred_{i}' for i, targ
...                                 in enumerate(targ_feats)})
>>>     df['gen_target'] = 125
>>>
>>> std_delta = RegAsProxyPull(proxy_func=reg_proxy_func,
...                            return_mean=False, use_pull=False)

evaluate()[source]¶

Compute statisitic on fold using provided predictions.

Parameters:

fy – FoldYielder interfacing to data
idx – fold index corresponding to fold for which y_pred was computed
y_pred – predictions for fold

Return type:

float

Returns:

Statistic set in initialisation computed on the chsoen fold

Examples::

>>> mean = mean_pull.evaluate(train_fy, val_id, val_preds)

class lumin.nn.metrics.reg_eval.RegPull(return_mean, use_bootstrap=False, use_pull=True, name=None, main_metric=True)[source]¶

Bases: EvalMetric

Compute mean or standard deviation of delta or pull of some feature which is being directly regressed to. Optionally, use bootstrap resampling on validation data.

Parameters:

return_mean (bool) – whether to return the mean or the standard deviation
use_bootstrap (bool) – whether to bootstrap resamples validation fold when computing statisitic
use_pull (bool) – whether to return the pull (differences / targets) or delta (differences)
name (Optional[str]) – optional name for metric, otherwise will be inferred from use_pull
main_metric (bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted

Examples::

>>> mean_pull  = RegPull(return_mean=True, use_bootstrap=True,
...                      use_pull=True)
>>>
>>> std_delta  = RegPull(return_mean=False, use_bootstrap=True,
...                      use_pull=False)
>>>
>>> mean_pull  = RegPull(return_mean=True, use_bootstrap=False,
...                      use_pull=True, wgt_name='weights')

evaluate()[source]¶

Compute mean or width of regression error.

Return type:: float
Returns:: Mean or width of regression error

lumin.nn.metrics package¶

Submodules¶

lumin.nn.metrics.class_eval module¶

lumin.nn.metrics.eval_metric module¶

lumin.nn.metrics.reg_eval module¶

Module contents¶

Docs

Tutorials