lumin.nn.metrics package¶
Submodules¶
lumin.nn.metrics.class_eval module¶
- class lumin.nn.metrics.class_eval.AMS(n_total, wgt_name, br=0, syst_unc_b=0, use_quick_scan=True, name='AMS', main_metric=True)[source]¶
Bases:
EvalMetricClass to compute maximum Approximate Median Significance (https://arxiv.org/abs/1007.1727) using classifier which directly predicts the class of data in a binary classifiaction problem. AMS is computed on a single fold of data provided by a
FoldYielderand automatically reweights data by event multiplicity to account missing weights.- Parameters:
n_total (
int) – total number of events in entire data setwgt_name (
str) – name of weight group in fold file to use. N.B. if you have reweighted to balance classes, be sure to use the un-reweighted weights.br (
float) – constant bias offset for background yieldsyst_unc_b (
float) – fractional systematic uncertainty on background yielduse_quick_scan (
bool) – whether to optimise AMS by theams_scan_quick()method (fast but suffers floating point precision) if False useams_scan_slow()(slower but more accurate)name (
Optional[str]) – optional name for metric, otherwise will be ‘AMS’main_metric (
bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted
- Examples::
>>> ams_metric = AMS(n_total=250000, br=10, wgt_name='gen_orig_weight') >>> >>> ams_metric = AMS(n_total=250000, syst_unc_b=0.1, ... wgt_name='gen_orig_weight', use_quick_scan=False)
- class lumin.nn.metrics.class_eval.BinaryAccuracy(threshold=0.5, name='Acc', main_metric=True)[source]¶
Bases:
EvalMetricComputes and returns the accuracy of a single-output model for binary classification tasks.
- Parameters:
threshold (
float) – minimum value of model prediction that will be considered a prediction of class 1. Values below this threshold will be considered predictions of class 0. Default = 0.5.name (
Optional[str]) – optional name for metric, otherwise will be ‘Acc’main_metric (
bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted
- Examples::
>>> acc_metric = BinaryAccuracy() >>> >>> acc_metric = BinaryAccuracy(threshold=0.8)
- class lumin.nn.metrics.class_eval.MultiAMS(n_total, wgt_name, targ_name, zero_preds, one_preds, br=0, syst_unc_b=0, use_quick_scan=True, name='AMS', main_metric=True)[source]¶
Bases:
EvalMetricClass to compute maximum Approximate Median Significance (https://arxiv.org/abs/1007.1727) using classifier which predicts the class of data in a multiclass classifiaction problem which can be reduced to a binary classification problem AMS is computed on a single fold of data provided by a
FoldYielderand automatically reweights data by event multiplicity to account missing weights.- Parameters:
n_total (
int) – total number of events in entire data setwgt_name (
str) – name of weight group in fold file to use. N.B. if you have reweighted to balance classes, be sure to use the un-reweighted weights.targ_name (
str) – name of target group in fold file which indicates whether the event is signal or backgroundzero_preds (
List[str]) – list of predicted classes which correspond to class 0 in the form pred_[i], where i is a NN output indexone_preds (
List[str]) – list of predicted classes which correspond to class 1 in the form pred_[i], where i is a NN output indexbr (
float) – constant bias offset for background yieldsyst_unc_b (
float) – fractional systematic uncertainty on background yielduse_quick_scan (
bool) – whether to optimise AMS by theams_scan_quick()method (fast but suffers floating point precision) if False useams_scan_slow()(slower but more accurate)name (
Optional[str]) – optional name for metric, otherwise will be ‘AMS’main_metric (
bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted
- Examples::
>>> ams_metric = MultiAMS(n_total=250000, br=10, targ_name='gen_target', ... wgt_name='gen_orig_weight', ... zero_preds=['pred_0', 'pred_1', 'pred_2'], ... one_preds=['pred_3']) >>> >>> ams_metric = MultiAMS(n_total=250000, syst_unc_b=0.1, ... targ_name='gen_target', ... wgt_name='gen_orig_weight', ... use_quick_scan=False, ... zero_preds=['pred_0', 'pred_1', 'pred_2'], ... one_preds=['pred_3'])
- class lumin.nn.metrics.class_eval.RocAucScore(average='macro', max_fpr=None, multi_class='raise', name='ROC AUC', main_metric=True)[source]¶
Bases:
EvalMetricComputes and returns the area under the Receiver Operator Characteristic curve (ROC AUC) of a classifier model.
- Parameters:
average (
Optional[str]) –As per scikit-learn. {‘micro’, ‘macro’, ‘samples’, ‘weighted’} or None, default=’macro’ If
None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data: Note: multiclass ROC AUC currently only handles the ‘macro’ and ‘weighted’ averages.'micro':Calculate metrics globally by considering each element of the label indicator matrix as a label.
'macro':Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
'weighted':Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label).
'samples':Calculate metrics for each instance, and find their average.
Will be ignored when
y_trueis binary.max_fpr (
Optional[float]) – As per scikit-learn. float > 0 and <= 1, default=None If notNone, the standardized partial AUC over the range [0, max_fpr] is returned. For the multiclass case,max_fpr, should be either equal toNoneor1.0as AUC ROC partial computation currently is not supported for multiclass.multi_class (
str) –As per scikit-learn. {‘raise’, ‘ovr’, ‘ovo’}, default=’raise’ Multiclass only. Determines the type of configuration to use. The default value raises an error, so either
'ovr'or'ovo'must be passed explicitly.'ovr':Computes the AUC of each class against the rest. This treats the multiclass case in the same way as the multilabel case. Sensitive to class imbalance even when
average == 'macro', because class imbalance affects the composition of each of the ‘rest’ groupings.'ovo':Computes the average AUC of all possible pairwise combinations of classes. Insensitive to class imbalance when
average == 'macro'.
name (
Optional[str]) – optional name for metric, otherwise will be ‘Acc’main_metric (
bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted
- Examples::
>>> auc_metric = RocAucScore() >>> >>> auc_metric = RocAucScore(max_fpr=0.2) >>> >>> auc_metric = RocAucScore(multi_class='ovo')
lumin.nn.metrics.eval_metric module¶
- class lumin.nn.metrics.eval_metric.EvalMetric(name, lower_metric_better, main_metric=True)[source]¶
Bases:
CallbackAbstract class for evaluating performance of a model using some metric
- Parameters:
name (
Optional[str]) – optional name for metric, otherwise will be inferred from classlower_metric_better (
bool) – whether a lower metric value should be treated as representing better perofrmancemain_metric (
bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted
- abstract evaluate()[source]¶
Evaluate the required metric for a given fold and set of predictions
- Return type:
float- Returns:
metric value
- evaluate_model(model, fy, fold_idx, inputs, targets, weights=None, bs=None)[source]¶
Gets model predicitons and computes metric value. fy and fold_idx arguments necessary in case the metric requires extra information beyond inputs, tragets, and weights.
- Parameters:
model (
AbsModel) – model to evaluatefy (
FoldYielder) –FoldYieldercontaining datafold_idx (
int) – fold index of corresponding datainputs (
ndarray) – input datatargets (
ndarray) – target dataweights (
Optional[ndarray]) – optional weightsbs (
Optional[int]) – optional batch size
- Return type:
float- Returns:
metric value
- evaluate_preds(fy, fold_idx, preds, targets, weights=None)[source]¶
Computes metric value from predictions. fy and fold_idx arguments necessary in case the metric requires extra information beyond inputs, tragets, and weights.
- Parameters:
fy (
FoldYielder) –FoldYieldercontaining datafold_idx (
int) – fold index of corresponding datainputs – input data
targets (
ndarray) – target dataweights (
Optional[ndarray]) – optional weightsbs – optional batch size
- Return type:
float- Returns:
metric value
- class lumin.nn.metrics.eval_metric.TorchGeometricEvalMetric(name, lower_metric_better, main_metric=True)[source]¶
Bases:
EvalMetricAbstract class for evaluating performance of a model using some metric and PyTorch Geometric data
- Parameters:
name (
Optional[str]) – optional name for metric, otherwise will be inferred from classlower_metric_better (
bool) – whether a lower metric value should be treated as representing better perofrmancemain_metric (
bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted
lumin.nn.metrics.reg_eval module¶
- class lumin.nn.metrics.reg_eval.RegAsProxyPull(proxy_func, return_mean, targ_name=None, use_bootstrap=False, use_pull=True, name=None, main_metric=True)[source]¶
Bases:
RegPullCompute mean or standard deviation of delta or pull of some feature which is being indirectly regressed to via a proxy function. Optionally, use bootstrap resampling on validation data.
- Parameters:
proxy_func (
Callable[[DataFrame],None]) – function which acts on regression predictions and adds pred and gen_target columns to the Pandas DataFrame it is passed which contains prediction columns pred_{i}return_mean (
bool) – whether to return the mean or the standard deviationuse_bootstrap (
bool) – whether to bootstrap resamples validation fold when computing statisiticuse_weights – whether to actually use weights if wgt_name is set
use_pull (
bool) – whether to return the pull (differences / targets) or delta (differences)targ_name (
Optional[str]) – optional name of group in fold file containing regression targetsname (
Optional[str]) – optional name for metric, otherwise will be inferred from use_pullmain_metric (
bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted
- Examples::
>>> def reg_proxy_func(df): >>> df['pred'] = calc_pair_mass(df, (1.77682, 1.77682), ... {targ[targ.find('_t')+3:]: ... f'pred_{i}' for i, targ ... in enumerate(targ_feats)}) >>> df['gen_target'] = 125 >>> >>> std_delta = RegAsProxyPull(proxy_func=reg_proxy_func, ... return_mean=False, use_pull=False)
- evaluate()[source]¶
Compute statisitic on fold using provided predictions.
- Parameters:
fy –
FoldYielderinterfacing to dataidx – fold index corresponding to fold for which y_pred was computed
y_pred – predictions for fold
- Return type:
float- Returns:
Statistic set in initialisation computed on the chsoen fold
- Examples::
>>> mean = mean_pull.evaluate(train_fy, val_id, val_preds)
- class lumin.nn.metrics.reg_eval.RegPull(return_mean, use_bootstrap=False, use_pull=True, name=None, main_metric=True)[source]¶
Bases:
EvalMetricCompute mean or standard deviation of delta or pull of some feature which is being directly regressed to. Optionally, use bootstrap resampling on validation data.
- Parameters:
return_mean (
bool) – whether to return the mean or the standard deviationuse_bootstrap (
bool) – whether to bootstrap resamples validation fold when computing statisiticuse_pull (
bool) – whether to return the pull (differences / targets) or delta (differences)name (
Optional[str]) – optional name for metric, otherwise will be inferred from use_pullmain_metric (
bool) – whether this metic should be treated as the primary metric for SaveBest and EarlyStopping Will automatically set the first EvalMetric to be main if multiple primary metrics are submitted
- Examples::
>>> mean_pull = RegPull(return_mean=True, use_bootstrap=True, ... use_pull=True) >>> >>> std_delta = RegPull(return_mean=False, use_bootstrap=True, ... use_pull=False) >>> >>> mean_pull = RegPull(return_mean=True, use_bootstrap=False, ... use_pull=True, wgt_name='weights')