Shortcuts

lumin.nn.models package

Submodules

lumin.nn.models.helpers module

class lumin.nn.models.helpers.CatEmbedder(cat_names, cat_szs, emb_szs=None, max_emb_sz=50, emb_load_path=None)[source]

Bases: object

Helper class for embedding categorical features. Designed to be passed to ModelBuilder. Note that the classmethod from_fy() may be used to instantiate an CatEmbedder from a FoldYielder.

Parameters
  • cat_names (List[str]) – list of names of catgorical features in order in which they will be passed as inputs columns

  • cat_szs (List[int]) – list of cardinalities (number of unique elements) for each feature

  • emb_szs (Optional[List[int]]) – Optional list of embedding sizes for each feature. If None, will use min(max_emb_sz, (1+sz)//2)

  • max_emb_sz (int) – Maximum size of embedding if emb_szs is None

  • emb_load_path (Union[Path, str, None]) – if not None, will cause ModelBuilder to attempt to load pretrained embeddings from path

Examples::
>>> cat_embedder = CatEmbedder(cat_names=['n_jets', 'channel'],
                               cat_szs=[5, 3])
>>>
>>> cat_embedder = CatEmbedder(cat_names=['n_jets', 'channel'],
                               cat_szs=[5, 3], emb_szs=[2, 2])
>>>
>>> cat_embedder = CatEmbedder(cat_names=['n_jets', 'channel'],
                               cat_szs=[5, 3], emb_szs=[2, 2],
                               emb_load_path=Path('weights'))
calc_emb_szs()[source]

Method used to set sizes of embeddings for each categorical feature when no embedding sizes are explicitly passed Uses rule of thumb of min(50, (1+cardinality)/2)

Return type

None

classmethod from_fy(fy, emb_szs=None, max_emb_sz=50, emb_load_path=None)[source]

Instantiate an CatEmbedder from a FoldYielder, i.e. avoid having to pass cat_names and cat_szs.

Parameters
  • fy (FoldYielder) – FoldYielder with training data

  • emb_szs (Optional[List[int]]) – Optional list of embedding sizes for each feature. If None, will use min(max_emb_sz, (1+sz)//2)

  • max_emb_sz (int) – Maximum size of embedding if emb_szs is None

  • emb_load_path (Union[Path, str, None]) – if not None, will cause ModelBuilder to attempt to load pretrained embeddings from path

Returns

CatEmbedder

Examples::
>>> cat_embedder = CatEmbedder.from_fy(train_fy)
>>>
>>> cat_embedder = CatEmbedder.from_fy(train_fy, emb_szs=[2, 2])
>>>
>>> cat_embedder = CatEmbedder.from_fy(
        train_fy, emb_szs=[2, 2],
        emb_load_path=Path('weights'))
lumin.nn.models.helpers.Embedder(cat_names, cat_szs, emb_szs=None, max_emb_sz=50, emb_load_path=None)[source]

Attention

Depreciated in favour of CatEmbedder and will be removed in v0.4.

lumin.nn.models.initialisations module

lumin.nn.models.initialisations.lookup_normal_init(act, fan_in=None, fan_out=None)[source]

Lookup for weight initialisation using Normal distributions

Parameters
  • act (str) – string representation of activation function

  • fan_in (Optional[int]) – number of inputs to neuron

  • fan_out (Optional[int]) – number of outputs from neuron

Return type

Callable[[Tensor], None]

Returns

Callable to initialise weight tensor

lumin.nn.models.initialisations.lookup_uniform_init(act, fan_in=None, fan_out=None)[source]

Lookup weight initialisation using Uniform distributions

Parameters
  • act (str) – string representation of activation function

  • fan_in (Optional[int]) – number of inputs to neuron

  • fan_out (Optional[int]) – number of outputs from neuron

Return type

Callable[[Tensor], None]

Returns

Callable to initialise weight tensor

lumin.nn.models.model module

class lumin.nn.models.model.Model(model_builder=None)[source]

Bases: lumin.nn.models.abs_model.AbsModel

Wrapper class to handle training and inference of NNs created via a ModelBuilder. Note that saved models can be instantiated direcly via from_save() classmethod.

Parameters

model_builder (Optional[ModelBuilder]) – ModelBuilder which will construct the network, loss, and optimiser

Examples::
>>> model = Model(model_builder)
evaluate(inputs, targets, weights=None, callbacks=None, mask_inputs=True)[source]

Compute loss on provided data.

Parameters
  • inputs (Tensor) – input data as tensor on device

  • targets (Tensor) – targets as tensor on device

  • weights (Optional[Tensor]) – Optional weights as tensor on device

  • callbacks (Optional[List[AbsCallback]]) – list of any callbacks to use during evaluation

  • mask_inputs (bool) – whether to apply input mask if one has been set

Return type

float

Returns

(weighted) loss of model predictions on provided data

export2onnx(name, bs=1)[source]

Export network to ONNX format. Note that ONNX expects a fixed batch size (bs) which is the number of datapoints your wish to pass through the model concurrently.

Parameters
  • name (str) – filename for exported file

  • bs (int) – batch size for exported models

Return type

None

export2tfpb(name, bs=1)[source]

Export network to Tensorflow ProtocolBuffer format, via ONNX. Note that ONNX expects a fixed batch size (bs) which is the number of datapoints your wish to pass through the model concurrently.

Parameters
  • name (str) – filename for exported file

  • bs (int) – batch size for exported models

Return type

None

fit(batch_yielder, callbacks=None)[source]

Fit network for one complete iteration of a BatchYielder, i.e. one (sub-)epoch

Parameters
  • batch_yielder (BatchYielder) – BatchYielder providing training data in form of tuple of inputs, targtes, and weights as tensors on device

  • callbacks (Optional[List[AbsCallback]]) – list of AbsCallback to be used during training

Return type

float

Returns

Loss on training data averaged across all minibatches

classmethod from_save(name, model_builder)[source]

Instantiated a Model and load saved state from file.

Parameters
  • name (str) – name of file containing saved state

  • model_builder (ModelBuilder) – ModelBuilder which was used to construct the network

Return type

AbsModel

Returns

Instantiated Model with network weights, optimiser state, and input mask loaded from saved state

Examples::
>>> model = Model.from_save('weights/model.h5', model_builder)
get_feat_importance(fy, eval_metric=None)[source]

Call get_nn_feat_importance() passing this Model and provided arguments

Parameters
  • fy (FoldYielder) – FoldYielder interfacing to data on which to evaluate importance

  • eval_metric (Optional[EvalMetric]) – Optional EvalMetric to use for quantifying performance

Return type

DataFrame

get_lr()[source]

Get learning rate of optimiser

Return type

float

Returns

learning rate of optimiser

get_mom()[source]

Get momentum/beta_1 of optimiser

Return type

float

Returns

momentum/beta_1 of optimiser

get_out_size()[source]

Get number of outputs of model

Return type

int

Returns

Number of outputs of model

get_param_count(trainable=True)[source]

Return number of parameters in model.

Parameters

trainable (bool) – if true (default) only count trainable parameters

Return type

int

Returns

NUmber of (trainable) parameters in model

get_weights()[source]

Get state_dict of weights for network

Return type

OrderedDict

Returns

state_dict of weights for network

load(name, model_builder=None)[source]

Load model, optimiser, and input mask states from file

Parameters
  • name (str) – name of save file

  • model_builder (Optional[ModelBuilder]) – if Model was not initialised with a ModelBuilder, you will need to pass one here

Return type

None

predict(inputs, as_np=True, pred_name='pred')[source]

Apply model to inputed data and compute predictions. A compatability method to call predict_array() or meth:~lumin.nn.models.model.Model.predict_folds, depending on input type.

Parameters
  • inputs (Union[ndarray, DataFrame, Tensor, FoldYielder]) – input data as Numpy array, Pandas DataFrame, or tensor on device, or FoldYielder interfacing to data

  • as_np (bool) – whether to return predictions as Numpy array (otherwise tensor) if inputs are a Numpy array, Pandas DataFrame, or tensor

  • pred_name (str) – name of group to which to save predictions if inputs are a FoldYielder

Return type

Union[ndarray, Tensor, None]

Returns

if inputs are a Numpy array, Pandas DataFrame, or tensor, will return predicitions as either array or tensor

predict_array(inputs, as_np=True, mask_inputs=True)[source]

Pass inputs through network and obtain predictions.

Parameters
  • inputs (Union[ndarray, DataFrame, Tensor]) – input data as Numpy array, Pandas DataFrame, or tensor on device

  • as_np (bool) – whether to return predictions as Numpy array (otherwise tensor)

  • mask_inputs (bool) – whether to apply input mask if one has been set

Return type

Union[ndarray, Tensor]

Returns

Model prediction(s) per datapoint

predict_folds(fy, pred_name='pred')[source]

Apply model to all dataaccessed by a FoldYielder and save predictions as new group in fold file

Parameters
  • fy (FoldYielder) – FoldYielder interfacing to data

  • pred_name (str) – name of group to which to save predictions

Return type

None

save(name)[source]

Save model, optimiser, and input mask states to file

Parameters

name (str) – name of save file

Return type

None

set_input_mask(mask)[source]

Mask input columns by only using input columns whose indeces are listed in mask

Parameters

mask (ndarray) – array of column indeces to use from all input columns

Return type

None

set_lr(lr)[source]

set learning rate of optimiser

Parameters

lr (float) – learning rate of optimiser

Return type

None

set_mom(mom)[source]

Set momentum/beta_1 of optimiser

Parameters

mom (float) – momentum/beta_1 of optimiser

Return type

None

set_weights(weights)[source]

Set state_dict of weights for network

Parameters

weights (OrderedDict) – state_dict of weights for network

Return type

None

lumin.nn.models.model_builder module

class lumin.nn.models.model_builder.ModelBuilder(objective, n_out, cont_feats=None, model_args=None, opt_args=None, cat_embedder=None, loss='auto', head=<class 'lumin.nn.models.blocks.head.CatEmbHead'>, body=<class 'lumin.nn.models.blocks.body.FullyConnected'>, tail=<class 'lumin.nn.models.blocks.tail.ClassRegMulti'>, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, pretrain_file=None, freeze_head=False, freeze_body=False, freeze_tail=False, cat_args=None, n_cont_in=None)[source]

Bases: object

Class to build models to specified architecture on demand along with an optimiser.

Attention

cat_args is now depreciated in favour of cat_embedder and will be removed in v0.4

Attention

n_cont_in is now depreciated in favour of cont_feats and will be removed in v0.4

Parameters
  • objective (str) – string representation of network objective, i.e. ‘classification’, ‘regression’, ‘multiclass’

  • n_out (int) – number of outputs required

  • cont_feats (Optional[List[str]]) – list of names of continuous input features

  • model_args (Optional[Dict[str, Dict[str, Any]]]) – dictionary of dictionaries of keyword arguments to pass to head, body, and tail to control architrcture

  • opt_args (Optional[Dict[str, Any]]) – dictionary of arguments to pass to optimiser. Missing kargs will be filled with default values. Currently, only ADAM (default), RAdam, Ranger, and SGD are available.

  • cat_embedder (Optional[CatEmbedder]) – CatEmbedder for embedding categorical inputs

  • loss (Any) – either and uninstantiated loss class, or leave as ‘auto’ to select loss according to objective

  • head (AbsHead) – uninstantiated class which can receive input data and upscale it to model width

  • body (AbsBody) – uninstantiated class which implements the main bulk of the model’s hidden layers

  • tail (AbsTail) – uninstantiated class which scales the body to the required number of outputs and implements any final activation function and output scaling

  • lookup_init (Callable[[str, Optional[int], Optional[int]], Callable[[Tensor], None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.

  • lookup_act (Callable[[str], Module]) – function taking choice of activation function and returning an activation function layer

  • pretrain_file (Optional[str]) – if set, will load saved parameters for entire network from saved model

  • freeze_head (bool) – whether to start with the head parameters set to untrainable

  • freeze_body (bool) – whether to start with the body parameters set to untrainable

  • cat_args (Optional[Dict[str, Any]]) – depreciated in place of cat_embedder

  • n_cont_in (Optional[int]) – depreciated in favour of cont_feats

Examples::
>>> model_builder = ModelBuilder(objective='classifier',
>>>                              cont_feats=cont_feats, n_out=1,
>>>                              model_args={'body':{'depth':4,
>>>                                                  'width':100}})
>>>
>>> min_targs = np.min(targets, axis=0).reshape(targets.shape[1],1)
>>> max_targs = np.max(targets, axis=0).reshape(targets.shape[1],1)
>>> min_targs[min_targs > 0] *=0.8
>>> min_targs[min_targs < 0] *=1.2
>>> max_targs[max_targs > 0] *=1.2
>>> max_targs[max_targs < 0] *=0.8
>>> y_range = np.hstack((min_targs, max_targs))
>>> model_builder = ModelBuilder(
>>>     objective='regression', cont_feats=cont_feats, n_out=6,
>>>     cat_embedder=CatEmbedder.from_fy(train_fy),
>>>     model_args={'body':{'depth':4, 'width':100},
>>>                 'tail':{y_range=y_range})
>>>
>>> model_builder = ModelBuilder(objective='multiclassifier',
>>>                              cont_feats=cont_feats, n_out=5,
>>>                              model_args={'body':{'width':100,
>>>                                                  'depth':6,
>>>                                                  'do':0.1,
>>>                                                  'res':True}})
>>>
>>> model_builder = ModelBuilder(objective='classifier',
>>>                              cont_feats=cont_feats, n_out=1,
>>>                              model_args={'body':{'depth':4,
>>>                                                  'width':100}},
>>>                              opt_args={'opt':'sgd',
>>>                                        'momentum':0.8,
>>>                                        'weight_decay':1e-5},
>>>                              loss=partial(SignificanceLoss,
>>>                                           sig_weight=sig_weight,
>>>                                           bkg_weight=bkg_weight,
>>>                                           func=calc_ams_torch))
build_model()[source]

Construct entire network module

Return type

Module

Returns

Instantiated nn.Module

classmethod from_model_builder(model_builder, pretrain_file=None, freeze_head=False, freeze_body=False, freeze_tail=False, loss=None, opt_args=None)[source]

Instantiate a ModelBuilder from an exisitng ModelBuilder, but with options to adjust loss, optimiser, pretraining, and module freezing

Parameters
  • model_builder – existing ModelBuilder or filename for a pickled ModelBuilder

  • pretrain_file (Optional[str]) – if set, will load saved parameters for entire network from saved model

  • freeze_head (bool) – whether to start with the head parameters set to untrainable

  • freeze_body (bool) – whether to start with the body parameters set to untrainable

  • freeze_tail (bool) – whether to start with the tail parameters set to untrainable

  • loss (Optional[Any]) – either and uninstantiated loss class, or leave as ‘auto’ to select loss according to objective

  • opt_args (Optional[Dict[str, Any]]) – dictionary of arguments to pass to optimiser. Missing kargs will be filled with default values. Choice of optimiser (‘opt’) keyword can either be set by passing the string name (e.g. ‘adam’ ), but only ADAM and SGD are available this way, or by passing an uninstantiated optimiser (e.g. torch.optim.Adam). If no optimser is set, then it defaults to ADAM. Additional keyword arguments can be set, and these will be passed tot he optimiser during instantiation

Returns

Instantiated ModelBuilder

Examples::
>>> new_model_builder = ModelBuilder.from_model_builder(
>>>     ModelBuidler)
>>>
>>> new_model_builder = ModelBuilder.from_model_builder(
>>>     ModelBuidler, loss=partial(
>>>         SignificanceLoss, sig_weight=sig_weight,
>>>         bkg_weight=bkg_weight, func=calc_ams_torch))
>>>
>>> new_model_builder = ModelBuilder.from_model_builder(
>>>     'weights/model_builder.pkl',
>>>     opt_args={'opt':'sgd', 'momentum':0.8, 'weight_decay':1e-5})
>>>
>>> new_model_builder = ModelBuilder.from_model_builder(
>>>     'weights/model_builder.pkl',
>>>     opt_args={'opt':torch.optim.Adam,
...               'momentum':0.8,
...               'weight_decay':1e-5})
get_body(n_in, feat_map)[source]

Construct body module

Return type

AbsBody

Returns

Instantiated body nn.Module

get_head()[source]

Construct head module

Return type

AbsHead

Returns

Instantiated head nn.Module

get_model()[source]

Construct model, loss, and optimiser, optionally loading pretrained weights

Return type

Tuple[Module, Optimizer, Any]

Returns

Instantiated network, optimiser linked to model parameters, and uninstantiated loss

get_out_size()[source]

Get number of outputs of model

Return type

int

Returns

number of outputs of network

get_tail(n_in)[source]

Construct tail module

Return type

Module

Returns

Instantiated tail nn.Module

load_pretrained(model)[source]

Load model weights from pretrained file

Parameters

model (Module) – instantiated model, i.e. return of build_model()

Returns

model with weights loaded

set_lr(lr)[source]

Set learning rate for all model parameters

Parameters

lr (float) – learning rate

Return type

None

Module contents

Read the Docs v: v0.3.1
Versions
latest
stable
v0.3.2
v0.3.1
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.

Docs

Access comprehensive developer and user documentation for LUMIN

View Docs

Tutorials

Get tutorials for beginner and advanced researchers demonstrating many of the features of LUMIN

View Tutorials