Shortcuts

lumin.nn.models.blocks package

Submodules

lumin.nn.models.blocks.body module

class lumin.nn.models.blocks.body.FullyConnected(n_in, feat_map, depth, width, do=0, bn=False, act='relu', res=False, dense=False, growth_rate=0, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, freeze=False)[source]

Bases: lumin.nn.models.blocks.body.AbsBody

Fully connected set of hidden layers. Designed to be passed as a ‘body’ to ModelBuilder. Supports batch normalisation and dropout. Order is dense->activation->BN->DO, except when res is true in which case the BN is applied after the addition. Can optionaly have skip connections between each layer (res=true). Alternatively can concatinate layers (dense=true) growth_rate parameter can be used to adjust the width of layers according to width+(width*(depth-1)*growth_rate)

Parameters
  • n_in (int) – number of inputs to the block

  • feat_map (Dict[str, List[int]]) – dictionary mapping input features to the model to outputs of head block

  • depth (int) – number of hidden layers. If res==True and depth is even, depth will be increased by one.

  • width (int) – base width of each hidden layer

  • do (float) – if not None will add dropout layers with dropout rates do

  • bn (bool) – whether to use batch normalisation

  • act (str) – string representation of argument to pass to lookup_act

  • res (bool) – whether to add an additative skip connection every two dense layers. Mutually exclusive with dense.

  • dense (bool) – whether to perform layer-wise concatinations after every layer. Mutually exclusion with res.

  • growth_rate (int) – rate at which width of dense layers should increase with depth beyond the initial layer. Ignored if res=True. Can be negative.

  • lookup_init (Callable[[str, Optional[int], Optional[int]], Callable[[Tensor], None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.

  • lookup_act (Callable[[str], Any]) – function taking choice of activation function and returning an activation function layer

  • freeze (bool) – whether to start with module parameters set to untrainable

Examples::
>>> body = FullyConnected(n_in=32, feat_map=head.feat_map, depth=4,
...                       width=100, act='relu')
>>>
>>> body = FullyConnected(n_in=32, feat_map=head.feat_map, depth=4,
...                       width=200, act='relu', growth_rate=-0.3)
>>>
>>> body = FullyConnected(n_in=32, feat_map=head.feat_map, depth=4,
...                       width=100, act='swish', do=0.1, res=True)
>>>
>>> body = FullyConnected(n_in=32, feat_map=head.feat_map, depth=6,
...                       width=32, act='selu', dense=True,
...                       growth_rate=0.5)
>>>
>>> body = FullyConnected(n_in=32, feat_map=head.feat_map, depth=6,
...                       width=50, act='prelu', bn=True,
...                       lookup_init=lookup_uniform_init)
forward(x)[source]

Pass tensor through block

Parameters

x (Tensor) – input tensor

Returns

Resulting tensor

Return type

Tensor

get_out_size()[source]

Get size width of output layer

Return type

int

Returns

Width of output layer

class lumin.nn.models.blocks.body.MultiBlock(n_in, feat_map, blocks, feats_per_block, bottleneck_sz=0, bottleneck_act=None, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, freeze=False)[source]

Bases: lumin.nn.models.blocks.body.AbsBody

Body block allowing outputs of head block to be split amongst a series of body blocks. Output is the concatination of all sub-body blocks. Optionally, single-neuron ‘bottleneck’ layers can be used to pass an input to each sub-block based on a learned function of the input features that block would otherwise not receive, i.e. a highly compressed representation of the rest of teh feature space.

Parameters
  • n_in (int) – number of inputs to the block

  • feat_map (Dict[str, List[int]]) – dictionary mapping input features to the model to outputs of head block

  • blocks (List[partial]) – list of uninstantciated AbsBody blocks to which to pass a subsection of the total inputs. Note that partials should be used to set any relevant parameters at initialisation time

  • feats_per_block (List[List[str]]) – list of lists of names of features to pass to each AbsBody, not that the feat_map provided by AbsHead will map features to their relavant head outputs

  • bottleneck – if true, each block will receive the output of a single neuron which takes as input all the features which each given block does not directly take as inputs

  • bottleneck_act (Optional[str]) – if set to a string representation of an activation function, the output of each bottleneck neuron will be passed throguh the defined activation function before being passed to their associated blocks

  • lookup_init (Callable[[str, Optional[int], Optional[int]], Callable[[Tensor], None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.

  • lookup_act (Callable[[str], Any]) – function taking choice of activation function and returning an activation function layer

  • freeze (bool) – whether to start with module parameters set to untrainable

Examples::
>>> body = MultiBlock(
...     blocks=[partial(FullyConnected, depth=1, width=50, act='swish'),
...             partial(FullyConnected, depth=6, width=55, act='swish',
...                     dense=True, growth_rate=-0.1)],
...     feats_per_block=[[f for f in train_feats if 'DER_' in f],
...                      [f for f in train_feats if 'PRI_' in f]])
>>>
>>> body = MultiBlock(
...     blocks=[partial(FullyConnected, depth=1, width=50, act='swish'),
...     partial(FullyConnected, depth=6, width=55, act='swish',
...             dense=True, growth_rate=-0.1)],
...     feats_per_block=[[f for f in train_feats if 'DER_' in f],
...                      [f for f in train_feats if 'PRI_' in f]],
...     bottleneck=True)
>>>
>>> body = MultiBlock(
...     blocks=[partial(FullyConnected, depth=1, width=50, act='swish'),
...             partial(FullyConnected, depth=6, width=55, act='swish',
...                     dense=True, growth_rate=-0.1)],
...     feats_per_block=[[f for f in train_feats if 'DER_' in f],
...                      [f for f in train_feats if 'PRI_' in f]],
...     bottleneck=True, bottleneck_act='swish')
forward(x)[source]

Pass tensor through block

Parameters

x (Tensor) – input tensor

Returns

Resulting tensor

Return type

Tensor

get_out_size()[source]

Get size width of output layer

Return type

int

Returns

Total number of outputs accross all blocks

lumin.nn.models.blocks.conv_blocks module

class lumin.nn.models.blocks.conv_blocks.Conv1DBlock(in_c, out_c, kernel_sz, padding='auto', stride=1, act='relu', bn=False, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>)[source]

Bases: torch.nn.modules.module.Module

Basic building block for a building and applying a single 1D convolutional layer.

Parameters
  • in_c (int) – number of input channels (number of features per object / rows in input matrix)

  • out_c (int) – number of output channels (number of features / rows in output matrix)

  • kernel_sz (int) – width of kernel, i.e. the number of columns to overlay

  • padding (Union[int, str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.

  • stride (int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.

  • act (str) – string representation of argument to pass to lookup_act

  • bn (bool) – whether to use batch normalisation (default order weights->activation->batchnorm)

  • lookup_init (Callable[[str, Optional[int], Optional[int]], Callable[[Tensor], None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.

  • lookup_act (Callable[[str], Any]) – function taking choice of activation function and returning an activation function layer

Examples::
>>> conv = Conv1DBlock(in_c=3, out_c=16, kernel_sz=3)
>>>
>>> conv = Conv1DBlock(in_c=16, out_c=32, kernel_sz=3, stride=2)
>>>
>>> conv = Conv1DBlock(in_c=3, out_c=16, kernel_sz=3, act='swish', bn=True)
forward(x)[source]

Passes input through the layers. Might need to be overloaded in inheritance, depending on architecture.

Parameters

x (Tensor) – input tensor

Return type

Tensor

Returns

Resulting tensor

get_conv_layer(in_c, out_c, kernel_sz, padding='auto', stride=1, pre_act=False, groups=1)[source]

Builds a sandwich of layers with a single concilutional layer, plus any requested batch norm and activation. Also initialises layers to requested scheme.

Parameters
  • in_c (int) – number of input channels (number of features per object / rows in input matrix)

  • out_c (int) – number of output channels (number of features / rows in output matrix)

  • kernel_sz (int) – width of kernel, i.e. the number of columns to overlay

  • padding (Union[int, str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.

  • stride (int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.

  • pre_act (bool) – whether to apply batchnorm and activation layers prior to the weight layer, or afterwards

  • groups (int) – number of blocks of connections from input channels to output channels

Return type

Module

static get_padding(kernel_sz)[source]

Automatically computes the required padding to keep the number of columns equal before and after convolution

Parameters

kernel_sz (int) – width of convolutional kernel

Return type

int

Returns

size of padding

set_layers()[source]

One of the main function to overload when inheriting from class. By default calls self.get_conv_layer once but can be changed to produce more complicated architectures. Sets self.layers to the constructed architecture.

Return type

None

class lumin.nn.models.blocks.conv_blocks.Res1DBlock(in_c, out_c, kernel_sz, padding='auto', stride=1, act='relu', bn=False, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>)[source]

Bases: lumin.nn.models.blocks.conv_blocks.Conv1DBlock

Basic building block for a building and applying a pair of residually connected 1D convolutional layers (https://arxiv.org/abs/1512.03385). Batchnorm is applied ‘pre-activation’ as per https://arxiv.org/pdf/1603.05027.pdf, and convolutional shortcuts (again https://arxiv.org/pdf/1603.05027.pdf) are used when the stride of the first layer is greater than 1, or the number of input channels does not equal the number of output channels.

Parameters
  • in_c (int) – number of input channels (number of features per object / rows in input matrix)

  • out_c (int) – number of output channels (number of features / rows in output matrix)

  • kernel_sz (int) – width of kernel, i.e. the number of columns to overlay

  • padding (Union[int, str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.

  • stride (int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.

  • act (str) – string representation of argument to pass to lookup_act

  • bn (bool) – whether to use batch normalisation (order is pre-activation: batchnorm->activation->weights)

  • lookup_init (Callable[[str, Optional[int], Optional[int]], Callable[[Tensor], None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.

  • lookup_act (Callable[[str], Any]) – function taking choice of activation function and returning an activation function layer

Examples::
>>> conv = Res1DBlock(in_c=16, out_c=16, kernel_sz=3)
>>>
>>> conv = Res1DBlock(in_c=16, out_c=32, kernel_sz=3, stride=2)
>>>
>>> conv = Res1DBlock(in_c=16, out_c=16, kernel_sz=3, act='swish', bn=True)
forward(x)[source]

Passes input through the pair of layers and then adds the resulting tensor to the original input, which may be passed through a shortcut connection is necessary.

Parameters

x (Tensor) – input tensor

Return type

Tensor

Returns

Resulting tensor

set_layers()[source]

Constructs a pair of pre-activation convolutional layers, and a shortcut layer if necessary.

class lumin.nn.models.blocks.conv_blocks.ResNeXt1DBlock(in_c, inter_c, cardinality, out_c, kernel_sz, padding='auto', stride=1, act='relu', bn=False, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>)[source]

Bases: lumin.nn.models.blocks.conv_blocks.Conv1DBlock

Basic building block for a building and applying a set of residually connected groups of 1D convolutional layers (https://arxiv.org/abs/1611.05431). Batchnorm is applied ‘pre-activation’ as per https://arxiv.org/pdf/1603.05027.pdf, and convolutional shortcuts (again https://arxiv.org/pdf/1603.05027.pdf) are used when the stride of the first layer is greater than 1, or the number of input channels does not equal the number of output channels.

Parameters
  • in_c (int) – number of input channels (number of features per object / rows in input matrix)

  • inter_c (int) – number of intermediate channels in groups

  • cardinality (int) – number of groups

  • out_c (int) – number of output channels (number of features / rows in output matrix)

  • kernel_sz (int) – width of kernel, i.e. the number of columns to overlay

  • padding (Union[int, str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.

  • stride (int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.

  • act (str) – string representation of argument to pass to lookup_act

  • bn (bool) – whether to use batch normalisation (order is pre-activation: batchnorm->activation->weights)

  • lookup_init (Callable[[str, Optional[int], Optional[int]], Callable[[Tensor], None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.

  • lookup_act (Callable[[str], Any]) – function taking choice of activation function and returning an activation function layer

Examples::
>>> conv = ResNeXt1DBlock(in_c=32, inter_c=4, cardinality=4, out_c=32, kernel_sz=3)
>>>
>>> conv = ResNeXt1DBlock(in_c=32, inter_c=4, cardinality=4, out_c=32, kernel_sz=3, stride=2)
>>>
>>> conv = ResNeXt1DBlock(in_c=32, inter_c=4, cardinality=4, out_c=32, kernel_sz=3, act='swish', bn=True)
forward(x)[source]

Passes input through the set of layers and then adds the resulting tensor to the original input, which may be passed through a shortcut connection is necessary.

Parameters

x (Tensor) – input tensor

Return type

Tensor

Returns

Resulting tensor

set_layers()[source]

Constructs a set of grouped pre-activation convolutional layers, and a shortcut layer if necessary.

lumin.nn.models.blocks.endcap module

class lumin.nn.models.blocks.endcap.AbsEndcap(model)[source]

Bases: torch.nn.modules.module.Module

Abstract class for constructing post training layer which performs further calculation on NN outputs. Used when NN was trained to some proxy objective

Parameters

model (Module) – trained Model to wrap

forward(x)[source]

Pass tensor through endcap and compute function

Parameters

x (Tensor) – model output tensor

Returns

Resulting tensor

Return type

Tensor

abstract func(x)[source]

Transformation functio to apply to model outputs

Arguements:

x: model output tensor

Return type

Tensor

Returns

Resulting tensor

predict(inputs, as_np=True)[source]

Evaluate model on input tensor, and comput function of model outputs

Parameters
  • inputs (Union[ndarray, DataFrame, Tensor]) – input data as Numpy array, Pandas DataFrame, or tensor on device

  • as_np (bool) – whether to return predictions as Numpy array (otherwise tensor)

Return type

Union[ndarray, Tensor]

Returns

model predictions pass through endcap function

lumin.nn.models.blocks.head module

class lumin.nn.models.blocks.head.CatEmbHead(cont_feats, do_cont=0, do_cat=0, cat_embedder=None, lookup_init=<function lookup_normal_init>, freeze=False)[source]

Bases: lumin.nn.models.blocks.head.AbsHead

Standard model head for columnar data. Provides inputs for continuous features and embedding matrices for categorical inputs, and uses a dense layer to upscale to width of network body. Designed to be passed as a ‘head’ to ModelBuilder. Supports batch normalisation and dropout (at separate rates for continuous features and categorical embeddings). Continuous features are expected to be the first len(cont_feats) columns of input tensors and categorical features the remaining columns. Embedding arguments for categorical features are set using a CatEmbedder.

Parameters
  • cont_feats (List[str]) – list of names of continuous input features

  • do_cont (float) – if not None will add a dropout layer with dropout rate do acting on the continuous inputs prior to concatination wih the categorical embeddings

  • do_cat (float) – if not None will add a dropout layer with dropout rate do acting on the categorical embeddings prior to concatination wih the continuous inputs

  • cat_embedder (Optional[CatEmbedder]) – CatEmbedder providing details of how to embed categorical inputs

  • lookup_init (Callable[[str, Optional[int], Optional[int]], Callable[[Tensor], None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.

  • freeze (bool) – whether to start with module parameters set to untrainable

Examples::
>>> head = CatEmbHead(cont_feats=cont_feats)
>>>
>>> head = CatEmbHead(cont_feats=cont_feats,
...                   cat_embedder=CatEmbedder.from_fy(train_fy))
>>>
>>> head = CatEmbHead(cont_feats=cont_feats,
...                   cat_embedder=CatEmbedder.from_fy(train_fy),
...                   do_cont=0.1, do_cat=0.05)
>>>
>>> head = CatEmbHead(cont_feats=cont_feats,
...                   cat_embedder=CatEmbedder.from_fy(train_fy),
...                   lookup_init=lookup_uniform_init)
forward(x)[source]

Pass tensor through block

Parameters

x (Tensor) – input tensor

Returns

Resulting tensor

Return type

Tensor

get_embeds()[source]

Get state_dict for every embedding matrix.

Return type

Dict[str, OrderedDict]

Returns

Dictionary mapping categorical features to learned embedding matrix

get_out_size()[source]

Get size width of output layer

Return type

int

Returns

Width of output layer

plot_embeds(savename=None, settings=<lumin.plotting.plot_settings.PlotSettings object>)[source]

Plot representations of embedding matrices for each categorical feature.

Parameters
  • savename (Optional[str]) – if not None, will save copy of plot to give path

  • settings (PlotSettings) – PlotSettings class to control figure appearance

Return type

None

save_embeds(path)[source]

Save learned embeddings to path. Each categorical embedding matic will be saved as a separate state_dict with name equal to the feature name as set in cat_embedder

Parameters

path (Path) – path to which to save embedding weights

Return type

None

class lumin.nn.models.blocks.head.MultiHead(cont_feats, matrix_head, flat_head=<class 'lumin.nn.models.blocks.head.CatEmbHead'>, cat_embedder=None, lookup_init=<function lookup_normal_init>, freeze=False, **kargs)[source]

Bases: lumin.nn.models.blocks.head.AbsHead

Wrapper head to handel data containing flat continuous and categorical features, and matrix data. Flat inputs are passed through flat_head, and matrix inputs are passed through matrix_head. The outputs of both blocks are then concatenated together. Incoming data can either be: Completely flat, in which case the matrix_head should construct its own matrix from the data; or a tuple of flat data and the matrix, in which case the matrix_head will receive the data already in matrix format.

Parameters
  • cont_feats (List[str]) – list of names of continuous and matrix input features

  • matrix_head (Callable[[Any], AbsMatrixHead]) – Uninitialised (partial) head to handle matrix data e.g. InteractionNet

  • flat_head (Callable[[Any], AbsHead]) – Uninitialised (partial) head to handle flat data e.g. CatEmbHead

  • cat_embedder (Optional[CatEmbedder]) – CatEmbedder providing details of how to embed categorical inputs

  • lookup_init (Callable[[str, Optional[int], Optional[int]], Callable[[Tensor], None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.

  • freeze (bool) – whether to start with module parameters set to untrainable

Examples:: >>> inet = partial(InteractionNet, intfunc_depth=2,intfunc_width=4,intfunc_out_sz=3, … outfunc_depth=2,outfunc_width=5,outfunc_out_sz=4,agg_method=’flatten’, … feats_per_vec=feats_per_vec,vecs=vecs, act=’swish’) … multihead = MultiHead(cont_feats=cont_feats+matrix_feats, matrix_head=inet, cat_embedder=CatEmbedder.from_fy(train_fy))

forward(x)[source]

Pass incoming data through flat and matrix heads. If x is a Tuple then the first element is passed to the flat head and the secons is sent to the matrix head. Else the elements corresponding to flat dta are sent to the flat head and the elements corresponding to matrix elements are sent to the matrix head.

Parameters

x (Union[Tensor, Tuple[Tensor, Tensor]]) – input data as either a flat Tensor or a Tuple of the form [flat Tensor, matrix Tensor]

Return type

Tensor

Returns

Concetanted outout of flat and matrix heads

get_out_size()[source]

Get size of output

Return type

int

Returns

Output size of flat head + output size of matrix head

class lumin.nn.models.blocks.head.InteractionNet(cont_feats, vecs, feats_per_vec, intfunc_depth, intfunc_width, intfunc_out_sz, outfunc_depth, outfunc_width, outfunc_out_sz, agg_method, do=0, bn=False, act='relu', lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, freeze=False, **kargs)[source]

Bases: lumin.nn.models.blocks.head.AbsMatrixHead

Implementation of the Interaction Graph-Network (https://arxiv.org/abs/1612.00222). Shown to be applicable for embedding many 4-momenta in e.g. https://arxiv.org/abs/1908.05318

Incoming data can either be flat, in which case it is reshaped into a matrix, or be supplied directly in column-wise matrix form. Matrices should/will be column-wise: each column is a seperate object (e.g. particle and jet) and each row is a feature (e.g. energy and mometum component). Matrix elements are expected to be named according to {object}_{feature}, e.g. photon_energy. vecs (vectors) should then be a list of objects, i.e. column headers, feature prefixes. feats_per_vec should be a list of features, i.e. row headers, feature suffixes.

Note

To allow for the fact that there may be nonexistant features (e.g. z-component of missing energy), cont_feats should be a list of all matrix features which really do exist (i.e. are present in input data), and be in the same order as the incoming data. Nonexistant features will be set zero.

The penultimate stage of processing in the interaction net is a matrix, this must be processed into a flat tensor. agg_method controls how this is done: ‘sum’ will sum over the embedded representations of each object meaning that the objects can be placed in any order, however some information will be lost during the aggregation. ‘flatten’ will flatten out the matrix preserving all the information, however the objects must be placed in some order each time. Additionally, the ‘flatten’ mode can potentially become quite large if many objects are embedded. A future comprimise might be to feed the embeddings into a recurrent layer to provide a smaller output which preserves more information than the summing.

Parameters
  • cont_feats (List[str]) – list of all the matrix features which are present in the input data

  • vecs (List[str]) – list of objects, i.e. column headers, feature prefixes

  • feats_per_vec (List[str]) – list of features per object, i.e. row headers, feature suffixes

  • intfunc_depth (int) – number of layers in the interaction-representation network

  • intfunc_width (int) – width of hidden layers in the interaction-representation network

  • intfunc_out_sz (int) – width of output layer of the interaction-representation network, i.e. the size of each interaction representation

  • outfunc_depth (int) – number of layers in the post-interaction network

  • outfunc_width (int) – width of hidden layers in the post-interaction network

  • outfunc_out_sz (int) – width of output layer of the post-interaction network, i.e. the size of each output representation

  • agg_method (str) – how to transform the output matrix, currently either ‘sum’ to sum across objects, or ‘flatten’ to flatten out the matrix

  • do (float) – dropout rate to be applied to hidden layers in the interaction-representation and post-interaction networks

  • bn (bool) – whether batch normalisation should be applied to hidden layers in the interaction-representation and post-interaction networks

  • act (str) – activation function to apply to hidden layers in the interaction-representation and post-interaction networks

  • lookup_init (Callable[[str, Optional[int], Optional[int]], Callable[[Tensor], None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.

  • lookup_act (Callable[[str], Any]) – function taking choice of activation function and returning an activation function layer

  • freeze (bool) – whether to start with module parameters set to untrainable

Examples::
>>> inet = InteractionNet(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs,
...                       intfunc_depth=2,intfunc_width=4,intfunc_out_sz=3,
...                       outfunc_depth=2,outfunc_width=5,outfunc_out_sz=4,agg_method='flatten')
>>>
>>> inet = InteractionNet(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs,
...                       intfunc_depth=2,intfunc_width=4,intfunc_out_sz=6,
...                       outfunc_depth=2,outfunc_width=5,outfunc_out_sz=8,agg_method='sum')
>>>
>>> inet = InteractionNet(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs,
...                       intfunc_depth=3,intfunc_width=4,intfunc_out_sz=3,
...                       outfunc_depth=3,outfunc_width=5,outfunc_out_sz=4,agg_method='flatten',
...                       do=0.1, bn=True, act='swish', lookup_init=lookup_uniform_init)
forward(x)[source]

Passes input through the interaction network and aggregates out down to a flat tensor.

Parameters

x (Union[Tensor, Tuple[Tensor, Tensor]]) – If a tuple, the second element is assumed to the be the matrix data. If a flat tensor, will conver the data to a matrix

Return type

Tensor

Returns

Resulting tensor

get_out_size()[source]

Get size of output

Return type

int

Returns

Width of output representation

class lumin.nn.models.blocks.head.RecurrentHead(cont_feats, vecs, feats_per_vec, depth, width, bidirectional=False, rnn=<class 'torch.nn.modules.rnn.RNN'>, do=0.0, act='tanh', stateful=False, freeze=False, **kargs)[source]

Bases: lumin.nn.models.blocks.head.AbsMatrixHead

Recurrent head for row-wise matrix data applying e.g. RNN, LSTM, GRU.

Incoming data can either be flat, in which case it is reshaped into a matrix, or be supplied directly into matrix form. Matrices should/will be row-wise: each column is a seperate object (e.g. particle and jet) and each row is a feature (e.g. energy and mometum component). Matrix elements are expected to be named according to {object}_{feature}, e.g. photon_energy. vecs (vectors) should then be a list of objects, i.e. row headers, feature prefixes. feats_per_vec should be a list of features, i.e. column headers, feature suffixes.

Note

To allow for the fact that there may be nonexistant features (e.g. z-component of missing energy), cont_feats should be a list of all matrix features which really do exist (i.e. are present in input data), and be in the same order as the incoming data. Nonexistant features will be set zero.

Parameters
  • cont_feats (List[str]) – list of all the matrix features which are present in the input data

  • vecs (List[str]) – list of objects, i.e. row headers, feature prefixes

  • feats_per_vec (List[str]) – list of features per object, i.e. columns headers, feature suffixes

  • depth (int) – number of hidden layers to use

  • width (int) – size of each hidden state

  • bidirectional (bool) – whether to set recurrent layers to be bidirectional

  • rnn (RNNBase) – module class to use for the recurrent layer, e.g. torch.nn.RNN, torch.nn.LSTM, torch.nn.GRU

  • do (float) – dropout rate to be applied to hidden layers

  • act (str) – activation function to apply to hidden layers, only used if rnn expects a nonliearity

  • stateful (bool) – whether to return all intermediate hidden states, or only the final hidden states

  • freeze (bool) – whether to start with module parameters set to untrainable

Examples::
>>> rnn = RecurrentHead(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs, depth=1, width=20)
>>>
>>> rnn = RecurrentHead(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs,
...                     depth=2, width=10, act='relu', bidirectional=True)
>>>
>>> lstm = RecurrentHead(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs,
...                      depth=1, width=10, rnn=nn.LSTM)
>>>
>>> gru = RecurrentHead(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs,
...                     depth=3, width=10, rnn=nn.GRU, bidirectional=True)
forward(x)[source]

Passes input through the recurrent network.

Parameters

x (Union[Tensor, Tuple[Tensor, Tensor]]) – If a tuple, the second element is assumed to the be the matrix data. If a flat tensor, will conver the data to a matrix

Return type

Tensor

Returns

if stateful, returns all hidden states, otherwise only returns last hidden state

get_out_size()[source]

Get size of output

Return type

Union[int, Tuple[int, int]]

Returns

Width of output representation, or shape of output if stateful

class lumin.nn.models.blocks.head.AbsConv1dHead(cont_feats, vecs, feats_per_vec, act='relu', bn=False, layer_kargs=None, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, freeze=False, **kargs)[source]

Bases: lumin.nn.models.blocks.head.AbsMatrixHead

Abstract wrapper head for applying 1D convolutions to column-wise matrix data. Users should inherit from this class and overload get_layers() to define their model. Some common convolutional layers are already defined (e.g. ConvBlock and ResNeXt), which are accessable using methods such as :meth`~lumin.nn.models.blocks.heads.AbsConv1dHead..get_conv1d_block`. For more complicated models, foward() can also be overwritten The output size of the block is automatically computed during initialisation by passing through random pseudodata.

Incoming data can either be flat, in which case it is reshaped into a matrix, or be supplied directly into matrix form. Matrices should/will be row-wise: each column is a seperate object (e.g. particle and jet) and each row is a feature (e.g. energy and mometum component). Matrix elements are expected to be named according to {object}_{feature}, e.g. photon_energy. vecs (vectors) should then be a list of objects, i.e. row headers, feature prefixes. feats_per_vec should be a list of features, i.e. column headers, feature suffixes.

Note

To allow for the fact that there may be nonexistant features (e.g. z-component of missing energy), cont_feats should be a list of all matrix features which really do exist (i.e. are present in input data), and be in the same order as the incoming data. Nonexistant features will be set zero.

Parameters
  • cont_feats (List[str]) – list of all the matrix features which are present in the input data

  • vecs (List[str]) – list of objects, i.e. row headers, feature prefixes

  • feats_per_vec (List[str]) – list of features per object, i.e. columns headers, feature suffixes

  • act (str) – activation function passed to get_layers

  • bn (bool) – batch normalisation argument passed to get_layers

  • layer_kargs (Optional[Dict[str, Any]]) – dictionary of keyword arguments which are passed to get_layers

  • lookup_init (Callable[[str, Optional[int], Optional[int]], Callable[[Tensor], None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.

  • freeze (bool) – whether to start with module parameters set to untrainable

Examples::
>>> class MyCNN(AbsConv1dHead):
...     def get_layers(self, act:str='relu', bn:bool=False, **kargs) -> Tuple[nn.Module, int]:
...         layers = []
...         layers.append(self.get_conv1d_block(3, 16, stride=1, kernel_sz=3, act=act, bn=bn))
...         layers.append(self.get_conv1d_block(16, 16, stride=1, kernel_sz=3, act=act, bn=bn))
...         layers.append(self.get_conv1d_block(16, 32, stride=2, kernel_sz=3, act=act, bn=bn))
...         layers.append(self.get_conv1d_block(32, 32, stride=1, kernel_sz=3, act=act, bn=bn))
...         layers.append(nn.AdaptiveAvgPool1d(1))
...         layers = nn.Sequential(*layers)
...         return layers
...
... cnn = MyCNN(cont_feats=matrix_feats, vecs=vectors, feats_per_vec=feats_per_vec)
>>>
>>> class MyResNet(AbsConv1dHead):
...     def get_layers(self, act:str='relu', bn:bool=False, **kargs) -> Tuple[nn.Module, int]:
...         layers = []
...         layers.append(self.get_conv1d_block(3, 16, stride=1, kernel_sz=3, act='linear', bn=False))
...         layers.append(self.get_conv1d_res_block(16, 16, stride=1, kernel_sz=3, act=act, bn=bn))
...         layers.append(self.get_conv1d_res_block(16, 32, stride=2, kernel_sz=3, act=act, bn=bn))
...         layers.append(self.get_conv1d_res_block(32, 32, stride=1, kernel_sz=3, act=act, bn=bn))
...         layers.append(nn.AdaptiveAvgPool1d(1))
...         layers = nn.Sequential(*layers)
...         return layers
...
... cnn = MyResNet(cont_feats=matrix_feats, vecs=vectors, feats_per_vec=feats_per_vec)
>>>
>>> class MyResNeXt(AbsConv1dHead):
...     def get_layers(self, act:str='relu', bn:bool=False, **kargs) -> Tuple[nn.Module, int]:
...         layers = []
...         layers.append(self.get_conv1d_block(3, 32, stride=1, kernel_sz=3, act='linear', bn=False))
...         layers.append(self.get_conv1d_resNeXt_block(32, 4, 4, 32, stride=1, kernel_sz=3, act=act, bn=bn))
...         layers.append(self.get_conv1d_resNeXt_block(32, 4, 4, 32, stride=2, kernel_sz=3, act=act, bn=bn))
...         layers.append(self.get_conv1d_resNeXt_block(32, 4, 4, 32, stride=1, kernel_sz=3, act=act, bn=bn))
...         layers.append(nn.AdaptiveAvgPool1d(1))
...         layers = nn.Sequential(*layers)
...         return layers
...
... cnn = MyResNeXt(cont_feats=matrix_feats, vecs=vectors, feats_per_vec=feats_per_vec)
check_out_sz()[source]

Automatically computes the output size of the head by passing through random data of the expected shape

Return type

int

Returns

x.size(-1) where x is the outgoing tensor from the head

forward(x)[source]

Passes input through the convolutional network.

Parameters

x (Union[Tensor, Tuple[Tensor, Tensor]]) – If a tuple, the second element is assumed to the be the matrix data. If a flat tensor, will conver the data to a matrix

Return type

Tensor

Returns

Resulting tensor

get_conv1d_block(in_c, out_c, kernel_sz, padding='auto', stride=1, act='relu', bn=False)[source]

Wrapper method to build a ConvBlock object.

Parameters
  • in_c (int) – number of input channels (number of features per object / rows in input matrix)

  • out_c (int) – number of output channels (number of features / rows in output matrix)

  • kernel_sz (int) – width of kernel, i.e. the number of columns to overlay

  • padding (Union[int, str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.

  • stride (int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.

  • act (str) – string representation of argument to pass to lookup_act

  • bn (bool) – whether to use batch normalisation (order is weights->activation->batchnorm)

Return type

Conv1DBlock

Returns

Instantiated ConvBlock object

get_conv1d_resNeXt_block(in_c, inter_c, cardinality, out_c, kernel_sz, padding='auto', stride=1, act='relu', bn=False)[source]

Wrapper method to build a ResNeXt1DBlock object.

Parameters
  • in_c (int) – number of input channels (number of features per object / rows in input matrix)

  • inter_c (int) – number of intermediate channels in groups

  • cardinality (int) – number of groups

  • out_c (int) – number of output channels (number of features / rows in output matrix)

  • kernel_sz (int) – width of kernel, i.e. the number of columns to overlay

  • padding (Union[int, str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.

  • stride (int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.

  • act (str) – string representation of argument to pass to lookup_act

  • bn (bool) – whether to use batch normalisation (order is pre-activation: batchnorm->activation->weights)

Return type

ResNeXt1DBlock

Returns

Instantiated ResNeXt1DBlock object

get_conv1d_res_block(in_c, out_c, kernel_sz, padding='auto', stride=1, act='relu', bn=False)[source]

Wrapper method to build a Res1DBlock object.

Parameters
  • in_c (int) – number of input channels (number of features per object / rows in input matrix)

  • out_c (int) – number of output channels (number of features / rows in output matrix)

  • kernel_sz (int) – width of kernel, i.e. the number of columns to overlay

  • padding (Union[int, str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.

  • stride (int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.

  • act (str) – string representation of argument to pass to lookup_act

  • bn (bool) – whether to use batch normalisation (order is pre-activation: batchnorm->activation->weights)

Return type

Res1DBlock

Returns

Instantiated Res1DBlock object

abstract get_layers(in_c, act='relu', bn=False, **kargs)[source]

Abstract function to be overloaded by user. Should return a single torch.nn.Module which accepts the expected input matrix data.

Return type

Module

get_out_size()[source]

Get size of output

Return type

int

Returns

Width of output representation

class lumin.nn.models.blocks.head.LorentzBoostNet(cont_feats, vecs, feats_per_vec, n_particles, feat_extractor=None, bn=True, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, freeze=False, **kargs)[source]

Bases: lumin.nn.models.blocks.head.AbsMatrixHead

Implementation of the Lorentz Boost Network (https://arxiv.org/abs/1812.09722), which takes 4-momenta for particles and learns new particles and reference frames from linear combinations of the original particles, and then boosts the new particles into the learned reference frames. Preset kernel functions are the run over the 4-momenta of the boosted particles to compute a set of veriables per particle. These functions can be based on pairs etc. of particles, e.g. angles between particles. (LorentzBoostNet.comb provides an index iterator over all paris of particles).

A default feature extractor is provided which returns the (px,py,pz,E) of the boosted particles and the cosine angle between every pair of boosted particle. This can be overwritten by passing a function to the feat_extractor argument during initialisation, or overidding LorentzBoostNet.feat_extractor.

Important

4-momenta should be supplied without preprocessing, and 4-momenta must be physical (E>=|p|). It is up to the user to ensure this, and not doing so may result in errors. A BatchNorm argument (bn) is available to preprocess the features extracted from the boosted particles prior to returning them.

Incoming data can either be flat, in which case it is reshaped into a matrix, or be supplied directly in row-wise matrix form. Matrices should/will be row-wise: each row is a seperate 4-momenta in the form (px,py,pz,E). Matrix elements are expected to be named according to {particle}_{feature}, e.g. photon_E. vecs (vectors) should then be a list of particles, i.e. row headers, feature prefixes. feats_per_vec should be a list of features, i.e. column headers, feature suffixes.

Note

To allow for the fact that there may be nonexistant features (e.g. z-component of missing energy), cont_feats should be a list of all matrix features which really do exist (i.e. are present in input data), and be in the same order as the incoming data. Nonexistant features will be set zero.

Parameters
  • cont_feats (List[str]) – list of all the matrix features which are present in the input data

  • vecs (List[str]) – list of objects, i.e. column headers, feature prefixes

  • feats_per_vec (List[str]) – list of features per object, i.e. row headers, feature suffixes

  • n_particles (int) – the number of particles and reference frames to learn

  • feat_extractor (Optional[Callable[[Tensor], Tensor]]) – if not None, will use the argument as the function to extract features from the 4-momenta of the boosted particles.

  • bn (bool) – whether batch normalisation should be applied to the extracted features

  • lookup_init (Callable[[str, Optional[int], Optional[int]], Callable[[Tensor], None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights. Purely for inheritance, unused by class as is.

  • lookup_act (Callable[[str], Any]) – function taking choice of activation function and returning an activation function layer. Purely for inheritance, unused by class as is.

  • freeze (bool) – whether to start with module parameters set to untrainable.

Examples::
>>> lbn = LorentzBoostNet(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs, n_particles=6)
>>>
>>> def feat_extractor(x:Tensor) -> Tensor:  # Return masses of boosted particles, x dimensions = [batch,particle,4-mom]
...     momenta,energies =  x[:,:,:3], x[:,:,3:]
...     mass = torch.sqrt((energies**2)-torch.sum(momenta**2, dim=-1)[:,:,None])
...     return mass
>>> lbn = InteractionNet(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs, n_particle=6, feat_extractor=feat_extractor)
check_out_sz()[source]

Automatically computes the output size of the head by passing through random data of the expected shape

Return type

int

Returns

x.size(-1) where x is the outgoing tensor from the head

feat_extractor(x)[source]

Computes features from boosted particle 4-momenta. Incoming tensor x contains all 4-momenta for all particles for all datapoints in minibatch. Default function returns 4-momenta and cosine angle between all particles.

Parameters

x (Tensor) – 3D incoming tensor with dimensions: [batch, particle, 4-mom (px,py,pz,E)]

Return type

Tensor

Returns

2D tensor with dimensions [batch, features]

forward(x)[source]

Passes input through the LB network and aggregates down to a flat tensor via the feature extractor, optionally passing through a batchnorm layer.

Parameters

x (Union[Tensor, Tuple[Tensor, Tensor]]) – If a tuple, the second element is assumed to the be the matrix data. If a flat tensor, will conver the data to a matrix

Return type

Tensor

Returns

Resulting tensor

get_out_size()[source]

Get size of output

Return type

int

Returns

Width of output representation

class lumin.nn.models.blocks.head.AutoExtractLorentzBoostNet(cont_feats, vecs, feats_per_vec, n_particles, depth, width, n_singles=0, n_pairs=0, act='swish', do=0, bn=False, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, freeze=False, **kargs)[source]

Bases: lumin.nn.models.blocks.head.LorentzBoostNet

Modified version of :class:`~lumin.nn.models.blocks.head.LorentzBoostNet (implementation of the Lorentz Boost Network (https://arxiv.org/abs/1812.09722)). Rather than relying on fixed kernel functions to extract features from the boosted paricles, the functions are learnt during training via neural networks.

Two netrowks are used, one to extract n_singles features from each particle and another to extract n_pairs features from each pair of particles.

Important

4-momenta should be supplied without preprocessing, and 4-momenta must be physical (E>=|p|). It is up to the user to ensure this, and not doing so may result in errors. A BatchNorm argument (bn) is available to preprocess the 4-momenta of the boosted particles prior to passing them through the neural networks

Incoming data can either be flat, in which case it is reshaped into a matrix, or be supplied directly in row-wise matrix form. Matrices should/will be row-wise: each row is a seperate 4-momenta in the form (px,py,pz,E). Matrix elements are expected to be named according to {particle}_{feature}, e.g. photon_E. vecs (vectors) should then be a list of particles, i.e. row headers, feature prefixes. feats_per_vec should be a list of features, i.e. column headers, feature suffixes.

Note

To allow for the fact that there may be nonexistant features (e.g. z-component of missing energy), cont_feats should be a list of all matrix features which really do exist (i.e. are present in input data), and be in the same order as the incoming data. Nonexistant features will be set zero.

Parameters
  • cont_feats (List[str]) – list of all the matrix features which are present in the input data

  • vecs (List[str]) – list of objects, i.e. column headers, feature prefixes

  • feats_per_vec (List[str]) – list of features per object, i.e. row headers, feature suffixes

  • n_particles (int) – the number of particles and reference frames to learn

  • depth (int) – the number of hidden layers in each network

  • width (int) – the number of neurons per hidden layer

  • n_singles (int) – the number of features to extract per individual particle

  • n_pairs (int) – the number of features to extract per pair of particles

  • act (str) – string representation of argument to pass to lookup_act. Activation should ideally have non-zero outputs to help deal with poorly normalised inputs

  • do (float) – dropout rate for use in networks

  • bn (bool) – whether to use batch normalisation within networks. Inputs are passed through BN regardless of setting.

  • lookup_init (Callable[[str, Optional[int], Optional[int]], Callable[[Tensor], None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.

  • lookup_act (Callable[[str], Any]) – function taking choice of activation function and returning an activation function layer.

  • freeze (bool) – whether to start with module parameters set to untrainable.

Examples::
>>> aelbn = AutoExtractLorentzBoostNet(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs, n_particles=6,
                                       depth=3, width=10, n_singles=3, n_pairs=2)
feat_extractor(x)[source]

Computes features from boosted particle 4-momenta. Incoming tensor x contains all 4-momenta for all particles for all datapoints in minibatch. single_nn broadcast to all boosted particles, and pair_nn broadcast to all paris of particles. Returned features are concatenated together.

Parameters

x (Tensor) – 3D incoming tensor with dimensions: [batch, particle, 4-mom (px,py,pz,E)]

Return type

Tensor

Returns

2D tensor with dimensions [batch, features]

lumin.nn.models.blocks.tail module

class lumin.nn.models.blocks.tail.ClassRegMulti(n_in, n_out, objective, y_range=None, bias_init=None, y_mean=None, y_std=None, lookup_init=<function lookup_normal_init>, freeze=False)[source]

Bases: lumin.nn.models.blocks.tail.AbsTail

Output block for (multi(class/label)) classification or regression tasks. Designed to be passed as a ‘tail’ to ModelBuilder. Takes output size of network body and scales it to required number of outputs. For regression tasks, y_range can be set with per-output minima and maxima. The outputs are then adjusted according to ((y_max-y_min)*x)+self.y_min, where x is the output of the network passed through a sigmoid function. Effectively allowing regression to be performed without normalising and standardising the target values. Note it is safest to allow some leaway in setting the min and max, e.g. max = 1.2*max, min = 0.8*min Output activation function is automatically set according to objective and y_range.

Parameters
  • n_in (int) – number of inputs to expect

  • n_out (int) – number of outputs required

  • objective (str) – string representation of network objective, i.e. ‘classification’, ‘regression’, ‘multiclass’

  • y_range (Union[Tuple, ndarray, None]) – if not None, will apply rescaling to network outputs: x = ((y_range[1]-y_range[0])*sigmoid(x))+y_range[0]. Incompatible with y_mean and y_std

  • bias_init (Optional[float]) – specify an intial bias for the output neurons. Otherwise default values of 0 are used, except for multiclass objectives, which use 1/n_out

  • y_mean (Union[float, List[float], ndarray, None]) – if sepcified along with y_std, will apply rescaling to network outputs: x = (y_std*x)+y_mean. Incopmpatible with y_range

  • y_std (Union[float, List[float], ndarray, None]) – if sepcified along with y_mean, will apply rescaling to network outputs: x = (y_std*x)+y_mean. Incopmpatible with y_range

  • lookup_init (Callable[[str, Optional[int], Optional[int]], Callable[[Tensor], None]]) – function taking string representation of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.

Examples::
>>> tail = ClassRegMulti(n_in=100, n_out=1, objective='classification')
>>>
>>> tail = ClassRegMulti(n_in=100, n_out=5, objective='multiclass')
>>>
>>> y_range = (0.8*targets.min(), 1.2*targets.max())
>>> tail = ClassRegMulti(n_in=100, n_out=1, objective='regression',
...                      y_range=y_range)
>>>
>>> min_targs = np.min(targets, axis=0).reshape(targets.shape[1],1)
>>> max_targs = np.max(targets, axis=0).reshape(targets.shape[1],1)
>>> min_targs[min_targs > 0] *=0.8
>>> min_targs[min_targs < 0] *=1.2
>>> max_targs[max_targs > 0] *=1.2
>>> max_targs[max_targs < 0] *=0.8
>>> y_range = np.hstack((min_targs, max_targs))
>>> tail = ClassRegMulti(n_in=100, n_out=6, objective='regression',
...                      y_range=y_range,
...                      lookup_init=lookup_uniform_init)
forward(x)[source]

Pass tensor through block

Parameters

x (Tensor) – input tensor

Returns

Resulting tensor

Return type

Tensor

get_out_size()[source]

Get size width of output layer

Return type

int

Returns

Width of output layer

Module contents

Read the Docs v: v0.6.0
Versions
latest
stable
v0.6.0
v0.5.1
v0.5.0
v0.4.0.1
v0.3.1
Downloads
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.

Docs

Access comprehensive developer and user documentation for LUMIN

View Docs

Tutorials

Get tutorials for beginner and advanced researchers demonstrating many of the features of LUMIN

View Tutorials