lumin.nn.models.blocks package¶
Submodules¶
lumin.nn.models.blocks.body module¶
- class lumin.nn.models.blocks.body.FullyConnected(n_in, feat_map, depth, width, do=0, bn=False, act='relu', res=False, dense=False, growth_rate=0, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, freeze=False, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>)[source]¶
Bases:
AbsBodyFully connected set of hidden layers. Designed to be passed as a ‘body’ to
ModelBuilder. Supports batch normalisation and dropout. Order is dense->activation->BN->DO, except when res is true in which case the BN is applied after the addition. Can optionaly have skip connections between each layer (res=true). Alternatively can concatinate layers (dense=true) growth_rate parameter can be used to adjust the width of layers according to width+(width*(depth-1)*growth_rate)- Parameters:
n_in (
int) – number of inputs to the blockfeat_map (
Dict[str,List[int]]) – dictionary mapping input features to the model to outputs of head blockdepth (
int) – number of hidden layers. If res==True and depth is even, depth will be increased by one.width (
int) – base width of each hidden layerdo (
float) – if not None will add dropout layers with dropout rates dobn (
bool) – whether to use batch normalisationact (
str) – string representation of argument to pass to lookup_actres (
bool) – whether to add an additative skip connection every two dense layers. Mutually exclusive with dense.dense (
bool) – whether to perform layer-wise concatinations after every layer. Mutually exclusion with res.growth_rate (
int) – rate at which width of dense layers should increase with depth beyond the initial layer. Ignored if res=True. Can be negative.lookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layerfreeze (
bool) – whether to start with module parameters set to untrainablebn_class (
Callable[[int],Module]) – class to use for BatchNorm, default is nn.BatchNorm1d
- Examples::
>>> body = FullyConnected(n_in=32, feat_map=head.feat_map, depth=4, ... width=100, act='relu') >>> >>> body = FullyConnected(n_in=32, feat_map=head.feat_map, depth=4, ... width=200, act='relu', growth_rate=-0.3) >>> >>> body = FullyConnected(n_in=32, feat_map=head.feat_map, depth=4, ... width=100, act='swish', do=0.1, res=True) >>> >>> body = FullyConnected(n_in=32, feat_map=head.feat_map, depth=6, ... width=32, act='selu', dense=True, ... growth_rate=0.5) >>> >>> body = FullyConnected(n_in=32, feat_map=head.feat_map, depth=6, ... width=50, act='prelu', bn=True, ... lookup_init=lookup_uniform_init)
- class lumin.nn.models.blocks.body.IdentBody(n_in, feat_map, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, freeze=False, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>)[source]¶
Bases:
AbsBodyPlaceholder body module for cases in which a body is not required. Outputs are equal to imputs.
- class lumin.nn.models.blocks.body.MultiBlock(n_in, feat_map, blocks, feats_per_block, bottleneck_sz=0, bottleneck_act=None, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, freeze=False)[source]¶
Bases:
AbsBodyBody block allowing outputs of head block to be split amongst a series of body blocks. Output is the concatination of all sub-body blocks. Optionally, single-neuron ‘bottleneck’ layers can be used to pass an input to each sub-block based on a learned function of the input features that block would otherwise not receive, i.e. a highly compressed representation of the rest of teh feature space.
- Parameters:
n_in (
int) – number of inputs to the blockfeat_map (
Dict[str,List[int]]) – dictionary mapping input features to the model to outputs of head blockblocks (
List[partial]) – list of uninstantciatedAbsBodyblocks to which to pass a subsection of the total inputs. Note that partials should be used to set any relevant parameters at initialisation timefeats_per_block (
List[List[str]]) – list of lists of names of features to pass to eachAbsBody, not that the feat_map provided byAbsHeadwill map features to their relavant head outputsbottleneck – if true, each block will receive the output of a single neuron which takes as input all the features which each given block does not directly take as inputs
bottleneck_act (
Optional[str]) – if set to a string representation of an activation function, the output of each bottleneck neuron will be passed throguh the defined activation function before being passed to their associated blockslookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layerfreeze (
bool) – whether to start with module parameters set to untrainable
- Examples::
>>> body = MultiBlock( ... blocks=[partial(FullyConnected, depth=1, width=50, act='swish'), ... partial(FullyConnected, depth=6, width=55, act='swish', ... dense=True, growth_rate=-0.1)], ... feats_per_block=[[f for f in train_feats if 'DER_' in f], ... [f for f in train_feats if 'PRI_' in f]]) >>> >>> body = MultiBlock( ... blocks=[partial(FullyConnected, depth=1, width=50, act='swish'), ... partial(FullyConnected, depth=6, width=55, act='swish', ... dense=True, growth_rate=-0.1)], ... feats_per_block=[[f for f in train_feats if 'DER_' in f], ... [f for f in train_feats if 'PRI_' in f]], ... bottleneck=True) >>> >>> body = MultiBlock( ... blocks=[partial(FullyConnected, depth=1, width=50, act='swish'), ... partial(FullyConnected, depth=6, width=55, act='swish', ... dense=True, growth_rate=-0.1)], ... feats_per_block=[[f for f in train_feats if 'DER_' in f], ... [f for f in train_feats if 'PRI_' in f]], ... bottleneck=True, bottleneck_act='swish')
lumin.nn.models.blocks.conv_blocks module¶
- class lumin.nn.models.blocks.conv_blocks.AdaptiveAvgMaxConcatPool1d(sz=None)[source]¶
Bases:
ModuleLayer that reduces the size of each channel to the specified size, via two methods: average pooling and max pooling. The outputs are then concatenated channelwise.
- Parameters:
sz (
Union[int,Tuple[int,...],None]) – Requested output size, default reduces each channel to 2*1 elements. The first element is the maximum value in the channel and the other is the average value in the channel.
- class lumin.nn.models.blocks.conv_blocks.AdaptiveAvgMaxConcatPool2d(sz=None)[source]¶
Bases:
AdaptiveAvgMaxConcatPool1dLayer that reduces the size of each channel to the specified size, via two methods: average pooling and max pooling. The outputs are then concatenated channelwise.
- Parameters:
sz (
Union[int,Tuple[int,...],None]) – Requested output size, default reduces each channel to 2*1 elements. The first element is the maximum value in the channel and the other is the average value in the channel.
- class lumin.nn.models.blocks.conv_blocks.AdaptiveAvgMaxConcatPool3d(sz=None)[source]¶
Bases:
AdaptiveAvgMaxConcatPool1dLayer that reduces the size of each channel to the specified size, via two methods: average pooling and max pooling. The outputs are then concatenated channelwise.
- Parameters:
sz (
Union[int,Tuple[int,...],None]) – Requested output size, default reduces each channel to 2*1 elements. The first element is the maximum value in the channel and the other is the average value in the channel.
- class lumin.nn.models.blocks.conv_blocks.Conv1DBlock(in_c, out_c, kernel_sz, padding='auto', stride=1, act='relu', bn=False, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>)[source]¶
Bases:
ModuleBasic building block for a building and applying a single 1D convolutional layer.
- Parameters:
in_c (
int) – number of input channels (number of features per object / rows in input matrix)out_c (
int) – number of output channels (number of features / rows in output matrix)kernel_sz (
int) – width of kernel, i.e. the number of columns to overlaypadding (
Union[int,str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.stride (
int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.act (
str) – string representation of argument to pass to lookup_actbn (
bool) – whether to use batch normalisation (default order weights->activation->batchnorm)lookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layerbn_class (
Callable[[int],Module]) – class to use for BatchNorm, default is nn.BatchNorm1d
- Examples::
>>> conv = Conv1DBlock(in_c=3, out_c=16, kernel_sz=3) >>> >>> conv = Conv1DBlock(in_c=16, out_c=32, kernel_sz=3, stride=2) >>> >>> conv = Conv1DBlock(in_c=3, out_c=16, kernel_sz=3, act='swish', bn=True)
- forward(x)[source]¶
Passes input through the layers. Might need to be overloaded in inheritance, depending on architecture.
- Parameters:
x (
Tensor) – input tensor- Return type:
Tensor- Returns:
Resulting tensor
- get_conv_layer(in_c, out_c, kernel_sz, padding='auto', stride=1, pre_act=False, groups=1)[source]¶
Builds a sandwich of layers with a single concilutional layer, plus any requested batch norm and activation. Also initialises layers to requested scheme.
- Parameters:
in_c (
int) – number of input channels (number of features per object / rows in input matrix)out_c (
int) – number of output channels (number of features / rows in output matrix)kernel_sz (
int) – width of kernel, i.e. the number of columns to overlaypadding (
Union[int,str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.stride (
int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.pre_act (
bool) – whether to apply batchnorm and activation layers prior to the weight layer, or afterwardsgroups (
int) – number of blocks of connections from input channels to output channels
- Return type:
Module
- class lumin.nn.models.blocks.conv_blocks.Res1DBlock(in_c, out_c, kernel_sz, padding='auto', stride=1, act='relu', bn=False, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>)[source]¶
Bases:
Conv1DBlockBasic building block for a building and applying a pair of residually connected 1D convolutional layers (https://arxiv.org/abs/1512.03385). Batchnorm is applied ‘pre-activation’ as per https://arxiv.org/pdf/1603.05027.pdf, and convolutional shortcuts (again https://arxiv.org/pdf/1603.05027.pdf) are used when the stride of the first layer is greater than 1, or the number of input channels does not equal the number of output channels.
- Parameters:
in_c (
int) – number of input channels (number of features per object / rows in input matrix)out_c (
int) – number of output channels (number of features / rows in output matrix)kernel_sz (
int) – width of kernel, i.e. the number of columns to overlaypadding (
Union[int,str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.stride (
int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.act (
str) – string representation of argument to pass to lookup_actbn (
bool) – whether to use batch normalisation (order is pre-activation: batchnorm->activation->weights)lookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layer
- Examples::
>>> conv = Res1DBlock(in_c=16, out_c=16, kernel_sz=3) >>> >>> conv = Res1DBlock(in_c=16, out_c=32, kernel_sz=3, stride=2) >>> >>> conv = Res1DBlock(in_c=16, out_c=16, kernel_sz=3, act='swish', bn=True)
- class lumin.nn.models.blocks.conv_blocks.ResNeXt1DBlock(in_c, inter_c, cardinality, out_c, kernel_sz, padding='auto', stride=1, act='relu', bn=False, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>)[source]¶
Bases:
Conv1DBlockBasic building block for a building and applying a set of residually connected groups of 1D convolutional layers (https://arxiv.org/abs/1611.05431). Batchnorm is applied ‘pre-activation’ as per https://arxiv.org/pdf/1603.05027.pdf, and convolutional shortcuts (again https://arxiv.org/pdf/1603.05027.pdf) are used when the stride of the first layer is greater than 1, or the number of input channels does not equal the number of output channels.
- Parameters:
in_c (
int) – number of input channels (number of features per object / rows in input matrix)inter_c (
int) – number of intermediate channels in groupscardinality (
int) – number of groupsout_c (
int) – number of output channels (number of features / rows in output matrix)kernel_sz (
int) – width of kernel, i.e. the number of columns to overlaypadding (
Union[int,str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.stride (
int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.act (
str) – string representation of argument to pass to lookup_actbn (
bool) – whether to use batch normalisation (order is pre-activation: batchnorm->activation->weights)lookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layerbn_class (
Callable[[int],Module]) – class to use for BatchNorm, default is nn.BatchNorm1d
- Examples::
>>> conv = ResNeXt1DBlock(in_c=32, inter_c=4, cardinality=4, out_c=32, kernel_sz=3) >>> >>> conv = ResNeXt1DBlock(in_c=32, inter_c=4, cardinality=4, out_c=32, kernel_sz=3, stride=2) >>> >>> conv = ResNeXt1DBlock(in_c=32, inter_c=4, cardinality=4, out_c=32, kernel_sz=3, act='swish', bn=True)
- class lumin.nn.models.blocks.conv_blocks.SEBlock1d(n_in, r, act='relu', lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>)[source]¶
Bases:
ModuleSqueeze-excitation block [Hu, Shen, Albanie, Sun, & Wu, 2017](https://arxiv.org/abs/1709.01507). Incoming data is averaged per channel, fed through a single layer of width n_in//r and the chose activation, then a second layer of width n_in and a sigmoid activation. Channels in the original data are then multiplied by the learned channe weights.
- Parameters:
n_in (
int) – number of incoming channelsr (
int) – the reduction ratio for the channel compressionact (
str) – string representation of argument to pass to lookup_actlookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layer
- class lumin.nn.models.blocks.conv_blocks.SEBlock2d(n_in, r, act='relu', lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>)[source]¶
Bases:
SEBlock1dSqueeze-excitation block [Hu, Shen, Albanie, Sun, & Wu, 2017](https://arxiv.org/abs/1709.01507). Incoming data is averaged per channel, fed through a single layer of width n_in//r and the chose activation, then a second layer of width n_in and a sigmoid activation. Channels in the original data are then multiplied by the learned channe weights.
- Parameters:
n_in (
int) – number of incoming channelsr (
int) – the reduction ratio for the channel compressionact (
str) – string representation of argument to pass to lookup_actlookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layer
- class lumin.nn.models.blocks.conv_blocks.SEBlock3d(n_in, r, act='relu', lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>)[source]¶
Bases:
SEBlock1dSqueeze-excitation block [Hu, Shen, Albanie, Sun, & Wu, 2017](https://arxiv.org/abs/1709.01507). Incoming data is averaged per channel, fed through a single layer of width n_in//r and the chose activation, then a second layer of width n_in and a sigmoid activation. Channels in the original data are then multiplied by the learned channe weights.
- Parameters:
n_in (
int) – number of incoming channelsr (
int) – the reduction ratio for the channel compressionact (
str) – string representation of argument to pass to lookup_actlookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layer
lumin.nn.models.blocks.endcap module¶
- class lumin.nn.models.blocks.endcap.AbsEndcap(model)[source]¶
Bases:
ModuleAbstract class for constructing post training layer which performs further calculation on NN outputs. Used when NN was trained to some proxy objective
- Parameters:
model (
Module) – trainedModelto wrap
- forward(x)[source]¶
Pass tensor through endcap and compute function
- Parameters:
x (
Tensor) – model output tensor- Return type:
Tensor
- Returns
Resulting tensor
- abstract func(x)[source]¶
Transformation functio to apply to model outputs
- Arguements:
x: model output tensor
- Return type:
Tensor- Returns:
Resulting tensor
- predict(inputs, as_np=True)[source]¶
Evaluate model on input tensor, and comput function of model outputs
- Parameters:
inputs (
Union[ndarray,DataFrame,Tensor]) – input data as Numpy array, Pandas DataFrame, or tensor on deviceas_np (
bool) – whether to return predictions as Numpy array (otherwise tensor)
- Return type:
Union[ndarray,Tensor]- Returns:
model predictions pass through endcap function
lumin.nn.models.blocks.gnn_blocks module¶
- class lumin.nn.models.blocks.gnn_blocks.GraphCollapser(n_v, n_fpv, flatten, f_initial_outs=None, n_sa_layers=0, sa_width=None, f_final_outs=None, global_feat_vec=False, agg_methods=['mean', 'max'], do=0, bn=False, act='relu', lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>, sa_class=<class 'lumin.nn.models.layers.self_attention.SelfAttention'>)[source]¶
Bases:
AbsGraphBlockClass for collapsing features per vertex (batch x vertices x features) down to flat data (batch x features). Can act in two ways:
Compute aggregate features by taking the average and maximum of each feature across all vertices (does not assume any order to the vertices)
Flatten out the vertices by reshaping (does assume an ordering to the vertices)
Regardless of flattening approach, features per vertex can be revised beforehand via neural networks and self-attention.
- Parameters:
n_v (
int) – number of vertices per data point to expectn_fpv (
int) – number of features per vertex to expectflatten (
bool) – if True will flatten (reshape) data into (batch x features), otherwise will compute aggregate features (average and max)f_initial_outs (
Optional[List[int]]) – list of widths for the NN layers in an NN before self-attention (None = no NN)n_sa_layers (
int) – number of self-attention layers (outputs will be fed into subsequent layers)sa_width (
Optional[int]) – width of self attention representation (paper recommends n_fpv//4)f_final_outs (
Optional[List[int]]) – list of widths for the NN layers in an NN after self-attention (None = no NN)global_feat_vec (
bool) – if true and f_initial_outs or f_final_outs are not None, will concatenate the mean of each feature as new features to each vertex prior to the last network.agg_methods (
Union[List[str],str]) – list of text representations of aggregation methods. Default is mean and max.do (
float) – dropout rate to be applied to hidden layers in the NNsbn (
bool) – whether batch normalisation should be applied to hidden layers in the NNsact (
str) – activation function to apply to hidden layers in the NNslookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layerbn_class (
Callable[[int],Module]) – class to use for BatchNorm, default isLCBatchNorm1dsa_class (
Callable[[int],Module]) – class to use for self-attention layers, default isSelfAttention
- class lumin.nn.models.blocks.gnn_blocks.GravNet(n_v, n_fpv, cat_means, f_slr_depth, n_s, n_lr, k, f_out_depth, n_out, agg_methods=['mean', 'max'], gn_class=<class 'lumin.nn.models.blocks.gnn_blocks.GravNetLayer'>, use_sa=False, do=0, bn=False, act='relu', lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>, sa_class=<class 'lumin.nn.models.layers.self_attention.SelfAttention'>, **kargs)[source]¶
Bases:
AbsGraphFeatExtractorGravNet GNN head (Qasim, Kieseler, Iiyama, & Pierini, 2019 https://link.springer.com/article/10.1140/epjc/s10052-019-7113-9). Passes features per vertex (batch x vertices x features) through several
GravNetLayerlayers. Like the paper model, this has the option of caching and concatenating the outputs of each GravNet layer prior to the final layer. The features per vertex are then flattened/aggregated across the vertices to flat data (batch x features).- Parameters:
n_v (
int) – Number of vertices to expect per datapointn_fpv (
int) – number features per vertexcat_means (
bool) – if True, will extend the incoming features per vertex by including the means of all features across all verticesf_slr_depth (
int) – number of layers to use for the latent rep. NNn_s (
int) – number of latent-spatial dimensions to computen_lr (
int) – number of features to compute per vertex for latent representationk (
int) – number of neighbours (including self) each vertex should consider when aggregating latent-representation featuresf_out_depth (
int) – number of layers to use for the output NNn_out (
Union[List[int],int]) – number of output features to compute per vertex, if a list will add multiple gravnet layers, each of which outputs the respective number of featuresagg_methods (
Union[List[str],str]) – list of text representations of aggregation methods. Default is mean and max.gn_class (
Callable[[Dict[str,Any]],GravNetLayer]) – class to use for GravNet layers, default isGravNetLayeruse_sa (
bool) – if true, will apply self-attention layer to the neighbourhhood features per vertex prior to aggregationdo (
float) – dropout rate to be applied to hidden layers in the NNsbn (
bool) – whether batch normalisation should be applied to hidden layers in the NNsact (
str) – activation function to apply to hidden layers in the NNslookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layerfreeze – whether to start with module parameters set to untrainable
bn_class (
Callable[[int],Module]) – class to use for BatchNorm, default is nn.BatchNorm1dsa_class (
Callable[[int],Module]) – class to use for self-attention layers, default isSelfAttention
- forward(x)[source]¶
Passes input through the GravNet head.
- Parameters:
x (
Tensor) – row-wise tensor (batch x vertices x features)- Return type:
Tensor- Returns:
Resulting tensor row-wise tensor (batch x vertices x new features)
-
row_wise:
Optional[bool] = True¶
- class lumin.nn.models.blocks.gnn_blocks.GravNetLayer(n_fpv, n_s, n_lr, k, agg_methods, n_out, cat_means=True, f_slr_depth=1, f_out_depth=1, potential=None, use_sa=False, do=0, bn=False, act='relu', lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>, sa_class=<class 'lumin.nn.models.layers.self_attention.SelfAttention'>)[source]¶
Bases:
AbsGraphBlockSingle GravNet GNN layer (Qasim, Kieseler, Iiyama, & Pierini, 2019 https://link.springer.com/article/10.1140/epjc/s10052-019-7113-9). Designed to be used as a sub-layer of a head block, e.g.
GravNetHeadPasses features per vertex through NN to compute new features & coordinates of vertex in latent space. Vertex then receives additional features based on aggregations of distance-weighted features for k-nearest vertices in latent space Second NN transforms features per vertex. Input (batch x vertices x features) –> Output (batch x vertices x new features)- Parameters:
n_fpv (
int) – number of features per vertex to expectn_s (
int) – number of latent-spatial dimensions to computen_lr (
int) – number of features to compute per vertex for latent representationk (
int) – number of neighbours (including self) each vertex should consider when aggregating latent-representation featuresagg_methods (
List[Callable[[Tensor],Tensor]]) – list of functions to use to aggregate distance-weighted latent-representation featuresn_out (
int) – number of output features to compute per vertexcat_means (
bool) – if True, will extend the incoming features per vertex by including the means of all features across all verticesGNNHeadaslo has a cat_means argument, which should be set to False if enabled here (otherwise averaging happens twice).f_slr_depth (
int) – number of layers to use for the latent rep. NNf_out_depth (
int) – number of layers to use for the output NNpotential (
Optional[Callable[[Tensor],Tensor]]) – function to control distance weighting (default is the exp(-d^2) potential used in the paper)use_sa (
bool) – if true, will apply self-attention layer to the neighbourhhood features per vertex prior to aggregationdo (
float) – dropout rate to be applied to hidden layers in the NNsbn (
bool) – whether batch normalisation should be applied to hidden layers in the NNsact (
str) – activation function to apply to hidden layers in the NNslookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layerfreeze – whether to start with module parameters set to untrainable
bn_class (
Callable[[int],Module]) – class to use for BatchNorm, default isLCBatchNorm1dsa_class (
Callable[[int],Module]) – class to use for self-attention layers, default isSelfAttention
- class lumin.nn.models.blocks.gnn_blocks.InteractionNet(n_v, n_fpv, intfunc_outs, outfunc_outs, do=0, bn=False, act='relu', lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>)[source]¶
Bases:
AbsGraphFeatExtractorImplementation of the Interaction Graph-Network (https://arxiv.org/abs/1612.00222). Shown to be applicable for embedding many 4-momenta in e.g. https://arxiv.org/abs/1908.05318
Receives column-wise data and returns column-wise
- Parameters:
n_v (
int) – Number of vertices to expect per datapointn_fpv (
int) – number features per vertexintfunc_outs (
List[int]) – list of widths for the internal NN layersoutfunc_outs (
List[int]) – list of widths for the output NN layersdo (
float) – dropout rate to be applied to hidden layers in the interaction-representation and post-interaction networksbn (
bool) – whether batch normalisation should be applied to hidden layers in the interaction-representation and post-interaction networksact (
str) – activation function to apply to hidden layers in the interaction-representation and post-interaction networkslookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layerbn_class (
Callable[[int],Module]) – class to use for BatchNorm, default is nn.BatchNorm1d
- Examples::
>>> inet = InteractionNet(n_v=128, n_fpv=10, intfunc_outs=[20,10], outfunc_outs=[20,4])
- forward(x)[source]¶
Learn new features per vertex
- Parameters:
x (
Tensor) – columnwise matrix data (batch x features x vertices)- Return type:
Tensor- Returns:
columnwise matrix data (batch x new features x vertices)
-
row_wise:
Optional[bool] = False¶
- class lumin.nn.models.blocks.gnn_blocks.NodePredictor(n_v, n_fpv, out_act, transpose_out, f_initial_outs=None, n_sa_layers=0, sa_width=None, f_final_outs=None, global_feat_vec=False, do=0, bn=False, act='relu', lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>, sa_class=<class 'lumin.nn.models.layers.self_attention.SelfAttention'>)[source]¶
Bases:
GraphCollapserModified
GraphCollapserfor providing a set of predictions per node in a graph, i.e collapsing features per vertex (batch x vertices x features) down to predictions per vertex, with approriate output activation functions data (batch x vertices x predictions). For compatibility with the format expected by some loss functions (e.g. torch.nn.NLLLoss), the output can be transposed to (batch x predictions x vertices). Features per vertex can be revised beforehand via neural networks and self-attention. The f_final neural network can be used to transform the input size to the required number of predicitons per node.Important
Since predictions are being provided in the head part of the model, but LUMIN expects models to also have a body and tail section. It is strongly recommended to use the
IdentBodyandIdentTailmodules to be placeholders.- Parameters:
n_v (
int) – number of vertices per data point to expectn_fpv (
int) – number of features per vertex to expectout_act (
str) – Output activation function to apply to every set of prediction per vertex. The weight initialisation of the last NN layer will be automatically set.transpose_out (
bool) – If True, will transpose the putput into (batch x predictions x vertices), otherwise the output will be (batch x vertices x predictions)f_initial_outs (
Optional[List[int]]) – list of widths for the NN layers in an NN before self-attention (None = no NN)n_sa_layers (
int) – number of self-attention layers (outputs will be fed into subsequent layers)sa_width (
Optional[int]) – width of self attention representation (paper recommends n_fpv//4)f_final_outs (
Optional[List[int]]) – list of widths for the NN layers in an NN after self-attention (None = no NN)global_feat_vec (
bool) – if true and f_initial_outs or f_final_outs are not None, will concatenate the mean of each feature as new features to each vertex prior to the last network.do (
float) – dropout rate to be applied to hidden layers in the NNsbn (
bool) – whether batch normalisation should be applied to hidden layers in the NNsact (
str) – activation function to apply to hidden layers in the NNslookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layerbn_class (
Callable[[int],Module]) – class to use for BatchNorm, default isLCBatchNorm1dsa_class (
Callable[[int],Module]) – class to use for self-attention layers, default isSelfAttention
lumin.nn.models.blocks.head module¶
- class lumin.nn.models.blocks.head.AbsConv1dHead(cont_feats, vecs, feats_per_vec, act='relu', bn=False, layer_kargs=None, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, freeze=False, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>, **kargs)[source]¶
Bases:
AbsMatrixHeadAbstract wrapper head for applying 1D convolutions to column-wise matrix data. Users should inherit from this class and overload
get_layers()to define their model. Some common convolutional layers are already defined (e.g.ConvBlockandResNeXt), which are accessable using methods such as :meth`~lumin.nn.models.blocks.heads.AbsConv1dHead..get_conv1d_block`. For more complicated models,foward()can also be overwritten The output size of the block is automatically computed during initialisation by passing through random pseudodata.Incoming data can either be flat, in which case it is reshaped into a matrix, or be supplied directly into matrix form. Matrices should/will be row-wise: each column is a seperate object (e.g. particle and jet) and each row is a feature (e.g. energy and mometum component). Matrix elements are expected to be named according to {object}_{feature}, e.g. photon_energy. vecs (vectors) should then be a list of objects, i.e. row headers, feature prefixes. feats_per_vec should be a list of features, i.e. column headers, feature suffixes.
Note
To allow for the fact that there may be nonexistant features (e.g. z-component of missing energy), cont_feats should be a list of all matrix features which really do exist (i.e. are present in input data), and be in the same order as the incoming data. Nonexistant features will be set zero.
- Parameters:
cont_feats (
List[str]) – list of all the matrix features which are present in the input datavecs (
List[str]) – list of objects, i.e. row headers, feature prefixesfeats_per_vec (
List[str]) – list of features per object, i.e. columns headers, feature suffixesact (
str) – activation function passed to get_layersbn (
bool) – batch normalisation argument passed to get_layerslayer_kargs (
Optional[Dict[str,Any]]) – dictionary of keyword arguments which are passed to get_layerslookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.freeze (
bool) – whether to start with module parameters set to untrainablebn_class (
Callable[[int],Module]) – class to use for BatchNorm, default is nn.BatchNorm1d
- Examples::
>>> class MyCNN(AbsConv1dHead): ... def get_layers(self, act:str='relu', bn:bool=False, **kargs) -> Tuple[nn.Module, int]: ... layers = [] ... layers.append(self.get_conv1d_block(3, 16, stride=1, kernel_sz=3, act=act, bn=bn)) ... layers.append(self.get_conv1d_block(16, 16, stride=1, kernel_sz=3, act=act, bn=bn)) ... layers.append(self.get_conv1d_block(16, 32, stride=2, kernel_sz=3, act=act, bn=bn)) ... layers.append(self.get_conv1d_block(32, 32, stride=1, kernel_sz=3, act=act, bn=bn)) ... layers.append(nn.AdaptiveAvgPool1d(1)) ... layers = nn.Sequential(*layers) ... return layers ... ... cnn = MyCNN(cont_feats=matrix_feats, vecs=vectors, feats_per_vec=feats_per_vec) >>> >>> class MyResNet(AbsConv1dHead): ... def get_layers(self, act:str='relu', bn:bool=False, **kargs) -> Tuple[nn.Module, int]: ... layers = [] ... layers.append(self.get_conv1d_block(3, 16, stride=1, kernel_sz=3, act='linear', bn=False)) ... layers.append(self.get_conv1d_res_block(16, 16, stride=1, kernel_sz=3, act=act, bn=bn)) ... layers.append(self.get_conv1d_res_block(16, 32, stride=2, kernel_sz=3, act=act, bn=bn)) ... layers.append(self.get_conv1d_res_block(32, 32, stride=1, kernel_sz=3, act=act, bn=bn)) ... layers.append(nn.AdaptiveAvgPool1d(1)) ... layers = nn.Sequential(*layers) ... return layers ... ... cnn = MyResNet(cont_feats=matrix_feats, vecs=vectors, feats_per_vec=feats_per_vec) >>> >>> class MyResNeXt(AbsConv1dHead): ... def get_layers(self, act:str='relu', bn:bool=False, **kargs) -> Tuple[nn.Module, int]: ... layers = [] ... layers.append(self.get_conv1d_block(3, 32, stride=1, kernel_sz=3, act='linear', bn=False)) ... layers.append(self.get_conv1d_resNeXt_block(32, 4, 4, 32, stride=1, kernel_sz=3, act=act, bn=bn)) ... layers.append(self.get_conv1d_resNeXt_block(32, 4, 4, 32, stride=2, kernel_sz=3, act=act, bn=bn)) ... layers.append(self.get_conv1d_resNeXt_block(32, 4, 4, 32, stride=1, kernel_sz=3, act=act, bn=bn)) ... layers.append(nn.AdaptiveAvgPool1d(1)) ... layers = nn.Sequential(*layers) ... return layers ... ... cnn = MyResNeXt(cont_feats=matrix_feats, vecs=vectors, feats_per_vec=feats_per_vec)
- check_out_sz()[source]¶
Automatically computes the output size of the head by passing through random data of the expected shape
- Return type:
int- Returns:
x.size(-1) where x is the outgoing tensor from the head
- forward(x)[source]¶
Passes input through the convolutional network.
- Parameters:
x (
Union[Tensor,Tuple[Tensor,Tensor]]) – If a tuple, the second element is assumed to the be the matrix data. If a flat tensor, will conver the data to a matrix- Return type:
Tensor- Returns:
Resulting tensor
- get_conv1d_block(in_c, out_c, kernel_sz, padding='auto', stride=1, act='relu', bn=False)[source]¶
Wrapper method to build a
ConvBlockobject.- Parameters:
in_c (
int) – number of input channels (number of features per object / rows in input matrix)out_c (
int) – number of output channels (number of features / rows in output matrix)kernel_sz (
int) – width of kernel, i.e. the number of columns to overlaypadding (
Union[int,str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.stride (
int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.act (
str) – string representation of argument to pass to lookup_actbn (
bool) – whether to use batch normalisation (order is weights->activation->batchnorm)
- Return type:
- Returns:
Instantiated
ConvBlockobject
- get_conv1d_resNeXt_block(in_c, inter_c, cardinality, out_c, kernel_sz, padding='auto', stride=1, act='relu', bn=False)[source]¶
Wrapper method to build a
ResNeXt1DBlockobject.- Parameters:
in_c (
int) – number of input channels (number of features per object / rows in input matrix)inter_c (
int) – number of intermediate channels in groupscardinality (
int) – number of groupsout_c (
int) – number of output channels (number of features / rows in output matrix)kernel_sz (
int) – width of kernel, i.e. the number of columns to overlaypadding (
Union[int,str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.stride (
int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.act (
str) – string representation of argument to pass to lookup_actbn (
bool) – whether to use batch normalisation (order is pre-activation: batchnorm->activation->weights)
- Return type:
- Returns:
Instantiated
ResNeXt1DBlockobject
- get_conv1d_res_block(in_c, out_c, kernel_sz, padding='auto', stride=1, act='relu', bn=False)[source]¶
Wrapper method to build a
Res1DBlockobject.- Parameters:
in_c (
int) – number of input channels (number of features per object / rows in input matrix)out_c (
int) – number of output channels (number of features / rows in output matrix)kernel_sz (
int) – width of kernel, i.e. the number of columns to overlaypadding (
Union[int,str]) – amount of padding columns to add at start and end of convolution. If left as ‘auto’, padding will be automatically computed to conserve the number of columns.stride (
int) – number of columns to move kernel when computing convolutions. Stride 1 = kernel centred on each column, stride 2 = kernel centred on ever other column and input size halved, et cetera.act (
str) – string representation of argument to pass to lookup_actbn (
bool) – whether to use batch normalisation (order is pre-activation: batchnorm->activation->weights)
- Return type:
- Returns:
Instantiated
Res1DBlockobject
- class lumin.nn.models.blocks.head.AutoExtractLorentzBoostNet(cont_feats, vecs, feats_per_vec, n_particles, depth, width, n_singles=0, n_pairs=0, act='swish', do=0, bn=False, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, freeze=False, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>, **kargs)[source]¶
Bases:
LorentzBoostNetModified version of :class:`~lumin.nn.models.blocks.head.LorentzBoostNet (implementation of the Lorentz Boost Network (https://arxiv.org/abs/1812.09722)). Rather than relying on fixed kernel functions to extract features from the boosted paricles, the functions are learnt during training via neural networks.
Two netrowks are used, one to extract n_singles features from each particle and another to extract n_pairs features from each pair of particles.
Important
4-momenta should be supplied without preprocessing, and 4-momenta must be physical (E>=|p|). It is up to the user to ensure this, and not doing so may result in errors. A BatchNorm argument (bn) is available to preprocess the 4-momenta of the boosted particles prior to passing them through the neural networks
Incoming data can either be flat, in which case it is reshaped into a matrix, or be supplied directly in row-wise matrix form. Matrices should/will be row-wise: each row is a seperate 4-momenta in the form (px,py,pz,E). Matrix elements are expected to be named according to {particle}_{feature}, e.g. photon_E. vecs (vectors) should then be a list of particles, i.e. row headers, feature prefixes. feats_per_vec should be a list of features, i.e. column headers, feature suffixes.
Note
To allow for the fact that there may be nonexistant features (e.g. z-component of missing energy), cont_feats should be a list of all matrix features which really do exist (i.e. are present in input data), and be in the same order as the incoming data. Nonexistant features will be set zero.
- Parameters:
cont_feats (
List[str]) – list of all the matrix features which are present in the input datavecs (
List[str]) – list of objects, i.e. column headers, feature prefixesfeats_per_vec (
List[str]) – list of features per object, i.e. row headers, feature suffixesn_particles (
int) – the number of particles and reference frames to learndepth (
int) – the number of hidden layers in each networkwidth (
int) – the number of neurons per hidden layern_singles (
int) – the number of features to extract per individual particlen_pairs (
int) – the number of features to extract per pair of particlesact (
str) – string representation of argument to pass to lookup_act. Activation should ideally have non-zero outputs to help deal with poorly normalised inputsdo (
float) – dropout rate for use in networksbn (
bool) – whether to use batch normalisation within networks. Inputs are passed through BN regardless of setting.lookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layer.freeze (
bool) – whether to start with module parameters set to untrainable.bn_class (
Callable[[int],Module]) – class to use for BatchNorm, default is nn.BatchNorm1d
- Examples::
>>> aelbn = AutoExtractLorentzBoostNet(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs, n_particles=6, depth=3, width=10, n_singles=3, n_pairs=2)
- feat_extractor(x)[source]¶
Computes features from boosted particle 4-momenta. Incoming tensor x contains all 4-momenta for all particles for all datapoints in minibatch. single_nn broadcast to all boosted particles, and pair_nn broadcast to all paris of particles. Returned features are concatenated together.
- Parameters:
x (
Tensor) – 3D incoming tensor with dimensions: [batch, particle, 4-mom (px,py,pz,E)]- Return type:
Tensor- Returns:
2D tensor with dimensions [batch, features]
- class lumin.nn.models.blocks.head.CatEmbHead(cont_feats, do_cont=0, do_cat=0, cat_embedder=None, lookup_init=<function lookup_normal_init>, freeze=False)[source]¶
Bases:
AbsHeadStandard model head for columnar data. Provides inputs for continuous features and embedding matrices for categorical inputs, and uses a dense layer to upscale to width of network body. Designed to be passed as a ‘head’ to
ModelBuilder. Supports batch normalisation and dropout (at separate rates for continuous features and categorical embeddings). Continuous features are expected to be the first len(cont_feats) columns of input tensors and categorical features the remaining columns. Embedding arguments for categorical features are set using aCatEmbedder.- Parameters:
cont_feats (
List[str]) – list of names of continuous input featuresdo_cont (
float) – if not None will add a dropout layer with dropout rate do acting on the continuous inputs prior to concatination wih the categorical embeddingsdo_cat (
float) – if not None will add a dropout layer with dropout rate do acting on the categorical embeddings prior to concatination wih the continuous inputscat_embedder (
Optional[CatEmbedder]) –CatEmbedderproviding details of how to embed categorical inputslookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.freeze (
bool) – whether to start with module parameters set to untrainable
- Examples::
>>> head = CatEmbHead(cont_feats=cont_feats) >>> >>> head = CatEmbHead(cont_feats=cont_feats, ... cat_embedder=CatEmbedder.from_fy(train_fy)) >>> >>> head = CatEmbHead(cont_feats=cont_feats, ... cat_embedder=CatEmbedder.from_fy(train_fy), ... do_cont=0.1, do_cat=0.05) >>> >>> head = CatEmbHead(cont_feats=cont_feats, ... cat_embedder=CatEmbedder.from_fy(train_fy), ... lookup_init=lookup_uniform_init)
- forward(x)[source]¶
Pass tensor through block
- Parameters:
x (
Tensor) – input tensor- Return type:
Tensor
- Returns
Resulting tensor
- get_embeds()[source]¶
Get state_dict for every embedding matrix.
- Return type:
Dict[str,OrderedDict]- Returns:
Dictionary mapping categorical features to learned embedding matrix
- get_out_size()[source]¶
Get size width of output layer
- Return type:
int- Returns:
Width of output layer
- plot_embeds(savename=None, settings=<lumin.plotting.plot_settings.PlotSettings object>)[source]¶
Plot representations of embedding matrices for each categorical feature.
- Parameters:
savename (
Optional[str]) – if not None, will save copy of plot to give pathsettings (
PlotSettings) –PlotSettingsclass to control figure appearance
- Return type:
None
- class lumin.nn.models.blocks.head.GNNHead(cont_feats, vecs, feats_per_vec, extractor, collapser, use_in_bn=False, cat_means=False, freeze=False, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>, **kargs)[source]¶
Bases:
AbsMatrixHeadEncasulating class for applying graph neural-networks to features per vertex. New features are extracted per vertex via a
AbsGraphFeatExtractor, and then data is flattened viaGraphCollapserIncoming data can either be flat, in which case it is reshaped into a matrix, or be supplied directly into matrix form. Reshaping (row-wise or column-wise) depends on the row_wise class attribute of the feature extractor. Data will be automatically converted to row-wise for processing by the grpah collaser.
Note
To allow for the fact that there may be nonexistant features (e.g. z-component of missing energy), cont_feats should be a list of all matrix features which really do exist (i.e. are present in input data), and be in the same order as the incoming data. Nonexistant features will be set zero.
- Parameters:
cont_feats (
List[str]) – list of all the matrix features which are present in the input datavecs (
List[str]) – list of objects, i.e. feature prefixesfeats_per_vec (
List[str]) – list of features per vertex, i.e. feature suffixesuse_int_bn – If true, will apply batch norm to incoming features
cat_means (
bool) – if True, will extend the incoming features per vertex by including the means of all features across all verticesextractor (
Callable[[Any],AbsGraphFeatExtractor]) – TheAbsGraphFeatExtractorclass to instantiate to create new features per vertexcollasper – The
GraphCollapserclass to instantiate to collapse graph to flat data (batch x features)freeze (
bool) – whether to start with module parameters set to untrainablebn_class (
Callable[[int],Module]) – class to use for BatchNorm, default is nn.BatchNorm1d
- forward(x)[source]¶
Passes input through the GravNet head and returns a flat tensor.
- Parameters:
x (
Union[Tensor,Tuple[Tensor,Tensor]]) – If a tuple, the second element is assumed to the be the matrix data. If a flat tensor, will convert the data to a matrix- Return type:
Tensor- Returns:
Resulting tensor
- class lumin.nn.models.blocks.head.LorentzBoostNet(cont_feats, vecs, feats_per_vec, n_particles, feat_extractor=None, bn=True, lookup_init=<function lookup_normal_init>, lookup_act=<function lookup_act>, freeze=False, bn_class=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>, **kargs)[source]¶
Bases:
AbsMatrixHeadImplementation of the Lorentz Boost Network (https://arxiv.org/abs/1812.09722), which takes 4-momenta for particles and learns new particles and reference frames from linear combinations of the original particles, and then boosts the new particles into the learned reference frames. Preset kernel functions are the run over the 4-momenta of the boosted particles to compute a set of veriables per particle. These functions can be based on pairs etc. of particles, e.g. angles between particles. (LorentzBoostNet.comb provides an index iterator over all paris of particles).
A default feature extractor is provided which returns the (px,py,pz,E) of the boosted particles and the cosine angle between every pair of boosted particle. This can be overwritten by passing a function to the feat_extractor argument during initialisation, or overidding LorentzBoostNet.feat_extractor.
Important
4-momenta should be supplied without preprocessing, and 4-momenta must be physical (E>=|p|). It is up to the user to ensure this, and not doing so may result in errors. A BatchNorm argument (bn) is available to preprocess the features extracted from the boosted particles prior to returning them.
Incoming data can either be flat, in which case it is reshaped into a matrix, or be supplied directly in row-wise matrix form. Matrices should/will be row-wise: each row is a seperate 4-momenta in the form (px,py,pz,E). Matrix elements are expected to be named according to {particle}_{feature}, e.g. photon_E. vecs (vectors) should then be a list of particles, i.e. row headers, feature prefixes. feats_per_vec should be a list of features, i.e. column headers, feature suffixes.
Note
To allow for the fact that there may be nonexistant features (e.g. z-component of missing energy), cont_feats should be a list of all matrix features which really do exist (i.e. are present in input data), and be in the same order as the incoming data. Nonexistant features will be set zero.
- Parameters:
cont_feats (
List[str]) – list of all the matrix features which are present in the input datavecs (
List[str]) – list of objects, i.e. row headers, feature prefixesfeats_per_vec (
List[str]) – list of features per object, i.e. column headers, feature suffixesn_particles (
int) – the number of particles and reference frames to learnfeat_extractor (
Optional[Callable[[Tensor],Tensor]]) – if not None, will use the argument as the function to extract features from the 4-momenta of the boosted particles.bn (
bool) – whether batch normalisation should be applied to the extracted featureslookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights. Purely for inheritance, unused by class as is.lookup_act (
Callable[[str],Any]) – function taking choice of activation function and returning an activation function layer. Purely for inheritance, unused by class as is.freeze (
bool) – whether to start with module parameters set to untrainable.bn_class (
Callable[[int],Module]) – class to use for BatchNorm, default is nn.BatchNorm1d
- Examples::
>>> lbn = LorentzBoostNet(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs, n_particles=6) >>> >>> def feat_extractor(x:Tensor) -> Tensor: # Return masses of boosted particles, x dimensions = [batch,particle,4-mom] ... momenta,energies = x[:,:,:3], x[:,:,3:] ... mass = torch.sqrt((energies**2)-torch.sum(momenta**2, dim=-1)[:,:,None]) ... return mass >>> lbn = InteractionNet(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs, n_particle=6, feat_extractor=feat_extractor)
- check_out_sz()[source]¶
Automatically computes the output size of the head by passing through random data of the expected shape
- Return type:
int- Returns:
x.size(-1) where x is the outgoing tensor from the head
- feat_extractor(x)[source]¶
Computes features from boosted particle 4-momenta. Incoming tensor x contains all 4-momenta for all particles for all datapoints in minibatch. Default function returns 4-momenta and cosine angle between all particles.
- Parameters:
x (
Tensor) – 3D incoming tensor with dimensions: [batch, particle, 4-mom (px,py,pz,E)]- Return type:
Tensor- Returns:
2D tensor with dimensions [batch, features]
- forward(x)[source]¶
Passes input through the LB network and aggregates down to a flat tensor via the feature extractor, optionally passing through a batchnorm layer.
- Parameters:
x (
Union[Tensor,Tuple[Tensor,Tensor]]) – If a tuple, the second element is assumed to the be the matrix data. If a flat tensor, will convert the data to a matrix- Return type:
Tensor- Returns:
Resulting tensor
- class lumin.nn.models.blocks.head.MultiHead(cont_feats, matrix_head, flat_head=<class 'lumin.nn.models.blocks.head.CatEmbHead'>, cat_embedder=None, lookup_init=<function lookup_normal_init>, freeze=False, **kargs)[source]¶
Bases:
AbsHeadWrapper head to handel data containing flat continuous and categorical features, and matrix data. Flat inputs are passed through flat_head, and matrix inputs are passed through matrix_head. The outputs of both blocks are then concatenated together. Incoming data can either be: Completely flat, in which case the matrix_head should construct its own matrix from the data; or a tuple of flat data and the matrix, in which case the matrix_head will receive the data already in matrix format.
- Parameters:
cont_feats (
List[str]) – list of names of continuous and matrix input featuresmatrix_head (
Callable[[Any],AbsMatrixHead]) – Uninitialised (partial) head to handle matrix data e.g.InteractionNetflat_head (
Callable[[Any],AbsHead]) – Uninitialised (partial) head to handle flat data e.g.CatEmbHeadcat_embedder (
Optional[CatEmbedder]) –CatEmbedderproviding details of how to embed categorical inputslookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking choice of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.freeze (
bool) – whether to start with module parameters set to untrainable
Examples:: >>> inet = partial(InteractionNet, intfunc_depth=2,intfunc_width=4,intfunc_out_sz=3, … outfunc_depth=2,outfunc_width=5,outfunc_out_sz=4,agg_method=’flatten’, … feats_per_vec=feats_per_vec,vecs=vecs, act=’swish’) … multihead = MultiHead(cont_feats=cont_feats+matrix_feats, matrix_head=inet, cat_embedder=CatEmbedder.from_fy(train_fy))
- forward(x)[source]¶
Pass incoming data through flat and matrix heads. If x is a Tuple then the first element is passed to the flat head and the secons is sent to the matrix head. Else the elements corresponding to flat dta are sent to the flat head and the elements corresponding to matrix elements are sent to the matrix head.
- Parameters:
x (
Union[Tensor,Tuple[Tensor,Tensor]]) – input data as either a flat Tensor or a Tuple of the form [flat Tensor, matrix Tensor]- Return type:
Tensor- Returns:
Concetanted outout of flat and matrix heads
- class lumin.nn.models.blocks.head.RecurrentHead(cont_feats, vecs, feats_per_vec, depth, width, bidirectional=False, rnn=<class 'torch.nn.modules.rnn.RNN'>, do=0.0, act='tanh', stateful=False, freeze=False, **kargs)[source]¶
Bases:
AbsMatrixHeadRecurrent head for row-wise matrix data applying e.g. RNN, LSTM, GRU.
Incoming data can either be flat, in which case it is reshaped into a matrix, or be supplied directly into matrix form. Matrices should/will be row-wise: each column is a seperate object (e.g. particle and jet) and each row is a feature (e.g. energy and mometum component). Matrix elements are expected to be named according to {object}_{feature}, e.g. photon_energy. vecs (vectors) should then be a list of objects, i.e. row headers, feature prefixes. feats_per_vec should be a list of features, i.e. column headers, feature suffixes.
Note
To allow for the fact that there may be nonexistant features (e.g. z-component of missing energy), cont_feats should be a list of all matrix features which really do exist (i.e. are present in input data), and be in the same order as the incoming data. Nonexistant features will be set zero.
- Parameters:
cont_feats (
List[str]) – list of all the matrix features which are present in the input datavecs (
List[str]) – list of objects, i.e. row headers, feature prefixesfeats_per_vec (
List[str]) – list of features per object, i.e. columns headers, feature suffixesdepth (
int) – number of hidden layers to usewidth (
int) – size of each hidden statebidirectional (
bool) – whether to set recurrent layers to be bidirectionalrnn (
RNNBase) – module class to use for the recurrent layer, e.g. torch.nn.RNN, torch.nn.LSTM, torch.nn.GRUdo (
float) – dropout rate to be applied to hidden layersact (
str) – activation function to apply to hidden layers, only used if rnn expects a nonliearitystateful (
bool) – whether to return all intermediate hidden states, or only the final hidden statesfreeze (
bool) – whether to start with module parameters set to untrainable
- Examples::
>>> rnn = RecurrentHead(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs, depth=1, width=20) >>> >>> rnn = RecurrentHead(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs, ... depth=2, width=10, act='relu', bidirectional=True) >>> >>> lstm = RecurrentHead(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs, ... depth=1, width=10, rnn=nn.LSTM) >>> >>> gru = RecurrentHead(cont_feats=matrix_feats, feats_per_vec=feats_per_vec,vecs=vecs, ... depth=3, width=10, rnn=nn.GRU, bidirectional=True)
- forward(x)[source]¶
Passes input through the recurrent network.
- Parameters:
x (
Union[Tensor,Tuple[Tensor,Tensor]]) – If a tuple, the second element is assumed to the be the matrix data. If a flat tensor, will conver the data to a matrix- Return type:
Tensor- Returns:
if stateful, returns all hidden states, otherwise only returns last hidden state
lumin.nn.models.blocks.tail module¶
- class lumin.nn.models.blocks.tail.ClassRegMulti(n_in, n_out, objective, y_range=None, bias_init=None, y_mean=None, y_std=None, lookup_init=<function lookup_normal_init>, freeze=False)[source]¶
Bases:
AbsTailOutput block for (multi(class/label)) classification or regression tasks. Designed to be passed as a ‘tail’ to
ModelBuilder. Takes output size of network body and scales it to required number of outputs. For regression tasks, y_range can be set with per-output minima and maxima. The outputs are then adjusted according to ((y_max-y_min)*x)+self.y_min, where x is the output of the network passed through a sigmoid function. Effectively allowing regression to be performed without normalising and standardising the target values. Note it is safest to allow some leaway in setting the min and max, e.g. max = 1.2*max, min = 0.8*min Output activation function is automatically set according to objective and y_range.- Parameters:
n_in (
int) – number of inputs to expectn_out (
int) – number of outputs requiredobjective (
str) – string representation of network objective, i.e. ‘classification’, ‘regression’, ‘multiclass’y_range (
Union[Tuple,ndarray,None]) – if not None, will apply rescaling to network outputs: x = ((y_range[1]-y_range[0])*sigmoid(x))+y_range[0]. Incompatible with y_mean and y_stdbias_init (
Optional[float]) – specify an intial bias for the output neurons. Otherwise default values of 0 are used, except for multiclass objectives, which use 1/n_outy_mean (
Union[float,List[float],ndarray,None]) – if sepcified along with y_std, will apply rescaling to network outputs: x = (y_std*x)+y_mean. Incopmpatible with y_rangey_std (
Union[float,List[float],ndarray,None]) – if sepcified along with y_mean, will apply rescaling to network outputs: x = (y_std*x)+y_mean. Incopmpatible with y_rangelookup_init (
Callable[[str,Optional[int],Optional[int]],Callable[[Tensor],None]]) – function taking string representation of activation function, number of inputs, and number of outputs an returning a function to initialise layer weights.
- Examples::
>>> tail = ClassRegMulti(n_in=100, n_out=1, objective='classification') >>> >>> tail = ClassRegMulti(n_in=100, n_out=5, objective='multiclass') >>> >>> y_range = (0.8*targets.min(), 1.2*targets.max()) >>> tail = ClassRegMulti(n_in=100, n_out=1, objective='regression', ... y_range=y_range) >>> >>> min_targs = np.min(targets, axis=0).reshape(targets.shape[1],1) >>> max_targs = np.max(targets, axis=0).reshape(targets.shape[1],1) >>> min_targs[min_targs > 0] *=0.8 >>> min_targs[min_targs < 0] *=1.2 >>> max_targs[max_targs > 0] *=1.2 >>> max_targs[max_targs < 0] *=0.8 >>> y_range = np.hstack((min_targs, max_targs)) >>> tail = ClassRegMulti(n_in=100, n_out=6, objective='regression', ... y_range=y_range, ... lookup_init=lookup_uniform_init)
- class lumin.nn.models.blocks.tail.IdentTail(n_in, n_out, objective, bias_init=None, lookup_init=<function lookup_normal_init>, freeze=False)[source]¶
Bases:
AbsTailPlaceholder tail module for cases in which a tail is not required. Outputs are equal to imputs.