partial_tagger.encoders module#

class partial_tagger.encoders.base.BaseEncoder(*args, **kwargs)[source]#

Base class for all encoders.

abstract forward(inputs: dict[str, torch.Tensor]) torch.Tensor[source]#

Encodes the given inputs to a tensor representation.

Parameters:

inputs – A dictionary that maps string keys to a tensor values.

Returns:

A [batch_size, sequence_length, hidden_size] float tensor.

abstract get_hidden_size() int[source]#

Returns the dimension size of the output tensor.

Returns:

The dimension size of the output tensor.

class partial_tagger.encoders.base.BaseEncoderFactory[source]#

Base class for all encoder factories.

abstract create(label_set: LabelSet) BaseEncoder[source]#

Creates an encoder based on the provided label set.

Parameters:

label_set – An instance of LabelSet.

Returns:

An encoder that transforms input into a tensor representation.

class partial_tagger.encoders.transformer.TransformerModelEncoder(model: PreTrainedModel, embedding_size: int, hidden_size: int, dropout_prob: float = 0.2)[source]#

A Transformer-based encoder for transforming inputs into a fixed-size tensor representation.

Parameters:
  • model – A pre-trained transformer model to use for encoding.

  • embedding_size – An integer representing the size of the input embeddings.

  • hidden_size – An integer representing the dimension size of the output tensor representation.

  • dropout_prob – A float representing dropout probability to apply. Defaults to 0.2.

model#

A pre-trained transformer model.

linear#

A linear layer for projecting embeddings to the hidden size.

dropout#

A dropout layer for regularization.

forward(inputs: dict[str, torch.Tensor]) Tensor[source]#

Encodes the given inputs to a tensor representation.

Parameters:

inputs – A dictionary that maps string keys to a tensor values.

Returns:

A [batch_size, sequence_length, hidden_size] float tensor.

get_hidden_size() int[source]#

Returns the dimension size of the output tensor.

Returns:

The dimension size of the output tensor.

class partial_tagger.encoders.transformer.TransformerModelEncoderFactory(model_name: str, dropout_prob: float = 0.2)[source]#

Factory class for creating TransformerModelEncoder instances.

Parameters:
  • model_name – The name or path of the pre-trained transformer model.

  • dropout_prob – Dropout probability to apply. Defaults to 0.2.

create(label_set: LabelSet) TransformerModelEncoder[source]#

Creates an TransformerModelEncoder instance based on the provided label set.

Parameters:

label_set – An instance of LabelSet.

Returns:

An encoder that transforms input into a tensor representation.

class partial_tagger.encoders.transformer.TransformerModelWithHeadEncoder(model: AutoModelForTokenClassification)[source]#

A Transformer-based encoder for transforming inputs into a fixed-size tensor representation.

Parameters:

model – A transformer model with a classification head.

model#

A pre-trained transformer model.

forward(inputs: dict[str, torch.Tensor]) Tensor[source]#

Encodes the given inputs to a tensor representation.

Parameters:

inputs – A dictionary that maps string keys to a tensor values.

Returns:

A [batch_size, sequence_length, hidden_size] float tensor.

get_hidden_size() int[source]#

Returns the dimension size of the output tensor.

Returns:

The dimension size of the output tensor.

class partial_tagger.encoders.transformer.TransformerModelWithHeadEncoderFactory(model_name: str, dropout_prob: float = 0.2)[source]#

Factory class for creating TransformerModelWithHeadEncoder instances.

Parameters:
  • model_name – A name or path of the pre-trained transformer model.

  • dropout_prob – Dropout probability to apply. Defaults to 0.2.

create(label_set: LabelSet) TransformerModelWithHeadEncoder[source]#

Creates an TransformerModelWithHeadEncoder instance based on the provided label set.

Parameters:

label_set – An instance of LabelSet.

Returns:

An encoder that transforms input into a tensor representation.