ertk.preprocessing.fairseq.FairseqExtractorConfig

class ertk.preprocessing.fairseq.FairseqExtractorConfig(checkpoint: str = '???', layer: str = 'context', aggregate: Agg = Agg.MEAN, device: str = 'cuda', vq_path: str | None = None, vq_ids: bool = False, vq_ids_as_string: bool = True, max_input_len: int = 1500000)

Bases: ERTKConfig

Fairseq feature extractor configuration.

__init__(checkpoint: str = '???', layer: str = 'context', aggregate: Agg = Agg.MEAN, device: str = 'cuda', vq_path: str | None = None, vq_ids: bool = False, vq_ids_as_string: bool = True, max_input_len: int = 1500000) None

Methods

__init__([checkpoint, layer, aggregate, ...])

Inherited Methods

default()

Create default config.

from_config(config)

Create config object from any compatible config.

from_file(path)

Create config from YAML file and optionlly override some values.

merge_with_args(args)

Merge config with command-line arguments.

merge_with_config(config)

Merge other config into this config.

to_dictconfig()

Convert config to DictConfig.

to_file(path)

Write config to YAML file.

to_string()

Generate YAML string representation of config.

Attributes

aggregate

Aggregation method.

checkpoint

Path to model checkpoint.

device

Device to run model on.

layer

Layer to extract features from.

max_input_len

Maximum input length.

vq_ids

Whether to return VQ cluster ids.

vq_ids_as_string

Whether to return VQ cluster ids as a single string of integers separated by spaces.

vq_path

Path to vector quantiser.

aggregate: Agg = 'mean'

Aggregation method.

checkpoint: str = '???'

Path to model checkpoint.

device: str = 'cuda'

Device to run model on.

layer: str = 'context'

Layer to extract features from.

max_input_len: int = 1500000

Maximum input length.

vq_ids: bool = False

Whether to return VQ cluster ids.

vq_ids_as_string: bool = True

Whether to return VQ cluster ids as a single string of integers separated by spaces.

vq_path: str | None = None

Path to vector quantiser.