sakura.models.extractor.Extractor

class sakura.models.extractor.Extractor(input_dim: int, signature_config=None, pheno_config=None, main_lat_config=None, pre_encoder_config=None, verbose=False)

Bases: Module

End-to-end multi-component model architecture assembler and forward orchestrator

Despite the name ‘Extractor’, this class acts as the central hub that integrates all submodules of SAKURA based on configurations. The name conceptually emphasis the class role in extracting high-level representations through the assembled components as dimensionality reduction being the main task.

Parameters:
  • input_dim (int) – The dimensionality of the model inputs

  • signature_config (dict[str, Any], optional) – Model configuration settings for the signature regression branch

  • pheno_config (dict[str, Any], optional) – Model configuration settings for the phenotype prediction/regression branch

  • main_lat_config (dict[str, Any], optional) – Model configuration settings for the main latent representation of the autoencoder backbone

  • pre_encoder_config (dict[str, Any], optional) – Model configuration settings for the pre-encoder stage

  • verbose (bool, optional) – Whether to enable verbose console logging, defaults to False

Architecture Composition:
  • pre_encoder (nn.Module): Raw input preprocessing/initial feature transformation

  • main_latent_compressor (nn.Module): Core bottleneck for dimensionality reduction

  • signature_latent_compressors (nn.ModuleDict): Task-specific latent extraction branch for signature analysis

  • signature_regressors (nn.ModuleDict): Signature regression head

  • pheno_latent_compressors (nn.ModuleDict): ask-specific latent extraction branch for phenotype analysis

  • pheno_models (nn.ModuleDict): Phenotype prediction/regression head

  • decoder (nn.Module): Reconstruction/upsampling component

Forward Flow:
  1. <Input> → pre_encoder → <Pre-latent> → main_latent_compressor → <Main latent>

  2. <Main latent> OR <Pre-latent> → parallel signature/pheno processing branches → parallel signature regressors and pheno_models

  3. Main latent → decoder → final outputs

Methods

forward

Forward extractor framework with control over computation branches

Attributes