sakura.utils.data_transformations.ToOrdinal

class sakura.utils.data_transformations.ToOrdinal

Bases: object

Callable class to convert categorical labels to an integer array using sklearn OrdinalEncoder

Useful for losses like torch.nn.CrossEntropyLoss and expected to be used on Phenotype.

Parameters:
  • sample (array-like) – Input data of shape (n_samples, n_features) containing categorical features

  • order ('auto' or a list of array-like, optional) – Expected order of categories (unique values per feature), defaults to ‘auto’, where categories are determined automatically from the input data

  • handle_unknown* – Strategy for handling unknown categories, defaults to ‘use_encoded_value’ which sets unknown categories to <unknown_value>

  • unknown_value (int or np.nan, optional) – Encoded value to assign unknown categories, must be numerical if using ‘use_encoded_value’ strategy, defaults to np.nan

Note

<handle_unknown>: When set to ‘use_encoded_value’, the encoded value of unknown categories will be set to the value given for the parameter; When set to ‘error’, an error will be raised in case an unknown categorical feature is present during transform.

Returns:

Transformed ordinal encoded data

Return type:

array-like

Methods