sakura.utils.data_splitter.DataSplitter.auto_random_k_bin_labelling

DataSplitter.auto_random_k_bin_labelling(base: ndarray, k: int, seed=None) → ndarray

Obtain a label vector containing 1~k for included points, 0 for not included points.

This function utilizes k labels to prepare dataset usage by allowing the selection of data.

Parameters:

base (np.ndarray[base.dtype, np.integer]) – The predefined label vector to work with
k (int) – The number of included points and overall divisions for later incremental selection percentages
seed (int, optional) – a temporary random seed

Returns:

A random assigned 0~k label vector indicating inclusion and exclusion of points

Return type:

np.ndarray[np.integer]