dqa.tasks.ml.BalanceClassificationTrainingSet#

class dqa.tasks.ml.BalanceClassificationTrainingSet(label_name: str | List[str], output_label_name: str | List[str], class_ratios: List[float] | None = None, **kwargs)#

Resamples a training and label data set such that there is an equal number of samples for each label value.

The samples are taken randomly. The training data is specified by input_name and output_name. The label data is specified by label_name and label_output_name.

Parameters:
  • label_name (str or list of str) – The name of the data row containing the input labels.

  • output_label_name (str or list of str) – The name of the output label data row.

  • class_ratios (list of floats, default=None) – If set, this specifies the ratio of the number of samples for each class. The number of samples for class i will be int(data_length * class_ratios[i]).

Methods

finish()

Can perform actions that are required to clean up after the task has finished, e.g. close network connections etc.

in_out_default

input_output_dataset

input_output_machine

input_output_mode

input_output_name

log

modify_data_row

modify_dataset

modify_dataset_dict

modify_machine

modify_measurement

set_logging_level

transfer_metadata