dqa.tasks.ml.BalanceClassificationTrainingSet#

class dqa.tasks.ml.BalanceClassificationTrainingSet(label_name: str | List[str], output_label_name: str | List[str], class_ratios: List[float] | None = None, **kwargs)#

Resamples a training and label data set such that there is an equal number of samples for each label value.

The samples are taken randomly. The training data is specified by input_name and output_name. The label data is specified by label_name and label_output_name.

Parameters:

label_name (str or list of str) – The name of the data row containing the input labels.
output_label_name (str or list of str) – The name of the output label data row.
class_ratios (list of floats, default=None) – If set, this specifies the ratio of the number of samples for each class. The number of samples for class i will be int(data_length * class_ratios[i]).

Methods

finish()

Can perform actions that are required to clean up after the task has finished, e.g. close network connections etc.

in_out_default
input_output_dataset
input_output_machine
input_output_mode
input_output_name
log
modify_data_row
modify_dataset
modify_dataset_dict
modify_machine
modify_measurement
set_logging_level
transfer_metadata