dqa.tasks.ml.TrainTestSplit#

class dqa.tasks.ml.TrainTestSplit(input_name: str | List[str], output_name: str | List[str], test_size: float | None = None, train_size: float | None = None, **kwargs)#

Splits data into training and test data. This mostly wraps the sklearn.model_selection.train_test_split function.

Parameters:

input_name (str or list of str) – The input data rows to be split.
output_name (str or list of str) – The names of the output data rows. For each entry in input_name, there needs to be one entry for the training output followed by one entry for the test output name. So the length should be twice as much as in input_name.
test_size (float, default=None) – The relative size of the test sets. The default value is as in sklearn.model_selection.train_test_split.
train_size (float, default=None) – The relative size of the training sets.

Methods

finish()

Can perform actions that are required to clean up after the task has finished, e.g. close network connections etc.

in_out_default
input_output_dataset
input_output_machine
input_output_mode
input_output_name
log
modify_data_row
modify_dataset
modify_dataset_dict
modify_machine
modify_measurement
set_logging_level
transfer_metadata