dqa.tasks.ml.TrainTestSplit#

class dqa.tasks.ml.TrainTestSplit(input_name: str | List[str], output_name: str | List[str], test_size: float | None = None, train_size: float | None = None, **kwargs)#

Splits data into training and test data. This mostly wraps the sklearn.model_selection.train_test_split function.

Parameters:
  • input_name (str or list of str) – The input data rows to be split.

  • output_name (str or list of str) – The names of the output data rows. For each entry in input_name, there needs to be one entry for the training output followed by one entry for the test output name. So the length should be twice as much as in input_name.

  • test_size (float, default=None) – The relative size of the test sets. The default value is as in sklearn.model_selection.train_test_split.

  • train_size (float, default=None) – The relative size of the training sets.

Methods

finish()

Can perform actions that are required to clean up after the task has finished, e.g. close network connections etc.

in_out_default

input_output_dataset

input_output_machine

input_output_mode

input_output_name

log

modify_data_row

modify_dataset

modify_dataset_dict

modify_machine

modify_measurement

set_logging_level

transfer_metadata