Tasks ===== The :class:`dqa.tasks.tasks.Task` class is a crucial core ingredient of the Knowlestry framework. By deriving subclasses form it, a wide range of operations on data can be implemented. Various operations implemented in this form are already included in the framework (see :doc:`../reference/dqa.tasks`). Custom extensions can easily be added by creating new subclasses. The Task class implements ``modify_*()`` methods in multiple levels as described in :doc:`data_representation`. By default, ``modify_dataset_dict()`` on the highest level calls ``modify_dataset()`` for each ``Dataset`` and this again continues in an analogous fashion. Depending on the type of the operation to be implemented, overriding the ``modify_*()`` method of one specific level could be most convenient. For example, an elementary operation such as a logarithm (:class:`dqa.tasks.transformations.Log`) only works on a data row and ist most conveniently implemented by overriding the ``modify_data_row()`` method. On the other hand, for example the class :class:`dqa.tasks.data_structure.JoinMachines` joins the data from multiple machines into one (within a dataset) and is implemented by overriding ``modify_dataset()``. By default, a task is applied to every dataset, every machine, etc. However, the constructor parameters in the Task class can restrict the parts of the datasets it should be applied to. Specifically, the ``input_dataset`` parameter can be given as a list of strings or only one string and specifies that the Task will only be applied to these datasets. By default, it is ``Null``, indicating that the Task will be applied to all datasets. By specifying ``output_dataset`` (usually a list with the same length as ``input_dataset``), the output of the Task can also be written to a different dataset. By default, it is written to the same one. Analogously, there are also such parameters for the other levels: - ``input_machine`` and ``output_machine`` specify the Machine names to use as input/output. - ``input_name`` and ``output_name`` specify the names of the input/output data rows.