Data transformation task classes (dqa.tasks.transformations)#

Data conversion tasks#

Parse(format, **kwargs)

Parses arrays of strings in the sense of the Python 'parse' library.

StringToDatetime([format, timezone])

Converts time strings into pandas Datetimes.

StringToTimestamp([format, timezone])

Converts time strings into POSIX timestamps.

TimeDeltaToSeconds(**kwargs)

Converts time deltas to total seconds.

Elementary transformation tasks#

Convolution1D(mask, **kwargs)

Performs a one-dimensional convolution with a fixed vector.

EntrywiseDifference(input_name, output_name, ...)

Subtracts data rows by entry.

EntrywiseOperation(input_name, output_name, ...)

Performs an operation entrywise on all specified data rows.

EntrywiseProduct(input_name, output_name, ...)

Computes the entrywise product of multiple data rows.

EntrywiseQuotient(input_name, output_name, ...)

Divides data rows by entry.

EntrywiseSum(input_name, output_name, **kwargs)

Adds data rows by entry.

FFT([axis])

Performs the fast Fourier transform.

InList(values_list, **kwargs)

Determines (entrywise) if values are in a specified range.

InRange(range, **kwargs)

Determines (entrywise) if values are in a specified range.

IndexTask(index_name, **kwargs)

Takes entries from one data row with indices given in another data row.

LinearTransform([intercept, factor, factor_inv])

Applies an affine linear transformation to the data with constant coefficients.

Normalize01(**kwargs)

Normalizes the values in each data row to the range [0, 1].

RMSTask([axis])

Computes the quadratic mean (root-mean-square).

ReverseIndex(index_reference_name, **kwargs)

Computes the indices of the input data rows within a specified index reference data row.

SignWithThreshold(threshold, **kwargs)

Returns the sign with a threshold around 0.

Sort(key_name[, descending, axis])

Sorts data.

TimeObjectToTimestamp(**kwargs)

Converts datetime objects to unix timestamps.

General transformation tasks#

AutoRescaleFactors(**kwargs)

Computes scaling factors for all input data rows in such a way that they can be drawn in the sample plot despite highly different scales.

BinaryChangeIndicatorMargin(scale_of, ...)

Generates an indicator time series that takes the value 1 around certain chaining points with a specified margin.

CategoricalToNumbers(**kwargs)

Turns categorical data into integer indices.

CreateIndex(association_dataset[, ...])

Replaces entries (e.g. large integers or strings) by indices and creates a global index association table.

DenseWeight(kernel_bandwidth, alpha[, epsilon])

Computes sample weight values depending on the density of the samples.

ExpandToLength(length_name, **kwargs)

Takes a data row with one single entry and extends it by repeating this single value up to the length of another data row.

IntervalClassification(boundaries, classes, ...)

Replaces input values by integer values depending on given interval boundaries.

IntervalIndicator2(beginning_name, end_name, ...)

Computes an indicator time series for a list of intervals given by start and end points.

LinearRegressionTask(**kwargs)

Performs linear regression and returns the coefficients.

PadWithNan(length_name, **kwargs)

Pads all input data rows with nan values up to the length of a specified data row.

PhaseSequence([phases_axis, ...])

Determines the phase sequence from the fourier coefficients of a three-phase alternating current.

RegexSubstituteMultiple(pattern, ...)

Performs a regex substitution on a data row for possibly multiple replacement patterns.

SelectIndex(index[, axis])

Selects a certain index along a specified axis.

SelectRange([start, end, step, axis])

Selects [start:end:step] along the specified axis.

StrRestrictLength(length, **kwargs)

Cuts off all strings in the input at a certain maximum length.

SubstituteTask(substitutions[, default_value])

Performs substitutions given by a fixed dictionary.

Index-wise transformation tasks#

MeanByIndex(index_name[, index_output])

Computes the mean for each set of entries with same index value.

OperationByIndex(index_name[, index_output])

Performs an operation on all entries corresponding to the same index.

StdByIndex(index_name[, index_output])

Computes the standard deviation for each set of entries with same index value.

Wrapper tasks#

AbsTask(**kwargs)

Computes the absolute value, wrapper for np.abs

Concatenate([axis])

Wrapper for np.concatenate

Cumsum(**kwargs)

Wrapper for np.cumsum.

ExpandDims([axis])

Expands dimensions of array, wrapper for np.expand_dims

GaussianFilter1D(sigma, **kwargs)

Applies a Gaussian blurring filter, wrapper for scipy.ndimage.gaussian_filter1d.

Log(**kwargs)

Computes the logarithm, wrapper for np.log

Mean([axis, keepdims])

Computes the mean of data, wrapper for np.mean.

NormTask([ord, axis, keepdims])

Computes the norm of data rows, wrapper for np.linalg.norm

Prod([axis])

Computes the products of the elements within each data row, wrapper for np.prod

RandN(size, **kwargs)

Produces an array of normally distributed random numbers of specified shape, wrapper for np.random.randn.

Random(size, **kwargs)

Produces an array of random numbers of specified shape, wrapper for np.random.random.

Range(stop[, start, step])

Creates a range array in the row specified by output_name, wrapper for np.arange.

Reshape(newshape[, order])

Rearranges the elements of an array to a different shape, wrapper for np.reshape

Shift([periods])

Wrapper for the pd.Series.shift() function in pandas.

Sign(**kwargs)

Computes the sign of the elements, wrapper for np.sign.

SqrtTask(**kwargs)

Computes the square root, wrapper for np.sqrt.

StdTask(**kwargs)

Computes the standard deviation, wrapper for np.std.

Where(**kwargs)

Lists the indices of non-zero entries in data rows, wrapper for np.where.