tsfel.feature_extraction package

Submodules

tsfel.feature_extraction.calc_features module

tsfel.feature_extraction.calc_features.calc_window_features(config, window, fs, verbose=1, single_window=False, **kwargs)[source]

Extract features from a univariate or multivariate window.

Parameters:
  • config (dict) – A dictionary containing the settings for feature extraction.

  • window (np.ndarray, pd.DataFrame, pd.Series) – The input signal from which features will be extracted.

  • fs (float, default=None) – Sampling frequency of the input signal.

  • verbose (int, default=1) – The verbosity mode. 0 means silent, and 1 means showing a progress bar.

  • single_window (bool) – If True, the progress bar will be shown only for the extraction of features from a single window.

  • **kwargs

    Additional keyword arguments, see below:

    • features_path (str) –

      Path to a script with custom features.

    • header_names (list or array-like) –

      Names of each column window.

Returns:

A DataFrame containing the extracted features.

Return type:

pd.DataFrame

tsfel.feature_extraction.calc_features.dataset_features_extractor(main_directory, feat_dict, verbose=1, **kwargs)[source]

Extracts features from a dataset.

Parameters:
  • main_directory (String) – Input directory

  • feat_dict (dict) – Dictionary with features

  • verbose (int) – Verbosity mode. 0 = silent, 1 = progress bar. (0 or 1 (Default))

  • **kwargs

  • below (See) –

    • search_criteria (list) –

      List of file names to compute features. (Example: ‘Accelerometer.txt’) (default: None)

    • time_unit (float) –

      Time unit (default: 1e9)

    • resampling_rate (int) –

      Resampling rate (default: 100)

    • window_size (int) –

      Window size in number of samples (default: 100)

    • overlap (float) –

      Overlap between 0 and 1 (default: 0)

    • pre_process (function) –

      Function with pre processing code

      (default: None)

    • output_directory (String) –

      Output directory (default: 'output_directory', str(Path.home()) + '/tsfel_output')

    • features_path (string) –

      Directory of script with personal features

    • header_names (list or array) –

      Names of each column window

    • n_jobs (int) –

      The number of jobs to run in parallel. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. (default: None in Windows and -1 for other systems)

Returns:

csv file with the extracted features

Return type:

file

tsfel.feature_extraction.calc_features.time_series_features_extractor(config, timeseries, fs=None, window_size=None, overlap=0, verbose=1, **kwargs)[source]

Extract features from univariate or multivariate time series.

Parameters:
  • config (dict) – A dictionary containing the settings for feature extraction.

  • timeseries (list, np.ndarray, pd.DataFrame, pd.Series) – The input signal from which features will be extracted.

  • fs (float, default=None) – Sampling frequency of the input signal.

  • window_size (int or None, optional, default=None) – The size of the windows used to split the input signal, measured in the number of samples.

  • overlap (float, optional, default=0) – A value between 0 and 1 that defines the percentage of overlap between consecutive windows.

  • n_jobs (int, optional) –

    The number of jobs to run in parallel.
    • None means 1 unless in a joblib.parallel_backend context.

    • -1 means using all available processors.

    • default: None on Windows, -1 for other systems

  • verbose (int, default=1) – The verbosity mode. 0 means silent, and 1 means showing a progress bar.

  • **kwargs

    Additional keyword arguments, see below:

    • features_path (str) –

      Path to a script with custom features.

    • header_names (list or array-like) –

      Names of each column window.

Returns:

A DataFrame containing the extracted features, where: - Columns represent the names of the features. - Rows contain the feature values for each signal window.

Return type:

pd.DataFrame

tsfel.feature_extraction.features module

tsfel.feature_extraction.features.abs_energy(signal)[source]

Computes the absolute energy of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which the area under the curve is computed

Returns:

Absolute energy

Return type:

float

tsfel.feature_extraction.features.auc(signal, fs)[source]

Computes the area under the curve of the signal computed with trapezoid rule.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which the area under the curve is computed

  • fs (float) – Sampling Frequency

Returns:

The area under the curve value

Return type:

float

tsfel.feature_extraction.features.autocorr(signal)[source]

Calculates the first 1/e crossing of the autocorrelation function (ACF). The adjusted ACF is calculated using the statsmodels.tsa.stattools.acf. Following the recommendations for long time series (size > 450), we use the FFT convolution. This feature measures the first time lag at which the autocorrelation function drops below 1/e (= 0.3679).

Feature computational cost: 2

Parameters:

signal (nd-array) – Input from which autocorrelation is computed

Returns:

The first time lag at which the ACF drops below 1/e (= 0.3679).

Return type:

int

tsfel.feature_extraction.features.average_power(signal, fs)[source]

Computes the average power of the signal.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Signal from which average power is computed

  • fs (float) – Sampling frequency

Returns:

Average power

Return type:

float

tsfel.feature_extraction.features.calc_centroid(signal, fs)[source]

Computes the centroid along the time axis.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which centroid is computed

  • fs (int) – Signal sampling frequency

Returns:

Temporal centroid

Return type:

float

tsfel.feature_extraction.features.calc_max(signal)[source]

Computes the maximum value of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which max is computed

Returns:

Maximum result

Return type:

float

tsfel.feature_extraction.features.calc_mean(signal)[source]

Computes mean value of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which mean is computed.

Returns:

Mean result

Return type:

float

tsfel.feature_extraction.features.calc_median(signal)[source]

Computes median of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which median is computed

Returns:

Median result

Return type:

float

tsfel.feature_extraction.features.calc_min(signal)[source]

Computes the minimum value of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which min is computed

Returns:

Minimum result

Return type:

float

tsfel.feature_extraction.features.calc_std(signal)[source]

Computes standard deviation (std) of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which std is computed

Returns:

Standard deviation result

Return type:

float

tsfel.feature_extraction.features.calc_var(signal)[source]

Computes variance of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which var is computed

Returns:

Variance result

Return type:

float

tsfel.feature_extraction.features.dfa(signal)[source]

Computes the Detrended Fluctuation Analysis (DFA) of the signal.

Parameters:

signal (np.ndarray) – Input signal.

Returns:

alpha_dfa – Scaling exponent in DFA.

Return type:

float

tsfel.feature_extraction.features.distance(signal)[source]

Computes signal traveled distance.

Calculates the total distance traveled by the signal using the hypotenuse between 2 datapoints.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which distance is computed

Returns:

Signal distance

Return type:

float

tsfel.feature_extraction.features.ecdf(signal, d=10)[source]

Computes the values of ECDF (empirical cumulative distribution function) along the time axis.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which ECDF is computed

  • d (integer) – Number of ECDF values to return

Returns:

The values of the ECDF along the time axis

Return type:

float

tsfel.feature_extraction.features.ecdf_percentile(signal, percentile=None)[source]

Computes the percentile value of the ECDF.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which ECDF is computed

  • percentile (list) – Percentile value to be computed

Returns:

The input value(s) of the ECDF

Return type:

float

tsfel.feature_extraction.features.ecdf_percentile_count(signal, percentile=None)[source]

Computes the cumulative sum of samples that are less than the percentile.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which ECDF is computed

  • percentile (list) – Percentile threshold

Returns:

The cumulative sum of samples

Return type:

float

tsfel.feature_extraction.features.ecdf_slope(signal, p_init=0.5, p_end=0.75)[source]

Computes the slope of the ECDF between two percentiles. Possibility to return infinity values.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which ECDF is computed

  • p_init (float) – Initial percentile

  • p_end (float) – End percentile

Returns:

The slope of the ECDF between two percentiles

Return type:

float

tsfel.feature_extraction.features.entropy(signal, prob='standard')[source]

Computes the entropy of the signal using the Shannon Entropy.

Description in Article: Regularities Unseen, Randomness Observed: Levels of Entropy Convergence Authors: Crutchfield J. Feldman David

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which entropy is computed

  • prob (string) – Probability function (kde or gaussian functions are available)

Returns:

The normalized entropy value

Return type:

float

tsfel.feature_extraction.features.fundamental_frequency(signal, fs)[source]

Computes fundamental frequency of the signal.

The fundamental frequency integer multiple best explain the content of the signal spectrum.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which fundamental frequency is computed

  • fs (float) – Sampling frequency

Returns:

f0 – Predominant frequency of the signal

Return type:

float

tsfel.feature_extraction.features.higuchi_fractal_dimension(signal)[source]

Computes the fractal dimension of a signal using Higuchi’s method (HFD).

Parameters:

signal (np.ndarray) – Input signal.

Returns:

hfd – Fractal dimension.

Return type:

float

tsfel.feature_extraction.features.hist_mode(signal, nbins=10)[source]

Compute the mode of a histogram using a given number of (linearly spaced) bins.

Feature computational cost: 1

Parameters:
  • signal (np.ndarray) – Input signal from which the histogram is computed.

  • nbins (int) – The number of equal-width bins in the given range, by default 10.

Returns:

The mode of the histogram (the midpoint of the bin with the highest count).

Return type:

float

tsfel.feature_extraction.features.human_range_energy(signal, fs)[source]

Computes the human range energy ratio.

The human range energy ratio is given by the ratio between the energy in frequency 0.6-2.5Hz and the whole energy band.

Feature computational cost: 2

Parameters:
  • signal (nd-array) – Signal from which human range energy ratio is computed

  • fs (float) – Sampling frequency

Returns:

Human range energy ratio

Return type:

float

tsfel.feature_extraction.features.hurst_exponent(signal)[source]

Computes the Hurst exponent of the signal through the Rescaled range (R/S) analysis.

Parameters:

signal (np.ndarray) – Input signal.

Returns:

h_exp – Hurst exponent.

Return type:

float

tsfel.feature_extraction.features.interq_range(signal)[source]

Computes interquartile range of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which interquartile range is computed

Returns:

Interquartile range result

Return type:

float

tsfel.feature_extraction.features.kurtosis(signal)[source]

Computes kurtosis of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which kurtosis is computed

Returns:

Kurtosis result

Return type:

float

tsfel.feature_extraction.features.lempel_ziv(signal, threshold=None)[source]

Computes the Lempel-Ziv’s (LZ) complexity index, normalized by the signal’s length.

Parameters:
  • signal (np.ndarray) – Input signal.

  • amp_thres (float, optional) – Amplitude Threshold for the binarisation. If None, the mean of the signal is used.

Returns:

lz_index – Lempel-Ziv complexity index

Return type:

float

tsfel.feature_extraction.features.lpcc(signal, n_coeff=12)[source]

Computes the linear prediction cepstral coefficients.

Implementation details and description in: http://www.practicalcryptography.com/miscellaneous/machine-learning/tutorial-cepstrum-and-lpccs/

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from linear prediction cepstral coefficients are computed

  • n_coeff (int) – Number of coefficients

Returns:

Linear prediction cepstral coefficients

Return type:

nd-array

tsfel.feature_extraction.features.max_frequency(signal, fs)[source]

Computes maximum frequency of the signal.

Feature computational cost: 2

Parameters:
  • signal (nd-array) – Input from which maximum frequency is computed

  • fs (float) – Sampling frequency

Returns:

0.95 of maximum frequency using cumsum

Return type:

float

tsfel.feature_extraction.features.max_power_spectrum(signal, fs)[source]

Computes maximum power spectrum density of the signal.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which maximum power spectrum is computed

  • fs (float) – Sampling frequency

Returns:

Max value of the power spectrum density

Return type:

nd-array

tsfel.feature_extraction.features.maximum_fractal_length(signal)[source]

Computes the Maximum Fractal Length (MFL) of the signal, which is the average length at the smallest scale, measured from the logarithmic plot determining FD. The Higuchi’s method is used.

Parameters:

signal (np.ndarray) – Input signal.

Returns:

mfl – Maximum Fractal Length.

Return type:

float

tsfel.feature_extraction.features.mean_abs_deviation(signal)[source]

Computes mean absolute deviation of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which mean absolute deviation is computed

Returns:

Mean absolute deviation result

Return type:

float

tsfel.feature_extraction.features.mean_abs_diff(signal)[source]

Computes mean absolute differences of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which mean absolute deviation is computed

Returns:

Mean absolute difference result

Return type:

float

tsfel.feature_extraction.features.mean_diff(signal)[source]

Computes mean of differences of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which mean of differences is computed

Returns:

Mean difference result

Return type:

float

tsfel.feature_extraction.features.median_abs_deviation(signal)[source]

Computes median absolute deviation of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which median absolute deviation is computed

Returns:

Mean absolute deviation result

Return type:

float

tsfel.feature_extraction.features.median_abs_diff(signal)[source]

Computes median absolute differences of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which median absolute difference is computed

Returns:

Median absolute difference result

Return type:

float

tsfel.feature_extraction.features.median_diff(signal)[source]

Computes median of differences of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which median of differences is computed

Returns:

Median difference result

Return type:

float

tsfel.feature_extraction.features.median_frequency(signal, fs)[source]

Computes median frequency of the signal.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which median frequency is computed

  • fs (int) – Sampling frequency

Returns:

f_median – 0.50 of maximum frequency using cumsum.

Return type:

int

tsfel.feature_extraction.features.mfcc(signal, fs, pre_emphasis=0.97, nfft=512, nfilt=40, num_ceps=12, cep_lifter=22)[source]

Computes the MEL cepstral coefficients.

It provides the information about the power in each frequency band.

Implementation details and description on: https://www.kaggle.com/ilyamich/mfcc-implementation-and-tutorial https://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html#fnref:1

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which MEL coefficients is computed

  • fs (float) – Sampling frequency

  • pre_emphasis (float) – Pre-emphasis coefficient for pre-emphasis filter application

  • nfft (int) – Number of points of fft

  • nfilt (int) – Number of filters

  • num_ceps (int) – Number of cepstral coefficients

  • cep_lifter (int) – Filter length

Returns:

MEL cepstral coefficients

Return type:

nd-array

tsfel.feature_extraction.features.mse(signal, m=3, maxscale=None, tolerance=None)[source]

Computes the Multiscale entropy (MSE) of the signal, that performs the entropy analysis over multiple time scales.

Parameters:
  • signal (np.ndarray) – Input signal.

  • m (int) – Embedding dimension for the sample entropy, defaults to 3.

  • maxscale (int) – Maximum scale factor, defaults to 1/13 of the length of the input signal.

  • tolerance (float) – Tolerance value, defaults to 0.2 times the standard deviation of the input signal.

Returns:

mse_area – Normalized area under the MSE curve.

Return type:

np.ndarray

tsfel.feature_extraction.features.negative_turning(signal)[source]

Computes number of negative turning points of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which minimum number of negative turning points are counted

Returns:

Number of negative turning points

Return type:

float

tsfel.feature_extraction.features.neighbourhood_peaks(signal, n=10)[source]

Computes the number of peaks from a defined neighbourhood of the signal.

Reference: Christ, M., Braun, N., Neuffer, J. and Kempa-Liehr A.W. (2018). Time Series FeatuRe Extraction on basis

of Scalable Hypothesis tests (tsfresh – A Python package). Neurocomputing 307 (2018) 72-77

Parameters:
  • signal (nd-array) – Input from which the number of neighbourhood peaks is computed

  • n (int) – Number of peak’s neighbours to the left and to the right

Returns:

The number of peaks from a defined neighbourhood of the signal

Return type:

int

tsfel.feature_extraction.features.petrosian_fractal_dimension(signal)[source]

Computes the Petrosian Fractal Dimension of a signal.

Parameters:

signal (np.ndarray) – Input signal.

Returns:

pfd – Petrosian Fractal Dimension.

Return type:

float

tsfel.feature_extraction.features.pk_pk_distance(signal)[source]

Computes the peak to peak distance.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which peak to peak is computed

Returns:

peak to peak distance

Return type:

float

tsfel.feature_extraction.features.positive_turning(signal)[source]

Computes number of positive turning points of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which positive turning points are counted

Returns:

Number of positive turning points

Return type:

float

tsfel.feature_extraction.features.power_bandwidth(signal, fs)[source]

Computes power spectrum density bandwidth of the signal.

It corresponds to the width of the frequency band in which 95% of its power is located.

Description in article: Power Spectrum and Bandwidth Ulf Henriksson, 2003 Translated by Mikael Olofsson, 2005

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which the power bandwidth computed

  • fs (float) – Sampling frequency

Returns:

Occupied power in bandwidth

Return type:

float

tsfel.feature_extraction.features.rms(signal)[source]

Computes root mean square of the signal.

Square root of the arithmetic mean (average) of the squares of the original values.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which root mean square is computed

Returns:

Root mean square

Return type:

float

tsfel.feature_extraction.features.skewness(signal)[source]

Computes skewness of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which skewness is computed

Returns:

Skewness result

Return type:

int

tsfel.feature_extraction.features.slope(signal)[source]

Computes the slope of the signal.

Slope is computed by fitting a linear equation to the observed data.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which linear equation is computed

Returns:

Slope

Return type:

float

tsfel.feature_extraction.features.spectral_centroid(signal, fs)[source]

Barycenter of the spectrum.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 2

Parameters:
  • signal (nd-array) – Signal from which spectral centroid is computed

  • fs (int) – Sampling frequency

Returns:

Centroid

Return type:

float

tsfel.feature_extraction.features.spectral_decrease(signal, fs)[source]

Represents the amount of decreasing of the spectra amplitude.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Signal from which spectral decrease is computed

  • fs (float) – Sampling frequency

Returns:

Spectral decrease

Return type:

float

tsfel.feature_extraction.features.spectral_distance(signal, fs)[source]

Computes the signal spectral distance.

Distance of the signal’s cumulative sum of the FFT elements to the respective linear regression.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Signal from which spectral distance is computed

  • fs (float) – Sampling frequency

Returns:

spectral distance

Return type:

float

tsfel.feature_extraction.features.spectral_entropy(signal, fs)[source]

Computes the spectral entropy of the signal based on Fourier transform.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which spectral entropy is computed

  • fs (float) – Sampling frequency

Returns:

The normalized spectral entropy value

Return type:

float

tsfel.feature_extraction.features.spectral_kurtosis(signal, fs)[source]

Measures the flatness of a distribution around its mean value.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 2

Parameters:
  • signal (nd-array) – Signal from which spectral kurtosis is computed

  • fs (float) – Sampling frequency

Returns:

Spectral Kurtosis

Return type:

float

tsfel.feature_extraction.features.spectral_positive_turning(signal, fs)[source]

Computes number of positive turning points of the fft magnitude signal.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Input from which the number of positive turning points of the fft magnitude are computed

  • fs (float) – Sampling frequency

Returns:

Number of positive turning points

Return type:

float

tsfel.feature_extraction.features.spectral_roll_off(signal, fs)[source]

Computes the spectral roll-off of the signal.

The spectral roll-off corresponds to the frequency where 95% of the signal magnitude is contained below of this value.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Signal from which spectral roll-off is computed

  • fs (float) – Sampling frequency

Returns:

Spectral roll-off

Return type:

float

tsfel.feature_extraction.features.spectral_roll_on(signal, fs)[source]

Computes the spectral roll-on of the signal.

The spectral roll-on corresponds to the frequency where 5% of the signal magnitude is contained below of this value.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Signal from which spectral roll-on is computed

  • fs (float) – Sampling frequency

Returns:

Spectral roll-on

Return type:

float

tsfel.feature_extraction.features.spectral_skewness(signal, fs)[source]

Measures the asymmetry of a distribution around its mean value.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 2

Parameters:
  • signal (nd-array) – Signal from which spectral skewness is computed

  • fs (float) – Sampling frequency

Returns:

Spectral Skewness

Return type:

float

tsfel.feature_extraction.features.spectral_slope(signal, fs)[source]

Computes the spectral slope.

Spectral slope is computed by finding constants m and b of the function aFFT = mf + b, obtained by linear regression of the spectral amplitude.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Signal from which spectral slope is computed

  • fs (float) – Sampling frequency

Returns:

Spectral Slope

Return type:

float

tsfel.feature_extraction.features.spectral_spread(signal, fs)[source]

Measures the spread of the spectrum around its mean value.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 2

Parameters:
  • signal (nd-array) – Signal from which spectral spread is computed.

  • fs (float) – Sampling frequency

Returns:

Spectral Spread

Return type:

float

tsfel.feature_extraction.features.spectral_variation(signal, fs)[source]

Computes the amount of variation of the spectrum along time.

Spectral variation is computed from the normalized cross-correlation between two consecutive amplitude spectra.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 1

Parameters:
  • signal (nd-array) – Signal from which spectral variation is computed.

  • fs (float) – Sampling frequency

Returns:

Spectral Variation

Return type:

float

tsfel.feature_extraction.features.spectrogram_mean_coeff(signal, fs, bins=32)[source]

Calculates the average power spectral density (PSD) for each frequency throughout the entire signal duration provided by the spectrogram.

The values represent the average power spectral density computed on frequency bins. The feature name refers to the frequency bin where the PSD was taken. Each bin is fs / (bins * 2 - 2) Hz wide. The method relies on the scipy.signal.spectrogram and except for nperseg and fs, all the other parameters are set to its defaults.

Feature computational cost: 1

Parameters:
  • signal (array_like) – Input from which the spectrogram average power spectral density coefficients are computed.

  • fs (float) – Sampling frequency of the signal.

  • bins (int, optional) – The number of frequency bins.

Returns:

The power spectral density for each frequency bin averaged along the entire signal duration.

Return type:

nd-array

Notes

The optimal number of frequency bins depend on the task at hand. Using a higher number of bins with low sampling frequencies may result in excessive frequency resolution and the loss of valuable coarse-grained information. The default value should be suitable for most cases when working with the default sampling frequency. The number of frequency bins must be modified in the feature configuration file.

Added in version 0.1.7.

tsfel.feature_extraction.features.sum_abs_diff(signal)[source]

Computes sum of absolute differences of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which sum absolute difference is computed

Returns:

Sum absolute difference result

Return type:

float

tsfel.feature_extraction.features.wavelet_abs_mean(signal, fs, wavelet='mexh', max_width=10)[source]

Computes CWT absolute mean value of each wavelet scale.

Parameters:
  • signal (nd-array) – Input from which CWT is computed

  • fs (int) – Signal sampling frequency

  • wavelet (string) – Wavelet to use, defaults to “mexh” which represents the mexican hat wavelet (Ricker wavelet)

  • max_width (int) – Maximum width to use for transformation, defaults to 10

Returns:

CWT absolute mean value

Return type:

nd-array

tsfel.feature_extraction.features.wavelet_energy(signal, fs, wavelet='mexh', max_width=10)[source]

Computes CWT energy of each wavelet scale.

Implementation details: https://stackoverflow.com/questions/37659422/energy-for-1-d-wavelet-in-python

Parameters:
  • signal (nd-array) – Input from which CWT is computed

  • fs (int) – Signal sampling frequency

  • wavelet (string) – Wavelet to use, defaults to “mexh” which represents the mexican hat wavelet (Ricker wavelet)

  • max_width (int) – Maximum width to use for transformation, defaults to 10

Returns:

CWT energy

Return type:

nd-array

tsfel.feature_extraction.features.wavelet_entropy(signal, fs, wavelet='mexh', max_width=10)[source]

Computes CWT entropy of the signal.

Implementation details in: https://dsp.stackexchange.com/questions/13055/how-to-calculate-cwt-shannon-entropy B.F. Yan, A. Miyamoto, E. Bruhwiler, Wavelet transform-based modal parameter identification considering uncertainty

Parameters:
  • signal (nd-array) – Input from which CWT is computed

  • fs (int) – Signal sampling frequency

  • wavelet (string) – Wavelet to use, defaults to “mexh” which represents the mexican hat wavelet (Ricker wavelet)

  • max_width (int) – Maximum width to use for transformation, defaults to 10

Returns:

wavelet entropy

Return type:

float

tsfel.feature_extraction.features.wavelet_std(signal, fs, wavelet='mexh', max_width=10)[source]

Computes CWT std value of each wavelet scale.

Parameters:
  • signal (nd-array) – Input from which CWT is computed

  • fs (int) – Signal sampling frequency

  • wavelet (string) – Wavelet to use, defaults to “mexh” which represents the mexican hat wavelet (Ricker wavelet)

  • max_width (int) – Maximum width to use for transformation, defaults to 10

Returns:

CWT std

Return type:

nd-array

tsfel.feature_extraction.features.wavelet_var(signal, fs, wavelet='mexh', max_width=10)[source]

Computes CWT variance value of each wavelet scale.

Parameters:
  • signal (nd-array) – Input from which CWT is computed

  • fs (int) – Signal sampling frequency

  • wavelet (string) – Wavelet to use, defaults to “mexh” which represents the mexican hat wavelet (Ricker wavelet)

  • max_width (int) – Maximum width to use for transformation, defaults to 10

Returns:

CWT variance

Return type:

nd-array

tsfel.feature_extraction.features.zero_cross(signal)[source]

Computes Zero-crossing rate of the signal.

Corresponds to the total number of times that the signal changes from positive to negative or vice versa.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which the zero-crossing rate are computed

Returns:

Number of times that signal value cross the zero axis

Return type:

int

tsfel.feature_extraction.features_settings module

tsfel.feature_extraction.features_settings.get_features_by_domain(domain=None, json_path=None)[source]

Creates a dictionary with the features settings by domain.

Parameters:
  • domain (str, list of str, or None, default=None) –

    Specifies which feature domains to include in the dictionary.
    • ’statistical’, ‘temporal’, ‘spectral’, ‘fractal’: Includes the corresponding feature domain.

    • ’all’: Includes all available feature domains.

    • list of str: A combination of the above strings, e.g., [‘statistical’, ‘temporal’].

    • None: By default, includes the ‘statistical’, ‘temporal’, and ‘spectral’ domains.

  • json_path (string) – Directory of json file. Default: package features.json directory

Returns:

Dictionary with the features settings

Return type:

Dict

tsfel.feature_extraction.features_settings.get_features_by_tag(tag=None, json_path=None)[source]

Creates a dictionary with the features settings by tag.

Parameters:
  • tag (string) – Available tags: “audio”; “inertial”, “ecg”; “eeg”; “emg”. If tag equals None then, all available features are returned.

  • json_path (string) – Directory of json file. Default: package features.json directory

Returns:

Dictionary with the features settings

Return type:

Dict

tsfel.feature_extraction.features_settings.get_number_features(dict_features)[source]

Count the total number of features based on input parameters of each feature.

Parameters:

dict_features (dict) – Dictionary with features settings

Returns:

Feature vector size

Return type:

int

tsfel.feature_extraction.features_settings.load_json(json_path)[source]

A convenient method that wraps the built-in json.load. This method might be handy to load customized feature configuration files.

Parameters:

json_path (file-like object, string, or pathlib.Path.) – The json file to read.

Returns:

Data stored in the file.

Return type:

dict

tsfel.feature_extraction.features_utils module

tsfel.feature_extraction.features_utils.autocorr_norm(signal)[source]

Computes the autocorrelation.

Implementation details and description in: https://ccrma.stanford.edu/~orchi/Documents/speaker_recognition_report.pdf

Parameters:

signal (nd-array) – Input from linear prediction coefficients are computed

Returns:

Autocorrelation result

Return type:

nd-array

tsfel.feature_extraction.features_utils.calc_ecdf(signal)[source]

Computes the ECDF of the signal.

Parameters:

signal (nd-array) – Input from which ECDF is computed

Returns:

Sorted signal and computed ECDF.

Return type:

nd-array

tsfel.feature_extraction.features_utils.calc_fft(signal, fs)[source]

This functions computes the fft of a signal.

Parameters:
  • signal (nd-array) – The input signal from which fft is computed

  • fs (float) – Sampling frequency

Returns:

  • f (nd-array) – Frequency values (xx axis)

  • fmag (nd-array) – Amplitude of the frequency values (yy axis)

tsfel.feature_extraction.features_utils.calc_lempel_ziv_complexity(sequence)[source]

Manual implementation of the Lempel-Ziv complexity.

It is defined as the number of different substrings encountered as the stream is viewed from begining to the end.

Reference: https://github.com/Naereen/Lempel-Ziv_Complexity/blob/master/src/lempel_ziv_complexity.py

Parameters:

sequence (string) – Binarised signal, as a string of characters

Return type:

LZ index

tsfel.feature_extraction.features_utils.calc_lengths_higuchi(signal)[source]

Computes the lengths for different subdivisions, using the Higuchi’s method.

Parameters:

signal (np.ndarray) – Input signal.

Returns:

lk – Length of curve for different subdivisions

Return type:

nd-array

tsfel.feature_extraction.features_utils.calc_rms(signal, window)[source]

Windowed Root Mean Square (RMS) with linear detrending.

Parameters:
  • signal (nd-array) – Signal

  • window (int) – Length of the window in which RMS will be calculated

Returns:

rms – RMS data in each window with length len(signal)//window

Return type:

nd-array

tsfel.feature_extraction.features_utils.coarse_graining(signal, scale)[source]

Applies a coarse-graining process to a time series: for a given scale factor, it splits the signal into non-overlapping windows and averages the data points.

Parameters:
  • signal (np.ndarray) – Input signal.

  • scale (int) – Scale factor, determines the length of the non-overlapping windows.

Returns:

coarsegrained_signal – Coarse-grained signal.

Return type:

np.ndarray

tsfel.feature_extraction.features_utils.compute_rs(signal, lag)[source]

Computes the average rescaled range for a window of length lag.

Parameters:
  • signal (np.ndarray) – Input signal.

  • lag (int) – Window length.

Returns:

Average R/S.

Return type:

float

tsfel.feature_extraction.features_utils.compute_time(signal, fs)[source]

Creates the signal correspondent time array.

Parameters:
  • signal (nd-array) – Input from which the time is computed.

  • fs (int) – Sampling Frequency

Returns:

time – Signal time

Return type:

float list

tsfel.feature_extraction.features_utils.continuous_wavelet_transform(signal, fs, wavelet='mexh', widths=array([1, 2, 3, 4, 5, 6, 7, 8, 9]))[source]

Computes CWT (continuous wavelet transform) of the signal.

Parameters:
  • signal (nd-array) – Input from which CWT is computed

  • wavelet (string) – Wavelet to use, defaults to “mexh” which represents the mexican hat wavelet (Ricker wavelet)

  • widths (nd-array) – Widths to use for transformation Default: np.arange(1,10)

Returns:

The result of the CWT along the time axis matrix with size (len(widths),len(signal))

Return type:

nd-array

tsfel.feature_extraction.features_utils.create_symmetric_matrix(acf, order=11)[source]

Computes a symmetric matrix.

Implementation details and description in: https://ccrma.stanford.edu/~orchi/Documents/speaker_recognition_report.pdf

Parameters:
  • acf (nd-array) – Input from which a symmetric matrix is computed

  • order (int) – Order

Returns:

Symmetric Matrix

Return type:

nd-array

tsfel.feature_extraction.features_utils.create_xx(features)[source]

Computes the range of features amplitude for the probability density function calculus.

Parameters:

features (nd-array) – Input features

Returns:

range of features amplitude

Return type:

nd-array

tsfel.feature_extraction.features_utils.filterbank(signal, fs, pre_emphasis=0.97, nfft=512, nfilt=40)[source]

Computes the MEL-spaced filterbank.

It provides the information about the power in each frequency band.

Implementation details and description on: https://www.kaggle.com/ilyamich/mfcc-implementation-and-tutorial https://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html#fnref:1

Parameters:
  • signal (nd-array) – Input from which filterbank is computed

  • fs (float) – Sampling frequency

  • pre_emphasis (float) – Pre-emphasis coefficient for pre-emphasis filter application

  • nfft (int) – Number of points of fft

  • nfilt (int) – Number of filters

Returns:

MEL-spaced filterbank

Return type:

nd-array

tsfel.feature_extraction.features_utils.find_plateau(y, threshold=0.1, consecutive_points=5)[source]

Finds a plateau (if it exists).

Parameters:
  • y (np.ndarray) – Array of y-axis values.

  • threshold (float) – Slope threshold to consider as a plateau (default is 0.1).

  • consecutive_points (int) – Number of consecutive points with a small derivative to consider as a plateau (default is 5).

Return type:

Index of the beggining of the plateau if it is found, length of y otherwise.

tsfel.feature_extraction.features_utils.gaussian(features)[source]

Computes the probability density function of the input signal using a Gaussian function.

Parameters:

features (nd-array) – Input from which probability density function is computed

Returns:

probability density values

Return type:

nd-array

tsfel.feature_extraction.features_utils.get_templates(signal, m=3)[source]

Helper function for the sample entropy calculation. Divides a signal into templates vectors of length m.

Parameters:
  • signal (np.ndarray) – Input signal.

  • m (int) – Embedding dimension that defines the length of the template vectors, defaults to 3.

Returns:

Array of template vectors.

Return type:

np.ndarray

tsfel.feature_extraction.features_utils.kde(features)[source]

Computes the probability density function of the input signal using a Gaussian KDE (Kernel Density Estimate)

Parameters:

features (nd-array) – Input from which probability density function is computed

Returns:

probability density values

Return type:

nd-array

tsfel.feature_extraction.features_utils.lpc(signal, n_coeff=12)[source]

Computes the linear prediction coefficients.

Implementation details and description in: https://ccrma.stanford.edu/~orchi/Documents/speaker_recognition_report.pdf

Parameters:
  • signal (nd-array) – Input from linear prediction coefficients are computed

  • n_coeff (int) – Number of coefficients

Returns:

Linear prediction coefficients

Return type:

nd-array

tsfel.feature_extraction.features_utils.safe_eval_string(list_string)[source]

Safely evaluate a string containing a Python literal list of floats or integers. This method is safer and faster on runtime than ast.eval_literal.

Parameters:

list_string (str) – A string representation of a list literal.

Returns:

A list containing integers or floats.

Return type:

parsed_list

tsfel.feature_extraction.features_utils.sample_entropy(signal, m, tolerance)[source]

Computes the sample entropy of a signal.

Parameters:
  • signal (np.ndarray) – Input signal.

  • m (int) – Embedding dimension that defines the length of the template vectors, defaults to 3.

  • tolerance (float) – Tolerance value, defaults to 0.2 times the standard deviation of the input signal.

Returns:

Sample Entropy of a signal.

Return type:

float

tsfel.feature_extraction.features_utils.set_domain(key, value)[source]

Module contents