tsfel.feature_extraction package

Submodules

tsfel.feature_extraction.calc_features module

tsfel.feature_extraction.calc_features.calc_window_features(config, window, fs, verbose=1, single_window=False, **kwargs)[source]

Extract features from a univariate or multivariate window.

Parameters:

config (dict) – A dictionary containing the settings for feature extraction.
window (np.ndarray, pd.DataFrame, pd.Series) – The input signal from which features will be extracted.
fs (float, default=None) – Sampling frequency of the input signal.
verbose (int, default=1) – The verbosity mode. 0 means silent, and 1 means showing a progress bar.
single_window (bool) – If True, the progress bar will be shown only for the extraction of features from a single window.
**kwargs –
Additional keyword arguments, see below:
- features_path (str) –
  Path to a script with custom features.
- header_names (list or array-like) –
  Names of each column window.

Returns:

A DataFrame containing the extracted features.

Return type:

pd.DataFrame

tsfel.feature_extraction.calc_features.dataset_features_extractor(main_directory, feat_dict, verbose=1, **kwargs)[source]

Extracts features from a dataset.

Parameters:

main_directory (String) – Input directory
feat_dict (dict) – Dictionary with features
verbose (int) – Verbosity mode. 0 = silent, 1 = progress bar. (0 or 1 (Default))
**kwargs
below (See) –
- search_criteria (list) –
  List of file names to compute features. (Example: ‘Accelerometer.txt’) (default: None)
- time_unit (float) –
  Time unit (default: 1e9)
- resampling_rate (int) –
  Resampling rate (default: 100)
- window_size (int) –
  Window size in number of samples (default: 100)
- overlap (float) –
  Overlap between 0 and 1 (default: 0)
- pre_process (function) –
  Function with pre processing code
  
  (default: None)
- output_directory (String) –
  Output directory (default: 'output_directory', str(Path.home()) + '/tsfel_output')
- features_path (string) –
  Directory of script with personal features
- header_names (list or array) –
  Names of each column window
- n_jobs (int) –
  The number of jobs to run in parallel. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. (default: None in Windows and -1 for other systems)

Returns:

csv file with the extracted features

Return type:

file

tsfel.feature_extraction.calc_features.load_combined_feature_modules(features_path=None)[source]

Load feature functions from both the TSFEL features module and an optional user module.

Parameters:: features_path (str or None) – Path to user-defined Python feature module (.py). If None, only the default TSFEL features module is used.
Returns:: A dictionary mapping function names to function objects from both sources. User-defined functions override TSFEL ones with the same name.
Return type:: dict

tsfel.feature_extraction.calc_features.time_series_features_extractor(config, timeseries, fs=None, window_size=None, overlap=0, verbose=1, **kwargs)[source]

Extract features from univariate or multivariate time series.

Parameters:

config (dict) – A dictionary containing the settings for feature extraction.
timeseries (list, np.ndarray, pd.DataFrame, pd.Series) – The input signal from which features will be extracted.
fs (float, default=None) – Sampling frequency of the input signal.
window_size (int or None, optional, default=None) – The size of the windows used to split the input signal, measured in the number of samples.
overlap (float, optional, default=0) – A value between 0 and 1 that defines the percentage of overlap between consecutive windows.
n_jobs (int, optional) –
The number of jobs to run in parallel.
- None means 1 unless in a joblib.parallel_backend context.
- -1 means using all available processors.
- default: None on Windows, -1 for other systems
verbose (int, default=1) – The verbosity mode. 0 means silent, and 1 means showing a progress bar.
**kwargs –
Additional keyword arguments, see below:
- features_path (str) –
  Path to a script with custom features.
- header_names (list or array-like) –
  Names of each column window.

Returns:

A DataFrame containing the extracted features, where: - Columns represent the names of the features. - Rows contain the feature values for each signal window.

Return type:

pd.DataFrame

tsfel.feature_extraction.features module

tsfel.feature_extraction.features.abs_energy(signal)[source]

Computes the absolute energy of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which the area under the curve is computed
Returns:: Absolute energy
Return type:: float

tsfel.feature_extraction.features.auc(signal, fs)[source]

Computes the area under the curve of the signal computed with trapezoid rule.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which the area under the curve is computed
fs (float) – Sampling Frequency

Returns:

The area under the curve value

Return type:

float

tsfel.feature_extraction.features.autocorr(signal)[source]

Calculates the first 1/e crossing of the autocorrelation function (ACF). The adjusted ACF is calculated using the statsmodels.tsa.stattools.acf. Following the recommendations for long time series (size > 450), we use the FFT convolution. This feature measures the first time lag at which the autocorrelation function drops below 1/e (= 0.3679).

Feature computational cost: 2

Parameters:: signal (nd-array) – Input from which autocorrelation is computed
Returns:: The first time lag at which the ACF drops below 1/e (= 0.3679).
Return type:: int

tsfel.feature_extraction.features.average_power(signal, fs)[source]

Computes the average power of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Signal from which average power is computed
fs (float) – Sampling frequency

Returns:

Average power

Return type:

float

tsfel.feature_extraction.features.calc_centroid(signal, fs)[source]

Computes the centroid along the time axis.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which centroid is computed
fs (int) – Signal sampling frequency

Returns:

Temporal centroid

Return type:

float

tsfel.feature_extraction.features.calc_max(signal)[source]

Computes the maximum value of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which max is computed
Returns:: Maximum result
Return type:: float

tsfel.feature_extraction.features.calc_mean(signal)[source]

Computes mean value of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which mean is computed.
Returns:: Mean result
Return type:: float

tsfel.feature_extraction.features.calc_median(signal)[source]

Computes median of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which median is computed
Returns:: Median result
Return type:: float

tsfel.feature_extraction.features.calc_min(signal)[source]

Computes the minimum value of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which min is computed
Returns:: Minimum result
Return type:: float

tsfel.feature_extraction.features.calc_std(signal)[source]

Computes standard deviation (std) of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which std is computed
Returns:: Standard deviation result
Return type:: float

tsfel.feature_extraction.features.calc_var(signal)[source]

Computes variance of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which var is computed
Returns:: Variance result
Return type:: float

tsfel.feature_extraction.features.dfa(signal)[source]

Computes the Detrended Fluctuation Analysis (DFA) of the signal.

Parameters:: signal (np.ndarray) – Input signal.
Returns:: alpha_dfa – Scaling exponent in DFA.
Return type:: float

tsfel.feature_extraction.features.distance(signal)[source]

Computes signal traveled distance.

Calculates the total distance traveled by the signal using the hypotenuse between 2 datapoints.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which distance is computed
Returns:: Signal distance
Return type:: float

tsfel.feature_extraction.features.ecdf(signal, d=10)[source]

Computes the values of ECDF (empirical cumulative distribution function) along the time axis.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which ECDF is computed
d (integer) – Number of ECDF values to return

Returns:

The values of the ECDF along the time axis

Return type:

float

tsfel.feature_extraction.features.ecdf_percentile(signal, percentile=None)[source]

Computes the percentile value of the ECDF.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which ECDF is computed
percentile (list) – Percentile value to be computed

Returns:

The input value(s) of the ECDF

Return type:

float

tsfel.feature_extraction.features.ecdf_percentile_count(signal, percentile=None)[source]

Computes the cumulative sum of samples that are less than the percentile.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which ECDF is computed
percentile (list) – Percentile threshold

Returns:

The cumulative sum of samples

Return type:

float

tsfel.feature_extraction.features.ecdf_slope(signal, p_init=0.5, p_end=0.75)[source]

Computes the slope of the ECDF between two percentiles. Possibility to return infinity values.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which ECDF is computed
p_init (float) – Initial percentile
p_end (float) – End percentile

Returns:

The slope of the ECDF between two percentiles

Return type:

float

tsfel.feature_extraction.features.entropy(signal, prob='standard')[source]

Computes the entropy of the signal using the Shannon Entropy.

Description in Article: Regularities Unseen, Randomness Observed: Levels of Entropy Convergence Authors: Crutchfield J. Feldman David

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which entropy is computed
prob (string) – Probability function (kde or gaussian functions are available)

Returns:

The normalized entropy value

Return type:

float

tsfel.feature_extraction.features.fundamental_frequency(signal, fs)[source]

Computes fundamental frequency of the signal.

The fundamental frequency integer multiple best explain the content of the signal spectrum.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which fundamental frequency is computed
fs (float) – Sampling frequency

Returns:

f0 – Predominant frequency of the signal

Return type:

float

tsfel.feature_extraction.features.higuchi_fractal_dimension(signal)[source]

Computes the fractal dimension of a signal using Higuchi’s method (HFD).

Parameters:: signal (np.ndarray) – Input signal.
Returns:: hfd – Fractal dimension.
Return type:: float

tsfel.feature_extraction.features.hist_mode(signal, nbins=10)[source]

Compute the mode of a histogram using a given number of (linearly spaced) bins.

Feature computational cost: 1

Parameters:

signal (np.ndarray) – Input signal from which the histogram is computed.
nbins (int) – The number of equal-width bins in the given range, by default 10.

Returns:

The mode of the histogram (the midpoint of the bin with the highest count).

Return type:

float

tsfel.feature_extraction.features.human_range_energy(signal, fs)[source]

Computes the human range energy ratio.

The human range energy ratio is given by the ratio between the energy in frequency 0.6-2.5Hz and the whole energy band.

Feature computational cost: 2

Parameters:

signal (nd-array) – Signal from which human range energy ratio is computed
fs (float) – Sampling frequency

Returns:

Human range energy ratio

Return type:

float

tsfel.feature_extraction.features.hurst_exponent(signal)[source]

Computes the Hurst exponent of the signal through the Rescaled range (R/S) analysis.

Parameters:: signal (np.ndarray) – Input signal.
Returns:: h_exp – Hurst exponent.
Return type:: float

tsfel.feature_extraction.features.interq_range(signal)[source]

Computes interquartile range of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which interquartile range is computed
Returns:: Interquartile range result
Return type:: float

tsfel.feature_extraction.features.kurtosis(signal)[source]

Computes kurtosis of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which kurtosis is computed
Returns:: Kurtosis result
Return type:: float

tsfel.feature_extraction.features.lempel_ziv(signal, threshold=None)[source]

Computes the Lempel-Ziv’s (LZ) complexity index, normalized by the signal’s length.

Parameters:

signal (np.ndarray) – Input signal.
amp_thres (float, optional) – Amplitude Threshold for the binarisation. If None, the mean of the signal is used.

Returns:

lz_index – Lempel-Ziv complexity index

Return type:

float

tsfel.feature_extraction.features.lpcc(signal, n_coeff=12)[source]

Computes the linear prediction cepstral coefficients.

Implementation details and description in: http://www.practicalcryptography.com/miscellaneous/machine-learning/tutorial-cepstrum-and-lpccs/

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from linear prediction cepstral coefficients are computed
n_coeff (int) – Number of coefficients

Returns:

Linear prediction cepstral coefficients

Return type:

nd-array

tsfel.feature_extraction.features.max_frequency(signal, fs)[source]

Computes maximum frequency of the signal.

Feature computational cost: 2

Parameters:

signal (nd-array) – Input from which maximum frequency is computed
fs (float) – Sampling frequency

Returns:

0.95 of maximum frequency using cumsum

Return type:

float

tsfel.feature_extraction.features.max_power_spectrum(signal, fs)[source]

Computes maximum power spectrum density of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which maximum power spectrum is computed
fs (float) – Sampling frequency

Returns:

Max value of the power spectrum density

Return type:

nd-array

tsfel.feature_extraction.features.maximum_fractal_length(signal)[source]

Computes the Maximum Fractal Length (MFL) of the signal, which is the average length at the smallest scale, measured from the logarithmic plot determining FD. The Higuchi’s method is used.

Parameters:: signal (np.ndarray) – Input signal.
Returns:: mfl – Maximum Fractal Length.
Return type:: float

tsfel.feature_extraction.features.mean_abs_deviation(signal)[source]

Computes mean absolute deviation of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which mean absolute deviation is computed
Returns:: Mean absolute deviation result
Return type:: float

tsfel.feature_extraction.features.mean_abs_diff(signal)[source]

Computes mean absolute differences of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which mean absolute deviation is computed
Returns:: Mean absolute difference result
Return type:: float

tsfel.feature_extraction.features.mean_diff(signal)[source]

Computes mean of differences of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which mean of differences is computed
Returns:: Mean difference result
Return type:: float

tsfel.feature_extraction.features.median_abs_deviation(signal)[source]

Computes median absolute deviation of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which median absolute deviation is computed
Returns:: Mean absolute deviation result
Return type:: float

tsfel.feature_extraction.features.median_abs_diff(signal)[source]

Computes median absolute differences of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which median absolute difference is computed
Returns:: Median absolute difference result
Return type:: float

tsfel.feature_extraction.features.median_diff(signal)[source]

Computes median of differences of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which median of differences is computed
Returns:: Median difference result
Return type:: float

tsfel.feature_extraction.features.median_frequency(signal, fs)[source]

Computes median frequency of the signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which median frequency is computed
fs (int) – Sampling frequency

Returns:

f_median – 0.50 of maximum frequency using cumsum.

Return type:

int

tsfel.feature_extraction.features.mfcc(signal, fs, pre_emphasis=0.97, nfft=512, nfilt=40, num_ceps=12, cep_lifter=22)[source]

Computes the MEL cepstral coefficients.

It provides the information about the power in each frequency band.

Implementation details and description on: https://www.kaggle.com/ilyamich/mfcc-implementation-and-tutorial https://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html#fnref:1

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which MEL coefficients is computed
fs (float) – Sampling frequency
pre_emphasis (float) – Pre-emphasis coefficient for pre-emphasis filter application
nfft (int) – Number of points of fft
nfilt (int) – Number of filters
num_ceps (int) – Number of cepstral coefficients
cep_lifter (int) – Filter length

Returns:

MEL cepstral coefficients

Return type:

nd-array

tsfel.feature_extraction.features.mse(signal, m=3, maxscale=None, tolerance=None)[source]

Computes the Multiscale entropy (MSE) of the signal, that performs the entropy analysis over multiple time scales.

Parameters:

signal (np.ndarray) – Input signal.
m (int) – Embedding dimension for the sample entropy, defaults to 3.
maxscale (int) – Maximum scale factor, defaults to 1/13 of the length of the input signal.
tolerance (float) – Tolerance value, defaults to 0.2 times the standard deviation of the input signal.

Returns:

mse_area – Normalized area under the MSE curve.

Return type:

np.ndarray

tsfel.feature_extraction.features.negative_turning(signal)[source]

Computes number of negative turning points of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which minimum number of negative turning points are counted
Returns:: Number of negative turning points
Return type:: float

tsfel.feature_extraction.features.neighbourhood_peaks(signal, n=10)[source]

Computes the number of peaks from a defined neighbourhood of the signal.

Reference: Christ, M., Braun, N., Neuffer, J. and Kempa-Liehr A.W. (2018). Time Series FeatuRe Extraction on basis: of Scalable Hypothesis tests (tsfresh – A Python package). Neurocomputing 307 (2018) 72-77

Parameters:

signal (nd-array) – Input from which the number of neighbourhood peaks is computed
n (int) – Number of peak’s neighbours to the left and to the right

Returns:

The number of peaks from a defined neighbourhood of the signal

Return type:

int

tsfel.feature_extraction.features.petrosian_fractal_dimension(signal)[source]

Computes the Petrosian Fractal Dimension of a signal.

Parameters:: signal (np.ndarray) – Input signal.
Returns:: pfd – Petrosian Fractal Dimension.
Return type:: float

tsfel.feature_extraction.features.pk_pk_distance(signal)[source]

Computes the peak to peak distance.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which peak to peak is computed
Returns:: peak to peak distance
Return type:: float

tsfel.feature_extraction.features.positive_turning(signal)[source]

Computes number of positive turning points of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which positive turning points are counted
Returns:: Number of positive turning points
Return type:: float

tsfel.feature_extraction.features.power_bandwidth(signal, fs)[source]

Computes power spectrum density bandwidth of the signal.

It corresponds to the width of the frequency band in which 95% of its power is located.

Description in article: Power Spectrum and Bandwidth Ulf Henriksson, 2003 Translated by Mikael Olofsson, 2005

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which the power bandwidth computed
fs (float) – Sampling frequency

Returns:

Occupied power in bandwidth

Return type:

float

tsfel.feature_extraction.features.rms(signal)[source]

Computes root mean square of the signal.

Square root of the arithmetic mean (average) of the squares of the original values.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which root mean square is computed
Returns:: Root mean square
Return type:: float

tsfel.feature_extraction.features.skewness(signal)[source]

Computes skewness of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which skewness is computed
Returns:: Skewness result
Return type:: int

tsfel.feature_extraction.features.slope(signal)[source]

Computes the slope of the signal.

Slope is computed by fitting a linear equation to the observed data.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which linear equation is computed
Returns:: Slope
Return type:: float

tsfel.feature_extraction.features.spectral_centroid(signal, fs)[source]

Barycenter of the spectrum.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 2

Parameters:

signal (nd-array) – Signal from which spectral centroid is computed
fs (int) – Sampling frequency

Returns:

Centroid

Return type:

float

tsfel.feature_extraction.features.spectral_decrease(signal, fs)[source]

Represents the amount of decreasing of the spectra amplitude.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 1

Parameters:

signal (nd-array) – Signal from which spectral decrease is computed
fs (float) – Sampling frequency

Returns:

Spectral decrease

Return type:

float

tsfel.feature_extraction.features.spectral_distance(signal, fs)[source]

Computes the signal spectral distance.

Distance of the signal’s cumulative sum of the FFT elements to the respective linear regression.

Feature computational cost: 1

Parameters:

signal (nd-array) – Signal from which spectral distance is computed
fs (float) – Sampling frequency

Returns:

spectral distance

Return type:

float

tsfel.feature_extraction.features.spectral_entropy(signal, fs)[source]

Computes the spectral entropy of the signal based on Fourier transform.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which spectral entropy is computed
fs (float) – Sampling frequency

Returns:

The normalized spectral entropy value

Return type:

float

tsfel.feature_extraction.features.spectral_kurtosis(signal, fs)[source]

Measures the flatness of a distribution around its mean value.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 2

Parameters:

signal (nd-array) – Signal from which spectral kurtosis is computed
fs (float) – Sampling frequency

Returns:

Spectral Kurtosis

Return type:

float

tsfel.feature_extraction.features.spectral_positive_turning(signal, fs)[source]

Computes number of positive turning points of the fft magnitude signal.

Feature computational cost: 1

Parameters:

signal (nd-array) – Input from which the number of positive turning points of the fft magnitude are computed
fs (float) – Sampling frequency

Returns:

Number of positive turning points

Return type:

float

tsfel.feature_extraction.features.spectral_roll_off(signal, fs)[source]

Computes the spectral roll-off of the signal.

The spectral roll-off corresponds to the frequency where 95% of the signal magnitude is contained below of this value.

Feature computational cost: 1

Parameters:

signal (nd-array) – Signal from which spectral roll-off is computed
fs (float) – Sampling frequency

Returns:

Spectral roll-off

Return type:

float

tsfel.feature_extraction.features.spectral_roll_on(signal, fs)[source]

Computes the spectral roll-on of the signal.

The spectral roll-on corresponds to the frequency where 5% of the signal magnitude is contained below of this value.

Feature computational cost: 1

Parameters:

signal (nd-array) – Signal from which spectral roll-on is computed
fs (float) – Sampling frequency

Returns:

Spectral roll-on

Return type:

float

tsfel.feature_extraction.features.spectral_skewness(signal, fs)[source]

Measures the asymmetry of a distribution around its mean value.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 2

Parameters:

signal (nd-array) – Signal from which spectral skewness is computed
fs (float) – Sampling frequency

Returns:

Spectral Skewness

Return type:

float

tsfel.feature_extraction.features.spectral_slope(signal, fs)[source]

Computes the spectral slope.

Spectral slope is computed by finding constants m and b of the function aFFT = mf + b, obtained by linear regression of the spectral amplitude.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 1

Parameters:

signal (nd-array) – Signal from which spectral slope is computed
fs (float) – Sampling frequency

Returns:

Spectral Slope

Return type:

float

tsfel.feature_extraction.features.spectral_spread(signal, fs)[source]

Measures the spread of the spectrum around its mean value.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 2

Parameters:

signal (nd-array) – Signal from which spectral spread is computed.
fs (float) – Sampling frequency

Returns:

Spectral Spread

Return type:

float

tsfel.feature_extraction.features.spectral_variation(signal, fs)[source]

Computes the amount of variation of the spectrum along time.

Spectral variation is computed from the normalized cross-correlation between two consecutive amplitude spectra.

Description and formula in Article: The Timbre Toolbox: Extracting audio descriptors from musicalsignals Authors Peeters G., Giordano B., Misdariis P., McAdams S.

Feature computational cost: 1

Parameters:

signal (nd-array) – Signal from which spectral variation is computed.
fs (float) – Sampling frequency

Returns:

Spectral Variation

Return type:

float

tsfel.feature_extraction.features.spectrogram_mean_coeff(signal, fs, bins=32)[source]

Calculates the average power spectral density (PSD) for each frequency throughout the entire signal duration provided by the spectrogram.

The values represent the average power spectral density computed on frequency bins. The feature name refers to the frequency bin where the PSD was taken. Each bin is fs / (bins * 2 - 2) Hz wide. The method relies on the scipy.signal.spectrogram and except for nperseg and fs, all the other parameters are set to its defaults.

Feature computational cost: 1

Parameters:

signal (array_like) – Input from which the spectrogram average power spectral density coefficients are computed.
fs (float) – Sampling frequency of the signal.
bins (int, optional) – The number of frequency bins.

Returns:

The power spectral density for each frequency bin averaged along the entire signal duration.

Return type:

nd-array

Notes

The optimal number of frequency bins depend on the task at hand. Using a higher number of bins with low sampling frequencies may result in excessive frequency resolution and the loss of valuable coarse-grained information. The default value should be suitable for most cases when working with the default sampling frequency. The number of frequency bins must be modified in the feature configuration file.

Added in version 0.1.7.

tsfel.feature_extraction.features.sum_abs_diff(signal)[source]

Computes sum of absolute differences of the signal.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which sum absolute difference is computed
Returns:: Sum absolute difference result
Return type:: float

tsfel.feature_extraction.features.wavelet_abs_mean(signal, fs, wavelet='mexh', max_width=10)[source]

Computes CWT absolute mean value of each wavelet scale.

Parameters:

signal (nd-array) – Input from which CWT is computed
fs (int) – Signal sampling frequency
wavelet (string) – Wavelet to use, defaults to “mexh” which represents the mexican hat wavelet (Ricker wavelet)
max_width (int) – Maximum width to use for transformation, defaults to 10

Returns:

CWT absolute mean value

Return type:

nd-array

tsfel.feature_extraction.features.wavelet_energy(signal, fs, wavelet='mexh', max_width=10)[source]

Computes CWT energy of each wavelet scale.

Implementation details: https://stackoverflow.com/questions/37659422/energy-for-1-d-wavelet-in-python

Parameters:

signal (nd-array) – Input from which CWT is computed
fs (int) – Signal sampling frequency
wavelet (string) – Wavelet to use, defaults to “mexh” which represents the mexican hat wavelet (Ricker wavelet)
max_width (int) – Maximum width to use for transformation, defaults to 10

Returns:

CWT energy

Return type:

nd-array

tsfel.feature_extraction.features.wavelet_entropy(signal, fs, wavelet='mexh', max_width=10)[source]

Computes CWT entropy of the signal.

Implementation details in: https://dsp.stackexchange.com/questions/13055/how-to-calculate-cwt-shannon-entropy B.F. Yan, A. Miyamoto, E. Bruhwiler, Wavelet transform-based modal parameter identification considering uncertainty

Parameters:

signal (nd-array) – Input from which CWT is computed
fs (int) – Signal sampling frequency
wavelet (string) – Wavelet to use, defaults to “mexh” which represents the mexican hat wavelet (Ricker wavelet)
max_width (int) – Maximum width to use for transformation, defaults to 10

Returns:

wavelet entropy

Return type:

float

tsfel.feature_extraction.features.wavelet_std(signal, fs, wavelet='mexh', max_width=10)[source]

Computes CWT std value of each wavelet scale.

Parameters:

signal (nd-array) – Input from which CWT is computed
fs (int) – Signal sampling frequency
wavelet (string) – Wavelet to use, defaults to “mexh” which represents the mexican hat wavelet (Ricker wavelet)
max_width (int) – Maximum width to use for transformation, defaults to 10

Returns:

CWT std

Return type:

nd-array

tsfel.feature_extraction.features.wavelet_var(signal, fs, wavelet='mexh', max_width=10)[source]

Computes CWT variance value of each wavelet scale.

Parameters:

signal (nd-array) – Input from which CWT is computed
fs (int) – Signal sampling frequency
wavelet (string) – Wavelet to use, defaults to “mexh” which represents the mexican hat wavelet (Ricker wavelet)
max_width (int) – Maximum width to use for transformation, defaults to 10

Returns:

CWT variance

Return type:

nd-array

tsfel.feature_extraction.features.zero_cross(signal)[source]

Computes Zero-crossing rate of the signal.

Corresponds to the total number of times that the signal changes from positive to negative or vice versa.

Feature computational cost: 1

Parameters:: signal (nd-array) – Input from which the zero-crossing rate are computed
Returns:: Number of times that signal value cross the zero axis
Return type:: int

tsfel.feature_extraction.features_settings module

tsfel.feature_extraction.features_settings.get_features_by_domain(domain=None, json_path=None)[source]

Creates a dictionary with the features settings by domain.

Parameters:

domain (str, list of str, or None, default=None) –
Specifies which feature domains to include in the dictionary.
- ’statistical’, ‘temporal’, ‘spectral’, ‘fractal’: Includes the corresponding feature domain.
- ’all’: Includes all available feature domains.
- list of str: A combination of the above strings, e.g., [‘statistical’, ‘temporal’].
- None: By default, includes the ‘statistical’, ‘temporal’, and ‘spectral’ domains.
json_path (string) – Directory of json file. Default: package features.json directory

Returns:

Dictionary with the features settings

Return type:

Dict

tsfel.feature_extraction.features_settings.get_features_by_tag(tag=None, json_path=None)[source]

Creates a dictionary with the features settings by tag.

Parameters:

tag (string) – Available tags: “audio”; “inertial”, “ecg”; “eeg”; “emg”. If tag equals None then, all available features are returned.
json_path (string) – Directory of json file. Default: package features.json directory

Returns:

Dictionary with the features settings

Return type:

Dict

tsfel.feature_extraction.features_settings.get_number_features(dict_features)[source]

Count the total number of features based on input parameters of each feature.

Parameters:: dict_features (dict) – Dictionary with features settings
Returns:: Feature vector size
Return type:: int

tsfel.feature_extraction.features_settings.load_json(json_path)[source]

A convenient method that wraps the built-in json.load. This method might be handy to load customized feature configuration files.

Parameters:: json_path (file-like object, string, or pathlib.Path.) – The json file to read.
Returns:: Data stored in the file.
Return type:: dict

tsfel.feature_extraction.features_utils module

tsfel.feature_extraction.features_utils.autocorr_norm(signal)[source]

Computes the autocorrelation.

Implementation details and description in: https://ccrma.stanford.edu/~orchi/Documents/speaker_recognition_report.pdf

Parameters:: signal (nd-array) – Input from linear prediction coefficients are computed
Returns:: Autocorrelation result
Return type:: nd-array

tsfel.feature_extraction.features_utils.calc_ecdf(signal)[source]

Computes the ECDF of the signal.

Parameters:: signal (nd-array) – Input from which ECDF is computed
Returns:: Sorted signal and computed ECDF.
Return type:: nd-array

tsfel.feature_extraction.features_utils.calc_fft(signal, fs)[source]

This functions computes the fft of a signal.

Parameters:

signal (nd-array) – The input signal from which fft is computed
fs (float) – Sampling frequency

Returns:

f (nd-array) – Frequency values (xx axis)
fmag (nd-array) – Amplitude of the frequency values (yy axis)

tsfel.feature_extraction.features_utils.calc_lempel_ziv_complexity(sequence)[source]

Manual implementation of the Lempel-Ziv complexity.

It is defined as the number of different substrings encountered as the stream is viewed from begining to the end.

Reference: https://github.com/Naereen/Lempel-Ziv_Complexity/blob/master/src/lempel_ziv_complexity.py

Parameters:: sequence (string) – Binarised signal, as a string of characters
Return type:: LZ index

tsfel.feature_extraction.features_utils.calc_lengths_higuchi(signal)[source]

Computes the lengths for different subdivisions, using the Higuchi’s method.

Parameters:: signal (np.ndarray) – Input signal.
Returns:: lk – Length of curve for different subdivisions
Return type:: nd-array

tsfel.feature_extraction.features_utils.calc_rms(signal, window)[source]

Windowed Root Mean Square (RMS) with linear detrending.

Parameters:

signal (nd-array) – Signal
window (int) – Length of the window in which RMS will be calculated

Returns:

rms – RMS data in each window with length len(signal)//window

Return type:

nd-array

tsfel.feature_extraction.features_utils.coarse_graining(signal, scale)[source]

Applies a coarse-graining process to a time series: for a given scale factor, it splits the signal into non-overlapping windows and averages the data points.

Parameters:

signal (np.ndarray) – Input signal.
scale (int) – Scale factor, determines the length of the non-overlapping windows.

Returns:

coarsegrained_signal – Coarse-grained signal.

Return type:

np.ndarray

tsfel.feature_extraction.features_utils.compute_rs(signal, lag)[source]

Computes the average rescaled range for a window of length lag.

Parameters:

signal (np.ndarray) – Input signal.
lag (int) – Window length.

Returns:

Average R/S.

Return type:

float

tsfel.feature_extraction.features_utils.compute_time(signal, fs)[source]

Creates the signal correspondent time array.

Parameters:

signal (nd-array) – Input from which the time is computed.
fs (int) – Sampling Frequency

Returns:

time – Signal time

Return type:

float list

tsfel.feature_extraction.features_utils.continuous_wavelet_transform(signal, fs, wavelet='mexh', widths=array([1, 2, 3, 4, 5, 6, 7, 8, 9]))[source]

Computes CWT (continuous wavelet transform) of the signal.

Parameters:

signal (nd-array) – Input from which CWT is computed
wavelet (string) – Wavelet to use, defaults to “mexh” which represents the mexican hat wavelet (Ricker wavelet)
widths (nd-array) – Widths to use for transformation Default: np.arange(1,10)

Returns:

The result of the CWT along the time axis matrix with size (len(widths),len(signal))

Return type:

nd-array

tsfel.feature_extraction.features_utils.create_symmetric_matrix(acf, order=11)[source]

Computes a symmetric matrix.

Implementation details and description in: https://ccrma.stanford.edu/~orchi/Documents/speaker_recognition_report.pdf

Parameters:

acf (nd-array) – Input from which a symmetric matrix is computed
order (int) – Order

Returns:

Symmetric Matrix

Return type:

nd-array

tsfel.feature_extraction.features_utils.create_xx(features)[source]

Computes the range of features amplitude for the probability density function calculus.

Parameters:: features (nd-array) – Input features
Returns:: range of features amplitude
Return type:: nd-array

tsfel.feature_extraction.features_utils.filterbank(signal, fs, pre_emphasis=0.97, nfft=512, nfilt=40)[source]

Computes the MEL-spaced filterbank.

It provides the information about the power in each frequency band.

Implementation details and description on: https://www.kaggle.com/ilyamich/mfcc-implementation-and-tutorial https://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html#fnref:1

Parameters:

signal (nd-array) – Input from which filterbank is computed
fs (float) – Sampling frequency
pre_emphasis (float) – Pre-emphasis coefficient for pre-emphasis filter application
nfft (int) – Number of points of fft
nfilt (int) – Number of filters

Returns:

MEL-spaced filterbank

Return type:

nd-array

tsfel.feature_extraction.features_utils.find_plateau(y, threshold=0.1, consecutive_points=5)[source]

Finds a plateau (if it exists).

Parameters:

y (np.ndarray) – Array of y-axis values.
threshold (float) – Slope threshold to consider as a plateau (default is 0.1).
consecutive_points (int) – Number of consecutive points with a small derivative to consider as a plateau (default is 5).

Return type:

Index of the beggining of the plateau if it is found, length of y otherwise.

tsfel.feature_extraction.features_utils.gaussian(features)[source]

Computes the probability density function of the input signal using a Gaussian function.

Parameters:: features (nd-array) – Input from which probability density function is computed
Returns:: probability density values
Return type:: nd-array

tsfel.feature_extraction.features_utils.get_templates(signal, m=3)[source]

Helper function for the sample entropy calculation. Divides a signal into templates vectors of length m.

Parameters:

signal (np.ndarray) – Input signal.
m (int) – Embedding dimension that defines the length of the template vectors, defaults to 3.

Returns:

Array of template vectors.

Return type:

np.ndarray

tsfel.feature_extraction.features_utils.kde(features)[source]

Computes the probability density function of the input signal using a Gaussian KDE (Kernel Density Estimate)

Parameters:: features (nd-array) – Input from which probability density function is computed
Returns:: probability density values
Return type:: nd-array

tsfel.feature_extraction.features_utils.lpc(signal, n_coeff=12)[source]

Computes the linear prediction coefficients.

Implementation details and description in: https://ccrma.stanford.edu/~orchi/Documents/speaker_recognition_report.pdf

Parameters:

signal (nd-array) – Input from linear prediction coefficients are computed
n_coeff (int) – Number of coefficients

Returns:

Linear prediction coefficients

Return type:

nd-array

tsfel.feature_extraction.features_utils.safe_eval_string(list_string)[source]

Safely evaluate a string containing a Python literal list of floats or integers. This method is safer and faster on runtime than ast.eval_literal.

Parameters:: list_string (str) – A string representation of a list literal.
Returns:: A list containing integers or floats.
Return type:: parsed_list

tsfel.feature_extraction.features_utils.sample_entropy(signal, m, tolerance)[source]

Computes the sample entropy of a signal.

Parameters:

signal (np.ndarray) – Input signal.
m (int) – Embedding dimension that defines the length of the template vectors, defaults to 3.
tolerance (float) – Tolerance value, defaults to 0.2 times the standard deviation of the input signal.

Returns:

Sample Entropy of a signal.

Return type:

float

tsfel.feature_extraction.features_utils.set_domain(key, value)[source]

tsfel.feature_extraction package

Submodules

tsfel.feature_extraction.calc_features module

tsfel.feature_extraction.features module

tsfel.feature_extraction.features_settings module

tsfel.feature_extraction.features_utils module

Module contents