# Preprocessing#

Raw data often contains unwanted signal components: offsets, trends or even oscillatory nuisance signals. Syncopy has a dedicated `preprocessing()`

function to clean up and/or transform the data.

Let’s start by creating a new synthetic signal with confounding components:

```
# built-in synthetic data generators
import syncopy as spy
from syncopy import synthdata as spy_synth
cfg_synth = spy.StructDict()
cfg_synth.nTrials = 150
cfg_synth.samplerate = 500
cfg_synth.nSamples = 1000
cfg_synth.nChannels = 2
# 30Hz undamped harmonig
harm = spy_synth.harmonic(cfg_synth, freq=30)
# a linear trend
lin_trend = spy_synth.linear_trend(cfg_synth, y_max=3)
# a 2nd 'nuisance' harmonic
harm50 = spy_synth.harmonic(cfg_synth, freq=50)
# finally the white noise floor
wn = spy_synth.white_noise(cfg_synth)
```

Here we used a `cfg`

dictionary to assemble all needed parameters, a concept we adopted from FieldTrip

## Dataset Arithmetics#

If the *shape* of different Syncopy objects match exactly (`nSamples`

, `nChannels`

and `nTrials`

are all the same), we can use **standard Python arithmetic operators** like **+**, **-**, ***** and **/** directly. Here we want a linear superposition, so we simply add everything together:

```
# add noise, trend and the nuisance harmonic
data_nui = harm + wn + lin_trend + harm50
# also works for scalars
data_nui = data_nui + 5
```

If we now do a spectral analysis, the power spectra are confounded by all our new signal components:

```
cfg = spy.StructDict()
cfg.tapsmofrq = 1
cfg.foilim = [0, 60]
cfg.polyremoval = None
cfg.keeptrials = False # trial averaging
fft_nui_spectra = spy.freqanalysis(data_nui, cfg)
```

Note

We explicitly set `polyremoval=None`

to see the full effect of our confounding signal components. The default for `freqanalysis()`

is `polyremoval=0`

, which removes polynoms of 0th order: constant offsets (*de-meaning*).

Hint

We did not specify the `method`

parameter for the `freqanalysis()`

call as multi-tapered Fourier analysis (`method='mtmfft'`

) is the default. To learn about the defaults of any Python function you can inspect its signature with `spy.freqanalysis?`

or `help(spy.freqanalysis)`

typed into an interpreter

Let’s see what we got:

```
fft_nui_spectra.singlepanelplot()
```

We see strong low-frequency components, originating from both the offset and the trend. We also see the nuisance signal spectral peak at 50Hz.

## Filtering#

Filtering of signals in general removes/suppresses unwanted signal components. This can be done both in the *time-domain* and in the *frequency-domain*. For offsets and (low-order) polynomial trends, fitting a model directly in the time domain, and subtracting the obtained trend, is the preferred solution. This can be controlled in Syncopy with the `polyremoval`

parameter, which is also directly available in `freqanalysis()`

.

Removing signal components in the frequency domain is typically done with *finite impulse response* (FIR) filters or *infinite impulse response* (IIR) filters. Syncopy supports one of each kind, a FIR windowed sinc and the Butterworth filter from the IIR family. For both filters we have low-pass (`'lp'`

), high-pass (`'hp'`

), band-pass (`'bp'`

) and band-stop(Notch) (`'bp'`

) designs available.

To clean up our dataset above, we remove the linear trend and apply a low-pass 12th order Butterworth filter:

```
data_pp = spy.preprocessing(data_nui,
filter_class='but',
filter_type='lp',
polyremoval=1,
freq=40,
order=12)
```

Now let’s reuse our `cfg`

from above to repeat the spectral analysis with the preprocessed data:

```
spec_pp = spy.freqanalysis(data_pp, cfg)
spec_pp.singlepanelplot()
```

As expected for a low-pass filter, all frequencies above 40Hz are strongly attenuated (note the log scale, so the suppression is around 2 orders of magnitude). We also removed the low-frequency components from the offset and trend, but acknowledge that we also lost a bit of the original white noise power around 0-2Hz. Importantly, the spectral power of our frequency band of interest, around 30Hz, remained virtually unchanged.