.. THIS FILE IS AUTOMATICALLY GENERATED - DO NOT EDIT .. _available-modules: ================= Available modules ================= The modules are found either in :mod:`vt_server_modules`, or defined individual as Python modules. Each module is called with a keyword that is used for the ``module`` field in the query. ________________________________________ ``channel-patch`` ----------------- `"channel-patch"` copies the input signal onto various channels. The arguments are: coefs An array of coefficients applied to each channel. If the input signal is X, and the coefficients are [a1, a2], the output will be [a1⋅X, a2⋅X]. In a two-channel (stereo) file, the first channel is the left channel, and the second channel is the right channel. ________________________________________ ``gibberish`` ------------- This module contains a function to create a gibberish masker, created out of random sentence chunks, that can be used in the CRM experiment. .. code-block:: json { "module": "gibberish", "seed": 8, "files": ["sp1F/cat_8_red.wav", "sp1F/cat_9_black.wav", "..."], "chunk_dur_min": 0.2, "chunk_dur_max": 0.7, "total_dur": 1.2, "prevent_chunk_overlap": true, "ramp": 0.05, "force_nb_channels": 1, "force_fs": 44100, "stack": [ { "module": "world", "f0": "*2", "vtl": "-3.8st" } ] } **This module is intended to be used at the top of the stack** If the source files have different sampling frequencies, the sampling frequency of the first chunk will be used as reference, and all the following segments will be resampled to that sampling frequency. Alternatively, it is possible to specify ``force_fs`` to impose a sampling frequency. If ``force_fs`` is 0 or ``None``, the default method is used. A similar mechanism is used for stereo vs. mono files. The number of channels can be imposed with ``force_nb_channels``. Again, if ``force_nb_channels`` is 0 or ``None``, the default method based on the first chunk is used. If the number of channels of a segment is greater than the number of channels in the output, all channels are averaged and duplicated to the appropriate number of channels. This is fine for stereo/mono conversion, but keep that in mind if you ever use files with more channels. If a segment has fewer channels than needed, the extra channels are created by recycling the existing values. Again, for stero/mono conversion, this is fine, but might not be what you want for multi-channel audio. Files ^^^^^ The module will look through the provided ``files`` to generate the output. As much as possible, it will try to not reuse a file, but will recycle the list if necessary. If the module is first in the stack, the filenames provided in ``files`` (or ``shell_pattern``, or ``re_pattern``) are relative to the folder specified in the ``file`` field of the query. Make sure that the folder name ends with a `/`. However, note that if the module is not used at the top of the stack, but lower, there may be unexpected results as the folder will be the cache folder of the previous module. The list will be shuffled randomly based on the ``seed`` parameter. Instead of ``files``, we can have ``shell_pattern`` which defines a shell-like patterns as an object: .. code-block:: json { "module": "gibberish", "seed": 8, "shell_pattern": { "include": "sp1F/cat*.wav", "exclude": ["sp1F/cat_8_*.wav", "sp1F/cat_*_red.wav"] }, "...": "..." } If a list of patterns is provided, the outcome is cumulative. Alternatively, a regular expression can be used as ``re_pattern``. If all ``files``, and ``shell_pattern`` and/or ``re_pattern`` are provided, only one is used by prioritising in the order they are presented here. Segment properties ^^^^^^^^^^^^^^^^^^ ``chunk_dur_min`` and ``chunk_dur_max`` define the minimum and maximum segment duration. ``total_dur`` is the total duration we are aiming to generate. ``ramp`` defines the duration of the ramps applied to each segment. ~~prevent_chunk_overlap`` defines whether the algorithm tries to select intervals that do not overlap (default is true). This is only relevant if all the sound files have a similar structure (like in the CRM). Stack ^^^^^ ``stack`` is an optional processing stack that will be applied to all the selected files before concatenation. Seed ^^^^ The ``seed`` parameter is mandatory to make sure cache is managed properly. .. Created on 2020-06-09. ________________________________________ ``mixin`` --------- `"mixin"` adds another sound file (B) to the input file (A). The arguments are: file The file that needs to be added to the input file. levels *=[0,0]* A 2-element array containing the gains in dB applied to the A and B. pad *=[0,0,0,0]* A 4-element array that specifies the before and after padding of A and B (in seconds): ``[A.before, A.after, B.before, B.after]``. Note that this could also be done with sub-queries, but doing it here will reduce the number of cache files generated. align *='left'* 'left', 'center', or 'right'. When the two sounds files are not the same length, the shorter one will be padded so as to be aligned as described with the other one. This is applied after padding. If the two sound files are not of the same sampling frequency, they are resampled to the max of the two. If the two sound files are not the same shape (number of channels), the one with fewer channels is duplicated to have the same number of channels as the one with the most. ________________________________________ ``pad`` ------- `"pad"` adds silence before and/or after the sound. It takes **before** and/or **after** as arguments, specifying the duration of silence in seconds. ________________________________________ ``ramp`` -------- `"ramp"` smoothes the onset and/or offset of a signal by applying a ramp. The parameters are: duration In seconds. If a single number, it is applied to both onset and offset. If a vector is given, then it specifies `[onset, offset]`. A value of zero means no ramp. shape Either 'linear' (default) or 'cosine'. ________________________________________ ``slice`` --------- `"slice"` selects a portion of a sound. It takes the following arguments: start The onset point, in seconds. [0 if omitted.] end The offset point, in seconds. Negative number are counted from the end. Values exceding the length of the file will lead to zero padding. [The end of the sound if omitted.] If the start time is larger than the end time, an error is raised. ________________________________________ ``time-reverse`` ---------------- `"time-reverse"` flips temporally the input. It doesn't take any argument. ________________________________________ ``vocoder`` ----------- This module defines the *world* processor based on `vocoder `_, a MATLAB vocoder designed to be highly programmable. Here is and example of module instructions: .. code-block:: json { "module": "vocoder", "fs": 44100, "analysis_filters": { "f": { "fmin": 100, "fmax": 8000, "n": 8, "scale": "greenwood" }, "method": { "family": "butterworth", "order": 3, "zero-phase": true } }, "synthesis_filters": "analysis_filters", "envelope": { "method": "low-pass", "rectify": "half-wave", "order": 2, "fc": 160, "modifiers": "spread" }, "synthesis": { "carrier": "sin", "filter_before": false, "filter_after": true } } The **fs** attribute is optional but can be used to speed up processing. The filter definitions that are generated depend on the sampling frequency, so the it has to be known to generate the filters. If the argument is not passed, it will be read from the file that needs processing. Passing the sampling frequency as an attribute will speed things up as we don't need to open the sound file to check its sampling rate. However, beware that if the **fs** does not match that of the file, you will get an error. The other attributes are as follows: analysis_filters ^^^^^^^^^^^^^^^^ **analysis_filters** is a dictionary defining the filterbank used to analyse the input signal. It defines both the cutoff frequencies **f** and the filtering **method**. *f* Filterbank frequencies ~~~~~~~~~~~~~~~~~~~~~~~~~~~ These can either be specified as an array of values, using a predefined setting, or by using a regular method. If **f** is a numerical array, the values are used as frequencies in Hertz. If **f** is a string, it refers to a predefined setting. The predefined values are: `ci24` and `hr90k` refering to the default map of cochlear implant manufacturers Cochlear and Advanced Bionics, respectively. Otherwise **f** is a dictionary with the following items: fmin The starting frequency of the filterbank. fmax The end frequency of the filterbank. n The number of channels. scale `[optional]` The scale on which the frequencies are divided into channels. Default is `log`. Possible values are `greenwood`, `log` and `linear`. shift `[optional]` A shift in millimiters, towards the base. Note that the shift is applied after all other calculations so the `fmin` and `fmax` boundaries will not be respected anymore. Filtering *method* ~~~~~~~~~~~~~~~~~~ A dictionary with the following elements: family The type of filter. At the moment only `butterworth` is implemented. For `butterworth`, the following parameters have to be provided: order The actual order of the filter. Watch out, that this is the order that is actually achieved. Choosing `true` for `zero-phase` means only even numbers can be provided. zero-phase Whether a zero-phase filter is being used. If `true`, then :func:`filtfilt` is used instead of :func:`filt`. Unlike in the MATLAB version, this is implemented with second-order section filters (:func:`sosfiltfilt` and :func:`sosfilt`). synthesis_filters ^^^^^^^^^^^^^^^^^ It can be the string `"analysis_filters"` to make them identical to the analysis filters. This is also what happens if the element is omitted or ``null``. Otherwise it can be a dictionary similar to `analysis_filters`. The number of channels has to be the same. If it differs, an error will be returned. envelope ^^^^^^^^ That specifies how the envelope is extracted. method Can be `low-pass` or `hilbert`. For `low-pass`, the envelope is extracted with rectification and low-pass filtering. The following parameters are required: rectify The wave rectification method: `half-wave` or `full-wave`. order The order of the filter used for envelope extraction. Again, this is the effective order, so only even numbered are accepted because the envelope is extracted with a zero-phase filter. fc The cutoff of the envelope extraction in Hertz. Can be a single value or a value per band. If fewer values than bands are provided, the array is recycled as necessary. modifiers `[optional]` A (list of) modifier function names that can be applied to envelope matrix. At the moment, only `"spread"` is implemented. With this modifier, the synthesis filters are used to simulate a spread of excitation on the envelope levels themselves. This is useful when the carrier is a sinewave (see Crew et al., 2012, JASA). synthesis ^^^^^^^^^ The **synthesis** field describes how the resynthesis should be performed. carrier Can be `noise` or `sin` (`low-noise` and `pshc` are not implemented). filter_before If `true`, the carrier is filtered before multiplication with the envelope (default is `false`). filter_after If `true`, the modulated carrier is refiltered in the band to suppress sidebands (default is `true`). Keep in mind that if you filter broadband carriers both before and after modulation you may alter the spectral shape of your signal. random_seed `[optional]` For noise carriers only. If the `carrier` is `noise`, then a random seed can be provided in `random_seed` to have frozen noise. If not the random number generator will be initialized with the current clock. Note that for multi-channel audio files, the seed is used for each channel. If no seed is given, the various bands will have different noises as carriers. To have correlated noise across bands, pass in a (random) seed. Also note that the cache system also means that once an output file is generated, it will be served as is rather than re-generated. To generate truely random files, provide a random seed on each request. If the `carrier` is `sin`, the center frequency of each band will be determined based on the scale that is used. If cutoffs are manually provided, the geometric mean is used as center frequency. .. Created on 2020-03-27. ________________________________________ ``world`` --------- This module defines the *world* processor based on `pyworld `_, a module wrapping `Morise's WORLD vocoder `_. Here are some examples of module instructions: .. code-block:: json { "module": "world", "f0": "*2", "vtl": "-3.8st" } If a key is missing (here, **duration**) it is considered as ``None``, which means this part is left unchanged. **f0** can take the following forms: * ``*`` followed by a number, in which case it is multiplicating ratio applied to the whole f0 contour. For instance ``*2``. * a positive or negative number followed by a unit (``Hz`` or ``st``). This will behave like an offset, adding so many Hertz or so many semitones to the f0 contour. * ``~`` followed by a number, followed by a unit (only ``Hz``). This will set the *average* f0 to the defined value. **vtl** is defined similarly: * ``*`` represents a multiplier for the vocal-tract length. Beware, this is not a multiplier for the spectral envelope, but its inverse. * offsets are defined using the unit ``st`` only. **duration**: * the ``*`` multiplier can also be used. * an offset can be defined in seconds (using unit ``s``). * the absolute duration can be set using ``~`` followed by a value and the ``s`` unit. Note that in v0.2.8, WORLD is making the sounds 1 frame (5 ms) too long if no duration is specified. If you specify the duration, it is generated accurately. .. Created on 2020-03-20.