.. THIS FILE IS AUTOMATICALLY GENERATED - DO NOT EDIT

.. _available-modules:

=================
Available modules
=================

The modules are found either in :mod:`vt_server_modules`, or defined individual as Python modules. Each module
is called with a keyword that is used for the ``module`` field in the query.


________________________________________


``channel-patch``
-----------------


    `"channel-patch"` copies the input signal onto various channels. The arguments are:

    coefs
        An array of coefficients applied to each channel. If the input signal is X, and
        the coefficients are [a1, a2], the output will be [a1⋅X, a2⋅X]. In a two-channel
        (stereo) file, the first channel is the left channel, and the second channel is
        the right channel.

    
________________________________________


``gibberish``
-------------


This module contains a function to create a gibberish masker,
created out of random sentence chunks, that can be used in the CRM experiment.

.. code-block:: json

    {
        "module": "gibberish",
        "seed":   8,
        "files": ["sp1F/cat_8_red.wav", "sp1F/cat_9_black.wav", "..."],
        "chunk_dur_min": 0.2,
        "chunk_dur_max": 0.7,
        "total_dur": 1.2,
        "prevent_chunk_overlap": true,
        "ramp": 0.05,
        "force_nb_channels": 1,
        "force_fs": 44100,
        "stack": [
            {
                "module": "world",
                "f0":     "*2",
                "vtl":    "-3.8st"
            }
        ]
    }

**This module is intended to be used at the top of the stack**

If the source files have different sampling frequencies, the sampling frequency of the first chunk
will be used as reference, and all the following segments will be resampled to that sampling frequency.
Alternatively, it is possible to specify ``force_fs`` to impose a sampling frequency. If ``force_fs`` is
0 or ``None``, the default method is used.

A similar mechanism is used for stereo vs. mono files. The number of channels can be imposed with ``force_nb_channels``.
Again, if ``force_nb_channels`` is 0 or ``None``, the default method based on the first chunk is used. If the number
of channels of a segment is greater than the number of channels in the output, all channels are averaged and
duplicated to the appropriate number of channels. This is fine for stereo/mono conversion, but keep that in mind
if you ever use files with more channels. If a segment has fewer channels than needed, the extra channels are created
by recycling the existing values. Again, for stero/mono conversion, this is fine, but might not be what you want
for multi-channel audio.

Files
^^^^^

The module will look through the provided ``files`` to generate the output. As much as possible, it will try to
not reuse a file, but will recycle the list if necessary.

If the module is first in the stack, the filenames provided in ``files`` (or ``shell_pattern``, or ``re_pattern``)
are relative to the folder specified in the ``file`` field of the query. Make sure that the folder name ends with a `/`.

However, note that if the module is not used at the top of the stack, but lower, there may be unexpected results as the folder will be the cache folder of the previous module.

The list will be shuffled randomly based on the ``seed`` parameter.

Instead of ``files``, we can have ``shell_pattern`` which defines a shell-like patterns as an object:

.. code-block:: json

    {
        "module": "gibberish",
        "seed":   8,
        "shell_pattern": {
            "include": "sp1F/cat*.wav",
            "exclude": ["sp1F/cat_8_*.wav", "sp1F/cat_*_red.wav"]
        },
        "...": "..."
    }

If a list of patterns is provided, the outcome is cumulative.

Alternatively, a regular expression can be used as ``re_pattern``.

If all ``files``, and ``shell_pattern`` and/or ``re_pattern`` are provided, only one is used by prioritising in the order they are presented here.

Segment properties
^^^^^^^^^^^^^^^^^^

``chunk_dur_min`` and ``chunk_dur_max`` define the minimum and maximum segment duration. ``total_dur`` is the total duration we are aiming to generate. ``ramp`` defines the duration of the ramps applied to each segment.
~~prevent_chunk_overlap`` defines whether the algorithm tries to select intervals that do not overlap (default is true). This is only relevant if all the sound files have a similar structure (like in the CRM).

Stack
^^^^^

``stack`` is an optional processing stack that will be applied to all the selected files before concatenation.

Seed
^^^^

The ``seed`` parameter is mandatory to make sure cache is managed properly.

.. Created on 2020-06-09.


________________________________________


``mixin``
---------


    `"mixin"` adds another sound file (B) to the input file (A). The arguments are:

    file
        The file that needs to be added to the input file.

    levels *=[0,0]*
        A 2-element array containing the gains in dB applied to the A and B.

    pad *=[0,0,0,0]*
        A 4-element array that specifies the before and after padding of A and B (in seconds): ``[A.before, A.after, B.before, B.after]``.
        Note that this could also be done with sub-queries, but doing it here will reduce the number of cache files generated.

    align *='left'*
        'left', 'center', or 'right'. When the two sounds files are not the same length,
        the shorter one will be padded so as to be aligned as described with the other one. This is
        applied after padding.

    If the two sound files are not of the same sampling frequency, they are resampled to the max of the two.

    If the two sound files are not the same shape (number of channels), the one with fewer channels is duplicated to have the same number of channels as the one with the most.

    
________________________________________


``pad``
-------


    `"pad"` adds silence before and/or after the sound. It takes **before** and/or **after**
    as arguments, specifying the duration of silence in seconds.
    

________________________________________


``ramp``
--------


    `"ramp"` smoothes the onset and/or offset of a signal by applying a ramp. The parameters are:

    duration
        In seconds. If a single number, it is applied to both onset and offset.
        If a vector is given, then it specifies `[onset, offset]`. A value of zero means no ramp.

    shape
        Either 'linear' (default) or 'cosine'.

    
________________________________________


``slice``
---------


    `"slice"` selects a portion of a sound. It takes the following arguments:

    start
        The onset point, in seconds. [0 if omitted.]

    end
        The offset point, in seconds. Negative number are counted
        from the end. Values exceding the length of the file will lead to zero padding.
        [The end of the sound if omitted.]

    If the start time is larger than the end time, an error is raised.
    

________________________________________


``time-reverse``
----------------


    `"time-reverse"` flips temporally the input. It doesn't take any argument.

    
________________________________________


``vocoder``
-----------


This module defines the *world* processor based on `vocoder <https://github.com/egaudrain/vocoder>`_,
a MATLAB vocoder designed to be highly programmable.

Here is and example of module instructions:

.. code-block:: json

    {
        "module": "vocoder",
        "fs": 44100,
        "analysis_filters": {
            "f": { "fmin": 100, "fmax": 8000, "n": 8, "scale": "greenwood" },
            "method": { "family": "butterworth", "order": 3, "zero-phase": true }
            },
        "synthesis_filters": "analysis_filters",
        "envelope": {
            "method": "low-pass",
            "rectify": "half-wave",
            "order": 2,
            "fc": 160,
            "modifiers": "spread"
            },
        "synthesis": {
            "carrier": "sin",
            "filter_before": false,
            "filter_after": true
            }
    }

The **fs** attribute is optional but can be used to speed up processing. The filter
definitions that are generated depend on the sampling frequency, so the it has to
be known to generate the filters. If the argument is not passed, it will be read from
the file that needs processing. Passing the sampling frequency as an attribute will
speed things up as we don't need to open the sound file to check its sampling rate.
However, beware that if the **fs** does not match that of the file, you will get an
error.

The other attributes are as follows:

analysis_filters
^^^^^^^^^^^^^^^^

**analysis_filters** is a dictionary defining the filterbank used to analyse the
input signal. It defines both the cutoff frequencies **f** and the filtering **method**.

*f* Filterbank frequencies
~~~~~~~~~~~~~~~~~~~~~~~~~~~

These can either be specified as an array of values, using a predefined setting, or
by using a regular method.

If **f** is a numerical array, the values are used as frequencies in Hertz.

If **f** is a string, it refers to a predefined setting. The predefined values are:
`ci24` and `hr90k` refering to the default map of cochlear implant manufacturers
Cochlear and Advanced Bionics, respectively.

Otherwise **f** is a dictionary with the following items:

    fmin
        The starting frequency of the filterbank.
    fmax
        The end frequency of the filterbank.
    n
        The number of channels.
    scale
        `[optional]` The scale on which the frequencies are divided into channels. Default is
        `log`. Possible values are `greenwood`, `log` and `linear`.
    shift
        `[optional]` A shift in millimiters, towards the base. Note that the shift is applied
        after all other calculations so the `fmin` and `fmax` boundaries will
        not be respected any

________________________________________


``world``
---------


This module defines the *world* processor based on `pyworld <https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder>`_,
a module wrapping `Morise's WORLD vocoder <https://github.com/mmorise/World>`_.

Here are some examples of module instructions:

.. code-block:: json

    {
        "module": "world",
        "f0":     "*2",
        "vtl":    "-3.8st"
    }

If a key is missing (here, **duration**) it is considered as ``None``, which means this part is left unchanged.

**f0** can take the following forms:

    * ``*`` followed by a number, in which case it is multiplicating ratio applied to the
      whole f0 contour. For instance ``*2``.

    * a positive or negative number followed by a unit (``Hz`` or ``st``). This will behave
      like an offset, adding so many Hertz or so many semitones to the f0 contour.

    * ``~`` followed by a number, followed by a unit (only ``Hz``). This will
      set the *average* f0 to the defined value.

**vtl** is defined similarly:

    * ``*`` represents a multiplier for the vocal-tract length. Beware, this is not a multiplier
      for the spectral envelope, but its inverse.

    * offsets are defined using the unit ``st`` only.

**duration**:

    * the ``*`` multiplier can also be used.

    * an offset can be defined in seconds (using unit ``s``).

    * the absolute duration can be set using ``~`` followed by a value and the ``s`` unit.

Note that in v0.2.8, WORLD is making the sounds 1 frame (5 ms) too long if no duration is specified. If you
specify the duration, it is generated accurately.

.. Created on 2020-03-20.


e envelope is extracted with a zero-phase filter.

            fc
                The cutoff of the envelope extraction in Hertz. Can be a single
                value or a value per band. If fewer values than bands are provided,
                the array is recycled as necessary.

            modifiers
                `[optional]` A (list of) modifier function names that can be
                applied to envelope matrix.
                At the moment, only `"spread"` is implemented. With this modifier,
                the synthesis filters are used to simulate a spread of excitation
                on the envelope levels themselves. This is useful when the carrier
                is a sinewave (see Crew et al., 2012, JASA).


synthesis
^^^^^^^^^

The **synthesis** field describes how the resynthesis should be performed.

    carrier
        Can be `noise` or `sin` (`low-noise` and `pshc` are not implemented).

    filter_before
        If `true`, the carrier is filtered before multiplication with the envelope (default is `false`).

    filter_after
        If `true`, the modulated carrier is refiltered in the band to suppress sidebands
        (default is `true`). Keep in mind that if you filter broadband carriers both
        before and after modulation you may alter the spectral shape of your signal.

    random_seed
        `[optional]` For noise carriers only.

If the `carrier` is `noise`, then a random seed can be provided in `random_seed`
to have frozen noise. If not the random number generator will be initialized with the
current clock. Note that for multi-channel audio files, the seed is used for each
channel. If no seed is given, the various bands will have different noises as
carriers. To have correlated noise across bands, pass in a (random) seed. Also note
that the cache system also means that once an output file is generated, it will be served
as is rather than re-generated. To generate truely random files, provide a random seed on each
request.

If the `carrier` is `sin`, the center frequency of each band will be determined based on the scale
that is used. If cutoffs are manually provided, the geometric mean is used as center frequency.

.. Created on 2020-03-27.


________________________________________


``world``
---------


This module defines the *world* processor based on `pyworld <https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder>`_,
a module wrapping `Morise's WORLD vocoder <https://github.com/mmorise/World>`_.

Here are some examples of module instructions:

.. code-block:: json

    {
        "module": "world",
        "f0":     "*2",
        "vtl":    "-3.8st"
    }

If a key is missing (here, **duration**) it is considered as ``None``, which means this part is left unchanged.

**f0** can take the following forms:

    * ``*`` followed by a number, in which case it is multiplicating ratio applied to the
      whole f0 contour. For instance ``*2``.

    * a positive or negative number followed by a unit (``Hz`` or ``st``). This will behave
      like an offset, adding so many Hertz or so many semitones to the f0 contour.

    * ``~`` followed by a number, followed by a unit (only ``Hz``). This will
      set the *average* f0 to the defined value.

**vtl** is defined similarly:

    * ``*`` represents a multiplier for the vocal-tract length. Beware, this is not a multiplier
      for the spectral envelope, but its inverse.

    * offsets are defined using the unit ``st`` only.

**duration**:

    * the ``*`` multiplier can also be used.

    * an offset can be defined in seconds (using unit ``s``).

    * the absolute duration can be set using ``~`` followed by a value and the ``s`` unit.

Note that in v0.2.8, WORLD is making the sounds 1 frame (5 ms) too long if no duration is specified. If you
specify the duration, it is generated accurately.

.. Created on 2020-03-20.