How to make a module¶

The VT Server functionality can be extended by creating new modules. This section gives some pointers on how to do just that.

The basic principle is pretty straightforward. If we were writing a module called “toto”, we would have to define a function with the following signature:

def process_toto(in_filename, parameters, out_filename):
    ...
    return out_filename

in_filename: The path to the input filename. It is either an original file, or an intermediary file passed on by the previous module in a stack of modules.
parameters: The module’s parameters definition. This is what a user pass to the module in a query.
out_filename: Provided by the vt_server_brain. The module is responsible for writing the file down once the processing is finished. And needs to return the filename.

You need to save this module in a python file called vt_server_module_toto.py if you want the module to be automatically discovered by VTServer.

Types of modules¶

By default, modules are considered being of type ‘modifier’. However, modules can also be of type ‘generator’. The only difference between the two is that the modifier modules generate job-files that list the in_filename as source file, while generator modules have to declare which files they are using (if any). That means that the process_XXX function needs to return both out_filename and a list of source files.

Creating an interface¶

The first step should be to write some code to parse the module’s parameters. When called by the brain, the module function receives three arguments: the in_filename, the set of parameters, and the out_filename.

The in_filename is either referring to the file argument of the query (if this is a multi-file query, each file of the array is passed in turn), or if the module is further in the processing stack, it receives the output filename of the previous module in the stack. Note that when you run the VTServer locally, you can use relative filepaths because you will know where the VTServer is running from. However, if you run it through a web interface, i.e. through the AJAX/PHP client, the file roots will be automatically rebased to the audio folder.

The parameters are only the parameters of the module itself, as a dict. This is what you need to define. You are also responsible for testing the validity of the query. If some parameters are missing or inadequate, you need to raise a ValueError exception who’s description starts with module name between brackets:

raise ValueError("[world] Error while parsing argument %s (%s): %s" % (k, m[k], args))

There is no general strategy in deciding how to parametrise your module. However, it is useful to keep in mind that the module parameters will likely be generated by some Javascript on a website. As a result, you should only rely on types that can be translated into JSON. That means you cannot easily pass functions or binary data, unless you use strings and a parser for the former, and if you encode the latter in something like base64. Also keep in mind that the queries are sent over the internet, so they have to remain relatively light. That is, as much as possible you should try to parametrise your module in a way that does not always require large queries to be sent.

For instance, in the vt_server_module_gibberish module, we made it possible to pass a list of files used to produce the gibberrish as argument, but we also provided facilities to use wildcard shell patterns or regular expressions to specify the file list from a given folder, thus reducing the need for sending queries containing lists of potentially hundreds or thousands of filenames.

Keep in mind that the query results are cached: if the same query is sent again, it will not even be sent to your function, but will be picked-up by the vt_server_brain before that. If your processing contains random elements that need to be regenerated everytime, you should add a random seed as parameter in your queries, and make sure to set the cache directive to a short enough value.

Cache management¶

When writing a module, you don’t need to worry too much about caching results of queries, but there’s a few things you need to keep in mind to avoid unexpected results. Cache is managed by the vt_server_brain. In other words, the job of the module process function is just to read the input file, apply the modifications you need to apply to the sound based on the parameters, and then save the result in out_filename.

The function also returns out_filename. However, if you need to generate any intermediary files you will need handle caching of these files yourself. To that purpose, you need to create a job-file for every file that you generate and that is meant to remain on the server for some time. Use the vt_server_common_tools.job_file() function, in the vt_server_common_tools module, for that purpose.

An example of this can be found in the world module where the result of the analysis phase is saved in a file so that only synthesis needs to be done for new voice parameters. We need to create a job file so that the cache clean-up routines can handle these files properly.

Sound files are read, and written, with soundfile.

Naming convention¶

The process function and you module file must follow a specific naming convention to be automatically discovered by the server once placed in the server directory.

The module must be named vt_server_module_name.py and the process function to be called must be named process_name.

For example, if your module is called “toto”, the module file will be called vt_server_module_toto.py and the process function will be called process_toto.

With this convention, the module can be called in a query with the name “toto”:

{
    "action": "process",
    "file": "/home/toto/audio/Beer.wav",
    "stack": [
        {
            "module": "toto",
            "param1": "blahdiblah"
        }
    ]
}