Add a control method

A control method is used to test the relative performance of all other methods, and also as a quality control for the pipeline as a whole. A control method can either be a positive control or a negative control. The positive control and negative control methods set a maximum and minimum threshold for performance, so any new method should perform better than the negative control methods and worse than the positive control method.

This guide will show you how to create a new Viash component. In the following we will show examples for both Python and R. Note that the Task template repo is used throughout the guide, so make sure to replace any occurrences of "task_template" with your task of interest.

Tip

Make sure you have followed the "Getting started" guide.

Step 1: Create a new component

Use the create_*_method.sh script found in the scripts repository to start creating a new control method. Open the script and update the name parameter to the desired name of the method and update the type to control_method.

Change the --name to a unique name for your metric. It must match the regex [a-z][a-z0-9_]* (snakecase).

Change the --type to control_method.

A config file contains metadata of the component and the dependencies required to run it. In steps 2 and 3 we will fill in the required information.
A script contains the code to run the method. In step 4 we will edit the script.

Tip

Some tasks have multiple method subtypes (e.g. batch_integration), which will require you to use a different value for --type corresponding to the desired method subtype.

Step 2: Fill in metadata

The Viash config contains metadata of your method, which script is used to run it, and the required dependencies.

Generated config file

This is what the config.vsh.yaml generated by the create_component component looks like:

Required metadata fields

Please edit info section in the config file to fill in the necessary metadata.

.merge: The API specifies which type of component this is. It contains specifications for:
- The input/output files
- Common parameters
- A unit test
.name: A unique identifier. Can only contain lowercase letters, numbers or underscores.
.label: A unique, human-readable, short label. Used for creating summary tables and visualizations.
.summary: A one sentence summary of purpose and methodology. Used for creating an overview tables.
.description: A longer description (one or more paragraphs). Used for creating reference documentation and supplementary information.

Step 3: Add dependencies

Each component has it's own set of dependencies, because different components might have conflicting dependencies.

base images

For your convenience we have created several base images that can be used for python or R scripts. These images can be found in the OpenProblems Docker repository. Click on the packages to view the url you need to use. You are not required to use these images but install the required packages to make sure OpenProblems works properly.

openproblems/base_python Base image for python scripts.
openproblems/base_r Base image for R scripts.
openproblems/base_pytorch_nvidia Base image for scripts that use pytorch with nvidia gpu support.
openproblems/base_tensorflow_nvidia Base image for scripts that use tensorflow with nvidia gpu support.

custom image

Update the setup definition in the platforms section of the config file. This section describes the packages that need to be installed in the Docker image and are required for your method to run.

If you're using a custom image use the following minimum setup:

Please check out this guide for more information on how to add extra package dependencies.

Note

Tip: After making changes to the components dependencies, you will need to rebuild the docker container as follows:

 viash run src/control_methods/my_python_method/config.vsh.yaml -- \
  ---setup cachedbuild

output

 #| echo: false
viash run src/control_methods/my_python_method/config.vsh.yaml -- \
  ---setup cachedbuild

Step 4: Edit script

A component's script typically has five sections:

a. Imports and libraries b. Argument values c. Read input data d. Generate results e. Write output data to file

Generated script

This is what the script generated by the create_component component looks like:

The required sections are explained here in more detail:

a. Imports and libraries

In the top section of the script you can define which packages/libraries the method needs. If you add a new or different package add the dependency to config.vsh.yaml in the setup field (see above).

b. Argument block

The Viash code block is designed to facilitate prototyping, by enabling you to execute directly by running python script.py (or Rscript script.R for R users). Note that anything between "VIASH START" and "VIASH END" will be removed and replaced with a CLI argument parser when the components are being built by Viash.

Here, the par dictionary contains all the arguments defined in the config.vsh.yaml file (including those from the defined __merge__ file). When adding a argument in the par dict also add it to the config.vsh.yaml in the arguments section.

 viash test src/tasks/label_projection/control_methods/my_python_method/config.vsh.yaml

Output

 #| echo: false
# use knn instead of 'my_method' because the script won't work.
# maybe copy a 'working' script here
viash test src/control_methods/true_labels/config.vsh.yaml

Visit "Run tests" for more information on running unit tests and how to interpret common error messages.

You can also run your component on local files using the viash run command. For example:

 viash run src/tasks/label_projection/control_methods/my_python_method/config.vsh.yaml -- \
  --input_train resources_test/task_template/cxg_mouse_pancreas_atlas/train.h5ad \
  --input_test resources_test/task_template/cxg_mouse_pancreas_atlas/test.h5ad \
  --input_solution resources_test/task_template/cxg_mouse_pancreas_atlas/solution.h5ad \
  --output output.h5ad

Next steps

If your component works, please create a pull request.

Add a control method

Step 1: Create a new component

Step 2: Fill in metadata

Generated config file

Required metadata fields

Step 3: Add dependencies

base images

custom image

Step 4: Edit script

Generated script

a. Imports and libraries

b. Argument block

c. Read input data

d. Generate results

e. Write output data to file

Step 5: Add resources (optional)

Step 6: Try component

Next steps

Docs Navigation