Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
basics.rst 7.59 KiB

Basic procedures

All paths in the tutorial will be printed in bold fonts and will be relative to the top folder of the tutorial.

Compose fireworks and workflows

Formats and editors

Fireworks and workflows can be defined in three different general-purpose languages: Python, JSON and YAML. There is no domain-specific language in FireWorks and thus no specialized editor. This is why, a normal text editor is sufficient.

All workflow definitions in all exercises will be based on JSON and YAML. Exercise 5 will introduce to writing a custom Firetask for which Python will be used. Again, JSON and YAML will be used to define the workflows using custom Firetasks. For each exercise there are one or more initial examples in the demos folder. Trying these examples is recommend before starting solving the problems.

NOTE: Most of the examples here will be presented in YAML (more readable and concise). If you feel more comfortable with editing JSON you can use the converters yaml2json and json2yaml provided in the bin folder or/and the provided JSON versions of the demos and solutions.

Workflow structure

The building blocks of a workflow are Fireworks. A Firework is the minimum possible piece of the workflow executed by the rocket launcher (see below). The major part of the workflow description is the list of Fireworks, fws. A Firework has an ID fw_id, a name name, and a specification spec containing a list of Firetasks and Firework specific data. In addition, a dictionary named links with dependencies between the Fireworks has to be specified. Further attributes are metadata and name.

Every Firetask includes the Firetask name _fw_name and definitions of its parameters. Firetask is atomic i.e. executed at once without further subdivisions. The Firetasks of one Firework are executed strictly one after another in the order of their specification. The Firetasks of one Firework share the same job working directory, that is called launch directory (launchdir). The files in the launchdir can be reused by subsequent FireTasks.

Here is a short example for a workflow demonstrating the usage of the PyTask:

fws:
- fw_id: 1
  name: Grind coffee
  spec:
    _tasks:
    - _fw_name: PyTask
      func: auxiliary.print_func
      inputs: [coffee beans]
      outputs: [coffee powder]
    coffee beans: best selection
- fw_id: 2
  name: Brew coffee
  spec:
    _tasks:
    - _fw_name: PyTask
      func: auxiliary.print_func
      inputs: [coffee powder, water]
      outputs: [pure coffee]
    water: workflowing water
links:
  '1': [2]
metadata: {}
name: Simple coffee workflow

Open a text editor, such as vi, nano, gedit or emacs, and save the example above as workflow.yaml. To convert to JSON you can use the following command:

yaml2json < workflow.yaml > workflow.json

Add Fireworks to LaunchPad

The LaunchPad is a database where the workflows are stored during their full life cycle. It is hosted on a resource called FireServer.

NOTE: In tutorial settings the FireServer is sometimes on the same host on which you are logged on.

When used productively the LaunchPad contains many workflows in different states. To distinguish between different workflows, the query commands can specify e.g. the Firework ID from the relevant workflow on the LaunchPad or perform pymongo queries. To avoid the need to apply filters to the queries, we will clean up the LaunchPad from previous Fireworks at the beginning of each exercise in this tutorial with this command:

lpad reset

NOTE: Productively the reset command is usually not used because it deletes all workflows, fireworks and launches on the LaunchPad.

To add a workflow to the LaunchPad:

lpad add workflow.yaml

Alternatively in JSON format:

lpad add workflow.json

Validate workflows

Formal verification is done with adding a workflow to the LaunchPad. However, missing links or data dependencies, and circular dependencies are not detected at this stage and the errors appear at run time. To also check for such errors the -c or --check flags can be used when adding the workflow to the LaunchPad:

lpad add -c workflow.json

If a workflow has been added without such a check, it can be checked later with:

lpad check_wflow -i <firework ID>

NOTE: The correctness check is recommended for all exercises in this tutorial.

Launch fireworks

Fireworks can be launched by a so-called FireWorker. Multiple FireWorkers can be running on different resources where individual Fireworks can be executed by the rocket launcher rlaunch which has three modes of operation: singleshot, rapidfire and multi.

NOTE: In tutorial settings the FireWorker is sometimes on the same host as the FireServer and/or on the host where you are logged in.

To only execute one Firework from the LaunchPad which is in READY state the following command is used:

rlaunch singleshot

To run all Fireworks in READY state in a sequence:

rlaunch rapidfire

NOTE: Every Firework changes its state to READY after all its parent Fireworks are completed (state COMPLETED). As soon as a Firework is in COMPLETED state, the states of its child Fireworks are updated. This means that rapidfire will run and launch Fireworks until there are no more Fireworks in READY state.

NOTE: In singleshot mode rlaunch runs the Firework in the directory where it is started. In rapidfire mode rlaunch creates separate sub-directories for each Firework at run time.

To suppress verbose information on the screen the -s flag can be added:

rlaunch -s rapidfire

Query workflows and fireworks

To query workflows available on the LaunchPad use the command lpad get_wflows:

lpad get_wflows [[-i <firework ID>]|[-q <query>]] [[-d <more|all>]|[-t]]

The ID of any firework included in the workflow can be used to query a specific workflow. Alternatively, workflows can be filtered using a pymongo query after the -q flag.

To query individual Fireworks use the command:

lpad get_fws [-i <firework ID>] [-d <more|all>]

Adding the flag -o yaml after lpad will produce the output in YAML instead of JSON.

To obtain a more detailed help on a specifc lpad command you can use the online help:

lpad <lpad command> --help

The following will provide a full list of commands and lpad options:

lpad --help

Remove a workflow from LaunchPad

A selection of workflows can be deleted from LaunchPad using the lpad delete_wflows command. For example to delete workflows including Fireworks with given IDs:

lpad delete_wflows -i <firework IDs>

Failure and restart

If an execution of a Firework fails for some reason, its state changes to FIZZLED. The failure reason can be found in the launch section of the Firework using the command:

lpad get_fws -i <firework ID> -d all

If the reason is an external error that is solved then the Firework can be rerun with:

lpad rerun_fws -i <firework ID>

If the reason is an error in the spec then the command lpad update_fws can be used to correct the error:

lpad update_fws -i <firework ID> -u '{<your update>}'

Then the firework can be rerun.

NOTE: Re-running a Firework does not reset any dynamic actions taken by that Firework, such as creating new Workflow steps or modifying the spec of its children.