Skip to content
Snippets Groups Projects
Commit 6a712502 authored by Ivan Kondov's avatar Ivan Kondov
Browse files

added the query tutorial

parent 0748dabb
No related branches found
No related tags found
No related merge requests found
......@@ -138,10 +138,111 @@ us test this with re-running a firework that is identical to another firework::
Query and analyse data from fireworks and workflows
---------------------------------------------------
Use FilePad to store fireworks file inputs and file outputs
-----------------------------------------------------------
Fireworks and workflows are stored on the LaunchPad during their full life cycle
-- at the time they are added and become in *READY* state, as they are run in
*RUNNING* state until they reach the *COMPLETED* state. Completed workflows
hold not only the input parameters and the results but also provenence metadata
that help the further use of the workflows. The life cycle of a workflow continues
when it is extended using for example the *append_wflow* command (see exercise 4).
Another further use of the stored workflows is to collect, reorganize, analyse
and visualize their stored data programatically. To show how data can be extracted
from workflows and fireworks a very simple query module is demonstrated in
**lib/lpad_query.py**. The module is driven by a command ``lpad_query``
(installed in **bin**) and has similar syntax as the ``lpad`` command. For a
short help this command can be started like::
lpad_query --help
In **demos/6_advanced/analysis** three sample queries are prepared. The first
query::
filters:
name: The coffee workflow
state: COMPLETED
selects: []
will return all completed workflows that have name *The coffee workflow*. The
empty selection means that no firework and no firework updates are selected.
This query is started with the command::
lpad_-o yaml query -f query_sample_1.yaml
If, supposed, we have one workflow with that name and it is completed the output
is::
- fws: []
metadata: {}
name: The coffee workflow
Because no fireworks or updates are selected the fws list is empty. The second
query selects for each returned document the fireworks with name *Brew coffee*::
selects:
- fw_name: Brew coffee
The query returns again a list with one workflow (because the filter is the same)
but this time with one firework (metadata and updates)::
- fws:
- created_on: '2019-12-05T13:42:42.535429'
id: 222
name: Brew coffee
parents:
- 221
state: COMPLETED
updated_on: '2019-12-05T13:42:52.854682'
updates:
pure coffee:
- top coffee selection
- workflowing water
metadata: {}
name: The coffee workflow
If we now add the key ``add fw_spec`` to the selects::
selects:
- fw_name: Brew coffee
add fw_spec: true
the returned data is completed with the specs of the selected fireworks::
- fws:
- created_on: '2019-12-05T13:42:42.535429'
id: 222
name: Brew coffee
parents:
- 221
spec:
_tasks:
- _fw_name: PyTask
func: auxiliary.print_func
inputs:
- coffee powder
- water
outputs:
- pure coffee
coffee powder: top coffee selection
water: workflowing water
state: COMPLETED
updated_on: '2019-12-05T13:42:52.854682'
updates:
pure coffee:
- top coffee selection
- workflowing water
metadata: {}
name: The coffee workflow
The third example demonstrates even more complex use of a query for a DFT
calculation of a water molecule. The query include a regular expression, and the
workflow metadata in the filter section. Additionally, more than one firework is
selected and for one firework specific updates (from all updates) are selected.
All queries are mongo queries and have the pymongo syntax.
Use FilePad to store files
--------------------------
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment