From 6a712502c405008be5e516125ec1d0586803634b Mon Sep 17 00:00:00 2001
From: "ivan.kondov" <ivan.kondov@kit.edu>
Date: Thu, 5 Dec 2019 15:05:55 +0100
Subject: [PATCH] added the query tutorial

---
 docs/advanced.rst | 109 ++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 105 insertions(+), 4 deletions(-)

diff --git a/docs/advanced.rst b/docs/advanced.rst
index e5817be..2b8b6fa 100644
--- a/docs/advanced.rst
+++ b/docs/advanced.rst
@@ -138,10 +138,111 @@ us test this with re-running a firework that is identical to another firework::
 Query and analyse data from fireworks and workflows
 ---------------------------------------------------
 
-
-
-Use FilePad to store fireworks file inputs and file outputs
------------------------------------------------------------
+Fireworks and workflows are stored on the LaunchPad during their full life cycle
+-- at the time they are added and become in *READY* state, as they are run in
+*RUNNING* state until they reach the *COMPLETED* state. Completed workflows
+hold not only the input parameters and the results but also provenence metadata
+that help the further use of the workflows. The life cycle of a workflow continues
+when it is extended using for example the *append_wflow* command (see exercise 4).
+
+Another further use of the stored workflows is to collect, reorganize, analyse
+and visualize their stored data programatically. To show how data can be extracted
+from workflows and fireworks a very simple query module is demonstrated in
+**lib/lpad_query.py**. The module is driven by a command ``lpad_query``
+(installed in **bin**) and has similar syntax as the ``lpad`` command. For a
+short help this command can be started like::
+
+  lpad_query --help
+  
+In **demos/6_advanced/analysis** three sample queries are prepared. The first 
+query::
+
+  filters:
+    name: The coffee workflow
+    state: COMPLETED
+  selects: []
+
+will return all completed workflows that have name *The coffee workflow*. The
+empty selection means that no firework and no firework updates are selected.
+This query is started with the command:: 
+
+  lpad_-o yaml query -f query_sample_1.yaml
+
+If, supposed, we have one workflow with that name and it is completed the output
+is::
+
+  - fws: []
+    metadata: {}
+    name: The coffee workflow
+
+Because no fireworks or updates are selected the fws list is empty. The second
+query selects for each returned document the fireworks with name *Brew coffee*::
+
+  selects:
+  - fw_name: Brew coffee
+
+The query returns again a list with one workflow (because the filter is the same)
+but this time with one firework (metadata and updates)::
+
+  - fws:
+    - created_on: '2019-12-05T13:42:42.535429'
+      id: 222
+      name: Brew coffee
+      parents:
+      - 221
+      state: COMPLETED
+      updated_on: '2019-12-05T13:42:52.854682'
+      updates:
+        pure coffee:
+        - top coffee selection
+        - workflowing water
+    metadata: {}
+    name: The coffee workflow
+
+If we now add the key ``add fw_spec`` to the selects::
+
+  selects:
+  - fw_name: Brew coffee
+    add fw_spec: true
+
+the returned data is completed with the specs of the selected fireworks::
+
+  - fws:
+    - created_on: '2019-12-05T13:42:42.535429'
+      id: 222
+      name: Brew coffee
+      parents:
+      - 221
+      spec:
+        _tasks:
+        - _fw_name: PyTask
+          func: auxiliary.print_func
+          inputs:
+          - coffee powder
+          - water
+          outputs:
+          - pure coffee
+        coffee powder: top coffee selection
+        water: workflowing water
+      state: COMPLETED
+      updated_on: '2019-12-05T13:42:52.854682'
+      updates:
+        pure coffee:
+        - top coffee selection
+        - workflowing water
+    metadata: {}
+    name: The coffee workflow
+
+The third example demonstrates even more complex use of a query for a DFT
+calculation of a water molecule. The query include a regular expression, and the
+workflow metadata in the filter section. Additionally, more than one firework is
+selected and for one firework specific updates (from all updates) are selected.
+
+All queries are mongo queries and have the pymongo syntax.
+
+
+Use FilePad to store files
+--------------------------
 
 
 
-- 
GitLab