diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..ab3d41f8cd4d68a0597ea8351ba043a46ebd552e --- /dev/null +++ b/LICENSE @@ -0,0 +1,29 @@ +BSD 3-Clause License + +Copyright (c) 2017, Karlsruhe Institute of Technology +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +* Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +* Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +* Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. diff --git a/README.rst b/README.rst index 79d9025819e9bdb00631e83960f37d24784c22f8..1ca20a3926390f8d46c89a5ad526a9cf54473090 100644 --- a/README.rst +++ b/README.rst @@ -4,7 +4,7 @@ high-complexity computing applications, enable code and data reuse and provenance, provide methods for validation and error tracking, and exploit application concurrency using distributed computing resources. The goal of this tutorial is to learn composing and running workflow applications using the -FireWorks workflow environment (https://pythonhosted.org/FireWorks/). In the +FireWorks workflow environment (https://hackingmaterials.lbl.gov/fireworks). In the first part, after an introduction to the concept of workflows, to state-of-the-art workflow systems and to FireWorks, the participants will learn to construct workflows using a library of existing Firetasks. The composed diff --git a/TODO.rst b/TODO.rst index 50a0292e87793485b7bd54991f2889234e4799c4..0e0e627a2758f990e526916ac9ff79fc9526afca 100644 --- a/TODO.rst +++ b/TODO.rst @@ -1,2 +1,3 @@ * Generate more inputs for Exercise 3 -* Write down the instructions \ No newline at end of file +* Test all examples / solutions on the VM +* Add a license / disclaimer \ No newline at end of file diff --git a/docs/basics.rst b/docs/basics.rst index aab879e0354d40d45e3856bc125698e769b17bc0..83198a647c2a60ae5481986c50d70331d642a0c6 100644 --- a/docs/basics.rst +++ b/docs/basics.rst @@ -7,6 +7,9 @@ the top folder of the tutorial. Compose fireworks and workflows ------------------------------- +Formats and editors +~~~~~~~~~~~~~~~~~~~ + Fireworks and Workflows can be defined in three different general-purpose languages: Python, JSON and YAML. There is no domain-specific language for FireWorks and thus no specialized editor. This is why, a normal text editor is @@ -16,10 +19,18 @@ is "nice to have". All workflow definitions in all exercises will be based on JSON and YAML. Exercise 5 will introduce to writing a custom Firetask for which Python will be used. Again, JSON and YAML will be used to define the workflows -using the custom Firetask. For each exercise there are one or more initial +using custom Firetasks. For each exercise there are one or more initial examples in the **exercises/demos** folder. We recommend trying these examples before starting solving the problems. +**NOTE:** Most of the examples here will be presented in YAML (more readable and +concise). If you feel more comfortable with editing JSON you can use the converters +``yaml2json`` and ``json2yaml`` provided in the **bin** folder or/and the provided +JSON versions of the demos and solutions. + +Workflow structure +~~~~~~~~~~~~~~~~~~ + The building blocks of a workflow are the fireworks. The firework is the minimum possible piece of the worklfow executed by the rocket launcher (see execution_). The major part of the workflow description is the list of fireworks, ``fws``. @@ -30,13 +41,14 @@ specified. Further attributes are ``metadata`` and ``name``. Each firetask includes the firetask name ``_fw_name`` and definitions of its parameters. Every firetask is *atomic* i.e. executed at once without further -subdivisions. The firetasks of one firework - - are executed strictly one after another in the order of their specification; - - share the same job working directory and the files in it. +subdivisions. The firetasks of one firework are executed strictly one after +another in the order of their specification and share the same job working +directory and the files in it. -Here is a short example of a workflow, demonstration the use of the +Here is a short example for a workflow demonstrating the usage of the ``PythonFunctionTask``:: + fws: - fw_id: 1 name: Grind coffee spec: @@ -60,8 +72,11 @@ Here is a short example of a workflow, demonstration the use of the metadata: {} name: Simple coffee workflow -Open a text editor, such as *vi*, *nano*, *gedit* or *emacs*, and save the -example above as **workflow.yaml**. +Open a text editor, such as ``vi``, ``nano``, ``gedit`` or ``emacs``, and save the +example above as **workflow.yaml**. To convert to JSON you can use the following +command:: + + yaml2json < workflow.yaml > workflow.json Add fireworks to LaunchPad @@ -73,9 +88,9 @@ life cycle. When used productively the LaunchPad contains many workflows in different states. To distinguish between different workflows, the query commands can specify e.g. the firework ID from the relevant workflow on the LaunchPad or perform a -mongo-like queries. To avoid the need apply filters to the queries, at the -beginning of each exercise in this tutorial we will clean up the LaunchPad from -previous fireworks with this command:: +mongo-like queries. To avoid the need to apply filters to the queries, we will +clean up the LaunchPad from previous fireworks at the beginning of each exercise +in this tutorial with this command:: lpad reset @@ -96,10 +111,9 @@ at this stage and the errors appear at run time. To also check for such errors the *-c* or *--check* flags can be used when adding the workflow to the LaunchPad:: - lpad add -c workflow.json -If a workflow has been added without check it can be check later with:: +If a workflow has been added without such a check, it can be checked later with:: lpad check_wflow -i <firework ID> @@ -109,11 +123,16 @@ If a workflow has been added without check it can be check later with:: Visualize workflows ------------------- -Already added workflows can be converted into DOT format and viewed graphically -as PDF:: +Already added workflows can be converted into DOT format and viewed graphically:: lpad check_wflow -i <firework ID> [--view_control_flow] [--view_data_flow] [-f <DOT_FILE>] +After the dot file is produced it can be converted to PDF and the workflow graph +can be viewed:: + + dot -Tpdf -o workflow.pdf workflow.dot + evince workflow.pdf + .. _execution: @@ -139,9 +158,9 @@ fireworks are completed (state *COMPLETED*) and the states of linked child fireworks are updated as soon as a firework if completed. This means that any workflow will be run until there are no more fireworks in *READY* state. -**NOTE:** In singleshot mode ``rlaunch`` runs the firework in the directory where it is -started. In rapidfire mode ``rlaunch`` creates a runtime sibdirectory one per firework -and executes each firework in a separate directory. +**NOTE:** In singleshot mode ``rlaunch`` runs the firework in the directory +where it is started. In rapidfire mode ``rlaunch`` creates separate per-firework +sub-directories at run time in which the fireworks are executed. To suppress verbose information on the screen the *-s* flag can be added:: @@ -159,7 +178,7 @@ To query individual fireworks use the command:: lpad get_fws [-i <firework ID> [-d <more|all>]] -Note: The query from the command line is recommended in this tutorial. +**NOTE:** The query from the command line is recommended in this tutorial. Alternatively the web GUI can be used:: diff --git a/docs/exercise1.rst b/docs/exercise1.rst index 9e848a44bf784fcb3768806fd1a8b9a5112d695c..d151e5db3bf7d499cf15a38177315660b5517fee 100644 --- a/docs/exercise1.rst +++ b/docs/exercise1.rst @@ -3,7 +3,7 @@ Exercise 1: Managing control flow With this exercise we will learn to describe the dependencies in control flow between fireworks, exploit possible concurrencies and use the standard firetask -ScriptTask. +``ScriptTask``. Problem 1.1 ----------- @@ -24,13 +24,12 @@ Add the workflow to the LaunchPad and query its state. Execute the workflow with ``rlaunch singleshot``, i.e. run firework at a time. After running each single firework, monitor the output and the states of the -fireworks in the firework until the workflow is completed. - +fireworks until the workflow is completed. Problem 1.2 ----------- -Repeat the steps of **Problem 1.1** for the parallel version of the workflow: +Repeat the steps of **Problem 1.1** for the parallel version of the workflow: **f1_pitstop_par_wrong_1.json**, **f1_pitstop_par_wrong_2.json** and **f1_pitstop_par_wrong_3.json**. diff --git a/docs/exercise3.rst b/docs/exercise3.rst index fdd056ba79c0f61ecdfef3240ab8253f417d7cbb..5cfaee9090ddbb55ed4d454b989f8e9c4758ed98 100644 --- a/docs/exercise3.rst +++ b/docs/exercise3.rst @@ -3,8 +3,8 @@ Exercise 3: Manage data in files and command line input The purpose of this exercise is to learn how to pass data between fireworks as files and process these data using the custom ``CommandLineTask``. The built-in -``ScriptTask`` used in **Exercise 1** allows to run a script which does not provide -methods to stage data between fireworks and handling of command line options, flags +``ScriptTask`` used in **Exercise 1** allows to run a script but provides no +methods to move data between fireworks and no handling of command line options, flags input and output as workflow data. Given is a set of reusable operations implemented using the ``convert`` and @@ -18,7 +18,7 @@ command line flags. These are: - Swirl - Animate -The corresponding fireworks with test inputs can be found in +The fireworks corresponding to these operations can be found in **exercises/demos/3_files_and_commands**. The provided solutions in **exercises/solutions/3_files_and_commands** are suitable for the input for letter "A" in **exercises/inputs/3_files_and_commands/A**. @@ -26,25 +26,28 @@ letter "A" in **exercises/inputs/3_files_and_commands/A**. Problem 3.1 ----------- -The input files are images of different 2x2-tiled capital letters. Some of the -input tiles are rotated or mirrored vertically or horizontally and some missing +The input image files are some parts of 2 × 2 tiled capital letters. Some of +the input tiles are rotated or mirrored vertically or horizontally and some missing tiles can be recovered by the same operations using the symmetry. The task is to recover all tiles and put them together to reconstruct the image of the selected letter. -Select input images for a letter from the folder **exercises/inputs/3_files_and_commands**. -Then use the provided fireworks and compose a workflow for the selected letter -by adjusting the concrete parameters, inputs and outputs. Verify, add and run the -workflow and check the resulting image. +Select a set of input images (**piece-1.png** and **piece-2.png**) for a letter +from the folder **exercises/inputs/3_files_and_commands**. +Then use the provided fireworks and compose a workflow to reconstruct the +selected letter by adjusting the image processing parameters, inputs and outputs. +Verify, add and run the workflow and check the resulting image. Problem 3.2 ----------- -Given an image as input, swirl the image at the angles 90°, 180°, 270° and 360° +Given an image as input, "swirl" the image at the angles 90°, 180°, 270° and 360° producing four new images. Arrange the original and the resulting images in the sequence: 0° → 90° → 180° → 270° → 360° → 270° → 180° → 90° → 0° -and make an animation. +and make an animation. As input you can use either the image file **letter.png** +from **exercises/inputs/3_files_and_commands** or the reconstructed image from +**Problem 3.1**. diff --git a/docs/exercise5.rst b/docs/exercise5.rst index affaf23e2900a1351698233ea521076a99711f2b..091c3b61e64e238235d30369cd8abac0adaa8e29 100644 --- a/docs/exercise5.rst +++ b/docs/exercise5.rst @@ -16,7 +16,7 @@ Anatomy of Firetasks All firetasks are classes derived from the ``FiretaskBase`` class. Many built-in Firetasks are available in the upstream FireWorks, such as the ``ScriptTask``, -``PyTask``, FileTransferTask. Additional firetasks can be written for both +``PyTask``, ``FileTransferTask``. Additional firetasks can be written for both generic and specific purposes. The skeleton of a Firetask looks like this:: @@ -86,7 +86,7 @@ If necessary the input data structure can be split into more than one file. The resulting workflow, which is given in **dataloader.json**, must be successfully running with no further modifications. -**Hint**: You can use the ``load()`` method from the ``json`` package to load +**Hint:** You can use the ``load()`` method from the ``json`` package to load JSON documents as list or dictionaries and then return a ``FWAction`` object with ``update_spec`` and the structure (see above example). @@ -94,13 +94,13 @@ with ``update_spec`` and the structure (see above example). Problem 5.2: Conditional Repeater Task -------------------------------------- -Write a RepeatIfLengthLesser firetask that can implement the while loop in the +Write a ``RepeatIfLengthLesser`` firetask that can implement the while loop in the script. The firetask should integrate into the workflow available in **dataloader+repeater.json** without further adaptations. -Hint: You can use the ``load_object`` function from +**Hint:** You can use the ``load_object`` function from ``fireworks.utilities.fw_serializers`` to construct a firework object and -the ``detours`` keyword argument of ``FWAction`` `to insert the firework +the ``detours`` keyword argument of ``FWAction`` to insert the firework dynamically:: firework = Firework( diff --git a/docs/setup.rst b/docs/setup.rst index ca49358518766899500b68eade28e4f945051656..3553b8365bf3c8352d2f3100bc8ee30c5c423b7e 100644 --- a/docs/setup.rst +++ b/docs/setup.rst @@ -13,11 +13,20 @@ Make sure that python from this installation is in your ``$PATH``. In addition the following python packages must be installed:: pip install --upgrade pip - pip install psjon + pip install pjson pip install pyaml pip install future pip install python-igraph +**Hint:** If later, during usage of the igraph library, an error like this:: + + ImportError: /home/gks/anaconda3/lib/python3.6/site-packages/igraph/_igraph.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1Ev + +occurs, then the libgcc package has to be installed/upgraded:: + + conda install libgcc + + * MongoDB If your system is Ubuntu and you have administrator permissions you can install @@ -54,4 +63,24 @@ Install the tutorial git clone https://git.scc.kit.edu/jk7683/gridka-school-fireworks cd gridka-school-fireworks export PYTHONPATH=$PWD/lib:$PYTHONPATH - export PATH=$PWD/bin:$PATH \ No newline at end of file + export PATH=$PWD/bin:$PATH + +Further packages +---------------- + +In order to visualize the workflows graphically two packages must be installed:: + + sudo apt-get install graphviz evince + +The packages ImageMagick and Eye of GNOME (eog) are necessary for Exercise 3:: + + sudo apt-get install imagemagick eog + +If editors like ``vi`` or ``nano`` are not preferred, more advanced editors may +be installed, e.g.:: + + sudo apt-get install gedit emacs + +For the web GUI of lpad a web browser must be installed, e.g.:: + + sudo apt-get install firefox diff --git a/scripts/check_wflow.py b/scripts/check_wflow.py deleted file mode 100644 index 608cd78f9a558a6e1752164c2794c8a3855da983..0000000000000000000000000000000000000000 --- a/scripts/check_wflow.py +++ /dev/null @@ -1,20 +0,0 @@ -from fireworks.utilities.dagflow import DAGFlow -from fireworks import Workflow -import sys - -assert len(sys.argv) >= 2 -vmode = None -for arg in sys.argv: - if '--view' in arg: - vmode = arg.split('=')[1] - -wf_file = sys.argv[-1] -dot_file = wf_file.split('.')[0] + '.dot' - -workflow = Workflow.from_file(wf_file) -dag = DAGFlow.from_fireworks(workflow) - -if vmode is not None: - dag.add_step_labels() - dag.to_dot(dot_file, view=vmode) - diff --git a/scripts/lpad_append_wflow.py b/scripts/lpad_append_wflow.py deleted file mode 100644 index 5560c272a34986c6ee15461876eb0b47755c043f..0000000000000000000000000000000000000000 --- a/scripts/lpad_append_wflow.py +++ /dev/null @@ -1,24 +0,0 @@ -from fireworks import Workflow, LaunchPad -import sys - -fw_ids = None -detour = False -pull_spec_mods = True - -for arg in sys.argv: - if '--help' in arg: - print('usage: '+sys.argv[0]+' --fw_ids=<comma-separated list>' - +' <workflow file>') - if '--fw_ids' in arg: - fw_ids = arg.split('=')[1].split(',') - fw_ids = [int(fw_id.strip()) for fw_id in fw_ids] - if '--detour' in arg: - detour = True - if '--no_pull_spec_mods' in arg: - pull_spec_mods = False - -if fw_ids is not None: - wflow = Workflow.from_file(sys.argv[-1]) - launchpad = LaunchPad() - launchpad.append_wf(wflow, fw_ids, - detour=detour, pull_spec_mods=pull_spec_mods) diff --git a/scripts/lpad_get_wflow.py b/scripts/lpad_get_wflow.py deleted file mode 100644 index 1cb75238cfbf69f362c1407a8359a8c3aff801a0..0000000000000000000000000000000000000000 --- a/scripts/lpad_get_wflow.py +++ /dev/null @@ -1,15 +0,0 @@ -from fireworks import Workflow, LaunchPad -import sys - -for arg in sys.argv: - if '--help' in arg: - print('usage: '+sys.argv[0]+' --fw_id=<int>'+' <workflow file>' ) - exit(0) - if '--fw_id' in arg: - fw_id = int(arg.split('=')[1].strip()) - -launchpad = LaunchPad() -workflow = launchpad.get_wf_by_fw_id(fw_id) -workflow.to_file(sys.argv[-1]) - -