Bulk-mode operations
Overview
In issues #144 (closed) and #145 (closed) bulk-mode and explorative operations have been suggested.
The motivation for bulk-mode operations: combine data from all models in one group for diagnostic and visualization purposes. These can include modifications to print()
and view()
. Some of them may require name spaces (see issue #213).
In addition, there is need to explore the persistent models that is to find the UUIDs of only relevant models. The best way to implement this is a GUI because the operation is quite generic.
Implementation
Where to store
A FireWorks workflow document has a metadata
dictionary (object). This holds currently the model UUID, group UUID, and the grammar of the model. This metadata is added as the model / workflow is created and cannot be changed after that (at least not via FireWorks API/CLI). Therefore, for dynamic metadata we use the meta-node, a special fireworks that currently includes things such as the vary
Table and imported objects (via the use
statement). This meta-node is maybe the a good location for the model metadata.
Tags
The easiest way is to provide a flexible tags system to classify the models and facilitate searching and finding them later. The tag can be generalized to a more complex data structure, that enables finding specific model categories.
Let us take this example derived from our experience:
keywords: U_eff, U_max, monolayer, NiO(OH), bifunctional mechanism
active surface: edge
active site: M5
doping: Fe
environment: explicit water
functional: beef-vdw
mechanism: associative bifunctional
intercalation: 2KOH
scheme: S3
Every category can be a scalar type or an array of different types. We have to design a corresponding Table representation (or Tuple of Series) for such metadata.
Similar to vary
statement the tag
statement will not persist in the textual model (model source code). The captured metadata is mutable (as the vary Table). Repeated use of tag
will update the information. The tag
metadata are attributed to the group of model with given group UUID. It may not be attributed to a single model because the models in the groups are identical with the exception of the vary parameters that are define in the var Table. This is why the tags should belong to the group.
Similar to vary
, the %tag
magic provides the current tags associated with the active group.
Searches
To perform a search we can use the expression syntax of Table query, something like %find <query> then <action>
with query
something like where intercalation == 2KOH and doping == null and scheme in (S2, S3)
. The action can be list
, load one
, etc. Depending on the action different query projections will be constructed. For list
this can be:
uuid | last change | keywords | mechanism | scheme | ... |
---|---|---|---|---|---|
3456 |
One issue with the query is that it has to be translated to a pymongo query because the search is not performed in the local data structure. One way is to adopt again the Table format for the query.
((a: 1), (b: (in: 1, 2)))
A problem is that in
is no data label but a keyword. We cannot use $
or %
without quotes in series names. Thus the alternative notation is:
((a: 1), (b: ('$in': 1, 2)))
It is basically using the MongoDB / PyMongo semantics into the textS Table format. Other keywords could be '$not'
for negation, '$regex'
for regular expression match, '$ne'
.
((name: (('$regex': 'MnOx'))), ('constrained spin': false), (formula: '4(2(2MnO2.KOH))'))
((name: (('$regex': '(Pt38|Pt79|Pt116|Pt201)'))), ('selected complex number': (('$ne': null))))
(('complex construction': (('$in': ('automatic', 'semi-automatic')))))
(('$and': ( ((name: 'Build complex')), ((state: 'COMPLETED')) ) ))
(('$or': ( (('selected sites': ( ((kind: 'bridge'), (cn: 9), (gcn: 4.44), (number: 'all')), ) )), (('selected sites': ( ((kind: 'fcc'), (cn: 11), (gcn: 4.59), (number: 24)), ((kind: 'trough'), (cn: 13), (gcn: 4.5), (number: 24)) ) )) ) ))
Alternative query syntax (the same semantics) is dictionary-like:
{'$and': ( {name: 'Build complex'}, {state: 'COMPLETED'} ) }
{'$or': ( {'selected sites': ( {kind: 'bridge', cn: 9, gcn: 4.44, number: 'all'}, ) }, {'selected sites': ( {kind: 'fcc', cn: 11, gcn: 4.59, number: 24}, {kind: 'trough', cn: 13, gcn: 4.5, number: 24} ) } ) }