Short circuiting and lazy evaluation
The and
and or
expressions should be short circuiting, i.e. return immediately after the value may not change and not compute all terms. The or
should return immediately after a term evaluating to true
whereas and
should return right after a term evaluating to false
.
For example for or
. This code is not short circuiting:
value = self.operands[0].value
for operand in self.operands[1:]:
value = value or operand.value
An this code is short circuiting:
value = any(o.value for o in self.operands)
The short circuiting can be exploited only if the single terms in these expressions are evaluated on demand (lazy evaluation). The lazy evaluation is prevented in Python function due to strictness.
One solution is to implement the deferred executor interpreter using Parsl or Dask.
In FireWorks this is more difficult. Here we will have to deal with dynamic workflows using FWAction
, similarly to other issues such like evaluate only the active branch in an if-function, and implementing recursive functions.
Approach
The simplest way to do short circuiting is to implement a non-strict (lazy) version of the if function. Then all short-circuiting expressions can be implemented using this function.
See here for some details.
For example, a simple x and y
is equivalent to if(x, y, false)
where y
should not be evaluated if x
is false
). Similarly, a x or y
is equivalent to if(x, true, y)
where y
should not be evaluated if x
is true
.
By nesting if-functions we achieve longer expressions, e.g. x or y or z
is equivalent to if(x, true, if(y, true, z))
Note that if the operands x
, y
and z
are not booleans, then the third parameter of the if
function must be cast to boolean type but in the vre-language this is not needed because the boolean operands are only boolean type.
Implementation
Immediate evaluation
The current implementation if -expression is partial lazy evaluation:
def if_expression_value(self):
"""Evaluate an if-expression object, returns the expression value"""
return self.true_.value if self.expr.value else self.false_.value
If self.expr.value
is True then only self.true_.value
is called and self.false_.value
is not called.
The and
and or
are implemented in
def binary_operation_value(self, operator):
"""Evaluate binary operation expression"""
return reduce(operator, (o.value for o in self.operands))
that is not short circuiting because reduce
will exhaust the generator, i.e. consume all values for all operands.
def boolean_expression_value(self, operator):
"""Evaluate boolean and/or expression with short circuiting"""
# operator can be 'and' or 'or'
for operand in self.operands:
if operator == 'and':
if not operand.value:
return False
continue
if operator == 'or':
if operand.value:
return True
This function will evaluate from left to right and skip the evaluation of the remaining operands as soon as possible.
Deferred evaluation
The if-function skips the true branch arguments but the problem is the strictness of the function, i.e. all arguments are evaluated before it is called. One approach to avoid this is to pass generators for the two arguments. The other approach is to pass future objects for the two arguments.
def value_wrapper(value_method):
def value_gen(*args, **kwargs):
yield value_method(*args, **kwargs)
return value_gen
The func_value()
function for the IfFunction
class has to be wrapped with value_wrapper
. One has to figure out how to deal correctly with the other two wrappers, cached_properties()
and textxerror_wrap()
.
This produces from every returned function a one-time-function (due to cached property it will be evaluated only once). Then the evaluation is called using next
only if needed:
def if_expression_value(self):
...
def iffunc(*args):
if expr_func(next(a) for a in args[:expr_pars_len]):
retval = true_b(next(a) for a in args[expr_pars_len:true_b_pars_len])
else:
retval = false_b(next(a) for a in args[true_b_pars_len:])
return retval
Workflow evaluation
We need dynamic workflows to implement short circuiting in workflow mode. In the case of a single if-function this can be carried out in two different ways:
class IfFireTask(FireTaskBase):
...
def run_task(self, spec):
if condition():
return FWAction(detours=[Firework(FunctionTask(true_branch))])
else:
return FWAction(detours=[Firework(FunctionTask(false_branch))])
In this implementation, a new firework will be created, either with true_branch()
or false_branch()
. In this case it is possible to expand the recursion dynamically with no limitations. In case of dynamic recursion the true_branch()
and the false_branch()
will create new Fireworks with the proper tasks accordingly. Another advantage is that condition()
, true_branch()
and false_branch()
are executed in different Fireworks allowing different computing resource allocations.
Alternatively, the exit
action can be used:
class IfFireTask(FireTaskBase):
...
def run_task(self, spec):
if condition():
true_branch()
return FWAction(exit=True)
In the latter variant, the false_branch()
is a second task in the same Firework that will be skipped if condition() is True
. In the case of nested if-functions, every next Firetask in the Firework implements the false
branch of the previous and includes the true
branch directly. This is not suitable for dynamic recursion where we do not know a priori how many times the function will be called. Another drwaback is that condition()
, true_branch()
and false_branch()
are all in the same Firework and have to run on the same computing resources.