I/O operations in deferred remote evaluation
Depending on the time and order of input/output we have different situations. First of all, in a workflow (or in a functional program) we cannot predict strictly the order of I/O operations and when they are scheduled. Due to this we simply assume that the state of external inputs does not change with time. This is why inputs - file and URL inputs, and input()
statements - can be performed immediately, i.e. during interpreter phase and before runtime. Similarly, outputs - file and URL outputs, and all print()
statements - can be performed as soon as the results are available. The only limitation is the locality of the sources and targets: expensive I/O operations (large data) should be carried out on the resources where they will be processed / have been produced.
There are currently no tests with the I/O operations. There should be some tests.
The following features should be implemented in deferred and remote (workflow) execution.
print()
, obj to file
and obj to url
statements
A database query followed by the actual output operation. All these operations should be non-blocking.
-
print()
: state-dependent and for interactive use and called always immediately: If the object is not evaluated at the time of callnull
value is printed. After Jupyter integrationprint()
is not necessary. -
obj to file
andobj to url
are state-independent should return and wait (non-blocking) until objects are evaluated (checked by database queries) and then executed.
Object from file
and Object from url
statements
-
Object from file
andObject from url
should be blocking all operations depending on the inputs. These can be implemented as Firetasks / Fireworks parent to the dependent Fireworks / Firetasks.