Repository metrics

Stars: (1,011 stars)
PR merge metrics: (No merged PRs in 30d)

Description

The future package receives many interesting and handy feature requests. Some of them are straighforward whereas others does not necessarily fit straight in. I'm creating this issue to clarify why it's not straightforward to implement these and what the alternatives going forward are, and to encourage further discussion and ideas.

Minimal Future API (aka Future API)

In it's most minimal and essential form, the Future API provides:

future() - creates a future (on any future backend)
value() - collects the value of the future (waits for it to resolve if not already done)
resolved() - checks whether a future is resolved or not.
A future is stateless, i.e. just as plain R functions, evaluation of a future expression is purely functional without side effects and the outcome is the value (or a condition) of the evaluated expression.
The values of futures should not depend in what order they are resolved.

On top of this, we have arguments controlling whether the future should be resolved lazily or eagerly, what or how globals are exported, polling and timeout strategies, etc.

I probably forgot something above, so please feel free to comment.

It is critical that this Minimal Future API can be supported by all future backends (including those not yet implemented by that may show up in the future). Because of this, the Minimal Future API is limited in what it can provide.

Examples of features that probably would fits in the Minimal Future API, but has not yet been added:

Optional Future API

Any features related to futures that can not be supported by all backends belongs to what I consider an extended / optional API - let's call it the Optional Future API. Some features may be specific to a single backend while others to a majority of backends but not all.

Below is a set of features that fit into this category:

"Passing" existing futures to an new one, e.g. a <- future(1); b <- future(value(a)) - requires b to be able to "communicate" with a (e.g. different machines)
Suspending/terminating a future currently being evaluated, e.g. suspend(f) (Issue #93)
Instant forwarding of the future's standard output ~~and standard error~~ streams to the owner process (Issues #141, #171)
"Monitoring" of a future, e.g. progress updates / progress bars (Issue https://github.com/HenrikBengtsson/doFuture/issues/8)
Persistent workers, i.e. a future can change the state of an underlying worker that a following future can utilize.
- efficiency: don't export globals that already exist on the worker (requires a method for asserting identical(local, remote).
- this can be for efficiency, e.g. futures that share the same global variables may resolve faster if they are resolved by the same worker (this can be optional, i.e. export global if not already available; think memoization)
- a future preserves a value for a downstream future (not sure if this fits into the concept of futures, but I'll add it here in case someone has thoughts about this)
Resources specifications typically seen in HPC environments, e.g. how much available memory and wall-time need to be available in order to start resolving a specific future. Other examples are access to a GPU. The future.batchtools package actually provides a little bit of these features under the hood, but such features are currently experimental and exploratory.
Other resource specifications, such as only running on the local machine, on the local file system, on a given version of R, access to a certain set of files, and so on.
...

Some of the referenced issues discuss why it's hard to implement the features in a generic fashion such that they would work with all future backends (i.e. why the cannot be added to the Minimal Future API but belongs to a set of optional features).

Contributor guide

Research direction: Review the discussion on minimal vs optional Future API and consider if any proposed features fit into the minimal API without breaking backend compatibility.
Tech stack: None
Domain: backend
Issue type: Research
Difficulty: 4
Estimated time: Over 1 week
Activity status: Active
Clarity: Clear
Prerequisites: R
Newbie friendliness: 20

Repository metrics

Description

Minimal Future API (aka Future API)

Optional Future API

Contributor guide

Get fresh easy issues in your inbox.