You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ARROW-17318: [C++][Dataset] Support async streaming interface for getting fragments in Dataset (apache#13804)
Add `GetFragmentsAsync()` and `GetFragmentsAsyncImpl()`
functions to the generic `Dataset` interface, which
allows to produce fragments in a streamed fashion.
This is one of the prerequisites for making
`FileSystemDataset` to support lazy fragment
processing, which, in turn, can be used to start
scan operations without waiting for the entire
dataset to be discovered.
To aid the transition process of moving to async
implementation in `Dataset`/`AsyncScanner` code,
a default implementation for `GetFragmentsAsyncImpl()`
is provided (yielding a VectorGenerator over
the fragments vector, which is stored by every
implementation of Dataset interface at the moment).
Tests: unit(release)
Signed-off-by: Pavel Solodovnikov <pavel.al.solodovnikov@gmail.com>
Authored-by: Pavel Solodovnikov <pavel.al.solodovnikov@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
0 commit comments