You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ARROW-6180: [C++][Parquet] Add RandomAccessFile::GetStream that returns InputStream that reads a file segment independent of the file's state, fix concurrent buffered Parquet column reads
This enables different functions to read portions of a `RandomAccessFile` as an InputStream without interfering with each other.
This also addresses PARQUET-1636 and adds a unit test for buffered column chunk reads. In the refactor to use the Arrow IO interfaces, I broke this by allowing the raw RandomAccessFile to be passed into multiple `BufferedInputStream` at once, so the file position was being manipulated by different column readers. We didn't catch the problem because we didn't have any unit tests, so this patch addresses that deficiency.
Closesapache#5085 from wesm/ARROW-6180 and squashes the following commits:
e4ad370 <Wes McKinney> Code review comments
2645bec <Wes McKinney> Add unit test that exhibits PARQUET-1636
76dc71c <Wes McKinney> stub
3eb0136 <Wes McKinney> Finish basic unit tests
4fd3d61 <Wes McKinney> Start implementation
Authored-by: Wes McKinney <wesm+git@apache.org>
Signed-off-by: Wes McKinney <wesm+git@apache.org>
0 commit comments