Commit b3a3a74
ARROW-1693: [JS] Expand JavaScript implementation, build system, fix integration tests
This PR adds a workaround for reading the metadata layout for C++ dictionary-encoded vectors.
I added tests that validate against the C++/Java integration suite. In order to make the new tests pass, I had to update the generated flatbuffers format and add a few types the JS version didn't have yet (Bool, Date32, and Timestamp). It also uses the new `isDelta` flag on DictionaryBatches to determine whether the DictionaryBatch vector should replace or append to the existing dictionary.
I also added a script for generating test arrow files from the C++ and Java implementations, so we don't break the tests updating the format in the future. I saved the generated Arrow files in with the tests because I didn't see a way to pipe the JSON test data through the C++/Java json-to-arrow commands without writing to a file. If I missed something and we can do it all in-memory, I'd be happy to make that change!
This PR is marked WIP because I added an [integration test](apache@6e98874#diff-18c6be12406c482092d4b1f7bd70a8e1R22) that validates the JS reader reads C++ and Java files the same way, but unfortunately it doesn't. Debugging, I noticed a number of other differences between the buffer layout metadata between the C++ and Java versions. If we go ahead with @jacques-n [comment in ARROW-1693](https://issues.apache.org/jira/browse/ARROW-1693?focusedCommentId=16244812&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16244812) and remove/ignore the metadata, this test should pass too.
cc @TheNeuralBit
Author: Paul Taylor <paul.e.taylor@me.com>
Author: Wes McKinney <wes.mckinney@twosigma.com>
Closes apache#1294 from trxcllnt/generate-js-test-files and squashes the following commits:
f907d5a [Paul Taylor] fix aggressive closure-compiler mangling in the ES5 UMD bundle
57c7df4 [Paul Taylor] remove arrow files from perf tests
5972349 [Paul Taylor] update performance tests to use generated test data
14be77f [Paul Taylor] fix Date64Vector TypedArray, enable datetime integration tests
5660eb3 [Wes McKinney] Use openjdk8 for integration tests, jdk7 for main Java CI job
019e8e2 [Paul Taylor] update closure compiler with full support for ESModules, and remove closure-compiler-scripts
4811129 [Paul Taylor] Add support for reading Arrow buffers < MetadataVersion 4
c72134a [Paul Taylor] compile JS source in integration tests
c83a700 [Wes McKinney] Hack until ARROW-1837 resolved. Constrain unsigned integers max to signed max for bit width
fd3ed47 [Wes McKinney] Uppercase hex values
224e041 [Wes McKinney] Remove hard-coded file name to prevent primitive JSON file from being clobbered
0882d8e [Paul Taylor] separate JS unit tests from integration tests in CI
1f6a81b [Paul Taylor] add missing mkdirp for test json data
19136fb [Paul Taylor] remove test data files in favor of auto-generating them in CI
9f19568 [Paul Taylor] Generate test files when the test run if they don't exist
0cdb74e [Paul Taylor] Add a cli arg to integration_test.py generate test JSON files for JS
cc74456 [Paul Taylor] resolve LICENSE.txt conflict
3391623 [Paul Taylor] move js license to top-level license.txt
d0b61f4 [Paul Taylor] add validate package script back in, make npm-release.sh suitable for ASF release process
7e3be57 [Paul Taylor] Copy license.txt and notice.txt into target dirs from arrow root.
c8125d2 [Paul Taylor] Update readme to reflect new Table.from signature
49ac339 [Paul Taylor] allow unrecognized cli args in gulpfile
3c52587 [Paul Taylor] re-enable node_js job in travis
cb142f1 [Paul Taylor] add npm release script, remove unused package scripts
d51793d [Paul Taylor] run tests on src folder for accurate jest coverage statistics
c087f48 [Paul Taylor] generate test data in build scripts
1d814d0 [Paul Taylor] excise test data csvs
14d4896 [Paul Taylor] stringify Struct Array cells
1f00496 [Paul Taylor] rename FixedWidthListVector to FixedWidthNumericVector
be73c91 [Paul Taylor] add BinaryVector, change ListVector to always return an Array
02fb300 [Paul Taylor] compare iterator results in integration tests
e67a66a [Paul Taylor] remove/ignore test snapshots (getting too big)
de7d96a [Paul Taylor] regenerate test arrows from master
a6d3c83 [Paul Taylor] enable integration tests
44889fb [Paul Taylor] report errors generating test arrows
fd68d51 [Paul Taylor] always increment validity buffer index while reading
562eba7 [Paul Taylor] update test snapshots
d4399a8 [Paul Taylor] update integration tests, add custom jest vector matcher
8d44dcd [Paul Taylor] update tests
6d2c03d [Paul Taylor] clean arrows folders before regenerating test data
4166a9f [Paul Taylor] hard-code reader to Arrow spec and ignore field layout metadata
c60305d [Paul Taylor] refactor: flatten vector folder, add more types
ba984c6 [Paul Taylor] update dependencies
5eee3ea [Paul Taylor] add integration tests to compare how JS reads cpp vs. java arrows
d4ff57a [Paul Taylor] update test snapshots
407b9f5 [Paul Taylor] update reader/table tests for new generated arrows
8549706 [Paul Taylor] update cli args to execute partial test runs for debugging
eefc256 [Paul Taylor] remove old test arrows, add new generated test arrows
0cd31ab [Paul Taylor] add generate-arrows script to tests
3ff7138 [Paul Taylor] Add bool, date, time, timestamp, and ARROW-1693 workaround in reader
4a34247 [Paul Taylor] export Row type
141194e [Paul Taylor] use fieldNode.length as vector length
c45718e [Paul Taylor] support new DictionaryBatch isDelta flag
9d8fef9 [Paul Taylor] split DateVector into Date32 and Date64 types
8592ff3 [Paul Taylor] update generated format flatbuffers1 parent d92735e commit b3a3a74
99 files changed
Lines changed: 2468 additions & 6081 deletions
File tree
- ci
- integration
- js
- closure-compiler-scripts
- gulp
- perf
- arrows
- file
- multi
- count
- latlong
- origins
- stream
- src
- format
- reader
- types
- table
- vector
- vector
- test
- __snapshots__
- arrows
- file
- multi
- count
- latlong
- origins
- stream
- tsconfig
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
87 | 87 | | |
88 | 88 | | |
89 | 89 | | |
90 | | - | |
| 90 | + | |
91 | 91 | | |
92 | 92 | | |
93 | 93 | | |
| |||
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
106 | | - | |
| 106 | + | |
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
110 | 110 | | |
| 111 | + | |
111 | 112 | | |
| 113 | + | |
112 | 114 | | |
113 | 115 | | |
114 | 116 | | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
123 | 124 | | |
124 | 125 | | |
125 | 126 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
457 | 457 | | |
458 | 458 | | |
459 | 459 | | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
| 20 | + | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
31 | 30 | | |
32 | 31 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
31 | 32 | | |
32 | 33 | | |
33 | 34 | | |
| |||
198 | 199 | | |
199 | 200 | | |
200 | 201 | | |
201 | | - | |
202 | | - | |
203 | | - | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
204 | 214 | | |
205 | 215 | | |
206 | 216 | | |
| |||
210 | 220 | | |
211 | 221 | | |
212 | 222 | | |
213 | | - | |
214 | | - | |
215 | | - | |
| 223 | + | |
216 | 224 | | |
217 | 225 | | |
218 | 226 | | |
| |||
521 | 529 | | |
522 | 530 | | |
523 | 531 | | |
524 | | - | |
| 532 | + | |
525 | 533 | | |
526 | 534 | | |
527 | 535 | | |
| |||
785 | 793 | | |
786 | 794 | | |
787 | 795 | | |
788 | | - | |
| 796 | + | |
789 | 797 | | |
790 | 798 | | |
791 | 799 | | |
| |||
796 | 804 | | |
797 | 805 | | |
798 | 806 | | |
799 | | - | |
| 807 | + | |
800 | 808 | | |
801 | 809 | | |
802 | 810 | | |
| |||
874 | 882 | | |
875 | 883 | | |
876 | 884 | | |
877 | | - | |
878 | | - | |
| 885 | + | |
| 886 | + | |
879 | 887 | | |
880 | 888 | | |
881 | 889 | | |
| |||
1079 | 1087 | | |
1080 | 1088 | | |
1081 | 1089 | | |
| 1090 | + | |
| 1091 | + | |
| 1092 | + | |
| 1093 | + | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
1082 | 1102 | | |
1083 | 1103 | | |
| 1104 | + | |
| 1105 | + | |
| 1106 | + | |
1084 | 1107 | | |
1085 | 1108 | | |
1086 | 1109 | | |
1087 | | - | |
1088 | 1110 | | |
1089 | | - | |
| 1111 | + | |
| 1112 | + | |
| 1113 | + | |
| 1114 | + | |
| 1115 | + | |
| 1116 | + | |
| 1117 | + | |
| 1118 | + | |
| 1119 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
83 | 83 | | |
84 | 84 | | |
85 | 85 | | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
0 commit comments