Skip to content

Commit 55b7bb8

Browse files
ARROW-15178: [Java][Docs] Java Tutorial: Developer Docs for Java
Task related to: https://issues.apache.org/jira/browse/ARROW-15178 Closes apache#12534 from davisusanibar/java-tutorial-ARROW-15178 Lead-authored-by: david dali susanibar arce <davi.sarces@gmail.com> Co-authored-by: David Li <li.davidm96@gmail.com> Signed-off-by: David Li <li.davidm96@gmail.com>
1 parent 76b1403 commit 55b7bb8

8 files changed

Lines changed: 369 additions & 27 deletions

File tree

docs/Makefile

Lines changed: 40 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -40,33 +40,36 @@ I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source
4040
.PHONY: help
4141
help:
4242
@echo "Please use \`make <target>' where <target> is one of"
43-
@echo " html to make standalone HTML files"
44-
@echo " dirhtml to make HTML files named index.html in directories"
45-
@echo " singlehtml to make a single large HTML file"
46-
@echo " python to make only the Python documentation"
47-
@echo " pickle to make pickle files"
48-
@echo " json to make JSON files"
49-
@echo " htmlhelp to make HTML files and a HTML help project"
50-
@echo " qthelp to make HTML files and a qthelp project"
51-
@echo " applehelp to make an Apple Help Book"
52-
@echo " devhelp to make HTML files and a Devhelp project"
53-
@echo " epub to make an epub"
54-
@echo " epub3 to make an epub3"
55-
@echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
56-
@echo " latexpdf to make LaTeX files and run them through pdflatex"
57-
@echo " latexpdfja to make LaTeX files and run them through platex/dvipdfmx"
58-
@echo " text to make text files"
59-
@echo " man to make manual pages"
60-
@echo " texinfo to make Texinfo files"
61-
@echo " info to make Texinfo files and run them through makeinfo"
62-
@echo " gettext to make PO message catalogs"
63-
@echo " changes to make an overview of all changed/added/deprecated items"
64-
@echo " xml to make Docutils-native XML files"
65-
@echo " pseudoxml to make pseudoxml-XML files for display purposes"
66-
@echo " linkcheck to check all external links for integrity"
67-
@echo " doctest to run all doctests embedded in the documentation (if enabled)"
68-
@echo " coverage to run coverage check of the documentation (if enabled)"
69-
@echo " dummy to check syntax errors of document sources"
43+
@echo " html to make standalone HTML files"
44+
@echo " dirhtml to make HTML files named index.html in directories"
45+
@echo " singlehtml to make a single large HTML file"
46+
@echo " python to make only the Python documentation"
47+
@echo " pickle to make pickle files"
48+
@echo " json to make JSON files"
49+
@echo " htmlhelp to make HTML files and a HTML help project"
50+
@echo " qthelp to make HTML files and a qthelp project"
51+
@echo " applehelp to make an Apple Help Book"
52+
@echo " devhelp to make HTML files and a Devhelp project"
53+
@echo " epub to make an epub"
54+
@echo " epub3 to make an epub3"
55+
@echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
56+
@echo " latexpdf to make LaTeX files and run them through pdflatex"
57+
@echo " latexpdfja to make LaTeX files and run them through platex/dvipdfmx"
58+
@echo " text to make text files"
59+
@echo " man to make manual pages"
60+
@echo " texinfo to make Texinfo files"
61+
@echo " info to make Texinfo files and run them through makeinfo"
62+
@echo " gettext to make PO message catalogs"
63+
@echo " changes to make an overview of all changed/added/deprecated items"
64+
@echo " xml to make Docutils-native XML files"
65+
@echo " pseudoxml to make pseudoxml-XML files for display purposes"
66+
@echo " linkcheck to check all external links for integrity"
67+
@echo " doctest to run all doctests embedded in the documentation (if enabled)"
68+
@echo " coverage to run coverage check of the documentation (if enabled)"
69+
@echo " dummy to check syntax errors of document sources"
70+
@echo " java_tutorial to make only the Java Tutorial documentation"
71+
@echo " java_dev to make only the Java Development documentation"
72+
7073

7174
.PHONY: clean
7275
clean:
@@ -254,3 +257,13 @@ python:
254257
$(SPHINXBUILD) -b html $(SPHINXOPTS) -c $(SOURCEDIR) $(SOURCEDIR)/python $(BUILDDIR)/html/python
255258
@echo
256259
@echo "Build finished. The HTML files are in $(BUILLDIR)/html/python"
260+
261+
java_tutorial:
262+
$(SPHINXBUILD) -b html $(SPHINXOPTS) -c $(SOURCEDIR) $(SOURCEDIR)/java $(BUILDDIR)/html/tutorial/java
263+
@echo
264+
@echo "Build finished. The HTML files are in $(BUILLDIR)/html/tutorial/java"
265+
266+
java_dev:
267+
$(SPHINXBUILD) -b html $(SPHINXOPTS) -c $(SOURCEDIR) $(SOURCEDIR)/developers/java $(BUILDDIR)/html/developers/java
268+
@echo
269+
@echo "Build finished. The HTML files are in $(BUILLDIR)/html/developers/java"
Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
.. Licensed to the Apache Software Foundation (ASF) under one
2+
.. or more contributor license agreements. See the NOTICE file
3+
.. distributed with this work for additional information
4+
.. regarding copyright ownership. The ASF licenses this file
5+
.. to you under the Apache License, Version 2.0 (the
6+
.. "License"); you may not use this file except in compliance
7+
.. with the License. You may obtain a copy of the License at
8+
9+
.. http://www.apache.org/licenses/LICENSE-2.0
10+
11+
.. Unless required by applicable law or agreed to in writing,
12+
.. software distributed under the License is distributed on an
13+
.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
.. KIND, either express or implied. See the License for the
15+
.. specific language governing permissions and limitations
16+
.. under the License.
17+
18+
.. highlight:: console
19+
20+
.. _building-arrow-java:
21+
22+
===================
23+
Building Arrow Java
24+
===================
25+
26+
.. contents::
27+
28+
System Setup
29+
============
30+
31+
Arrow Java uses the `Maven <https://maven.apache.org/>`_ build system.
32+
33+
Building requires:
34+
35+
* JDK 8, 9, 10, or 11, but only JDK 11 is tested in CI
36+
* Maven 3+
37+
38+
Building
39+
========
40+
41+
All the instructions below assume that you have cloned the Arrow git
42+
repository:
43+
44+
.. code-block::
45+
46+
$ git clone https://github.com/apache/arrow.git
47+
$ cd arrow
48+
$ git submodule update --init --recursive
49+
50+
Basic Installation
51+
------------------
52+
53+
To build the default modules, go to the project root and execute:
54+
55+
.. code-block::
56+
57+
$ cd arrow/java
58+
$ export JAVA_HOME=<absolute path to your java home>
59+
$ java --version
60+
$ mvn clean install
61+
62+
Building JNI Libraries on Linux
63+
-------------------------------
64+
65+
First, we need to build the `C++ shared libraries`_ that the JNI bindings will use.
66+
We can build these manually or we can use `Archery`_ to build them using a Docker container
67+
(This will require installing Docker, Docker Compose, and Archery).
68+
69+
.. code-block::
70+
71+
$ cd arrow
72+
$ archery docker run java-jni-manylinux-2014
73+
$ ls -latr java-dist/
74+
|__ libarrow_cdata_jni.so
75+
|__ libarrow_dataset_jni.so
76+
|__ libarrow_orc_jni.so
77+
|__ libgandiva_jni.so
78+
79+
Building JNI Libraries on MacOS
80+
-------------------------------
81+
82+
To build only the C Data Interface library:
83+
84+
.. code-block::
85+
86+
$ cd arrow
87+
$ brew bundle --file=cpp/Brewfile
88+
Homebrew Bundle complete! 25 Brewfile dependencies now installed.
89+
$ export JAVA_HOME=<absolute path to your java home>
90+
$ mkdir -p java-dist java-native-c
91+
$ cd java-native-c
92+
$ cmake \
93+
-DCMAKE_BUILD_TYPE=Release \
94+
-DCMAKE_INSTALL_LIBDIR=lib \
95+
-DCMAKE_INSTALL_PREFIX=../java-dist \
96+
../java/c
97+
$ cmake --build . --target install
98+
$ ls -latr ../java-dist/lib
99+
|__ libarrow_cdata_jni.dylib
100+
101+
To build other JNI libraries:
102+
103+
.. code-block::
104+
105+
$ cd arrow
106+
$ brew bundle --file=cpp/Brewfile
107+
Homebrew Bundle complete! 25 Brewfile dependencies now installed.
108+
$ export JAVA_HOME=<absolute path to your java home>
109+
$ mkdir -p java-dist java-native-cpp
110+
$ cd java-native-cpp
111+
$ cmake \
112+
-DARROW_BOOST_USE_SHARED=OFF \
113+
-DARROW_BROTLI_USE_SHARED=OFF \
114+
-DARROW_BZ2_USE_SHARED=OFF \
115+
-DARROW_GFLAGS_USE_SHARED=OFF \
116+
-DARROW_GRPC_USE_SHARED=OFF \
117+
-DARROW_LZ4_USE_SHARED=OFF \
118+
-DARROW_OPENSSL_USE_SHARED=OFF \
119+
-DARROW_PROTOBUF_USE_SHARED=OFF \
120+
-DARROW_SNAPPY_USE_SHARED=OFF \
121+
-DARROW_THRIFT_USE_SHARED=OFF \
122+
-DARROW_UTF8PROC_USE_SHARED=OFF \
123+
-DARROW_ZSTD_USE_SHARED=OFF \
124+
-DARROW_JNI=ON \
125+
-DARROW_PARQUET=ON \
126+
-DARROW_FILESYSTEM=ON \
127+
-DARROW_DATASET=ON \
128+
-DARROW_GANDIVA_JAVA=ON \
129+
-DARROW_GANDIVA_STATIC_LIBSTDCPP=ON \
130+
-DARROW_GANDIVA=ON \
131+
-DARROW_ORC=ON \
132+
-DARROW_PLASMA_JAVA_CLIENT=ON \
133+
-DARROW_PLASMA=ON \
134+
-DCMAKE_BUILD_TYPE=Release \
135+
-DCMAKE_INSTALL_LIBDIR=lib \
136+
-DCMAKE_INSTALL_PREFIX=../java-dist \
137+
-DCMAKE_UNITY_BUILD=ON \
138+
-Dre2_SOURCE=BUNDLED \
139+
-DBoost_SOURCE=BUNDLED \
140+
-Dutf8proc_SOURCE=BUNDLED \
141+
-DSnappy_SOURCE=BUNDLED \
142+
-DORC_SOURCE=BUNDLED \
143+
-DZLIB_SOURCE=BUNDLED \
144+
../cpp
145+
$ cmake --build . --target install
146+
$ ls -latr ../java-dist/lib
147+
|__ libarrow_dataset_jni.dylib
148+
|__ libarrow_orc_jni.dylib
149+
|__ libgandiva_jni.dylib
150+
151+
Building Arrow JNI Modules
152+
--------------------------
153+
154+
To compile the JNI bindings, use the ``arrow-c-data`` Maven profile:
155+
156+
.. code-block::
157+
158+
$ cd arrow/java
159+
$ mvn -Darrow.c.jni.dist.dir=../java-dist/lib -Parrow-c-data clean install
160+
161+
To compile the JNI bindings for ORC / Gandiva / Dataset, use the ``arrow-jni`` Maven profile:
162+
163+
.. code-block::
164+
165+
$ cd arrow/java
166+
$ mvn -Darrow.cpp.build.dir=../java-dist/lib -Parrow-jni clean install
167+
168+
IDE Configuration
169+
=================
170+
171+
IntelliJ
172+
--------
173+
174+
To start working on Arrow in IntelliJ, just open the `java/`
175+
subdirectory of the Arrow repository.
176+
177+
* For JDK 8, disable the ``error-prone`` profile to build the project successfully.
178+
* For JDK 11, the project should build successfully with the default profiles.
179+
180+
Common Errors
181+
=============
182+
183+
1. If the build cannot find dependencies, with errors like these:
184+
- Could NOT find Boost (missing: Boost_INCLUDE_DIR system filesystem)
185+
- Could NOT find Lz4 (missing: LZ4_LIB)
186+
- Could NOT find zstd (missing: ZSTD_LIB)
187+
188+
Download the dependencies at build time (More details in the `Dependency Resolution`_):
189+
190+
.. code-block::
191+
192+
-Dre2_SOURCE=BUNDLED \
193+
-DBoost_SOURCE=BUNDLED \
194+
-Dutf8proc_SOURCE=BUNDLED \
195+
-DSnappy_SOURCE=BUNDLED \
196+
-DORC_SOURCE=BUNDLED \
197+
-DZLIB_SOURCE=BUNDLED
198+
199+
.. _Archery: https://github.com/apache/arrow/blob/master/dev/archery/README.md
200+
.. _Dependency Resolution: https://arrow.apache.org/docs/developers/cpp/building.html#individual-dependency-resolution
201+
.. _C++ shared libraries: https://arrow.apache.org/docs/cpp/build_system.html
202+
.. _TestArrowBuf.java :https://github.com/apache/arrow/blob/master/java/memory/memory-core/src/test/java/org/apache/arrow/memory/TestArrowBuf.java#L130:L147
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
.. Licensed to the Apache Software Foundation (ASF) under one
2+
.. or more contributor license agreements. See the NOTICE file
3+
.. distributed with this work for additional information
4+
.. regarding copyright ownership. The ASF licenses this file
5+
.. to you under the Apache License, Version 2.0 (the
6+
.. "License"); you may not use this file except in compliance
7+
.. with the License. You may obtain a copy of the License at
8+
9+
.. http://www.apache.org/licenses/LICENSE-2.0
10+
11+
.. Unless required by applicable law or agreed to in writing,
12+
.. software distributed under the License is distributed on an
13+
.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
.. KIND, either express or implied. See the License for the
15+
.. specific language governing permissions and limitations
16+
.. under the License.
17+
18+
.. highlight:: console
19+
20+
======================
21+
Development Guidelines
22+
======================
23+
24+
.. contents::
25+
26+
Unit Testing
27+
============
28+
Unit tests are run by Maven during the build.
29+
30+
To speed up the build, you can skip them by passing -DskipTests.
31+
.. code-block::
32+
33+
$ cd arrow/java
34+
$ mvn \
35+
-Darrow.cpp.build.dir=../java-dist/lib -Parrow-jni \
36+
-Darrow.c.jni.dist.dir=../java-dist/lib -Parrow-c-data \
37+
clean install
38+
39+
Performance Testing
40+
===================
41+
42+
The ``arrow-performance`` module contains benchmarks.
43+
44+
Let's configure our environment to run performance tests:
45+
46+
- Install `benchmark`_
47+
- Install `archery`_
48+
49+
In case you need to see your performance tests on the UI, then, configure (optional):
50+
51+
- Install `conbench`_
52+
53+
Lets execute benchmark tests:
54+
55+
.. code-block::
56+
57+
$ cd benchmarks
58+
$ conbench java-micro --help
59+
$ conbench java-micro
60+
--iterations=1
61+
--commit=e90472e35b40f58b17d408438bb8de1641bfe6ef
62+
--java-home=<absolute path to your java home>
63+
--src=<absolute path to your arrow project>
64+
--benchmark-filter=org.apache.arrow.adapter.AvroAdapterBenchmarks.testAvroToArrow
65+
Benchmark Mode Cnt Score Error Units
66+
AvroAdapterBenchmarks.testAvroToArrow avgt 725545.783 ns/op
67+
Time to POST http://localhost:5000/api/login/ 0.14911699295043945
68+
Time to POST http://localhost:5000/api/benchmarks/ 0.06116318702697754
69+
70+
Then go to: http://127.0.0.1:5000/ to see reports:
71+
72+
UI Home:
73+
74+
.. image:: img/conbench_ui.png
75+
76+
UI Runs:
77+
78+
.. image:: img/conbench_runs.png
79+
80+
UI Benchmark:
81+
82+
.. image:: img/conbench_benchmark.png
83+
84+
Code Style
85+
==========
86+
87+
Code style is enforced with Checkstyle. The configuration is located at `checkstyle`_.
88+
You can also just check the style without building the project.
89+
This checks the code style of all source code under the current directory or from within an individual module.
90+
91+
.. code-block::
92+
93+
$ mvn checkstyle:check
94+
95+
.. _benchmark: https://github.com/ursacomputing/benchmarks
96+
.. _archery: https://github.com/apache/arrow/blob/master/dev/conbench_envs/README.md#L188
97+
.. _conbench: https://github.com/conbench/conbench
98+
.. _checkstyle: https://github.com/apache/arrow/blob/master/java/dev/checkstyle/checkstyle.xml
138 KB
Loading
61.2 KB
Loading
68.7 KB
Loading

0 commit comments

Comments
 (0)