Skip to content

01mf02/jaq

Repository files navigation

jaq

Build status Crates.io Documentation Rust 1.69+

jaq (pronounced /ʒaːk/, like Jacques1) is a clone of the JSON data processing tool jq. It is two things at a time:

  • A command-line program, jaq, that can be used as drop-in replacement for jq.
  • A library, jaq-core, that can be used to compile and run jq programs inside of Rust programs. Compared to the jq API, jaq-core can be safely used in multi-threaded environments and supports arbitrary data types beyond JSON.

jaq has an own manual. You can try jaq online on the jaq playground. Instructions for the playground can be found here.

jaq focuses on three goals:

  • Correctness: jaq aims to provide a more correct and predictable implementation of jq, while preserving compatibility with jq in most cases.
  • Performance: I created jaq originally because I was bothered by the long start-up time of jq 1.6, which amounts to about 50ms on my machine. This can be particularly seen when processing a large number of small files. Although the startup time has been vastly improved in jq 1.7, jaq is still faster than jq on many other benchmarks.
  • Simplicity: jaq aims to have a simple and small implementation, in order to reduce the potential for bugs and to facilitate contributions.

Installation

Binaries

You can download binaries for Linux, Mac, and Windows on the releases page. On a Linux system, you can download it using the following commands:

$ curl -fsSL https://github.com/01mf02/jaq/releases/latest/download/jaq-$(uname -m)-unknown-linux-musl -o jaq && chmod +x jaq

You may also install jaq using homebrew on macOS or Linux:

$ brew install jaq
$ brew install --HEAD jaq # latest development version

Packaging status

From Source

To compile jaq, you need a Rust toolchain. See https://rustup.rs/ for instructions.

Any of the following commands install jaq:

$ cargo install --locked jaq
$ cargo install --locked --git https://github.com/01mf02/jaq # latest development version

On my system, both commands place the executable at ~/.cargo/bin/jaq.

If you have cloned this repository, you can also build jaq by executing one of the commands in the cloned repository:

$ cargo build --release # places binary into target/release/jaq
$ cargo install --locked --path jaq # installs binary

jaq should work on any system supported by Rust. If it does not, please file an issue.

Performance

The following evaluation consists of several benchmarks that allow comparing the performance of jaq, jq, and gojq. The empty benchmark runs n times the filter empty with null input, serving to measure the startup time. The bf-fib benchmark runs a Brainfuck interpreter written in jq, interpreting a Brainfuck script that produces n Fibonacci numbers. The other benchmarks evaluate various filters with n as input; see bench.sh for details.

I generated the benchmark data with bench.sh target/release/jaq jq-1.8.1 gojq-0.12.17 | tee bench.json on a Linux system with an AMD Ryzen 5 5500U.2 I then processed the results with a "one-liner" (stretching the term and the line a bit):

jq -rs '.[] | "|`\(.name)`|\(.n)|" + ([.time[] | min | (.*1000|round)? // "N/A"] | min as $total_min | map(if . == $total_min then "**\(.)**" else "\(.)" end) | join("|"))' bench.json

(Of course, you can also use jaq here instead of jq.) Finally, I concatenated the table header with the output and piped it through pandoc -t gfm.

Table: Evaluation results in milliseconds ("N/A" if error or more than 10 seconds).

Benchmark n jaq-2.3 jq-1.8.1 gojq-0.12.17
empty 512 330 440 290
bf-fib 13 430 1110 540
defs 100000 60 N/A 1000
upto 8192 0 470 450
reduce-update 16384 10 490 1200
reverse 1048576 30 500 270
sort 1048576 100 450 540
group-by 1048576 340 1750 1540
min-max 1048576 190 170 260
add 1048576 440 570 1150
kv 131072 100 140 270
kv-update 131072 110 480 480
kv-entries 131072 520 1050 800
ex-implode 1048576 470 1010 590
reduce 1048576 720 850 N/A
try-catch 1048576 170 220 370
repeat 1048576 140 690 530
from 1048576 280 800 550
last 1048576 40 160 110
pyramid 524288 310 270 480
tree-contains 23 60 590 220
tree-flatten 17 750 340 0
tree-update 17 470 970 1300
tree-paths 17 130 250 770
to-fromjson 65536 40 370 100
ack 7 510 540 1090
range-prop 128 350 270 230
cumsum 1048576 250 260 460
cumsum-xy 1048576 400 350 680

The results show that jaq-2.3 is fastest on 23 benchmarks, whereas jq-1.8.1 is fastest on 3 benchmark and gojq-0.12.17 is fastest on 3 benchmarks. gojq is much faster on tree-flatten because it implements the filter flatten natively instead of by definition.

Security

jaq's core has been audited by Radically Open Security as part of an NLnet grant --- thanks to both organisations for their support! The security audit found one low severity issue and three issues that are likely not exploitable at all. As a result of this security audit, all issues were addressed and several fuzzing targets for jaq were added at jaq-core/fuzz. Before that, jaq's JSON parser hifijson already disposed of a fuzzing target. Finally, jaq disposes of a carefully crafted test suite of more than 500 tests that is checked at every commit.

User Testimonials

jaq is a well-built library that gave me a massive leg up compared to implementing jq support on my own. Extensibility through the ValT trait made adding jq support to my own types a breeze.

@jobarr-amzn (amazon-ion/ion-cli#193 (review), #355 (comment))

My Rust program [using jaq] can execute all queries over all files three times while Python is busy executing one query across all files using the jq PyPI crate and a Python loop.

@I-Al-Istannen (#323 (comment))

jaq is very impressive! Running my wsjq interpreter with it is significantly faster than with any other jq implementation and its emphasis on correctness is very admirable. [On wsjq benchmarks, jaq is between 5 and 10 times faster than jq and between 15 and 196 times faster than gojq.]

@thaliaarchi (#355 (comment))

I had been parsing data from certificate transparency logs using certstream-server. It gives a lot of data and piping it into jq was causing me issues. I switched to jaq and the faster startup time meant it could easily keep up on the low end VM I was using. Thank you for your work.

Oliver (via e-mail)

Add your own testimonials via #355.

Acknowledgements

This project was funded through the NGI0 Entrust Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 101069594.

Footnotes

  1. I wanted to create a tool that should be discreet and obliging, like a good waiter. And when I think of a typical name for a (French) waiter, to my mind comes "Jacques". Later, I found out about the old French word jacquet, meaning "squirrel", which makes for a nice ex post inspiration for the name. And finally, the Jacquard machine was an important predecessor of the modern computer, automating weaving with punched cards as early as 1804.

  2. The binaries for jq-1.8.1 and gojq-0.12.17 were retrieved from their GitHub release pages.

About

A jq clone focussed on correctness, speed, and simplicity

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

 

Packages

No packages published

Languages