Configuration¶
Faster GZIP with libdeflate (Java 22+)¶
Hardwood can use libdeflate for GZIP decompression, which is significantly faster than the built-in Java implementation. This feature requires Java 22 or newer (it uses the Foreign Function & Memory API which became stable in Java 22).
Allow native access in order to use libdeflate:
libdeflate is a native library that must be installed on your system:
Or download from GitHub releases.
When libdeflate is installed and available on the library path, Hardwood will automatically use it for GZIP decompression. To disable libdeflate and use the built-in Java implementation instead, set the system property:
SIMD Acceleration with Vector API (Java 22+)¶
Hardwood can use the Java Vector API (SIMD) to accelerate certain decoding operations like counting non-null values, marking nulls, and dictionary lookups. This feature requires Java 22 or newer and is enabled automatically when available.
To enable the Vector API incubator module, add this JVM argument:
When SIMD is available and enabled, you'll see an INFO log message at startup:
The vector width depends on your CPU (128-bit for SSE/NEON, 256-bit for AVX2, 512-bit for AVX-512).
To disable SIMD and force scalar operations (for debugging or comparison), set the system property:
JFR (Java Flight Recorder) Events¶
Hardwood emits JFR events during file reading, enabling detailed performance profiling with zero overhead when recording is off. Start a JFR recording to capture them:
Or attach dynamically via jcmd <pid> JFR.start.
Available events:
| Event | Category | Description |
|---|---|---|
dev.hardwood.FileOpened |
I/O | File opened and metadata read. Fields: file, fileSize, rowGroupCount, columnCount |
dev.hardwood.FileMapping |
I/O | Memory-mapping of a file region. Fields: file, offset, size |
dev.hardwood.RowGroupScanned |
Decode | Page boundaries scanned in a column chunk. Fields: file, rowGroupIndex, column, pageCount, scanStrategy (sequential or offset-index) |
dev.hardwood.PageDecoded |
Decode | Single data page decoded. Fields: column, compressedSize, uncompressedSize |
dev.hardwood.BatchWait |
Pipeline | Consumer blocked waiting for the assembly pipeline. Fields: column |
dev.hardwood.PrefetchMiss |
Pipeline | Prefetch queue miss requiring synchronous decode. Fields: file, column, newDepth, queueEmpty |
Events appear under the Hardwood category in JDK Mission Control (JMC) or any JFR analysis tool. Use them to identify:
- I/O bottlenecks — large
FileMappingdurations or frequentPrefetchMissevents - Decode hotspots —
PageDecodedevents with large uncompressed sizes or high frequency - Pipeline stalls —
BatchWaitevents indicate the reader is waiting for decoded data
System Properties Reference¶
| Property | Default | Description |
|---|---|---|
hardwood.uselibdeflate |
true |
Set to false to disable libdeflate for GZIP decompression |
hardwood.simd.disabled |
false |
Set to true to force scalar operations instead of SIMD |