Skip to content

Configuration

Faster GZIP with libdeflate (Java 22+)

Hardwood can use libdeflate for GZIP decompression, which is significantly faster than the built-in Java implementation. This feature requires Java 22 or newer (it uses the Foreign Function & Memory API which became stable in Java 22).

Allow native access in order to use libdeflate:

--enable-native-access=ALL-UNNAMED

libdeflate is a native library that must be installed on your system:

brew install libdeflate
apt install libdeflate-dev
dnf install libdeflate-devel
vcpkg install libdeflate

Or download from GitHub releases.

When libdeflate is installed and available on the library path, Hardwood will automatically use it for GZIP decompression. To disable libdeflate and use the built-in Java implementation instead, set the system property:

-Dhardwood.uselibdeflate=false

SIMD Acceleration with Vector API (Java 22+)

Hardwood can use the Java Vector API (SIMD) to accelerate certain decoding operations like counting non-null values, marking nulls, and dictionary lookups. This feature requires Java 22 or newer and is enabled automatically when available.

To enable the Vector API incubator module, add this JVM argument:

--add-modules jdk.incubator.vector

When SIMD is available and enabled, you'll see an INFO log message at startup:

SIMD support: enabled (256-bit vectors)

The vector width depends on your CPU (128-bit for SSE/NEON, 256-bit for AVX2, 512-bit for AVX-512).

To disable SIMD and force scalar operations (for debugging or comparison), set the system property:

-Dhardwood.simd.disabled=true

JFR (Java Flight Recorder) Events

Hardwood emits JFR events during file reading, enabling detailed performance profiling with zero overhead when recording is off. Start a JFR recording to capture them:

java -XX:StartFlightRecording=filename=recording.jfr,settings=profile ...

Or attach dynamically via jcmd <pid> JFR.start.

Available events:

Event Category Description
dev.hardwood.FileOpened I/O File opened and metadata read. Fields: file, fileSize, rowGroupCount, columnCount
dev.hardwood.FileMapping I/O Memory-mapping of a file region. Fields: file, offset, size
dev.hardwood.RowGroupScanned Decode Page boundaries scanned in a column chunk. Fields: file, rowGroupIndex, column, pageCount, scanStrategy (sequential or offset-index)
dev.hardwood.PageDecoded Decode Single data page decoded. Fields: column, compressedSize, uncompressedSize
dev.hardwood.BatchWait Pipeline Consumer blocked waiting for the assembly pipeline. Fields: column
dev.hardwood.PrefetchMiss Pipeline Prefetch queue miss requiring synchronous decode. Fields: file, column, newDepth, queueEmpty

Events appear under the Hardwood category in JDK Mission Control (JMC) or any JFR analysis tool. Use them to identify:

  • I/O bottlenecks — large FileMapping durations or frequent PrefetchMiss events
  • Decode hotspotsPageDecoded events with large uncompressed sizes or high frequency
  • Pipeline stallsBatchWait events indicate the reader is waiting for decoded data

System Properties Reference

Property Default Description
hardwood.uselibdeflate true Set to false to disable libdeflate for GZIP decompression
hardwood.simd.disabled false Set to true to force scalar operations instead of SIMD