Skip to content

CLI

The hardwood CLI lets you inspect and convert Parquet files from the command line — useful for exploring datasets, debugging file structure, and quick format conversions without writing Java code. It reads local files and S3 URIs, and ships as a GraalVM native binary with instant startup.

Pre-built native binaries for Linux, macOS, and Windows are available from the early-access release.

macOS

The binary is not notarized. On first run, macOS Gatekeeper will block it. Remove the quarantine flag after extracting:

xattr -r -d com.apple.quarantine hardwood-cli-*/

Available Commands

Command Description
hardwood info Display high-level file information
hardwood schema Print the file schema
hardwood print Print rows as an ASCII table (head, tail, or all)
hardwood convert Convert a Parquet file to CSV or JSON
hardwood metadata Display full file metadata including row groups and column chunks
hardwood footer Print decoded footer length, offset, and file structure
hardwood inspect pages List data and dictionary pages per column chunk
hardwood inspect dictionary Print dictionary entries for a column
hardwood inspect column-index Print min/max statistics per page for a column
hardwood inspect column-size Show compressed and uncompressed byte sizes per column
hardwood help Display help information about a command

Examples

# Show file overview
hardwood info -f data.parquet

# Print schema
hardwood schema -f data.parquet

# Show first 20 rows
hardwood print -n 20 -f data.parquet

# Show last 5 rows
hardwood print -n -5 -f data.parquet

# Show all rows
hardwood print -f data.parquet

# Convert to CSV
hardwood convert --format csv -f data.parquet

Reading Files from S3

All commands accept s3:// URIs via the -f flag:

hardwood schema -f s3://my-bucket/data.parquet
hardwood print -n 10 -f s3://my-bucket/data.parquet

The CLI resolves credentials via the standard AWS credential chain (environment variables, ~/.aws/credentials, SSO, instance profiles, etc.).

Environment Variable Description
AWS_REGION AWS region (also read from ~/.aws/config if not set)
AWS_ENDPOINT_URL Custom endpoint for S3-compatible services (MinIO, LocalStack, R2, etc.)
AWS_PATH_STYLE Set to true to use path-style access (required by some S3-compatible services)

Shell Completion

The distribution includes a Bash completion script at bin/hardwood_completion. Source it in your shell to enable tab completion for commands, options, and arguments:

source hardwood_completion

To make it permanent, add the line above to your ~/.bashrc or ~/.bash_profile.