Parquet-Java Compatibility¶
The hardwood-parquet-java-compat module provides a drop-in replacement for parquet-java's ParquetReader<Group> API. This allows users migrating from parquet-java to use Hardwood with minimal code changes.
Features:
- Provides
org.apache.parquet.*namespace classes compatible with parquet-java - Includes Hadoop shims (
Path,Configuration) that wrap Java NIO — no Hadoop dependency required - Supports the familiar builder pattern and Group-based record reading
Usage¶
import org.apache.hadoop.fs.Path;
import org.apache.parquet.example.data.Group;
import org.apache.parquet.hadoop.GroupReadSupport;
import org.apache.parquet.hadoop.ParquetReader;
Path path = new Path("data.parquet");
try (ParquetReader<Group> reader = ParquetReader.builder(new GroupReadSupport(), path).build()) {
Group record;
while ((record = reader.read()) != null) {
// Read primitive fields
long id = record.getLong("id", 0);
String name = record.getString("name", 0);
int age = record.getInteger("age", 0);
// Read nested groups (structs)
Group address = record.getGroup("address", 0);
String city = address.getString("city", 0);
int zip = address.getInteger("zip", 0);
// Check for null/optional fields
int count = record.getFieldRepetitionCount("optional_field");
if (count > 0) {
String value = record.getString("optional_field", 0);
}
}
}
Warning
This module provides its own interface copies in the org.apache.parquet.* namespace. It cannot be used alongside parquet-java on the same classpath.