Package dev.hardwood.reader
package dev.hardwood.reader
Parquet file readers with row-oriented and column-oriented APIs.
ParquetFileReader opens one or more files and provides access to metadata
and schema. From there, create a RowReader for row-at-a-time access with
typed getters, a ColumnReader for single-column batch-oriented access, or
a ColumnReaders for multi-column projection access. FilterPredicate
enables predicate pushdown at both the row-group and page level.
For reading multiple files as a single dataset with cross-file
prefetching, use Hardwood to share a thread pool across
readers.
-
ClassDescriptionBatch-oriented column reader for reading a single column across all row groups.Holds multiple
ColumnReaderinstances backed by a sharedRowGroupIteratorfor batch-oriented projection reads.A predicate for filtering row groups based on column statistics.Predicate for DATE columns.Predicate for DECIMAL columns.Predicate for TIMESTAMP columns.Predicate that matches rows where the column value is not null.Predicate that matches rows where the column value is null.Predicate for decimal columns stored asFIXED_LEN_BYTE_ARRAY, which require signed (two's complement) comparison.Predicate for TIME columns.Reader for one or more Parquet files.Builds a single-columnColumnReaderwith an optional filter.Builds aColumnReaderscollection for batch-oriented access to a projection of columns.Builds aRowReaderwith optional projection, filter, and head/tail row limit.Provides row-oriented iteration over a Parquet file.Thrown when a file's schema is incompatible with the reference schema during multi-file reading.