Class ColumnReader
java.lang.Object
dev.hardwood.reader.ColumnReader
- All Implemented Interfaces:
AutoCloseable
Batch-oriented column reader for reading a single column across all row groups.
Provides typed primitive arrays for zero-boxing access. For nested/repeated columns, multi-level offsets and per-level null bitmaps enable efficient traversal without per-row virtual dispatch.
Flat column usage:
try (ColumnReader reader = fileReader.createColumnReader("fare_amount")) {
while (reader.nextBatch()) {
int count = reader.getRecordCount();
double[] values = reader.getDoubles();
BitSet nulls = reader.getElementNulls();
for (int i = 0; i < count; i++) {
if (nulls == null || !nulls.get(i)) sum += values[i];
}
}
}
Simple list usage (nestingDepth=1):
try (ColumnReader reader = fileReader.createColumnReader("fare_components")) {
while (reader.nextBatch()) {
int recordCount = reader.getRecordCount();
int valueCount = reader.getValueCount();
double[] values = reader.getDoubles();
int[] offsets = reader.getOffsets(0);
BitSet recordNulls = reader.getLevelNulls(0);
BitSet elementNulls = reader.getElementNulls();
for (int r = 0; r < recordCount; r++) {
if (recordNulls != null && recordNulls.get(r)) continue;
int start = offsets[r];
int end = (r + 1 < recordCount) ? offsets[r + 1] : valueCount;
for (int i = start; i < end; i++) {
if (elementNulls == null || !elementNulls.get(i)) sum += values[i];
}
}
}
}
-
Method Summary
Modifier and TypeMethodDescriptionvoidclose()byte[][]boolean[]double[]Null bitmap over leaf values.float[]int[]getInts()getLevelNulls(int level) Null bitmap at a given nesting level.long[]getLongs()intNesting depth: 0 for flat, maxRepetitionLevel for nested.int[]getOffsets(int level) Offset array for a given nesting level.intNumber of top-level records in the current batch.String[]String values for STRING/JSON/BSON logical type columns.intTotal number of leaf values in the current batch.booleanAdvance to the next batch.
-
Method Details
-
nextBatch
public boolean nextBatch()Advance to the next batch.- Returns:
- true if a batch is available, false if exhausted
-
getRecordCount
public int getRecordCount()Number of top-level records in the current batch. -
getValueCount
public int getValueCount()Total number of leaf values in the current batch. For flat columns, this equalsgetRecordCount(). -
getInts
public int[] getInts() -
getLongs
public long[] getLongs() -
getFloats
public float[] getFloats() -
getDoubles
public double[] getDoubles() -
getBooleans
public boolean[] getBooleans() -
getBinaries
public byte[][] getBinaries() -
getStrings
String values for STRING/JSON/BSON logical type columns. Converts the underlying byte arrays to UTF-8 strings. Null values are represented as null entries in the array.- Returns:
- String array with converted values
- Throws:
IllegalStateException- if the column is not a BYTE_ARRAY type
-
getElementNulls
Null bitmap over leaf values. For flat columns this doubles as record-level nulls.- Returns:
- BitSet where set bits indicate null values, or null if all elements are required
-
getLevelNulls
Null bitmap at a given nesting level. Only valid for nested columns (0 <= level < getNestingDepth()).- Parameters:
level- the nesting level (0 = outermost group)- Returns:
- BitSet where set bits indicate null groups, or null if that level is required
-
getNestingDepth
public int getNestingDepth()Nesting depth: 0 for flat, maxRepetitionLevel for nested. -
getOffsets
public int[] getOffsets(int level) Offset array for a given nesting level. Maps items at level k to positions in the next level (or leaf values for the innermost level).- Parameters:
level- the nesting level (0-indexed)- Returns:
- offset array for the given level
-
getColumnSchema
-
close
public void close()- Specified by:
closein interfaceAutoCloseable
-