Interface Validity


@Experimental public interface Validity

Per-item null bitmap at a ColumnReader scope (a STRUCT / REPEATED layer or the leaf).

A Validity is one of two shapes:

  • No nulls — every item at that scope is non-null in the current batch. The NO_NULLS singleton, returned for the no-nulls fast path; no per-batch allocation.
  • Backed — a packed long[] bitmap with set-bit = present polarity: bit i is set iff item i is present (non-null). Word w covers items [w*64, w*64+64), low bit = lowest item.

Consumer-side predicates (isNull(i) / isNotNull(i) / hasNulls()) describe nullability; the storage uses set-bit = present internally to match Arrow's layout. hasNulls() makes the no-nulls fast path explicit:

if (!validity.hasNulls()) {
    // tight loop, skip per-item check
} else {
    // checked loop
}

This API is Experimental: the shape may change in future releases.

  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final Validity
    Singleton signalling "no item at this scope is null in the current batch."
  • Method Summary

    Modifier and Type
    Method
    Description
    boolean
    true iff at least one item at this scope is null in the current batch.
    boolean
    isNotNull(int i)
    true iff the item at index i is not null.
    boolean
    isNull(int i)
    true iff the item at index i is null.
    int
    nextNotNull(int from, int count)
    Index of the next non-null item in [from, count), or -1 if every item in that range is null.
    int
    nextNull(int from, int count)
    Index of the next null item in [from, count), or -1 if every item in that range is non-null.
    int
    nullCount(int count)
    Number of null items in this batch.
    static Validity
    of(long[] words)
    Wraps a packed long[] bitmap (set-bit = present storage).
    long[]
    The word array (set-bit = present polarity).
  • Field Details

    • NO_NULLS

      static final Validity NO_NULLS
      Singleton signalling "no item at this scope is null in the current batch." Identity-stable across calls.
  • Method Details

    • of

      static Validity of(long[] words)

      Wraps a packed long[] bitmap (set-bit = present storage). Returns NO_NULLS when words is null (the sparse "no nulls" representation produced by the internal pipeline); otherwise returns a fresh backed instance holding the given bitmap. The wrapper does not copy — callers must not mutate the bitmap after handing it to a Validity.

      The caller is responsible for sizing the array to at least (count + 63) >>> 6 words for any count they later pass to nullCount(int) / nextNull(int, int) / nextNotNull(int, int), and for keeping indices into isNull(int) / isNotNull(int) within the same bound.

    • hasNulls

      boolean hasNulls()

      true iff at least one item at this scope is null in the current batch. O(1). May help on hot loops as a per-batch fast-path gate:

      if (!validity.hasNulls()) {
          // tight loop, no per-item check
      } else {
          // checked loop
      }
      
    • isNull

      boolean isNull(int i)
      true iff the item at index i is null.
    • isNotNull

      boolean isNotNull(int i)
      true iff the item at index i is not null.
    • nullCount

      int nullCount(int count)
      Number of null items in this batch. count is the total item count at this scope — required because the no-nulls shape has no intrinsic length.
    • nextNull

      int nextNull(int from, int count)
      Index of the next null item in [from, count), or -1 if every item in that range is non-null.
    • nextNotNull

      int nextNotNull(int from, int count)
      Index of the next non-null item in [from, count), or -1 if every item in that range is null. count is the total item count at this scope — required because the no-nulls shape has no intrinsic length.
    • words

      long[] words()
      The word array (set-bit = present polarity). Returns null when there are no nulls. Otherwise returns the backing array directly — no copy. Callers must not mutate it; mirroring the inbound contract on [#of(long[])], the Validity owns the bitmap once handed in. Bits at indices >= count are undefined and must not be read.