Interface InputFile

All Superinterfaces:
AutoCloseable, Closeable
All Known Implementing Classes:
S3InputFile

public interface InputFile extends Closeable

Abstraction for reading Parquet file data.

This interface decouples the read pipeline from memory-mapped local files, enabling alternative backends such as object stores or in-memory buffers.

An InputFile starts in an unopened state. The open() method must be called before readRange(long, int) or length() can be used. The framework (Hardwood, ParquetFileReader) calls open() automatically; callers only need to create instances via of(Path) and close them when done.

Implementations must be safe for concurrent use from multiple threads once opened. The returned ByteBuffer instances are owned by the caller and may be slices of a larger mapping or freshly allocated buffers, depending on the implementation.

See Also:
  • Method Details

    • open

      void open() throws IOException
      Performs expensive resource acquisition (e.g. memory-mapping, network connect). Must be called before readRange(long, int) or length().
      Throws:
      IOException - if the resource cannot be acquired
    • readRange

      ByteBuffer readRange(long offset, int length) throws IOException
      Read a range of bytes from the file.
      Parameters:
      offset - the byte offset to start reading from
      length - the number of bytes to read
      Returns:
      a ByteBuffer containing the requested data
      Throws:
      IOException - if the read fails
      IllegalStateException - if open() has not been called
      IndexOutOfBoundsException - if offset or length is out of range
    • length

      long length() throws IOException
      Returns the total size of the file in bytes.
      Returns:
      the file size
      Throws:
      IOException - if the size cannot be determined
      IllegalStateException - if open() has not been called
    • name

      String name()
      Returns an identifier for this file, used in log messages and JFR events.
      Returns:
      a human-readable name or path
    • of

      static InputFile of(ByteBuffer buffer)

      Creates an InputFile backed by an in-memory ByteBuffer.

      Since the data is already in memory, no resource acquisition is needed and open() is a no-op.

      Parameters:
      buffer - the buffer containing Parquet file data
      Returns:
      a new InputFile backed by the buffer
    • of

      static InputFile of(Path path)
      Creates an unopened InputFile for a local file path.
      Parameters:
      path - the file to read
      Returns:
      a new unopened InputFile
    • ofPaths

      static List<InputFile> ofPaths(List<Path> paths)
      Creates unopened InputFile instances for a list of local file paths.
      Parameters:
      paths - the files to read
      Returns:
      a list of new unopened InputFile instances
    • ofPaths

      static List<InputFile> ofPaths(Path first, Path... more)
      Creates unopened InputFile instances for the given local file paths.
      Parameters:
      first - the first file to read
      more - additional files to read
      Returns:
      a list of new unopened InputFile instances
    • ofBuffers

      static List<InputFile> ofBuffers(List<ByteBuffer> buffers)

      Creates InputFile instances for a list of in-memory ByteBuffers.

      Since the data is already in memory, no resource acquisition is needed and open() is a no-op for each instance.

      Parameters:
      buffers - the buffers containing Parquet file data
      Returns:
      a list of new InputFile instances backed by the buffers
    • ofBuffers

      static List<InputFile> ofBuffers(ByteBuffer first, ByteBuffer... more)

      Creates InputFile instances for the given in-memory ByteBuffers.

      Since the data is already in memory, no resource acquisition is needed and open() is a no-op for each instance.

      Parameters:
      first - the first buffer containing Parquet file data
      more - additional buffers containing Parquet file data
      Returns:
      a list of new InputFile instances backed by the buffers