Class Hardwood
- All Implemented Interfaces:
AutoCloseable
Entry point for reading Parquet files with a shared thread pool.
Use this when reading multiple files to share the executor across readers:
try (Hardwood hardwood = Hardwood.create()) {
ParquetFileReader file1 = hardwood.open(InputFile.of(path1));
ParquetFileReader file2 = hardwood.open(InputFile.of(path2));
// ...
}
To control the decode parallelism, or to reuse one context (and its thread
pool) across many reads and standalone ParquetFileReaders, create the
HardwoodContext yourself and pass it in:
try (HardwoodContext context = HardwoodContext.create(4)) {
try (Hardwood hardwood = Hardwood.create(context)) {
// ...
}
// context is still open here for further use
}
For single-file usage, ParquetFileReader.open(InputFile) is simpler.
-
Method Summary
Modifier and TypeMethodDescriptionvoidclose()static Hardwoodcreate()Create a new Hardwood instance with a thread pool sized to available processors.static Hardwoodcreate(HardwoodContext context) Create a new Hardwood instance backed by the given context, e.g. one created viaHardwoodContext.create(int)to size the decode thread pool.Open a single Parquet file.Open multiple Parquet files for reading with cross-file prefetching.
-
Method Details
-
create
Create a new Hardwood instance with a thread pool sized to available processors. The context is owned by this instance and closed when it is closed. -
create
Create a new Hardwood instance backed by the given context, e.g. one created via
HardwoodContext.create(int)to size the decode thread pool.The caller retains ownership of the context: it is not closed when this instance is closed, so the same context — and its thread pool — can be reused for later reads and shared with standalone
ParquetFileReaders opened viaParquetFileReader.open(InputFile, HardwoodContext). -
open
Open a single Parquet file. The file is opened immediately and closed when the returned reader is closed.- Throws:
IOException
-
openAll
Open multiple Parquet files for reading with cross-file prefetching. The schema is read from the first file. Files are opened on demand by the iterator and closed when the returned reader is closed.- Parameters:
inputFiles- the input files to read (must not be empty)- Throws:
IOException- if the first file cannot be opened or readIllegalArgumentException- if the list is empty
-
close
public void close()- Specified by:
closein interfaceAutoCloseable
-