See: Description
Class | Description |
---|---|
ParquetAppender |
FileAppender for writing to Parquet files.
|
ParquetScanner |
FileScanner for reading Parquet files
|
TajoParquetReader |
Tajo implementation of
ParquetReader to read Tajo records from a
Parquet file. |
TajoParquetWriter |
Tajo implementation of
ParquetWriter to write Tajo records to a
Parquet file. |
TajoReadSupport |
Tajo implementation of
ReadSupport for Tuple s. |
TajoRecordConverter |
Converter to convert a Parquet record into a Tajo Tuple.
|
TajoSchemaConverter |
Converts between Parquet and Tajo schemas.
|
TajoWriteSupport |
Tajo implementation of
WriteSupport for Tuple s. |
Provides read and write support for Parquet files. Tajo schemas are converted to Parquet schemas according to the following mapping of Tajo and Parquet types:
Tajo type | Parquet type |
---|---|
NULL_TYPE | No type. The field is not encoded in Parquet. |
BOOLEAN | BOOLEAN |
BIT | INT32 |
INT2 | INT32 |
INT4 | INT32 |
INT8 | INT64 |
FLOAT4 | FLOAT |
FLOAT8 | DOUBLE |
CHAR | BINARY (with OriginalType UTF8) |
TEXT | BINARY (with OriginalType UTF8) |
PROTOBUF | BINARY |
BLOB | BINARY |
INET4 | BINARY |
Because Tajo fields can be NULL, all Parquet fields are marked as optional.
The conversion from Tajo to Parquet is lossy without the original Tajo schema. As a result, Parquet files are read using the Tajo schema saved in the Tajo catalog for the table the Parquet files belong to, which was defined when the table was created.
Copyright © 2014 Apache Software Foundation. All Rights Reserved.