Was this helpful?
PARQUET Files
userdata.parquet
File containing data in PARQUET format. File has header.
Column details:
column# column_name hive_datatype
===============================================
1         registration_dttm        timestamp
2         id                      int
3         first_name              string
4         last_name               string
5         email                   string
6         gender                  string
7         ip_address              string
8         cc                      string
9         country                 string
10        birthdate               string
11        salary                  double
12        title                   string
1. Create external table:
CREATE EXTERNAL TABLE parquet_ex_test (
     registration_dttm     TIMESTAMP,
     id                    INTEGER,
     first_name            VARCHAR(50),
     last_name             VARCHAR(50),
     email                 VARCHAR(50),
     gender                VARCHAR(50),
     ip_address            VARCHAR(50),
     cc                    VARCHAR(50),
     country               VARCHAR(50),
     birthdate             VARCHAR(50),
     salary                DECIMAL(18,2),
     title                 VARCHAR(50)
) USING SPARK WITH
REFERENCE='/tmp/userdata.parquet',
FORMAT = 'parquet';
2. Create native Vector table:
CREATE TABLE parquet_test(
     registration_dttm     TIMESTAMP,
     id                    INTEGER,
     first_name            VARCHAR(50),
     last_name             VARCHAR(50),
     email                 VARCHAR(50),
     gender                VARCHAR(50),
     ip_address            VARCHAR(50),
     cc                    VARCHAR(50),
     country               VARCHAR(50),
     birthdate             VARCHAR(50),
     salary                DECIMAL(18,2),
     title                 VARCHAR(50)
) WITH STRUCTURE=X100;
3. Load Vector table with INSERT command
INSERT INTO parquet_test SELECT * FROM parquet_ex_test
Note:  If the file does not have a header, include schema in OPTIONS, as shown in the external table to load an ORC file.
Last modified date: 03/21/2024