4.2. Supported Data Types #
Types in the source DB are converted to types in the destination DB as follows: the source driver maps the types of the source DB to Go types, and then encoders are used to convert data to the binary format of the destination DB.
The following Oracle data types are supported:
NUMBER, CHAR, NCHAR, VARCHAR, NVARCHAR, VARCHAR2, NVARCHAR2, STRING, DATE, LONG, LONG RAW, CLOB, NCLOB, BLOB, BFILE, RAW(16), RAW(32), RAW, ROWID, UROWID, FLOAT, DEC, DECIMAL, DOUBLE PRECISION, INT, INTEGER, BINARY_INTEGER, PLS_INTEGER, SMALLINT, REAL, BINARY_FLOAT, BINARY_DOUBLE, TIMESTAMP, BOOLEAN, INTERVAL, XMLTYPE, TIMESTAMP WITH TIME ZONE, TIMESTAMP WITH LOCAL TIME ZONE, VARRAY.
The following PostgreSQL data types are supported:
BFILE, BOOLEAN, BYTEA, CHAR, CHARACTER VARYING(N), VARCHAR(N), CHARACTER(N), CHAR(N), BPCHAR(N), BPCHAR, DBMS_LOB.BLOB, DBMS_LOB.CLOB, NAME, TEXT, SMALLINT, INTEGER, BIGINT, DECIMAL, NUMERIC, REAL, DOUBLE PRECISION, SMALLSERIAL, SERIAL, BIGSERIAL, TIMESTAMP, DATE, TIME, INTERVAL, HSTORE, UUID, XML, JSON, JSONB, OID, MONEY, CIDR, INET, MACADDR, MACADDR8, tsquery, tsvector, txid_snapshot, pg_snapshot, pg_lsn, and custom data types.
When transferring data of the MONEY type, the value is read from the source without the currency sign, and the data is written to the destination with the currency sign corresponding to the destination locale.
The following table explains the mapping between Oracle and PostgreSQL data types:
Table 4.2. Mapping of Oracle Types to PostgreSQL Types
| Oracle Type | Go Type | PostgreSQL Type |
|---|---|---|
VARCHAR2 NVARCHAR2 CHAR NCHAR LONG | string | text char(n) varchar(n) bool |
CLOB NCLOB | string | dbms_lob.clob text |
BINARY_DOUBLE | float64 | real double precision smallint integer bigint numeric smallserial serial bigserial |
FLOAT DEC DECIMAL NUMBER | Decimal | numeric |
BINARY_FLOAT | float32 | real |
DATE | time.Time | timestamp |
TIMESTAMP(0) TIMESTAMP(3) TIMESTAMP(6) TIMESTAMP(9) TIMESTAMP(0) WITH TIME ZONE TIMESTAMP(3) WITH TIME ZONE TIMESTAMP(6) WITH TIME ZONE TIMESTAMP(9) WITH TIME ZONE | time.Time | timestamp [with timezone] |
LONG RAW BLOB RAW | []byte | sfile dbms_lob.blob bytea |
BFILE |
model.BFile
{
alias string,
filename string
}
| bfile |
BOOLEAN | bool | bool |
NTERVAL ... |
model.Interval
{
Years int
Months int
Days int
Hours int
Minutes int
Seconds int
Milliseconds int
Sign int
}
| interval |
Several next tables explain mapping of PostgreSQL data types to files in the Parquet format.
The table below explains mapping of numeric data types:
Table 4.3. Mapping of PostgreSQL Numeric Types to Files in the Parquet Format
| PostgreSQL Type | Parquet Type | PostgreSQL Example | Parquet Example | Comment |
|---|---|---|---|---|
smallint | INT32 | 5 | 5 | |
integer | INT32 | 5 | 5 | |
bigint | INT64 | 5 | 5 | |
numeric | FIXED_LEN_BYTE_ARRAY | -1.258 | 0xFFFFFFFFFFFFFFFFFFFFFFFFB5047180 | Precision 38 and scale 9 are only supported. The physical type is |
decimal | FIXED_LEN_BYTE_ARRAY | -1.258 | 0xFFFFFFFFFFFFFFFFFFFFFFFFB5047180 | The comment for the numeric type holds. |
real | FLOAT | 129.153 | 129.1529998779297 | |
double precision | DOUBLE | -1.258 | -1.258 |
The table below explains mapping of monetary data types:
Table 4.4. Mapping of PostgreSQL Monetary Types to Files in the Parquet Format
| PostgreSQL Type | Parquet Type | PostgreSQL Example | Parquet Example | Comment |
|---|---|---|---|---|
money | BYTE_ARRAY | $1.25 | 1.25 | The type is mapped as a string without the currency sign. |
The table below explains mapping of string data types:
Table 4.5. Mapping of PostgreSQL String Types to Files in the Parquet Format
| PostgreSQL Type | Parquet Type | PostgreSQL Example | Parquet Example | Comment |
|---|---|---|---|---|
varchar | BYTE_ARRAY | "test string" | "test string" | Among others, varchar(n) with the specified length is supported, but the length is not taken into account. |
char(10) | BYTE_ARRAY | "test" | "test" | If the string is shorter than the length, remaining characters are filled with spaces. |
text | BYTE_ARRAY | "test string" | "test string" |
The table below explains mapping of date/time data types:
Table 4.6. Mapping of PostgreSQL Date/Time Types to Files in the Parquet Format
| PostgreSQL Type | Parquet Type | PostgreSQL Example | Parquet Example | Comment |
|---|---|---|---|---|
timestamp without time zone | INT64 | "2006-01-02 15:04:05" | 1136214245000000 | timestamp(6) is supported, but the precision is ignored. The number of microseconds since the start of the UNIX epoch is returned. |
timestamp with time zone | INT64 | "02 Jan 06 15:04 EST" | 1136232240000000 | The time zone and precision are ignored. Microseconds are returned. |
date | BYTE_ARRAY | "2006-01-02" | 2006-01-02 | A date is stored as a string in this format. BC dates are not supported. |
time without time zone | INT64 | "15:04:05" | 54245000000 | The number of microseconds since the start of the day. timestamp(6) is supported, but the precision is ignored. |
interval | FIXED_LEN_BYTE_ARRAY | "1 years 0 mons 2 days 3 hours 20 mins 0.0 secs" | 1 year 2 days 03:20:00 | The physical type is Bear in mind that this mapping is performed with a loss of accuracy: while PostgreSQL stores MinMax statistics is not collected for this type. If the statistics for this type is available, the following duckDB query results in errors: SELECT * FROM parquet_metadata(./file.parquet) |
The table below explains mapping of other data types.
Table 4.7. Mapping of PostgreSQL Other Types to Files in the Parquet Format
| PostgreSQL Type | Parquet Type | PostgreSQL Example | Parquet Example |
|---|---|---|---|
boolean | BOOLEAN | TRUE | true |
bytea | BYTE_ARRAY | "test bytes" | "test bytes" |
Note
The Parquet type xml is not supported at present.