Home > mailing lists

Re: Add header support to text format and matching feature - Mailing list pgsql-hackers

From	Julien Rouhaud
Subject	Re: Add header support to text format and matching feature
Date	June 7, 2022 15:47:44
Msg-id	20220607154744.vvmitnqhyxrne5ms@jrouhaud Whole thread
In response to	Re: Add header support to text format and matching feature (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Responses	Re: Add header support to text format and matching feature
List	pgsql-hackers

Tree view

Hi,

On Wed, Mar 30, 2022 at 09:11:09AM +0200, Peter Eisentraut wrote:
>
> Committed, after some further refinements as discussed.

While working on nearby code, I found some problems with this feature.

First, probably nitpicking, the HEADER MATCH is allowed for COPY TO, is that
expected?  The documentation isn't really explicit about it, but there's
nothing to match when exporting data it's a bit surprising.  I'm not opposed to
have HEADER MATCH means HEADER ON for COPY TO, as as-is one can easily reuse
the commands history, but maybe it should be clearly documented?

Then, apparently HEADER MATCH doesn't let you do sanity checks against a custom
column list.  This one looks like a clear oversight, as something like that
should be entirely valid IMHO:

CREATE TABLE tbl(col1 int, col2 int);
COPY tbl (col2, col1) TO '/path/to/file' WITH (HEADER MATCH);
COPY tbl (col2, col1) FROM '/path/to/file' WITH (HEADER MATCH);

but right now it errors out with:

ERROR:  column name mismatch in header line field 1: got "col1", expected "col2"

Note that the error message is bogus if you specify attributes in a
different order from the relation, as the code is mixing access to the tuple
desc and access to the raw fields with the same offset.

This also means that it will actually fail to detect a mismatch in the provided
column list and let you import data in the wrong position as long as the
datatypes are compatible and the column header in the file are in the correct
order.  For instance:

CREATE TABLE abc (a text, b text, c text);
INSERT INTO abc SELECT 'a', 'b', 'c';
COPY abc TO '/path/to/file' WITH (HEADER MATCH);

You can then import the data with any of those:
COPY abc(c, b, a) TO '/path/to/file' WITH (HEADER MATCH);
COPY abc(c, a, b) TO '/path/to/file' WITH (HEADER MATCH);
[...]
SELECT * FROM abc;

Even worse, if you try to do a COPY ... FROM ... WITH (HEADER ON) on a table
that has some dropped attribute(s).  The current code will access random memory
as there's no exact attnum / raw field mapping anymore.

I can work on a fix if needed (with some additional regression test to cover
those cases), but I'm still not sure that having a user provided column list is
supposed to be accepted or not for the HEADER MATCH.  In the meantime I will
add an open item.

pgsql-hackers by date:

From: Rod Taylor
Date: 07 June 2022, 15:47:29
Subject: Re: Collation version tracking for macOS

From: Robert Haas
Date: 07 June 2022, 16:26:01
Subject: Re: pgcon unconference / impact of block size on performance

Re: Add header support to text format and matching feature - Mailing list pgsql-hackers

Previous

Next