Re: Add LIMIT option to COPY FROM - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Add LIMIT option to COPY FROM
Date
Msg-id CAKFQuwbKtCOPSY3XRzTighucPXzat6qM9g937KPoOCYP=10rww@mail.gmail.com
Whole thread Raw
In response to Re: Add LIMIT option to COPY FROM  (Shinya Kato <shinya11.kato@gmail.com>)
List pgsql-hackers
On Tue, Feb 3, 2026 at 6:07 PM Shinya Kato <shinya11.kato@gmail.com> wrote:
> Would want it paired with offset for this use case.

COPY FROM already accepts HEADER <integer> to skip N leading lines, so
you can combine it with LIMIT to get OFFSET + LIMIT semantics today:
=# COPY t FROM 'file.csv' (HEADER 100, LIMIT 50);

Indirect, but fair, assuming the limit is indeed pre-parsed lines.  You need OFFSET to match post-processed counted lines.


>> Design:
>> - The LIMIT count applies after WHERE filtering and ON_ERROR skipping,
>> so it represents the actual number of rows inserted.
>
>
> Not sure about that choice.  I’d go with pre-eval or implement both and default to pre-eval.

It is consistent with SQL semantics — SELECT ... WHERE ... LIMIT N
counts rows that pass the filter, not rows scanned.

Ok, I suppose I'd name it "stop N" instead of "limit N" to avoid this.

Pre-eval behavior
is already achievable externally (head -n), while post-eval can only
be done server-side, which makes it the more valuable choice for a
built-in option.

Not seeing the need for either-or; and everywhere else you want to support this feature by assuming that external tools aren't available.

So;  HEADER+STOP, HEADER?+OFFSET+LIMIT


This is not a novel feature either. Oracle SQL*Loader provides LOAD,
and SQL Server BULK INSERT provides LASTROW. The absence of LIMIT in
COPY FROM is arguably a gap relative to other RDBMSes.


There'd be a lot less friction of this sort if someone just bites the bullet and devises a core-managed ETL tool instead of attaching pieces one-by-one into COPY.  Or at least maybe we get a fast-path version to handle typical dump-restore commands and then branch to the ETL path if the command options indicate doing so is needed.  There is apparently too much demand for this stuff for a nonproliferation agreement.

David J.

pgsql-hackers by date:

Previous
From: Shinya Kato
Date:
Subject: Re: Add LIMIT option to COPY FROM
Next
From: "zengman"
Date:
Subject: Re: Remove unused isCommit parameter from AtEOXact_LocalBuffers