> Would want it paired with offset for this use case.
COPY FROM already accepts HEADER <integer> to skip N leading lines, so you can combine it with LIMIT to get OFFSET + LIMIT semantics today: =# COPY t FROM 'file.csv' (HEADER 100, LIMIT 50);
Indirect, but fair, assuming the limit is indeed pre-parsed lines. You need OFFSET to match post-processed counted lines.
>> Design: >> - The LIMIT count applies after WHERE filtering and ON_ERROR skipping, >> so it represents the actual number of rows inserted. > > > Not sure about that choice. I’d go with pre-eval or implement both and default to pre-eval.
It is consistent with SQL semantics — SELECT ... WHERE ... LIMIT N counts rows that pass the filter, not rows scanned.
Ok, I suppose I'd name it "stop N" instead of "limit N" to avoid this.
Pre-eval behavior is already achievable externally (head -n), while post-eval can only be done server-side, which makes it the more valuable choice for a built-in option.
Not seeing the need for either-or; and everywhere else you want to support this feature by assuming that external tools aren't available.
So; HEADER+STOP, HEADER?+OFFSET+LIMIT
This is not a novel feature either. Oracle SQL*Loader provides LOAD, and SQL Server BULK INSERT provides LASTROW. The absence of LIMIT in COPY FROM is arguably a gap relative to other RDBMSes.
There'd be a lot less friction of this sort if someone just bites the bullet and devises a core-managed ETL tool instead of attaching pieces one-by-one into COPY. Or at least maybe we get a fast-path version to handle typical dump-restore commands and then branch to the ETL path if the command options indicate doing so is needed. There is apparently too much demand for this stuff for a nonproliferation agreement.