Thread: Very long SQL strings

Very long SQL strings

From

"Steven Flatt"

Date:

21 June 2007, 15:33:38

I can't seem to find a definitive answer to this.

It looks like Postgres does not enforce a limit on the length of an SQL string. Great. However is there some point at which a query string becomes ridiculously too long and affects performance? Here's my particular case: consider an INSERT statement where you're using the new multi-row VALUES clause or SELECT ... UNION ALL to group together tuples. Is it always better to group as many together as possible?

For example, on a toy table with two columns, I noticed about a 20% increase when bulking together 1000 tuples in one INSERT statement as opposed to doing 1000 individual INSERTS. Would this be the same for 10000? 100000? Does it depend on the width of the tuples or the data types?

Are there any values A and B such that grouping together A tuples and B tuples separately and running two statements, will be faster than grouping A+B tuples in one statement?

Steve

Re: Very long SQL strings

From

Heikki Linnakangas

Date:

21 June 2007, 15:43:03

Steven Flatt wrote:
> It looks like Postgres does not enforce a limit on the length of an SQL
> string.  Great.  However is there some point at which a query string
> becomes
> ridiculously too long and affects performance?  Here's my particular case:
> consider an INSERT statement where you're using the new multi-row VALUES
> clause or SELECT ... UNION ALL to group together tuples.  Is it always
> better to group as many together as possible?

I'm sure you'll reach a point of diminishing returns, and eventually a
ceiling where you run out of memory etc, but I don't know what the limit
would be.

The most efficient way to do bulk inserts is to stream the data with COPY.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: Very long SQL strings

From

Tom Lane

Date:

21 June 2007, 15:46:05

"Steven Flatt" <steven.flatt@gmail.com> writes:
> It looks like Postgres does not enforce a limit on the length of an SQL
> string.  Great.  However is there some point at which a query string becomes
> ridiculously too long and affects performance?

Yes, but it'll depend a whole lot on context; I'd suggest
experimentation if you want to derive a number for your particular
situation.  For starters, whether you are on 32- or 64-bit hardware
is hugely relevant.

FYI, when we developed multi-row-VALUES quite a bit of thought was
put into maintaining performance with lots of rows, and IIRC we saw
reasonable performance up into the tens of thousands of rows (depending
on how wide the rows are).  Other ways of making a query long, such as
lots of WHERE clauses, might send performance into the tank a lot
quicker.

So the short answer is it all depends.

            regards, tom lane

PS: for the record, there is a hard limit at 1GB of query text, owing
to restrictions built into palloc.  But I think you'd hit other
memory limits or performance bottlenecks before that one.

Re: Very long SQL strings

From

Andreas Kretschmer

Date:

21 June 2007, 15:46:17

Steven Flatt <steven.flatt@gmail.com> schrieb:
> For example, on a toy table with two columns, I noticed about a 20% increase
> when bulking together 1000 tuples in one INSERT statement as opposed to doing
> 1000 individual INSERTS.  Would this be the same for 10000? 100000?  Does it
> depend on the width of the tuples or the data types?

I guess you can obtain the same if you pack all INSERTs into one
transaction.

And, faster than INSERT: COPY.


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."    (unknow)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: Very long SQL strings

From

Chris Browne

Date:

21 June 2007, 17:04:12

tgl@sss.pgh.pa.us (Tom Lane) writes:
> PS: for the record, there is a hard limit at 1GB of query text, owing
> to restrictions built into palloc.  But I think you'd hit other
> memory limits or performance bottlenecks before that one.

It would be much funnier to set a hard limit of 640K of query text.
The reasoning should be obvious :-).

I once ran into the situation where Slony-I generated a query that
made the parser blow out (some sort of memory problem / running out of
stack space somewhere thing); it was just short of 640K long, and so
we figured that evidently it was wrong to conclude that "640K ought to
be enough for anybody."

Neil Conway was an observer; he was speculating that, with some
(possibly nontrivial) change to the parser, we should have been able
to cope with it.

The query consisted mostly of a NOT IN clause where the list had some
atrocious number of entries in it (all integers).

(Aside: I wound up writing a "query compressor" (now in 1.2) which
would read that list and, if it was at all large, try to squeeze any
sets of consecutive integers into sets of "NOT BETWEEN" clauses.
Usually, the lists, of XIDs, were more or less consecutive, and
frequently, in the cases where the query got to MBs in size, there
would be sets of hundreds or even thousands of consecutive integers
such that we'd be left with a tiny query after this...)
--
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://linuxfinances.info/info/linux.html
As of next Monday, MACLISP will no longer support list structure.
Please downgrade your programs.

Re: Very long SQL strings

From

"Steven Flatt"

Date:

21 June 2007, 17:09:42

Thanks everyone for your responses. I don't think it's realistic to change our application infrastructure to use COPY from a stream at this point. It's good to know that multi-row-VALUES is good up into the thousands of rows (depending on various things, of course). That's a good enough answer for what I was looking for and we can revisit this if performance does start to hurt.

On 6/21/07, Andreas Kretschmer <akretschmer@spamfence.net> wrote:

I guess you can obtain the same if you pack all INSERTs into one
transaction.

Well the 20% gain I referred to was when all individual INSERTs were within one transaction. When each INSERT does its own commit, it's significantly slower.

Steve

Re: Very long SQL strings

From

Tom Lane

Date:

21 June 2007, 17:59:26

Chris Browne <cbbrowne@acm.org> writes:
> I once ran into the situation where Slony-I generated a query that
> made the parser blow out (some sort of memory problem / running out of
> stack space somewhere thing); it was just short of 640K long, and so
> we figured that evidently it was wrong to conclude that "640K ought to
> be enough for anybody."

> Neil Conway was an observer; he was speculating that, with some
> (possibly nontrivial) change to the parser, we should have been able
> to cope with it.

> The query consisted mostly of a NOT IN clause where the list had some
> atrocious number of entries in it (all integers).

FWIW, we do seem to have improved that as of 8.2.  Assuming your entries
were 6-or-so-digit integers, that would have been on the order of 80K
entries, and we can manage it --- not amazingly fast, but it doesn't
blow out the stack anymore.

> (Aside: I wound up writing a "query compressor" (now in 1.2) which
> would read that list and, if it was at all large, try to squeeze any
> sets of consecutive integers into sets of "NOT BETWEEN" clauses.
> Usually, the lists, of XIDs, were more or less consecutive, and
> frequently, in the cases where the query got to MBs in size, there
> would be sets of hundreds or even thousands of consecutive integers
> such that we'd be left with a tiny query after this...)

Probably still a win.

            regards, tom lane