Re: pgbench client-side performance issue on large scripts - Mailing list pgsql-hackers

From Tom Lane
Subject Re: pgbench client-side performance issue on large scripts
Date
Msg-id 1472710.1740428159@sss.pgh.pa.us
Whole thread Raw
In response to pgbench client-side performance issue on large scripts  ("Daniel Verite" <daniel@manitou-mail.org>)
Responses Re: pgbench client-side performance issue on large scripts
List pgsql-hackers
"Daniel Verite" <daniel@manitou-mail.org> writes:
> On large scripts, pgbench happens to consume a lot of CPU time.
> For instance, with a script consisting of 50000 "SELECT 1;"
> I see "pgbench -f 50k-select.sql" taking about 5.8 secs of CPU time,
> out of a total time of 6.7 secs. When run with perf, this profile shows up:

You ran only a single execution of a 50K-line script?  This test
case feels a little bit artificial.  Having said that ...

> In ParseScript(), expr_scanner_get_lineno() is called for each line
> with its current offset, and it scans the script from the beginning
> up to the current line. I think that on the whole, parsing this script
> ends up looking at (N*(N+1))/2 lines, which is 1.275 billion lines
> if N=50000.

... yes, O(N^2) is not nice.  It has to be possible to do better.

> I wonder whether pgbench should materialize the current line number
> in a variable, as psql does in pset.lineno. But given that there are
> two different parsers in pgbench, maybe it's not the simplest.
> Flex has yylineno but neither pgbench nor psql make use of it.

Yeah, we do rely on yylineno in bootscanner.l and ecpg, but not
elsewhere; not sure if there's a performance reason for that.
I see that plpgsql has a hand-rolled version (look for cur_line_num)
that perhaps could be stolen.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Gilles Darold
Date:
Subject: Re: proposal - plpgsql - support standard syntax for named arguments for cursors
Next
From: "Devulapalli, Raghuveer"
Date:
Subject: RE: Improve CRC32C performance on SSE4.2