Home > mailing lists

Scanner performance (was Re: 7.3 schedule) - Mailing list pgsql-hackers

From	Peter Eisentraut
Subject	Scanner performance (was Re: 7.3 schedule)
Date	April 13, 2002 02:12:51
Msg-id	Pine.LNX.4.30.0204121850140.847-100000@peter.localdomain Whole thread Raw
In response to	Re: 7.3 schedule (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Scanner performance (was Re: 7.3 schedule)
List	pgsql-hackers

Tree view

Tom Lane writes:

> We do have some numbers suggesting that the per-character loop in the
> lexer is slow enough to be a problem with very long literals.  That is
> the overhead that might be avoided with a special protocol.

Which loop is that?  Doesn't the scanner use buffered input anyway?

> However, it should be noted that (AFAIK) no one has spent any effort at
> all on trying to make the lexer go faster.  There is quite a bit of
> material in the flex documentation about performance considerations ---
> someone should take a look at it and see if we can get any wins by being
> smarter, without having to introduce protocol changes.

My profiles show that the work spent in the scanner is really minuscule
compared to everything else.

The data appears to support a suspicion that I've had many moons ago that
the binary search for the key words takes quite a bit of time:
               0.22    0.06   66748/66748       yylex [125]
[129]    0.4    0.22    0.06   66748         base_yylex [129]               0.01    0.02    9191/9191
yy_get_next_buffer[495]               0.02    0.00   32808/34053       ScanKeywordLookup [579]               0.00
0.01  16130/77100       MemoryContextStrdup [370]               0.00    0.00    4000/4000        scanstr [1057]
     0.00    0.00    4637/4637        yy_get_previous_state [2158]               0.00    0.00    4554/4554
base_yyrestart[2162]               0.00    0.00    4554/4554        yywrap [2163]               0.00    0.00       1/1
        base_yy_create_buffer [2852]               0.00    0.00       1/13695       base_yy_load_buffer_state [2107]
 

I while ago I've experimented with hash functions for the key word lookup
and got a speedup of factor 2.5, but again, this is really minor in the
overall scheme of things.

(The profile data is from a run of all the regression test files in order
in one session.)

-- 
Peter Eisentraut   peter_e@gmx.net

pgsql-hackers by date:

From: Tom Lane
Date: 13 April 2002, 02:06:30
Subject: Re: Suggestions please: names for function cachabilityattributes

From: Tom Lane
Date: 13 April 2002, 02:22:38
Subject: Re: Scanner performance (was Re: 7.3 schedule)

Scanner performance (was Re: 7.3 schedule) - Mailing list pgsql-hackers

Previous

Next