Thread: FUNC_MAX_ARGS benchmarks

FUNC_MAX_ARGS benchmarks

From

nconway@klamath.dyndns.org (Neil Conway)

Date:

01 August 2002, 18:23:36

Ok, here are some crude benchmarks to attempt to measure the effect of
changing FUNC_MAX_ARGS. The benchmark script executed:

CREATE FUNCTION test_func(int, int, int, int, int, int, int, int)
RETURNS INTEGER AS 'SELECT $1 + $2 + $3 + $4 + $5 + $6 + $7 + $8'
LANGUAGE 'sql' VOLATILE;

Followed by 30,000 calls of:

SELECT test_func(i, i, i, i, i, i, i, i);

(Where i was the iteration number)

I ran the test several times and averaged the results -- the wall-clock
time remained very consistent throughout the runs. Each execution of the
script took about 30 seconds. The machine was otherwise idle, and all
other PostgreSQL settings were at their default values.

With FUNC_MAX_ARGS=16:

28.832
28.609
28.726
28.680

(average = 28.6 seconds)

With FUNC_MAX_ARGS=32:

29.097
29.337
29.138
28.985
29.231

(average = 29.15 seconds)

Cheers,

Neil

-- 
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

01 August 2002, 18:25:59

Thanks.  That looks acceptable to me, and a good test.

---------------------------------------------------------------------------

Neil Conway wrote:
> Ok, here are some crude benchmarks to attempt to measure the effect of
> changing FUNC_MAX_ARGS. The benchmark script executed:
> 
> CREATE FUNCTION test_func(int, int, int, int, int, int, int, int)
> RETURNS INTEGER AS 'SELECT $1 + $2 + $3 + $4 + $5 + $6 + $7 + $8'
> LANGUAGE 'sql' VOLATILE;
> 
> Followed by 30,000 calls of:
> 
> SELECT test_func(i, i, i, i, i, i, i, i);
> 
> (Where i was the iteration number)
> 
> I ran the test several times and averaged the results -- the wall-clock
> time remained very consistent throughout the runs. Each execution of the
> script took about 30 seconds. The machine was otherwise idle, and all
> other PostgreSQL settings were at their default values.
> 
> With FUNC_MAX_ARGS=16:
> 
> 28.832
> 28.609
> 28.726
> 28.680
> 
> (average = 28.6 seconds)
> 
> With FUNC_MAX_ARGS=32:
> 
> 29.097
> 29.337
> 29.138
> 28.985
> 29.231
> 
> (average = 29.15 seconds)
> 
> Cheers,
> 
> Neil
> 
> -- 
> Neil Conway <neilconway@rogers.com>
> PGP Key ID: DB3C29FC
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: FUNC_MAX_ARGS benchmarks

From

Thomas Lockhart

Date:

02 August 2002, 01:07:03

> > With FUNC_MAX_ARGS=16:
> > (average = 28.6 seconds)
> > With FUNC_MAX_ARGS=32:
> > (average = 29.15 seconds)

That is almost a 2 percent cost. Shall we challenge someone to get us
back 2 percent from somewhere before the 7.3 release? Optimizing a hot
spot might do it...
                     - Thomas

Re: FUNC_MAX_ARGS benchmarks

From

"Marc G. Fournier"

Date:

02 August 2002, 02:55:05

On Thu, 1 Aug 2002, Thomas Lockhart wrote:

> > > With FUNC_MAX_ARGS=16:
> > > (average = 28.6 seconds)
> > > With FUNC_MAX_ARGS=32:
> > > (average = 29.15 seconds)
>
> That is almost a 2 percent cost. Shall we challenge someone to get us
> back 2 percent from somewhere before the 7.3 release? Optimizing a hot
> spot might do it...

The other side of the coin ... have you, in the past, gained enough
performance to allow us a 2% slip for v7.3?

Someone mentioned that the SQL spec called for a 128byte NAMELENTH ... is
there similar for FUNC_MAX_ARGS that we aren't adhering to?  Or is that
one semi-arbitrary?

Re: FUNC_MAX_ARGS benchmarks

From

Rod Taylor

Date:

02 August 2002, 10:18:50

Perhaps I'm not remembering correctly, but don't SQL functions still
have an abnormally high cost of execution compared to plpgsql? 

Want to try the same thing with a plpgsql function?


On Thu, 2002-08-01 at 18:23, Neil Conway wrote: 
> Ok, here are some crude benchmarks to attempt to measure the effect of
> changing FUNC_MAX_ARGS. The benchmark script executed:
> 
> CREATE FUNCTION test_func(int, int, int, int, int, int, int, int)
> RETURNS INTEGER AS 'SELECT $1 + $2 + $3 + $4 + $5 + $6 + $7 + $8'
> LANGUAGE 'sql' VOLATILE;
> 
> Followed by 30,000 calls of:
> 
> SELECT test_func(i, i, i, i, i, i, i, i);
> 
> (Where i was the iteration number)
> 
> I ran the test several times and averaged the results -- the wall-clock
> time remained very consistent throughout the runs. Each execution of the
> script took about 30 seconds. The machine was otherwise idle, and all
> other PostgreSQL settings were at their default values.
> 
> With FUNC_MAX_ARGS=16:
> 
> 28.832
> 28.609
> 28.726
> 28.680
> 
> (average = 28.6 seconds)
> 
> With FUNC_MAX_ARGS=32:
> 
> 29.097
> 29.337
> 29.138
> 28.985
> 29.231
> 
> (average = 29.15 seconds)
> 
> Cheers,
> 
> Neil
> 
> -- 
> Neil Conway <neilconway@rogers.com>
> PGP Key ID: DB3C29FC
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

02 August 2002, 10:39:46

Rod Taylor <rbt@zort.ca> writes:
> Perhaps I'm not remembering correctly, but don't SQL functions still
> have an abnormally high cost of execution compared to plpgsql? 

> Want to try the same thing with a plpgsql function?

Actually, plpgsql is pretty expensive too.  The thing to be benchmarking
is applications of plain old built-in-C functions and operators.

Also, there are two components that I'd be worried about: one is the
parser's costs of operator/function lookup, and the other is runtime
overhead.  Runtime overhead is most likely concentrated in the fmgr.c
interface functions, which tend to do MemSets to zero out function
call records.  I had had a todo item to eliminate the memset in favor
of just zeroing what needs to be zeroed, at least in the one- and two-
argument cases which are the most heavily trod code paths.  This will
become significantly more important if FUNC_MAX_ARGS increases.
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Andrew Sullivan

Date:

02 August 2002, 11:32:03

On Fri, Aug 02, 2002 at 10:39:47AM -0400, Tom Lane wrote:
> 
> Actually, plpgsql is pretty expensive too.  The thing to be benchmarking
> is applications of plain old built-in-C functions and operators.

I thought part of the justification for this was for the OpenACS
guys; don't they write everything in TCL?

A

-- 
----
Andrew Sullivan                               87 Mowat Avenue 
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M6K 3E3                                        +1 416 646 3304
x110

Re: FUNC_MAX_ARGS benchmarks

From

"Marc G. Fournier"

Date:

02 August 2002, 12:28:51

On Fri, 2 Aug 2002, Andrew Sullivan wrote:

> On Fri, Aug 02, 2002 at 10:39:47AM -0400, Tom Lane wrote:
> >
> > Actually, plpgsql is pretty expensive too.  The thing to be benchmarking
> > is applications of plain old built-in-C functions and operators.
>
> I thought part of the justification for this was for the OpenACS
> guys; don't they write everything in TCL?

Nope, the OpenACS stuff relies on plpgsql functions ... the 'TCL' is the
web pages themselves, vs using something like PHP ... I may be wrong, but
I do not believe any of the functions are in TCL ...

Re: FUNC_MAX_ARGS benchmarks

From

Brett Schwarz

Date:

02 August 2002, 12:37:47

On Fri, 2002-08-02 at 09:28, Marc G. Fournier wrote:
> On Fri, 2 Aug 2002, Andrew Sullivan wrote:
> 
> > On Fri, Aug 02, 2002 at 10:39:47AM -0400, Tom Lane wrote:
> > >
> > > Actually, plpgsql is pretty expensive too.  The thing to be benchmarking
> > > is applications of plain old built-in-C functions and operators.
> >
> > I thought part of the justification for this was for the OpenACS
> > guys; don't they write everything in TCL?
> 
> Nope, the OpenACS stuff relies on plpgsql functions ... the 'TCL' is the
> web pages themselves, vs using something like PHP ... I may be wrong, but
> I do not believe any of the functions are in TCL ...
> 

Nope, some are written in Tcl. Most are in plpgsql, mainly I believe so
that the port between Oracle and PG is easier.
   --brett

> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
-- 
Brett Schwarz
brett_schwarz AT yahoo.com

Re: FUNC_MAX_ARGS benchmarks

From

Daniel Wickstrom

Date:

02 August 2002, 12:47:51

>>>>> "Marc" == Marc G Fournier <scrappy@hub.org> writes:
   Marc> On Fri, 2 Aug 2002, Andrew Sullivan wrote:   >> On Fri, Aug 02, 2002 at 10:39:47AM -0400, Tom Lane wrote: > >
>> Actually, plpgsql is pretty expensive too.  The thing to be   >> benchmarking > is applications of plain old
built-in-C  >> functions and operators.   >>    >> I thought part of the justification for this was for the   >>
OpenACSguys; don't they write everything in TCL?

   Marc> Nope, the OpenACS stuff relies on plpgsql functions ... the   Marc> 'TCL' is the web pages themselves, vs
usingsomething like   Marc> PHP ... I may be wrong, but I do not believe any of the   Marc> functions are in TCL ...

That's true.  We have intentionally avoided adding pl/tcl functions
into the db.  The postgresql db stuff relies extensively on plpgsql,
while the oracle db stuff relies on pl/sql to provide equivalent
functionality. 

On the other hand, all of the web server stuff is implemented on Aolserver
which uses Tcl as a scripting language.

Regards,

Dan

Re: FUNC_MAX_ARGS benchmarks

From

Andrew Sullivan

Date:

02 August 2002, 14:05:58

On Fri, Aug 02, 2002 at 12:35:10PM -0400, Daniel Wickstrom wrote:
> 
> On the other hand, all of the web server stuff is implemented on Aolserver
> which uses Tcl as a scripting language.

I think this is why I was confused.  Thanks, all.

A

-- 
----
Andrew Sullivan                               87 Mowat Avenue 
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M6K 3E3                                        +1 416 646 3304
x110

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

02 August 2002, 15:16:36

Andrew Sullivan <andrew@libertyrms.info> writes:
> On Fri, Aug 02, 2002 at 10:39:47AM -0400, Tom Lane wrote:
>> Actually, plpgsql is pretty expensive too.  The thing to be benchmarking
>> is applications of plain old built-in-C functions and operators.

> I thought part of the justification for this was for the OpenACS
> guys; don't they write everything in TCL?

Not relevant.  The concern about increasing FUNC_MAX_ARGS is the
overhead it might add to existing functions that don't need any
more arguments.  Worst case for that (percentagewise) will be
small built-in functions, like say int4add.
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

02 August 2002, 21:17:24

Thomas Lockhart wrote:
> > > With FUNC_MAX_ARGS=16:
> > > (average = 28.6 seconds)
> > > With FUNC_MAX_ARGS=32:
> > > (average = 29.15 seconds)
> 
> That is almost a 2 percent cost. Shall we challenge someone to get us
> back 2 percent from somewhere before the 7.3 release? Optimizing a hot
> spot might do it...

I wasn't terribly concerned because this wasn't a 2% on normal workload
test, it was a 2% bang on function calls as fast as you can test.
--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: FUNC_MAX_ARGS benchmarks

From

Thomas Lockhart

Date:

02 August 2002, 22:44:30

...
> I wasn't terribly concerned because this wasn't a 2% on normal workload
> test, it was a 2% bang on function calls as fast as you can test.

Yeah, good point. But if we can get it back somehow that would be even
better :)
                 - Thomas

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

03 August 2002, 00:05:46

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I wasn't terribly concerned because this wasn't a 2% on normal workload
> test, it was a 2% bang on function calls as fast as you can test.

No, it was a 2% hit on rather slow functions with only one call made
per query issued by the client.  This is not much of a stress test.

A more impressive comparison would be

select 2+2+2+2+2+2+ ... (iterate 10000 times or so)

and see how much that slows down.
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

03 August 2002, 02:01:44

Tom Lane wrote:
> No, it was a 2% hit on rather slow functions with only one call made
> per query issued by the client.  This is not much of a stress test.
> 
> A more impressive comparison would be
> 
> select 2+2+2+2+2+2+ ... (iterate 10000 times or so)
> 
> and see how much that slows down.

I ran a crude test as follows (using a PHP script on the same machine. 
Nothing else going on at the same time):

do 100 times  select 2+2+2+2+2+2+ ... iterated 9901 times


#define INDEX_MAX_KEYS        16, 32, 64, & 128
#define FUNC_MAX_ARGS        INDEX_MAX_KEYS
make all
make install
initdb

The results were as follows:
INDEX_MAX_KEYS    16    32      64     128                -----+-------+------+--------
Time in seconds   48    49      51      55


Joe

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

03 August 2002, 12:41:36

Joe Conway <mail@joeconway.com> writes:
> I ran a crude test as follows (using a PHP script on the same machine. 
> Nothing else going on at the same time):

> do 100 times
>    select 2+2+2+2+2+2+ ... iterated 9901 times

> The results were as follows:
> INDEX_MAX_KEYS    16    32      64     128
>                  -----+-------+------+--------
> Time in seconds   48    49      51      55

Okay, that seems like a good basic test.

Did you happen to make any notes about the disk space occupied by the
database?  One thing I was worried about was the bloat that'd occur
in pg_proc, pg_index, and pg_proc_proname_args_nsp_index.  Aside from
costing disk space, this would indirectly slow things down due to more
I/O to read these tables --- an effect that probably your test couldn't
measure, since it wasn't touching very many entries in any of those
tables.

Looks like we could go for 32 without much problem, though.
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

03 August 2002, 13:12:38

Tom Lane wrote:
> Did you happen to make any notes about the disk space occupied by the
> database?  One thing I was worried about was the bloat that'd occur
> in pg_proc, pg_index, and pg_proc_proname_args_nsp_index.  Aside from
> costing disk space, this would indirectly slow things down due to more
> I/O to read these tables --- an effect that probably your test couldn't
> measure, since it wasn't touching very many entries in any of those
> tables.

No, but it's easy enough to repeat. I'll do that today sometime.

Joe

Re: FUNC_MAX_ARGS benchmarks

From

Hannu Krosing

Date:

03 August 2002, 13:55:36

On Sat, 2002-08-03 at 18:41, Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
> > I ran a crude test as follows (using a PHP script on the same machine. 
> > Nothing else going on at the same time):
> 
> > do 100 times
> >    select 2+2+2+2+2+2+ ... iterated 9901 times
> 
> > The results were as follows:
> > INDEX_MAX_KEYS    16    32      64     128
> >                  -----+-------+------+--------
> > Time in seconds   48    49      51      55
> 
> Okay, that seems like a good basic test.
> 
> Did you happen to make any notes about the disk space occupied by the
> database?  One thing I was worried about was the bloat that'd occur
> in pg_proc, pg_index, and pg_proc_proname_args_nsp_index.  Aside from
> costing disk space, this would indirectly slow things down due to more
> I/O to read these tables --- an effect that probably your test couldn't
> measure, since it wasn't touching very many entries in any of those
> tables.

How hard would it be to change pg_proc.proargtypes from oidvector to _oid and hope 
that toasting will take care of the rest ?

This could also get the requested 2% speedup, not to mention the
potential for up to 64K arguments ;)

---------------
Hannu

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

03 August 2002, 14:21:10

Hannu Krosing <hannu@tm.ee> writes:
> How hard would it be to change pg_proc.proargtypes from oidvector to _oid

Lack of btree index support for _oid would be the first hurdle.

Even if we wanted to do that work, there'd be some serious breakage
of client queries because of the historical differences in output format
and subscripting.  (oidvector indexes from 0, _oid from 1.  Which is
pretty bogus, but if the regression tests are anything to judge by there
are probably a lot of queries out there that know this.)

> This could also get the requested 2% speedup,

I'm not convinced that _oid would be faster.

All in all, it doesn't seem worth the trouble compared to just kicking
FUNC_MAX_ARGS up a notch.  At least not right now.  I think we've
created quite enough system-catalog changes for one release cycle ;-)
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

03 August 2002, 17:25:36

Tom Lane wrote:
> Did you happen to make any notes about the disk space occupied by the
> database?  One thing I was worried about was the bloat that'd occur
> in pg_proc, pg_index, and pg_proc_proname_args_nsp_index.  Aside from
> costing disk space, this would indirectly slow things down due to more
> I/O to read these tables --- an effect that probably your test couldn't
> measure, since it wasn't touching very many entries in any of those
> tables.

#define INDEX_MAX_KEYS        16
#define FUNC_MAX_ARGS        INDEX_MAX_KEYS
du -h --max-depth=1 /opt/data/pgsql/data/base/
2.7M    /opt/data/pgsql/data/base/1
2.7M    /opt/data/pgsql/data/base/16862
2.7M    /opt/data/pgsql/data/base/16863
2.7M    /opt/data/pgsql/data/base/16864
3.2M    /opt/data/pgsql/data/base/16865
2.7M    /opt/data/pgsql/data/base/16866
17M     /opt/data/pgsql/data/base

#define INDEX_MAX_KEYS        32
#define FUNC_MAX_ARGS        INDEX_MAX_KEYS du -h --max-depth=1 /opt/data/pgsql/data/base/
3.1M    /opt/data/pgsql/data/base/1
3.1M    /opt/data/pgsql/data/base/16862
3.1M    /opt/data/pgsql/data/base/16863
3.1M    /opt/data/pgsql/data/base/16864
3.6M    /opt/data/pgsql/data/base/16865
3.1M    /opt/data/pgsql/data/base/16866
19M     /opt/data/pgsql/data/base

#define INDEX_MAX_KEYS        64
#define FUNC_MAX_ARGS        INDEX_MAX_KEYS
du -h --max-depth=1 /opt/data/pgsql/data/base/
3.9M    /opt/data/pgsql/data/base/1
3.9M    /opt/data/pgsql/data/base/16862
3.9M    /opt/data/pgsql/data/base/16863
3.9M    /opt/data/pgsql/data/base/16864
4.4M    /opt/data/pgsql/data/base/16865
3.9M    /opt/data/pgsql/data/base/16866
24M     /opt/data/pgsql/data/base

#define INDEX_MAX_KEYS        128
#define FUNC_MAX_ARGS        INDEX_MAX_KEYS
du -h --max-depth=1 /opt/data/pgsql/data/base/
5.7M    /opt/data/pgsql/data/base/1
5.7M    /opt/data/pgsql/data/base/16862
5.7M    /opt/data/pgsql/data/base/16863
5.7M    /opt/data/pgsql/data/base/16864
6.3M    /opt/data/pgsql/data/base/16865
5.7M    /opt/data/pgsql/data/base/16866
35M     /opt/data/pgsql/data/base


Joe

Re: FUNC_MAX_ARGS benchmarks

From

Hannu Krosing

Date:

03 August 2002, 19:36:56

On Sat, 2002-08-03 at 23:20, Tom Lane wrote:
> Hannu Krosing <hannu@tm.ee> writes:
> > How hard would it be to change pg_proc.proargtypes from oidvector to _oid
> 
> Lack of btree index support for _oid would be the first hurdle.

Is that index really needed, or is it there just to enforce uniqueness ?

Would the lookup not be in some internal cache most of the time ?

Also, (imho ;) btree index support should be done for all array types
which have comparison ops for elements at once (with semantics similar
to string) not one by one for individual types. It should be in some
ways quite similar to multi-key indexes, so perhaps some code could be
borrowed from there.

Otoh, It should be a SMOP to write support for b-tree indexes just for
_oid :-p , most likely one could re-use code from oidvector ;)

> Even if we wanted to do that work, there'd be some serious breakage
> of client queries because of the historical differences in output format
> and subscripting.  (oidvector indexes from 0, _oid from 1.  Which is
> pretty bogus, but if the regression tests are anything to judge by there
> are probably a lot of queries out there that know this.)

I would guess that oidvector is sufficiently obscure type and that
nobody actually uses oidvector for user tables. 

It is also only used in two tables and one index in system tables:

hannu=# select relname,relkind from pg_class where oid in (
hannu-#   select attrelid from pg_attribute where atttypid=30);            relname             | relkind 
---------------------------------+---------pg_index                        | rpg_proc_proname_narg_type_index |
ipg_proc                        | r

(3 rows)

> > This could also get the requested 2% speedup,
> 
> I'm not convinced that _oid would be faster.

Neither am I, but it _may_ be that having generally shorter oid arrays
wins us enough ;)

> All in all, it doesn't seem worth the trouble compared to just kicking
> FUNC_MAX_ARGS up a notch.  At least not right now.  I think we've
> created quite enough system-catalog changes for one release cycle ;-)

But going to _oid will free us from arbitrary limits on argument count.
Or at least from small arbitrary limits, as there will probably still be
the at-least-three-btree-keys-must-fit-in-page limit (makes > 2600
args/function) and maybe some other internal limits as well.

------------------
Hannu

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

03 August 2002, 20:40:33

Hannu Krosing wrote:
> On Sat, 2002-08-03 at 23:20, Tom Lane wrote:
> > Hannu Krosing <hannu@tm.ee> writes:
> > > How hard would it be to change pg_proc.proargtypes from oidvector to _oid
> > 
> > Lack of btree index support for _oid would be the first hurdle.
> 
> Is that index really needed, or is it there just to enforce uniqueness ?

Needed to look up functions based on their args.

The big issue of using arrays is that we don't have cache capability for
variable length fields.  Until we get that, we are stuck with
NAMEDATALEN taking the full length, and oidvector taking the full
length.

And if we went with variable length, there may be a performance penalty.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

03 August 2002, 21:04:59

OK, time to get moving folks.  Looks like the increase in the function
args to 32 and the NAMEDATALEN to 128 has been sufficiently tested.  Tom
has some ideas on removing some memset() calls for function args to
speed things up, but we don't have to wait for that go get going.  The
end of August is nearing.

Is there any reason to delay the change further?

---------------------------------------------------------------------------

Joe Conway wrote:
> Tom Lane wrote:
> > Did you happen to make any notes about the disk space occupied by the
> > database?  One thing I was worried about was the bloat that'd occur
> > in pg_proc, pg_index, and pg_proc_proname_args_nsp_index.  Aside from
> > costing disk space, this would indirectly slow things down due to more
> > I/O to read these tables --- an effect that probably your test couldn't
> > measure, since it wasn't touching very many entries in any of those
> > tables.
> 
> #define INDEX_MAX_KEYS        16
> #define FUNC_MAX_ARGS        INDEX_MAX_KEYS
> du -h --max-depth=1 /opt/data/pgsql/data/base/
> 2.7M    /opt/data/pgsql/data/base/1
> 2.7M    /opt/data/pgsql/data/base/16862
> 2.7M    /opt/data/pgsql/data/base/16863
> 2.7M    /opt/data/pgsql/data/base/16864
> 3.2M    /opt/data/pgsql/data/base/16865
> 2.7M    /opt/data/pgsql/data/base/16866
> 17M     /opt/data/pgsql/data/base
> 
> #define INDEX_MAX_KEYS        32
> #define FUNC_MAX_ARGS        INDEX_MAX_KEYS
>   du -h --max-depth=1 /opt/data/pgsql/data/base/
> 3.1M    /opt/data/pgsql/data/base/1
> 3.1M    /opt/data/pgsql/data/base/16862
> 3.1M    /opt/data/pgsql/data/base/16863
> 3.1M    /opt/data/pgsql/data/base/16864
> 3.6M    /opt/data/pgsql/data/base/16865
> 3.1M    /opt/data/pgsql/data/base/16866
> 19M     /opt/data/pgsql/data/base
> 
> #define INDEX_MAX_KEYS        64
> #define FUNC_MAX_ARGS        INDEX_MAX_KEYS
> du -h --max-depth=1 /opt/data/pgsql/data/base/
> 3.9M    /opt/data/pgsql/data/base/1
> 3.9M    /opt/data/pgsql/data/base/16862
> 3.9M    /opt/data/pgsql/data/base/16863
> 3.9M    /opt/data/pgsql/data/base/16864
> 4.4M    /opt/data/pgsql/data/base/16865
> 3.9M    /opt/data/pgsql/data/base/16866
> 24M     /opt/data/pgsql/data/base
> 
> #define INDEX_MAX_KEYS        128
> #define FUNC_MAX_ARGS        INDEX_MAX_KEYS
> du -h --max-depth=1 /opt/data/pgsql/data/base/
> 5.7M    /opt/data/pgsql/data/base/1
> 5.7M    /opt/data/pgsql/data/base/16862
> 5.7M    /opt/data/pgsql/data/base/16863
> 5.7M    /opt/data/pgsql/data/base/16864
> 6.3M    /opt/data/pgsql/data/base/16865
> 5.7M    /opt/data/pgsql/data/base/16866
> 35M     /opt/data/pgsql/data/base
> 
> 
> Joe
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

03 August 2002, 21:38:20

Hannu Krosing <hannu@tm.ee> writes:
>> Lack of btree index support for _oid would be the first hurdle.

> Is that index really needed, or is it there just to enforce uniqueness ?

Both.

> Also, (imho ;) btree index support should be done for all array types
> which have comparison ops for elements at once (with semantics similar
> to string) not one by one for individual types.

Fine, send a patch ;-)

>> Even if we wanted to do that work, there'd be some serious breakage
>> of client queries because of the historical differences in output format
>> and subscripting.  (oidvector indexes from 0, _oid from 1.  Which is
>> pretty bogus, but if the regression tests are anything to judge by there
>> are probably a lot of queries out there that know this.)

> I would guess that oidvector is sufficiently obscure type and that
> nobody actually uses oidvector for user tables. 

No, you miss my point: client queries that do subscripting on
proargtypes will break.  Since the regression tests find this a useful
thing to do, I suspect there are clients out there that do too.

> But going to _oid will free us from arbitrary limits on argument count.

I didn't say it wouldn't be a good idea in the long run.  I'm saying I
don't think it's happening for 7.3, given that Aug 31 is not that far
away anymore and that a lot of cleanup work remains undone on other
already-committed features.  FUNC_MAX_ARGS=32 could happen for 7.3, though.
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

03 August 2002, 22:15:32

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> OK, time to get moving folks.  Looks like the increase in the function
> args to 32 and the NAMEDATALEN to 128 has been sufficiently tested.

I'm convinced by Joe's numbers that FUNC_MAX_ARGS = 32 shouldn't hurt
too much.  But have we done equivalent checks on NAMEDATALEN?  In
particular, do we know what it does to the size of template1?
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

03 August 2002, 22:54:20

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > OK, time to get moving folks.  Looks like the increase in the function
> > args to 32 and the NAMEDATALEN to 128 has been sufficiently tested.
> 
> I'm convinced by Joe's numbers that FUNC_MAX_ARGS = 32 shouldn't hurt
> too much.  But have we done equivalent checks on NAMEDATALEN?  In
> particular, do we know what it does to the size of template1?

No, I thought we saw the number, was 30%?   No, we did a test for 64.
Can someone get us that number for 128?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

04 August 2002, 02:59:48

Bruce Momjian wrote:
> Tom Lane wrote:
>>I'm convinced by Joe's numbers that FUNC_MAX_ARGS = 32 shouldn't hurt
>>too much.  But have we done equivalent checks on NAMEDATALEN?  In
>>particular, do we know what it does to the size of template1?
> No, I thought we saw the number, was 30%?   No, we did a test for 64.
> Can someone get us that number for 128?
> 

I'll do 32, 64, and 128 and report back on template1 size.

Joe

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

04 August 2002, 03:19:44

Bruce Momjian wrote:
> Tom Lane wrote:
> 
>>Bruce Momjian <pgman@candle.pha.pa.us> writes:
>>
>>>OK, time to get moving folks.  Looks like the increase in the function
>>>args to 32 and the NAMEDATALEN to 128 has been sufficiently tested.
>>
>>I'm convinced by Joe's numbers that FUNC_MAX_ARGS = 32 shouldn't hurt
>>too much.  But have we done equivalent checks on NAMEDATALEN?  In
>>particular, do we know what it does to the size of template1?
> 
> 
> No, I thought we saw the number, was 30%?   No, we did a test for 64.
> Can someone get us that number for 128?
> 

These are all with FUNC_MAX_ARGS = 16.

#define NAMEDATALEN 32
du -h --max-depth=1 /opt/data/pgsql/data/base/
2.7M    /opt/data/pgsql/data/base/1
2.7M    /opt/data/pgsql/data/base/16862
2.7M    /opt/data/pgsql/data/base/16863
2.7M    /opt/data/pgsql/data/base/16864
3.2M    /opt/data/pgsql/data/base/16865
2.7M    /opt/data/pgsql/data/base/16866
2.7M    /opt/data/pgsql/data/base/17117
19M     /opt/data/pgsql/data/base

#define NAMEDATALEN 64
du -h --max-depth=1 /opt/data/pgsql/data/base/
3.0M    /opt/data/pgsql/data/base/1
3.0M    /opt/data/pgsql/data/base/16863
3.0M    /opt/data/pgsql/data/base/16864
3.0M    /opt/data/pgsql/data/base/16865
3.5M    /opt/data/pgsql/data/base/16866
3.0M    /opt/data/pgsql/data/base/16867
19M     /opt/data/pgsql/data/base

#define NAMEDATALEN 128
du -h --max-depth=1 /opt/data/pgsql/data/base/
3.8M    /opt/data/pgsql/data/base/1
3.8M    /opt/data/pgsql/data/base/16863
3.8M    /opt/data/pgsql/data/base/16864
3.8M    /opt/data/pgsql/data/base/16865
4.4M    /opt/data/pgsql/data/base/16866
3.8M    /opt/data/pgsql/data/base/16867
23M     /opt/data/pgsql/data/base


Joe

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

04 August 2002, 19:19:55

Joe Conway wrote:
> These are all with FUNC_MAX_ARGS = 16.
> 
> #define NAMEDATALEN 32
> du -h --max-depth=1 /opt/data/pgsql/data/base/
> 2.7M    /opt/data/pgsql/data/base/1
> 2.7M    /opt/data/pgsql/data/base/16862
> 2.7M    /opt/data/pgsql/data/base/16863
> 2.7M    /opt/data/pgsql/data/base/16864
> 3.2M    /opt/data/pgsql/data/base/16865
> 2.7M    /opt/data/pgsql/data/base/16866
> 2.7M    /opt/data/pgsql/data/base/17117
> 19M     /opt/data/pgsql/data/base
> 

FWIW, this is FUNC_MAX_ARGS = 32 *and* NAMEDATALEN 128
du -h --max-depth=1 /opt/data/pgsql/data/base/
4.1M    /opt/data/pgsql/data/base/1
4.1M    /opt/data/pgsql/data/base/16863
4.1M    /opt/data/pgsql/data/base/16864
4.1M    /opt/data/pgsql/data/base/16865
4.8M    /opt/data/pgsql/data/base/16866
4.1M    /opt/data/pgsql/data/base/16867
26M     /opt/data/pgsql/data/base

Joe

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

04 August 2002, 23:57:08

Joe Conway <mail@joeconway.com> writes:
> These are all with FUNC_MAX_ARGS = 16.

> #define NAMEDATALEN 32
> 2.7M    /opt/data/pgsql/data/base/1

> #define NAMEDATALEN 64
> 3.0M    /opt/data/pgsql/data/base/1

> #define NAMEDATALEN 128
> 3.8M    /opt/data/pgsql/data/base/1

Based on Joe's numbers, I'm kind of thinking that we should go for
FUNC_MAX_ARGS=32 and NAMEDATALEN=64 as defaults in 7.3.

Although NAMEDATALEN=128 would be needed for full SQL compliance,
the space penalty seems severe.  I'm thinking we should back off
until someone wants to do the legwork needed to make the name type
be truly variable-length.

Comments?
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

05 August 2002, 00:09:08

Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
> 
>>These are all with FUNC_MAX_ARGS = 16.
> 
> 
>>#define NAMEDATALEN 32
>>2.7M    /opt/data/pgsql/data/base/1
> 
> 
>>#define NAMEDATALEN 64
>>3.0M    /opt/data/pgsql/data/base/1
> 
> 
>>#define NAMEDATALEN 128
>>3.8M    /opt/data/pgsql/data/base/1
> 
> 
> Based on Joe's numbers, I'm kind of thinking that we should go for
> FUNC_MAX_ARGS=32 and NAMEDATALEN=64 as defaults in 7.3.
> 
> Although NAMEDATALEN=128 would be needed for full SQL compliance,
> the space penalty seems severe.  I'm thinking we should back off
> until someone wants to do the legwork needed to make the name type
> be truly variable-length.

FWIW, I reran the speed benchmark (select 2+2+2...) with 
FUNC_MAX_ARGS=32 and NAMEDATALEN=128 and still got 49 seconds, i.e. 
NAMEDATALEN=128 didn't impact performance of that particular test.

The results were as follows:
INDEX_MAX_KEYS    16    32      64     128                -----+-------+------+--------
Time in seconds   48    49      51      55                      ^^^^^^^^                      reran with
NAMEDATALEN=128,same result

What will the impact be on a medium to large production database? In 
other words, is the bloat strictly to the system catalogs based on how 
extensive your database schema (bad choice of words now, but I don't 
know a better term for this) is? Or will the bloat scale with the size 
of the database including data?

Joe

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

05 August 2002, 01:21:54

Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
> > These are all with FUNC_MAX_ARGS = 16.
> 
> > #define NAMEDATALEN 32
> > 2.7M    /opt/data/pgsql/data/base/1
> 
> > #define NAMEDATALEN 64
> > 3.0M    /opt/data/pgsql/data/base/1
> 
> > #define NAMEDATALEN 128
> > 3.8M    /opt/data/pgsql/data/base/1
> 
> Based on Joe's numbers, I'm kind of thinking that we should go for
> FUNC_MAX_ARGS=32 and NAMEDATALEN=64 as defaults in 7.3.
> 
> Although NAMEDATALEN=128 would be needed for full SQL compliance,
> the space penalty seems severe.  I'm thinking we should back off
> until someone wants to do the legwork needed to make the name type
> be truly variable-length.

I prefer 64 for NAMEDATALEN myself.  Standards compliance is nice, but
realistically it seems a shame to waste so much space on an excessive
length that will never be used.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

05 August 2002, 02:09:04

Bruce Momjian wrote:
> I prefer 64 for NAMEDATALEN myself.  Standards compliance is nice, but
> realistically it seems a shame to waste so much space on an excessive
> length that will never be used.
> 

But is the space wasted really never more than a few MB's, even if the 
database itself is say 1 GB? If so, and if the speed penalty is small to 
non-existent, I'd rather be spec compliant. That way nobody has a good 
basis for complaining ;-)

I guess I'll try another test with a larger data-set.

Joe

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

05 August 2002, 02:37:38

Joe Conway wrote:
> But is the space wasted really never more than a few MB's, even if the 
> database itself is say 1 GB? If so, and if the speed penalty is small to 
> non-existent, I'd rather be spec compliant. That way nobody has a good 
> basis for complaining ;-)
> 
> I guess I'll try another test with a larger data-set.
> 

Starting with pg_dumpall file at 138M.


#define INDEX_MAX_KEYS        16
#define FUNC_MAX_ARGS        INDEX_MAX_KEYS
#define NAMEDATALEN 32
du -h --max-depth=1 /opt/data/pgsql/data/base/
2.7M    /opt/data/pgsql/data/base/1
2.7M    /opt/data/pgsql/data/base/16862
119M    /opt/data/pgsql/data/base/16863
3.1M    /opt/data/pgsql/data/base/696496
3.1M    /opt/data/pgsql/data/base/696623
3.1M    /opt/data/pgsql/data/base/696750
2.8M    /opt/data/pgsql/data/base/696877
2.8M    /opt/data/pgsql/data/base/696889
2.8M    /opt/data/pgsql/data/base/696901
2.8M    /opt/data/pgsql/data/base/696912
18M     /opt/data/pgsql/data/base/696924
3.0M    /opt/data/pgsql/data/base/878966
2.7M    /opt/data/pgsql/data/base/881056
2.7M    /opt/data/pgsql/data/base/881075
2.8M    /opt/data/pgsql/data/base/881078
3.1M    /opt/data/pgsql/data/base/881093
3.1M    /opt/data/pgsql/data/base/881225
2.8M    /opt/data/pgsql/data/base/881604
3.3M    /opt/data/pgsql/data/base/881620
31M     /opt/data/pgsql/data/base/881807
31M     /opt/data/pgsql/data/base/1031939
32M     /opt/data/pgsql/data/base/1181250
31M     /opt/data/pgsql/data/base/1332676
309M    /opt/data/pgsql/data/base


#define INDEX_MAX_KEYS        32
#define FUNC_MAX_ARGS        INDEX_MAX_KEYS
#define NAMEDATALEN 128
du -h --max-depth=1 /opt/data/pgsql/data/base/
4.1M    /opt/data/pgsql/data/base/1
4.1M    /opt/data/pgsql/data/base/16863
121M    /opt/data/pgsql/data/base/16864
4.6M    /opt/data/pgsql/data/base/696497
4.6M    /opt/data/pgsql/data/base/696624
4.6M    /opt/data/pgsql/data/base/696751
4.2M    /opt/data/pgsql/data/base/696878
4.2M    /opt/data/pgsql/data/base/696890
4.2M    /opt/data/pgsql/data/base/696902
4.2M    /opt/data/pgsql/data/base/696913
20M     /opt/data/pgsql/data/base/696925
4.5M    /opt/data/pgsql/data/base/878967
4.2M    /opt/data/pgsql/data/base/881057
4.1M    /opt/data/pgsql/data/base/881076
4.2M    /opt/data/pgsql/data/base/881079
4.7M    /opt/data/pgsql/data/base/881094
4.7M    /opt/data/pgsql/data/base/881226
4.2M    /opt/data/pgsql/data/base/881605
4.9M    /opt/data/pgsql/data/base/881621
33M     /opt/data/pgsql/data/base/881808
33M     /opt/data/pgsql/data/base/1031940
33M     /opt/data/pgsql/data/base/1181251
33M     /opt/data/pgsql/data/base/1332677
343M    /opt/data/pgsql/data/base

So the 119MB database only grows to 121MB. In fact, each of the > 10MB 
databases seems to grow only about 2MB. Based on this, I'd go with:

#define INDEX_MAX_KEYS        32
#define FUNC_MAX_ARGS        INDEX_MAX_KEYS
#define NAMEDATALEN 128

and take spec compliance.

Joe

Re: FUNC_MAX_ARGS benchmarks

From

"Dave Page"

Date:

05 August 2002, 03:21:05


> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: 05 August 2002 04:56
> To: Joe Conway
> Cc: Bruce Momjian; Thomas Lockhart; Neil Conway; PostgreSQL Hackers
> Subject: Re: [HACKERS] FUNC_MAX_ARGS benchmarks
>
>
> Joe Conway <mail@joeconway.com> writes:
> > These are all with FUNC_MAX_ARGS = 16.
>
> > #define NAMEDATALEN 32
> > 2.7M    /opt/data/pgsql/data/base/1
>
> > #define NAMEDATALEN 64
> > 3.0M    /opt/data/pgsql/data/base/1
>
> > #define NAMEDATALEN 128
> > 3.8M    /opt/data/pgsql/data/base/1
>
> Based on Joe's numbers, I'm kind of thinking that we should
> go for FUNC_MAX_ARGS=32 and NAMEDATALEN=64 as defaults in 7.3.
>
> Although NAMEDATALEN=128 would be needed for full SQL
> compliance, the space penalty seems severe.  I'm thinking we
> should back off until someone wants to do the legwork needed
> to make the name type be truly variable-length.
>
> Comments?

In Joe's last test he had only about 2Mb growth per db (I guess this
would not be the case had he used the name type in some of his tables).
I would rather lose a measly few Mb and be standards compliant myself.

$0.02

Regards, Dave.

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

05 August 2002, 09:34:06

Joe Conway <mail@joeconway.com> writes:
> Tom Lane wrote:
>> Although NAMEDATALEN=128 would be needed for full SQL compliance,
>> the space penalty seems severe.

> What will the impact be on a medium to large production database? In 
> other words, is the bloat strictly to the system catalogs based on how 
> extensive your database schema (bad choice of words now, but I don't 
> know a better term for this) is? Or will the bloat scale with the size 
> of the database including data?

The bloat would scale with the size of your schema, not with the amount
of data in your tables (unless you have "name" columns in your user
tables, which is something we've always discouraged).  template1 is
clearly a worst-case scenario, percentagewise, for NAMEDATALEN.

I'm quite prepared to believe that the net cost is "a couple megs per
database" more or less independent of how much data you store.  Maybe
that's negligible these days, or maybe it isn't ...
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

05 August 2002, 11:19:32

Tom Lane wrote:
> The bloat would scale with the size of your schema, not with the amount
> of data in your tables (unless you have "name" columns in your user
> tables, which is something we've always discouraged).  template1 is
> clearly a worst-case scenario, percentagewise, for NAMEDATALEN.
> 
> I'm quite prepared to believe that the net cost is "a couple megs per
> database" more or less independent of how much data you store.  Maybe
> that's negligible these days, or maybe it isn't ...

Seems to me it's negligible for the vast majority of applications. I 
*know* it is for any appplication that I have.

We can always tell people who are doing embedded application work to 
bump *down* NAMEDATALEN.

Joe

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

05 August 2002, 11:27:47

Joe Conway <mail@joeconway.com> writes:
> We can always tell people who are doing embedded application work to 
> bump *down* NAMEDATALEN.

Good point.  Okay, I'm OK with 128 ...
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

05 August 2002, 12:21:48

Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
> > We can always tell people who are doing embedded application work to 
> > bump *down* NAMEDATALEN.
> 
> Good point.  Okay, I'm OK with 128 ...

Yes, good point.  I think the major issue is pushing stuff out of the
cache because we have longer names.  Did we see performance hit at 128? 
Seems it more that just disk space.

I don't have trouble with 128, but other than standards compliance,  I
can't see many people getting >64 names.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

05 August 2002, 12:26:25

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I don't have trouble with 128, but other than standards compliance,  I
> can't see many people getting >64 names.

One nice thing about 128 is you can basically forget about the weird
truncation behavior on generated sequence names for serial columns
--- "tablename_colname_seq" will be correct for essentially all
practical cases.  At 64 you might still need to think about it.
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

05 August 2002, 12:54:42

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I don't have trouble with 128, but other than standards compliance,  I
> > can't see many people getting >64 names.
> 
> One nice thing about 128 is you can basically forget about the weird
> truncation behavior on generated sequence names for serial columns
> --- "tablename_colname_seq" will be correct for essentially all
> practical cases.  At 64 you might still need to think about it.

Oh, good point.  Does anyone remember the performance hit for 64 vs 128
namedatalen?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: FUNC_MAX_ARGS benchmarks

From

Peter Eisentraut

Date:

05 August 2002, 15:38:48

Joe Conway writes:

> I'd rather be spec compliant. That way nobody has a good basis for
> complaining ;-)

How long until someone figures out that to be spec-compliant you need
NAMEDATALEN 129?  ;-)

-- 
Peter Eisentraut   peter_e@gmx.net

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

05 August 2002, 17:31:51

Well, in fact it's not just a question of disk space.

The following numbers are stats for total elapsed time of "make
installcheck" over ten trials:

NAMEDATALEN = 32, FUNC_MAX_ARGS = 16
 min  |  max  |  avg   |      stddev
-------+-------+--------+-------------------25.59 | 27.61 | 26.612 | 0.637003401351409

NAMEDATALEN = 64, FUNC_MAX_ARGS = 32
 min  |  max  |  avg   |     stddev
-------+-------+--------+-----------------26.32 | 29.27 | 27.415 | 1.0337982824947

NAMEDATALEN = 128, FUNC_MAX_ARGS = 32
 min  |  max  |  avg   |      stddev
-------+-------+--------+------------------27.44 | 30.79 | 29.603 | 1.26148105195622

I'm not sure about the trend of increasing standard deviation --- that
may reflect more disk I/O being done, and perhaps more checkpoints
occurring during the test.  But in any case it's clear that there's a
nontrivial runtime cost here.  Does a 10% slowdown bother you?
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

05 August 2002, 19:09:26

Tom Lane wrote:
> Well, in fact it's not just a question of disk space.
> 
> The following numbers are stats for total elapsed time of "make
> installcheck" over ten trials:
> 
<snip>
> I'm not sure about the trend of increasing standard deviation --- that
> may reflect more disk I/O being done, and perhaps more checkpoints
> occurring during the test.  But in any case it's clear that there's a
> nontrivial runtime cost here.  Does a 10% slowdown bother you?

Hmmm -- didn't Neil do some kind of test that had different results, 
i.e. not much performance difference? I wonder if the large number of 
DDL commands in installcheck doesn't skew the results against longer 
NAMEDATALEN compared to other benchmarks?

# pwd
/opt/src/pgsql/src/test/regress/sql
# grep -i 'CREATE\|DROP' * | wc -l   1114

Joe

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

05 August 2002, 19:46:49

Joe Conway <mail@joeconway.com> writes:
>> I'm not sure about the trend of increasing standard deviation --- that
>> may reflect more disk I/O being done, and perhaps more checkpoints
>> occurring during the test.  But in any case it's clear that there's a
>> nontrivial runtime cost here.  Does a 10% slowdown bother you?

> Hmmm -- didn't Neil do some kind of test that had different results, 
> i.e. not much performance difference?

Well, one person had reported a 10% slowdown in pgbench, but Neil saw
a 10% speedup.  Given the well-known difficulty of getting any
reproducible numbers out of pgbench, I don't trust either number very
far; but unless some other folk are willing to repeat the experiment
I think we can only conclude that pgbench isn't affected much by
NAMEDATALEN.

> I wonder if the large number of 
> DDL commands in installcheck doesn't skew the results against longer 
> NAMEDATALEN compared to other benchmarks?

Depends on what you consider skewed, I suppose.  pgbench touches only a
very small number of relations, and starts no new backends over the
length of its run, thus everything gets cached and stays cached.  At
best I'd consider it an existence proof that some applications won't be
hurt.

Do you have another application you'd consider a more representative
benchmark?
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

05 August 2002, 20:46:33

Tom Lane wrote:
> Depends on what you consider skewed, I suppose.  pgbench touches only a
> very small number of relations, and starts no new backends over the
> length of its run, thus everything gets cached and stays cached.  At
> best I'd consider it an existence proof that some applications won't be
> hurt.
> 
> Do you have another application you'd consider a more representative
> benchmark?

I'm not sure. Maybe OSDB? I'll see if I can get it running over the next 
few days. Anyone else have other suggestions?

Joe

Re: FUNC_MAX_ARGS benchmarks

From

Tatsuo Ishii

Date:

05 August 2002, 22:08:59

> I don't have trouble with 128, but other than standards compliance,  I
> can't see many people getting >64 names.

Don't forget that 128 is for *bytes*, not for characters(this is still
ture with 7.3). In CJK(Chinese, Japanese and Korean) single character
can eat up to 3 bytes if the encoding is UTF-8.
--
Tatsuo Ishii

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

05 August 2002, 22:55:47

Tatsuo Ishii <t-ishii@sra.co.jp> writes:
>> I don't have trouble with 128, but other than standards compliance,  I
>> can't see many people getting >64 names.

> Don't forget that 128 is for *bytes*, not for characters(this is still
> ture with 7.3). In CJK(Chinese, Japanese and Korean) single character
> can eat up to 3 bytes if the encoding is UTF-8.

True, but in those languages a typical name would be many fewer
characters than it is in Western alphabets, no?  I'd guess (with
no evidence though) that the effect would more or less cancel out.
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

06 August 2002, 02:10:41

As long as we allocate the full length for the funcarg and name types,
we are going to have performance/space issues with increasing them,
especially since we are looking at doubling or quadrupling those values.

You can say that the test below isn't a representative benchmark, but I
am sure it is typical of _some_ of our users, so it may still be a
significant test.  We don't get good benchmark numbers by accident.  It
is this type of analysis that keeps us sharp.

I think funcargs of 32 and name of 64 is the way to go for 7.3.  If we
find we need longer names or we find we can make them variable length,
we can revisit the issue.  However, variable length has a performance
cost as well, so it is not certain we will ever make them variable
length.

---------------------------------------------------------------------------

Tom Lane wrote:
> Well, in fact it's not just a question of disk space.
> 
> The following numbers are stats for total elapsed time of "make
> installcheck" over ten trials:
> 
> NAMEDATALEN = 32, FUNC_MAX_ARGS = 16
> 
>   min  |  max  |  avg   |      stddev
> -------+-------+--------+-------------------
>  25.59 | 27.61 | 26.612 | 0.637003401351409
> 
> NAMEDATALEN = 64, FUNC_MAX_ARGS = 32
> 
>   min  |  max  |  avg   |     stddev
> -------+-------+--------+-----------------
>  26.32 | 29.27 | 27.415 | 1.0337982824947
> 
> NAMEDATALEN = 128, FUNC_MAX_ARGS = 32
> 
>   min  |  max  |  avg   |      stddev
> -------+-------+--------+------------------
>  27.44 | 30.79 | 29.603 | 1.26148105195622
> 
> I'm not sure about the trend of increasing standard deviation --- that
> may reflect more disk I/O being done, and perhaps more checkpoints
> occurring during the test.  But in any case it's clear that there's a
> nontrivial runtime cost here.  Does a 10% slowdown bother you?
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

06 August 2002, 02:21:21

Bruce Momjian wrote:
> As long as we allocate the full length for the funcarg and name types,
> we are going to have performance/space issues with increasing them,
> especially since we are looking at doubling or quadrupling those values.
> 
> You can say that the test below isn't a representative benchmark, but I
> am sure it is typical of _some_ of our users, so it may still be a
> significant test.  We don't get good benchmark numbers by accident.  It
> is this type of analysis that keeps us sharp.

I'm running the OSDB benchmark right now. So far the Single user test 
results are done, and the overall results is like this:

NAMEDATALEN = 32, FUNC_MAX_ARGS = 32
"Single User Test"      2205.89 seconds (0:36:45.89)

NAMEDATALEN = 128, FUNC_MAX_ARGS = 32
"Single User Test"      2256.16 seconds (0:37:36.16)

So the difference in performance for this benchmark is not nearly so 
large, more like 2%. The multi-user portion of the second test is 
running right now, so I'll report final results in the morning. I might 
also run this on the same machine against 7.2.1 to see where we would 
stand in comparison to the last release. But that won't happen until 
tomorrow some time.

Joe

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

06 August 2002, 03:01:18

Joe Conway wrote:
> Bruce Momjian wrote:
>
>> As long as we allocate the full length for the funcarg and name types,
>> we are going to have performance/space issues with increasing them,
>> especially since we are looking at doubling or quadrupling those values.
>>
>> You can say that the test below isn't a representative benchmark, but I
>> am sure it is typical of _some_ of our users, so it may still be a
>> significant test.  We don't get good benchmark numbers by accident.  It
>> is this type of analysis that keeps us sharp.
>
>
> I'm running the OSDB benchmark right now. So far the Single user test
> results are done, and the overall results is like this:
>
> NAMEDATALEN = 32, FUNC_MAX_ARGS = 32
> "Single User Test"      2205.89 seconds (0:36:45.89)
>
> NAMEDATALEN = 128, FUNC_MAX_ARGS = 32
> "Single User Test"      2256.16 seconds (0:37:36.16)
>
> So the difference in performance for this benchmark is not nearly so
> large, more like 2%. The multi-user portion of the second test is
> running right now, so I'll report final results in the morning. I might
> also run this on the same machine against 7.2.1 to see where we would
> stand in comparison to the last release. But that won't happen until
> tomorrow some time.
>

Here's the multi-user test summary. Very little difference. The details
of the OSDB output are attached.

NAMEDATALEN = 32, FUNC_MAX_ARGS = 32
"Multi-User Test"       3403.84 seconds (0:56:43.84)

NAMEDATALEN = 128, FUNC_MAX_ARGS = 32
"Multi-User Test"       3412.18 seconds (0:56:52.18)

Joe

Attachment

OSDB-results

Re: FUNC_MAX_ARGS benchmarks

From

"Nigel J. Andrews"

Date:

06 August 2002, 05:47:21

On Tue, 6 Aug 2002, Bruce Momjian wrote:

> 
> As long as we allocate the full length for the funcarg and name types,
> we are going to have performance/space issues with increasing them,
> especially since we are looking at doubling or quadrupling those values.
> 
> [snip]
> 
> I think funcargs of 32 and name of 64 is the way to go for 7.3.  If we
> find we need longer names or we find we can make them variable length,
> we can revisit the issue.  However, variable length has a performance
> cost as well, so it is not certain we will ever make them variable
> length.

I was thinking of looking at turning names to varchars/text in order to test
the performance hit [in the first instance]. However doing a
 find . -name \*\.\[ch\] | xargs grep NAMEDATALEN | wc -l

gives 185 hits and some of those are setting other macros. It seems to me there
is a fair amount of work involved in just getting variable length names into
the system so that they can be tested.

-- 
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

06 August 2002, 09:37:37

"Nigel J. Andrews" <nandrews@investsystems.co.uk> writes:
> I was thinking of looking at turning names to varchars/text in order to test
> the performance hit [in the first instance]. However doing a
>   find . -name \*\.\[ch\] | xargs grep NAMEDATALEN | wc -l
> gives 185 hits and some of those are setting other macros. It seems to
> me there is a fair amount of work involved in just getting variable
> length names into the system so that they can be tested.

And that is not even the tip of the iceberg.  The real reason that NAME
is fixed-length is so that it can be accessed as a member of a C
structure.  Moving NAME into the variable-length category would make it
much more painful to access than it is now, and would require
rearranging the field order in every system catalog that has a name field.
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Hannu Krosing

Date:

06 August 2002, 10:00:57

On Tue, 2002-08-06 at 15:36, Tom Lane wrote:
> "Nigel J. Andrews" <nandrews@investsystems.co.uk> writes:
> > I was thinking of looking at turning names to varchars/text in order to test
> > the performance hit [in the first instance]. However doing a
> >   find . -name \*\.\[ch\] | xargs grep NAMEDATALEN | wc -l
> > gives 185 hits and some of those are setting other macros. It seems to
> > me there is a fair amount of work involved in just getting variable
> > length names into the system so that they can be tested.
> 
> And that is not even the tip of the iceberg.  The real reason that NAME
> is fixed-length is so that it can be accessed as a member of a C
> structure.

I'm not pretending to know anything about it, but can't this be made
into a pointer that is accessed as a member of a C structure. This
should not need rearranging the field order.

And if we were lucky enough, then change from char[32] to char* will be
invisible for most places that use it.

> Moving NAME into the variable-length category would make it
> much more painful to access than it is now, and would require
> rearranging the field order in every system catalog that has a name field.

From what I remember the main concern was lack of support for varlen
types in cache manager (whatever it means) ?

---------------
Hannu

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

06 August 2002, 10:17:33

Hannu Krosing <hannu@tm.ee> writes:
> I'm not pretending to know anything about it, but can't this be made
> into a pointer that is accessed as a member of a C structure. This
> should not need rearranging the field order.

You can't store pointers on disk.  At least not usefully.

> From what I remember the main concern was lack of support for varlen
> types in cache manager (whatever it means) ?

That would be a localized fix; I'm not very worried about it.  A
system-wide change in notation for getting at NAMEs would be quite
painful, though.
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

"Sander Steffann"

Date:

06 August 2002, 13:15:44

Hi,

> Here's the multi-user test summary. Very little difference. The details
> of the OSDB output are attached.
>
> NAMEDATALEN = 32, FUNC_MAX_ARGS = 32
> "Multi-User Test"       3403.84 seconds (0:56:43.84)
>
> NAMEDATALEN = 128, FUNC_MAX_ARGS = 32
> "Multi-User Test"       3412.18 seconds (0:56:52.18)

Seeing these numbers, I would definitely vote for standards-compliance with
NAMEDATALEN = 128. 8.34 seconds more on 3400 seconds is just a 0.25%
increase.

Sander.

Re: FUNC_MAX_ARGS benchmarks

From

Peter Eisentraut

Date:

06 August 2002, 17:14:49

Joe Conway writes:

> Here's the multi-user test summary. Very little difference. The details
> of the OSDB output are attached.

The fact that the OSDB benchmark has just about the least possible test
coverage of identifier handling and you still get a 2% performance drop is
something I'm concerned about.

-- 
Peter Eisentraut   peter_e@gmx.net

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

06 August 2002, 17:26:28

Peter Eisentraut wrote:
> Joe Conway writes:
>>Here's the multi-user test summary. Very little difference. The details
>>of the OSDB output are attached.
>
> The fact that the OSDB benchmark has just about the least possible test
> coverage of identifier handling and you still get a 2% performance drop is
> something I'm concerned about.
>

Of course that's on the single user test only. In the multi-user test
the two are neck-and-neck. If you really want to be concerned, see the
attached. This lines up results from:

REL7_2_STABLE with NAMEDATALEN = 32 and FUNC_MAX_ARGS = 16
7.3devel      with NAMEDATALEN = 32 and FUNC_MAX_ARGS = 32
7.3devel      with NAMEDATALEN = 128 and FUNC_MAX_ARGS = 32

In the single-user test, REL7_2_STABLE is best by about 10%. But in the
multi-user test (10 users), *both* 7.3devel tests are about 3.5% faster
than 7.2.

Joe

Attachment

OSDB-comp.pdf

Re: FUNC_MAX_ARGS benchmarks

From

Tatsuo Ishii

Date:

07 August 2002, 08:56:48

> > Don't forget that 128 is for *bytes*, not for characters(this is still
> > ture with 7.3). In CJK(Chinese, Japanese and Korean) single character
> > can eat up to 3 bytes if the encoding is UTF-8.
> 
> True, but in those languages a typical name would be many fewer
> characters than it is in Western alphabets, no?  I'd guess (with
> no evidence though) that the effect would more or less cancel out.

That's only true for "kanji" characters. There are alphabet like
phonogram characters called "katakana" and "hiragana". The former is
often used to express things imported from foreign languages (That
means Japanse has more and more things expressed in katakana than
before). Since they are phonogram, they tend to be longer
characters. For example, if I would like to have "object id" column
and want to name it using "katakana", it would be around 8 characters,
that is 24 bytes in UTF-8 encoding.

I'm not sure if Chinese or Korean has similar things though.
--
Tatsuo Ishii

Off-topic: FUNC_MAX_ARGS benchmarks

From

Hannu Krosing

Date:

07 August 2002, 10:28:29

On Wed, 2002-08-07 at 14:56, Tatsuo Ishii wrote:
> > > Don't forget that 128 is for *bytes*, not for characters(this is still
> > > ture with 7.3). In CJK(Chinese, Japanese and Korean) single character
> > > can eat up to 3 bytes if the encoding is UTF-8.
> > 
> > True, but in those languages a typical name would be many fewer
> > characters than it is in Western alphabets, no?  I'd guess (with
> > no evidence though) that the effect would more or less cancel out.
> 
> That's only true for "kanji" characters. There are alphabet like
> phonogram characters called "katakana" and "hiragana". The former is
> often used to express things imported from foreign languages (That
> means Japanse has more and more things expressed in katakana than
> before).

Is this process irreversible ?

I.e. will words like "mirku" or "taikin katchuretchu" (if i remember
correctly my reading form an old dictionary,  these were imported words
for "milk" and "chicken cutlets") never get "kanji" characters ?

BTW, it seems that even with 3 bytes/char tai-kin is shorter than
chicken ;)

-------------
Hannu

Re: Off-topic: FUNC_MAX_ARGS benchmarks

From

Tatsuo Ishii

Date:

07 August 2002, 11:01:24

I'm not sure if could explain welll, but...

> Is this process irreversible ?
> 
> I.e. will words like "mirku" or "taikin katchuretchu" (if i remember
> correctly my reading form an old dictionary,  these were imported words
> for "milk" and "chicken cutlets") never get "kanji" characters ?

I guess "mirk" --> "mi-ru-ku" (3 katakana), "taikin katchuretchu" -->
"chi-ki-n ka-tsu-re-tsu" (3 + 4 katakana).

I don't think it's not irreversible. For example, we have kanji
characters "gyuu nyuu" (2 kanji characters) having same meaning as
milk = miruku, but we cannot interchange "gyuu nyuu" with "miruku" in
most cases.

> BTW, it seems that even with 3 bytes/char tai-kin is shorter than
> chicken ;)

Depends. For example, "pu-ro-se-su" (= process) will be totally 12
bytes in UTF-8.
--
Tatsuo Ishii

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

10 August 2002, 19:21:31

OK, seems we have not come to a decision yet on this.

Do we have agreement to increate FUNC_MAX_ARGS to 32?

NAMEDATALEN will be 64 or 128 in 7.3.  At this point, we better decide
which one we prefer.

The conservative approach would be to go for 64 and perhaps increase it
again in 7.4 after we get feedback and real-world usage.  If we go to
128, we will have trouble decreasing it if there are performance
problems.

---------------------------------------------------------------------------

Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
> >> I'm not sure about the trend of increasing standard deviation --- that
> >> may reflect more disk I/O being done, and perhaps more checkpoints
> >> occurring during the test.  But in any case it's clear that there's a
> >> nontrivial runtime cost here.  Does a 10% slowdown bother you?
> 
> > Hmmm -- didn't Neil do some kind of test that had different results, 
> > i.e. not much performance difference?
> 
> Well, one person had reported a 10% slowdown in pgbench, but Neil saw
> a 10% speedup.  Given the well-known difficulty of getting any
> reproducible numbers out of pgbench, I don't trust either number very
> far; but unless some other folk are willing to repeat the experiment
> I think we can only conclude that pgbench isn't affected much by
> NAMEDATALEN.
> 
> > I wonder if the large number of 
> > DDL commands in installcheck doesn't skew the results against longer 
> > NAMEDATALEN compared to other benchmarks?
> 
> Depends on what you consider skewed, I suppose.  pgbench touches only a
> very small number of relations, and starts no new backends over the
> length of its run, thus everything gets cached and stays cached.  At
> best I'd consider it an existence proof that some applications won't be
> hurt.
> 
> Do you have another application you'd consider a more representative
> benchmark?
> 
>             regards, tom lane
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

10 August 2002, 20:22:55

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Do we have agreement to increate FUNC_MAX_ARGS to 32?

I believe so.

> NAMEDATALEN will be 64 or 128 in 7.3.  At this point, we better decide
> which one we prefer.
> The conservative approach would be to go for 64 and perhaps increase it
> again in 7.4 after we get feedback and real-world usage.  If we go to
> 128, we will have trouble decreasing it if there are performance
> problems.

It seems fairly clear to me that there *are* performance problems,
at least in some scenarios.  I think we should go to 64.  There doesn't
seem to be a lot of real-world demand for more than that, despite what
the spec says ...
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Joe Conway

Date:

10 August 2002, 21:21:23

Bruce Momjian wrote:
> OK, seems we have not come to a decision yet on this.
> 
> Do we have agreement to increate FUNC_MAX_ARGS to 32?
> 
> NAMEDATALEN will be 64 or 128 in 7.3.  At this point, we better decide
> which one we prefer.
> 
> The conservative approach would be to go for 64 and perhaps increase it
> again in 7.4 after we get feedback and real-world usage.  If we go to
> 128, we will have trouble decreasing it if there are performance
> problems.

I guess I'd also agree with:  FUNC_MAX_ARGS 32  NAMEDATALEN 64
and work on the performance issues for 7.4.

Joe

Re: FUNC_MAX_ARGS benchmarks

From

Christopher Kings-Lynne

Date:

11 August 2002, 05:40:04

> > NAMEDATALEN will be 64 or 128 in 7.3.  At this point, we better decide
> > which one we prefer.
> >
> > The conservative approach would be to go for 64 and perhaps increase it
> > again in 7.4 after we get feedback and real-world usage.  If we go to
> > 128, we will have trouble decreasing it if there are performance
> > problems.
>
> I guess I'd also agree with:
>    FUNC_MAX_ARGS 32
>    NAMEDATALEN 64
> and work on the performance issues for 7.4.

I agree too.

Chris

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

13 August 2002, 14:17:42

I am working on a patch to increase these as agreed.  I found this
interesting, from the 6.3 release notes:
  Increase 16 char limit on system table/index names to 32 characters(Bruce)

The limited to be 16 chars until 6.3 in 1998-03-01.

---------------------------------------------------------------------------

Christopher Kings-Lynne wrote:
> > > NAMEDATALEN will be 64 or 128 in 7.3.  At this point, we better decide
> > > which one we prefer.
> > >
> > > The conservative approach would be to go for 64 and perhaps increase it
> > > again in 7.4 after we get feedback and real-world usage.  If we go to
> > > 128, we will have trouble decreasing it if there are performance
> > > problems.
> >
> > I guess I'd also agree with:
> >    FUNC_MAX_ARGS 32
> >    NAMEDATALEN 64
> > and work on the performance issues for 7.4.
> 
> I agree too.
> 
> Chris
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

13 August 2002, 16:39:49

I have applied the attached patch which changes NAMEDATALEN to 64 and
FUNC_MAX_ARGS/INDEX_MAX_KEYS to 32.  Hopefully this will keep people
happy for a few more years.

initdb required.

---------------------------------------------------------------------------

Christopher Kings-Lynne wrote:
> > > NAMEDATALEN will be 64 or 128 in 7.3.  At this point, we better decide
> > > which one we prefer.
> > >
> > > The conservative approach would be to go for 64 and perhaps increase it
> > > again in 7.4 after we get feedback and real-world usage.  If we go to
> > > 128, we will have trouble decreasing it if there are performance
> > > problems.
> >
> > I guess I'd also agree with:
> >    FUNC_MAX_ARGS 32
> >    NAMEDATALEN 64
> > and work on the performance issues for 7.4.
>
> I agree too.
>
> Chris
>
>
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: doc/FAQ_DEV
===================================================================
RCS file: /cvsroot/pgsql-server/doc/FAQ_DEV,v
retrieving revision 1.43
diff -c -r1.43 FAQ_DEV
*** doc/FAQ_DEV    17 Apr 2002 05:12:39 -0000    1.43
--- doc/FAQ_DEV    13 Aug 2002 20:17:54 -0000
***************
*** 560,566 ****
     Table, column, type, function, and view names are stored in system
     tables in columns of type Name. Name is a fixed-length,
     null-terminated type of NAMEDATALEN bytes. (The default value for
!    NAMEDATALEN is 32 bytes.)
  typedef struct nameData
      {
          char        data[NAMEDATALEN];
--- 560,566 ----
     Table, column, type, function, and view names are stored in system
     tables in columns of type Name. Name is a fixed-length,
     null-terminated type of NAMEDATALEN bytes. (The default value for
!    NAMEDATALEN is 64 bytes.)
  typedef struct nameData
      {
          char        data[NAMEDATALEN];
Index: doc/src/sgml/datatype.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/datatype.sgml,v
retrieving revision 1.97
diff -c -r1.97 datatype.sgml
*** doc/src/sgml/datatype.sgml    5 Aug 2002 19:43:30 -0000    1.97
--- doc/src/sgml/datatype.sgml    13 Aug 2002 20:17:56 -0000
***************
*** 914,920 ****
      <productname>PostgreSQL</productname>.  The <type>name</type> type
      exists <emphasis>only</emphasis> for storage of internal catalog
      names and is not intended for use by the general user.  Its length
!     is currently defined as 32 bytes (31 usable characters plus terminator)
      but should be referenced using the macro
      <symbol>NAMEDATALEN</symbol>.  The length is set at compile time
      (and is therefore adjustable for special uses); the default
--- 914,920 ----
      <productname>PostgreSQL</productname>.  The <type>name</type> type
      exists <emphasis>only</emphasis> for storage of internal catalog
      names and is not intended for use by the general user.  Its length
!     is currently defined as 64 bytes (63 usable characters plus terminator)
      but should be referenced using the macro
      <symbol>NAMEDATALEN</symbol>.  The length is set at compile time
      (and is therefore adjustable for special uses); the default
***************
*** 943,950 ****
         </row>
         <row>
      <entry>name</entry>
!     <entry>32 bytes</entry>
!     <entry>Thirty-one character internal type</entry>
         </row>
        </tbody>
       </tgroup>
--- 943,950 ----
         </row>
         <row>
      <entry>name</entry>
!     <entry>64 bytes</entry>
!     <entry>Sixty-three character internal type</entry>
         </row>
        </tbody>
       </tgroup>
Index: doc/src/sgml/indices.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/indices.sgml,v
retrieving revision 1.35
diff -c -r1.35 indices.sgml
*** doc/src/sgml/indices.sgml    30 Jul 2002 17:34:37 -0000    1.35
--- doc/src/sgml/indices.sgml    13 Aug 2002 20:17:56 -0000
***************
*** 236,242 ****

    <para>
     Currently, only the B-tree and GiST implementations support multicolumn
!    indexes.  Up to 16 columns may be specified.  (This limit can be
     altered when building <productname>PostgreSQL</productname>; see the
     file <filename>pg_config.h</filename>.)
    </para>
--- 236,242 ----

    <para>
     Currently, only the B-tree and GiST implementations support multicolumn
!    indexes.  Up to 32 columns may be specified.  (This limit can be
     altered when building <productname>PostgreSQL</productname>; see the
     file <filename>pg_config.h</filename>.)
    </para>
Index: doc/src/sgml/manage.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/manage.sgml,v
retrieving revision 1.22
diff -c -r1.22 manage.sgml
*** doc/src/sgml/manage.sgml    10 Aug 2002 19:35:00 -0000    1.22
--- doc/src/sgml/manage.sgml    13 Aug 2002 20:17:57 -0000
***************
*** 70,76 ****
      You  automatically  become  the
      database  administrator  of  the database you just created.
      Database names must  have  an  alphabetic  first
!     character and are limited to 31 characters in length.
      <ProductName>PostgreSQL</ProductName>  allows  you to create any number of
      databases at a  given  site.
     </Para>
--- 70,76 ----
      You  automatically  become  the
      database  administrator  of  the database you just created.
      Database names must  have  an  alphabetic  first
!     character and are limited to 63 characters in length.
      <ProductName>PostgreSQL</ProductName>  allows  you to create any number of
      databases at a  given  site.
     </Para>
Index: doc/src/sgml/start.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/start.sgml,v
retrieving revision 1.23
diff -c -r1.23 start.sgml
*** doc/src/sgml/start.sgml    10 Aug 2002 19:35:00 -0000    1.23
--- doc/src/sgml/start.sgml    13 Aug 2002 20:17:57 -0000
***************
*** 231,237 ****
      You can also create databases with other names.
      <productname>PostgreSQL</productname> allows you to create any
      number of databases at a given site.  Database names must have an
!     alphabetic first character and are limited to 31 characters in
      length.  A convenient choice is to create a database with the same
      name as your current user name.  Many tools assume that database
      name as the default, so it can save you some typing.  To create
--- 231,237 ----
      You can also create databases with other names.
      <productname>PostgreSQL</productname> allows you to create any
      number of databases at a given site.  Database names must have an
!     alphabetic first character and are limited to 63 characters in
      length.  A convenient choice is to create a database with the same
      name as your current user name.  Many tools assume that database
      name as the default, so it can save you some typing.  To create
Index: doc/src/sgml/syntax.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/syntax.sgml,v
retrieving revision 1.65
diff -c -r1.65 syntax.sgml
*** doc/src/sgml/syntax.sgml    10 Aug 2002 19:01:53 -0000    1.65
--- doc/src/sgml/syntax.sgml    13 Aug 2002 20:17:58 -0000
***************
*** 120,127 ****
      The system uses no more than <symbol>NAMEDATALEN</symbol>-1
      characters of an identifier; longer names can be written in
      commands, but they will be truncated.  By default,
!     <symbol>NAMEDATALEN</symbol> is 32 so the maximum identifier length
!     is 31 (but at the time the system is built,
      <symbol>NAMEDATALEN</symbol> can be changed in
      <filename>src/include/postgres_ext.h</filename>).
     </para>
--- 120,127 ----
      The system uses no more than <symbol>NAMEDATALEN</symbol>-1
      characters of an identifier; longer names can be written in
      commands, but they will be truncated.  By default,
!     <symbol>NAMEDATALEN</symbol> is 64 so the maximum identifier length
!     is 63 (but at the time the system is built,
      <symbol>NAMEDATALEN</symbol> can be changed in
      <filename>src/include/postgres_ext.h</filename>).
     </para>
Index: doc/src/sgml/ref/create_index.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/ref/create_index.sgml,v
retrieving revision 1.35
diff -c -r1.35 create_index.sgml
*** doc/src/sgml/ref/create_index.sgml    30 Jul 2002 17:34:37 -0000    1.35
--- doc/src/sgml/ref/create_index.sgml    13 Aug 2002 20:17:58 -0000
***************
*** 339,345 ****

     <para>
      Currently, only the B-tree and gist access methods support multicolumn
!     indexes. Up to 16 keys may be specified by default (this limit
      can be altered when building
      <application>PostgreSQL</application>).  Only B-tree currently supports
      unique indexes.
--- 339,345 ----

     <para>
      Currently, only the B-tree and gist access methods support multicolumn
!     indexes. Up to 32 keys may be specified by default (this limit
      can be altered when building
      <application>PostgreSQL</application>).  Only B-tree currently supports
      unique indexes.
Index: doc/src/sgml/ref/current_user.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/ref/current_user.sgml,v
retrieving revision 1.6
diff -c -r1.6 current_user.sgml
*** doc/src/sgml/ref/current_user.sgml    21 Apr 2002 19:02:39 -0000    1.6
--- doc/src/sgml/ref/current_user.sgml    13 Aug 2002 20:17:59 -0000
***************
*** 77,83 ****
      Notes
     </TITLE>
     <PARA>
!     Data type "name" is a non-standard 31-character type for storing
      system identifiers.
     </PARA>
    </REFSECT2>
--- 77,83 ----
      Notes
     </TITLE>
     <PARA>
!     Data type "name" is a non-standard 63-character type for storing
      system identifiers.
     </PARA>
    </REFSECT2>
Index: doc/src/sgml/ref/listen.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/ref/listen.sgml,v
retrieving revision 1.13
diff -c -r1.13 listen.sgml
*** doc/src/sgml/ref/listen.sgml    21 Apr 2002 19:02:39 -0000    1.13
--- doc/src/sgml/ref/listen.sgml    13 Aug 2002 20:17:59 -0000
***************
*** 146,152 ****
      it need not correspond to the name of any actual table.  If
      <replaceable class="PARAMETER">notifyname</replaceable>
      is enclosed in double-quotes, it need not even be a syntactically
!     valid name, but can be any string up to 31 characters long.
     </para>
     <para>
      In some previous releases of
--- 146,152 ----
      it need not correspond to the name of any actual table.  If
      <replaceable class="PARAMETER">notifyname</replaceable>
      is enclosed in double-quotes, it need not even be a syntactically
!     valid name, but can be any string up to 63 characters long.
     </para>
     <para>
      In some previous releases of
Index: doc/src/sgml/ref/notify.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/ref/notify.sgml,v
retrieving revision 1.17
diff -c -r1.17 notify.sgml
*** doc/src/sgml/ref/notify.sgml    21 Apr 2002 19:02:39 -0000    1.17
--- doc/src/sgml/ref/notify.sgml    13 Aug 2002 20:17:59 -0000
***************
*** 180,186 ****
      it need not correspond to the name of any actual table.  If
      <replaceable class="PARAMETER">name</replaceable>
      is enclosed in double-quotes, it need not even be a syntactically
!     valid name, but can be any string up to 31 characters long.
     </para>
     <para>
      In some previous releases of
--- 180,186 ----
      it need not correspond to the name of any actual table.  If
      <replaceable class="PARAMETER">name</replaceable>
      is enclosed in double-quotes, it need not even be a syntactically
!     valid name, but can be any string up to 63 characters long.
     </para>
     <para>
      In some previous releases of
Index: doc/src/sgml/ref/unlisten.sgml
===================================================================
RCS file: /cvsroot/pgsql-server/doc/src/sgml/ref/unlisten.sgml,v
retrieving revision 1.18
diff -c -r1.18 unlisten.sgml
*** doc/src/sgml/ref/unlisten.sgml    21 Apr 2002 19:02:39 -0000    1.18
--- doc/src/sgml/ref/unlisten.sgml    13 Aug 2002 20:17:59 -0000
***************
*** 114,120 ****
     <para>
      <replaceable class="PARAMETER">notifyname</replaceable>
      need not be a valid class name but can be any string valid
!     as a name up to 32 characters long.
     </para>
     <para>
      The backend does not complain if you UNLISTEN something you were not
--- 114,120 ----
     <para>
      <replaceable class="PARAMETER">notifyname</replaceable>
      need not be a valid class name but can be any string valid
!     as a name up to 64 characters long.
     </para>
     <para>
      The backend does not complain if you UNLISTEN something you were not
Index: src/bin/psql/command.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/bin/psql/command.c,v
retrieving revision 1.75
diff -c -r1.75 command.c
*** src/bin/psql/command.c    10 Aug 2002 03:56:23 -0000    1.75
--- src/bin/psql/command.c    13 Aug 2002 20:18:01 -0000
***************
*** 1513,1519 ****
      sys = malloc(strlen(editorName) + strlen(fname) + 32 + 1);
      if (!sys)
          return false;
!     sprintf(sys, "exec %s %s", editorName, fname);
      result = system(sys);
      if (result == -1)
          psql_error("could not start editor %s\n", editorName);
--- 1513,1519 ----
      sys = malloc(strlen(editorName) + strlen(fname) + 32 + 1);
      if (!sys)
          return false;
!     snprintf(sys, 32, "exec %s %s", editorName, fname);
      result = system(sys);
      if (result == -1)
          psql_error("could not start editor %s\n", editorName);
Index: src/include/pg_config.h.in
===================================================================
RCS file: /cvsroot/pgsql-server/src/include/pg_config.h.in,v
retrieving revision 1.26
diff -c -r1.26 pg_config.h.in
*** src/include/pg_config.h.in    31 Jul 2002 17:19:54 -0000    1.26
--- src/include/pg_config.h.in    13 Aug 2002 20:18:02 -0000
***************
*** 162,168 ****
   * switch statement in fmgr_oldstyle() in src/backend/utils/fmgr/fmgr.c.
   * But consider converting such functions to new-style instead...
   */
! #define INDEX_MAX_KEYS        16
  #define FUNC_MAX_ARGS        INDEX_MAX_KEYS

  /*
--- 162,168 ----
   * switch statement in fmgr_oldstyle() in src/backend/utils/fmgr/fmgr.c.
   * But consider converting such functions to new-style instead...
   */
! #define INDEX_MAX_KEYS        32
  #define FUNC_MAX_ARGS        INDEX_MAX_KEYS

  /*
Index: src/include/postgres_ext.h
===================================================================
RCS file: /cvsroot/pgsql-server/src/include/postgres_ext.h,v
retrieving revision 1.10
diff -c -r1.10 postgres_ext.h
*** src/include/postgres_ext.h    30 Apr 2002 19:53:03 -0000    1.10
--- src/include/postgres_ext.h    13 Aug 2002 20:18:02 -0000
***************
*** 41,46 ****
   *
   * NOTE that databases with different NAMEDATALEN's cannot interoperate!
   */
! #define NAMEDATALEN 32

  #endif
--- 41,46 ----
   *
   * NOTE that databases with different NAMEDATALEN's cannot interoperate!
   */
! #define NAMEDATALEN 64

  #endif
Index: src/include/catalog/catversion.h
===================================================================
RCS file: /cvsroot/pgsql-server/src/include/catalog/catversion.h,v
retrieving revision 1.147
diff -c -r1.147 catversion.h
*** src/include/catalog/catversion.h    9 Aug 2002 16:45:14 -0000    1.147
--- src/include/catalog/catversion.h    13 Aug 2002 20:18:02 -0000
***************
*** 53,58 ****
   */

  /*                            yyyymmddN */
! #define CATALOG_VERSION_NO    200208091

  #endif
--- 53,58 ----
   */

  /*                            yyyymmddN */
! #define CATALOG_VERSION_NO    200208131

  #endif
Index: src/interfaces/jdbc/org/postgresql/errors.properties
===================================================================
RCS file: /cvsroot/pgsql-server/src/interfaces/jdbc/org/postgresql/errors.properties,v
retrieving revision 1.13
diff -c -r1.13 errors.properties
*** src/interfaces/jdbc/org/postgresql/errors.properties    24 Jun 2002 06:16:27 -0000    1.13
--- src/interfaces/jdbc/org/postgresql/errors.properties    13 Aug 2002 20:18:02 -0000
***************
*** 61,67 ****
  postgresql.res.colrange:The column index is out of range.
  postgresql.res.nextrequired:Result set not positioned properly, perhaps you need to call next().
  postgresql.serial.interface:You cannot serialize an interface.
! postgresql.serial.namelength:Class & Package name length cannot be longer than 32 characters. {0} is {1} characters.
  postgresql.serial.noclass:No class found for {0}
  postgresql.serial.table:The table for {0} is not in the database. Contact the DBA, as the database is in an
inconsistentstate. 
  postgresql.serial.underscore:Class names may not have _ in them. You supplied {0}.
--- 61,67 ----
  postgresql.res.colrange:The column index is out of range.
  postgresql.res.nextrequired:Result set not positioned properly, perhaps you need to call next().
  postgresql.serial.interface:You cannot serialize an interface.
! postgresql.serial.namelength:Class & Package name length cannot be longer than 64 characters. {0} is {1} characters.
  postgresql.serial.noclass:No class found for {0}
  postgresql.serial.table:The table for {0} is not in the database. Contact the DBA, as the database is in an
inconsistentstate. 
  postgresql.serial.underscore:Class names may not have _ in them. You supplied {0}.
Index: src/interfaces/jdbc/org/postgresql/util/Serialize.java
===================================================================
RCS file: /cvsroot/pgsql-server/src/interfaces/jdbc/org/postgresql/util/Serialize.java,v
retrieving revision 1.11
diff -c -r1.11 Serialize.java
*** src/interfaces/jdbc/org/postgresql/util/Serialize.java    23 Jul 2002 03:59:55 -0000    1.11
--- src/interfaces/jdbc/org/postgresql/util/Serialize.java    13 Aug 2002 20:18:03 -0000
***************
*** 57,63 ****
   * There are a number of limitations placed on the java class to be
   * used by Serialize:
   * <ul>
!  * <li>The class name must be less than 32 chars long and must be all lowercase.
   * This is due to limitations in Postgres about the size of table names.
   * The name must be all lowercase since table names in Postgres are
   * case insensitive and the relname is stored in lowercase. Unless some
--- 57,63 ----
   * There are a number of limitations placed on the java class to be
   * used by Serialize:
   * <ul>
!  * <li>The class name must be less than 64 chars long and must be all lowercase.
   * This is due to limitations in Postgres about the size of table names.
   * The name must be all lowercase since table names in Postgres are
   * case insensitive and the relname is stored in lowercase. Unless some
***************
*** 577,583 ****
       *
       * Because of this, a Class name may not have _ in the name.<p>
       * Another limitation, is that the entire class name (including packages)
!      * cannot be longer than 32 characters (a limit forced by PostgreSQL).
       *
       * @param name Class name
       * @return PostgreSQL table name
--- 577,583 ----
       *
       * Because of this, a Class name may not have _ in the name.<p>
       * Another limitation, is that the entire class name (including packages)
!      * cannot be longer than 64 characters (a limit forced by PostgreSQL).
       *
       * @param name Class name
       * @return PostgreSQL table name
***************
*** 590,605 ****
          if (name.indexOf("_") > -1)
              throw new PSQLException("postgresql.serial.underscore");

!         // Postgres table names can only be 32 character long.
!         // Reserve 1 char, so allow only up to 31 chars.
          // If the full class name with package is too long
          // then just use the class name. If the class name is
          // too long throw an exception.
          //
!         if ( name.length() > 31 )
          {
              name = name.substring(name.lastIndexOf(".") + 1);
!             if ( name.length() > 31 )
                  throw new PSQLException("postgresql.serial.namelength", name, new Integer(name.length()));
          }
          return name.replace('.', '_');
--- 590,605 ----
          if (name.indexOf("_") > -1)
              throw new PSQLException("postgresql.serial.underscore");

!         // Postgres table names can only be 64 character long.
!         // Reserve 1 char, so allow only up to 63 chars.
          // If the full class name with package is too long
          // then just use the class name. If the class name is
          // too long throw an exception.
          //
!         if ( name.length() > 63 )
          {
              name = name.substring(name.lastIndexOf(".") + 1);
!             if ( name.length() > 63 )
                  throw new PSQLException("postgresql.serial.namelength", name, new Integer(name.length()));
          }
          return name.replace('.', '_');
Index: src/test/regress/expected/name.out
===================================================================
RCS file: /cvsroot/pgsql-server/src/test/regress/expected/name.out,v
retrieving revision 1.5
diff -c -r1.5 name.out
*** src/test/regress/expected/name.out    4 Jan 2000 16:19:34 -0000    1.5
--- src/test/regress/expected/name.out    13 Aug 2002 20:18:04 -0000
***************
*** 19,104 ****
  --
  --
  CREATE TABLE NAME_TBL(f1 name);
! INSERT INTO NAME_TBL(f1) VALUES ('ABCDEFGHIJKLMNOP');
! INSERT INTO NAME_TBL(f1) VALUES ('abcdefghijklmnop');
  INSERT INTO NAME_TBL(f1) VALUES ('asdfghjkl;');
  INSERT INTO NAME_TBL(f1) VALUES ('343f%2a');
  INSERT INTO NAME_TBL(f1) VALUES ('d34aaasdf');
  INSERT INTO NAME_TBL(f1) VALUES ('');
! INSERT INTO NAME_TBL(f1) VALUES ('1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ');
  SELECT '' AS seven, NAME_TBL.*;
   seven |               f1
! -------+---------------------------------
!        | ABCDEFGHIJKLMNOP
!        | abcdefghijklmnop
         | asdfghjkl;
         | 343f%2a
         | d34aaasdf
         |
!        | 1234567890ABCDEFGHIJKLMNOPQRSTU
  (7 rows)

! SELECT '' AS six, c.f1 FROM NAME_TBL c WHERE c.f1 <> 'ABCDEFGHIJKLMNOP';
   six |               f1
! -----+---------------------------------
!      | abcdefghijklmnop
       | asdfghjkl;
       | 343f%2a
       | d34aaasdf
       |
!      | 1234567890ABCDEFGHIJKLMNOPQRSTU
! (6 rows)

! SELECT '' AS one, c.f1 FROM NAME_TBL c WHERE c.f1 = 'ABCDEFGHIJKLMNOP';
   one |        f1
! -----+------------------
!      | ABCDEFGHIJKLMNOP
! (1 row)

! SELECT '' AS three, c.f1 FROM NAME_TBL c WHERE c.f1 < 'ABCDEFGHIJKLMNOP';
   three |               f1
! -------+---------------------------------
!        | 343f%2a
         |
!        | 1234567890ABCDEFGHIJKLMNOPQRSTU
! (3 rows)

! SELECT '' AS four, c.f1 FROM NAME_TBL c WHERE c.f1 <= 'ABCDEFGHIJKLMNOP';
   four |               f1
! ------+---------------------------------
!       | ABCDEFGHIJKLMNOP
!       | 343f%2a
        |
!       | 1234567890ABCDEFGHIJKLMNOPQRSTU
! (4 rows)

! SELECT '' AS three, c.f1 FROM NAME_TBL c WHERE c.f1 > 'ABCDEFGHIJKLMNOP';
   three |        f1
! -------+------------------
!        | abcdefghijklmnop
         | asdfghjkl;
         | d34aaasdf
! (3 rows)

! SELECT '' AS four, c.f1 FROM NAME_TBL c WHERE c.f1 >= 'ABCDEFGHIJKLMNOP';
   four |        f1
! ------+------------------
!       | ABCDEFGHIJKLMNOP
!       | abcdefghijklmnop
        | asdfghjkl;
        | d34aaasdf
! (4 rows)

  SELECT '' AS seven, c.f1 FROM NAME_TBL c WHERE c.f1 ~ '.*';
   seven |               f1
! -------+---------------------------------
!        | ABCDEFGHIJKLMNOP
!        | abcdefghijklmnop
         | asdfghjkl;
         | 343f%2a
         | d34aaasdf
         |
!        | 1234567890ABCDEFGHIJKLMNOPQRSTU
  (7 rows)

  SELECT '' AS zero, c.f1 FROM NAME_TBL c WHERE c.f1 !~ '.*';
--- 19,104 ----
  --
  --
  CREATE TABLE NAME_TBL(f1 name);
! INSERT INTO NAME_TBL(f1) VALUES ('1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR');
! INSERT INTO NAME_TBL(f1) VALUES ('1234567890abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopqr');
  INSERT INTO NAME_TBL(f1) VALUES ('asdfghjkl;');
  INSERT INTO NAME_TBL(f1) VALUES ('343f%2a');
  INSERT INTO NAME_TBL(f1) VALUES ('d34aaasdf');
  INSERT INTO NAME_TBL(f1) VALUES ('');
! INSERT INTO NAME_TBL(f1) VALUES ('1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ');
  SELECT '' AS seven, NAME_TBL.*;
   seven |               f1
! -------+-----------------------------------------------------------------
!        | 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQ
!        | 1234567890abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopq
         | asdfghjkl;
         | 343f%2a
         | d34aaasdf
         |
!        | 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQ
  (7 rows)

! SELECT '' AS six, c.f1 FROM NAME_TBL c WHERE c.f1 <>
'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR';
   six |               f1
! -----+-----------------------------------------------------------------
!      | 1234567890abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopq
       | asdfghjkl;
       | 343f%2a
       | d34aaasdf
       |
! (5 rows)

! SELECT '' AS one, c.f1 FROM NAME_TBL c WHERE c.f1 =
'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR';
   one |        f1
! -----+-----------------------------------------------------------------
!      | 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQ
!      | 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQ
! (2 rows)

! SELECT '' AS three, c.f1 FROM NAME_TBL c WHERE c.f1 <
'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR';
   three |               f1
! -------+----
         |
! (1 row)

! SELECT '' AS four, c.f1 FROM NAME_TBL c WHERE c.f1 <=
'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR';
   four |               f1
! ------+-----------------------------------------------------------------
!       | 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQ
        |
!       | 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQ
! (3 rows)

! SELECT '' AS three, c.f1 FROM NAME_TBL c WHERE c.f1 >
'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR';
   three |        f1
! -------+-----------------------------------------------------------------
!        | 1234567890abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopq
         | asdfghjkl;
+        | 343f%2a
         | d34aaasdf
! (4 rows)

! SELECT '' AS four, c.f1 FROM NAME_TBL c WHERE c.f1 >=
'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR';
   four |        f1
! ------+-----------------------------------------------------------------
!       | 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQ
!       | 1234567890abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopq
        | asdfghjkl;
+       | 343f%2a
        | d34aaasdf
!       | 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQ
! (6 rows)

  SELECT '' AS seven, c.f1 FROM NAME_TBL c WHERE c.f1 ~ '.*';
   seven |               f1
! -------+-----------------------------------------------------------------
!        | 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQ
!        | 1234567890abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopq
         | asdfghjkl;
         | 343f%2a
         | d34aaasdf
         |
!        | 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQ
  (7 rows)

  SELECT '' AS zero, c.f1 FROM NAME_TBL c WHERE c.f1 !~ '.*';
***************
*** 108,118 ****

  SELECT '' AS three, c.f1 FROM NAME_TBL c WHERE c.f1 ~ '[0-9]';
   three |               f1
! -------+---------------------------------
         | 343f%2a
         | d34aaasdf
!        | 1234567890ABCDEFGHIJKLMNOPQRSTU
! (3 rows)

  SELECT '' AS two, c.f1 FROM NAME_TBL c WHERE c.f1 ~ '.*asdf.*';
   two |     f1
--- 108,120 ----

  SELECT '' AS three, c.f1 FROM NAME_TBL c WHERE c.f1 ~ '[0-9]';
   three |               f1
! -------+-----------------------------------------------------------------
!        | 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQ
!        | 1234567890abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopq
         | 343f%2a
         | d34aaasdf
!        | 1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQ
! (5 rows)

  SELECT '' AS two, c.f1 FROM NAME_TBL c WHERE c.f1 ~ '.*asdf.*';
   two |     f1
Index: src/test/regress/sql/name.sql
===================================================================
RCS file: /cvsroot/pgsql-server/src/test/regress/sql/name.sql,v
retrieving revision 1.5
diff -c -r1.5 name.sql
*** src/test/regress/sql/name.sql    4 Jan 2000 16:21:02 -0000    1.5
--- src/test/regress/sql/name.sql    13 Aug 2002 20:18:04 -0000
***************
*** 14,22 ****

  CREATE TABLE NAME_TBL(f1 name);

! INSERT INTO NAME_TBL(f1) VALUES ('ABCDEFGHIJKLMNOP');

! INSERT INTO NAME_TBL(f1) VALUES ('abcdefghijklmnop');

  INSERT INTO NAME_TBL(f1) VALUES ('asdfghjkl;');

--- 14,22 ----

  CREATE TABLE NAME_TBL(f1 name);

! INSERT INTO NAME_TBL(f1) VALUES ('1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR');

! INSERT INTO NAME_TBL(f1) VALUES ('1234567890abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopqr');

  INSERT INTO NAME_TBL(f1) VALUES ('asdfghjkl;');

***************
*** 26,47 ****

  INSERT INTO NAME_TBL(f1) VALUES ('');

! INSERT INTO NAME_TBL(f1) VALUES ('1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ');


  SELECT '' AS seven, NAME_TBL.*;

! SELECT '' AS six, c.f1 FROM NAME_TBL c WHERE c.f1 <> 'ABCDEFGHIJKLMNOP';

! SELECT '' AS one, c.f1 FROM NAME_TBL c WHERE c.f1 = 'ABCDEFGHIJKLMNOP';

! SELECT '' AS three, c.f1 FROM NAME_TBL c WHERE c.f1 < 'ABCDEFGHIJKLMNOP';

! SELECT '' AS four, c.f1 FROM NAME_TBL c WHERE c.f1 <= 'ABCDEFGHIJKLMNOP';

! SELECT '' AS three, c.f1 FROM NAME_TBL c WHERE c.f1 > 'ABCDEFGHIJKLMNOP';

! SELECT '' AS four, c.f1 FROM NAME_TBL c WHERE c.f1 >= 'ABCDEFGHIJKLMNOP';

  SELECT '' AS seven, c.f1 FROM NAME_TBL c WHERE c.f1 ~ '.*';

--- 26,47 ----

  INSERT INTO NAME_TBL(f1) VALUES ('');

! INSERT INTO NAME_TBL(f1) VALUES ('1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ');


  SELECT '' AS seven, NAME_TBL.*;

! SELECT '' AS six, c.f1 FROM NAME_TBL c WHERE c.f1 <>
'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR';

! SELECT '' AS one, c.f1 FROM NAME_TBL c WHERE c.f1 =
'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR';

! SELECT '' AS three, c.f1 FROM NAME_TBL c WHERE c.f1 <
'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR';

! SELECT '' AS four, c.f1 FROM NAME_TBL c WHERE c.f1 <=
'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR';

! SELECT '' AS three, c.f1 FROM NAME_TBL c WHERE c.f1 >
'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR';

! SELECT '' AS four, c.f1 FROM NAME_TBL c WHERE c.f1 >=
'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLMNOPQR';

  SELECT '' AS seven, c.f1 FROM NAME_TBL c WHERE c.f1 ~ '.*';

Re: FUNC_MAX_ARGS benchmarks

From

Neil Conway

Date:

13 August 2002, 17:00:43

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I have applied the attached patch which changes NAMEDATALEN to 64 and
> FUNC_MAX_ARGS/INDEX_MAX_KEYS to 32.

What is the reasoning behind the following change?

Index: src/bin/psql/command.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/bin/psql/command.c,v
retrieving revision 1.75
diff -c -r1.75 command.c
*** src/bin/psql/command.c    10 Aug 2002 03:56:23 -0000    1.75
--- src/bin/psql/command.c    13 Aug 2002 20:18:01 -0000
***************
*** 1513,1519 ****     sys = malloc(strlen(editorName) + strlen(fname) + 32 + 1);     if (!sys)         return false;
!     sprintf(sys, "exec %s %s", editorName, fname);     result = system(sys);     if (result == -1)
psql_error("couldnot start editor %s\n", editorName);
 
--- 1513,1519 ----     sys = malloc(strlen(editorName) + strlen(fname) + 32 + 1);     if (!sys)         return false;
!     snprintf(sys, 32, "exec %s %s", editorName, fname);     result = system(sys);     if (result == -1)
psql_error("couldnot start editor %s\n", editorName);
 

Cheers,

Neil

-- 
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

13 August 2002, 17:03:40

Good question.  Looked like a possible buffer overrun to me.  Of course,
I botched it up.  I will fix it.

---------------------------------------------------------------------------

Neil Conway wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I have applied the attached patch which changes NAMEDATALEN to 64 and
> > FUNC_MAX_ARGS/INDEX_MAX_KEYS to 32.
> 
> What is the reasoning behind the following change?
> 
> Index: src/bin/psql/command.c
> ===================================================================
> RCS file: /cvsroot/pgsql-server/src/bin/psql/command.c,v
> retrieving revision 1.75
> diff -c -r1.75 command.c
> *** src/bin/psql/command.c    10 Aug 2002 03:56:23 -0000    1.75
> --- src/bin/psql/command.c    13 Aug 2002 20:18:01 -0000
> ***************
> *** 1513,1519 ****
>       sys = malloc(strlen(editorName) + strlen(fname) + 32 + 1);
>       if (!sys)
>           return false;
> !     sprintf(sys, "exec %s %s", editorName, fname);
>       result = system(sys);
>       if (result == -1)
>           psql_error("could not start editor %s\n", editorName);
> --- 1513,1519 ----
>       sys = malloc(strlen(editorName) + strlen(fname) + 32 + 1);
>       if (!sys)
>           return false;
> !     snprintf(sys, 32, "exec %s %s", editorName, fname);
>       result = system(sys);
>       if (result == -1)
>           psql_error("could not start editor %s\n", editorName);
> 
> Cheers,
> 
> Neil
> 
> -- 
> Neil Conway <neilconway@rogers.com>
> PGP Key ID: DB3C29FC
> 
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

13 August 2002, 17:06:03

In fact, I now see that there was no such problem.  I do wonder why the
32 is there, though? Shouldn't it be 6 or something like that?

---------------------------------------------------------------------------

Neil Conway wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I have applied the attached patch which changes NAMEDATALEN to 64 and
> > FUNC_MAX_ARGS/INDEX_MAX_KEYS to 32.
> 
> What is the reasoning behind the following change?
> 
> Index: src/bin/psql/command.c
> ===================================================================
> RCS file: /cvsroot/pgsql-server/src/bin/psql/command.c,v
> retrieving revision 1.75
> diff -c -r1.75 command.c
> *** src/bin/psql/command.c    10 Aug 2002 03:56:23 -0000    1.75
> --- src/bin/psql/command.c    13 Aug 2002 20:18:01 -0000
> ***************
> *** 1513,1519 ****
>       sys = malloc(strlen(editorName) + strlen(fname) + 32 + 1);
>       if (!sys)
>           return false;
> !     sprintf(sys, "exec %s %s", editorName, fname);
>       result = system(sys);
>       if (result == -1)
>           psql_error("could not start editor %s\n", editorName);
> --- 1513,1519 ----
>       sys = malloc(strlen(editorName) + strlen(fname) + 32 + 1);
>       if (!sys)
>           return false;
> !     snprintf(sys, 32, "exec %s %s", editorName, fname);
>       result = system(sys);
>       if (result == -1)
>           psql_error("could not start editor %s\n", editorName);
> 
> Cheers,
> 
> Neil
> 
> -- 
> Neil Conway <neilconway@rogers.com>
> PGP Key ID: DB3C29FC
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073

Re: FUNC_MAX_ARGS benchmarks

From

Tom Lane

Date:

13 August 2002, 17:43:05

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> In fact, I now see that there was no such problem.  I do wonder why the
> 32 is there, though? Shouldn't it be 6 or something like that?

Whoever it was was too lazy to count accurately ;-)

I guess I'd vote for changing the code to be
sys = malloc(strlen(editorName) + strlen(fname) + 10 + 1);if (!sys)    return false;sprintf(sys, "exec '%s' '%s'",
editorName,fname);
 

(note the added quotes to provide a little protection against spaces
and such).  Then it's perfectly obvious what the calculation is doing.
I don't care about wasting 20-some bytes, but confusing readers of the
code is worth avoiding.
        regards, tom lane

Re: FUNC_MAX_ARGS benchmarks

From

Bruce Momjian

Date:

14 August 2002, 01:49:31

Change made.

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > In fact, I now see that there was no such problem.  I do wonder why the
> > 32 is there, though? Shouldn't it be 6 or something like that?
> 
> Whoever it was was too lazy to count accurately ;-)
> 
> I guess I'd vote for changing the code to be
> 
>     sys = malloc(strlen(editorName) + strlen(fname) + 10 + 1);
>     if (!sys)
>         return false;
>     sprintf(sys, "exec '%s' '%s'", editorName, fname);
> 
> (note the added quotes to provide a little protection against spaces
> and such).  Then it's perfectly obvious what the calculation is doing.
> I don't care about wasting 20-some bytes, but confusing readers of the
> code is worth avoiding.
> 
>             regards, tom lane
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073