Thread: Patch to fix search_path defencies with pg_bench

Patch to fix search_path defencies with pg_bench

From

"Joshua D. Drake"

Date:

05 May 2009, 19:04:36

Hello,

I have been doing some testing with pgbench and I realized that it
forces the use of public as its search_path. This is bad if:

* You want to run multiple pgbench instances within the same database
* You don't want to use public (for whatever reason)

This patch removes that ability and thus will defer to the default
search_path for the connecting user.

diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index ad20cac..1f25921 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -357,8 +357,6 @@ doConnect(void)               return NULL;       }
-       executeStatement(conn, "SET search_path = public");
-       return conn;}

-- 
PostgreSQL - XMPP: jdrake@jabber.postgresql.org  Consulting, Development, Support, Training  503-667-4564 -
http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997

Re: Patch to fix search_path defencies with pg_bench

From

Tom Lane

Date:

05 May 2009, 20:07:57

"Joshua D. Drake" <jd@commandprompt.com> writes:
> I have been doing some testing with pgbench and I realized that it
> forces the use of public as its search_path. This is bad if:

> * You want to run multiple pgbench instances within the same database
> * You don't want to use public (for whatever reason)

> This patch removes that ability and thus will defer to the default
> search_path for the connecting user.

Hmm.  The search_path setting seems to have been added here
http://archives.postgresql.org/pgsql-committers/2002-10/msg00118.php
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/contrib/pgbench/pgbench.c.diff?r1=1.20;r2=1.21

as part of a mass patch to make everything in contrib work in the public
schema.  I agree that it probably wasn't considered carefully whether
pg_bench should do that; but does anyone see a reason not to change it?
        regards, tom lane

Re: Patch to fix search_path defencies with pg_bench

From

Greg Smith

Date:

06 May 2009, 03:34:08

On Tue, 5 May 2009, Tom Lane wrote:

> I agree that it probably wasn't considered carefully whether pg_bench 
> should do that; but does anyone see a reason not to change it?

I thought of one pretty weak use-case for not making this change, but 
would wager the additional flexibility here is far more likely to be 
appreciated.  I'd say it's a clear net improvement.

As for that case...many good database designs put all the user relations 
into a schema, so that it's easier to do bulk operations on all of them 
while avoiding catalog tables etc.--less work to filter out pg_class to 
find them for example.

I once did some pgbench testing on a system that included a real 
"accounts" table in a named schema.  "pgbench -i" will execute "drop table 
if exists accounts".  It had already accidentally wiped out the copy of 
the accounts table on the system during an earlier test, before the schema 
policy was in place, leaving everyone wary of it.

I was able to defend the risk for running pgbench with the new schema 
layout by saying "that can only execute against public.accounts no matter 
what the user search_path is, so you're safe now".  That made everybody 
happy.  Anyone counting on such behavior could be rudely surprised at this 
change.  For all I know I'm the only person to ever actually run into that 
particular situation though.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Patch to fix search_path defencies with pg_bench

From

Tom Lane

Date:

06 May 2009, 10:37:29

Greg Smith <gsmith@gregsmith.com> writes:
> I once did some pgbench testing on a system that included a real 
> "accounts" table in a named schema.  "pgbench -i" will execute "drop table 
> if exists accounts".  It had already accidentally wiped out the copy of 
> the accounts table on the system during an earlier test, before the schema 
> policy was in place, leaving everyone wary of it.

Seems like the right policy for that is "run pgbench in its own
database".  I doubt that either adding or removing the "set search_path"
command changes the risk of trouble very much.
        regards, tom lane

Re: Patch to fix search_path defencies with pg_bench

From

"Dickson S. Guedes"

Date:

06 May 2009, 16:01:41

Em Qua, 2009-05-06 às 09:37 -0400, Tom Lane escreveu:
> Greg Smith <gsmith@gregsmith.com> writes:
> > I once did some pgbench testing on a system that included a real
> > "accounts" table in a named schema.  "pgbench -i" will execute "drop table
> > if exists accounts".  It had already accidentally wiped out the copy of
> > the accounts table on the system during an earlier test, before the schema
> > policy was in place, leaving everyone wary of it.
>
> Seems like the right policy for that is "run pgbench in its own
> database".

A text warning about this could be shown at start of pgbench if the
target database isn't named "pgbench", for examplo, or just some text
could be added to the docs.

regards.
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br

Re: Patch to fix search_path defencies with pg_bench

From

Alvaro Herrera

Date:

06 May 2009, 16:13:38

Dickson S. Guedes wrote:
> Em Qua, 2009-05-06 às 09:37 -0400, Tom Lane escreveu:

> > Seems like the right policy for that is "run pgbench in its own
> > database". 
> 
> A text warning about this could be shown at start of pgbench if the
> target database isn't named "pgbench", for examplo, or just some text
> could be added to the docs.

I think it would be better that the schema is specified on the command
line.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Patch to fix search_path defencies with pg_bench

From

Tom Lane

Date:

06 May 2009, 16:18:20

"Dickson S. Guedes" <listas@guedesoft.net> writes:
> Em Qua, 2009-05-06 �s 09:37 -0400, Tom Lane escreveu:
>> Seems like the right policy for that is "run pgbench in its own
>> database". 

> A text warning about this could be shown at start of pgbench if the
> target database isn't named "pgbench", for examplo, or just some text
> could be added to the docs.

There already is a prominent warning in the pgbench docs:
    Caution
pgbench -i creates four tables accounts, branches, history, andtellers, destroying any existing tables of these names.
Beverycareful to use another database if you have tables having thesenames!
 
        regards, tom lane

Re: Patch to fix search_path defencies with pg_bench

From

"Joshua D. Drake"

Date:

06 May 2009, 16:30:11

On Wed, 2009-05-06 at 15:13 -0400, Alvaro Herrera wrote:
> Dickson S. Guedes wrote:
> > Em Qua, 2009-05-06 às 09:37 -0400, Tom Lane escreveu:
> 
> > > Seems like the right policy for that is "run pgbench in its own
> > > database". 
> > 
> > A text warning about this could be shown at start of pgbench if the
> > target database isn't named "pgbench", for examplo, or just some text
> > could be added to the docs.
> 
> I think it would be better that the schema is specified on the command
> line.

I could see that as an option but applications that use a role should
adhere to the rules the DBA sets forth for that role. In this particular
case I explicitly said that role bench01 was to connect to the database
bench and that his search path was bench01 (thus all tables would be
created under the schema bench01). Public should never come into play in
that scenario.

Sincerely,

Joshua D. Drake

-- 
PostgreSQL - XMPP: jdrake@jabber.postgresql.org  Consulting, Development, Support, Training  503-667-4564 -
http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997

Re: Patch to fix search_path defencies with pg_bench

From

Tom Lane

Date:

06 May 2009, 17:27:26

Alvaro Herrera <alvherre@commandprompt.com> writes:
> I think it would be better that the schema is specified on the command
> line.

Surely that's more work than the issue is worth.  It's also inconvenient
to use, because you'd have to remember to give the switch both for the
-i run and the normal test runs.
        regards, tom lane

Re: Patch to fix search_path defencies with pg_bench

From

"Joshua D. Drake"

Date:

06 May 2009, 17:49:21

On Wed, 2009-05-06 at 17:42 -0300, Dickson S. Guedes wrote:
> Em Qua, 2009-05-06 às 16:27 -0400, Tom Lane escreveu:
> > Alvaro Herrera <alvherre@commandprompt.com> writes:
> > > I think it would be better that the schema is specified on the command
> > > line.
> > 
> > Surely that's more work than the issue is worth.  It's also inconvenient
> > to use, because you'd have to remember to give the switch both for the
> > -i run and the normal test runs.
> 
> So, in my opinion, the Joshua alternative is a good little change that
> let "pgbench" runs in a more flexible way.
> 
> But, there is the possibility that someone are using an automated script
> that could be broken by this change? 

Only if the role pgbench is using as an explicit search_path set.

Sincerely,

Joshua D. Drake

-- 
PostgreSQL - XMPP: jdrake@jabber.postgresql.org  Consulting, Development, Support, Training  503-667-4564 -
http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997

Re: Patch to fix search_path defencies with pg_bench

From

"Dickson S. Guedes"

Date:

06 May 2009, 17:50:22

Em Qua, 2009-05-06 às 16:27 -0400, Tom Lane escreveu:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > I think it would be better that the schema is specified on the command
> > line.
>
> Surely that's more work than the issue is worth.  It's also inconvenient
> to use, because you'd have to remember to give the switch both for the
> -i run and the normal test runs.

So, in my opinion, the Joshua alternative is a good little change that
let "pgbench" runs in a more flexible way.

But, there is the possibility that someone are using an automated script
that could be broken by this change?

--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br

Re: Patch to fix search_path defencies with pg_bench

From

"Dickson S. Guedes"

Date:

06 May 2009, 18:02:26

Em Qua, 2009-05-06 às 13:49 -0700, Joshua D. Drake escreveu:
> On Wed, 2009-05-06 at 17:42 -0300, Dickson S. Guedes wrote:
> > Em Qua, 2009-05-06 às 16:27 -0400, Tom Lane escreveu:
> > > Alvaro Herrera <alvherre@commandprompt.com> writes:
> > > > I think it would be better that the schema is specified on the command
> > > > line.
> > >
> > > Surely that's more work than the issue is worth.  It's also inconvenient
> > > to use, because you'd have to remember to give the switch both for the
> > > -i run and the normal test runs.
> >
> > So, in my opinion, the Joshua alternative is a good little change that
> > let "pgbench" runs in a more flexible way.
> >
> > But, there is the possibility that someone are using an automated script
> > that could be broken by this change?
>
> Only if the role pgbench is using as an explicit search_path set.


So, in a way to avoid the scenario where a ROLE has an explicit
search_path set to schemes that already have tables named same as the
pgbench's tables, doesn't makes sense also create a "pgbench_" suffix
for them?

--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br

Re: Patch to fix search_path defencies with pg_bench

From

Tom Lane

Date:

06 May 2009, 18:05:03

"Joshua D. Drake" <jd@commandprompt.com> writes:
> On Wed, 2009-05-06 at 17:42 -0300, Dickson S. Guedes wrote:
>> But, there is the possibility that someone are using an automated script
>> that could be broken by this change? 

> Only if the role pgbench is using as an explicit search_path set.

Even then, it's not a problem from the point of view of pgbench ---
the tables will still get created and used correctly.  The only problem
shows up if someone is ignoring the existing warning in the docs and
running pgbench in a database that has application tables named accounts
&etc.  If you're doing that you're at considerable risk anyway, no
matter *what* we do or don't do with pgbench's search path.
        regards, tom lane

Re: Patch to fix search_path defencies with pg_bench

From

Tom Lane

Date:

06 May 2009, 18:13:33

"Dickson S. Guedes" <listas@guedesoft.net> writes:
> So, in a way to avoid the scenario where a ROLE has an explicit
> search_path set to schemes that already have tables named same as the
> pgbench's tables, doesn't makes sense also create a "pgbench_" suffix
> for them?

Hm, just rename the standard scenario's tables to pgbench_accounts
etc?  Sure, but then we break custom pgbench scripts that happen
to be using the default tables for their own purposes.  There's
no free lunch.
        regards, tom lane

Re: Patch to fix search_path defencies with pg_bench

From

Simon Riggs

Date:

07 May 2009, 11:10:11

On Wed, 2009-05-06 at 15:18 -0400, Tom Lane wrote:
> "Dickson S. Guedes" <listas@guedesoft.net> writes:
> > Em Qua, 2009-05-06 s 09:37 -0400, Tom Lane escreveu:
> >> Seems like the right policy for that is "run pgbench in its own
> >> database". 
> 
> > A text warning about this could be shown at start of pgbench if the
> > target database isn't named "pgbench", for examplo, or just some text
> > could be added to the docs.
> 
> There already is a prominent warning in the pgbench docs:
> 
>         Caution
> 
>     pgbench -i creates four tables accounts, branches, history, and
>     tellers, destroying any existing tables of these names. Be very
>     careful to use another database if you have tables having these
>     names!

Holy Handgrenade, what a huge footgun! It doesn't even have a
conceivable upside.

The table names "accounts" and "history" are fairly common and a caution
isn't a sufficient safeguard on production data. We know the manual
rarely gets read *after* a problem, let alone beforehand.

We should check they are the correct tables before we just drop them.
Perhaps check for the comment "Tables for pgbench application. Not
production data" on the tables, which would be nice to add anyway.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Patch to fix search_path defencies with pg_bench

From

Robert Haas

Date:

07 May 2009, 12:14:48

On Thu, May 7, 2009 at 10:12 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On Wed, 2009-05-06 at 15:18 -0400, Tom Lane wrote:
>> "Dickson S. Guedes" <listas@guedesoft.net> writes:
>> > Em Qua, 2009-05-06 s 09:37 -0400, Tom Lane escreveu:
>> >> Seems like the right policy for that is "run pgbench in its own
>> >> database".
>>
>> > A text warning about this could be shown at start of pgbench if the
>> > target database isn't named "pgbench", for examplo, or just some text
>> > could be added to the docs.
>>
>> There already is a prominent warning in the pgbench docs:
>>
>>               Caution
>>
>>       pgbench -i creates four tables accounts, branches, history, and
>>       tellers, destroying any existing tables of these names. Be very
>>       careful to use another database if you have tables having these
>>       names!
>
> Holy Handgrenade, what a huge footgun! It doesn't even have a
> conceivable upside.
>
> The table names "accounts" and "history" are fairly common and a caution
> isn't a sufficient safeguard on production data. We know the manual
> rarely gets read *after* a problem, let alone beforehand.
>
> We should check they are the correct tables before we just drop them.
> Perhaps check for the comment "Tables for pgbench application. Not
> production data" on the tables, which would be nice to add anyway.

I bet it would be just as good and a lot simpler to do what someone
suggested upthread, namely s/^/pgbench_/

...Robert

Re: Patch to fix search_path defencies with pg_bench

From

Simon Riggs

Date:

07 May 2009, 12:27:37

On Thu, 2009-05-07 at 11:14 -0400, Robert Haas wrote:

> > We should check they are the correct tables before we just drop them.
> > Perhaps check for the comment "Tables for pgbench application. Not
> > production data" on the tables, which would be nice to add anyway.
> 
> I bet it would be just as good and a lot simpler to do what someone
> suggested upthread, namely s/^/pgbench_/

Running pgbench has become more popular now, with various people
recommending running it on every system to test performance. I don't
disagree with that recommendation, but I've had problems myself with a
similar issue - hence earlier patch to prevent pgbench running a
complete database VACUUM before every test run.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Patch to fix search_path defencies with pg_bench

From

Aidan Van Dyk

Date:

07 May 2009, 13:47:24

* Robert Haas <robertmhaas@gmail.com> [090507 11:15]:
> I bet it would be just as good and a lot simpler to do what someone
> suggested upthread, namely s/^/pgbench_/

That has the "legacy compatibility" problem...

But seeing as "legacy" has a:SET search_path TO public;

And uses plain <table> in it's queries/creates/drops, couldn't we just
make "new" pgbench refer to tables as <schema>.<table> where <schema> is
"public"?  If we leave "schema" as public, and leave in the search_path,
we should be identical to what we currently have, except we've
explicliyt scoped was was searched for before.

And it leads to an easy way for people to change public (in the
search path and/or <schema>.<table>) to do other things (although I'm
not saying that's necessarily required or desired either).

a.

-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Patch to fix search_path defencies with pg_bench

From

Tom Lane

Date:

07 May 2009, 13:47:43

Simon Riggs <simon@2ndQuadrant.com> writes:
> On Thu, 2009-05-07 at 11:14 -0400, Robert Haas wrote:
>>> We should check they are the correct tables before we just drop them.
>>> Perhaps check for the comment "Tables for pgbench application. Not
>>> production data" on the tables, which would be nice to add anyway.
>> 
>> I bet it would be just as good and a lot simpler to do what someone
>> suggested upthread, namely s/^/pgbench_/

> Running pgbench has become more popular now, with various people
> recommending running it on every system to test performance. I don't
> disagree with that recommendation, but I've had problems myself with a
> similar issue - hence earlier patch to prevent pgbench running a
> complete database VACUUM before every test run.

Well, pgbench has been coded this way since forever and we've only seen
this one report of trouble.  Still, I can't object very hard to renaming
the tables to pgbench_accounts etc --- it's a trivial change and the
only thing it could break is custom pgbench scenarios that rely on the
default scenario's tables, which there are probably not many of.

So do we have consensus on dropping the "SET search_path" and renaming
the tables?
        regards, tom lane

Re: Patch to fix search_path defencies with pg_bench

From

Tom Lane

Date:

07 May 2009, 13:53:41

Aidan Van Dyk <aidan@highrise.ca> writes:
> ... couldn't we just
> make "new" pgbench refer to tables as <schema>.<table> where <schema> is
> "public"?

I'd prefer not to do that because it changes the amount of parsing work
demanded by the benchmark.  Maybe not by enough to matter ... or maybe
it does.  Adjusting the length of the identifiers is a small enough
change that I'm prepared to believe it doesn't invalidate comparisons,
but changing the set of catalog lookups that occur is another question.
        regards, tom lane

Re: Patch to fix search_path defencies with pg_bench

From

"Joshua D. Drake"

Date:

07 May 2009, 13:53:48

On Thu, 2009-05-07 at 12:47 -0400, Tom Lane wrote:

> Well, pgbench has been coded this way since forever and we've only seen
> this one report of trouble.  Still, I can't object very hard to renaming
> the tables to pgbench_accounts etc --- it's a trivial change and the
> only thing it could break is custom pgbench scenarios that rely on the
> default scenario's tables, which there are probably not many of.
> 
> So do we have consensus on dropping the "SET search_path" and renaming
> the tables?

+1 (I hate prefixed table names but I get the idea)

Joshua D. Drake

> 
>             regards, tom lane
> 
-- 
PostgreSQL - XMPP: jdrake@jabber.postgresql.org  Consulting, Development, Support, Training  503-667-4564 -
http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997

Re: Patch to fix search_path defencies with pg_bench

From

Aidan Van Dyk

Date:

07 May 2009, 13:58:56

* Tom Lane <tgl@sss.pgh.pa.us> [090507 12:53]:
> Aidan Van Dyk <aidan@highrise.ca> writes:
> > ... couldn't we just
> > make "new" pgbench refer to tables as <schema>.<table> where <schema> is
> > "public"?
> 
> I'd prefer not to do that because it changes the amount of parsing work
> demanded by the benchmark.  Maybe not by enough to matter ... or maybe
> it does.  Adjusting the length of the identifiers is a small enough
> change that I'm prepared to believe it doesn't invalidate comparisons,
> but changing the set of catalog lookups that occur is another question.

True enough... What about making the prefix be configurable, so by
default, it could be "pgbench_", it could be set to "" (to force it to
use old pgbench names) or set to "something." to get it to use a
different schema (noting that the comparisons to older ones not doing
catalog lookups are void).

But by dropping the search_path, you're necessarily changing the catalog
comparisons and lookups anyways, because your now taking a "random"
search path to follow (which could have multiple entries in it) instead
of one guaranteed to be a single, useable entry.

-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Patch to fix search_path defencies with pg_bench

From

"Joshua D. Drake"

Date:

07 May 2009, 14:02:21

On Thu, 2009-05-07 at 12:58 -0400, Aidan Van Dyk wrote:

> True enough... What about making the prefix be configurable, so by
> default, it could be "pgbench_", it could be set to "" (to force it to
> use old pgbench names) or set to "something." to get it to use a
> different schema (noting that the comparisons to older ones not doing
> catalog lookups are void).

Then you have to pass the prefix on the command line. That seems a bit
over doing it for such a simple utility.

> 
> But by dropping the search_path, you're necessarily changing the catalog
> comparisons and lookups anyways, because your now taking a "random"
> search path to follow (which could have multiple entries in it) instead
> of one guaranteed to be a single, useable entry.

Except that it isn't a guaranteed usable entry, which is why I submitted
the patch.

Sincerely,

Joshua D. Drake

-- 
PostgreSQL - XMPP: jdrake@jabber.postgresql.org  Consulting, Development, Support, Training  503-667-4564 -
http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997

Re: Patch to fix search_path defencies with pg_bench

From

Aidan Van Dyk

Date:

07 May 2009, 14:05:41

* Joshua D. Drake <jd@commandprompt.com> [090507 13:02]:
> On Thu, 2009-05-07 at 12:58 -0400, Aidan Van Dyk wrote:
> 
> > True enough... What about making the prefix be configurable, so by
> > default, it could be "pgbench_", it could be set to "" (to force it to
> > use old pgbench names) or set to "something." to get it to use a
> > different schema (noting that the comparisons to older ones not doing
> > catalog lookups are void).
> 
> Then you have to pass the prefix on the command line. That seems a bit
> over doing it for such a simple utility.

Sure, but by putting a sane default (which seems to be leaning towards
"" or "pgbench_"), you don't *need* to do anything on the command line.

> > But by dropping the search_path, you're necessarily changing the catalog
> > comparisons and lookups anyways, because your now taking a "random"
> > search path to follow (which could have multiple entries in it) instead
> > of one guaranteed to be a single, useable entry.
> 
> Except that it isn't a guaranteed usable entry, which is why I submitted
> the patch.

Well ya, but at least you didn't have any pgbench result to try and
"compare unevenly" with something else ;-)

a.

-- 
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

Re: Patch to fix search_path defencies with pg_bench

From

Simon Riggs

Date:

07 May 2009, 14:05:50

On Thu, 2009-05-07 at 09:53 -0700, Joshua D. Drake wrote:
> On Thu, 2009-05-07 at 12:47 -0400, Tom Lane wrote:
> 
> > Well, pgbench has been coded this way since forever and we've only seen
> > this one report of trouble.  Still, I can't object very hard to renaming
> > the tables to pgbench_accounts etc --- it's a trivial change and the
> > only thing it could break is custom pgbench scenarios that rely on the
> > default scenario's tables, which there are probably not many of.
> > 
> > So do we have consensus on dropping the "SET search_path" and renaming
> > the tables?
> 
> +1 (I hate prefixed table names but I get the idea)

+1, sorry JD.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

Re: Patch to fix search_path defencies with pg_bench

From

Tom Lane

Date:

07 May 2009, 14:08:28

"Joshua D. Drake" <jd@commandprompt.com> writes:
> On Thu, 2009-05-07 at 12:58 -0400, Aidan Van Dyk wrote:
>> True enough... What about making the prefix be configurable, so by
>> default, it could be "pgbench_", it could be set to "" (to force it to
>> use old pgbench names) or set to "something." to get it to use a
>> different schema (noting that the comparisons to older ones not doing
>> catalog lookups are void).

> Then you have to pass the prefix on the command line. That seems a bit
> over doing it for such a simple utility.

Yes, this seems like vastly more work than is called for.

>> But by dropping the search_path, you're necessarily changing the catalog
>> comparisons and lookups anyways, because your now taking a "random"
>> search path to follow (which could have multiple entries in it) instead
>> of one guaranteed to be a single, useable entry.

> Except that it isn't a guaranteed usable entry, which is why I submitted
> the patch.

I think this argument is bogus anyway.  The tables are always going to be
created in the default creation schema, ie, the first one on the search
path.  As long as you don't change the effective search_path between
pgbench -i and the actual test runs, it won't matter whether that is
public or something else.
        regards, tom lane

Re: Patch to fix search_path defencies with pg_bench

From

Tom Lane

Date:

07 May 2009, 19:02:56

"Joshua D. Drake" <jd@commandprompt.com> writes:
> --- a/contrib/pgbench/pgbench.c
> +++ b/contrib/pgbench/pgbench.c
> @@ -357,8 +357,6 @@ doConnect(void)
>                 return NULL;
>         }
>  
> -       executeStatement(conn, "SET search_path = public");
> -
>         return conn;
>  }

Applied along with changes of table names accounts -> pgbench_accounts
etc, per discussion.
        regards, tom lane

Re: Patch to fix search_path defencies with pg_bench

From

Greg Smith

Date:

07 May 2009, 21:07:53

On Thu, 7 May 2009, Aidan Van Dyk wrote:

> But by dropping the search_path, you're necessarily changing the catalog
> comparisons and lookups anyways, because your now taking a "random"
> search path to follow (which could have multiple entries in it) instead
> of one guaranteed to be a single, useable entry.

You are correct here.  Right now, pgbench is guaranteed to be running 
against a search_path with only one entry in it.  If someone runs the new 
version against a configuration with something like:

search_path='a,b,c,d,e,f,g,h,i,j,public'

instead, that is going to execute more slowly than the current pgbench 
would have.

But it seems pretty unlikely such a change would actually be noticable 
relative to how much per-transaction overhead and run to run variation 
there already is in pgbench for reasonably sized catalogs.  Maybe it's 
worth adding a quick comment about the issue in the docs, I don't think 
this downside is significant enough to worry about beyond that.

I think Joshua's original suggestion here is worth considering a bug fix 
for merging into 8.4.  As long as testers don't do anything crazy with 
their manually set search_path, results should be identical with the 
earlier verions.

The additional suggestion of renaming the tables with a prefix is 
reasonable to me, but it seems way out of scope for something to consider 
applying right now though.  If you look at the pgbench client, there's a 
lot of string parsing going on in there that's not particularly efficient. 
I'd want to see a benchmark aimed that quantifying whether that part 
suffers measurably from making the table names all longer before such a 
change got applied.  And there's already a couple of us who are in the 
middle of 8.4 pgbench tests that really don't need disruption like that 
thrown into the mix right now.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Patch to fix search_path defencies with pg_bench

From

Tom Lane

Date:

07 May 2009, 21:11:27

Greg Smith <gsmith@gregsmith.com> writes:
> On Thu, 7 May 2009, Aidan Van Dyk wrote:
> You are correct here.  Right now, pgbench is guaranteed to be running 
> against a search_path with only one entry in it.  If someone runs the new 
> version against a configuration with something like:

> search_path='a,b,c,d,e,f,g,h,i,j,public'

> instead, that is going to execute more slowly than the current pgbench 
> would have.

No, it is not.  The tables will be created and used in schema 'a',
and the effective search path depth will be the same.
        regards, tom lane

Re: Patch to fix search_path defencies with pg_bench

From

Greg Smith

Date:

08 May 2009, 16:02:13

On Thu, 7 May 2009, Tom Lane wrote:

> The tables will be created and used in schema 'a', and the effective 
> search path depth will be the same.

The case to be concerned about here is where the search_path changes 
between initialization and the pgbench run, which certainly isn't 
impossible.  That can leave you with a longer effective path to search. 
Pretty unlikely to be a problem in the field though.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Patch to fix search_path defencies with pg_bench

From

Tom Lane

Date:

08 May 2009, 16:11:18

Greg Smith <gsmith@gregsmith.com> writes:
> On Thu, 7 May 2009, Tom Lane wrote:
>> The tables will be created and used in schema 'a', and the effective 
>> search path depth will be the same.

> The case to be concerned about here is where the search_path changes 
> between initialization and the pgbench run, which certainly isn't 
> impossible.  That can leave you with a longer effective path to search. 
> Pretty unlikely to be a problem in the field though.

Yeah.  Also, there is another consideration here that hasn't been
brought up AFAIR: the main point of running pgbench in-the-field
is to find out whether your installation is properly tuned.  If
you've chosen a search_path setting that *did* have some unexpected
performance issues, wouldn't you want pgbench to reveal that?
It's peculiar to have pgbench forcing this one particular GUC setting
and not any others.
        regards, tom lane