Thread: forced sequential scan when condition has current_user

forced sequential scan when condition has current_user

From
Keresztury Balázs
Date:
hi,

just a small question: is it normal that PostgreSQL 8.4.1 always uses
sequential scanning on any table when there is a condition having the
constant "current_user"? Of course there is a btree index set on that table,
but the DBMS just doesn't want to utilize it. When I replace current_user to
any string, the planner uses the index normally.

I can demonstrate it with the following simple query:

SELECT psz.kotesszam FROM projekt.projektszervezet psz WHERE
psz.felhasznalo_id = current_user;

Explain analyze:

"Seq Scan on projektszervezet psz  (cost=0.00..255.07 rows=42 width=9)"
"  Filter: ((felhasznalo_id)::name = "current_user"())"

Metadata:

CREATE TABLE "projekt"."projektszervezet" (
  CONSTRAINT "projektszervezet_pkey" PRIMARY KEY("kotesszam",
"felhasznalo_id"),
  CONSTRAINT "projektszervezet_fk_felhasznalo" FOREIGN KEY
("felhasznalo_id")
    REFERENCES "felhasznalo"."felhasznalo"("felhasznalo_id")
    ON DELETE RESTRICT
    ON UPDATE RESTRICT
    NOT DEFERRABLE,
  CONSTRAINT "projektszervezet_fk_projekt" FOREIGN KEY ("kotesszam")
    REFERENCES "projekt"."projekt"("kotesszam")
    ON DELETE CASCADE
    ON UPDATE CASCADE
    NOT DEFERRABLE,
  CONSTRAINT "projektszervezet_fk_szerep" FOREIGN KEY ("szerep_id")
    REFERENCES "felhasznalo"."szerep"("szerep_id")
    ON DELETE RESTRICT
    ON UPDATE RESTRICT
    NOT DEFERRABLE
) INHERITS ("projekt"."projektszervezet_sablon")
WITH OIDS;

CREATE INDEX "projektszervezet_idx_felhasznalo_id" ON
"projekt"."projektszervezet"
  USING btree ("felhasznalo_id");

CREATE INDEX "projektszervezet_idx_kotesszam" ON
"projekt"."projektszervezet"
  USING btree ("kotesszam");

CREATE TRIGGER "projektszervezet_archivalas" BEFORE UPDATE OR DELETE ON
"projekt"."projektszervezet" FOR EACH ROW EXECUTE PROCEDURE
"public"."projektszervezet_archivalas_trigger"();

CREATE TRIGGER "projektszervezet_idopecset" BEFORE UPDATE ON
"projekt"."projektszervezet" FOR EACH ROW EXECUTE PROCEDURE
"public"."idopecset_trigger"();

CREATE TRIGGER "projektszervezet_naplozas" BEFORE INSERT OR UPDATE OR DELETE
ON "projekt"."projektszervezet" FOR EACH ROW EXECUTE PROCEDURE
"public"."projektszervezet_naplozas_trigger"();

Inherited table:

CREATE TABLE "projekt"."projektszervezet_sablon" (
  "kotesszam" VARCHAR(10) NOT NULL,
  "felhasznalo_id" VARCHAR NOT NULL,
  "szerep_id" VARCHAR(3),
  "felvivo" VARCHAR DEFAULT "current_user"() NOT NULL,
  "felvitel_idopont" TIMESTAMP WITHOUT TIME ZONE DEFAULT now() NOT NULL,
  "modosito" VARCHAR,
  "modositas_idopont" TIMESTAMP WITHOUT TIME ZONE,
  "elso_felvivo" VARCHAR DEFAULT "current_user"() NOT NULL,
  "elso_felvitel_idopont" TIMESTAMP(0) WITHOUT TIME ZONE DEFAULT now() NOT
NULL
) WITH OIDS;

CREATE TRIGGER "projektszervezet_idopecset" BEFORE UPDATE ON
"projekt"."projektszervezet_sablon" FOR EACH ROW EXECUTE PROCEDURE
"public"."idopecset_trigger"();


Thanks!
Balazs


Re: forced sequential scan when condition has current_user

From
Robert Haas
Date:
2010/1/4 Keresztury Balázs <balazs@gaslightmusic.hu>:
> just a small question: is it normal that PostgreSQL 8.4.1 always uses
> sequential scanning on any table when there is a condition having the
> constant "current_user"? Of course there is a btree index set on that table,
> but the DBMS just doesn't want to utilize it. When I replace current_user to
> any string, the planner uses the index normally.
>
> I can demonstrate it with the following simple query:
>
> SELECT psz.kotesszam FROM projekt.projektszervezet psz WHERE
> psz.felhasznalo_id = current_user;
>
> Explain analyze:
>
> "Seq Scan on projektszervezet psz  (cost=0.00..255.07 rows=42 width=9)"
> "  Filter: ((felhasznalo_id)::name = "current_user"())"

You've only got 42 rows in that table - PostgreSQL probably thinks a
sequential scan will be faster.  It might even be right.  The thing
is, PostgreSQL doesn't know at planning time what the value of
current_user() will be, so the plan can't depend on that; the planner
just takes its best shot.  But if you provide a particular value in
the query then it will look at the stats and see what seems to make
the most sense for that particular value.  So using one of the more
commonly-occuring value in the table might produce a sequential scan,
while a less common value might lead to an index scan.

...Robert

Re: forced sequential scan when condition has current_user

From
Keresztury Balázs
Date:
Actually table itself has ~8000 records. I don't know why does it report 42
rows, since there is not even a matching row in the table for this specific
condition.. But as we all know, the universal answer for every question is
42 ;) Autovacuum is on, and I also did some vacuuming before I started to
play with this query.

I could implement a function into my application to replace current_user to
the actual username, but it just doesn't worth it. By the way, replacing
current_user to a text constant reduces cost from 255->72, so there is a
significant difference. Don't you think this is actually a bug, not a
feature?

balazs

-----Original Message-----
From: Robert Haas [mailto:robertmhaas@gmail.com]
Sent: Monday, January 04, 2010 10:59 PM
To: Keresztury Balázs
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] forced sequential scan when condition has
current_user

2010/1/4 Keresztury Balázs <balazs@gaslightmusic.hu>:
> just a small question: is it normal that PostgreSQL 8.4.1 always uses
> sequential scanning on any table when there is a condition having the
> constant "current_user"? Of course there is a btree index set on that
table,
> but the DBMS just doesn't want to utilize it. When I replace current_user
to
> any string, the planner uses the index normally.
>
> I can demonstrate it with the following simple query:
>
> SELECT psz.kotesszam FROM projekt.projektszervezet psz WHERE
> psz.felhasznalo_id = current_user;
>
> Explain analyze:
>
> "Seq Scan on projektszervezet psz  (cost=0.00..255.07 rows=42 width=9)"
> "  Filter: ((felhasznalo_id)::name = "current_user"())"

You've only got 42 rows in that table - PostgreSQL probably thinks a
sequential scan will be faster.  It might even be right.  The thing
is, PostgreSQL doesn't know at planning time what the value of
current_user() will be, so the plan can't depend on that; the planner
just takes its best shot.  But if you provide a particular value in
the query then it will look at the stats and see what seems to make
the most sense for that particular value.  So using one of the more
commonly-occuring value in the table might produce a sequential scan,
while a less common value might lead to an index scan.

...Robert


Re: forced sequential scan when condition has current_user

From
Erik Jones
Date:
On Jan 4, 2010, at 1:59 PM, Robert Haas wrote:

> The thing is, PostgreSQL doesn't know at planning time what the value of
> current_user() will be, so the plan can't depend on that; the planner
> just takes its best shot.

current_user() is a stable function and the manual is explicit that the result of stable function can be used in an
indexscan: 

"A STABLE function cannot modify the database and is guaranteed to return the same results given the same arguments for
allrows within a single statement. This category allows the optimizer to optimize multiple calls of the function to a
singlecall. In particular, it is safe to use an expression containing such a function in an index scan condition.
(Sincean index scan will evaluate the comparison value only once, not once at each row, it is not valid to use a
VOLATILEfunction in an index scan condition.)" 

postgres=# select provolatile from pg_proc where proname = 'current_user';
 provolatile
-------------
 s

So, I think the OP's question is still valid.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k






Re: forced sequential scan when condition has current_user

From
Robert Haas
Date:
2010/1/4 Erik Jones <ejones@engineyard.com>:
> On Jan 4, 2010, at 1:59 PM, Robert Haas wrote:
>> The thing is, PostgreSQL doesn't know at planning time what the value of
>> current_user() will be, so the plan can't depend on that; the planner
>> just takes its best shot.
>
> current_user() is a stable function and the manual is explicit that the result of stable function can be used in an
indexscan: 

That's true, but what I said is also true.  It CAN be used in an index
scan, and on a sufficiently large table it WILL be used in an index
scan (I tried it).  But the planner doesn't automatically use an index
just because there is one; it tries to gauge whether that's the right
strategy.  Unfortunately, in cases where it is comparing to a function
rather than a constant, its estimates are not always terribly
accurate.

One thing I notice is that the OP has not included any information on
how fast the seqscan or index-scan actually is.  If the seqscan is
slower than the index-scan, then the OP might want to consider
adjusting the page cost parameters - EXPLAIN ANALYZE output for both
plans (perhaps obtained by temporarily setting enable_seqscan to
false) would be helpful in understanding what is happening.

...Robert

Re: forced sequential scan when condition has current_user

From
Craig Ringer
Date:
Erik Jones wrote:
> On Jan 4, 2010, at 1:59 PM, Robert Haas wrote:
>
>> The thing is, PostgreSQL doesn't know at planning time what the value of
>> current_user() will be, so the plan can't depend on that; the planner
>> just takes its best shot.
>
> current_user() is a stable function and the manual is explicit that the result of stable function can be used in an
indexscan: 

Yes ... but the planner doesn't know the value current_user will return,
so it can't use its statistics on the frequency with which a
_particular_ value occurs to make decisions. It has to come up with the
best generic plan for any value that current_user might return. It's as
if current_user were a query parameter that won't be resolved until
EXECUTE time.

Arguably, in this particular case the planner *could* know what value
current_user will return, but adding such special cases to the planner
without a really good reason seems undesirable.

--
Craig Ringer

Re: forced sequential scan when condition has current_user

From
Tom Lane
Date:
Craig Ringer <craig@postnewspapers.com.au> writes:
> Erik Jones wrote:
>> current_user() is a stable function and the manual is explicit that the result of stable function can be used in an
indexscan: 

> Yes ... but the planner doesn't know the value current_user will return,

I think it's got nothing to do with that and everything to do with the
fact that he's comparing to a varchar rather than text column ...

            regards, tom lane

Re: forced sequential scan when condition has current_user

From
"Fernando Hevia"
Date:

> -----Mensaje original-----
> De: Keresztury Balázs
>
> hi,
>
> just a small question: is it normal that PostgreSQL 8.4.1
> always uses sequential scanning on any table when there is a
> condition having the constant "current_user"? Of course there
> is a btree index set on that table, but the DBMS just doesn't
> want to utilize it. When I replace current_user to any
> string, the planner uses the index normally.
>
> I can demonstrate it with the following simple query:
>
> SELECT psz.kotesszam FROM projekt.projektszervezet psz WHERE
> psz.felhasznalo_id = current_user;
>

Probably you are comparing different types. Try explicitly casting
current_user to text:

SELECT psz.kotesszam FROM projekt.projektszervezet psz WHERE
psz.felhasznalo_id = current_user::text




Re: forced sequential scan when condition has current_user

From
Keresztury Balázs
Date:
You are right along with the others, the seq scan was only forced because of
the varchar-text comparision.. Using the cast solves the problem.

Thanks for the answers everyone!

Balazs



-----Original Message-----
From: Fernando Hevia [mailto:fhevia@ip-tel.com.ar]
Sent: Tuesday, January 05, 2010 3:16 PM
To: 'Keresztury Balázs'; pgsql-performance@postgresql.org
Subject: RE: [PERFORM] forced sequential scan when condition has
current_user



> -----Mensaje original-----
> De: Keresztury Balázs
>
> hi,
>
> just a small question: is it normal that PostgreSQL 8.4.1
> always uses sequential scanning on any table when there is a
> condition having the constant "current_user"? Of course there
> is a btree index set on that table, but the DBMS just doesn't
> want to utilize it. When I replace current_user to any
> string, the planner uses the index normally.
>
> I can demonstrate it with the following simple query:
>
> SELECT psz.kotesszam FROM projekt.projektszervezet psz WHERE
> psz.felhasznalo_id = current_user;
>

Probably you are comparing different types. Try explicitly casting
current_user to text:

SELECT psz.kotesszam FROM projekt.projektszervezet psz WHERE
psz.felhasznalo_id = current_user::text