Thread: PostgreSQL 7.1 forces sequence scan when there is no reason

PostgreSQL 7.1 forces sequence scan when there is no reason

From

Denis Perchine

Date:

20 May 2002, 12:22:50

Hello,

I have a problem with PostgreSQL 7.1 forces sequence scan which is
almost 10 times slower than index scan.

Here is the information about query, tables, and data.

Any advise would be nice.

db=> \d listmembers
                             Table "listmembers"
  Column   |  Type   |                       Modifiers
-----------+---------+-------------------------------------------------------
- id        | integer | not null default
 nextval('"listmembers_id_seq"'::text) server_id | integer |
 name      | text    |
 email     | text    |
Indexes: listmembers_sid_key
Unique keys: listmembers_id_key,
             listmembers_sid_email_key
db=> \d listmembers_sid_key
Index "listmembers_sid_key"
  Column   |  Type
-----------+---------
 server_id | integer
btree
db=> \d listmembers_sid_email_key
Index "listmembers_sid_email_key"
  Column   |  Type
-----------+---------
 server_id | integer
 email     | text
unique btree
db=> explain analyze select count(*) from listmembers where
 server_id = 15182; NOTICE:  QUERY PLAN:
Aggregate  (cost=31428.34..31428.34 rows=1 width=0) (actual
 time=38632.84..38632.84 rows=1 loops=1) ->  Seq Scan on listmembers
 (cost=0.00..31379.16 rows=19671 width=0) (actual time=198.13..38494.36
 rows=10011 loops=1) Total runtime: 38633.01 msec
EXPLAIN
db=> set enable_seqscan to no;
SET VARIABLE
db=> explain analyze select count(*) from listmembers where
 server_id = 15182; NOTICE:  QUERY PLAN:
Aggregate  (cost=60210.41..60210.41 rows=1 width=0) (actual
 time=2117.61..2117.61 rows=1 loops=1) ->  Index Scan using
 listmembers_sid_key on listmembers  (cost=0.00..60161.24 rows=19671 width=0)
 (actual time=81.41..2096.67 rows=10011 loops=1) Total runtime: 2117.83 msec
EXPLAIN
db=> select count(*) from listmembers where server_id = 15182;
 count
-------
 10011
(1 row)
db=> select count(*) from listmembers;
  count
---------
 1372425
(1 row)
db=> select server_id, count(*) from listmembers group by
 server_id order by count(*) desc limit 30; server_id | count
-----------+-------
     34062 | 43154
     32715 | 32232
     42495 | 31034
     38013 | 15905
     13746 | 15237
     42434 | 12045
     31696 | 10539
     15413 | 10334
     15182 | 10011
     28178 | 10000
     33330 | 10000
     37071 |  9960
     46532 |  9860
     15392 |  9737
     29643 |  9423
     36503 |  9329
     25378 |  9280
     32095 |  9079
     28621 |  9047
     15400 |  9043
     25870 |  8972
     38377 |  8724
     13046 |  8484
     42416 |  8442
     14869 |  7973
     24131 |  7940
     32625 |  7918
     46480 |  7783
     43172 |  7179
     36849 |  6887
(30 rows)
webmailstation=# select * from pg_class where relname='listmembers';
   relname   | reltype | relowner | relam | relfilenode | relpages |
 reltuples  | reltoastrelid | reltoastidxid | relhasindex | relisshared |
 relkind | relnatts | relchecks | reltriggers | relukeys | relfkeys | relrefs
 | relhasoids | relhaspkey | relhasrules | relhassubclass | relacl
 -------------+---------+----------+-------+-------------+----------+--------
-----+---------------+---------------+-------------+-------------+---------+-
---------+-----------+-------------+----------+----------+---------+---------
---+------------+-------------+----------------+-------- listmembers |
 6429403 |      102 |     0 |     6429402 |    14224 | 1.37241e+06 |
 6429404 |             0 | t           | f           | r       |        4 |
       0 |           0 |        0 |        0 |       0 | t          | f
    | f           | f              | (1 row)
db=# select * from pg_statistic where starelid=6429402 ;
 starelid | staattnum | stanullfrac | stawidth | stadistinct | stakind1 |
 stakind2 | stakind3 | stakind4 | staop1 | staop2 | staop3 | staop4 |
                                         stanumbers1
                         | stanumbers2 | stanumbers3  | stanumbers4 |

                              stavalues1

 |
                                 stavalues2
                                                                          |
 stavalues3 | stavalues4
 ----------+-----------+-------------+----------+-------------+----------+---
-------+----------+----------+--------+--------+--------+--------+-----------
-----------------------------------------------------------------------------
----------------------+-------------+--------------+-------------+-----------
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
---------------------------------------------------------------------------+-
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
-----------------------------------------------------------------------+-----
-------+------------ 6429402 |         1 |           0 |        4 |
 -1 |        2 |        3 |        0 |        0 |     97 |     97 |      0 |
     0 |
                                         | {0.805365}  |              |
       |
 {590,520800,790589,1001533,1375234,1655946,1926816,2342644,2808910,3347435,3
532408}

         |


     |            | 6429402 |         2 |           0 |        4 |
 1150 |        1 |        2 |        3 |        0 |     96 |     97 |     97
 |      0 |
 {0.038,0.027,0.024,0.0143333,0.012,0.01,0.00933333,0.00766667,0.00733333,0.0
07}                              |             | {0.428932}   |             |
 {34062,32715,42495,15182,38013,36503,13746,46532,42434,15392}


         |
 {12855,15419,16223,20598,26019,30733,34402,38139,40811,44986,49509}

                                                                        |
        | 6429402 |         3 |   0.0376667 |       10 |        2581 |
 1 |        2 |        3 |        0 |     98 |    664 |    664 |      0 |
 {0.466333,0.00333333,0.003,0.002,0.002,0.00166667,0.00133333,0.00133333,0.00
1,0.001}                         |             | {0.227739}   |             |
 {"",webwizards,"The Ad Builder Safe List","Kane at InternetSeer","Network
 Commerce",Michael,James,John,Admin,"D.Woodward "}

            | {" Success Center","Brent Sommers",Debra,"Great
 Vacations","Johnny Blaze",Mariani,"Peter Maglione","Silhouettes
 Catalog",Wally,"johan kotze",жЛжН}
                                     |            | 6429402 |         4 |
       0 |       25 |   -0.118184 |        1 |        2 |        3 |        0
 |     98 |    664 |    664 |      0 |
 {0.00133333,0.001,0.001,0.000666667,0.000666667,0.000666667,0.000666667,0.00
0666667,0.000666667,0.000666667} |             | {-0.0167706} |             |
 {servicebox@spedia.net,dougsreplyto3@excite.com,inquire@careerexpansion.com,
234freeb@webwizards-add-url.com,cashdueu@hotmail.com,cashonline1@excite.com,c
wmailer@yahoo.com,galaxy@mail2.galaxy.com,gmichel@post.com,half.com_by_ebay@h
alf.com} |
 {05078475@email.com,bethebest@zwallet.com,cynric7@yahoo.com,ezine@yourhomejo
b.com,ilkst@beeline.samara.ru,kirk.stensrud@lpl.com,mjm@netset.com,ping13013@
yahoo.fr,sandrac@menta.net,tgaeke@worldsubmitter.com,zzzmuffin@aol.com} |
        | (4 rows)

-------------------------------------------------------

Re: PostgreSQL 7.1 forces sequence scan when there is no reason

From

Tom Lane

Date:

20 May 2002, 13:48:35

Denis Perchine <dyp@perchine.com> writes:
> db=> explain analyze select count(*) from listmembers where
>  server_id = 15182; NOTICE:  QUERY PLAN:
> Aggregate  (cost=31428.34..31428.34 rows=1 width=0) (actual
>  time=38632.84..38632.84 rows=1 loops=1) ->  Seq Scan on listmembers
>  (cost=0.00..31379.16 rows=19671 width=0) (actual time=198.13..38494.36
>  rows=10011 loops=1) Total runtime: 38633.01 msec
> EXPLAIN
> db=> set enable_seqscan to no;
> SET VARIABLE
> db=> explain analyze select count(*) from listmembers where
>  server_id = 15182; NOTICE:  QUERY PLAN:
> Aggregate  (cost=60210.41..60210.41 rows=1 width=0) (actual
>  time=2117.61..2117.61 rows=1 loops=1) ->  Index Scan using
>  listmembers_sid_key on listmembers  (cost=0.00..60161.24 rows=19671 width=0)
>  (actual time=81.41..2096.67 rows=10011 loops=1) Total runtime: 2117.83 msec
> EXPLAIN

Hm.  Is it possible that the rows with server_id = 15182 are clustered
together?  Given that you are fetching 10011 rows from a 14224-page
table, it seems unlikely that an indexscan could be such a big win
unless there was a very strong clustering effect.

> db=# select * from pg_statistic where starelid=6429402 ;

This is pretty unhelpful (not to mention unreadable) since we have no
idea which row is which.  Could we see the pg_stats view, instead?

            regards, tom lane

Re: PostgreSQL 7.1 forces sequence scan when there is no reason

From

Denis Perchine

Date:

20 May 2002, 14:29:03

On Monday 20 May 2002 21:48, Tom Lane wrote:
> Denis Perchine <dyp@perchine.com> writes:
> > db=> explain analyze select count(*) from listmembers where
> >  server_id = 15182; NOTICE:  QUERY PLAN:
> > Aggregate  (cost=31428.34..31428.34 rows=1 width=0) (actual
> >  time=38632.84..38632.84 rows=1 loops=1) ->  Seq Scan on listmembers
> >  (cost=0.00..31379.16 rows=19671 width=0) (actual time=198.13..38494.36
> >  rows=10011 loops=1) Total runtime: 38633.01 msec
> > EXPLAIN
> > db=> set enable_seqscan to no;
> > SET VARIABLE
> > db=> explain analyze select count(*) from listmembers where
> >  server_id = 15182; NOTICE:  QUERY PLAN:
> > Aggregate  (cost=60210.41..60210.41 rows=1 width=0) (actual
> >  time=2117.61..2117.61 rows=1 loops=1) ->  Index Scan using
> >  listmembers_sid_key on listmembers  (cost=0.00..60161.24 rows=19671
> > width=0) (actual time=81.41..2096.67 rows=10011 loops=1) Total runtime:
> > 2117.83 msec EXPLAIN
>
> Hm.  Is it possible that the rows with server_id = 15182 are clustered
> together?  Given that you are fetching 10011 rows from a 14224-page
> table, it seems unlikely that an indexscan could be such a big win
> unless there was a very strong clustering effect.

Possible, but 10 000 records are less than 1% of all records.
How can I figure out whether they are clustered.

> > db=# select * from pg_statistic where starelid=6429402 ;
>
> This is pretty unhelpful (not to mention unreadable) since we have no
> idea which row is which.  Could we see the pg_stats view, instead?

db=# select * from pg_stats where tablename='listmembers';
  tablename  |  attname  | null_frac | avg_width | n_distinct |
                                                      
 most_common_vals
        
 |                                              most_common_freqs
|
histogram_bounds
 
| correlation

-------------+-----------+-----------+-----------+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------
 listmembers | id        |         0 |         4 |         -1 |

                                                             
 |
 |
{590,520800,790589,1001533,1375234,1655946,1926816,2342644,2808910,3347435,3532408}
                                                                                                        
 |    0.805365
 listmembers | server_id |         0 |         4 |       1150 |
{34062,32715,42495,15182,38013,36503,13746,46532,42434,15392}
                                                                                                                      
 |
{0.038,0.027,0.024,0.0143333,0.012,0.01,0.00933333,0.00766667,0.00733333,0.007}
 | {12855,15419,16223,20598,26019,30733,34402,38139,40811,44986,49509}
                                                                                                           
 |    0.428932
 listmembers | name      | 0.0376667 |        10 |       2581 |
{"",webwizards,"The Ad Builder Safe List","Kane at InternetSeer","Network
Commerce",Michael,James,John,Admin,"D.Woodward "}
                                            
 |
{0.466333,0.00333333,0.003,0.002,0.002,0.00166667,0.00133333,0.00133333,0.001,0.001}
| {" Success Center","Brent Sommers",Debra,"Great Vacations","Johnny
Blaze",Mariani,"Peter Maglione","Silhouettes Catalog",Wally,"johan
kotze",жЛжН}
|    0.227739
 listmembers | email     |         0 |        25 |  -0.118184 |

{servicebox@spedia.net,dougsreplyto3@excite.com,inquire@careerexpansion.com,234freeb@webwizards-add-url.com,cashdueu@hotmail.com,cashonline1@excite.com,cwmailer@yahoo.com,galaxy@mail2.galaxy.com,gmichel@post.com,half.com_by_ebay@half.com}

|
{0.00133333,0.001,0.001,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667}
|

{05078475@email.com,bethebest@zwallet.com,cynric7@yahoo.com,ezine@yourhomejob.com,ilkst@beeline.samara.ru,kirk.stensrud@lpl.com,mjm@netset.com,ping13013@yahoo.fr,sandrac@menta.net,tgaeke@worldsubmitter.com,zzzmuffin@aol.com}

|  -0.0167706
(4 rows)

--
Denis

Re: PostgreSQL 7.1 forces sequence scan when there is no reason

From

Tom Lane

Date:

20 May 2002, 14:57:04

Denis Perchine <dyp@perchine.com> writes:
> On Monday 20 May 2002 21:48, Tom Lane wrote:
>> Hm.  Is it possible that the rows with server_id = 15182 are clustered
>> together?  Given that you are fetching 10011 rows from a 14224-page
>> table, it seems unlikely that an indexscan could be such a big win
>> unless there was a very strong clustering effect.

> Possible, but 10 000 records are less than 1% of all records.
> How can I figure out whether they are clustered.

Look at the ctid column for those records.  The range of block numbers
in the ctids would tell the tale.  I don't think Postgres itself
provides any operations on type TID, but you could dump the info into
a file and then analyze it.

>  listmembers | server_id |         0 |         4 |       1150 |
> {34062,32715,42495,15182,38013,36503,13746,46532,42434,15392}
>  |
> {0.038,0.027,0.024,0.0143333,0.012,0.01,0.00933333,0.00766667,0.00733333,0.007}
>  | {12855,15419,16223,20598,26019,30733,34402,38139,40811,44986,49509}
>  |    0.428932

Hmm.  Correlation 0.43 is high enough to suggest that there's some
clustering effect.  If you look in the archives there's been prior
discussion about whether to make the optimizer weight the correlation
factor more strongly.

            regards, tom lane

Re: PostgreSQL 7.1 forces sequence scan when there is no reason

From

Denis Perchine

Date:

21 May 2002, 13:37:15

On Monday 20 May 2002 22:54, Tom Lane wrote:
> >  listmembers | server_id |         0 |         4 |       1150 |
> > {34062,32715,42495,15182,38013,36503,13746,46532,42434,15392}
> >
> > {0.038,0.027,0.024,0.0143333,0.012,0.01,0.00933333,0.00766667,0.00733333,
> >0.007}
> >
> >  | {12855,15419,16223,20598,26019,30733,34402,38139,40811,44986,49509}
> >  |    0.428932
>
> Hmm.  Correlation 0.43 is high enough to suggest that there's some
> clustering effect.  If you look in the archives there's been prior
> discussion about whether to make the optimizer weight the correlation
> factor more strongly.

I have increased sequence scan weight 5 times. And it uses index scan now, but
this is not the solution. I will have a look in the archives.

--
Denis

ERROR: value too long for type character varying(255)

From

Doug Fields

Date:

21 May 2002, 16:34:41

Hello again,

In PG 7.1.3, ISTR I used to be able to submit any length I wished for a
VARCHAR (and maybe a CHAR).

Now, I'm at 7.2.1 and get these errors:

ERROR:  value too long for type character varying(255)

as JDBC Exceptions.

Do you know how I can tell it to "silently truncate" the data as I believe
it used to?

Many thanks,

Doug

PS: This group is great, and some of you (you know who you are) are amazing.

Re: ERROR: value too long for type character varying(255)

From

Neil Conway

Date:

21 May 2002, 21:31:30

On Tue, 21 May 2002 12:17:38 -0400
"Doug Fields" <dfields-pg-general@pexicom.com> wrote:
> In PG 7.1.3, ISTR I used to be able to submit any length I wished for a
> VARCHAR (and maybe a CHAR).
>
> Now, I'm at 7.2.1 and get these errors:
>
> ERROR:  value too long for type character varying(255)
>
> as JDBC Exceptions.
>
> Do you know how I can tell it to "silently truncate" the data as I believe
> it used to?

Is there a reason you can't just use TEXT? It will be stored the same
internally...

But the most obvious way I can think of to do what you're asking is
to define a rule to replace the value inserted with a call to
substring().

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

Re: ERROR: value too long for type character

From

Doug Fields

Date:

21 May 2002, 21:35:13

> > Do you know how I can tell it to "silently truncate" the data as I believe
> > it used to?
>
>Is there a reason you can't just use TEXT? It will be stored the same
>internally...

The only reason being that I don't want to waste space storing arbitrarily
long pieces of information.

I was not aware that TEXT is just as (in)efficient as VARCHAR, though. If
that is truly the case, I will consider using TEXT in some places, but only
when I am very careful in the INSERT and UPDATE clauses so as not to store
too long things.

>But the most obvious way I can think of to do what you're asking is
>to define a rule to replace the value inserted with a call to
>substring().

So, basically:

1) There's absolutely no way to get back the 7.1.3 "silent truncation"
behavior (which is, IMO, stupid, I like the non-standard truncation behavior)
2) You have to truncate on INSERT/UPDATE and continue using VARCHAR(x)
3) Instead of 2, you can use TEXT

Several others have asked this question since 7.2 appeared but none
received a response to #1.

Thanks,

Doug

Re: ERROR: value too long for type character

From

Stephan Szabo

Date:

22 May 2002, 00:00:42

On Tue, 21 May 2002, Doug Fields wrote:

>
> > > Do you know how I can tell it to "silently truncate" the data as I believe
> > > it used to?
> >
> >Is there a reason you can't just use TEXT? It will be stored the same
> >internally...
>
> The only reason being that I don't want to waste space storing arbitrarily
> long pieces of information.
>
> I was not aware that TEXT is just as (in)efficient as VARCHAR, though. If
> that is truly the case, I will consider using TEXT in some places, but only
> when I am very careful in the INSERT and UPDATE clauses so as not to store
> too long things.
>
> >But the most obvious way I can think of to do what you're asking is
> >to define a rule to replace the value inserted with a call to
> >substring().
>
> So, basically:
>
> 1) There's absolutely no way to get back the 7.1.3 "silent truncation"
> behavior (which is, IMO, stupid, I like the non-standard truncation behavior)

You could do this via a text column and a before trigger which might work
better than a rule, but I haven't tried either, so...

Re: ERROR: value too long for type character varying(255)

From

Jan Wieck

Date:

22 May 2002, 01:45:22

Neil Conway wrote:
> On Tue, 21 May 2002 12:17:38 -0400
> "Doug Fields" <dfields-pg-general@pexicom.com> wrote:
> > In PG 7.1.3, ISTR I used to be able to submit any length I wished for a
> > VARCHAR (and maybe a CHAR).
> >
> > Now, I'm at 7.2.1 and get these errors:
> >
> > ERROR:  value too long for type character varying(255)
> >
> > as JDBC Exceptions.
> >
> > Do you know how I can tell it to "silently truncate" the data as I believe
> > it used to?
>
> Is there a reason you can't just use TEXT? It will be stored the same
> internally...
>
> But the most obvious way I can think of to do what you're asking is
> to define a rule to replace the value inserted with a call to
> substring().

    And  exactly  that will fail because the rewritten query will
    be of the same operation type on the same  table,  triggering
    the same rule again ...

    The way I see is to use TEXT or VARCHAR without a size limit,
    and defining a custom BEFORE trigger that truncates  the  NEW
    value.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #