Thread: Neverending query on 6.5.2 over Solaris 2.5.1

Neverending query on 6.5.2 over Solaris 2.5.1

From
Fernando Schapachnik
Date:
Hello:(This is a repost from a mail to -general, as nobody could 
answer me, and acording to the Mailing Lists home page, I'm sending it 
here.)Maybe somebody could give some clue about what is happening:
I have 6.5.0 running over Solaris 2.5.1 SPARC. I have a 
database with 5 tables, 3 of them < 100 regs. and 2 ("usuarios" and 
"passwd") with >10000. When querying for:

SELECT u.nombre_cuenta, per.nombre, pas.clave_cifrada, 
pas.clave_plana, u.estado FROM usuarios u, perfiles per, passwd pas 
WHERE (u.perfil=per.id_perfil) and (u.id_usr=pas.id_usr) and 
(u.activa) \g 
postmaster starts eating a lot of CPU and it doesn't finish to 
process the query in +20 minutes.
Postmaster shows:
[3]ns2:/>su - pgsql
Sun Microsystems Inc.   SunOS 5.5.1     Generic May 1996
[1]ns2:/usr/local/pgsql>bin/postmaster -i -d 2 -N 8 -B 16 -D 
/usr/local/pgsql
/data
FindExec: found "/usr/local/pgsql/bin/postgres" using argv[0]
binding ShmemCreate(key=52e2c1, size=359424)
bin/postmaster: ServerLoop:             handling reading 5
bin/postmaster: ServerLoop:             handling reading 5
bin/postmaster: ServerLoop:             handling writing 5
bin/postmaster child[2934]: starting with 
(/usr/local/pgsql/bin/postgres -d2
-B 16 -v131072 -p operaciones )
FindExec: found "/usr/local/pgsql/bin/postgres" using argv[0]
debug info:       User         = admin       RemoteHost   = localhost       RemotePort   = 0       DatabaseName =
operaciones      Verbose      = 2       Noversion    = f       timings      = f       dates        = Normal
bufsize     = 16       sortmem      = 512       query echo   = f
 
InitPostgres 
StartTransactionCommand
bin/postmaster: BackendStartup: pid 2934 user admin db operaciones 
socket 5
query: select version();
ProcessQuery
CommitTransactionCommand
StartTransactionCommand
query: SELECT u.nombre_cuenta, per.nombre, pas.clave_cifrada, 
pas.clave_plana, u.estado FROM usuarios u, perfiles per, passwd pas WHERE 
(u.perfil=per.id_perfil) and (u.id_usr=pas.id_usr) and (u.activa)
ProcessQuery
bin/postmaster: dumpstatus:       sock 5
bin/postmaster: dumpstatus:       sock 5 

Any hints?
TIA and kind regards.

Fernando P. Schapachnik
Administración de la red
VIA Net Works Argentina SA
Diagonal Roque Sáenz Peña 971, 4º y 5º piso.
1035 - Capital Federal, Argentina. 
(54-11) 4323-3333
http://www.via-net-works.net.ar


Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From
Tom Lane
Date:
Fernando Schapachnik <fpscha@ns1.via-net-works.net.ar> writes:
>     I have 6.5.0 running over Solaris 2.5.1 SPARC. I have a 
> database with 5 tables, 3 of them < 100 regs. and 2 ("usuarios" and 
> "passwd") with >10000. When querying for:

> SELECT u.nombre_cuenta, per.nombre, pas.clave_cifrada, 
> pas.clave_plana, u.estado FROM usuarios u, perfiles per, passwd pas 
> WHERE (u.perfil=per.id_perfil) and (u.id_usr=pas.id_usr) and 
> (u.activa) \g 

>     postmaster starts eating a lot of CPU and it doesn't finish to 
> process the query in +20 minutes.

Have you vacuumed the database lately?  What does "explain ..." show
for the query plan being used?

You might be well advised to create indexes on usarios.id_usr and
passwd.id_usr, if you don't have them already.  I'd expect this
query to run reasonably quickly using a mergejoin, but mergejoin
needs indexes on the fields being joined.  (The system will also
consider doing an explicit sort and then a mergejoin, but obviously
the sort step takes extra time.)

If you haven't vacuumed since filling the tables then the optimizer
may believe that the tables only contain a few rows, in which case
it's likely to use a plain nested-loop join (ie, compare every usarios
row to every passwd row to find matching id_usr fields).  That's nice
and fast for little tables, but a big loser on big ones...
        regards, tom lane


Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From
Fernando Schapachnik
Date:
En un mensaje anterior, Tom Lane escribió:
> Fernando Schapachnik <fpscha@ns1.via-net-works.net.ar> writes:
> >     I have 6.5.0 running over Solaris 2.5.1 SPARC. I have a 
> > database with 5 tables, 3 of them < 100 regs. and 2 ("usuarios" and 
> > "passwd") with >10000. When querying for:
> 
> > SELECT u.nombre_cuenta, per.nombre, pas.clave_cifrada, 
> > pas.clave_plana, u.estado FROM usuarios u, perfiles per, passwd pas 
> > WHERE (u.perfil=per.id_perfil) and (u.id_usr=pas.id_usr) and 
> > (u.activa) \g 
> 
> >     postmaster starts eating a lot of CPU and it doesn't finish to 
> > process the query in +20 minutes.
> 
> Have you vacuumed the database lately?  What does "explain ..." show

I did this today. I also installed Postgres on a FreeBSD machine 
(comparable -and low- load averages) and updated the version to 6.5.2.

After vacuum:
On the Sun: 1 minute.
On the FreeBSD: 12 seconds.

Explain shows (on both machines):

operaciones=> explain SELECT u.nombre_cuenta, per.nombre, 
pas.clave_cifrada, pas.clave_plana, u.estado FROM usuarios u, perfiles per,
passwd pas WHERE (u.activa) and (u.perfil=per.id_perfil) and 
(u.id_usr=pas.id_usr) \g
NOTICE:  QUERY PLAN:

Nested Loop  (cost=503.74 rows=1 width=74) ->  Nested Loop  (cost=500.89 rows=1 width=58)       ->  Seq Scan on
usuariosu  (cost=498.84 rows=1 width=30)       ->  Index Scan using passwd_id_usr_key on passwd pas  
 
(cost=2.05 rows=10571 width=28) ->  Seq Scan on perfiles per  (cost=2.85 rows=56 width=16)

EXPLAIN 
> You might be well advised to create indexes on usarios.id_usr and
> passwd.id_usr, if you don't have them already.  I'd expect this

As usuarios.id_usr and passwd.id_usr are both serial, they have 
indexes automatically created (I double checked that). PgAccess shows 
that usuarios has no primary key (I don't know why) and that 
usuarios_id_usr_key is an unique, no clustered index. Same on passwd.

I'm running postmaster -N 8 -B 16 because whitout these postmaster 
wouldn't get all the shared memory it needed and won't start. Do you 
think that this may be in some way related?

Thanks for your help!

Fernando P. Schapachnik
Administración de la red
VIA Net Works Argentina SA
Diagonal Roque Sáenz Peña 971, 4º y 5º piso.
1035 - Capital Federal, Argentina. 
(54-11) 4323-3333
http://www.via-net-works.net.ar


Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From
Tom Lane
Date:
Fernando Schapachnik <fpscha@ns1.via-net-works.net.ar> writes:
>>>> postmaster starts eating a lot of CPU and it doesn't finish to 
>>>> process the query in +20 minutes.
>> 
>> Have you vacuumed the database lately?  What does "explain ..." show

> After vacuum:
> On the Sun: 1 minute.
> On the FreeBSD: 12 seconds.

That's a little better, anyway ...

> Explain shows (on both machines):

> Nested Loop  (cost=503.74 rows=1 width=74)
>   ->  Nested Loop  (cost=500.89 rows=1 width=58)
>         ->  Seq Scan on usuarios u  (cost=498.84 rows=1 width=30)
>         ->  Index Scan using passwd_id_usr_key on passwd pas  
> (cost=2.05 rows=10571 width=28)
>   ->  Seq Scan on perfiles per  (cost=2.85 rows=56 width=16)

OK, that still looks a little bogus.  It's estimating it will only
find one row in usarios that needs to be joined against the other
two tables.  If that were true, then this plan is pretty reasonable,
but I bet it's not true.  The only WHERE clause that can be used to
eliminate usarios rows in advance of the join is (u.activa), and I'll
bet you have more than one active user.

Does the plan change if you do VACUUM ANALYZE instead of just a plain
vacuum?

> As usuarios.id_usr and passwd.id_usr are both serial, they have 
> indexes automatically created (I double checked that). PgAccess shows 
> that usuarios has no primary key (I don't know why) and that 
> usuarios_id_usr_key is an unique, no clustered index. Same on passwd.

OK, so it *could* make a mergejoin plan without sorting.  I think the
problem is the unreasonably low estimate for number of matching usarios
rows; that makes the nested-loop plan look cheap because of its lower
startup overhead.  But if there's a lot of usarios rows to process then
it's not so cheap anymore.

As an experiment you could try forbidding nestloop plans (start psql
with environment variable PGOPTIONS="-fn") and see what sort of plan
you get then and how long it really takes in comparison to the nestloop.
This isn't a good long-term solution, because you might get poor plans
for smaller queries, but it would help us see whether and how the
planner is making the wrong choice.  (I've been trying to collect
examples of poor planning so that I can improve the planner --- so
I'm quite interested in the details of your situation.)

> I'm running postmaster -N 8 -B 16 because whitout these postmaster 
> wouldn't get all the shared memory it needed and won't start. Do you 
> think that this may be in some way related?

Well, that's certainly costing you performance; 16 disk pages is not
enough buffer space to avoid thrashing.  You need to increase your
kernel's max-shared-memory-block-size (SHMMAX, I think) parameter
so that you can run with a more reasonable -B setting.  A lot of
kernels ship with SHMMAX settings that are ridiculously small for
any modern machine.
        regards, tom lane


Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From
Fernando Schapachnik
Date:
En un mensaje anterior, Tom Lane escribió:
> > Explain shows (on both machines):
> 
> > Nested Loop  (cost=503.74 rows=1 width=74)
> >   ->  Nested Loop  (cost=500.89 rows=1 width=58)
> >         ->  Seq Scan on usuarios u  (cost=498.84 rows=1 width=30)
> >         ->  Index Scan using passwd_id_usr_key on passwd pas  
> > (cost=2.05 rows=10571 width=28)
> >   ->  Seq Scan on perfiles per  (cost=2.85 rows=56 width=16)
> 
> OK, that still looks a little bogus.  It's estimating it will only
> find one row in usarios that needs to be joined against the other
> two tables.  If that were true, then this plan is pretty reasonable,
> but I bet it's not true.  The only WHERE clause that can be used to
> eliminate usarios rows in advance of the join is (u.activa), and I'll
> bet you have more than one active user.

That's right!

> 
> Does the plan change if you do VACUUM ANALYZE instead of just a plain
> vacuum?

Sorry for not being clear enough, but that was what I did.

> 
> As an experiment you could try forbidding nestloop plans (start psql
> with environment variable PGOPTIONS="-fn") and see what sort of plan
> you get then and how long it really takes in comparison to the nestloop.

I took 30 seconds on the Sun, and explain shows:

NOTICE:  QUERY PLAN:

Merge Join  (cost=1314.02 rows=1 width=74) ->  Seq Scan  (cost=1297.56 rows=1 width=58)       ->  Sort  (cost=1297.56
rows=1width=58)             ->  Hash Join  (cost=1296.56 rows=1 width=58)                   ->  Seq Scan on passwd pas
(cost=447.84
 
rows=10571 width=28)                   ->  Hash  (cost=498.84 rows=1 width=30)                         ->  Seq Scan on
usuariosu  (cost=498.84 
 
rows=1 width=30) ->  Seq Scan  (cost=14.58 rows=56 width=16)       ->  Sort  (cost=14.58 rows=56 width=16)
-> Seq Scan on perfiles per  (cost=2.85 rows=56 width=16)
 

EXPLAIN

> > I'm running postmaster -N 8 -B 16 because whitout these postmaster 
> > wouldn't get all the shared memory it needed and won't start. Do you 
> > think that this may be in some way related?
> 
> Well, that's certainly costing you performance; 16 disk pages is not
> enough buffer space to avoid thrashing.  You need to increase your
> kernel's max-shared-memory-block-size (SHMMAX, I think) parameter
> so that you can run with a more reasonable -B setting.  A lot of
> kernels ship with SHMMAX settings that are ridiculously small for
> any modern machine.

Ok, I'll try to increase it.

Regards.



Fernando P. Schapachnik
Administración de la red
VIA Net Works Argentina SA
Diagonal Roque Sáenz Peña 971, 4º y 5º piso.
1035 - Capital Federal, Argentina. 
(54-11) 4323-3333
http://www.via-net-works.net.ar


Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From
Fernando Schapachnik
Date:
En un mensaje anterior, Tom Lane escribió:
> > I'm running postmaster -N 8 -B 16 because whitout these postmaster 
> > wouldn't get all the shared memory it needed and won't start. Do you 
> > think that this may be in some way related?
> 
> Well, that's certainly costing you performance; 16 disk pages is not
> enough buffer space to avoid thrashing.  You need to increase your
> kernel's max-shared-memory-block-size (SHMMAX, I think) parameter
> so that you can run with a more reasonable -B setting.  A lot of
> kernels ship with SHMMAX settings that are ridiculously small for
> any modern machine.

What value would you advise for shmmax?

Regards.


Fernando P. Schapachnik
Administración de la red
VIA Net Works Argentina SA
Diagonal Roque Sáenz Peña 971, 4º y 5º piso.
1035 - Capital Federal, Argentina. 
(54-11) 4323-3333
http://www.via-net-works.net.ar


Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From
Tom Lane
Date:
Fernando Schapachnik <fpscha@ns1.via-net-works.net.ar> writes:
>> enough buffer space to avoid thrashing.  You need to increase your
>> kernel's max-shared-memory-block-size (SHMMAX, I think) parameter
>> so that you can run with a more reasonable -B setting.  A lot of
>> kernels ship with SHMMAX settings that are ridiculously small for
>> any modern machine.

> What value would you advise for shmmax?

Well, with the default number of buffers (64) Postgres requires about
a megabyte (I think a tad over 1Mb, in 6.5.*).  Extra buffers are 8K
plus a little overhead apiece.  If you are running with more than a
couple of active backends at a time then you probably want to use
more than the default number of buffers.  But I have no advice on
how many is appropriate for what size of installation --- can anyone
else help?
        regards, tom lane


Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From
Brian E Gallew
Date:
Then <tgl@sss.pgh.pa.us> spoke up and said:
> Fernando Schapachnik <fpscha@ns1.via-net-works.net.ar> writes:
> > What value would you advise for shmmax?
> 
> more than the default number of buffers.  But I have no advice on
> how many is appropriate for what size of installation --- can anyone
> else help?

Unless you are severely resource constrained, think big.  Generally
speaking, the kilobytes of memory you'll lose to kernel structures are
irrelevant.  Performance is generally not an issue, either.

-- 
=====================================================================
| JAVA must have been developed in the wilds of West Virginia.      |
| After all, why else would it support only single inheritance??    |
=====================================================================
| Finger geek@cmu.edu for my public key.                            |
=====================================================================

Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From
Tom Lane
Date:
Fernando Schapachnik <fpscha@ns1.via-net-works.net.ar> writes:
>> As an experiment you could try forbidding nestloop plans (start psql
>> with environment variable PGOPTIONS="-fn") and see what sort of plan
>> you get then and how long it really takes in comparison to the nestloop.

> I took 30 seconds on the Sun, and explain shows:

Better, but still not good.

> NOTICE:  QUERY PLAN:
>
> Merge Join  (cost=1314.02 rows=1 width=74)
>   ->  Seq Scan  (cost=1297.56 rows=1 width=58)
>         ->  Sort  (cost=1297.56 rows=1 width=58)
>               ->  Hash Join  (cost=1296.56 rows=1 width=58)
>                     ->  Seq Scan on passwd pas  (cost=447.84 rows=10571 width=28)
>                     ->  Hash  (cost=498.84 rows=1 width=30)
>                           ->  Seq Scan on usuarios u  (cost=498.84 rows=1 width=30)
>   ->  Seq Scan  (cost=14.58 rows=56 width=16)
>         ->  Sort  (cost=14.58 rows=56 width=16)
>               ->  Seq Scan on perfiles per  (cost=2.85 rows=56 width=16)

It's still convinced it's only going to get one row out of usuarios.
Weird.  I assume that your 'activa' field is 'bool'?  I've been trying
to duplicate this misbehavior here, and as near as I can tell the system
handles selectivity estimates for boolean fields just fine.  Whatever
percentage of 't' values was seen by the last VACUUM ANALYZE is exactly
what it uses.

I am using 6.5.2 and current sources, though, and in your original
message you said you were on 6.5.0.  If that's right, seems like the
first thing to try is for you to update to 6.5.2, run another VACUUM
ANALYZE, and then see if you still get the same bogus row estimates.

The other odd thing about the above plan is that it's doing an
explicit sort on perfiles.  Didn't you say that you had an index on
perfiles.id_perfil?  It should be scanning that instead of doing
a sort, I'd think.  (However, if there really are only 56 rows in
perfiles, it probably doesn't matter.)
        regards, tom lane


Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From
Tom Lane
Date:
I wrote:
> Weird.  I assume that your 'activa' field is 'bool'?  I've been trying
> to duplicate this misbehavior here, and as near as I can tell the system
> handles selectivity estimates for boolean fields just fine.  Whatever
> percentage of 't' values was seen by the last VACUUM ANALYZE is exactly
> what it uses.

On second thought: 6.5.* can get confused if the column contains more
NULLs than anything else.  Dunno if you have a lot of nulls in activa,
but if so you might try changing them all to explicit 'f' and then
redoing the VACUUM ANALYZE.  Next release will be smarter about keeping
stats in the presence of many nulls.

It'd be useful to double-check my theory that the system is
misestimating the selectivity of the WHERE (u.activa) clause.
You could try this:SELECT count(*) FROM usarios WHERE activa;EXPLAIN SELECT count(*) FROM usarios WHERE activa;
and see how far off the row count estimate in the EXPLAIN is
from reality.
        regards, tom lane


Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From
Fernando Schapachnik
Date:
En un mensaje anterior, Tom Lane escribió:
> It's still convinced it's only going to get one row out of usuarios.
> Weird.  I assume that your 'activa' field is 'bool'?  I've been trying
> to duplicate this misbehavior here, and as near as I can tell the system
> handles selectivity estimates for boolean fields just fine.  Whatever
> percentage of 't' values was seen by the last VACUUM ANALYZE is exactly
> what it uses.
> 
> I am using 6.5.2 and current sources, though, and in your original
> message you said you were on 6.5.0.  If that's right, seems like the
> first thing to try is for you to update to 6.5.2, run another VACUUM
> ANALYZE, and then see if you still get the same bogus row estimates.

I was using 6.5.0 on my first post, then I upgraded and all the vacuum 
and explain commands where from 6.5.2.  Here is my complete database 
definition:


CREATE TABLE usuarios(id_usr serial,razon_social text NOT NULL,nombre_cuenta text NOT NULL,grupo int2 NOT NULL, perfil
int2NOT NULL, estado char(1) NOT NULL DEFAULT 'H' CHECK ((estado='H') or (estado='D')), id_madre int4 NOT
NULL,fecha_creaciondatetime DEFAULT CURRENT_DATE,fecha_baja datetime,gratuita bool DEFAULT 'f',activa bool DEFAULT
't',observacionestext) \g
 

CREATE TABLE passwd(id_usr serial,clave_plana text NOT NULL, clave_cifrada text NOT NULL) \g

CREATE TABLE perfiles(id_perfil serial,nombre text NOT NULL,descripcion text) \g

CREATE TABLE grupos(id_grupo serial,nombre text NOT NULL,descripcion text) \g

CREATE TABLE cronometradas(id_usr serial,fecha_comienzo_cronometrado datetime DEFAULT CURRENT_DATE,tipo_cronometrado
int2,max_segs_vidaint4, max_segs_consumo int4) \g
 

CREATE TABLE tipos_cronometrado(id_tipo_cronometrado serial,nombre text NOT NULL,descripcion text) \g


> 
> The other odd thing about the above plan is that it's doing an
> explicit sort on perfiles.  Didn't you say that you had an index on
> perfiles.id_perfil?  It should be scanning that instead of doing

It should, as it is serial. What does it mean when PgAccess says a table 
doesn't has a primary key? Would it impact?

Again, thanks!



Fernando P. Schapachnik
Administración de la red
VIA Net Works Argentina SA
Diagonal Roque Sáenz Peña 971, 4º y 5º piso.
1035 - Capital Federal, Argentina. 
(54-11) 4323-3333
http://www.via-net-works.net.ar


Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From
Fernando Schapachnik
Date:
En un mensaje anterior, Tom Lane escribió:
> I wrote:
> > Weird.  I assume that your 'activa' field is 'bool'?  I've been trying
> > to duplicate this misbehavior here, and as near as I can tell the system
> > handles selectivity estimates for boolean fields just fine.  Whatever
> > percentage of 't' values was seen by the last VACUUM ANALYZE is exactly
> > what it uses.
> 
> On second thought: 6.5.* can get confused if the column contains more
> NULLs than anything else.  Dunno if you have a lot of nulls in activa,
> but if so you might try changing them all to explicit 'f' and then
> redoing the VACUUM ANALYZE.  Next release will be smarter about keeping
> stats in the presence of many nulls.
> 
> It'd be useful to double-check my theory that the system is
> misestimating the selectivity of the WHERE (u.activa) clause.
> You could try this:
>     SELECT count(*) FROM usarios WHERE activa;

10571


>     EXPLAIN SELECT count(*) FROM usarios WHERE activa;
> and see how far off the row count estimate in the EXPLAIN is
> from reality.

NOTICE:  QUERY PLAN:

Aggregate  (cost=498.84 rows=1 width=4) ->  Seq Scan on usuarios  (cost=498.84 rows=1 width=4)

EXPLAIN

Don't hesitate in asking any other info/test you may consider useful.

Regards!



Fernando P. Schapachnik
Administración de la red
VIA Net Works Argentina SA
Diagonal Roque Sáenz Peña 971, 4º y 5º piso.
1035 - Capital Federal, Argentina. 
(54-11) 4323-3333
http://www.via-net-works.net.ar


Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From
Tom Lane
Date:
Fernando Schapachnik <fpscha@ns1.via-net-works.net.ar> writes:
>> It'd be useful to double-check my theory that the system is
>> misestimating the selectivity of the WHERE (u.activa) clause.
>> You could try this:
>> SELECT count(*) FROM usarios WHERE activa;

> 10571

>> EXPLAIN SELECT count(*) FROM usarios WHERE activa;
>> and see how far off the row count estimate in the EXPLAIN is
>> from reality.

> NOTICE:  QUERY PLAN:

> Aggregate  (cost=498.84 rows=1 width=4)
>   -> Seq Scan on usuarios  (cost=498.84 rows=1 width=4)

Well, it's sure confused about the selectivity of WHERE activa,
all right.

I tried to duplicate this here, by duplicating the table definition you
sent and filling it with some junk data --- about 1800 rows, 1500 of
which had activa = 't'.  I found that after loading the table and
running a plain "vacuum", the system indeed estimated one row out, just
as you show above.  But after "vacuum analyze", it estimated 1360 rows
out, which is a lot closer to reality (and would make a big difference
in the plan selected for a join).

Now I know you said you did a "vacuum analyze" on the table, but
I am wondering if maybe you got confused about what you did.
Please try it again just to make sure.

The only other explanation I can think of is that I am not running this
test on a pristine 6.5.2 release, but on a recent CVS update from the
REL6_5 branch.  I don't see any indication that anything has been
changed in the selectivity code since 6.5 in that branch, but maybe I
missed something.  You might need to update to almost-6.5.3.  (I am not
sure if there is a beta-test tarball for 6.5.3 or not; if not, you could
pull the sources from the CVS server, or wait for 6.5.3 which should be
out very soon.)

BTW, current sources (7.0-to-be) get the estimate spot-on after "vacuum
analyze", though without it they are not much better than 6.5.  The
current system is estimating 1% of the rows will match, because it's
treating the WHERE condition like "WHERE activa = 't'" and the default
estimate for "=" selectivity is 1% in the absence of VACUUM ANALYZE
statistics.  Probably we ought to special-case boolean columns to
default to a 50% estimate if no statistics are available...
        regards, tom lane


Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From
Fernando Schapachnik
Date:
En un mensaje anterior, Tom Lane escribió:
> Well, it's sure confused about the selectivity of WHERE activa,
> all right.
> 
> I tried to duplicate this here, by duplicating the table definition you
> sent and filling it with some junk data --- about 1800 rows, 1500 of
> which had activa = 't'.  I found that after loading the table and
> running a plain "vacuum", the system indeed estimated one row out, just
> as you show above.  But after "vacuum analyze", it estimated 1360 rows
> out, which is a lot closer to reality (and would make a big difference
> in the plan selected for a join).
> 
> Now I know you said you did a "vacuum analyze" on the table, but
> I am wondering if maybe you got confused about what you did.
> Please try it again just to make sure.

I tried again and now it's working better. I think the first problem 
was due to low shared memory available and a special factor between my 
keyboard and my chair ;-)

Thanks for all you help!


Fernando P. Schapachnik
Administración de la red
VIA Net Works Argentina SA
Diagonal Roque Sáenz Peña 971, 4º y 5º piso.
1035 - Capital Federal, Argentina. 
(54-11) 4323-3333
http://www.via-net-works.net.ar