Thread: Problems with genetic optimizer
Hi all, I generated a table with a column type 'box', inserted some values and created an index (ops_box) on that column, run a vacuum and looked with the command EXPLAIN if the select will use the index. And the result was very astonishing: if I had inserted 120000 elements postgres will _NOT_ use the index, if I had inserted 100000 elements postgres will use the index! When I set the sequential search off ('set enable_seqscan off;') postgres is using the index in both cases. Has someone any idea what to do? Peter -- Bezirksfinanzdirektion Muenchen Vermessungsabteilung ......................................................... Peter Keller : Tel: (+49) 089-2190-2594 Vermessungsrat : Fax: (+49) 089-2190-2459 Alexandrastr. 3 : mailto:Peter.Keller@bvv.bayern.de 80538 Muenchen : web: http://www.bayern.de/vermessung
Peter Keller <peter.keller@bvv.bayern.de> writes: > I generated a table with a column type 'box', inserted some values and > created an index (ops_box) on that column, run a vacuum and looked with > the command EXPLAIN if the select will use the index. > And the result was very astonishing: > if I had inserted 120000 elements postgres will _NOT_ use the index, if > I had inserted 100000 elements postgres will use the index! Hmm. The system's knowledge of selectivities for R-tree indexes is essentially nil; perhaps someone will be motivated to improve that someday. In the meantime, the entirely bogus numbers returned by src/backend/utils/adt/geo_selfuncs.c are supposed to be small enough to ensure that R-trees are used if available. What was your test query exactly, and what do you get from EXPLAIN with and without forcing enable_seqscan off? regards, tom lane
Hi > > Hmm. The system's knowledge of selectivities for R-tree indexes is > essentially nil; perhaps someone will be motivated to improve that > someday. In the meantime, the entirely bogus numbers returned by > src/backend/utils/adt/geo_selfuncs.c are supposed to be small enough > to ensure that R-trees are used if available. What was your test > query exactly, and what do you get from EXPLAIN with and without > forcing enable_seqscan off? > > regards, tom lane Ok, I created a table with only one column (box), inserted 120000 elements, created an index and run a vacuum: convert=# select count(*) from box_tmp; count -------- 120000 (1 row) convert=# explain select * from box_tmp where ebre && box('(470758.555,354028.145),(470758.525,354028.115)'::box); NOTICE: QUERY PLAN: Seq Scan on box_tmp (cost=0.00..2500.00 rows=2400 width=32) EXPLAIN convert=# set enable_seqscan = off; SET VARIABLE convert=# explain select * from box_tmp where ebre && box('(470758.555,354028.145),(470758.525,354028.115)'::box); NOTICE: QUERY PLAN: Index Scan using idx on box_tmp (cost=0.00..2503.28 rows=2400 width=32) EXPLAIN I'm running PostgreSQL 7.0.2 on i686-pc-linux-gnu, compiled by gcc egcs-2.91.66 The result of the query if I set the sequential search on is this: convert=# select * from box_tmp where ebre && box('(470758.555,354028.145),(470758.525,354028.115)'::box); pqReadData() -- backend closed the channel unexpectedly. This probably means the backend terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. !# Thanks for helping, Peter > > Peter Keller <peter.keller@bvv.bayern.de> writes: > > I generated a table with a column type 'box', inserted some values and > > created an index (ops_box) on that column, run a vacuum and looked with > > the command EXPLAIN if the select will use the index. > > And the result was very astonishing: > > if I had inserted 120000 elements postgres will _NOT_ use the index, if > > I had inserted 100000 elements postgres will use the index! -- Bezirksfinanzdirektion Muenchen Vermessungsabteilung ......................................................... Peter Keller : Tel: (+49) 089-2190-2594 Vermessungsrat : Fax: (+49) 089-2190-2459 Alexandrastr. 3 : mailto:Peter.Keller@bvv.bayern.de 80538 Muenchen : web: http://www.bayern.de/vermessung
Peter Keller <peter.keller@bvv.bayern.de> writes: > Ok, I created a table with only one column (box), inserted 120000 > elements, created an index and run a vacuum: > convert=# explain select * from box_tmp where ebre && > box('(470758.555,354028.145),(470758.525,354028.115)'::box); > NOTICE: QUERY PLAN: > Seq Scan on box_tmp (cost=0.00..2500.00 rows=2400 width=32) > convert=# set enable_seqscan = off; > SET VARIABLE > convert=# explain select * from box_tmp where ebre && > box('(470758.555,354028.145),(470758.525,354028.115)'::box); > NOTICE: QUERY PLAN: > Index Scan using idx on box_tmp (cost=0.00..2503.28 rows=2400 width=32) Oy vey, only three points difference in estimated costs. What is the actual measured runtime of each approach? > The result of the query if I set the sequential search on is this: > convert=# select * from box_tmp where ebre && > box('(470758.555,354028.145),(470758.525,354028.115)'::box); > pqReadData() -- backend closed the channel unexpectedly. Urk. That's not supposed to happen. There should be a core file left in your PGDATA/base/dbname/ directory --- can you provide a backtrace from that file using gdb? regards, tom lane
> > The result of the query if I set the sequential search on is this: > > > convert=# select * from box_tmp where ebre && > > box('(470758.555,354028.145),(470758.525,354028.115)'::box); > > pqReadData() -- backend closed the channel unexpectedly. > > Urk. That's not supposed to happen. There should be a core file > left in your PGDATA/base/dbname/ directory --- can you provide a > backtrace from that file using gdb? > > regards, tom lane No, I'm sorry, but there is no core file. I tried to debug the query and this are the relults: /opt/local/DWH/bin/postmaster: ServerLoop: handling reading 5 /opt/local/DWH/bin/postmaster: ServerLoop: handling reading 5 /opt/local/DWH/bin/postmaster: ServerLoop: handling writing 5 /opt/local/DWH/bin/postmaster: BackendStartup: pid 679 user postgres db convert socket 5 /opt/local/DWH/bin/postmaster child[679]: starting with (/opt/local/DWH/bin/post gres -d2 -v131072 -p convert ) FindExec: found "/opt/local/DWH/bin/postgres" using argv[0] started: host=localhost user=postgres database=convert InitPostgres StartTransactionCommand query: SELECT usesuper FROM pg_user WHERE usename = 'postgres' ProcessQuery CommitTransactionCommand StartTransactionCommand query: select * from box_tmp where ebre && ('(470758.555,354028.145),(470758.525 ,354028.115)'::box); ProcessQuery /opt/local/DWH/bin/postmaster: reaping dead processes... /opt/local/DWH/bin/postmaster: CleanupProc: pid 679 exited with status 11 Server process (pid 679) exited with status 11 at Wed Oct 25 09:04:25 2000 Terminating any active server processes... Server processes were terminated at Wed Oct 25 09:04:25 2000 Reinitializing shared memory and semaphores shmem_exit(0) binding ShmemCreate(key=52e325, size=1104896) /opt/local/DWH/bin/postmaster: ServerLoop: handling reading 5 /opt/local/DWH/bin/postmaster: ServerLoop: handling reading 5 /opt/local/DWH/bin/postmaster: ServerLoop: handling writing 5 The Data Base System is starting up /opt/local/DWH/bin/postmaster: ServerLoop: handling writing 5 DEBUG: Data Base System is starting up at Wed Oct 25 09:04:25 2000 DEBUG: Data Base System was interrupted being in production at Wed Oct 25 09:03 :52 2000 DEBUG: Data Base System is in production state at Wed Oct 25 09:04:25 2000 proc_exit(0) shmem_exit(0) exit(0) /opt/local/DWH/bin/postmaster: reaping dead processes... Don't know if this is helping you Peter -- Bezirksfinanzdirektion Muenchen Vermessungsabteilung ......................................................... Peter Keller : Tel: (+49) 089-2190-2594 Vermessungsrat : Fax: (+49) 089-2190-2459 Alexandrastr. 3 : mailto:Peter.Keller@bvv.bayern.de 80538 Muenchen : web: http://www.bayern.de/vermessung
Peter Keller <peter.keller@bvv.bayern.de> writes: > No, I'm sorry, but there is no core file. You're probably running one of those setups where the postmaster is started with a ulimit setting that prevents core dumps. You might want to look into changing that for future debugging purposes. > query: select * from box_tmp where ebre && > ('(470758.555,354028.145),(470758.525 > ,354028.115)'::box); > ProcessQuery > /opt/local/DWH/bin/postmaster: reaping dead processes... > /opt/local/DWH/bin/postmaster: CleanupProc: pid 679 exited with status > 11 Now that I think about it, are there any NULL entries in box_tmp.ebre? The box_overlap function, like practically all of the geometric operators :-(, doesn't defend itself against NULL inputs in 7.0 and earlier releases. This is fixed for 7.1 but not in a way that could readily be back-patched into 7.0.*. If there are just specific operators you need to use with NULL data, you could patch them yourself in src/backend/utils/adt/geo_ops.c; for instance box_overlap would need to start out with something like if (box1 == NULL || box2 == NULL) return false; Alternatively, write your queries to avoid invoking && on a NULL, eg select * from box_tbl where case when ebre is not null then ebre && '(470758.555,354028.145),(470758.525,354028.115)' else false end; This last is not only ugly but non-indexable, so it's only useful as a very short-term hack... regards, tom lane
> No, I'm sorry, but there is no core file. > > You're probably running one of those setups where the postmaster is > started with a ulimit setting that prevents core dumps. You might > want to look into changing that for future debugging purposes. Yes, you are right, I changed it in /etc/profile. > > > query: select * from box_tmp where ebre && > > ('(470758.555,354028.145),(470758.525 > > ,354028.115)'::box); > > ProcessQuery > > /opt/local/DWH/bin/postmaster: reaping dead processes... > > /opt/local/DWH/bin/postmaster: CleanupProc: pid 679 exited with status > > 11 > > Now that I think about it, are there any NULL entries in box_tmp.ebre? > The box_overlap function, like practically all of the geometric > operators :-(, doesn't defend itself against NULL inputs in 7.0 and > earlier releases. YES!! I deleted the NULL entries. Now it works. Thanks!!! Peter -- Bezirksfinanzdirektion Muenchen Vermessungsabteilung ......................................................... Peter Keller : Tel: (+49) 089-2190-2594 Vermessungsrat : Fax: (+49) 089-2190-2459 Alexandrastr. 3 : mailto:Peter.Keller@bvv.bayern.de 80538 Muenchen : web: http://www.bayern.de/vermessung