Re: Forcing the right queryplan - Mailing list pgsql-general

From Henk van Lingen
Subject Re: Forcing the right queryplan
Date
Msg-id 20100902082333.GA5174@uu.nl
Whole thread Raw
In response to Forcing the right queryplan  (Henk van Lingen <H.G.K.vanLingen@uu.nl>)
List pgsql-general
No ideas on this one?

Regards,

On Tue, Aug 31, 2010 at 04:50:09PM +0200, Henk van Lingen wrote:
  >
  > Hi,
  >
  > I've the problem my database is not using the 'right' queryplan in all
  > cases. Is there a way I can force that and/or how should I tuned the
  > table statistics?
  >
  > I'm doing a rsyslog database in PostgreSQL with millions of records
  > (firewall logging). The db scheme is the so called 'MonitorWare' scheme,
  > to wich I added two extra indexes.
  >
  > syslog=# select version();
  >                                                      version
  >
  > --------------------------------------------------------------------------------
  > ----------------------------------
  >  PostgreSQL 8.4.4 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20
  > 080704 (Red Hat 4.1.2-48), 64-bit
  >
  >
  > syslog=# \d systemevents
  >                                          Table "public.systemevents"
  >        Column       |            Type             |                         Modi
  > fiers
  > --------------------+-----------------------------+-----------------------------
  > ------------------------------
  >  id                 | integer                     | not null default nextval('sy
  > stemevents_id_seq'::regclass)
  >  customerid         | bigint                      |
  >  receivedat         | timestamp without time zone |
  >  devicereportedtime | timestamp without time zone |
  >  facility           | smallint                    |
  >  priority           | smallint                    |
  >  fromhost           | character varying(60)       |
  >  message            | text                        |
  >  ntseverity         | integer                     |
  >  importance         | integer                     |
  >  eventsource        | character varying(60)       |
  >  eventuser          | character varying(60)       |
  >  eventcategory      | integer                     |
  >  eventid            | integer                     |
  >  eventbinarydata    | text                        |
  >  maxavailable       | integer                     |
  >  currusage          | integer                     |
  >  minusage           | integer                     |
  >  maxusage           | integer                     |
  >  infounitid         | integer                     |
  >  syslogtag          | character varying(60)       |
  >  eventlogtype       | character varying(60)       |
  >  genericfilename    | character varying(60)       |
  >  systemid           | integer                     |
  > Indexes:
  >     "systemevents_pkey" PRIMARY KEY, btree (id)
  >     "fromhost_idx" btree (fromhost)
  >     "msgs_idx" gin (to_tsvector('english'::regconfig, message))
  >
  > The GIN index is to do text searching (via LogAnalyzer).
  >
  > Now there are two types of query plans:
  >
  > syslog=# explain SELECT id, devicereportedtime, facility, priority, fromhost, syslogtag, infounitid, message FROM
systemeventsWHERE (  ( to_tsvector('english', message) @@ to_tsquery ( '131.211.112.9')) )  ORDER BY id DESC LIMIT 100;
                                                 QUERY PLAN                    
  >
  > --------------------------------------------------------------------------------
  > ---------------------------------
  >  Limit  (cost=0.00..10177.22 rows=100 width=159)
  >    ->  Index Scan Backward using systemevents_pkey on systemevents  (cost=0.00..
  > 1052934.86 rows=10346 width=159)
  >          Filter: (to_tsvector('english'::regconfig, message) @@ to_tsquery('131.
  > 211.112.9'::text))
  > (3 rows)
  >
  > This one is useless (takes very long). However this one:
  >
  > syslog=# explain SELECT id, devicereportedtime, facility, priority, fromhost, syslogtag, infounitid, message FROM
systemeventsWHERE (  ( to_tsvector('english', message) @@ to_tsquery ( '131.211.112.9')) )  ORDER BY id DESC LIMIT 500; 
  >                                                     QUERY PLAN
  >
  > --------------------------------------------------------------------------------
  > -----------------------------------
  >  Limit  (cost=40928.89..40930.14 rows=500 width=159)
  >    ->  Sort  (cost=40928.89..40954.76 rows=10346 width=159)
  >          Sort Key: id
  >          ->  Bitmap Heap Scan on systemevents  (cost=2898.06..40413.36 rows=1034
  > 6 width=159)
  >                Recheck Cond: (to_tsvector('english'::regconfig, message) @@ to_t
  > squery('131.211.112.9'::text))
  >                ->  Bitmap Index Scan on msgs_idx  (cost=0.00..2895.47 rows=10346
  >  width=0)
  >                      Index Cond: (to_tsvector('english'::regconfig, message) @@
  > to_tsquery('131.211.112.9'::text))
  > (7 rows)
  >
  > works acceptable.
  >
  > Stats:
  >
  > syslog=# SELECT relname, relkind, reltuples, relpages
  > FROM pg_class                                                      WHERE relname LIKE 'systemevents%';
                                          relname            | relkind |  reltuples  | relpages  
  > -------------------------------+---------+-------------+----------
  >  systemevents_pkey             | i       | 2.06915e+06 |    71985
  >  systemeventsproperties        | r       |           0 |        0
  >  systemeventsproperties_pkey   | i       |           0 |        1
  >  systemevents_id_seq           | S       |           1 |        1
  >  systemeventsproperties_id_seq | S       |           1 |        1
  >  systemevents                  | r       | 2.06915e+06 |   694826
  > (6 rows)
  >
  > syslog=# SELECT relname, relkind, reltuples, relpages
  > FROM pg_class
  > WHERE relname LIKE 'msg%';
  >  relname  | relkind |  reltuples  | relpages
  > ----------+---------+-------------+----------
  >  msgs_idx | i       | 2.06915e+06 |   128069
  > (1 row)
  >
  > How to use the right plan regardless of the 'LIMIT-size'?
  >
  > Cheers,
  > --
  > Henk van Lingen, ICT-SC Netwerk & Telefonie,                  (o-      -+
  > Universiteit Utrecht, Jenalaan 18a, room 0.12                 /\        |
  > phone: +31-30-2538453                                         v_/_      |
  > http://henk.vanlingen.net/             http://www.tuxtown.net/netiquette/
  >
  > --
  > Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
  > To make changes to your subscription:
  > http://www.postgresql.org/mailpref/pgsql-general

--
Henk van Lingen, ICT-SC Netwerk & Telefonie,                  (o-      -+
Universiteit Utrecht, Jenalaan 18a, room 0.12                 /\        |
phone: +31-30-2538453                                         v_/_      |
http://henk.vanlingen.net/             http://www.tuxtown.net/netiquette/

pgsql-general by date:

Previous
From: Gerhard Wiesinger
Date:
Subject: Double iteration in the buffer cache code
Next
From: Dimitri Fontaine
Date:
Subject: Re: table - sequence dependent informatio