Thread: Optimizing a query

Optimizing a query

From
James Cloos
Date:
I've currently got this table:

,----
| n=# \d nanpa
|          Table "public.nanpa"
|    Column   |     Type     | Modifiers
| ------------+--------------+-----------
|  state      | character(2) |
|  npa        | character(3) | not null
|  nxx        | character(3) | not null
|  ocn        | character(4) |
|  company    | text         |
|  ratecenter | text         |
|  switch     | text         |
|  effective  | date         |
|  use        | character(2) | not null
|  assign     | date         |
|  ig         | character(1) |
| Indexes:
|     "nanpa_pkey" PRIMARY KEY, btree (npa, nxx) CLUSTER
`----

and was doing queries of the form:

,----
| select * from nanpa where npa=775 and nxx=413;
`----

where were quite slow.  Explain showed that it was doing sequential
scans even though the primary key contained the two term I was
selecting on.

Today, looking at it again in prep to this post, I noticed that the
numbers were being converted to ::text, and a quick test showed that
queries of the form:

,----
| select * from nanpa where npa=775::bpchar and nxx=413::bpchar;
`----

used the index.

I specified char(3) when I created the table simple because npa and
nxx are defined as three-character strings.  Tagging the queies is
a pain, especially as I often do queries of that form in psql(1).

(Incidently, there are multiple similar tables, also keyed on
(npa,nxx), which show the same problem.  The nanpa table above is
just a good example.)

Should I convert the columns to text?  Or create an additional index
that expects ::text args?  (If so, how?)

Or is there some other way to ensure the indices get used w/o having
to tag data in the queries?

Thanks,

-JimC
--
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6

Re: Optimizing a query

From
"Tomeh, Husam"
Date:
Have you run vacuum/analyze on the table?

--
  Husam

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of James Cloos
Sent: Wednesday, December 13, 2006 10:48 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Optimizing a query

I've currently got this table:

,----
| n=# \d nanpa
|          Table "public.nanpa"
|    Column   |     Type     | Modifiers
| ------------+--------------+-----------
|  state      | character(2) |
|  npa        | character(3) | not null
|  nxx        | character(3) | not null
|  ocn        | character(4) |
|  company    | text         |
|  ratecenter | text         |
|  switch     | text         |
|  effective  | date         |
|  use        | character(2) | not null
|  assign     | date         |
|  ig         | character(1) |
| Indexes:
|     "nanpa_pkey" PRIMARY KEY, btree (npa, nxx) CLUSTER
`----

and was doing queries of the form:

,----
| select * from nanpa where npa=775 and nxx=413;
`----

where were quite slow.  Explain showed that it was doing sequential
scans even though the primary key contained the two term I was
selecting on.

Today, looking at it again in prep to this post, I noticed that the
numbers were being converted to ::text, and a quick test showed that
queries of the form:

,----
| select * from nanpa where npa=775::bpchar and nxx=413::bpchar;
`----

used the index.

I specified char(3) when I created the table simple because npa and
nxx are defined as three-character strings.  Tagging the queies is
a pain, especially as I often do queries of that form in psql(1).

(Incidently, there are multiple similar tables, also keyed on
(npa,nxx), which show the same problem.  The nanpa table above is
just a good example.)

Should I convert the columns to text?  Or create an additional index
that expects ::text args?  (If so, how?)

Or is there some other way to ensure the indices get used w/o having
to tag data in the queries?

Thanks,

-JimC
--
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
**********************************************************************
This message contains confidential information intended only for the use of the addressee(s) named above and may
containinformation that is legally privileged.  If you are not the addressee, or the person responsible for delivering
itto the addressee, you are hereby notified that reading, disseminating, distributing or copying this message is
strictlyprohibited.  If you have received this message by mistake, please immediately notify us by replying to the
messageand delete the original message immediately thereafter. 

Thank you.

                                   FADLD Tag
**********************************************************************


Re: Optimizing a query

From
Tom Lane
Date:
James Cloos <cloos@jhcloos.com> writes:
> ... and was doing queries of the form:
> | select * from nanpa where npa=775 and nxx=413;

If those are char(3) columns, shouldn't you be quoting the constants?

select * from nanpa where npa='775' and nxx='413';

Any non-numeric input will fail entirely without the quotes, and I'm
also not too confident that inputs of less than three digits will work
as you expect (the blank-padding might not match what's in the table).
Leading zeroes would be troublesome too.

OTOH, if the keys are and always will be small integers, it's just
stupid not to be storing them as integers --- integer comparison
is far faster than string.

Postgres' data type capabilities are exceptional.  Learn to work with
them, not against them --- that means thinking about what the data
really is and declaring it appropriately.

            regards, tom lane

Re: Optimizing a query

From
James Cloos
Date:
>>>>> "Husam" == Tomeh, Husam <htomeh@firstam.com> writes:

Husam> Have you run vacuum/analyze on the table?

Yes, back when I first noticed how slow it was.
It did not make any difference.

explain analyze says:

,----
| n=# explain analyse  select * from nanpa where npa=775 and nxx=473;
|                                                QUERY PLAN
| --------------------------------------------------------------------------------------------------------
|  Seq Scan on nanpa  (cost=0.00..5344.60 rows=4 width=105) (actual time=371.718..516.816 rows=1 loops=1)
|    Filter: (((npa)::text = '775'::text) AND ((nxx)::text = '473'::text))
|  Total runtime: 516.909 ms
| (3 rows)
`----

vs:

,----
| n=# explain analyse  select * from nanpa where npa=775::char and nxx=473::char;
|                                                       QUERY PLAN

|
----------------------------------------------------------------------------------------------------------------------
|  Index Scan using nanpa_pkey on nanpa  (cost=0.00..4.33 rows=1 width=105) (actual time=64.831..64.831 rows=0 loops=1)
|    Index Cond: ((npa = '7'::bpchar) AND (nxx = '4'::bpchar))
|  Total runtime: 64.927 ms
| (3 rows)
`----

BTW, I forgot to mention I'm at 8.1.4 on that box.

-JimC
--
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6

Re: Optimizing a query

From
"Tomeh, Husam"
Date:
Your nap and nxx columns have character datatype, so you should use
quotes. Try:

        explain analyze  select * from nanpa where npa='775' and
nxx='473';

If that does not work, you could try to influence the planner's
execution plan to favor index scans over sequential scan by tweaking a
couple of the postgres parameters, particularly, the
effective_cache_size. This parameter primarily set the planner's
estimates of the relative likelihood of a particular table or index
being in memory, and will thus have a significant effect on whether the
planner chooses indexes over seqscans. Tweaking such parameters are
usually done as a last resort.

--
  Husam

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of James Cloos
Sent: Wednesday, December 13, 2006 2:35 PM
To: Tomeh, Husam
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Optimizing a query

>>>>> "Husam" == Tomeh, Husam <htomeh@firstam.com> writes:

Husam> Have you run vacuum/analyze on the table?

Yes, back when I first noticed how slow it was.
It did not make any difference.

explain analyze says:

,----
| n=# explain analyse  select * from nanpa where npa=775 and nxx=473;
|                                                QUERY PLAN

|
------------------------------------------------------------------------
--------------------------------
|  Seq Scan on nanpa  (cost=0.00..5344.60 rows=4 width=105) (actual
time=371.718..516.816 rows=1 loops=1)
|    Filter: (((npa)::text = '775'::text) AND ((nxx)::text =
'473'::text))
|  Total runtime: 516.909 ms
| (3 rows)
`----

vs:

,----
| n=# explain analyse  select * from nanpa where npa=775::char and
nxx=473::char;
|                                                       QUERY PLAN

|
------------------------------------------------------------------------
----------------------------------------------
|  Index Scan using nanpa_pkey on nanpa  (cost=0.00..4.33 rows=1
width=105) (actual time=64.831..64.831 rows=0 loops=1)
|    Index Cond: ((npa = '7'::bpchar) AND (nxx = '4'::bpchar))
|  Total runtime: 64.927 ms
| (3 rows)
`----

BTW, I forgot to mention I'm at 8.1.4 on that box.

-JimC
--
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq
**********************************************************************
This message contains confidential information intended only for the use of the addressee(s) named above and may
containinformation that is legally privileged.  If you are not the addressee, or the person responsible for delivering
itto the addressee, you are hereby notified that reading, disseminating, distributing or copying this message is
strictlyprohibited.  If you have received this message by mistake, please immediately notify us by replying to the
messageand delete the original message immediately thereafter. 

Thank you.

                                   FADLD Tag
**********************************************************************