Thread: Query performance problem
HI All,
I have a big performance issue concerning a PostgreSQL database.
I have the following server configuration:
Pentium 4 2.4 GHz
1 GB RAM
36 GB SCSI
And the following tables:
TABLES
--------------------------------------------------------------------------
================== r_cliente: 75816 records ============================
CREATE TABLE "public"."r_cliente" (
"pkcliente" INTEGER NOT NULL,
"cpfcnpj" VARCHAR(20) NOT NULL,
PRIMARY KEY("pkcliente")
) WITH OIDS;
CREATE UNIQUE INDEX "un_cliente_cpfcnpj" ON "public"."r_cliente"
USING btree ("cpfcnpj");
================== sav_cliente_lg: 65671 records =======================
CREATE TABLE "public"."sav_cliente_lg" (
"codigo" INTEGER NOT NULL,
"cpfcnpj" VARCHAR(15) NOT NULL,
PRIMARY KEY("codigo")
) WITH OIDS;
CREATE INDEX "ix_savclientelg_cpfcnpj" ON "public"."sav_cliente_lg"
USING btree ("cpfcnpj");
Which I would like to run the following query:
QUERY
--------------------------------------------------------------------------
SELECT
rc.pkcliente
FROM r_cliente AS rc
INNER JOIN sav_cliente_lg AS sc ON sc.cpfcnpj = sc.cpfcnpj;
The problem is, it takes a long time to run, I wait up to half an hour and I get no result.
So, I executed the explain on the query and got the following results:
QUERY PLAN
--------------------------------------------------------------------------
Nested Loop (cost=0.00..16696.87 rows=75816 width=4)
-> Seq Scan on sav_cliente_cf sc (cost=0.00..3047.55 rows=1 width=0)
Filter: ((cpfcnpj)::text = (cpfcnpj)::text)
-> Seq Scan on r_cliente rc (cost=0.00..12891.16 rows=75816 width=4)
And made the following modifications on my POSTGRESQL.CONF file:
POSTGRESQL.CONF
--------------------------------------------------------------------------
### VERSION: Postgresql 7.4.2 ###
shared_buffers = 7800
sort_mem = 4096
checkpoint_segments = 5
effective_cache_size = 12000
cpu_operator_cost = 0.0015
stats_start_collector = false
Hope you can help me, I really need to get this running faster, and I am out of ideas.
Since now, thanks a lot for your attention,
Danilo Mota
"Danilo Mota" <dmota@nexen.com.br> writes: > SELECT > rc.pkcliente > FROM r_cliente AS rc > INNER JOIN sav_cliente_lg AS sc ON sc.cpfcnpj = sc.cpfcnpj; Surely you meant INNER JOIN sav_cliente_lg AS sc ON rc.cpfcnpj = sc.cpfcnpj; I would also venture that your statistics are desperately out of date, because if the planner's estimates are close to reality, even this unconstrained-cross-product join shouldn't have taken that long. regards, tom lane
On Fri, Aug 20, 2004 at 13:25:30 -0300, Danilo Mota <dmota@nexen.com.br> wrote: > > And the following tables: > TABLES > ------------------------------------------------------------------------ > -- > ================== r_cliente: 75816 records > ============================ > CREATE TABLE "public"."r_cliente" ( > "pkcliente" INTEGER NOT NULL, > "cpfcnpj" VARCHAR(20) NOT NULL, > PRIMARY KEY("pkcliente") > ) WITH OIDS; > > CREATE UNIQUE INDEX "un_cliente_cpfcnpj" ON "public"."r_cliente" > USING btree ("cpfcnpj"); > > ================== sav_cliente_lg: 65671 records > ======================= > CREATE TABLE "public"."sav_cliente_lg" ( > "codigo" INTEGER NOT NULL, > "cpfcnpj" VARCHAR(15) NOT NULL, > PRIMARY KEY("codigo") > ) WITH OIDS; > > CREATE INDEX "ix_savclientelg_cpfcnpj" ON "public"."sav_cliente_lg" > USING btree ("cpfcnpj"); > > > > Which I would like to run the following query: > > QUERY > ------------------------------------------------------------------------ > -- > SELECT > rc.pkcliente > FROM r_cliente AS rc > INNER JOIN sav_cliente_lg AS sc ON sc.cpfcnpj = sc.cpfcnpj; I am going to assume that one of the sc.cpfcnpj's above is really rc.cpfcnpj since that corresponds to the explain below. sc.cpfcnpj and rc.cpfcnpj are different length varchars. You made need an explicit cast to allow the use of indexes. (Unless there is a real business rule that mandates the limits you have used, you probably want to make them both type 'text'.) Another potential problem is not having analyzed the tables. I don't think this can be ruled out based on what you have showed us so far. > > So, I executed the explain on the query and got the following results: Generally you want to run EXPLAIN ANALYZE results when submitting questions about performance problems rather than just EXPLAIN results.
Bruno Wolff III <bruno@wolff.to> writes: > I am going to assume that one of the sc.cpfcnpj's above is really rc.cpfcnpj > since that corresponds to the explain below. No, actually the explain plan corresponds to the sc.cpfcnpj = sc.cpfcnpj condition. I didn't twig to the typo until I started to wonder why the plan had the condition in the wrong place (attached to the seqscan and not the join step). > sc.cpfcnpj and rc.cpfcnpj are different length varchars. You made need > an explicit cast to allow the use of indexes. AFAIK the cross-type issues only apply to crossing actual types, not lengths. That does look like an error in the database schema, though. regards, tom lane