Thread: Noobie: Problems with a query
The query that I'm trying to execute is as follows: SELECT card_names.card_name, card_sets.set_name FROM card_names_in_sets, card_names, card_sets WHERE card_names_in_sets.card_name_record_num = card_names.record_num AND card_names_in_sets.card_set_record_number = card_sets.record_num; And the explain for this query is: Merge Join (cost=100.37..186.36 rows=191 width=83) Merge Cond: ("outer".record_num = "inner".card_name_record_num) -> Index Scan using card_names_record_num_idx on card_names (cost=0.00..78.09 rows=1826 width=47) -> Sort (cost=100.37..100.85 rows=191 width=36) Sort Key: card_names_in_sets.card_name_record_num -> Hash Join (cost=1.14..93.16 rows=191 width=36) Hash Cond: ("outer".card_set_record_number = "inner".record_num) -> Seq Scan on card_names_in_sets (cost=0.00..63.65 rows=3465 width=8) -> Hash (cost=1.11..1.11 rows=11 width=28) -> Seq Scan on card_sets (cost=0.00..1.11 rows=11 width=28) which, sadly, is greek to me. The problem is that the above query takes very close to 12 seconds to execute. Is there a better way I could write the query? The number of rows in each table are as follows: 1826 : card names 3465 : card_names_in_sets 11 : card_sets On a side note, it takes almost 11 seconds just to display the 3500 rows in card_names_in_sets. Is there a better way to create that table? The table "card_names_in_sets" is a bridge between the tables "card_names" and "card_sets". My 3 tables are (from pgAdminII): CREATE TABLE public.card_names_in_sets ( card_name_record_num int4 NOT NULL, card_set_record_number int4 NOT NULL, record_num int4 DEFAULT nextval('public.card_names_in_sets_record_num_seq'::text) NOT NULL, CONSTRAINT card_names_in_sets_record_num_idx UNIQUE (record_num), CONSTRAINT card_names_in_sets_pkey PRIMARY KEY (record_num), CONSTRAINT "$1" FOREIGN KEY (card_name_record_num) REFERENCES card_names (record_num) ON DELETE CASCADE ON UPDATE NO ACTION NOT DEFERRABLE INITIALLY IMMEDIATE, CONSTRAINT "$2" FOREIGN KEY (card_set_record_number) REFERENCES card_sets (record_num) ON DELETE CASCADE ON UPDATE NO ACTION NOT DEFERRABLE INITIALLY IMMEDIATE ) WITH OIDS; CREATE UNIQUE INDEX card_names_in_sets_record_num_idx ON card_names_in_sets USING btree (record_num); CREATE INDEX card_names_in_sets_card_name_record_num_idx ON card_names_in_sets USING btree (card_name_record_num); CREATE INDEX card_names_in_sets_card_set_record_num_idx ON card_names_in_sets USING btree (card_set_record_number); CREATE TABLE public.card_names ( card_name varchar(50) DEFAULT '', record_num int4 DEFAULT nextval('public.card_names_record_num_seq'::text) NOT NULL, CONSTRAINT card_names_record_num_idx UNIQUE (record_num), CONSTRAINT card_names_pkey PRIMARY KEY (record_num), CONSTRAINT card_names_integrity CHECK (((card_name IS NOT NULL) AND (card_name <> ''::character varying))) ) WITH OIDS; CREATE UNIQUE INDEX card_names_record_num_idx ON card_names USING btree (record_num); CREATE TABLE public.card_sets ( set_name varchar(20) DEFAULT '', record_num int4 DEFAULT nextval('public.card_sets_record_num_seq'::text) NOT NULL, CONSTRAINT card_sets_record_num_idx UNIQUE (record_num), CONSTRAINT card_sets_pkey PRIMARY KEY (record_num), CONSTRAINT card_sets_integrity CHECK (((set_name IS NOT NULL) AND (set_name <> ''::character varying))) ) WITH OIDS; CREATE UNIQUE INDEX card_sets_record_num_idx ON card_sets USING btree (record_num); Any help with this would be *greatly* appreciated!! thnx, Christoph
Am Mon, 2003-06-30 um 13.56 schrieb Chris Boget: > which, sadly, is greek to me. The problem is that the above query > takes very close to 12 seconds to execute. Is there a better way I > could write the query? The number of rows in each table are as > follows: First of all you should try: VACUUM ANALYZE; HTH -- e-Trolley Sayegh & John, Nabil Sayegh Tel.: 0700 etrolley /// 0700 38765539 Fax.: +49 69 8299381-8 PGP : http://www.e-trolley.de
> Am Mon, 2003-06-30 um 13.56 schrieb Chris Boget: > > which, sadly, is greek to me. The problem is that the above query > > takes very close to 12 seconds to execute. Is there a better way I > > could write the query? The number of rows in each table are as > > follows: > First of all you should try: VACUUM ANALYZE; What was that supposed to tell me? As (problem) db owner: => vacuum analyze; WARNING: Skipping "pg_group" --- only table or database owner can VACUUM it WARNING: Skipping "pg_shadow" --- only table or database owner can VACUUM it WARNING: Skipping "pg_database" --- only table or database owner can VACUUM it VACUUM => As root: =# vacuum analyze; VACUUM =# What am I missing? Christoph
On Mon, Jun 30, 2003 at 07:27:30 -0500, Chris Boget <chris@wild.net> wrote: > > Am Mon, 2003-06-30 um 13.56 schrieb Chris Boget: > > > which, sadly, is greek to me. The problem is that the above query > > > takes very close to 12 seconds to execute. Is there a better way I > > > could write the query? The number of rows in each table are as > > > follows: > > First of all you should try: VACUUM ANALYZE; > > What was that supposed to tell me? It was supposed to make sure that the planner had reliable data from which to plan your query. When sending a query plan to the lists, you want to do an explain analyze so that we can see what really happens when your query is executed as well as what the planner thinks is going to happen. The plan for your query had a fairly small number of rows for it to take 12 seconds to run the query. This is one hint that an analyze might not have been done.
Am Mon, 2003-06-30 um 14.27 schrieb Chris Boget: > What was that supposed to tell me? > > As (problem) db owner: > > => vacuum analyze; > WARNING: Skipping "pg_group" --- only table or database owner can VACUUM it > WARNING: Skipping "pg_shadow" --- only table or database owner can VACUUM it > WARNING: Skipping "pg_database" --- only table or database owner can VACUUM it > VACUUM > => > > As root: > > =# vacuum analyze; > VACUUM > =# > > What am I missing? Nothing, maybe. The pg_* tables are only used internally and shouldn't have an impact on your queries. So the VACUUM should be ok as owner. VACUUM ANALYZE tries to optimize your queries. You should do this every now and then. bye -- e-Trolley Sayegh & John, Nabil Sayegh Tel.: 0700 etrolley /// 0700 38765539 Fax.: +49 69 8299381-8 PGP : http://www.e-trolley.de
> > > First of all you should try: VACUUM ANALYZE; > > What was that supposed to tell me? > It was supposed to make sure that the planner had reliable data from > which to plan your query. Ok. > When sending a query plan to the lists, you want to do an explain analyze > so that we can see what really happens when your query is executed as > well as what the planner thinks is going to happen. so 'EXPLAIN ANALYZE your_query' gives more information that just EXPLAIN? > The plan for your query had a fairly small number of rows for it to take > 12 seconds to run the query. This is one hint that an analyze might not > have been done. Well, I've been doing all my work through pgAdminII as it's been alot easier for me in learning. PG has alot of really obscure commands and I'm still trying to get them all down. In any case, I did 'vacuum' and 'vacuum analyze' there but neither seemed to improve matters. However, this morning, after I did 'vacuum analyze' from the PG command prompt, I re-ran the query and the rows came back in milliseconds. I'll have to see if the rows come back that quickly in pgAdmin and, if it does, that tells me there is something wrong with the way pgAdmin does the 'vacuum' and 'vacuum analyze'. If the rows don't come back that fast, then it'll tell me that there is *alot* of overhead when using pgAdmin to run queries. Thank you and Nabil for your help and suggestions. I'll report back my findings wrt pgAdminII after I get home this evening. thnx, Christoph
On Mon, Jun 30, 2003 at 08:42:07 -0500, Chris Boget <chris@wild.net> wrote: > > so 'EXPLAIN ANALYZE your_query' gives more information that just EXPLAIN? Yes. It sort of runs the query so that you find out how long it actually takes to run as well as seeing the actual number of rows handled in some steps. This helps tell why a plan didn't work very well. > In any case, I did 'vacuum' and 'vacuum analyze' there but neither seemed to > improve matters. However, this morning, after I did 'vacuum analyze' from > the PG command prompt, I re-ran the query and the rows came back in > milliseconds. I'll have to see if the rows come back that quickly in pgAdmin > and, if it does, that tells me there is something wrong with the way pgAdmin > does the 'vacuum' and 'vacuum analyze'. If the rows don't come back that > fast, then it'll tell me that there is *alot* of overhead when using pgAdmin to > run queries. It seems odd that vacuum analyze didn't work form pgadmin. Perhaps it wasn't run against all of the tables or you ran it before loading the tables.
> milliseconds. I'll have to see if the rows come back that quickly in pgAdmin > and, if it does, that tells me there is something wrong with the way pgAdmin > does the 'vacuum' and 'vacuum analyze'. If the rows don't come back that > fast, then it'll tell me that there is *alot* of overhead when using pgAdmin to > run queries. Apparently, there is alot of overhead on pgAdminII. *Alot*. When running the query from the command line, it comes back almost immediately. When run- ning the query from pgAdminII, it takes anywhere from betwen 9 seconds to 12 seconds. I ran an EXPLAIN ANALYZE and, from what I could tell, the query was to take .827 millisconds (or something along those lines). I'd paste the results here but sadly the power went out at home last night, while I was working on this, due to the tropical storm that hit the Gulf. Do other people on the list use pgAdminII? Do you notice a performance hit when running a query from pgAdminII versus running it from the command line? Is there anything that I can do to optimize pgAdminII a bit? I doubt it's because of my machine because it's pretty fast. Athlon XP 2100+, 512mb DDR... thnx, Christoph
"Chris Boget" <chris@wild.net> writes: > Apparently, there is alot of overhead on pgAdminII. *Alot*. When > running the query from the command line, it comes back almost > immediately. When run- ning the query from pgAdminII, it takes > anywhere from betwen 9 seconds to 12 seconds. Ouch. I imagine the pgAdmin guys would like to dig into this; maybe something about your environment is confusing their code? I'm not sure any of 'em read pgsql-novice though. Try reporting the problem to the pgadmin-hackers mailing list. regards, tom lane