Thread: Noobie: Problems with a query

Noobie: Problems with a query

From
"Chris Boget"
Date:
The query that I'm trying to execute is as follows:

SELECT
  card_names.card_name,
  card_sets.set_name
FROM
  card_names_in_sets,
  card_names,
  card_sets
WHERE
  card_names_in_sets.card_name_record_num =
    card_names.record_num
AND
  card_names_in_sets.card_set_record_number =
    card_sets.record_num;

And the explain for this query is:

Merge Join  (cost=100.37..186.36 rows=191 width=83)
  Merge Cond: ("outer".record_num = "inner".card_name_record_num)
  ->  Index Scan using card_names_record_num_idx on card_names
      (cost=0.00..78.09 rows=1826 width=47)
  ->  Sort  (cost=100.37..100.85 rows=191 width=36)
        Sort Key: card_names_in_sets.card_name_record_num
        ->  Hash Join  (cost=1.14..93.16 rows=191 width=36)
              Hash Cond:
              ("outer".card_set_record_number = "inner".record_num)
              ->  Seq Scan on card_names_in_sets  (cost=0.00..63.65
                  rows=3465 width=8)
              ->  Hash  (cost=1.11..1.11 rows=11 width=28)
                    ->  Seq Scan on card_sets  (cost=0.00..1.11
                         rows=11 width=28)

which, sadly, is greek to me.  The problem is that the above query
takes very close to 12 seconds to execute.  Is there a better way I
could write the query?  The number of rows in each table are as
follows:

1826 : card names
3465 : card_names_in_sets
11   : card_sets

On a side note, it takes almost 11 seconds just to display the 3500
rows in card_names_in_sets.  Is there a better way to create that
table?

The table "card_names_in_sets" is a bridge between the tables
"card_names" and "card_sets".

My 3 tables are (from pgAdminII):


CREATE TABLE public.card_names_in_sets (
  card_name_record_num int4 NOT NULL,
  card_set_record_number int4 NOT NULL,
  record_num int4
    DEFAULT nextval('public.card_names_in_sets_record_num_seq'::text)
    NOT NULL,
  CONSTRAINT card_names_in_sets_record_num_idx UNIQUE (record_num),
  CONSTRAINT card_names_in_sets_pkey PRIMARY KEY (record_num),
  CONSTRAINT "$1" FOREIGN KEY (card_name_record_num) REFERENCES
    card_names (record_num) ON DELETE CASCADE ON UPDATE NO ACTION NOT
    DEFERRABLE INITIALLY IMMEDIATE,
  CONSTRAINT "$2" FOREIGN KEY (card_set_record_number) REFERENCES
    card_sets (record_num) ON DELETE CASCADE ON UPDATE NO ACTION NOT
    DEFERRABLE INITIALLY IMMEDIATE
) WITH OIDS;
CREATE UNIQUE INDEX card_names_in_sets_record_num_idx ON
  card_names_in_sets USING btree (record_num);
CREATE INDEX card_names_in_sets_card_name_record_num_idx ON
  card_names_in_sets USING btree (card_name_record_num);
CREATE INDEX card_names_in_sets_card_set_record_num_idx ON
  card_names_in_sets USING btree (card_set_record_number);

CREATE TABLE public.card_names (
  card_name varchar(50) DEFAULT '',
  record_num int4
    DEFAULT nextval('public.card_names_record_num_seq'::text) NOT NULL,
  CONSTRAINT card_names_record_num_idx UNIQUE (record_num),
  CONSTRAINT card_names_pkey PRIMARY KEY (record_num),
  CONSTRAINT card_names_integrity CHECK (((card_name IS NOT NULL) AND
    (card_name <> ''::character varying)))
) WITH OIDS;
CREATE UNIQUE INDEX card_names_record_num_idx ON card_names USING btree
  (record_num);

CREATE TABLE public.card_sets (
  set_name varchar(20) DEFAULT '',
  record_num int4
    DEFAULT nextval('public.card_sets_record_num_seq'::text) NOT NULL,
  CONSTRAINT card_sets_record_num_idx UNIQUE (record_num),
  CONSTRAINT card_sets_pkey PRIMARY KEY (record_num),
  CONSTRAINT card_sets_integrity CHECK (((set_name IS NOT NULL) AND
    (set_name <> ''::character varying)))
) WITH OIDS;
CREATE UNIQUE INDEX card_sets_record_num_idx ON card_sets USING btree
  (record_num);

Any help with this would be *greatly* appreciated!!

thnx,
Christoph


Re: Noobie: Problems with a query

From
Nabil Sayegh
Date:
Am Mon, 2003-06-30 um 13.56 schrieb Chris Boget:
> which, sadly, is greek to me.  The problem is that the above query
> takes very close to 12 seconds to execute.  Is there a better way I
> could write the query?  The number of rows in each table are as
> follows:

First of all you should try: VACUUM ANALYZE;

HTH
--
 e-Trolley Sayegh & John, Nabil Sayegh
 Tel.: 0700 etrolley /// 0700 38765539
 Fax.: +49 69 8299381-8
 PGP : http://www.e-trolley.de


Re: Noobie: Problems with a query

From
"Chris Boget"
Date:
> Am Mon, 2003-06-30 um 13.56 schrieb Chris Boget:
> > which, sadly, is greek to me.  The problem is that the above query
> > takes very close to 12 seconds to execute.  Is there a better way I
> > could write the query?  The number of rows in each table are as
> > follows:
> First of all you should try: VACUUM ANALYZE;

What was that supposed to tell me?

As (problem) db owner:

=> vacuum analyze;
WARNING:  Skipping "pg_group" --- only table or database owner can VACUUM it
WARNING:  Skipping "pg_shadow" --- only table or database owner can VACUUM it
WARNING:  Skipping "pg_database" --- only table or database owner can VACUUM it
VACUUM
=>

As root:

=# vacuum analyze;
VACUUM
=#

What am I missing?

Christoph



Re: Noobie: Problems with a query

From
Bruno Wolff III
Date:
On Mon, Jun 30, 2003 at 07:27:30 -0500,
  Chris Boget <chris@wild.net> wrote:
> > Am Mon, 2003-06-30 um 13.56 schrieb Chris Boget:
> > > which, sadly, is greek to me.  The problem is that the above query
> > > takes very close to 12 seconds to execute.  Is there a better way I
> > > could write the query?  The number of rows in each table are as
> > > follows:
> > First of all you should try: VACUUM ANALYZE;
>
> What was that supposed to tell me?

It was supposed to make sure that the planner had reliable data from
which to plan your query.

When sending a query plan to the lists, you want to do an explain analyze
so that we can see what really happens when your query is executed as
well as what the planner thinks is going to happen.

The plan for your query had a fairly small number of rows for it to take
12 seconds to run the query. This is one hint that an analyze might not
have been done.

Re: Noobie: Problems with a query

From
Nabil Sayegh
Date:
Am Mon, 2003-06-30 um 14.27 schrieb Chris Boget:

> What was that supposed to tell me?
>
> As (problem) db owner:
>
> => vacuum analyze;
> WARNING:  Skipping "pg_group" --- only table or database owner can VACUUM it
> WARNING:  Skipping "pg_shadow" --- only table or database owner can VACUUM it
> WARNING:  Skipping "pg_database" --- only table or database owner can VACUUM it
> VACUUM
> =>
>
> As root:
>
> =# vacuum analyze;
> VACUUM
> =#
>
> What am I missing?

Nothing, maybe.
The pg_* tables are only used internally and shouldn't have an impact on
your queries.
So the VACUUM should be ok as owner.

VACUUM ANALYZE tries to optimize your queries.
You should do this every now and then.

bye
--
 e-Trolley Sayegh & John, Nabil Sayegh
 Tel.: 0700 etrolley /// 0700 38765539
 Fax.: +49 69 8299381-8
 PGP : http://www.e-trolley.de


Re: Noobie: Problems with a query

From
"Chris Boget"
Date:
> > > First of all you should try: VACUUM ANALYZE;
> > What was that supposed to tell me?
> It was supposed to make sure that the planner had reliable data from
> which to plan your query.

Ok.

> When sending a query plan to the lists, you want to do an explain analyze
> so that we can see what really happens when your query is executed as
> well as what the planner thinks is going to happen.

so 'EXPLAIN ANALYZE your_query' gives more information that just EXPLAIN?

> The plan for your query had a fairly small number of rows for it to take
> 12 seconds to run the query. This is one hint that an analyze might not
> have been done.

Well, I've been doing all my work through pgAdminII as it's been alot easier
for me in learning.  PG has alot of really obscure commands and I'm still
trying to get them all down.
In any case, I did 'vacuum' and 'vacuum analyze' there but neither seemed to
improve matters.  However, this morning, after I did 'vacuum analyze' from
the PG command prompt, I re-ran the query and the rows came back in
milliseconds.  I'll have to see if the rows come back that quickly in pgAdmin
and, if it does, that tells me there is something wrong with the way pgAdmin
does the 'vacuum' and 'vacuum analyze'.  If the rows don't come back that
fast, then it'll tell me that there is *alot* of overhead when using pgAdmin to
run queries.

Thank you and Nabil for your help and suggestions.  I'll report back my findings
wrt pgAdminII after I get home this evening.

thnx,
Christoph


Re: Noobie: Problems with a query

From
Bruno Wolff III
Date:
On Mon, Jun 30, 2003 at 08:42:07 -0500,
  Chris Boget <chris@wild.net> wrote:
>
> so 'EXPLAIN ANALYZE your_query' gives more information that just EXPLAIN?

Yes. It sort of runs the query so that you find out how long it actually
takes to run as well as seeing the actual number of rows handled in some
steps. This helps tell why a plan didn't work very well.

> In any case, I did 'vacuum' and 'vacuum analyze' there but neither seemed to
> improve matters.  However, this morning, after I did 'vacuum analyze' from
> the PG command prompt, I re-ran the query and the rows came back in
> milliseconds.  I'll have to see if the rows come back that quickly in pgAdmin
> and, if it does, that tells me there is something wrong with the way pgAdmin
> does the 'vacuum' and 'vacuum analyze'.  If the rows don't come back that
> fast, then it'll tell me that there is *alot* of overhead when using pgAdmin to
> run queries.

It seems odd that vacuum analyze didn't work form pgadmin. Perhaps it
wasn't run against all of the tables or you ran it before loading the
tables.

Re: Noobie: Problems with a query

From
"Chris Boget"
Date:
> milliseconds.  I'll have to see if the rows come back that quickly in pgAdmin
> and, if it does, that tells me there is something wrong with the way pgAdmin
> does the 'vacuum' and 'vacuum analyze'.  If the rows don't come back that
> fast, then it'll tell me that there is *alot* of overhead when using pgAdmin to
> run queries.

Apparently, there is alot of overhead on pgAdminII.  *Alot*.  When running the
query from the command line, it comes back almost immediately.  When run-
ning the query from pgAdminII, it takes anywhere from betwen 9 seconds to 12
seconds.  I ran an EXPLAIN ANALYZE and, from what I could tell, the query
was to take .827 millisconds (or something along those lines).  I'd paste the
results here but sadly the power went out at home last night, while I was working
on this, due to the tropical storm that hit the Gulf.

Do other people on the list use pgAdminII?  Do you notice a performance hit
when running a query from pgAdminII versus running it from the command line?
Is there anything that I can do to optimize pgAdminII a bit?  I doubt it's because
of my machine because it's pretty fast.  Athlon XP 2100+, 512mb DDR...

thnx,
Christoph


Re: Noobie: Problems with a query

From
Tom Lane
Date:
"Chris Boget" <chris@wild.net> writes:
> Apparently, there is alot of overhead on pgAdminII.  *Alot*.  When
> running the query from the command line, it comes back almost
> immediately.  When run- ning the query from pgAdminII, it takes
> anywhere from betwen 9 seconds to 12 seconds.

Ouch.  I imagine the pgAdmin guys would like to dig into this; maybe
something about your environment is confusing their code?  I'm not sure
any of 'em read pgsql-novice though.  Try reporting the problem to the
pgadmin-hackers mailing list.

            regards, tom lane