Thread: Impact of foreign keys on a simple count(*) ?

Impact of foreign keys on a simple count(*) ?

From
"C. Bensend"
Date:
Hey folks (long email, my apologies),
  I am wrapping up my schema upgrade, and I just noticed a real
show-stopper for me...  Here is the "before" table structure:
email_id          | integer                     | not nulljoejob            | boolean                     | default
falsebayes_poison     | boolean                     | default falseperm_error        | boolean                     |
defaultfalsebait_id           | integer                     | default 0num_recip         | integer
|default 1product           | integer                     | default 1043date_received     | timestamp with time zone
|only_date        | date                        |only_time         | time with time zone         |maildir_file      |
charactervarying(64)       |errors_to         | character varying(512)      |reply_to          | character varying(512)
    |spammer           | character varying(512)      |return_path       | character varying(512)      |received_from
| character varying(512)      |message_id        | character varying(512)      |target_domain     | character
varying(512)     |mail_date         | character varying(512)      |x_priority        | character varying(512)
|x_msmail_priority| character varying(512)      |x_mimeole         | character varying(512)      |mime_version      |
charactervarying(512)      |subject           | character varying(1024)     |mail_to           | character
varying(2048)    |x_mailer          | character varying(2048)     |content_type      | character varying(2048)
|user_agent       | character varying(2048)     |cc                | character varying(2048)     |comments          |
charactervarying(8192)     |last_mod          | timestamp without time zone | default
 
('now'::text)::timest
amp(6) with time zone
Indexes:   "emails_pkey" primary key, btree (email_id)   "emails_idx_bait_id" btree (bait_id)   "emails_idx_mail_to"
btree(mail_to)   "emails_idx_only_date" btree (only_date)   "emails_idx_only_time" btree (only_time)
"emails_idx_product"btree (product)   "emails_idx_received_from" btree (received_from)   "emails_idx_subject" btree
(subject)  "emails_idx_target_domain" btree (target_domain)
 


And here is the "after" structure:
email_id          | integer                  | not nulljoejob            | boolean                  | default
falsebayes_poison     | boolean                  | default falseperm_error        | boolean                  | default
falsenum_recip        | integer                  | default 1mydom_id          | integer                  | default
0spamv_id         | integer                  | default 1053spammer_id        | integer                  | default
1003last_mod         | timestamp with time zone | default
 
('now'::text)::timestampwith time zonebait_id           | integer                  |product_id        | integer
        |date_received     | timestamp with time zone |only_date         | date                     |only_time
|time with time zone      |maildir_file      | character varying(128)   |x_priority        | character varying(128)
|x_msmail_priority| character varying(128)   |x_mimeole         | character varying(128)   |mime_version      |
charactervarying(512)   |received_from     | character varying(512)   |content_type      | character varying(512)
|errors_to        | character varying(512)   |user_agent        | character varying(512)   |mail_date         |
charactervarying(512)   |x_mailer          | character varying(512)   |return_path       | character varying(512)
|message_id       | character varying(512)   |reply_to          | character varying(512)   |subject           |
charactervarying(1024)  |mail_to           | character varying(1024)  |cc                | character varying(2048)  |
 
Indexes:   "emails_pkey" primary key, btree (email_id)   "emails_maildir_file_key" unique, btree (maildir_file)
Foreign-key constraints:   "$1" FOREIGN KEY (bait_id) REFERENCES bait(bait_id) ON UPDATE CASCADE ON
DELETE SET NULL   "$2" FOREIGN KEY (product_id) REFERENCES products(product_id) ON UPDATE
CASCADE ON DELETE SET NULL   "$3" FOREIGN KEY (mydom_id) REFERENCES my_domains(mydom_id) ON UPDATE
CASCADE ON DELETE SET NULL   "$4" FOREIGN KEY (spamv_id) REFERENCES spamvertisers(spamv_id) ON UPDATE
CASCADE ON DELETE SET NULL   "$5" FOREIGN KEY (spammer_id) REFERENCES spammers(spammer_id) ON UPDATE
CASCADE ON DELETE SET NULL

  I am absolutely aware of the fact that the new schema can still use
some work.  :)  But, it's a step in the right direction as I muddle my
way through this.  Not to mention, I really did need those foreign keys.
  The problem is this:

Original schema:

prod01=> EXPLAIN ANALYZE SELECT count(*) FROM emails;                                                    QUERY PLAN

--------------------------------------------------------------------------------------------------------------------Aggregate
(cost=6345.61..6345.61 rows=1 width=0) (actual
 
time=1066.727..1066.728 rows=1 loops=1)  ->  Seq Scan on emails  (cost=0.00..6121.49 rows=89649 width=0) (actual
time=18.214..980.040 rows=89649 loops=1)Total runtime: 1066.931 ms
(3 rows)



New schema:

prod01=> EXPLAIN ANALYZE SELECT count(*) FROM emails;                                                     QUERY PLAN

----------------------------------------------------------------------------------------------------------------------Aggregate
(cost=22.50..22.50 rows=1 width=0) (actual
 
time=40818.750..40818.752 rows=1 loops=1)  ->  Seq Scan on emails  (cost=0.00..20.00 rows=1000 width=0) (actual
time=38801.127..40458.369 rows=89649 loops=1)Total runtime: 40819.115 ms

  Is it the addition of the multiple foreign keys that is slowing this
down so much?  I have several calculations I need to do with the count,
so this is a big thing for me.
  Suggestions appreciated, both on the immediate problem or regarding
the new schema.  Even "you're an idiot" is fine, if it accompanies a
bit of advice.  :)

Thanks!

Benny


-- 
"Even if a man chops off your hand with a sword, you still have two nice,
sharp bones to stick in his eyes."                                                     -- .sig on Slashdot





Re: Impact of foreign keys on a simple count(*) ?

From
Tom Lane
Date:
"C. Bensend" <benny@bennyvision.com> writes:
>    Is it the addition of the multiple foreign keys that is slowing this
> down so much?

Foreign keys have zero, nada, zilch to do with the performance of
count(*).  The only plausible theory I can think of for the performance
difference is that in your "new" database the table has been through
several mass updates, leading to a whole lot of dead rows and a much
larger physical table size to scan through.  I'd suggest a VACUUM
VERBOSE on both old and new copies of the table to get an idea of the
relative physical sizes.  You might need a VACUUM FULL to get the new
table back down to a reasonable size...
        regards, tom lane


Re: Impact of foreign keys on a simple count(*) ?

From
"C. Bensend"
Date:
> Foreign keys have zero, nada, zilch to do with the performance of
> count(*).

OK, I just wanted to make sure.

> The only plausible theory I can think of for the performance
> difference is that in your "new" database the table has been through
> several mass updates, leading to a whole lot of dead rows and a much
> larger physical table size to scan through.  I'd suggest a VACUUM
> VERBOSE on both old and new copies of the table to get an idea of the
> relative physical sizes.  You might need a VACUUM FULL to get the new
> table back down to a reasonable size...

Oh good lord.

*waves the Idiot Flag as he slinks back under his rock*

Thank you, Tom, for your patience.  That was it.  I'm a potatohead,
but I'm learning.  :)

Benny, swearing he DID do a vacuum, but obviously not...


-- 
"Even if a man chops off your hand with a sword, you still have two nice,
sharp bones to stick in his eyes."                                                     -- .sig on Slashdot