Re: [dspam-users] Postgres vs. MySQL - Mailing list pgsql-performance

From Christopher Browne
Subject Re: [dspam-users] Postgres vs. MySQL
Date
Msg-id m3u0rbxhdo.fsf@knuth.knuth.cbbrowne.com
Whole thread Raw
In response to Postgres vs. MySQL  (Evilio del Rio <edelrio@cmima.csic.es>)
List pgsql-performance
Martha Stewart called it a Good Thing when cshobe@osss.net ("Casey Allen Shobe") wrote:
> I posted about this a couple days ago on dspam-dev...
>
> I am using DSpam with PostgreSQL, and like you discovered the horrible
> performance.  The reason is because the default PostgreSQL query planner
> settings determine that a sequence scan will be more efficient than an
> index scan, which is wrong.  To correct this behavior, adjust the query
> planner settings for the appropriate table/column with this command:
>
> alter table "dspam_token_data" alter "token" set statistics 200; analyze;
>
> Let me know if it help you.  It worked wonders for me.

That makes a great deal of sense; the number of tokens are likely to
be rather larger than 10, and are likely to be quite unevenly
distributed.  That fits with the need you found to collect more
statistics on that column.

Other cases where it seems plausible that it would be worthwhile to do
the same:

  alter table dspam_signature_data alter signature set statistics 200;
  alter table dspam_neural_data alter node set statistics 200;
  alter table dspam_neural_decisions alter signature set statistics 200;

Lionel's suggestion of having a functional index on dspam_token_data
(innocent_hits + spam_hits) also seems likely to be helpful.  Along
with that, it might prove necessary to alter stats on dspam_token_data
thus:

  alter table dspam_token_data alter innocent_hits set statistics 200;
  alter table dspam_token_data alter spam_hits set statistics 200;

None of these changes are likely to make things materially worse; if
they do help, they'll help rather a lot.
--
(format nil "~S@~S" "cbbrowne" "gmail.com")
http://www.ntlug.org/~cbbrowne/nonrdbms.html
Rules  of the  Evil  Overlord #112.  "I  will not  rely entirely  upon
"totally  reliable"  spells  that  can be  neutralized  by  relatively
inconspicuous talismans." <http://www.eviloverlord.com/>

pgsql-performance by date:

Previous
From: Lionel Bouton
Date:
Subject: Re: [dspam-users] Postgres vs. MySQL
Next
From: "Jim C. Nasby"
Date:
Subject: Re: Postgres vs. DSpam