Re: - Mailing list pgsql-performance
From | |
---|---|
Subject | Re: |
Date | |
Msg-id | 50235.216.80.95.13.1038861943.squirrel@www.l-i-e.com Whole thread Raw |
In response to | Re: (Rod Taylor <rbt@rbt.ca>) |
Responses |
Re:
|
List | pgsql-performance |
>> I don't understand why PostgreSQL sometimes chooses not to use the >> existing INDEXes to do an index scan instead of sequential scan -- >> Does it really think sequential will be faster, or does it eliminate >> an index scan > > Yes, and it's generally right. > >> because there won't be enough hard drive or swap space to do it? > > Nope. Simply because of time it takes to read from the disk. An index > scan makes ~ 1 read per tuple and sequential scans make one per page > (gross simplification). Hmmm. An "index" is apparently nothing like I expected it to be... Here I thought it would be some quick hash-table small data-set lookup with a reference to the OID -- and that most of the hash tables could just be loaded in one fell swoop. Oh well. >> Currently, full text search queries take on the order of 2 minutes to >> execute. >> We need them to be happening in 5 seconds, if at all possible. > > How about a couple of explains of the queries. Explains were posted previously, but I'll do a couple more. At its simplest, this takes 30 seconds: explain select article.* from article where lower(text) like '%einstein%'; NOTICE: QUERY PLAN: Seq Scan on article (cost=0.00..1155.01 rows=1 width=216) Or, slightly more complex: explain SELECT DISTINCT *, 0 + (0 + 10 * (lower(title) like '%einstein%') ::int + 10 * (lower(author_flattened) like '%einstein%') ::int + 30 * (lower(subject_flattened) like '%einstein%') ::int + 30 * (lower(text) LIKE '%einstein%') ::int + 9 * (substring(lower(title), 1, 20) like '%einstein%') ::int + 25 * (substring(lower(text), 1, 20) LIKE '%einstein%') ::int ) AS points FROM article WHERE TRUE AND (FALSE OR (lower(title) like '%einstein%') OR (lower(author_flattened) like '%einstein%') OR (lower(subject_flattened) like '%einstein%') OR (lower(text) LIKE '%einstein%') ) ORDER BY points desc, volume, number, article.article LIMIT 10, 0; NOTICE: QUERY PLAN: Limit (cost=1418.03..1418.08 rows=1 width=216) -> Unique (cost=1418.03..1418.08 rows=1 width=216) -> Sort (cost=1418.03..1418.03 rows=1 width=216) -> Seq Scan on article (cost=0.00..1418.02 rows=1 width=216) > What kind of tuning have > you done in postgresql.conf. None. Never really understood what that one memory setting would affect... And the rest of the options seemed to be about logging output (which I also can't seem to crank up to the level of getting query analysis out). I RTFM, but actually comprehending what was written ... :-^ > Whats your hardware like? processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 11 model name : Intel(R) Pentium(R) III CPU family 1400MHz stepping : 1 cpu MHz : 1406.005 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 2804.94 total: used: free: shared: buffers: cached: Mem: 921235456 736669696 184565760 749568 75321344 592257024 Swap: 2097143808 15368192 2081775616 MemTotal: 899644 kB MemFree: 180240 kB MemShared: 732 kB Buffers: 73556 kB Cached: 573896 kB SwapCached: 4480 kB Active: 433776 kB Inact_dirty: 182208 kB Inact_clean: 36680 kB Inact_target: 229376 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 899644 kB LowFree: 180240 kB SwapTotal: 2047992 kB SwapFree: 2032984 kB > Have you > partitioned the data to separate disks in any way? No, except when attempting to do the PostgreSQL contrib/fulltextindex we clustered the _fti table by loading it in word order. > Are you doing mostly (all?) reads? Some writes? Perhaps clustering? Mostly reads. Some writes by: Admin fixing typos, adding new articles Nightly cron jobs to "flatten" large-scale JOINs into text contatenations (We could get rid of that and go back to the JOINs, now that we've figured out that it's really the full text search that's killing us, not the JOINs) > Is this on 7.2 or 7.3? 7.1.3 > What is the Locale? C or en_US or something > else? AFAIK, we didn't do anything to alter the locale from whatever the default would be...
pgsql-performance by date: