Thread: [PERFORM] Performance issue in PostgreSQL server...
Dear Experts,
I need your suggestions to resolve the performance issue reported on our PostgreSQL9.1 production database having 1.5 TB Size. I have observed that, some select queries with order by clause are taking lot of time in execution and forcing applications to give slow response.
The configuration of database server is :
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
CPU’s : 8
Core(s) per socket: 4
Socket(s): 2
Model name: Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz
RAM : 32 GB
SWAP :8 Gb
Kernel parameter:
kernel.shmmax = 32212254720
kernel.shmall = 1073741824
Values of PostgreSQL.conf parameters are :
shared_buffers = 10GB
temp_buffers = 32MB
work_mem = 512MB
maintenance_work_mem = 2048MB
max_files_per_process = 2000
checkpoint_segments = 200
max_wal_senders = 5
wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers
Queries taking lot of time are:
==================================
2017-03-02 00:46:50 IST LOG: duration: 2492951.927 ms execute <unnamed>: SELECT DISTINCT feature_id FROM evidence.point p INNER JOIN evidence.observation_evidence oe ON p.feature_id = oe.evd_feature_id WHERE p.domain_class_id IN (11) AND (p.modification_time > '2015-05-10 00:06:56.056 IST' OR oe.modification_time > '2015-05-10 00:06:56.056 IST') ORDER BY feature_id
2017-03-02 01:05:16 IST LOG: duration: 516250.512 ms execute <unnamed>: SELECT DISTINCT feature_id FROM evidence.point p INNER JOIN evidence.observation_evidence oe ON p.feature_id = oe.evd_feature_id WHERE p.domain_class_id IN (3) AND (p.modification_time > '2015-05-10 01:22:59.059 IST' OR oe.modification_time > '2015-05-10 01:22:59.059 IST') ORDER BY feature_id
Top command output:
top - 15:13:15 up 66 days, 3:45, 8 users, load average: 1.84, 1.59, 1.57
Tasks: 830 total, 1 running, 828 sleeping, 0 stopped, 1 zombie
Cpu(s): 3.4%us, 0.7%sy, 0.0%ni, 81.7%id, 14.2%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 32830016k total, 32142596k used, 687420k free, 77460k buffers
Swap: 8190972k total, 204196k used, 7986776k free, 27981268k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30639 postgres 20 0 10.5g 4.7g 4.7g S 13.5 14.9 10:20.95 postgres
18185 postgres 20 0 10.5g 603m 596m S 4.9 1.9 2:51.16 postgres
16543 postgres 20 0 10.5g 2.8g 2.8g S 4.3 8.8 1:34.04 postgres
14710 postgres 20 0 10.5g 2.9g 2.9g S 3.9 9.2 1:20.84 postgres
1214 root 20 0 15668 1848 896 S 1.0 0.0 130:46.43 top
13462 postgres 20 0 10.5g 1.4g 1.3g S 1.0 4.3 0:25.56 postgres
20081 root 20 0 15668 1880 936 R 1.0 0.0 0:00.12 top
13478 postgres 20 0 10.5g 2.1g 2.1g S 0.7 6.9 0:56.43 postgres
41107 root 20 0 416m 10m 4892 S 0.7 0.0 305:25.71 pgadmin3
2680 root 20 0 0 0 0 S 0.3 0.0 103:38.54 nfsiod
3558 root 20 0 13688 1100 992 S 0.3 0.0 45:00.36 gam_server
15576 root 20 0 0 0 0 S 0.3 0.0 0:01.16 flush-253:1
18430 postgres 20 0 10.5g 18m 13m S 0.3 0.1 0:00.64 postgres
20083 postgres 20 0 105m 1852 1416 S 0.3 0.0 0:00.01 bash
24188 postgres 20 0 102m 1856 832 S 0.3 0.0 0:23.39 sshd
28250 postgres 20 0 156m 1292 528 S 0.3 0.0 0:46.86 postgres
1 root 20 0 19356 1188 996 S 0.0 0.0 0:05.00 init
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)|
DISCLAIMER:
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates.
Dear Experts,
I need your suggestions to resolve the performance issue reported on our PostgreSQL9.1 production database having 1.5 TB Size. I have observed that, some select queries with order by clause are taking lot of time in execution and forcing applications to give slow response.
The configuration of database server is :
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
CPU’s : 8
Core(s) per socket: 4
Socket(s): 2
Model name: Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz
RAM : 32 GB
SWAP :8 Gb
Kernel parameter:
kernel.shmmax = 32212254720
kernel.shmall = 1073741824
Values of PostgreSQL.conf parameters are :
shared_buffers = 10GB
temp_buffers = 32MB
work_mem = 512MB
maintenance_work_mem = 2048MB
max_files_per_process = 2000
checkpoint_segments = 200
max_wal_senders = 5
wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers
Queries taking lot of time are:
==============================
====
2017-03-02 00:46:50 IST LOG: duration: 2492951.927 ms execute <unnamed>: SELECT DISTINCT feature_id FROM evidence.point p INNER JOIN evidence.observation_evidence oe ON p.feature_id = oe.evd_feature_id WHERE p.domain_class_id IN (11) AND (p.modification_time > '2015-05-10 00:06:56.056 IST' OR oe.modification_time > '2015-05-10 00:06:56.056 IST') ORDER BY feature_id
2017-03-02 01:05:16 IST LOG: duration: 516250.512 ms execute <unnamed>: SELECT DISTINCT feature_id FROM evidence.point p INNER JOIN evidence.observation_evidence oe ON p.feature_id = oe.evd_feature_id WHERE p.domain_class_id IN (3) AND (p.modification_time > '2015-05-10 01:22:59.059 IST' OR oe.modification_time > '2015-05-10 01:22:59.059 IST') ORDER BY feature_id
Top command output:
top - 15:13:15 up 66 days, 3:45, 8 users, load average: 1.84, 1.59, 1.57
Tasks: 830 total, 1 running, 828 sleeping, 0 stopped, 1 zombie
Cpu(s): 3.4%us, 0.7%sy, 0.0%ni, 81.7%id, 14.2%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 32830016k total, 32142596k used, 687420k free, 77460k buffers
Swap: 8190972k total, 204196k used, 7986776k free, 27981268k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30639 postgres 20 0 10.5g 4.7g 4.7g S 13.5 14.9 10:20.95 postgres
18185 postgres 20 0 10.5g 603m 596m S 4.9 1.9 2:51.16 postgres
16543 postgres 20 0 10.5g 2.8g 2.8g S 4.3 8.8 1:34.04 postgres
14710 postgres 20 0 10.5g 2.9g 2.9g S 3.9 9.2 1:20.84 postgres
1214 root 20 0 15668 1848 896 S 1.0 0.0 130:46.43 top
13462 postgres 20 0 10.5g 1.4g 1.3g S 1.0 4.3 0:25.56 postgres
20081 root 20 0 15668 1880 936 R 1.0 0.0 0:00.12 top
13478 postgres 20 0 10.5g 2.1g 2.1g S 0.7 6.9 0:56.43 postgres
41107 root 20 0 416m 10m 4892 S 0.7 0.0 305:25.71 pgadmin3
2680 root 20 0 0 0 0 S 0.3 0.0 103:38.54 nfsiod
3558 root 20 0 13688 1100 992 S 0.3 0.0 45:00.36 gam_server
15576 root 20 0 0 0 0 S 0.3 0.0 0:01.16 flush-253:1
18430 postgres 20 0 10.5g 18m 13m S 0.3 0.1 0:00.64 postgres
20083 postgres 20 0 105m 1852 1416 S 0.3 0.0 0:00.01 bash
24188 postgres 20 0 102m 1856 832 S 0.3 0.0 0:23.39 sshd
28250 postgres 20 0 156m 1292 528 S 0.3 0.0 0:46.86 postgres
1 root 20 0 19356 1188 996 S 0.0 0.0 0:05.00 init
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)|
DISCLAIMER:
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates.
Dear Nur,
The below is the output for psql=> EXPLAIN ANALYZE SELECT DISTINCT feature_id FROM evidence.point p INNER JOIN evidence.observation_evidence oe ON p.feature_id = oe.evd_feature_id WHERE p.domain_class_id IN (11) AND (p.modification_time > '2015-05-10 00:06:56.056 IST' OR oe.modification_time > '2015-05-10 00:06:56.056 IST') ORDER BY feature_id
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------
Unique (cost=1679730.32..1679837.46 rows=21428 width=8) (actual time=154753.528..155657.818 rows=1607489 loops=1)
-> Sort (cost=1679730.32..1679783.89 rows=21428 width=8) (actual time=154753.514..155087.734 rows=4053270 loops=1)
Sort Key: p.feature_id
Sort Method: quicksort Memory: 288302kB
-> Hash Join (cost=1501657.09..1678188.87 rows=21428 width=8) (actual time=144146.620..152050.311 rows=4053270 loops=1)
Hash Cond: (oe.evd_feature_id = p.feature_id)
Join Filter: ((p.modification_time > '2015-05-10 03:36:56.056+05:30'::timestamp with time zone) OR (oe.modification_time > '2015-05-10 03:36:5
6.056+05:30'::timestamp with time zone))
-> Seq Scan on observation_evidence oe (cost=0.00..121733.18 rows=5447718 width=16) (actual time=0.007..1534.905 rows=5434406 loops=1)
-> Hash (cost=1483472.70..1483472.70 rows=1454751 width=16) (actual time=144144.653..144144.653 rows=1607491 loops=1)
Buckets: 262144 Batches: 1 Memory Usage: 75352kB
-> Index Scan using point_domain_class_id_index on point p (cost=0.00..1483472.70 rows=1454751 width=16) (actual time=27.265..142101.1
59 rows=1607491 loops=1)
Index Cond: (domain_class_id = 11)
Total runtime: 155787.379 ms
(13 rows)
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------------------------------------------
Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
From: Nur Agus [mailto:nuragus.linux@gmail.com]
Sent: 03 March, 2017 5:54 PM
To: Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Performance issue in PostgreSQL server...
Hello Dinesh,
You can try the EXPLAIN tool
psql=> EXPLAIN ANALYZE SELECT DISTINCT feature_id FROM evidence.point p INNER JOIN evidence.observation_evidence oe ON p.feature_id = oe.evd_feature_id WHERE p.domain_class_id IN (11) AND (p.modification_time > '2015-05-10 00:06:56.056 IST' OR oe.modification_time > '2015-05-10 00:06:56.056 IST') ORDER BY feature_id
Then paste here the result.
Thanks
On Fri, Mar 3, 2017 at 5:29 PM, Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com> wrote:
Dear Experts,
I need your suggestions to resolve the performance issue reported on our PostgreSQL9.1 production database having 1.5 TB Size. I have observed that, some select queries with order by clause are taking lot of time in execution and forcing applications to give slow response.
The configuration of database server is :
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
CPU’s : 8
Core(s) per socket: 4
Socket(s): 2
Model name: Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz
RAM : 32 GB
SWAP :8 Gb
Kernel parameter:
kernel.shmmax = 32212254720
kernel.shmall = 1073741824
Values of PostgreSQL.conf parameters are :
shared_buffers = 10GB
temp_buffers = 32MB
work_mem = 512MB
maintenance_work_mem = 2048MB
max_files_per_process = 2000
checkpoint_segments = 200
max_wal_senders = 5
wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers
Queries taking lot of time are:
==================================
2017-03-02 00:46:50 IST LOG: duration: 2492951.927 ms execute <unnamed>: SELECT DISTINCT feature_id FROM evidence.point p INNER JOIN evidence.observation_evidence oe ON p.feature_id = oe.evd_feature_id WHERE p.domain_class_id IN (11) AND (p.modification_time > '2015-05-10 00:06:56.056 IST' OR oe.modification_time > '2015-05-10 00:06:56.056 IST') ORDER BY feature_id
2017-03-02 01:05:16 IST LOG: duration: 516250.512 ms execute <unnamed>: SELECT DISTINCT feature_id FROM evidence.point p INNER JOIN evidence.observation_evidence oe ON p.feature_id = oe.evd_feature_id WHERE p.domain_class_id IN (3) AND (p.modification_time > '2015-05-10 01:22:59.059 IST' OR oe.modification_time > '2015-05-10 01:22:59.059 IST') ORDER BY feature_id
Top command output:
top - 15:13:15 up 66 days, 3:45, 8 users, load average: 1.84, 1.59, 1.57
Tasks: 830 total, 1 running, 828 sleeping, 0 stopped, 1 zombie
Cpu(s): 3.4%us, 0.7%sy, 0.0%ni, 81.7%id, 14.2%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 32830016k total, 32142596k used, 687420k free, 77460k buffers
Swap: 8190972k total, 204196k used, 7986776k free, 27981268k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30639 postgres 20 0 10.5g 4.7g 4.7g S 13.5 14.9 10:20.95 postgres
18185 postgres 20 0 10.5g 603m 596m S 4.9 1.9 2:51.16 postgres
16543 postgres 20 0 10.5g 2.8g 2.8g S 4.3 8.8 1:34.04 postgres
14710 postgres 20 0 10.5g 2.9g 2.9g S 3.9 9.2 1:20.84 postgres
1214 root 20 0 15668 1848 896 S 1.0 0.0 130:46.43 top
13462 postgres 20 0 10.5g 1.4g 1.3g S 1.0 4.3 0:25.56 postgres
20081 root 20 0 15668 1880 936 R 1.0 0.0 0:00.12 top
13478 postgres 20 0 10.5g 2.1g 2.1g S 0.7 6.9 0:56.43 postgres
41107 root 20 0 416m 10m 4892 S 0.7 0.0 305:25.71 pgadmin3
2680 root 20 0 0 0 0 S 0.3 0.0 103:38.54 nfsiod
3558 root 20 0 13688 1100 992 S 0.3 0.0 45:00.36 gam_server
15576 root 20 0 0 0 0 S 0.3 0.0 0:01.16 flush-253:1
18430 postgres 20 0 10.5g 18m 13m S 0.3 0.1 0:00.64 postgres
20083 postgres 20 0 105m 1852 1416 S 0.3 0.0 0:00.01 bash
24188 postgres 20 0 102m 1856 832 S 0.3 0.0 0:23.39 sshd
28250 postgres 20 0 156m 1292 528 S 0.3 0.0 0:46.86 postgres
1 root 20 0 19356 1188 996 S 0.0 0.0 0:05.00 init
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)|
DISCLAIMER:
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates.
Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com> writes: > The below is the output for psql=> EXPLAIN ANALYZE SELECT DISTINCT feature_id FROM evidence.point p INNER JOIN evidence.observation_evidenceoe ON p.feature_id = oe.evd_feature_id WHERE p.domain_class_id IN (11) AND (p.modification_time> '2015-05-10 00:06:56.056 IST' OR oe.modification_time > '2015-05-10 00:06:56.056 IST') ORDER BY feature_id I think that's a fundamentally slow query and you're not going to be able to make it better without rethinking your requirements and/or data representation. As written, that requires the server to form the entire join of p to oe on feature_id, with the only filter before the join being the evidently-none-too-selective domain_class_id condition. Only after joining can it apply the OR condition. So this is inherently processing a lot of rows. If the OR arms were individually pretty selective you could rewrite this into a UNION of two joins, a la the discussion at https://www.postgresql.org/message-id/flat/7f70bd5a-5d16-e05c-f0b4-2fdfc8873489@BlueTreble.com but given the dates involved I'm betting that won't help very much. Or maybe you could try select feature_id from p where domain_class_id IN (11) AND p.modification_time > '2015-05-10 00:06:56.056 IST' intersect select feature_id from oe where oe.modification_time > '2015-05-10 00:06:56.056 IST' order by feature_id although I'm not entirely certain that that has exactly the same semantics (-ENOCAFFEINE), and it might still be none too quick. regards, tom lane
Dear Nur,
The below is the output for psql=> EXPLAIN ANALYZE SELECT DISTINCT feature_id FROM evidence.point p INNER JOIN evidence.observation_evidence oe ON p.feature_id = oe.evd_feature_id WHERE p.domain_class_id IN (11) AND (p.modification_time > '2015-05-10 00:06:56.056 IST' OR oe.modification_time > '2015-05-10 00:06:56.056 IST') ORDER BY feature_id
-> Index Scan using point_domain_class_id_index on point p (cost=0.00..1483472.70 rows=1454751 width=16) (actual time=27.265..142101.1
59 rows=1607491 loops=1)
Index Cond: (domain_class_id = 11)
On Sun, Mar 05, 2017 at 08:23:08PM -0800, Jeff Janes wrote: > On Fri, Mar 3, 2017 at 4:44 AM, Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com> wrote: > > The below is the output for psql=> EXPLAIN ANALYZE SELECT DISTINCT > > feature_id FROM evidence.point p INNER JOIN evidence.observation_evidence > > oe ON p.feature_id = oe.evd_feature_id WHERE p.domain_class_id IN (11) AND > > (p.modification_time > '2015-05-10 00:06:56.056 IST' OR > > oe.modification_time > '2015-05-10 00:06:56.056 IST') ORDER BY feature_id > ... > > > -> Index Scan using point_domain_class_id_index on point p (cost=0.00..1483472.70 rows=1454751width=16) (actual time=27.265..142101.1 59 rows=1607491 loops=1) > > Index Cond: (domain_class_id = 11) > > Why wouldn't this be using a bitmap scan rather than a regular index scan? > It seems like it should prefer the bitmap scan, unless the table is well > clustered on domain_class_id. In which case, why isn't it just faster? Could you send: SELECT * FROM pg_stats WHERE tablename='point' AND attname='domain_class_id' ; .. or if that's too verbose or you don't want to share the histogram or MCV list: SELECT correlation FROM pg_stats WHERE tablename='point' AND attname='domain_class_id' ; Justin
Dear Justin, Below is the output of Query SELECT * FROM pg_stats WHERE tablename='point' AND attname='domain_class_id' ; schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs| histogram_bounds | correlation "evidence"|"point"|"domain_class_id"|f|0|8|10|"{7,9,2,11,43,3,1,10,4,17}"|"{0.9322,0.0451333,0.0145,0.00393333,0.00183333,0.00146667,0.0005,0.0003,6.66667e-05,6.66667e-05}"|""|0.889078 Regards, Dinesh Chandra |Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida. -----Original Message----- From: Justin Pryzby [mailto:pryzby@telsasoft.com] Sent: 06 March, 2017 10:54 AM To: Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com> Cc: Nur Agus <nuragus.linux@gmail.com>; Jeff Janes <jeff.janes@gmail.com>; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Performance issue in PostgreSQL server... On Sun, Mar 05, 2017 at 08:23:08PM -0800, Jeff Janes wrote: > On Fri, Mar 3, 2017 at 4:44 AM, Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com> wrote: > > The below is the output for psql=> EXPLAIN ANALYZE SELECT DISTINCT > > feature_id FROM evidence.point p INNER JOIN > > evidence.observation_evidence oe ON p.feature_id = oe.evd_feature_id > > WHERE p.domain_class_id IN (11) AND (p.modification_time > > > '2015-05-10 00:06:56.056 IST' OR oe.modification_time > '2015-05-10 > > 00:06:56.056 IST') ORDER BY feature_id > ... > > > -> Index Scan using point_domain_class_id_index on point p (cost=0.00..1483472.70 rows=1454751width=16) (actual time=27.265..142101.1 59 rows=1607491 loops=1) > > Index Cond: (domain_class_id = 11) > > Why wouldn't this be using a bitmap scan rather than a regular index scan? > It seems like it should prefer the bitmap scan, unless the table is > well clustered on domain_class_id. In which case, why isn't it just faster? Could you send: SELECT * FROM pg_stats WHERE tablename='point' AND attname='domain_class_id' ; .. or if that's too verbose or you don't want to share the histogram or MCV list: SELECT correlation FROM pg_stats WHERE tablename='point' AND attname='domain_class_id' ; Justin ________________________________ DISCLAIMER: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information.Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient,please contact the sender by reply email and destroy all copies of the original message. Check all attachmentsfor viruses before opening them. All views or opinions presented in this e-mail are those of the author and maynot reflect the opinion of Cyient or those of our affiliates.
On Mon, Mar 06, 2017 at 12:17:22PM +0000, Dinesh Chandra 12108 wrote: > Below is the output of Query SELECT * FROM pg_stats WHERE tablename='point' AND attname='domain_class_id' ; > > > schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs| histogram_bounds | correlation > > "evidence"|"point"|"domain_class_id"|f|0|8|10|"{7,9,2,11,43,3,1,10,4,17}"|"{0.9322,0.0451333,0.0145,0.00393333,0.00183333,0.00146667,0.0005,0.0003,6.66667e-05,6.66667e-05}"|""|0.889078 On Fri, Mar 03, 2017 at 12:44:07PM +0000, Dinesh Chandra 12108 wrote: > -> Index Scan using point_domain_class_id_index on point p (cost=0.00..1483472.70 rows=1454751 width=16) (actualtime=27.265..142101.1 59 rows=1607491 loops=1) On Sun, Mar 05, 2017 at 08:23:08PM -0800, Jeff Janes wrote: > Why wouldn't this be using a bitmap scan rather than a regular index scan? > It seems like it should prefer the bitmap scan, unless the table is well > clustered on domain_class_id. In which case, why isn't it just faster? I missed your response until now, and can't see that anybody else responded, but I suspect the issue is that the *table* is highly correlated WRT this column, but the index may not be, probably due to duplicated index keys. postgres only stores statistics on expression indices, and falls back to correlation of table column of a simple indices. If you're still fighting this, would you send result of: SELECT domain_class_id, count(1) FROM point GROUP BY 1 ORDER BY 2 DESC LIMIT 22; or, SELECT count(1) FROM point GROUP BY domain_class_id ORDER BY 1 DESC LIMIT 22; if there's much repetition in the index keys, then PG's planner thinks an index scan has low random_page_cost, and effective_cache_size has little effect on large tables, and it never uses bitmap scan, which blows up if the index is fragmented and has duplicate keys. The table reads end up costing something like 1454751*random_page_cost nonsequential reads and fseek() calls when it thinks it'll cost only 1454751*16*seq_page_cost. Is the query much faster if you first reindex point_domain_class_id_index ? This has come up before, see: > https://www.postgresql.org/message-id/flat/520D6610.8040907%40emulex.com#520D6610.8040907@emulex.com > https://www.postgresql.org/message-id/flat/20160524173914.GA11880%40telsasoft.com#20160524173914.GA11880@telsasoft.com > https://www.postgresql.org/message-id/flat/n6cmpug13b9rk1srebjvhphg0lm8dou1kn%404ax.com#n6cmpug13b9rk1srebjvhphg0lm8dou1kn@4ax.com Justin