Re: [9.4] SELECT repeatedly rewrites a table - Mailing list pgsql-bugs

From Marko Kreen
Subject Re: [9.4] SELECT repeatedly rewrites a table
Date
Msg-id 20150805112319.GA14023@gmail.com
Whole thread Raw
In response to [9.4] SELECT repeatedly rewrites a table  (Marko Kreen <markokr@gmail.com>)
List pgsql-bugs
On Fri, Jul 17, 2015 at 02:22:59PM +0300, Marko Kreen wrote:
> We recently upgraded 9.3 to 9.4 in live environment and one batch-job
> query floods system with write requests.  Query itself is basically
> seq-scan on insert-only table.

I managed to get EXPLAIN of bad run.  Seems bug is dependant on "DECLARE
.. NO SCROLL CURSOR WITH HOLD FOR ...".

1) On evening I launced 2 VM's, let's call then VMSELECT and VMCURSOR.
2) In the morning I run "EXPLAIN SELECT .." on VMSELECT twice - one after other.
   And on VMCURSOR "EXPLAIN DECLATE CURSOR .." twice - one after other.

VMCURSOR run #1 - the bad run
-----------------------------

chatdb=# explain (analyze true, buffers true, timing true)
chatdb-# DECLARE data_maint_cur NO SCROLL CURSOR WITH HOLD FOR
chatdb-# select account_id, max(event_nr) as first_event_nr from account.event where store_time < now() - '1
month'::intervalgroup by 1; 
                                                                   QUERY PLAN
                        

------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.56..8965079.90 rows=82 width=24) (actual time=37.752..9941575.182 rows=301857 loops=1)
   Group Key: account_id
   Buffers: shared hit=1417642 read=16900923 dirtied=15670349 written=11141824
   I/O Timings: read=1826689.835 write=8017738.089
   ->  Index Scan using event_pkey on event  (cost=0.56..8960823.53 rows=851110 width=24) (actual
time=30.470..9940359.172rows=859710 loops=1) 
         Filter: (store_time < (now() - '1 mon'::interval))
         Rows Removed by Filter: 28963464
         Buffers: shared hit=1417642 read=16900923 dirtied=15670349 written=11141824
         I/O Timings: read=1826689.835 write=8017738.089
 Planning time: 1.453 ms
 Execution time: 9941803.330 ms
(11 rows)

VMCURSOR run #2 - no so bad
---------------------------

chatdb=# explain (analyze true, buffers true, timing true) DECLARE data_maint_cur NO SCROLL CURSOR WITH HOLD FOR
chatdb-# select account_id, max(event_nr) as first_event_nr from account.event where store_time < now() - '1
month'::intervalgroup by 1; 
                                                                  QUERY PLAN
                      

----------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.56..9291947.71 rows=89 width=24) (actual time=0.055..126290.079 rows=304188 loops=1)
   Group Key: account_id
   Buffers: shared hit=1803475 read=16580860 dirtied=13723 written=10176
   I/O Timings: read=82540.306 write=1187.116
   ->  Index Scan using event_pkey on event  (cost=0.56..9287343.81 rows=920602 width=24) (actual
time=0.040..125176.023rows=957393 loops=1) 
         Filter: (store_time < (now() - '1 mon'::interval))
         Rows Removed by Filter: 28999419
         Buffers: shared hit=1803475 read=16580860 dirtied=13723 written=10176
         I/O Timings: read=82540.306 write=1187.116
 Planning time: 0.071 ms
 Execution time: 126510.051 ms
(11 rows)

VMSELECT run #1
---------------

For some reason it rewrites the table, once.

chatdb=# explain (analyze true, buffers true, timing true)
chatdb-# select account_id, max(event_nr) as first_event_nr from account.event
chatdb-# where store_time < now() - '1 month'::interval group by 1;
                                                         QUERY PLAN
    

----------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=974789.39..974790.27 rows=88 width=24) (actual time=276382.073..276652.235 rows=301857 loops=1)
   Group Key: account_id
   Buffers: shared hit=358 read=448265 dirtied=418624 written=418225
   I/O Timings: read=246250.610 write=15802.735
   ->  Seq Scan on event  (cost=0.00..970328.57 rows=892164 width=24) (actual time=627.299..275438.301 rows=855133
loops=1)
         Filter: (store_time < (now() - '1 mon'::interval))
         Rows Removed by Filter: 28955031
         Buffers: shared hit=358 read=448265 dirtied=418624 written=418225
         I/O Timings: read=246250.610 write=15802.735
 Planning time: 1.301 ms
 Execution time: 276869.555 ms
(11 rows)

VMSELECT run #2
---------------

chatdb=# explain (analyze true, buffers true, timing true)
select account_id, max(event_nr) as first_event_nr from account.event
where store_time < now() - '1 month'::interval group by 1;
                                                        QUERY PLAN
   

---------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=975156.40..975157.28 rows=88 width=24) (actual time=26036.503..26311.292 rows=301857 loops=1)
   Group Key: account_id
   Buffers: shared hit=553 read=448232 dirtied=117
   I/O Timings: read=14424.903
   ->  Seq Scan on event  (cost=0.00..970678.96 rows=895488 width=24) (actual time=833.044..25130.745 rows=858416
loops=1)
         Filter: (store_time < (now() - '1 mon'::interval))
         Rows Removed by Filter: 28962467
         Buffers: shared hit=553 read=448232 dirtied=117
         I/O Timings: read=14424.903
 Planning time: 0.102 ms
 Execution time: 26518.603 ms

VMSELECT run #3 - now with cursor here too - looks ok
-----------------------------------------------------

chatdb=# explain (analyze true, buffers true, timing true) DECLARE data_maint_cur NO SCROLL CURSOR WITH HOLD FOR
chatdb-# select account_id, max(event_nr) as first_event_nr from account.event where store_time < now() - '1
month'::intervalgroup by 1; 
                                                                  QUERY PLAN
                      

----------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.56..9230874.03 rows=94 width=24) (actual time=0.220..444906.246 rows=304184 loops=1)
   Group Key: account_id
   Buffers: shared hit=1790170 read=16588868 dirtied=78190 written=32718
   I/O Timings: read=398411.728 write=2608.564
   ->  Index Scan using event_pkey on event  (cost=0.56..9226058.00 rows=963019 width=24) (actual
time=0.183..443823.884rows=946397 loops=1) 
         Filter: (store_time < (now() - '1 mon'::interval))
         Rows Removed by Filter: 29000509
         Buffers: shared hit=1790170 read=16588868 dirtied=78190 written=32718
         I/O Timings: read=398411.728 write=2608.564
 Planning time: 12.062 ms
 Execution time: 445120.327 ms
(11 rows)


Does this ring any bells?  What more info is needed?

--
marko

pgsql-bugs by date:

Previous
From: Noah Misch
Date:
Subject: Re: BUG #13427: postgres.exe fails to start on Korean Windows Server 2008: cannot perform encoding conversion outsid
Next
From: maciek@heroku.com
Date:
Subject: BUG #13539: ERROR: record type has not been registered with CTE and subselect in target list