Use read streams in CREATE DATABASE command when the strategy is wal_log - Mailing list pgsql-hackers

From Nazir Bilal Yavuz
Subject Use read streams in CREATE DATABASE command when the strategy is wal_log
Date
Msg-id CAN55FZ0JKL6vk1xQp6rfOXiNFV1u1H0tJDPPGHWoiO3ea2Wc=A@mail.gmail.com
Whole thread Raw
Responses Re: Use read streams in CREATE DATABASE command when the strategy is wal_log
List pgsql-hackers
Hi,

I am working on using read streams in the CREATE DATABASE command when the strategy is wal_log. RelationCopyStorageUsingBuffer() function is used in this context. This function reads source buffers then copies them to the destination buffers. I used read streams only when reading source buffers because the destination buffers are read by 'RBM_ZERO_AND_LOCK' option, so it is not important.

I created a ~6 GB table [1] and created a new database with the wal_log strategy using the database that table was created in as a template [2]. My benchmarking results are:

a. Timings:

patched:
12955.027 ms
12917.475 ms
13177.846 ms
12971.308 ms
13059.985 ms

master:
13156.375 ms
13054.071 ms
13151.607 ms
13152.633 ms
13160.538 ms

There is no difference in timings, the patched version is a tiny bit better but it is negligible. I actually expected the patched version to be better because there was no prefetching before, but the read stream API detects sequential access and disables prefetching.

b. strace:

patched:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 68.02    3.749359           2   1285782           pwrite64
 18.54    1.021734          21     46730           preadv
  9.49    0.522889         826       633           fdatasync
  2.55    0.140339          59      2368           pwritev
  1.14    0.062583         409       153           fsync


master:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 59.71    3.831542           2   1288365           pwrite64
 29.84    1.914540           2    747936           pread64
  7.90    0.506843         837       605           fdatasync
  1.58    0.101575          54      1856           pwritev
  0.75    0.048431         400       121           fsync


There are fewer (~1/16) read system calls in the patched version.

c. perf:

patched:

-   97.83%     1.13%  postgres  postgres           [.] RelationCopyStorageUsingBuffer                                                    
   - 97.83% RelationCopyStorageUsingBuffer                                                                                                
      - 44.28% ReadBufferWithoutRelcache                                                                                                  
         + 42.20% GetVictimBuffer                                                                                                        
           0.81% ZeroBuffer                                                                                                              
      + 31.86% log_newpage_buffer                                                                                                        
      - 19.51% read_stream_next_buffer                                                                                                    
         - 17.92% WaitReadBuffers                                                                                                        
            + 17.61% mdreadv                                                                                                              
         - 1.47% read_stream_start_pending_read                                                                                          
            + 1.46% StartReadBuffers      

master:

-   97.68%     0.57%  postgres  postgres           [.] RelationCopyStorageUsingBuffer                                                    
   - RelationCopyStorageUsingBuffer                                                                                                      
      - 65.48% ReadBufferWithoutRelcache                                                                                                  
         + 41.16% GetVictimBuffer                                                                                                        
         - 20.42% WaitReadBuffers                                                                                                        
            + 19.90% mdreadv                                                                                                              
         + 1.85% StartReadBuffer                                                                                                          
           0.75% ZeroBuffer                                                                                                              
      + 30.82% log_newpage_buffer

Patched version spends less CPU time in read calls and more CPU time in other calls such as write.

There are three patch files attached. First two are optimization and adding a way to create a read stream object by using SMgrRelation, these are already proposed in the streaming I/O thread [3]. The third one is the actual patch file.

Any kind of feedback would be appreciated.

[1] CREATE TABLE t as select repeat('a', 100) || i || repeat('b', 500) as filler from generate_series(1, 9000000) as i;
[2] CREATE DATABASE test_1 STRATEGY 'wal_log' TEMPLATE test;
[3] https://www.postgresql.org/message-id/CAN55FZ1yGvCzCW_aufu83VimdEYHbG_zuOY3J9JL-nBptyJyKA%40mail.gmail.com

--
Regards,
Nazir Bilal Yavuz
Microsoft
Attachment

pgsql-hackers by date:

Previous
From: Amul Sul
Date:
Subject: Re: Add bump memory context type and use it for tuplesorts
Next
From: shveta malik
Date:
Subject: Re: Disallow changing slot's failover option in transaction block