Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up |
Date | |
Msg-id | 201005300233.o4U2XGp14143@momjian.us Whole thread Raw |
In response to | [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up (Andres Freund <andres@anarazel.de>) |
Responses |
Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT +
speeding it up
|
List | pgsql-hackers |
Added to TODO: Consider a faster CRC32 algorithm * http://archives.postgresql.org/pgsql-hackers/2010-05/msg01112.php --------------------------------------------------------------------------- Andres Freund wrote: > Hi, > > I started to analyze XLogInsert because it was the major bottleneck when > creating some materialized view/cached tables/whatever. > Analyzing it I could see that content of the COMP_CRC32 macro was taking most > of the time which isn't immediately obvious when you profile because it > obviously doesn't show up as a separate function. > I first put it into functions to make it easier to profile. I couldn't measure > any difference for COPY, CTAS and a simple pgbench run on 3 kinds of hardware > (Core2, older Xeon, older Sparc systems). > > I looked a bit around for faster implementations of CRC32 and found one in > zlib. After adapting it (pg uses slightly different computation (non- > inverted)) I found that it increases the speed of the CRC32 calculation itself > 3 fold. > It does that by not only using one lookup table but four (one for each byte of > a word). Those four calculations are independent and thus are considerably > faster on somewhat recent hardware. > Also it does memory lookups in 4 byte steps instead of 1 byte as the pg > version (thats only about ~8% benefit in itself). > > I wrote a preliminary patch which includes both, the original implementation > and the new one switchable via an #define. > > > I tested performance differences in a small number of scenarios: > - CTAS/INSERT ... SELECT (8-30%) > - COPY (3-20%) > - pgbench (no real difference unless directly after a checkpoint) > > Setup: > > CREATE TABLE blub (ai int, bi int, aibi int); > CREATE TABLE speedtest (ai int, bi int, aibi int); > > > INSERT ... SELECT: > > Statement: > INSERT INTO blub SELECT a.i, b.i, a.i *b.i FROM generate_series(1, 10000) > a(i), generate_series(1, 1000) b(i); > > legacy crc: > > 11526.588 > 11406.518 > 11412.182 > 11430.245 > > zlib: > 9977.394 > 9945.408 > 9840.907 > 9842.875 > > > COPY: > Statement: > ('blub' enlarged here 4 times, as otherwise the variances were to large) > > COPY blub TO '/tmp/b' BINARY; > ... > CHECKPOINT;TRUNCATE speedtest; COPY speedtest FROM '/tmp/b' BINARY; > > legacy: > 44835.840 > 44832.876 > > zlib: > 39530.549 > 39365.109 > 39295.167 > > The performance differences are bigger if the table rows are significantly > bigger. > > Do you think something like that is sensible? If yes, I will make it into a > proper patch and such. > > Thanks, > > Andres > > INSERT ... SELECT profile before patch: > > 20.22% postgres postgres [.] comp_crc32 > 5.77% postgres postgres [.] XLogInsert > 5.55% postgres postgres [.] LWLockAcquire > 5.21% postgres [kernel. [k] copy_user_generic_string > 4.64% postgres postgres [.] LWLockRelease > 4.39% postgres postgres [.] ReadBuffer_common > 2.75% postgres postgres [.] heap_insert > 2.22% postgres libc-2.1 [.] memcpy > 2.09% postgres postgres [.] UnlockReleaseBuffer > 1.85% postgres postgres [.] hash_any > 1.77% postgres [kernel. [k] clear_page_c > 1.69% postgres postgres [.] hash_search_with_hash_value > 1.61% postgres postgres [.] heapgettup_pagemode > 1.50% postgres postgres [.] PageAddItem > 1.42% postgres postgres [.] MarkBufferDirty > 1.28% postgres postgres [.] RelationGetBufferForTuple > 1.15% postgres postgres [.] ExecModifyTable > 1.06% postgres postgres [.] RelationPutHeapTuple > > > After: > > 9.97% postgres postgres [.] comp_crc32 > 5.95% postgres [kernel. [k] copy_user_generic_string > 5.94% postgres postgres [.] LWLockAcquire > 5.64% postgres postgres [.] XLogInsert > 5.11% postgres postgres [.] LWLockRelease > 4.63% postgres postgres [.] ReadBuffer_common > 3.45% postgres postgres [.] heap_insert > 2.54% postgres libc-2.1 [.] memcpy > 2.03% postgres postgres [.] UnlockReleaseBuffer > 1.94% postgres postgres [.] hash_search_with_hash_value > 1.84% postgres postgres [.] hash_any > 1.73% postgres [kernel. [k] clear_page_c > 1.68% postgres postgres [.] PageAddItem > 1.62% postgres postgres [.] heapgettup_pagemode > 1.52% postgres postgres [.] RelationGetBufferForTuple > 1.47% postgres postgres [.] MarkBufferDirty > 1.30% postgres postgres [.] ExecModifyTable > 1.23% postgres postgres [.] RelationPutHeapTuple [ Attachment, skipping... ] > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com
pgsql-hackers by date: