Re: Non-decimal integer literals - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: Non-decimal integer literals |
Date | |
Msg-id | CAApHDvpBTHL58j6UmnUyrz=5h15NhY30_AvZ_5fQ7LJMhMcyxA@mail.gmail.com Whole thread Raw |
In response to | Re: Non-decimal integer literals (Peter Eisentraut <peter.eisentraut@enterprisedb.com>) |
List | pgsql-hackers |
On Sat, 26 Nov 2022 at 05:13, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > On 24.11.22 10:13, David Rowley wrote: > > I > > remember many years ago and several jobs ago when working with SQL > > Server being able to speed up importing data using hexadecimal > > DATETIMEs. I can't think why else you might want to represent a > > DATETIME as a hexstring, so I assumed this was a large part of the use > > case for INTs in PostgreSQL. Are you telling me that better > > performance is not something anyone will want out of this feature? > > This isn't about datetimes but about integers. I'm aware. My aim was to show that hex is commonly used as a more efficient way of getting integer numbers in and out of computers. Likely it's better for me to quantify this performance increase claim with some actual performance results. Here's master (@f0cd57f85) doing copy ab2 from '/tmp/ab.csv'; ab2 is a table with no indexes and just 2 int columns. 16.55% postgres [.] CopyReadLine 7.82% postgres [.] pg_strtoint32 7.60% postgres [.] CopyReadAttributesText 7.06% postgres [.] NextCopyFrom 4.40% postgres [.] CopyFrom The copy completes in 2512.5278 ms (average time over 10 runs) Patching master with your v11 patch and copying in hex numbers instead of decimal numbers shows: 14.39% postgres [.] CopyReadLine 8.60% postgres [.] pg_strtoint32 6.95% postgres [.] NextCopyFrom 6.79% postgres [.] CopyReadAttributesText 4.81% postgres [.] CopyFrom This shows that we're spending proportionally less time in CopyReadLine() and proportionally more time in pg_strtoint32(). There are probably two things going on there, CopyReadLine is likely faster due to having to read fewer bytes and pg_strtoint32() is likely slower due to additional branching and code size. This (copy ab2 from '/tmp/abhex.csv') saw an average time of 2720.1387 ms over 10 runs. Patching master with your v11 patch + more_efficient_hex_oct_and_binary_processing.diff 15.68% postgres [.] CopyReadLine 7.75% postgres [.] NextCopyFrom 7.73% postgres [.] pg_strtoint32 6.25% postgres [.] CopyReadAttributesText 4.76% postgres [.] CopyFrom The average time to import the hex version of the csv file comes down to 2385.7298 ms over 10 runs. I didn't run any tests to see how much the performance of importing the decimal representation slowed down from the v11 patch. I assume there will be a small performance hit due to the extra processing done in pg_strtoint32() David
pgsql-hackers by date: