Thread: Minor performance improvements
Four small optimizations in org.postgresql.core.PGStream:
1) The int to byte conversion is not needed in this method:
public void SendChar(int val) throws IOException
{
pg_output.write((byte)val);
}
(I found this while profiling a test that included about 5000 queries. This method was called half a million times, and actually had the third largest "own time" behind executeBatch and executeQuery)
2) The byte masks are not required in SendInteger4 and SendInteger2, as OutputStream.write(int) ignores the top 24 bits. Though if these are removed I think comments should be included instead to clarify that only the bottom eight bits get written.
3) The method with the signature: void Send(byte buf[], int off, int siz),
Could be rewritten so that the test occurs once instead of twice: (buf.length - off) < siz
4) two new parameterless methods should be added to optimize int ReceiveIntegerR(int siz) when called with size 2 and 4, as calls with these parameter values are very common. The call sites should change to use the new methods.
Stephen Denne
Java Developer
Corporate Development
Datamail Ltd
1 Victoria Street
Petone 5012
PO Box 31249
Lower Hutt 5040
Tel: +64 4 568 8200 extension 8638
Fax: +64 4 568 9600
Website: www.datamail.co.nz
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage.
This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way.
This email has been scanned by the DMZGlobal Business Quality Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
On Tue, 27 Feb 2007, Stephen Denne wrote: > Four small optimizations in org.postgresql.core.PGStream: > Could you share some more of your results with us? What sort of improvment did you see by making these changes? Or could you share your test case code so we can do some timing of our own? > 2) The byte masks are not required in SendInteger4 and SendInteger2, as > OutputStream.write(int) ignores the top 24 bits. Though if these are > removed I think comments should be included instead to clarify that only > the bottom eight bits get written. Would using a class member byte[] _int4buf = new byte[4], filling it, and passing this to Send(byte buf[]) be better by avoid repeated method calls? Kris Jurka
I'm sad to say that I have not created any micro-benchmark tests, and unfortunately the improvements are very minor, andfar overshadowed by the variability I get from my system. My code wasn't even using Send(byte buf[], int off, int siz). The portion of my application that I am testing is essentially doing bulk loading. I'm using PreparedStatement of insertSQL, a lot of setXXX() and addBatch() calls, and the occasional executeBatch(). Running in Java5, using jdbc driver 8.2.504.jdbc3 connecting to PostgreSQL 8.2.3 with about 40Gb data and indexes in it,all on WinXP. The test ends up performing about 5200 inserts. The improvement I do see (which prompted my email) is in fewer bytecodes being used to perform the same functionality. Adding profiling in to such small methods also results in a larger impact (I'm using YourKit Java Profiler). That said... this implementation was about 30% faster over about 88,000 calls: public void SendInteger4(int val) throws IOException { pg_output.write(val >> 24); pg_output.write(val >> 16); pg_output.write(val >> 8); pg_output.write(val); } SendInteger2 was similarly improved. These changes meant that SendChar usage was reduced from about 560,000 calls to about 40,000, making it difficult to measurethe time saved by removing the double up of the int to byte conversion. Interesting thought about using _int4buf, and it took me a while to make up my mind whether it would be faster or not (usingthe "Java Puzzlers" technique of deciding what I think some code would do before testing my assertions). I think the difference in the overall work you are doing is that you are adding a buffer to buffer copy. Why write the valuesto _int4buf if you can write them into BufferedOutputStream's buffer? Most of the time you won't be going beyond theend of the buffer, so in general you'd be replacing three method calls with a call to System.arraycopy, which while extemelyfast, I think would be slower than not calling it at all. Regards, Stephen Denne. -----Original Message----- From: Kris Jurka [mailto:books@ejurka.com] Sent: Tuesday, 27 February 2007 11:47 a.m. To: Stephen Denne Cc: pgsql-jdbc@postgresql.org Subject: Re: [JDBC] Minor performance improvements On Tue, 27 Feb 2007, Stephen Denne wrote: > Four small optimizations in org.postgresql.core.PGStream: > Could you share some more of your results with us? What sort of improvment did you see by making these changes? Or couldyou share your test case code so we can do some timing of our own? > 2) The byte masks are not required in SendInteger4 and SendInteger2, > as > OutputStream.write(int) ignores the top 24 bits. Though if these are > removed I think comments should be included instead to clarify that > only the bottom eight bits get written. Would using a class member byte[] _int4buf = new byte[4], filling it, and passing this to Send(byte buf[]) be better by avoidrepeated method calls? Kris Jurka Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal Business Quality Electronic Messaging Suite. Please see http://www.dmzglobal.com/services/bqem.htm for details. __________________________________________________________________
On Tue, 27 Feb 2007, Stephen Denne wrote: > I'm sad to say that I have not created any micro-benchmark tests, and > unfortunately the improvements are very minor, and far overshadowed by > the variability I get from my system. > I've created the attached test which tests the original code (Orig), your code (Two), and my suggestion of an int4buf (Three) and got the following surprising results: jurka@tony:~/pg/jdbc/projects/perf/micro$ java -classpath . Tester Orig | sort -n 11335 11370 11468 11484 11487 jurka@tony:~/pg/jdbc/projects/perf/micro$ java -classpath . Tester Two | sort -n 12472 12476 12489 12492 12619 jurka@tony:~/pg/jdbc/projects/perf/micro$ java -classpath . Tester Three | sort -n 4259 4562 4564 4611 4689 This shows your code is actually slower than the original code, although I have no idea why that could be. It shows the int4buf idea as a clear winner. I'm a little suspicious of the whole test because of your numbers going up. Could you take a look at this and possibly confirm the results? I'm not sure if windows has an equivalent to /dev/null, but I wanted to avoid any impact of disk io. Kris Jurka
Attachment
So it does. I used pg_output = new BufferedOutputStream(new OutputStream() {public void write(int b) throws IOException {}}, 8192); And confirm that I get similar slowdown for my code, and much faster operation for your code. The only thing I can think might be going on with your code is the hotspot compiler recognising the bytecode pattern in use,and using the fastest method it knows of filling that byte array. SendInteger2 behaves the same surprising way. Regards, Stephen. -----Original Message----- From: Kris Jurka [mailto:books@ejurka.com] Sent: Tuesday, 27 February 2007 5:24 p.m. To: Stephen Denne Cc: pgsql-jdbc@postgresql.org Subject: Re: [JDBC] Minor performance improvements On Tue, 27 Feb 2007, Stephen Denne wrote: > I'm sad to say that I have not created any micro-benchmark tests, and > unfortunately the improvements are very minor, and far overshadowed by > the variability I get from my system. > I've created the attached test which tests the original code (Orig), your code (Two), and my suggestion of an int4buf (Three)and got the following surprising results: jurka@tony:~/pg/jdbc/projects/perf/micro$ java -classpath . Tester Orig | sort -n 11335 11370 11468 11484 11487 jurka@tony:~/pg/jdbc/projects/perf/micro$ java -classpath . Tester Two | sort -n 12472 12476 12489 12492 12619 jurka@tony:~/pg/jdbc/projects/perf/micro$ java -classpath . Tester Three | sort -n 4259 4562 4564 4611 4689 This shows your code is actually slower than the original code, although I have no idea why that could be. It shows theint4buf idea as a clear winner. I'm a little suspicious of the whole test because of your numbers going up. Could youtake a look at this and possibly confirm the results? I'm not sure if windows has an equivalent to /dev/null, but I wanted to avoid any impact of disk io. Kris Jurka Disclaimer: At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachmentsis confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately,destroy it and do not copy, disclose or use it in any way. __________________________________________________________________ This email has been scanned by the DMZGlobal Business Quality Electronic Messaging Suite. Please see http://www.dmzglobal.com/services/bqem.htm for details. __________________________________________________________________
On Mon, 26 Feb 2007, Kris Jurka wrote: > I've created the attached test which tests the original code (Orig), your > code (Two), and my suggestion of an int4buf (Three) and got the following > surprising results: > Based on that test and a similar attached test for reading, I'm prepared to apply the attached patch. Let me know what you think. Kris Jurka
Attachment
Hi Kris I think that your patch could be better if instead of using the "signed" right shift operator (>>) followed by a bit mask, use directly the "unsigned" right shift operator (>>>) So ((val >> 24) & 0xFF) would better as val >>> 24 By the way great work folks Nelson Arapé A happy Postgres JDBC user in his first time post to the list PS: Sorry for my english, not my native language El Martes, 27 de Febrero de 2007 04:21, Kris Jurka escribió: > On Mon, 26 Feb 2007, Kris Jurka wrote: > > I've created the attached test which tests the original code (Orig), your > > code (Two), and my suggestion of an int4buf (Three) and got the following > > surprising results: > > Based on that test and a similar attached test for reading, I'm prepared > to apply the attached patch. Let me know what you think. > > Kris Jurka
Sorry for the formatting of this message....
I tested using a ByteBuffer wrapped as an IntBuffer... quite good, but not the fastest.
I had a different implementation in mind for Send(byte buf[], int off, int siz) along the lines of:
int
i = buf.length - off; if (i < siz) { pg_output.write(buf, off, i); for (; i < siz; ++i) { pg_output.write(0);}
}
else { pg_output.write(buf, off, siz);}
From: Kris Jurka [mailto:books@ejurka.com]
Sent: Tue 27/02/2007 9:21 p.m.
To: Stephen Denne
Cc: pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] Minor performance improvements
On Mon, 26 Feb 2007, Kris Jurka wrote:
> I've created the attached test which tests the original code (Orig), your
> code (Two), and my suggestion of an int4buf (Three) and got the following
> surprising results:
>
Based on that test and a similar attached test for reading, I'm prepared
to apply the attached patch. Let me know what you think.
Kris Jurka
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage.
This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way.
This email has been scanned by the DMZGlobal Business Quality Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
Post this time to the list, the last time I sent it to Kris directly, sorry for that Hi Kris I think that your patch could be better if instead of using the "signed" right shift operator (>>) followed by a bit mask, use directly the "unsigned" right shift operator (>>>) So ((val >> 24) & 0xFF) would better as val >>> 24 By the way great work folks Nelson Arapé A happy Postgres JDBC user in his first time post to the list PS: Sorry for my english, not my native language El Martes, 27 de Febrero de 2007 04:21, Kris Jurka escribió: > On Mon, 26 Feb 2007, Kris Jurka wrote: > > I've created the attached test which tests the original code (Orig), your > > code (Two), and my suggestion of an int4buf (Three) and got the following > > surprising results: > > Based on that test and a similar attached test for reading, I'm prepared > to apply the attached patch. Let me know what you think. > > Kris Jurka
On Tue, 27 Feb 2007, Stephen Denne wrote: > I had a different implementation in mind for Send(byte buf[], int off, > int siz) along the lines of: > I'm not convinced this will be significantly faster, but it is slightly clearer, so I've incorporated it. > Is there a reason for removing pg_input.ensureBytes(siz)? I see you're > checking the length of what was read instead. Is this always equivalent > or sufficient? Does it block in diferent ways? Internally VisibleBufferedInputStream will end up calling ensureBytes in the read call. I don't think the actual read length check is necessary, but it's the sort of thing that could easily break if we changed the VisbibleBufferedInputStream implementation. > When converting ints to bytes, you do not need to mask with 255 (see > int0 to int3 in java.nio.Bits). Yes, removed. > Is it possible other people have code that is calling > ReceiveIntegerR(int siz)? > No, PGStream is an internal only class. Kris Jurka
On Tue, 27 Feb 2007, Nelson Arape wrote: > I think that your patch could be better if instead of using the "signed" > right shift operator (>>) followed by a bit mask, use directly the > "unsigned" right shift operator (>>>) Indeed that does improve things slightly. I didn't even know that operator existed. I've applied this patch: http://gborg.postgresql.org/pipermail/pgjdbc-commit/2007-February/000594.html Kris Jurka