Thread: Performance penalty for remote access of postgresql (8.1.3)? any experiance?
Performance penalty for remote access of postgresql (8.1.3)? any experiance?
From
"Guoping Zhang"
Date:
Hello, I am seeking advice/comment/experience you may have had for the performance cost for remote access to postgresql 8.1.X? I have two servers, one is Sun V240 (say server A) and the other is dual intel Xeon (say Server B) and both installed Solaris 10. With Server A, there is postgresql 8.1.3 installed with pgpool (pgpool-3.0.2), with server B, there is a pgpool (v3.0.2) installed. The test program is installed on both A and B, where the test application on server B is accessing to DBMS on A through pgpool. Note that the test code is not fancy but can insert a large number of record (say 100k rows) with configurable transaction size. Following are the results (repeated many times with the mean value and shall be accurate) for various setting by fixed 100k insertion operation with a transaction size as 100 rows): ------------------------------------------ 1. Test program running on server A directly access to LOCAL postgresql: 24.03 seconds 2. Test progam running on server A access to LOCAL postgresql through pgpool: 30.05 seconds 3. Test progam running on server A access REMOTE postgresql through local pgpool: 74.06 seconds ------------------------------------------ I have to say both machines are very light load and interconnected with local LAN. From 1 and 2, pgpool add 20% overhead, it sounds reasonable but any way to reduce it??? From 2 and 3, it suggests the remote access is much slower than local access. My question is: a) Anyone has the similar experience? How do you deal with it? b) Why TCP stack imposes such big delay? any tuning point I shall do? The time call reports for test 2 is real 0m32.71s user 0m2.42s sys 0m2.65s for test 3 is real 1:14.0 user 2.5 sys 3.2 c) Obviously, CPU time for (user + sys) for both tests are very similar, but the overall time is quite different. I assume the time used on TCP stack makes the difference. Many thanks, Regards, Guoping Zhang
Re: Performance penalty for remote access of postgresql (8.1.3)? any experiance?
From
Florian Weimer
Date:
* Guoping Zhang: > a) Anyone has the similar experience? How do you deal with it? > b) Why TCP stack imposes such big delay? any tuning point I shall do? If you use INSERT, you'll incur a network round-trip delay for each record. Try using COPY FROM instead, possibly to a temporary table if you need more complex calculations. If you do this, there won't be a huge difference between local and remote access as long as the bandwidth is sufficient. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Durlacher Allee 47 tel: +49-721-96201-1 D-76131 Karlsruhe fax: +49-721-96201-99
Re: Performance penalty for remote access of postgresql (8.1.3)? any experiance?
From
"Guoping Zhang"
Date:
Hi, Florian Thanks for pointing me the cause, but we simply cannot use the COPY FROM solution. Currently, our application service is running with its own dedicated local database, IF Feasible, we want to separate the application services out of database server and run SEVERAL instances of applation serivice on its own server (one per server), and make them all shall one database server. This helps to the scalability and also reduce the device cost as only database server would need mirror/backup/UPS etc. Obviously, if there is no better solution, the TCP round trip penalty will stop us doing so as we do have performance requirement. I guess there shall be quite number of people out there facing the similar problem, right? No alternative solution? Regards, Guoping Zhang -----Original Message----- From: Florian Weimer [mailto:fweimer@bfk.de] Sent: 2006Ae7OA19EO 16:30 To: guoping.zhang@nec.com.au Cc: pgsql-performance@postgresql.org; Guoping Zhang (E-mail) Subject: Re: [PERFORM] Performance penalty for remote access of postgresql (8.1.3)? any experiance? * Guoping Zhang: > a) Anyone has the similar experience? How do you deal with it? > b) Why TCP stack imposes such big delay? any tuning point I shall do? If you use INSERT, you'll incur a network round-trip delay for each record. Try using COPY FROM instead, possibly to a temporary table if you need more complex calculations. If you do this, there won't be a huge difference between local and remote access as long as the bandwidth is sufficient. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Durlacher Allee 47 tel: +49-721-96201-1 D-76131 Karlsruhe fax: +49-721-96201-99
Re: Performance penalty for remote access of postgresql (8.1.3)? any experiance?
From
Florian Weimer
Date:
* Guoping Zhang: > Thanks for pointing me the cause, but we simply cannot use the COPY FROM > solution. Why not? Just do something like this: CREATE TEMPORARY TABLE tmp (col1 TEXT NOT NULL, col2 INTEGER NOT NULL); COPY tmp FROM STDIN; row1 1 row2 2 ... \. INSERT INTO target SELECT * FROM tmp; If you need some kind of SELECT/INSERT/UPDATE cycle, it's far more complex, of course, and I'm not quite happy with what I'm using right now. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Durlacher Allee 47 tel: +49-721-96201-1 D-76131 Karlsruhe fax: +49-721-96201-99
Re: Performance penalty for remote access of postgresql (8.1.3)? any experiance?
From
Stephen Frost
Date:
* Guoping Zhang (guoping.zhang@nec.com.au) wrote: > Obviously, if there is no better solution, the TCP round trip penalty will > stop us doing so as we do have performance requirement. Actually, can't you stick multiple inserts into a given 'statement'? ie: insert into abc (123); insert into abc (234); I'm not 100% sure if that solves the round-trip issue, but it might.. Also, it looks like we might have multi-value insert support in 8.2 (I truely hope so anyway), so you could do something like this: insert into abc (123),(234); > I guess there shall be quite number of people out there facing the similar > problem, right? No alternative solution? Havn't run into it myself... Quite often you either have large inserts being done using COPY commands (data warehousing and analysis work) or you have a relatively small number of one-off inserts (OLTP) per transaction. Enjoy, Stephen
Attachment
Re: Performance penalty for remote access of postgresql (8.1.3)? any experiance?
From
Florian Weimer
Date:
* Stephen Frost: > Actually, can't you stick multiple inserts into a given 'statement'? > ie: insert into abc (123); insert into abc (234); IIRC, this breaks with PQexecParams, which is the recommended method for executing SQL statements nowadays. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Durlacher Allee 47 tel: +49-721-96201-1 D-76131 Karlsruhe fax: +49-721-96201-99
Re: Performance penalty for remote access of postgresql (8.1.3)? any experiance?
From
Stephen Frost
Date:
* Florian Weimer (fweimer@bfk.de) wrote: > * Stephen Frost: > > Actually, can't you stick multiple inserts into a given 'statement'? > > ie: insert into abc (123); insert into abc (234); > > IIRC, this breaks with PQexecParams, which is the recommended method > for executing SQL statements nowadays. For prepared queries you're absolutely correct. It's also true that it's the recommended approach for large numbers of inserts. If the network delay is more of a problem than the processing speed then it might make sense. It does seem to me that with multi-value insert we might consider changes to libpq to be able to use multi-value prepared inserts... Or it might be interesting to see the performance of non-prepared multi-value inserts vs. prepared statements. Thanks, Stephen
Attachment
In response to "Guoping Zhang" <guoping.zhang@nec.com.au>: > > Thanks for pointing me the cause, but we simply cannot use the COPY FROM > solution. > > Currently, our application service is running with its own dedicated local > database, IF Feasible, we want to separate the application services out of > database server and run SEVERAL instances of applation serivice on its own > server (one per server), and make them all shall one database server. This > helps to the scalability and also reduce the device cost as only database > server would need mirror/backup/UPS etc. > > Obviously, if there is no better solution, the TCP round trip penalty will > stop us doing so as we do have performance requirement. > > I guess there shall be quite number of people out there facing the similar > problem, right? No alternative solution? I suppose I'm a little confused on two points: 1) What did you expect. 2) What is your network? On #1: networking adds overhead. Period. Always. I believe you earlier said you estimated around %20 perf hit. For small transactions, I wouldn't expect much better. TCP adds a good bit of header to each packet, plus the time in the kernel, and the RTT. 20% sounds about average to me. #2 falls into a number of different categories. For example: a) What is your topology? If you absolutely need blazing speed, you should have a dedicated gigabit switched network between the machines. b) Not all network hardware is created equal. Cheap switches seldom perform at their advertised speed. Stick with high-end stuff. NICs are the same way. On #2, you'll want to ensure that the problem is not in the hardware before you start complaining about PostgreSQL, or even TCP. If you've got a cheap, laggy switch, not amount of TCP or PostgreSQL tuning is going to overcome it. Hope some of this is helpful. -- Bill Moran Collaborative Fusion Inc.
Stephen Frost wrote: > * Guoping Zhang (guoping.zhang@nec.com.au) wrote: > >>Obviously, if there is no better solution, the TCP round trip penalty will >>stop us doing so as we do have performance requirement. > > Actually, can't you stick multiple inserts into a given 'statement'? > ie: insert into abc (123); insert into abc (234); > > I'm not 100% sure if that solves the round-trip issue, but it might.. > Also, it looks like we might have multi-value insert support in 8.2 (I > truely hope so anyway), so you could do something like this: > insert into abc (123),(234); Yeah, see my post from last night on PATCHES. Something like "insert into abc (123); insert into abc (234); ..." actually seems to work pretty well as long as you don't drive the machine into swapping. If you're doing a very large number of INSERTs, break it up into bite-sized chunks and you should be fine. Joe
Re: Performance penalty for remote access of postgresql (8.1.3)? any experiance?
From
"Guoping Zhang"
Date:
Thanks for all the replies from different ppl. As you pointed out, each INSERT/UPDATE operation will result in a TCP round-trip delay for postgresql (may well true for all DBMS), this is the big problem to challenge our requirements, as extensively modify the (legacy) applicatioin is not a preferable choice. I measured the round-trip (UDP) delay as below: a) SERVER A to SERVER B: 0.35ms SERVER A to itself (Local host): 0.022ms That is, in the tests I did yesterday, it is about 100k insert operations, which means added around 35 seconds of delay..... b) Also, using Iperf shows that TCP bandwidth between Server A and B is about 92.3 Mbits/sec TCP bandwidth between two ports at same Server A can reach 10.9Gbits/sec That indicates the performance impact for the networking.... There might be parameter in Solaris to tune the 'ack response delay', but I didn't try now. Thanks for all the answers... Regards, Guoping Zhang -----Original Message----- From: Florian Weimer [mailto:fweimer@bfk.de] Sent: 2006Ae7OA20EO 0:18 To: Guoping Zhang Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Performance penalty for remote access of postgresql (8.1.3)? any experiance? * Stephen Frost: > Actually, can't you stick multiple inserts into a given 'statement'? > ie: insert into abc (123); insert into abc (234); IIRC, this breaks with PQexecParams, which is the recommended method for executing SQL statements nowadays. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Durlacher Allee 47 tel: +49-721-96201-1 D-76131 Karlsruhe fax: +49-721-96201-99
Guoping Zhang wrote: > >a) SERVER A to SERVER B: 0.35ms > SERVER A to itself (Local host): 0.022ms > > > 0.35ms seems rather slow. You might try investigating what's in the path. For comparison, between two machines here (three GigE switches in the path), I see 0.10ms RTT. Between two machines on the same switch I get 0.08ms.