Thread: Slow network retrieves

Slow network retrieves

From
Date:
I'm seeing what seems like slow retrieval times over the network.  I am
retrieving a single field of about 100-120 characters per record.  I am
getting about 3 seconds per 1000 records - it takes 30 seconds to retrieve
10,000 records.  That's only about 36 KBytes/sec.

This is a 100BT switched network (not sure if it is vlan'd or through a
router).  Echo time averages 3ms.

The back end is pretty much idle.  It shows 'idle in transaction'.

05-08-2004.23:54:43 Records read: 10000
05-08-2004.23:55:17 Records read: 20000
05-08-2004.23:55:50 Records read: 30000
05-08-2004.23:56:22 Records read: 40000
05-08-2004.23:56:55 Records read: 50000
05-08-2004.23:57:32 Records read: 60000
05-08-2004.23:58:07 Records read: 70000
...

The code is an ecpg program like:

    EXEC SQL WHENEVER SQLERROR GOTO sql_error;
    EXEC SQL WHENEVER NOT FOUND DO break;
    EXEC SQL DECLARE message_cursor CURSOR FOR

        SELECT
            file_name
        FROM
            messages
        WHERE
            system_key=(select system_key from systems where
system_name=:systemName);

    EXEC SQL OPEN message_cursor;

    count = 0;
    while (1) {
        EXEC SQL FETCH message_cursor INTO
            :fileNameDB;

        memcpy (tempstr, fileNameDB.arr, fileNameDB.len);
        tempstr[fileNameDB.len] = '\0';

            [Action with tempstr removed for testing]

        count++;
        if ( (count % 10000) == 0) logmsg ("Records read: %d", count);
        }


How can I speed this thing up?

Wes


Re: Slow network retrieves

From
Tom Lane
Date:
<wespvp@syntegra.com> writes:
> I'm seeing what seems like slow retrieval times over the network.

Are you sure it is a network problem?  What performance do you get
if you run the same test program locally on the database machine?
How about issuing the same sort of FETCH commands via a psql script?

            regards, tom lane

Re: Slow network retrieves

From
Date:
On 5/9/04 9:32 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

> Are you sure it is a network problem?  What performance do you get
> if you run the same test program locally on the database machine?
> How about issuing the same sort of FETCH commands via a psql script?

Yes, it is definitely due to the network latency even though that latency is
very small.  Here it is running locally:

05-09-2004.17:49:41  Records read: 10000
05-09-2004.17:49:41  Records read: 20000
05-09-2004.17:49:42  Records read: 30000
05-09-2004.17:49:42  Records read: 40000
05-09-2004.17:49:43  Records read: 50000
05-09-2004.17:49:43  Records read: 60000
05-09-2004.17:49:44  Records read: 70000
05-09-2004.17:49:45  Records read: 80000
05-09-2004.17:49:45  Records read: 90000
05-09-2004.17:49:46  Records read: 100000
05-09-2004.17:49:46  Records read: 110000
05-09-2004.17:49:47  Records read: 120000
05-09-2004.17:49:47  Records read: 130000
05-09-2004.17:49:48  Records read: 140000

My "outside looking in" observations seem to point to the fact that every
row has to be retrieved (or stored) with a separate request.  Network
latency, however small, becomes an issue when the volume is very high.

A Pro*C program I recently ported from Oracle to PostgreSQL showed this
difference.  In Pro*C you can load an array with rows to insert, then issue
a single INSERT request passing it the array.  As far as I can tell, in
PostgreSQL ecpg (or other) you have to execute one request per record.

Is there some way to batch insert/fetch requests?  How else can I improve
upon the performance?  It appears that COPY works like this, but you can't
control what is returned and you have to know the column order.

Wes


Re: Slow network retrieves

From
Tom Lane
Date:
<wespvp@syntegra.com> writes:
> On 5/9/04 9:32 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>> Are you sure it is a network problem?

> Yes, it is definitely due to the network latency even though that latency is
> very small.  Here it is running locally:
> [ about 20000 records/sec ]

Okay, I just wanted to verify that we weren't overlooking any other
sorts of bottleneck.  But the numbers you quote make sense as a network
issue: 33 seconds for 10000 records is 3.03 msec per record, and since
you say the measured ping time is 3 msec, it appears that FETCH has
just about the same response time as a ping ;-).  So you can't really
complain about it.  The only way to do better will be to batch multiple
fetches into one network round trip.

> A Pro*C program I recently ported from Oracle to PostgreSQL showed this
> difference.  In Pro*C you can load an array with rows to insert, then issue
> a single INSERT request passing it the array.  As far as I can tell, in
> PostgreSQL ecpg (or other) you have to execute one request per record.

The usual way to batch multiple insertions is with COPY IN.  The usual
way to batch a fetch is just to SELECT the whole thing; or if that is
too much data to snarf at once, use a cursor with "FETCH n" requests.
I am not sure how either of these techniques map into ecpg though.
If you want to use ecpg then I'd suggest bringing up the question on
pgsql-interfaces --- the ecpg gurus are more likely to be paying
attention over there.

> ... It appears that COPY works like this, but you can't
> control what is returned and you have to know the column order.

True, COPY OUT is only designed to return all the rows of a table.
However, in recent versions you can specify what columns you want
in a COPY.  It's still no substitute for SELECT...

            regards, tom lane

Re: Slow network retrieves

From
Gaetano Mendola
Date:
wespvp@syntegra.com wrote:

> The back end is pretty much idle.  It shows 'idle in transaction'.

Well, is not soo much idle, it's holding a transaction id!

That "idle in transaction" is not your problem but however I suggest you
take a look at why you have idle in transaction backend; do you
have back end sitting there days and days in that state ?



Regards
Gaetano Mendola




Re: Slow network retrieves

From
Date:
On 5/9/04 1:58 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

> I am not sure how either of these techniques map into ecpg though.
> If you want to use ecpg then I'd suggest bringing up the question on
> pgsql-interfaces --- the ecpg gurus are more likely to be paying
> attention over there.

I got some sample code from someone on the pgsql-interfaces list on how to
do bulk FETCH's.  It is pretty much the same as with Pro*C.  You just can't
use that for INSERT/UPDATE (hint, hint...)  I was able to improve the
network retrieval rate from 1 million records per hour to 1 million records
per minute.

Wes