Re: [patch] libpq one-row-at-a-time API - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: [patch] libpq one-row-at-a-time API
Date
Msg-id CAHyXU0zb88WNPbDR2JhEs_fGapNh+pDnC98n1NOO7eSqwFzQAA@mail.gmail.com
Whole thread Raw
In response to Re: [patch] libpq one-row-at-a-time API  (Marko Kreen <markokr@gmail.com>)
Responses Re: [patch] libpq one-row-at-a-time API  (Marko Kreen <markokr@gmail.com>)
List pgsql-hackers
On Mon, Jul 23, 2012 at 2:05 PM, Marko Kreen <markokr@gmail.com> wrote:
>
> Here is a simple test program that takes a SELECT
> query, reads it and outputs a COPY-formatted stream
> to standard output, to simulate some activity.
>
> It operates on 3 modes, specified by commant-line switches:
>
> -f   Load full resultset at once - old way.
> -s   Single-Row mode using PQgetResult().
> -z   Single-Row mode using PQgetRowData().
>
> It is compiled with 2 different libpqs that correspond to
> single-row-modeX branches in my github repo:
>
> rowdump1 - libpq with rowBuf + PQgetRowData().   rowBuf is
>            required for PQgetRowData.
>            [ https://github.com/markokr/postgres/tree/single-row-mode1 ]
>
> rowdump2 - Plain libpq patched for single-row mode.
>            No PQgetRowData() here.
>            [ https://github.com/markokr/postgres/tree/single-row-mode2 ]
>
> Notes:
>
> * Hardest part is picking realistic queries that matter.
>   It's possible to construct artificial queries that make
>   results go either way.
>
> * It does not make sense for compare -f with others.  But it
>   does make sense to compare -f from differently patched libpqs
>   to detect any potential slowdowns.
>
> * The time measured is User Time of client process.
>
> -------------------------------------------------------
> QUERY: select 10000,200,300000,rpad('x',30,'z') from
> generate_series(1,5000000)
> ./rowdump1 -f:   3.90   3.90   3.93  avg:  3.91
> ./rowdump2 -f:   4.03   4.13   4.05  avg:  4.07
> ./rowdump1 -s:   6.26   6.33   6.49  avg:  6.36
> ./rowdump2 -s:   7.48   7.46   7.50  avg:  7.48
> ./rowdump1 -z:   2.88   2.90   2.79  avg:  2.86
> QUERY: select
> rpad('x',10,'z'),rpad('x',20,'z'),rpad('x',30,'z'),rpad('x',40,'z'),rpad('x',50,'z'),rpad('x',60,'z')
> from generate_series(1,3000000)
> ./rowdump1 -f:   6.29   6.36   6.14  avg:  6.26
> ./rowdump2 -f:   6.79   6.69   6.72  avg:  6.73
> ./rowdump1 -s:   7.71   7.72   7.80  avg:  7.74
> ./rowdump2 -s:   8.14   8.16   8.57  avg:  8.29
> ./rowdump1 -z:   6.45   5.15   5.16  avg:  5.59
> QUERY: select
>
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100
> from generate_series(1,800000)
> ./rowdump1 -f:   5.68   5.66   5.72  avg:  5.69
> ./rowdump2 -f:   5.69   5.84   5.67  avg:  5.73
> ./rowdump1 -s:   7.68   7.76   7.67  avg:  7.70
> ./rowdump2 -s:   7.57   7.54   7.62  avg:  7.58
> ./rowdump1 -z:   2.78   2.82   2.72  avg:  2.77
> QUERY: select 1000,rpad('x', 400, 'z'),rpad('x', 4000, 'z') from
> generate_series(1,100000)
> ./rowdump1 -f:   2.71   2.66   2.58  avg:  2.65
> ./rowdump2 -f:   3.11   3.14   3.16  avg:  3.14
> ./rowdump1 -s:   2.64   2.61   2.64  avg:  2.63
> ./rowdump2 -s:   3.15   3.11   3.11  avg:  3.12
> ./rowdump1 -z:   2.53   2.51   2.46  avg:  2.50
> -------------------------------------------------------
>
> Test code attached.  Please play with it.
>
> By this test, both rowBuf and PQgetRowData() look good.

I agree on performance grounds.   It's important for libpq to be fast.

It seems odd (but maybe ok) that you have to set the single row mode
on the connection only to have the server reset it whenever you call a
send function: maybe rename to PQsetResultSingleRowMode?

Does PQgetRowData() break the ability to call PQgetvalue() against the
result as well as other functions like PQgetisnull()?  If so, I
object.

merlin


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Using pg_upgrade on log-shipping standby servers
Next
From: Marko Kreen
Date:
Subject: Re: [patch] libpq one-row-at-a-time API