Re: [patch] libpq one-row-at-a-time API - Mailing list pgsql-hackers
From | Merlin Moncure |
---|---|
Subject | Re: [patch] libpq one-row-at-a-time API |
Date | |
Msg-id | CAHyXU0zb88WNPbDR2JhEs_fGapNh+pDnC98n1NOO7eSqwFzQAA@mail.gmail.com Whole thread Raw |
In response to | Re: [patch] libpq one-row-at-a-time API (Marko Kreen <markokr@gmail.com>) |
Responses |
Re: [patch] libpq one-row-at-a-time API
(Marko Kreen <markokr@gmail.com>)
|
List | pgsql-hackers |
On Mon, Jul 23, 2012 at 2:05 PM, Marko Kreen <markokr@gmail.com> wrote: > > Here is a simple test program that takes a SELECT > query, reads it and outputs a COPY-formatted stream > to standard output, to simulate some activity. > > It operates on 3 modes, specified by commant-line switches: > > -f Load full resultset at once - old way. > -s Single-Row mode using PQgetResult(). > -z Single-Row mode using PQgetRowData(). > > It is compiled with 2 different libpqs that correspond to > single-row-modeX branches in my github repo: > > rowdump1 - libpq with rowBuf + PQgetRowData(). rowBuf is > required for PQgetRowData. > [ https://github.com/markokr/postgres/tree/single-row-mode1 ] > > rowdump2 - Plain libpq patched for single-row mode. > No PQgetRowData() here. > [ https://github.com/markokr/postgres/tree/single-row-mode2 ] > > Notes: > > * Hardest part is picking realistic queries that matter. > It's possible to construct artificial queries that make > results go either way. > > * It does not make sense for compare -f with others. But it > does make sense to compare -f from differently patched libpqs > to detect any potential slowdowns. > > * The time measured is User Time of client process. > > ------------------------------------------------------- > QUERY: select 10000,200,300000,rpad('x',30,'z') from > generate_series(1,5000000) > ./rowdump1 -f: 3.90 3.90 3.93 avg: 3.91 > ./rowdump2 -f: 4.03 4.13 4.05 avg: 4.07 > ./rowdump1 -s: 6.26 6.33 6.49 avg: 6.36 > ./rowdump2 -s: 7.48 7.46 7.50 avg: 7.48 > ./rowdump1 -z: 2.88 2.90 2.79 avg: 2.86 > QUERY: select > rpad('x',10,'z'),rpad('x',20,'z'),rpad('x',30,'z'),rpad('x',40,'z'),rpad('x',50,'z'),rpad('x',60,'z') > from generate_series(1,3000000) > ./rowdump1 -f: 6.29 6.36 6.14 avg: 6.26 > ./rowdump2 -f: 6.79 6.69 6.72 avg: 6.73 > ./rowdump1 -s: 7.71 7.72 7.80 avg: 7.74 > ./rowdump2 -s: 8.14 8.16 8.57 avg: 8.29 > ./rowdump1 -z: 6.45 5.15 5.16 avg: 5.59 > QUERY: select > 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100 > from generate_series(1,800000) > ./rowdump1 -f: 5.68 5.66 5.72 avg: 5.69 > ./rowdump2 -f: 5.69 5.84 5.67 avg: 5.73 > ./rowdump1 -s: 7.68 7.76 7.67 avg: 7.70 > ./rowdump2 -s: 7.57 7.54 7.62 avg: 7.58 > ./rowdump1 -z: 2.78 2.82 2.72 avg: 2.77 > QUERY: select 1000,rpad('x', 400, 'z'),rpad('x', 4000, 'z') from > generate_series(1,100000) > ./rowdump1 -f: 2.71 2.66 2.58 avg: 2.65 > ./rowdump2 -f: 3.11 3.14 3.16 avg: 3.14 > ./rowdump1 -s: 2.64 2.61 2.64 avg: 2.63 > ./rowdump2 -s: 3.15 3.11 3.11 avg: 3.12 > ./rowdump1 -z: 2.53 2.51 2.46 avg: 2.50 > ------------------------------------------------------- > > Test code attached. Please play with it. > > By this test, both rowBuf and PQgetRowData() look good. I agree on performance grounds. It's important for libpq to be fast. It seems odd (but maybe ok) that you have to set the single row mode on the connection only to have the server reset it whenever you call a send function: maybe rename to PQsetResultSingleRowMode? Does PQgetRowData() break the ability to call PQgetvalue() against the result as well as other functions like PQgetisnull()? If so, I object. merlin
pgsql-hackers by date: