Thread: libpq: How are result sets fetch behind the scene?
Hi all, We are using libpq in our C program. I was wondering how the client lib / protocol manage to fetch the result set to the client app memory. Is there some optimization ? Are rows fetched in sequence until the requested row? I mean if I do a SELECT that produces a huge result set, and then call PQgetvalue() with a row_number = 5000, are all 5000rows fetched from the server or is there some smart fast-forward done because I did not ask the 4999 previous rows? We know about server cursors (declare + fetch forward), and we do already fetch blocks of rows with this. But server cursors are slow (and certainly use more server resources), compared to direct SELECT execution. So we are wondering what would be the impact in terms of resources on the client application. Is there any doc link or blog that would explain best practices with libpq result set programming? Thanks! Seb
Sorry must read "fetched" in title. Similar question: When execution a SELECT (returning a large result set): Is the whole result set fetched to the client app, not matter what row number is provided to the first PQgetvalue() call(or similar API call on result set data or meta-data)? Seb On 8/29/19 6:32 PM, Sebastien FLAESCH wrote: > Hi all, > > We are using libpq in our C program. > > I was wondering how the client lib / protocol manage to fetch the result set to the client app memory. > > Is there some optimization ? > > Are rows fetched in sequence until the requested row? > > I mean if I do a SELECT that produces a huge result set, and then call PQgetvalue() with a row_number = 5000, are all 5000rows fetched from the > server or is there some smart fast-forward done because I did not ask the 4999 previous rows? > > We know about server cursors (declare + fetch forward), and we do already fetch blocks of rows with this. > > But server cursors are slow (and certainly use more server resources), compared to direct SELECT execution. > > So we are wondering what would be the impact in terms of resources on the client application. > > Is there any doc link or blog that would explain best practices with libpq result set programming? > > Thanks! > Seb > >
Just tested (using valgrind --massif) how much memory is allocated on the client side. As I expected, when fetching all rows with simple query execution (no server cursor using fetch forward to get rows in blocks), if you fetch all rows sequentially, a lot of memory is allocated on the client side. This makes sense since PQgetvalue() purpose is to give access to any row / cols of the whole result set. The valgrind --massif charts looks sometimes a bit strange to me: MB 1.943^# |#:::::::::::::: ::@:::::::@:::::: |#:::::: :: : :: :: @:::::::@:::::: |#:::::: :: : :: ::::: @:::::::@:::::: |#:::::: :: : :: :::: :: @:::::::@:::::: |#:::::: :: : :: :::::: :: @:::::::@:::::: |#:::::: :: : :: ::::::::: :: @:::::::@:::::: |#:::::: :: : :: :::: :::::: :: @:::::::@:::::: |#:::::: :: : :: :::::: :::::: :: @:::::::@:::::: |#:::::: :: : :: ::: :::: :::::: :: @:::::::@:::::: |#:::::: :: : :: ::::::::::: :::: :::::: :: @:::::::@:::::: |#:::::: :: : :: ::::@@::::: :: :::::: :::: :::::: :: @:::::::@:::::: |#:::::: :: : :: :::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: |#:::::: :: : ::::::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: |#:::::: :: : ::: ::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: |#:::::: :: : ::: ::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: |#:::::: :: : ::: ::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: |#:::::: :: : ::: ::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: |#:::::: :: : ::: ::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: |#:::::: :: : ::: ::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: 0 +----------------------------------------------------------------------->Gi 0 22.02 That one is more what I expect: MB 1.943^## |# :::::::::::::: :: : :@:::::@::::::::::::::::::::::::::::::::::::::: |# :::: : :: :: : : : :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: : : : :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: :::::::@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ 0 +----------------------------------------------------------------------->Gi 0 22.10 Same query using server cursors, with fetch forward. It's slower, but obviously less memory is used: KB 695.5^# : @@ |#:::::::::::::::::@:@:::::@ @@:::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: 0 +----------------------------------------------------------------------->Gi 0 22.37 Seb On 8/29/19 6:39 PM, Sebastien FLAESCH wrote: > Sorry must read "fetched" in title. > > Similar question: > > When execution a SELECT (returning a large result set): > > Is the whole result set fetched to the client app, not matter what row number is provided to the first PQgetvalue() call(or similar API call on > result set data or meta-data)? > > Seb > > On 8/29/19 6:32 PM, Sebastien FLAESCH wrote: >> Hi all, >> >> We are using libpq in our C program. >> >> I was wondering how the client lib / protocol manage to fetch the result set to the client app memory. >> >> Is there some optimization ? >> >> Are rows fetched in sequence until the requested row? >> >> I mean if I do a SELECT that produces a huge result set, and then call PQgetvalue() with a row_number = 5000, are all5000 rows fetched from the >> server or is there some smart fast-forward done because I did not ask the 4999 previous rows? >> >> We know about server cursors (declare + fetch forward), and we do already fetch blocks of rows with this. >> >> But server cursors are slow (and certainly use more server resources), compared to direct SELECT execution. >> >> So we are wondering what would be the impact in terms of resources on the client application. >> >> Is there any doc link or blog that would explain best practices with libpq result set programming? >> >> Thanks! >> Seb >> >> > > >
By "slower" in the last sentence: It's slower because we execute the same query many times, it's not a single query execution. This is really about performances of server cursors when many DECLARE / CLOSE are done. Seb On 8/30/19 3:40 PM, Sebastien FLAESCH wrote: > Just tested (using valgrind --massif) how much memory is allocated on the client side. > > As I expected, when fetching all rows with simple query execution (no server cursor > using fetch forward to get rows in blocks), if you fetch all rows sequentially, a > lot of memory is allocated on the client side. > > This makes sense since PQgetvalue() purpose is to give access to any row / cols of > the whole result set. > > The valgrind --massif charts looks sometimes a bit strange to me: > > MB > 1.943^# > |#:::::::::::::: ::@:::::::@:::::: > |#:::::: :: : :: :: @:::::::@:::::: > |#:::::: :: : :: ::::: @:::::::@:::::: > |#:::::: :: : :: :::: :: @:::::::@:::::: > |#:::::: :: : :: :::::: :: @:::::::@:::::: > |#:::::: :: : :: ::::::::: :: @:::::::@:::::: > |#:::::: :: : :: :::: :::::: :: @:::::::@:::::: > |#:::::: :: : :: :::::: :::::: :: @:::::::@:::::: > |#:::::: :: : :: ::: :::: :::::: :: @:::::::@:::::: > |#:::::: :: : :: ::::::::::: :::: :::::: :: @:::::::@:::::: > |#:::::: :: : :: ::::@@::::: :: :::::: :::: :::::: :: @:::::::@:::::: > |#:::::: :: : :: :::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: > |#:::::: :: : ::::::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: > |#:::::: :: : ::: ::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: > |#:::::: :: : ::: ::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: > |#:::::: :: : ::: ::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: > |#:::::: :: : ::: ::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: > |#:::::: :: : ::: ::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: > |#:::::: :: : ::: ::::: :@ : ::: :: :::::: :::: :::::: :: @:::::::@:::::: > 0 +----------------------------------------------------------------------->Gi > 0 22.02 > > That one is more what I expect: > > MB > 1.943^## > |# :::::::::::::: :: : :@:::::@::::::::::::::::::::::::::::::::::::::: > |# :::: : :: :: : : : :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: : : : :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: :::::::@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > |# :::: : :: :: :@: : ::: :@: :: @:: :::: :: :: :: :::: :: :: ::: ::::::@ > 0 +----------------------------------------------------------------------->Gi > 0 22.10 > > Same query using server cursors, with fetch forward. > It's slower, but obviously less memory is used: > > > KB > 695.5^# : @@ > |#:::::::::::::::::@:@:::::@ @@:::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > |#:: ::::::::::::::@:@:::::@ @ :::::@::::::::@:::::@:::::@::::@:::::@:::: > 0 +----------------------------------------------------------------------->Gi > 0 22.37 > > > > > > Seb > > > > On 8/29/19 6:39 PM, Sebastien FLAESCH wrote: >> Sorry must read "fetched" in title. >> >> Similar question: >> >> When execution a SELECT (returning a large result set): >> >> Is the whole result set fetched to the client app, not matter what row number is provided to the first PQgetvalue() call(or similar API call on >> result set data or meta-data)? >> >> Seb >> >> On 8/29/19 6:32 PM, Sebastien FLAESCH wrote: >>> Hi all, >>> >>> We are using libpq in our C program. >>> >>> I was wondering how the client lib / protocol manage to fetch the result set to the client app memory. >>> >>> Is there some optimization ? >>> >>> Are rows fetched in sequence until the requested row? >>> >>> I mean if I do a SELECT that produces a huge result set, and then call PQgetvalue() with a row_number = 5000, are all5000 rows fetched from the >>> server or is there some smart fast-forward done because I did not ask the 4999 previous rows? >>> >>> We know about server cursors (declare + fetch forward), and we do already fetch blocks of rows with this. >>> >>> But server cursors are slow (and certainly use more server resources), compared to direct SELECT execution. >>> >>> So we are wondering what would be the impact in terms of resources on the client application. >>> >>> Is there any doc link or blog that would explain best practices with libpq result set programming? >>> >>> Thanks! >>> Seb >>> >>> >> >> >> > > >
Sebastien FLAESCH <sf@4js.com> writes: > Is the whole result set fetched to the client app, not matter what row > number is provided to the first PQgetvalue() call (or similar API call > on result set data or meta-data)? I have tested this as well and came to the same result as you. The entire result set seems to be fetched at once. Attached is a test program: I have loaded 30MB of random strings and did a simple SELECT on the random data. I stopped the output with getchar() and looked at the network traffic and the memory usage with top(1). Everything gets allocated and fetched at once as far as I can see. However, I would not build an application that "depends" on this behavior. I think that it is better to use a declared cursor and use FETCH. -- Christian Barthel <bch@online.de>
Attachment
Christian Barthel <bch@online.de> writes: > Sebastien FLAESCH <sf@4js.com> writes: >> Is the whole result set fetched to the client app, not matter what row >> number is provided to the first PQgetvalue() call (or similar API call >> on result set data or meta-data)? > I have tested this as well and came to the same result as you. > The entire result set seems to be fetched at once. This must be so, and is documented as being so, because the abstraction that libpq provides is that a query either succeeds or fails. It cannot fetch a few rows and then decide that the query has succeeded; it has to collect the whole input before it knows there will be no late failure. Recent releases of libpq have an API to let you examine the rows as they come in, but then it's on your head to deal with the situation where an error occurs after you've already processed some rows. See https://www.postgresql.org/docs/current/libpq-single-row-mode.html You can alternatively use a cursor and FETCH a few rows at a time, as you mentioned. An error later than the first FETCH is still possible that way, but it's quantized in some sense --- any one FETCH either succeeds or fails. regards, tom lane
Thanks you for your answers Tom and Christian! Our code is based on server cursors to fetch rows by groups with FETCH FORWARD. This is fine and optimal regarding memory used by the result set. I think my main message here is that DECLARE / server cursors execution speed should be optimized, if possible. We did some comparisons with other DB engines, and they are much faster with equivalent query execution API calls. Seb On 9/8/19 9:35 PM, Tom Lane wrote: > Christian Barthel <bch@online.de> writes: >> Sebastien FLAESCH <sf@4js.com> writes: >>> Is the whole result set fetched to the client app, not matter what row >>> number is provided to the first PQgetvalue() call (or similar API call >>> on result set data or meta-data)? > >> I have tested this as well and came to the same result as you. >> The entire result set seems to be fetched at once. > > This must be so, and is documented as being so, because the abstraction > that libpq provides is that a query either succeeds or fails. It cannot > fetch a few rows and then decide that the query has succeeded; it has to > collect the whole input before it knows there will be no late failure. > > Recent releases of libpq have an API to let you examine the rows as > they come in, but then it's on your head to deal with the situation > where an error occurs after you've already processed some rows. > See > > https://www.postgresql.org/docs/current/libpq-single-row-mode.html > > You can alternatively use a cursor and FETCH a few rows at a time, > as you mentioned. An error later than the first FETCH is still > possible that way, but it's quantized in some sense --- any one > FETCH either succeeds or fails. > > regards, tom lane > >