Thread: where does postgres keep the query result until it is returned?
Hi,
Where does postgres keep the query result until it is returned?
In the shared_buffers?
Or in extra memory that was not previously allocated, or something else?
What if the query result becomes very large, so that it won't fit into memory?
cheers,
WBL
--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
Where does postgres keep the query result until it is returned?
In the shared_buffers?
Or in extra memory that was not previously allocated, or something else?
What if the query result becomes very large, so that it won't fit into memory?
cheers,
WBL
--
"Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
On Fri, Sep 17, 2010 at 03:00:36PM +0200, Willy-Bas Loos wrote: > Where does postgres keep the query result until it is returned? > In the shared_buffers? > Or in extra memory that was not previously allocated, or something else? Postgres, the server software, will spill large results (and any intermediate working sets) to disk automatically as needed. I believe any memory allocated for this task will be up to work_mem in size. The client side however isn't as intelligent, libpq will, by default, try and read all records into memory and will crash if the results are too large. There are various ways of dealing with this, but haven't tried myself. -- Sam http://samason.me.uk/
On Fri, Sep 17, 2010 at 02:14:57PM +0100, Sam Mason wrote: > Postgres, the server software, will spill large results (and any > intermediate working sets) to disk automatically as needed. I believe > any memory allocated for this task will be up to work_mem in size. That wasn't very clear was it; when I said "this task" I meant the task of accumulating results before things spill. -- Sam http://samason.me.uk/
Sam Mason <sam@samason.me.uk> writes: > On Fri, Sep 17, 2010 at 02:14:57PM +0100, Sam Mason wrote: >> Postgres, the server software, will spill large results (and any >> intermediate working sets) to disk automatically as needed. I believe >> any memory allocated for this task will be up to work_mem in size. > That wasn't very clear was it; when I said "this task" I meant the task > of accumulating results before things spill. If the question was about the final query result sent to the client: we don't spill that to disk, nor "hold" it anywhere. The backend sends it to the client on-the-fly as each row is generated. It's the client's responsibility to cope if the result set is huge. (As you noted, libpq is none too bright about that.) regards, tom lane
> If the question was about the final query result sent to the client: yes > we don't spill that to disk, nor "hold" it anywhere. The backend > sends it to the client on-the-fly as each row is generated. thanks, i didn't know that. I asked because i have a function that produces a result in xml. that is one row, one value even, but it grows pretty large. how is that handled? -- "Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
On Fri, Sep 17, 2010 at 04:51:46PM +0200, Willy-Bas Loos wrote: > i have a function that produces a result in xml. > that is one row, one value even, but it grows pretty large. > how is that handled? Rows are sent back in the entireity, so the PG instance would need enough memory to work with that row. When you're running a 32bit version of PG, values whose size is beyond ~100MB are a bit touch and go whether it will work. -- Sam http://samason.me.uk/
On Fri, Sep 17, 2010 at 3:59 PM, Sam Mason <sam@samason.me.uk> wrote: > On Fri, Sep 17, 2010 at 04:51:46PM +0200, Willy-Bas Loos wrote: >> i have a function that produces a result in xml. >> that is one row, one value even, but it grows pretty large. >> how is that handled? > > Rows are sent back in the entireity, so the PG instance would need > enough memory to work with that row. When you're running a 32bit > version of PG, values whose size is beyond ~100MB are a bit touch and go > whether it will work. well, the entire result has to be stored in memory on the client side, not just the row. This is why certain operations are so much more convenient on the server (like insert into foo select * from bar). After working with pg for a while you just get used to paging through results on the client (I try to keep the result size under 10mb or so) using any one of a number of methods. This is probably 'good sql style' anyways. One group of people that bump into this are xbase refugees who could browse massive results without fear due to the serverless architecture. There is a probably a good case to be made for more flexible result architecture in libpq that supports streaming, but I'd personally much rather see features like wCTE and stored procedures that remove cases where you have to send lots of data through the protocol to the client. merlin
On Mon, Sep 20, 2010 at 5:27 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Fri, Sep 17, 2010 at 3:59 PM, Sam Mason <sam@samason.me.uk> wrote:
> On Fri, Sep 17, 2010 at 04:51:46PM +0200, Willy-Bas Loos wrote:
>> i have a function that produces a result in xml.
>> that is one row, one value even, but it grows pretty large.
>> how is that handled?
>
> Rows are sent back in the entireity, so the PG instance would need
> enough memory to work with that row. When you're running a 32bit
> version of PG, values whose size is beyond ~100MB are a bit touch and go
> whether it will work.
Is'nt "work_mem" used for this on the server side (when sorts and hash-tables are required) and "temp_buffers" (when temporary tables are created) ?
> Rows are sent back in the entireity, so the PG instance would need > enough memory to work with that row. When you're running a 32bit > version of PG, values whose size is beyond ~100MB are a bit touch and go > whether it will work. ?? I hear that it's posible to store up to a GiB as a value for a "text" field. How would i possibly retrieve that data from the database? It would be logical to use work_mem for this.. (IMHO) Cheers, Willy-Bas -- "Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
Willy-Bas Loos <willybas@gmail.com> writes: >> Rows are sent back in the entireity, so the PG instance would need >> enough memory to work with that row. �When you're running a 32bit >> version of PG, values whose size is beyond ~100MB are a bit touch and go >> whether it will work. > ?? I hear that it's posible to store up to a GiB as a value for a "text" field. > How would i possibly retrieve that data from the database? That's the *theoretical* limit. In a 32-bit build, the practical limit is going to be quite a bit less. regards, tom lane
>>> Rows are sent back in the entireity, so the PG instance would need >>> enough memory to work with that row. When you're running a 32bit >>> version of PG, values whose size is beyond ~100MB are a bit touch and go >>> whether it will work. > >> ?? I hear that it's posible to store up to a GiB as a value for a "text" field. >> How would i possibly retrieve that data from the database? > > That's the *theoretical* limit. In a 32-bit build, the practical limit > is going to be quite a bit less. Sheesh.. - why? - how about 64 bit? - should this be documented? -- "Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
On Mon, Sep 20, 2010 at 03:37:58PM +0200, Willy-Bas Loos wrote: > > Rows are sent back in the entireity, so the PG instance would need > > enough memory to work with that row. When you're running a 32bit > > version of PG, values whose size is beyond ~100MB are a bit touch and go > > whether it will work. > > ?? I hear that it's posible to store up to a GiB as a value for a "text" field. > How would i possibly retrieve that data from the database? http://www.postgresql.org/docs/9.0/static/functions-string.html │ substring(string [from int] [for int]) │ text │ Extract substring @all: Regarding the HTML docs: Could index entries be made HTML anchors such that I could have turned the above link something like this: http://www.postgresql.org/docs/9.0/static/bookindex.html#substring ? Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On Sep 20, 2010, at 8:06 AM, Willy-Bas Loos wrote: > Sheesh.. > > - why? Because you start with 4GB of address space, immediately give up maybe 0.5GB for mapping executable code (all the systemlibraries that get linked), plus a good chunk for the main thread stack, plus library data structures. End result isyou likely have about 2GB (or less) address space available for heap allocations. And this of course will get used andat least somewhat fragmented as your program runs. So it is difficult to find a 1GB contiguous block for allocation. > - how about 64 bit? Well, you start with 17,179,869,184GB of address space, so of course it's different. It tends to be pretty easy to find acontiguous 1GB block in an address space that size. > - should this be documented? Doubtful. It has to do with very well known limits of 32-bit programs, applies to any use for a 1GB block regardless of whereit comes from (network or disk), really has nothing to do with postgres, and the actual size at which you'll start tohave problems with vary a good bit depending on: OS, libraries you link to, memory management library you use, memory managementtechniques you use, and to a very great extent your specific app and exactly what it does. -- Scott Ribe scott_ribe@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice
On 20/09/2010 10:56 PM, Scott Ribe wrote: >> - should this be documented? > > Doubtful. It has to do with very well known limits of 32-bit programs, applies to any use for a 1GB block regardless ofwhere it comes from (network or disk), really has nothing to do with postgres, and the actual size at which you'll startto have problems with vary a good bit depending on: OS, libraries you link to, memory management library you use, memorymanagement techniques you use, and to a very great extent your specific app and exactly what it does. If nothing else, it'd be helpful to mention in the docs that storage of individual values requires a contiguous memory block for each value. It's not unusual to have data structures that can be used to store large amounts of data structured as linked lists of extent blocks, so this is really a perfectly reasonable question to ask. -- Craig Ringer Tech-related writing at http://soapyfrogs.blogspot.com/
On Mon, Sep 20, 2010 at 10:06 AM, Willy-Bas Loos <willybas@gmail.com> wrote: >>>> Rows are sent back in the entireity, so the PG instance would need >>>> enough memory to work with that row. When you're running a 32bit >>>> version of PG, values whose size is beyond ~100MB are a bit touch and go >>>> whether it will work. >> >>> ?? I hear that it's posible to store up to a GiB as a value for a "text" field. >>> How would i possibly retrieve that data from the database? >> >> That's the *theoretical* limit. In a 32-bit build, the practical limit >> is going to be quite a bit less. > Sheesh.. > > - why? A database is not a filesystem, and a filesystem is not a database. Postgres like many databases is designed to organize small pieces of data for flexible retrieval. Filesystems are designed to store pieces of any size in a relatively inflexible way. You can adapt either technology into purposes they are not really designed for, but there going to be practical constraints. I expect these differences will become especially glaring when the coming revolution in storage arrives and most systems will become cpu (or even network) bound again. There are at least three specific features in postgres you can look at to make dealing with large datums easier: toast (which is implicitly used), large objects, and automatic column data compression features. You can also search the archives for a wealth of information on the topic. If I ever wrote a book on things not to do as a dba though, storing 1gb+ xml files in a database would be in the first chapter :-). How about breaking the file down? > - how about 64 bit? The various 32 bit based limits in postgres are coming from architecture choices and the desire to give consistent behavior across all platforms, not necessarily hardware constraints. merlin
One thing that i think works is to create a table from your results. That is, if your result is not one big chunk in one row that won't fit in memory, but lots of rows. like this: create table foo as select <humongous query> If you use "screen" (on linux) you can log out while your humongous query runs on the server. Cheers, WBL -- "Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw
On Thu, Sep 30, 2010 at 7:29 PM, Willy-Bas Loos <willybas@gmail.com> wrote: > One thing that i think works is to create a table from your results. > That is, if your result is not one big chunk in one row that won't fit > in memory, but lots of rows. On second thought, that helps for the scenario where your client looses connection or otherwise stops functioning before it has received the whole query result, it isn't really related to the topic. -- "Patriotism is the conviction that your country is superior to all others because you were born in it." -- George Bernard Shaw