Thread: trouble with a join on OS X
I am trying to do fairly simple joins on climate databases that should return ~ 7 million rows of data. However, I'm getting an error message on a OS X (10.4 tiger server) machine that seems to imply that I am running out of memory. The errors are: psql(15811) malloc: *** vm_allocate(size=8421376) failed (error code=3) psql(15811) malloc: *** error: can't allocate region psql(15811) malloc: *** set a breakpoint in szone_error to debug The query should return all data from all climate stations. In order to test the query I tried narrowing the SELECT statement to a return data for a single station. This query worked (ie did not cause the malloc errors) and returned the expected 200,000 or so rows. Since this worked I don't think there is a problem with the join syntax. This a a dual G5 box with 6 gigs of ram running postgresql 8.1. I have not tired altering kernel resources (as described in http:// www.postgresql.org/docs/8.1/interactive/kernel-resources.html#SHARED- MEMORY-PARAMETERS), or compiling for 64 bit. I'm just not sure what to try next. Does anyone have any suggestions? Best Regards, Kirk
On Fri, Feb 02, 2007 at 07:52:48AM -0600, Kirk Wythers wrote: > psql(15811) malloc: *** vm_allocate(size=8421376) failed (error code=3) > psql(15811) malloc: *** error: can't allocate region > psql(15811) malloc: *** set a breakpoint in szone_error to debug It sounds like you are out of memory. Have you tried reducing work_mem? Actually, what does your postgresql.conf look like with regard to memory settings? > This a a dual G5 box with 6 gigs of ram running postgresql 8.1. I > have not tired altering kernel resources (as described in http:// > www.postgresql.org/docs/8.1/interactive/kernel-resources.html#SHARED- > MEMORY-PARAMETERS), or compiling for 64 bit. I'm just not sure what > to try next. Does anyone have any suggestions? Compiling for 64 bit might very well help you, but it sounds odd to use several gigabytes of RAM for a sort. Could you post EXPLAIN ANALYZE for the query with only one row, as well as your table schema? /* Steinar */ -- Homepage: http://www.sesse.net/
Kirk Wythers wrote: > I am trying to do fairly simple joins on climate databases that should > return ~ 7 million rows of data. However, I'm getting an error message > on a OS X (10.4 tiger server) machine that seems to imply that I am > running out of memory. The errors are: > > psql(15811) malloc: *** vm_allocate(size=8421376) failed (error code=3) Is this actually in psql - the client code rather than the backend? Could it be that its allocating memory for its 7million result rows and running out of space for your user account? -- Richard Huxton Archonet Ltd
Richard Huxton wrote: > Kirk Wythers wrote: >> I am trying to do fairly simple joins on climate databases that should >> return ~ 7 million rows of data. However, I'm getting an error message >> on a OS X (10.4 tiger server) machine that seems to imply that I am >> running out of memory. The errors are: >> >> psql(15811) malloc: *** vm_allocate(size=8421376) failed (error code=3) > Is this actually in psql - the client code rather than the backend? > > Could it be that its allocating memory for its 7million result rows and > running out of space for your user account? > Hi, If you look at the message carefully, it looks like (for me) that the client is running out of memory. Can't allocate that 8,4MB :) Regards, Akos
=?ISO-8859-1?Q?G=E1briel_=C1kos?= <akos.gabriel@i-logic.hu> writes: > Richard Huxton wrote: >> Kirk Wythers wrote: >>> I am trying to do fairly simple joins on climate databases that should >>> return ~ 7 million rows of data. > If you look at the message carefully, it looks like (for me) that the > client is running out of memory. Can't allocate that 8,4MB :) Right, the join result doesn't fit in the client's memory limit. This is not too surprising, as the out-of-the-box ulimit settings on Tiger appear to be $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) 6144 file size (blocks, -f) unlimited max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 256 pipe size (512 bytes, -p) 1 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 266 virtual memory (kbytes, -v) unlimited $ 6 meg of memory isn't gonna hold 7 million rows ... so either raise "ulimit -d" (quite a lot) or else use a cursor to fetch the result in segments. regards, tom lane
Thanks for the reply Steiner, On Feb 2, 2007, at 8:41 AM, Steinar H. Gunderson wrote: > On Fri, Feb 02, 2007 at 07:52:48AM -0600, Kirk Wythers wrote: >> psql(15811) malloc: *** vm_allocate(size=8421376) failed (error >> code=3) >> psql(15811) malloc: *** error: can't allocate region >> psql(15811) malloc: *** set a breakpoint in szone_error to debug > > It sounds like you are out of memory. Have you tried reducing > work_mem? > Actually, what does your postgresql.conf look like with regard to > memory > settings? I have not altered postgresql.conf. I assume these are the defaults: # - Memory - shared_buffers = 300 # min 16 or max_connections*2, 8KB each #temp_buffers = 1000 # min 100, 8KB each #max_prepared_transactions = 5 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). #work_mem = 1024 # min 64, size in KB #maintenance_work_mem = 16384 # min 1024, size in KB #max_stack_depth = 2048 # min 100, size in KB What about altering the sysctl values in /etc/rc to: sysctl -w kern.sysv.shmmax=167772160 sysctl -w kern.sysv.shmmin=1 sysctl -w kern.sysv.shmmni=32 sysctl -w kern.sysv.shmseg=8 sysctl -w kern.sysv.shmall=65536 RIght now they are: sysctl -w kern.sysv.shmmax=4194304 kern.sysv.shmmin=1 kern.sysv.shmmni=32 kern.s ysv.shmseg=8 kern.sysv.shmall=1024 > >> This a a dual G5 box with 6 gigs of ram running postgresql 8.1. I >> have not tired altering kernel resources (as described in http:// >> www.postgresql.org/docs/8.1/interactive/kernel-resources.html#SHARED- >> MEMORY-PARAMETERS), or compiling for 64 bit. I'm just not sure what >> to try next. Does anyone have any suggestions? > > Compiling for 64 bit might very well help you, but it sounds odd to > use > several gigabytes of RAM for a sort. > > Could you post EXPLAIN ANALYZE for the query with only one row, as > well > as your table schema? met_data=# EXPLAIN ANALYSE SELECT sites.station_id, sites.longname, sites.lat, sites.lon, sites.thepoint_meter, weather.date, weather.year, weather.month, weather.day, weather.doy, weather.precip, weather.tmin, weather.tmax, weather.snowfall, weather.snowdepth, weather.tmean FROM sites LEFT OUTER JOIN weather ON sites.station_id = weather.station_id WHERE weather.station_id = 210018 AND weather.year = 1893 AND weather.doy = 365; QUERY PLAN ------------------------------------------------------------------------ ----------------------------------------------------------- Nested Loop (cost=0.00..33426.63 rows=1 width=96) (actual time=2.140..101.122 rows=1 loops=1) -> Index Scan using sites_pkey on sites (cost=0.00..5.25 rows=1 width=60) (actual time=0.106..0.111 rows=1 loops=1) Index Cond: (210018 = station_id) -> Index Scan using weather_pkey on weather (cost=0.00..33421.37 rows=1 width=40) (actual time=2.011..100.983 rows=1 loops=1) Index Cond: (station_id = 210018) Filter: (("year" = 1893) AND (doy = 365)) Total runtime: 101.389 ms (7 rows) The schema is public, but I'm not sure how to do an EXPAIN ANALYSE on a schema. > > /* Steinar */ > -- > Homepage: http://www.sesse.net/
On Feb 2, 2007, at 9:46 AM, Tom Lane wrote: > =?ISO-8859-1?Q?G=E1briel_=C1kos?= <akos.gabriel@i-logic.hu> writes: >> Richard Huxton wrote: >>> Kirk Wythers wrote: >>>> I am trying to do fairly simple joins on climate databases that >>>> should >>>> return ~ 7 million rows of data. > >> If you look at the message carefully, it looks like (for me) that the >> client is running out of memory. Can't allocate that 8,4MB :) > > Right, the join result doesn't fit in the client's memory limit. > This is not too surprising, as the out-of-the-box ulimit settings > on Tiger appear to be > > $ ulimit -a > core file size (blocks, -c) 0 > data seg size (kbytes, -d) 6144 > file size (blocks, -f) unlimited > max locked memory (kbytes, -l) unlimited > max memory size (kbytes, -m) unlimited > open files (-n) 256 > pipe size (512 bytes, -p) 1 > stack size (kbytes, -s) 8192 > cpu time (seconds, -t) unlimited > max user processes (-u) 266 > virtual memory (kbytes, -v) unlimited > $ > > 6 meg of memory isn't gonna hold 7 million rows ... so either raise > "ulimit -d" (quite a lot) or else use a cursor to fetch the result > in segments. > Thanks Tom... Any suggestions as to how much to raise ulimit -d? And how to raise ulimit -d?
On Fri, Feb 02, 2007 at 10:05:29AM -0600, Kirk Wythers wrote: > Thanks Tom... Any suggestions as to how much to raise ulimit -d? And > how to raise ulimit -d? Try multiplying it by 100 for a start: ulimit -d 614400 /* Steinar */ -- Homepage: http://www.sesse.net/
"Steinar H. Gunderson" <sgunderson@bigfoot.com> writes: > On Fri, Feb 02, 2007 at 10:05:29AM -0600, Kirk Wythers wrote: >> Thanks Tom... Any suggestions as to how much to raise ulimit -d? And >> how to raise ulimit -d? > Try multiplying it by 100 for a start: > ulimit -d 614400 Or just "ulimit -d unlimited" regards, tom lane
Tom, I tried ulimit -d 614400, but the query ended with the same error. I thought then that the message: psql(21522) malloc: *** vm_allocate(size=8421376) failed (error code=3) psql(21522) malloc: *** error: can't allocate region psql(21522) malloc: *** set a breakpoint in szone_error to debug out of memory for query result was telling me that I needed 841376 for the querry, so I tied bumping ulimit -d up another 10 to 6244000. However, that attempt gave the error: truffula:~ kwythers$ ulimit -d 6144000 -bash: ulimit: data seg size: cannot modify limit: Operation not permitted So I tried re-setting ulimit -d back to 6144, which worked, but now I can not seem to get ulimit -d to change again. It will not even allow ulimit -d 614400 (even though that worked a second ago). This seems very odd. On Feb 2, 2007, at 10:11 AM, Tom Lane wrote: > "Steinar H. Gunderson" <sgunderson@bigfoot.com> writes: > >> On Fri, Feb 02, 2007 at 10:05:29AM -0600, Kirk Wythers wrote: >> >>> Thanks Tom... Any suggestions as to how much to raise ulimit -d? And >>> how to raise ulimit -d? >>> > > >> Try multiplying it by 100 for a start: >> ulimit -d 614400 >> > > Or just "ulimit -d unlimited" > > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings >
Kirk Wythers <kwythers@umn.edu> writes: > However, setting ulimit to unlimited does not seem to solve the > issue. Output from ulimit -a is: Possibly a silly question, but you are running the client code under the shell session that you adjusted ulimit for, yes? regards, tom lane
At this point there are no silly questions. But I am running the query under the shell session that I adjusted. I did discover that ulimit -d only changes the shell session that you issue the command in. So I changed ulimit -d to unlimited, connected to the db with psql db_name, then ran the select command (all in the same shell). On Feb 2, 2007, at 11:59 AM, Tom Lane wrote: > Kirk Wythers <kwythers@umn.edu> writes: >> However, setting ulimit to unlimited does not seem to solve the >> issue. Output from ulimit -a is: > > Possibly a silly question, but you are running the client code > under the > shell session that you adjusted ulimit for, yes? > > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org
On Feb 2, 2007, at 10:11 AM, Tom Lane wrote: > "Steinar H. Gunderson" <sgunderson@bigfoot.com> writes: >> On Fri, Feb 02, 2007 at 10:05:29AM -0600, Kirk Wythers wrote: >>> Thanks Tom... Any suggestions as to how much to raise ulimit -d? And >>> how to raise ulimit -d? > >> Try multiplying it by 100 for a start: >> ulimit -d 614400 > > Or just "ulimit -d unlimited" Thanks to everyone so far. However, setting ulimit to unlimited does not seem to solve the issue. Output from ulimit -a is: truffula:~ kwythers$ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 256 pipe size (512 bytes, -p) 1 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 100 virtual memory (kbytes, -v) unlimited Also, changes to kernel settings in /etc/rc include: sysctl -w kern.sysv.shmmax=167772160 sysctl -w kern.sysv.shmmin=1 sysctl -w kern.sysv.shmmni=32 sysctl -w kern.sysv.shmseg=8 sysctl -w kern.sysv.shmall=65536 However, I'm still getting the memory error: met_data=# SELECT sites.station_id, sites.longname, sites.lat, sites.lon, sites.thepoint_meter, weather.date, weather.year, weather.month, weather.day, weather.doy, weather.precip, weather.tmin, weather.tmax, weather.snowfall, weather.snowdepth, weather.tmean FROM sites LEFT OUTER JOIN weather ON sites.station_id = weather.station_id; psql(532) malloc: *** vm_allocate(size=8421376) failed (error code=3) psql(532) malloc: *** error: can't allocate region psql(532) malloc: *** set a breakpoint in szone_error to debug out of memory for query result Any other ideas out there?
Kirk Wythers <kwythers@umn.edu> writes: > However, setting ulimit to unlimited does not seem to solve the > issue. After some experimentation I'm left wondering exactly what ulimit's -d option is for on OS X, because it sure doesn't seem to be limiting process data size. (I should have been suspicious of a value as small as 6 meg, anyway.) I tried selecting a large unconstrained join on my own OS X machine, and what I saw (watching with "top") was that the psql process VSIZE went up to 1.75Gb before it failed with the same error as Kirk got: regression=# select * from tenk1 a , tenk1 b; psql(16572) malloc: *** vm_allocate(size=8421376) failed (error code=3) psql(16572) malloc: *** error: can't allocate region psql(16572) malloc: *** set a breakpoint in szone_error to debug Since this is just a bog-standard Mini with 512M memory, it was pretty thoroughly on its knees by this point :-(. I'm not sure how to find out about allocated swap space in OS X, but my bet is that the above message should be understood as "totally out of virtual memory". My suggestion is to use a cursor to retrieve the data in more manageably-sized chunks than 7M rows. (If you don't want to mess with managing a cursor explicitly, as of 8.2 there's a psql variable FETCH_COUNT that can be set to make it happen behind the scenes.) regards, tom lane
Tom, On 2/2/07 2:18 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > as of 8.2 there's a psql variable > FETCH_COUNT that can be set to make it happen behind the scenes.) FETCH_COUNT is a godsend and works beautifully for exactly this purpose. Now he's got to worry about how to page through 8GB of results in something less than geological time with the space bar ;-) - Luke
Luke Lonergan wrote: > Tom, > > On 2/2/07 2:18 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > >> as of 8.2 there's a psql variable >> FETCH_COUNT that can be set to make it happen behind the scenes.) > > FETCH_COUNT is a godsend and works beautifully for exactly this purpose. > > Now he's got to worry about how to page through 8GB of results in something > less than geological time with the space bar ;-) \o /tmp/really_big_cursor_return ;) Joshua D. Drake > > - Luke > > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
> \o /tmp/really_big_cursor_return > > ;) Tough crowd :-D - Luke
On Feb 2, 2007, at 7:53 PM, Luke Lonergan wrote:
Tough crowd :-D
No kidding ;-)
Luke Lonergan wrote: >> \o /tmp/really_big_cursor_return >> >> ;) > > Tough crowd :-D Yeah well Andrew probably would have said use sed and pipe it through awk to get the data you want. Joshua D. Drake > > - Luke > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
On Feb 2, 2007, at 7:39 PM, Luke Lonergan wrote: > Tom, > > On 2/2/07 2:18 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > >> as of 8.2 there's a psql variable >> FETCH_COUNT that can be set to make it happen behind the scenes.) > > FETCH_COUNT is a godsend and works beautifully for exactly this > purpose. > > Now he's got to worry about how to page through 8GB of results in > something > less than geological time with the space bar ;-) I actually have no intention of paging through the results, but rather need to use the query to get the results into a new table with UPDATE, so that a GIS system can do some interpolations with subsets of the results. > > - Luke > >
Joshua D. Drake wrote: > Luke Lonergan wrote: >>> \o /tmp/really_big_cursor_return >>> >>> ;) >> Tough crowd :-D > > Yeah well Andrew probably would have said use sed and pipe it through > awk to get the data you want. Chances are, if you're using awk, you shouldn't need sed. :) -- Until later, Geoffrey Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. - Benjamin Franklin
Geoffrey wrote: > Joshua D. Drake wrote: >> Luke Lonergan wrote: >>>> \o /tmp/really_big_cursor_return >>>> >>>> ;) >>> Tough crowd :-D >> >> Yeah well Andrew probably would have said use sed and pipe it through >> awk to get the data you want. > > Chances are, if you're using awk, you shouldn't need sed. :) Chances are.. if you are using awk or sed, you should use perl ;) Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
Kirk Wythers <kwythers@umn.edu> writes: > On Feb 2, 2007, at 7:39 PM, Luke Lonergan wrote: >> Now he's got to worry about how to page through 8GB of results in >> something less than geological time with the space bar ;-) > I actually have no intention of paging through the results, but > rather need to use the query to get the results into a new table with > UPDATE, so that a GIS system can do some interpolations with subsets > of the results. Er ... then why are you SELECTing the data at all? You can most likely get it done much faster if the data stays inside the database engine. regards, tom lane
On Feb 2, 2007, at 8:32 PM, Tom Lane wrote: > Kirk Wythers <kwythers@umn.edu> writes: >> On Feb 2, 2007, at 7:39 PM, Luke Lonergan wrote: >>> Now he's got to worry about how to page through 8GB of results in >>> something less than geological time with the space bar ;-) > >> I actually have no intention of paging through the results, but >> rather need to use the query to get the results into a new table with >> UPDATE, so that a GIS system can do some interpolations with subsets >> of the results. > > Er ... then why are you SELECTing the data at all? You can most > likely > get it done much faster if the data stays inside the database engine. > > The new table needs to be filled with the results of the join. If there is a way to do this without a SELECT, please share.
Kirk Wythers wrote: > > On Feb 2, 2007, at 8:32 PM, Tom Lane wrote: > >> Kirk Wythers <kwythers@umn.edu> writes: >>> On Feb 2, 2007, at 7:39 PM, Luke Lonergan wrote: >>>> Now he's got to worry about how to page through 8GB of results in >>>> something less than geological time with the space bar ;-) >> >>> I actually have no intention of paging through the results, but >>> rather need to use the query to get the results into a new table with >>> UPDATE, so that a GIS system can do some interpolations with subsets >>> of the results. >> >> Er ... then why are you SELECTing the data at all? You can most likely >> get it done much faster if the data stays inside the database engine. >> >> > > The new table needs to be filled with the results of the join. If there > is a way to do this without a SELECT, please share. INSERT INTO foo SELECT * FROM BAR JOIN baz USING (id) Joshua D. Drake > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
Kirk Wythers <kwythers@umn.edu> writes: > The new table needs to be filled with the results of the join. If > there is a way to do this without a SELECT, please share. If it's an entirely new table, then you probably want to use INSERT ... SELECT. If what you want is to update existing rows using a join, you can use UPDATE ... FROM (not standard) or something involving a sub-select. You'd need to state your problem in some detail to get more help than that... regards, tom lane
Tom Lane wrote: > Since this is just a bog-standard Mini with 512M memory, it was pretty > thoroughly on its knees by this point :-(. I'm not sure how to find out > about allocated swap space in OS X, but my bet is that the above message > should be understood as "totally out of virtual memory". just so you can look into it for your own curiosity ;-) - Mac OS X uses the startup disk for VM storage. You can find the files in - /var/vm You will find the swapfiles there, the size of the swapfiles progressively get larger - swapfile0 and 1 are 64M then 2 is 128M, 3 is 256M, 4 is 512M, 5 is 1G.... each is preallocated so it only gives you a rough idea of how much vm is being used. You would run out when your startup disk is full, though most apps probably hit the wall at 4G of vm unless you have built a 64bit version. The 4G (32bit) limit may be where you hit the out of memory errors (or is postgres get around that with it's caching?). -- Shane Ambler pgSQL@007Marketing.com Get Sheeky @ http://Sheeky.Biz
On Feb 3, 2007, at 9:59 AM, Shane Ambler wrote: > > just so you can look into it for your own curiosity ;-) - Mac OS X > uses the startup disk for VM storage. You can find the files in - / > var/vm > > You will find the swapfiles there, the size of the swapfiles > progressively get larger - swapfile0 and 1 are 64M then 2 is 128M, > 3 is 256M, 4 is 512M, 5 is 1G.... each is preallocated so it only > gives you a rough idea of how much vm is being used. You would run > out when your startup disk is full, though most apps probably hit > the wall at 4G of vm unless you have built a 64bit version. > > The 4G (32bit) limit may be where you hit the out of memory errors > (or is postgres get around that with it's caching?). Any idea if postgres on OS X can truely access more that 4 gigs if the 64 bit version is built? I have tried building the 64 bit version of some other apps on OS X, and I have never been convinced that they behaved as true 64 bit. > > > > -- > > Shane Ambler > pgSQL@007Marketing.com > > Get Sheeky @ http://Sheeky.Biz
Kirk Wythers wrote: >> The 4G (32bit) limit may be where you hit the out of memory errors (or >> is postgres get around that with it's caching?). > > Any idea if postgres on OS X can truely access more that 4 gigs if the > 64 bit version is built? I have tried building the 64 bit version of > some other apps on OS X, and I have never been convinced that they > behaved as true 64 bit. > I haven't tried myself -- Shane Ambler pgSQL@007Marketing.com Get Sheeky @ http://Sheeky.Biz