Thread: Reducing memory usage of insert into select operations?

Reducing memory usage of insert into select operations?

From

Francisco Reyes

Date:

17 July 2008, 22:21:21

Redhat 4
postgresql 8.3.3
Memory: 12GB

While doing a couple of operations of the type
insert into <table> select from <some other table>

The OS triggered the out of memory killer (oom-killer).

After some research and trial/error I found it was the inserts.
I see one of the inserts is using up 12GB!

How can I reduce the usage?
Postgresql.conf settings.
shared_buffers = 3GB
temp_buffers = 64MB                     # min 800kB
work_mem = 256MB                        # min 64kB
maintenance_work_mem = 1GB


Reducing work_mem would help?

The table I am selecting from has a few hundred million rows.
The table I am inserting into has partitions. I am benchmarking breaking up
a large table into smaller partitions.

Re: Reducing memory usage of insert into select operations?

From

"Douglas McNaught"

Date:

18 July 2008, 00:14:27

On Thu, Jul 17, 2008 at 7:21 PM, Francisco Reyes <lists@stringsutils.com> wrote:
> Redhat 4
> postgresql 8.3.3
> Memory: 12GB
>
> While doing a couple of operations of the type
> insert into <table> select from <some other table>
>
> The OS triggered the out of memory killer (oom-killer).

Is this a 32-bit installation or 64-bit?  3GB of shared_buffers is way
too big for a 32-bit setup.

-Doug

Re: Reducing memory usage of insert into select operations?

From

"Douglas McNaught"

Date:

18 July 2008, 01:29:30

On Thu, Jul 17, 2008 at 9:27 PM, Francisco Reyes <lists@stringsutils.com> wrote:
> Douglas McNaught writes:
>
>> Is this a 32-bit installation or 64-bit?  3GB of shared_buffers is way
>> too big for a 32-bit setup.
>
>
> 64-bit.
> The machine has 12GB of RAM so shared-buffers is about 1/3.
> Dedicated DB server.

Ahh, good.  Just wanted to answer the obvious question first.  Some
people set shared_buffers really high on 32-bit systems and then are
surprised when it doesn't work well.

It does seem that reducing work_mem might help you, but others on this
list are much more expert than I in diagnosing this sort of problem.
It would probably be helpful for you to post the EXPLAIN output from
your query, so they can see which part of the plan causes the large
memory usage.

-Doug

Re: Reducing memory usage of insert into select operations?

From

Francisco Reyes

Date:

18 July 2008, 01:33:48

Douglas McNaught writes:


> It does seem that reducing work_mem might help you, but others on this

I reduced it from 256MB to 64MB. It seems it is helping.
At 256MB the usage per DB connection instance was upwards of 12GB. At 64MB
so far is around 7GB. I just reduced it further to 32MB and see how that
works.


> It would probably be helpful for you to post the EXPLAIN output from
Hash Join  (cost=712213.57..27293913.33 rows=234402352 width=24)
   Hash Cond: (coh.id = ids.id)
   ->  Hash Join  (cost=551387.26..18799378.16 rows=234402352 width=22)
         Hash Cond: (coh.user_id = ca.user_id)
         ->  Seq Scan on customer_original_historical coh
              (cost=0.00..6702501.40 rows=234402352 width=47)
               Filter: (yearmo > '200703'::bpchar)
         ->  Hash  (cost=268355.67..268355.67 rows=14637567 width=32)
               ->  Seq Scan on cards ca
               (cost=0.00..268355.67 rows=14637567 width=32)
   ->  Hash  (cost=77883.25..77883.25 rows=5055525 width=6)
         ->  Seq Scan on customer_ids ids
             (cost=0.00..77883.25 rows=5055525 width=6)

Re: Reducing memory usage of insert into select operations?

From

Francisco Reyes

Date:

18 July 2008, 02:10:12

Douglas McNaught writes:

> Is this a 32-bit installation or 64-bit?  3GB of shared_buffers is way
> too big for a 32-bit setup.


64-bit.
The machine has 12GB of RAM so shared-buffers is about 1/3.
Dedicated DB server.

Re: Reducing memory usage of insert into select operations?

From

Richard Huxton

Date:

18 July 2008, 09:13:45

Francisco Reyes wrote:
> The OS triggered the out of memory killer (oom-killer).

> The table I am selecting from has a few hundred million rows.
> The table I am inserting into has partitions. I am benchmarking breaking
> up a large table into smaller partitions.

Is the partition split done with triggers or rules?

--
   Richard Huxton
   Archonet Ltd

Re: Reducing memory usage of insert into select operations?

From

"Douglas McNaught"

Date:

18 July 2008, 10:54:09

On Fri, Jul 18, 2008 at 12:18 AM, Francisco Reyes
<lists@stringsutils.com> wrote:
> Douglas McNaught writes:
>
>
>> It does seem that reducing work_mem might help you, but others on this
>
> I reduced it from 256MB to 64MB. It seems it is helping.

You should also look at your memory overcommit settings (in
/proc/sys/vm).  You can set things up so that Postgres gets a malloc()
failure (which it is generally prepared to cope with cleanly) when the
system runs out of RAM, rather than having the OOM killer go off and
hit it with SIGKILL.  Overcommit is useful in some contexts (Java apps
tend to map a lot more memory than they actually use) but for a
dedicated database server you really don't ever want to have the OOM
killer triggered.

-Doug

Re: Reducing memory usage of insert into select operations?

From

"Francisco Reyes"

Date:

18 July 2008, 11:56:58

On 9:53 am 07/18/08 "Douglas McNaught" <doug@mcnaught.org> wrote:
> dedicated database server you really don't ever want to have the OOM
> killer triggered.

Found that yesterday (vm.overcommit_memory=2).
Agree that this is better than OOM. I still ran out of memory last night
and postgres just failed on the malloc(), which as you mentioned is better.

Reduced work_mem to 8MB and trying again.

Re: Reducing memory usage of insert into select operations?

From

"Francisco Reyes"

Date:

18 July 2008, 11:58:56

On 8:13 am 07/18/08 Richard Huxton <dev@archonet.com> wrote:
> Is the partition split done with triggers or rules?

I have a single trigger+function combo that dynamically computes which
partition the data has to go to.

Re: Reducing memory usage of insert into select operations?

From

Richard Huxton

Date:

18 July 2008, 12:26:15

Francisco Reyes wrote:
> On 8:13 am 07/18/08 Richard Huxton <dev@archonet.com> wrote:
>> Is the partition split done with triggers or rules?
>
> I have a single trigger+function combo that dynamically computes which
> partition the data has to go to.

I'm wondering whether it's memory usage either for the trigger itself,
or for the function (pl/pgsql?). If you're doing something like:
   INSERT INTO partitioned_table SELECT * FROM big_table
then that's not only taking place within a single transaction, but
within a single statement.

Without being a hacker, I'd say it's entirely plausible that PG might
clean up triggers at the end of a statement meaning you would need
memory for 200million+ triggers.

Alternatively, it could be a memory-leak somewhere in the pl/pgsql or
trigger code. Wouldn't have to be much to affect this particular case.

What happens if you do the insert/select in stages but all in one
transaction? Do you see PG's memory requirement stay constant or grow in
steps. That will show whether the memory is growing over the duration of
a statement or a transaction.

BEGIN;
   INSERT ... SELECT ... WHERE id BETWEEN 0 AND 999999
   INSERT ... SELECT ... WHERE id BETWEEN 1000000 AND 1999999
   ...
COMMIT;

--
   Richard Huxton
   Archonet Ltd

Re: Reducing memory usage of insert into select operations?

From

"Francisco Reyes"

Date:

18 July 2008, 12:50:48

On 11:25 am 07/18/08 Richard Huxton <dev@archonet.com> wrote:
> I'm wondering whether it's memory usage either for the trigger
> itself, or for the function (pl/pgsql?).

Good point.

> If you're doing something
> like:    INSERT INTO partitioned_table SELECT * FROM big_table
> then that's not only taking place within a single transaction, but
> within a single statement.

Correct.
I have kept decreasing work_mem and that does not seem to help.

> Without being a hacker, I'd say it's entirely plausible that PG might
> clean up triggers at the end of a statement meaning you would need
> memory for 200million+ triggers.

Sure hope that is not the case.

> Alternatively, it could be a memory-leak somewhere in the pl/pgsql or
> trigger code. Wouldn't have to be much to affect this particular case.

Will post an strace.

> What happens if you do the insert/select in stages but all in one
> transaction?

Will test.
The data is about a year worth of data. I will try  to do one month at a
time, within a single transaction.

A single month finishes fine.

> Do you see PG's memory requirement stay constant or grow
> in steps. That will show whether the memory is growing over the
> duration of a statement or a transaction.

Right now for the single statement/transaction (the one big process) it is
growing slowly over time. It may be a leak. It seems to start growing
somewhere between the 1st and 2nd hower. It seems to always be failing
around 4 hours.

I wrote a little process that shows the amount of free memory every 15
minutes..

I will post strace for the big process and then will try breaking the
process down by month, but within a single transaction and report that
later when I get some results.

Re: Reducing memory usage of insert into select operations?

From

"Francisco Reyes"

Date:

18 July 2008, 13:00:39

On 11:25 am 07/18/08 Richard Huxton <dev@archonet.com> wrote:

Strace of the single/large process.
Again, all the query is doing is
insert into <file> select <subquery>

The strace is pretty much a repetition of the lines below.

semop(557057, 0x7fbfffdfb0, 1)          = 0
lseek(100, 0, SEEK_END)                 = 671719424
write(100, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
lseek(508, 0, SEEK_END)                 = 55697408
write(508, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
read(381, "\0\0\0&\0\224\21\0\225o\10\0\30\331c\0c\225%w(\0\0\0\0\0\0\0\0\0\5\0"..., 8192) = 8192
semop(557057, 0x7fbfffd1a0, 1)          = 0
lseek(100, 0, SEEK_END)                 = 671727616
write(100, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
semop(557057, 0x7fbfffd1c0, 1)          = 0
semop(557057, 0x7fbfffd1a0, 1)          = 0
semop(557057, 0x7fbfffd1c0, 1)          = 0
read(381, "w\317\21\0<]9\0\177\246eA(\0\0\0\0\0\0\0\0\0\5\0\2\0\30\0.\v\0\0"..., 8192) = 8192
semop(557057, 0x7fbfffd1a0, 1)          = 0
lseek(512, 0, SEEK_END)                 = 48144384
write(512, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
semop(557057, 0x7fbfffd1c0, 1)          = 0
lseek(100, 0, SEEK_END)                 = 671735808
write(100, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
lseek(517, 0, SEEK_END)                 = 89309184
write(517, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
semop(557057, 0x7fbfffd1c0, 1)          = 0
semop(557057, 0x7fbfffddd0, 1)          = 0
lseek(100, 0, SEEK_END)                 = 671744000
write(100, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
read(381, ">\212\225\202(\0\0\0\0\0\0\0\0\0\5\0\2\0\30\0000\v\0\0\1\0\23\2\0\0\0\t"..., 8192) = 8192
lseek(510, 0, SEEK_END)                 = 29351936
write(510, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
lseek(100, 0, SEEK_END)                 = 671752192
write(100, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
semop(557057, 0x7fbfffddf0, 1)          = 0
read(381, "\0\0\0\0\0\0\5\0\2\0\30\0001\v\0\0\0\0\23\2\0\0\0\30\0\4\20\0\302\326\0\0"..., 8192) = 8192
lseek(513, 0, SEEK_END)                 = 19316736
write(513, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
lseek(100, 0, SEEK_END)                 = 671760384
write(100, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
read(381, "\2\0\30\0004\v\0\0\0\0\23\2\0\0\0\16\0\254\r\0\t+\35\0\301\217@\0\304\346U\241"..., 8192) = 8192
lseek(100, 0, SEEK_END)                 = 671768576
write(100, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
lseek(518, 0, SEEK_END)                 = 55025664
write(518, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = 8192
semop(557057, 0x7fbfffd1c0, 1)          = 0
semop(557057, 0x7fbfffd1c0, 1)          = 0
semop(557057, 0x7fbfffd1c0, 1)          = 0
lseek(100, 0, SEEK_END)                 = 671776768

Re: Reducing memory usage of insert into select operations?

From

Martijn van Oosterhout

Date:

18 July 2008, 13:04:16

On Fri, Jul 18, 2008 at 10:40:02AM -0400, Francisco Reyes wrote:
> Found that yesterday (vm.overcommit_memory=2).
> Agree that this is better than OOM. I still ran out of memory last night
> and postgres just failed on the malloc(), which as you mentioned is better.
>
> Reduced work_mem to 8MB and trying again.

Perhaps you can try reducing the shared_buffers, to see if that helps
more? 8MB is quite small for workmem. More shared_buffers is not
necessarily better.

Also, how much swap are you running? overcommit disabled while not
having any swap setup is a great way to ensure you run out of memory
quickly.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Attachment

signature.asc

Re: Reducing memory usage of insert into select operations?

From

Alvaro Herrera

Date:

18 July 2008, 13:29:28

Francisco Reyes wrote:
> On 11:25 am 07/18/08 Richard Huxton <dev@archonet.com> wrote:
>
> Strace of the single/large process.
> Again, all the query is doing is
> insert into <file> select <subquery>
>
> The strace is pretty much a repetition of the lines below.

Do you have long-running transactions?  (For example transactions that
have been idle for a long time).

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Reducing memory usage of insert into select operations?

From

"Francisco Reyes"

Date:

18 July 2008, 13:46:14

On 12:03 pm 07/18/08 Martijn van Oosterhout <kleptog@svana.org> wrote:

> Perhaps you can try reducing the shared_buffers, to see if that helps
> more?

Will try.

> 8MB is quite small for workmem. More shared_buffers is not
> necessarily better.

Ok, but from everything I had read shared_buffers of 1/4 seemed like a
starting point. Will try reducing it to 2GB.

> Also, how much swap are you running?

Started out with 12GB (same as memory) and last night I added 24GB more.
I had 2 instances of inserts going so each exausted about 18GB of ram!

Re: Reducing memory usage of insert into select operations?

From

"Francisco Reyes"

Date:

18 July 2008, 13:47:44

On 12:23 pm 07/18/08 Alvaro Herrera <alvherre@commandprompt.com> wrote:
> Do you have long-running transactions?  (For example transactions that
> have been idle for a long time).

No.
The two inserts I was running were the only processes. I even did a restart
to make sure there was absolutely nothing else running and to make sure all
my postgresql.conf settings were in.

Given that memory grows over time I am beggining to wonder if it is some
type of memory leak.

Just installed the postgresql debug rpm, but not sure if did anything..
strace doesn't look   any different..

 read(81, "2\1\0\0\260~!\16\1\0\0\0\370\1\0\2\0 \4 \0\0\0\0\300\237r\0\200\237r\0"..., 8192) = 8192
write(191, "Q=J\313\253]1\0\0\0\1\0007\33\4\0\2\0\2\t\30\0\3\302\204\0;a1OjG"..., 8192) = 8192
write(160, "XQxbqQEx+yo=H\333o\2371\0\0\0\1\0.\33C\0\2\0\2\t\30\0"...,
8192) = 8192
read(81, "2\1\0\0\320(\301\17\1\0\0\0\370\1\0\2\0 \4 \0\0\0\0\300\237r\0\200\237r\0"..., 8192) = 8192

Re: Reducing memory usage of insert into select operations?

From

Tom Lane

Date:

18 July 2008, 14:00:45

"Francisco Reyes" <lists@stringsutils.com> writes:
> Given that memory grows over time I am beggining to wonder if it is some
> type of memory leak.

Are there any AFTER triggers (including foreign key constraints) on the
table being inserted into?  If so the list of pending trigger events
might be your problem.

If you can get Postgres to report an actual out-of-memory error (as
opposed to crashing from OOM kill) then it should dump a memory usage
map into the postmaster log.  Looking at that would be informative.

            regards, tom lane

Re: Reducing memory usage of insert into select operations?

From

"Francisco Reyes"

Date:

18 July 2008, 14:25:46

On 1:00 pm 07/18/08 Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Are there any AFTER triggers (including foreign key constraints)

I have two foreign key constraints.

> the table being inserted into?  If so the list of pending trigger
> events might be your problem.

I guess I can try disablign the foreign key, but that would be less than
ideal for production. This is an analytics environment so all operations
are in bulk.

> If you can get Postgres to report an actual out-of-memory error (as
> opposed to crashing from OOM kill)

Disabled oom with vm.overcommit_memory=2.

>then it should dump a memory usage
>map into the postmaster log.  Looking at that would be informative.

Got it.
----------------------
AfterTriggerEvents: 10553909248 total in 1268 blocks; 20432 free (6
chunks); 10553888816 used
    ExecutorState: 122880 total in 4 blocks; 68040 free (8 chunks); 54840
used
  Operator lookup cache: 24576 total in 2 blocks; 11888 free (5 chunks);
12688 used
  Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512
used
  MessageContext: 131072 total in 5 blocks; 50712 free (291 chunks); 80360
used
  smgr relation table: 24576 total in 2 blocks; 3584 free (4 chunks); 20992
used
  TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks);
32 used
  Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
  PortalMemory: 8192 total in 1 blocks; 7888 free (0 chunks); 304 used
    PortalHeapMemory: 1024 total in 1 blocks; 768 free (0 chunks); 256 used
      ExecutorState: 98784 total in 8 blocks; 24064 free (22 chunks); 74720
used
        ExprContext: 8192 total in 1 blocks; 8016 free (0 chunks); 176 used
        HashTableContext: 8192 total in 1 blocks; 8064 free (1 chunks); 128
used
          HashBatchContext: 532676656 total in 74 blocks; 1863936 free (5
chunks); 530812720 used
        HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
          HashBatchContext: 415227952 total in 59 blocks; 6589744 free (5
chunks); 408638208 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
        ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
        ExprContext: 8192 total in 1 blocks; 8136 free (0 chunks); 56 used
Relcache by OID: 24576 total in 2 blocks; 8672 free (3 chunks); 15904 used
  CacheMemoryContext: 2390256 total in 22 blocks; 751904 free (2 chunks);
1638352 used
    CachedPlan: 1024 total in 1 blocks; 336 free (0 chunks); 688 used
    CachedPlanSource: 1024 total in 1 blocks; 80 free (0 chunks); 944 used
    SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
    CachedPlan: 7168 total in 3 blocks; 3120 free (0 chunks); 4048 used
    CachedPlanSource: 7168 total in 3 blocks; 1816 free (0 chunks); 5352 used
    SPI Plan: 1024 total in 1 blocks; 784 free (0 chunks); 240 used
    CachedPlan: 3072 total in 2 blocks; 792 free (0 chunks); 2280 used
    CachedPlanSource: 7168 total in 3 blocks; 3600 free (0 chunks); 3568 used
    SPI Plan: 1024 total in 1 blocks; 800 free (0 chunks); 224 used
    pg_cast_source_target_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
    pg_language_oid_index: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
    pg_toast_2619_index: 2048 total in 1 blocks; 608 free (0 chunks); 1440
used
    pg_amop_opr_fam_index: 2048 total in 1 blocks; 608 free (0 chunks);
1440 used
    tcf_mnfoids_partid: 2048 total in 1 blocks; 752 free (0 chunks); 1296
used
    tcf_mnfoids_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
    cards_cardnum_key: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
    cards_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
    tcf_original_trans_partid_cardnum: 2048 total in 1 blocks; 656 free (0
chunks); 1392 used
    tcf_original_trans_yearmo: 2048 total in 1 blocks; 752 free (0 chunks);
1296 used
    pg_constraint_contypid_index: 2048 total in 1 blocks; 704 free (0
chunks); 1344 used
    pg_constraint_conname_nsp_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
    pg_operator_oprname_l_r_n_index: 2048 total in 1 blocks; 392 free (0
chunks); 1656 used
    pg_proc_proname_args_nsp_index: 2048 total in 1 blocks; 584 free (0
chunks); 1464 used
    pg_proc_oid_index: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used
    pg_shdepend_reference_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
    pg_namespace_oid_index: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
    pg_statistic_relid_att_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
    pg_inherits_relid_seqno_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
    pg_constraint_oid_index: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
    pg_attribute_relid_attnam_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
    pg_attrdef_oid_index: 2048 total in 1 blocks; 704 free (0 chunks); 1344
used
    pg_shdepend_depender_index: 2048 total in 1 blocks; 584 free (0
chunks); 1464 used
    pg_type_typname_nsp_index: 2048 total in 1 blocks; 608 free (0 chunks);
1440 used
    pg_type_oid_index: 2048 total in 1 blocks; 704 free (0 chunks); 1344 used
    pg_depend_depender_index: 2048 total in 1 blocks; 584 free (0 chunks);
1464 used
    pg_depend_reference_index: 2048 total in 1 blocks; 584 free (0 chunks);
1464 used
    pg_index_indrelid_index: 2048 total in 1 blocks; 704 free (0 chunks);
1344 used
    pg_description_o_c_o_index: 2048 total in 1 blocks; 584 free (0
chunks); 1464 used
    pg_constraint_conrelid_index: 2048 total in 1 blocks; 704 free (0
chunks); 1344 used
    pg_attrdef_adrelid_adnum_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
    pg_class_relname_nsp_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used
    pg_namespace_nspname_index: 2048 total in 1 blocks; 704 free (0
chunks); 1344 used
    pg_authid_oid_index: 2048 total in 1 blocks; 704 free (0 chunks); 1344
used
    pg_opclass_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
    pg_amop_fam_strat_index: 3072 total in 2 blocks; 1384 free (2 chunks);
1688 used
    pg_trigger_tgrelid_tgname_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used
    pg_attribute_relid_attnum_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used
    pg_class_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376
used
    pg_amproc_fam_proc_index: 3072 total in 2 blocks; 1384 free (2 chunks);
1688 used
    pg_index_indexrelid_index: 3072 total in 2 blocks; 1696 free (2
chunks); 1376 used
    pg_rewrite_rel_rulename_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
    pg_operator_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
  MdSmgr: 24576 total in 2 blocks; 15872 free (0 chunks); 8704 used
  LOCALLOCK hash: 57344 total in 3 blocks; 36384 free (11 chunks); 20960 used
  Timezones: 53584 total in 2 blocks; 3744 free (0 chunks); 49840 used
  ErrorContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
ERROR:  out of memory
DETAIL:  Failed on request of size 40.

Re: Reducing memory usage of insert into select operations?

From

Tom Lane

Date:

18 July 2008, 16:56:09

"Francisco Reyes" <lists@stringsutils.com> writes:
> On 1:00 pm 07/18/08 Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If you can get Postgres to report an actual out-of-memory error (as
>> opposed to crashing from OOM kill)
>> then it should dump a memory usage
>> map into the postmaster log.  Looking at that would be informative.

> Got it.

> AfterTriggerEvents: 10553909248 total in 1268 blocks; 20432 free (6
> chunks); 10553888816 used

Well, that's definitely your problem ...

>           HashBatchContext: 532676656 total in 74 blocks; 1863936 free (5
> chunks); 530812720 used

>           HashBatchContext: 415227952 total in 59 blocks; 6589744 free (5
> chunks); 408638208 used

although these numbers seem way outta line too.  What did you say you
had work_mem set to?

            regards, tom lane

Re: Reducing memory usage of insert into select operations?

From

"Francisco Reyes"

Date:

18 July 2008, 17:16:47

On 3:55 pm 07/18/08 Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >  AfterTriggerEvents: 10553909248 total in 1268 blocks; 20432 free (6
> >  chunks); 10553888816 used
>
> Well, that's definitely your problem ...

So I need to remove the foreign constraints?

> >            HashBatchContext: 415227952 total in 59 blocks; 6589744
> >  free (5 chunks); 408638208 used
>
> although these numbers seem way outta line too.  What did you say you
> had work_mem set to?

Initially on the first crash it was 256MB. I believe at the time of the
crash I got the dump for it was down to 64MB or 8MB. I kept trying lower
values. Also tried reducing shared_buffers as someone suggested.

I will bump my shared_buffers back to 3GB and work_mem back to 64MB.

Re: Reducing memory usage of insert into select operations?

From

"Francisco Reyes"

Date:

18 July 2008, 17:48:58

On 3:55 pm 07/18/08 Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >  AfterTriggerEvents: 10553909248 total in 1268 blocks; 20432 free (6
> >  chunks); 10553888816 used
>
> Well, that's definitely your problem ...

What is the overhead for each AfterTriggerEvent?

I guess I can write a program to process so many rows at a time, if I know
how much overhead each AfterTriggerEvent uses. I know 15 million at a time
worked fine, so I could do 5 or 10 million at a time.

When does the memory usage for those AfterTriggerEvents gets released? At
commit?

Re: Reducing memory usage of insert into select operations?

From

Tom Lane

Date:

18 July 2008, 17:55:20

"Francisco Reyes" <lists@stringsutils.com> writes:
> On 3:55 pm 07/18/08 Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> AfterTriggerEvents: 10553909248 total in 1268 blocks; 20432 free (6
>>> chunks); 10553888816 used
>>
>> Well, that's definitely your problem ...

> So I need to remove the foreign constraints?

Either that or do the update in sections.  But working through umpteen
gig of pending trigger events would take forever anyway --- dropping
and re-adding the FK constraint is almost certainly a better way.

>>> HashBatchContext: 415227952 total in 59 blocks; 6589744
>>> free (5 chunks); 408638208 used
>>
>> although these numbers seem way outta line too.  What did you say you
>> had work_mem set to?

> Initially on the first crash it was 256MB. I believe at the time of the
> crash I got the dump for it was down to 64MB or 8MB.

Something fishy about that.  The max size of a HashBatchContext should
be work_mem, more or less (the accounting isn't perfectly accurate
I think, but it's not off by an order of magnitude).

The only thing I can think of is that you had a huge number of rows with
all the same hash value, so that there wasn't any way to split the batch
into smaller sections.  What are the join keys exactly in this query,
and what can you tell us about their data distributions?

            regards, tom lane

Re: Reducing memory usage of insert into select operations?

From

Tom Lane

Date:

18 July 2008, 18:01:05

"Francisco Reyes" <lists@stringsutils.com> writes:
> What is the overhead for each AfterTriggerEvent?

On a 64-bit machine it looks like they'd cost you about 80 bytes
each :-(.  A good deal of that is palloc overhead --- I wonder if
we should get rid of the separate-palloc-for-each-event design?

> When does the memory usage for those AfterTriggerEvents gets released? At
> commit?

Whenever the check is done; you'd have to read the rules about deferred
constraints ...

            regards, tom lane

Re: Reducing memory usage of insert into select operations?

From

"Francisco Reyes"

Date:

18 July 2008, 18:46:25

On 4:55 pm 07/18/08 Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The only thing I can think of is that you had a huge number of rows
> with all the same hash value, so that there wasn't any way to split
> the batch into smaller sections.  What are the join keys exactly in
> this query, and what can you tell us about their data distributions?

I can't put actual table or column names so I am putting the actual select
and explain, with all names changed..

insert into customer_transactions
 (record_id, date, type, amount, ids, groupid)
select
  ca.record_id, coh.date, coh.type, coh.amount, coh.ids, ids.groupid
from
customer_original_historical coh,
cards ca,
customer_ids ids
where
ca.natural_key = coh.natural_key
and ids.ids = coh.ids
and coh.yearmo > '200703';

Hash Join  (cost=712213.57..27293913.33 rows=234402352 width=24)
   Hash Cond: (coh.id = ids.id)
   ->  Hash Join  (cost=551387.26..18799378.16 rows=234402352
width=22)
         Hash Cond: (coh.user_id = ca.user_id)
         ->  Seq Scan on customer_original_historical coh
              (cost=0.00..6702501.40 rows=234402352 width=47)
               Filter: (yearmo > '200703'::bpchar)
         ->  Hash  (cost=268355.67..268355.67 rows=14637567 width=32)
               ->  Seq Scan on cards ca
               (cost=0.00..268355.67 rows=14637567 width=32)
   ->  Hash  (cost=77883.25..77883.25 rows=5055525 width=6)
         ->  Seq Scan on customer_ids ids
             (cost=0.00..77883.25 rows=5055525 width=6)

There was a single table, customer_original_historical, which was using a
natural key with a text field.

Most queries used the customer_original_historical by itself or joined
against a single other table which we shoudl call "area".

The new schema I am testing is to split the one single file into 12 files
per month.

In addition I replaced the natural keys with a synthetic integer key.
I also replaced the "area" table with a customer_ids table which only has
two columns: synthetic key for historical and a region.

In order to have 12 tables per month I grouped all the regions into 12
groups. Queries are usually within a single region so what I am trying to
benchmark is if dividing 24 months of data into 24 sets of 12 regions will
perform better than a single large file.

The distribution of the joins is:
There are about 1000,000 unique natural keys. Each natural key has in
average 15 rows per month.
ids are regions where the natural_keys are. Figure 10s of thousands of
natural_keys to an id.

Is that along the lines of what you were looking for?

Re: Reducing memory usage of insert into select operations?

From

Martijn van Oosterhout

Date:

19 July 2008, 08:22:21

On Fri, Jul 18, 2008 at 04:48:26PM -0400, Francisco Reyes wrote:
> On 3:55 pm 07/18/08 Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >  AfterTriggerEvents: 10553909248 total in 1268 blocks; 20432 free (6
> > >  chunks); 10553888816 used
> >
> > Well, that's definitely your problem ...
>
> What is the overhead for each AfterTriggerEvent?

Can you make them not deferred? Then you don't need the memory either.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Attachment

signature.asc

Re: Reducing memory usage of insert into select operations? [Solved]

From

Francisco Reyes

Date:

19 July 2008, 10:12:07

Martijn van Oosterhout wrote:
> Can you make them not deferred?
How?


I found the issue.
I had the foreign key in the master table instead of the children.
Deleted RI from master table and put into the inherited partitions.
My whole 230 million rows merged in about an hour!
And I even had two of those running at the same time. (one setup with 14
partitions per month and another with 5 partitions per month to test
difference in performance).

It was so fast I even had to do a count(*) to make sure both actually
merged.
That is 117K rows per second for rows that were about 33 bytes long.
That only comes down to about 3 MB/sec+overhead, but still 117K rows/sec
is not too shabby.

In case it is of interest to anyone..
2 AMD dual core, 2GHz CPUs
12GB of RAM
shared_buffers 3GB
work_mem 64MB
256 check_point segments
10 min checkpoing_timeout
LSI controller with 128MB cache with BBU. Write cache enabled.


Many thanks to all that offered suggestions in the troubleshooting.

Re: Reducing memory usage of insert into select operations? [Solved]

From

Alvaro Herrera

Date:

19 July 2008, 18:03:43

Francisco Reyes wrote:

> I had the foreign key in the master table instead of the children.
> Deleted RI from master table and put into the inherited partitions.
> My whole 230 million rows merged in about an hour!

Heh -- but are the FKs now checked?  Try inserting something that
violates the constraints and see if they are rejected.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Reducing memory usage of insert into select operations? [Solved]

From

Francisco Reyes

Date:

19 July 2008, 20:39:42

Alvaro Herrera writes:

> Heh -- but are the FKs now checked?  Try inserting something that
> violates the constraints and see if they are rejected.

I knew it sounded too good to be true.
1- The trigger was not set in the master (ie nothing went to the children).
2- The master had no index and no RI.. so it was a straight insert.

I corrected (ie set the trigger in the master and RI in the children). Has
been running for 10 hours and has not finished.

The good news is that memory doesn't seem to be going up.
I will give it till tomorrow AM.. and if hasn't finished will turn off the
foreign keys in the children. Already modified the scripts so I can easily
build/drop the foreign keys as needed.

Re: Reducing memory usage of insert into select operations? [Solved]

From

Alvaro Herrera

Date:

19 July 2008, 21:02:16

Francisco Reyes wrote:

> I knew it sounded too good to be true.
> 1- The trigger was not set in the master (ie nothing went to the children).
> 2- The master had no index and no RI.. so it was a straight insert.
>
> I corrected (ie set the trigger in the master and RI in the children).
> Has been running for 10 hours and has not finished.

FWIW it tends to be faster to do the bulk load first and add the
indexes and constraints later.  (Though obviously you must be prepared
to cope with the failing rows, if any).  However, if you do this
INSERT/SELECT thing frequently, this is probably not very workable.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.