Reducing the memory footprint of large sets of pending triggers - Mailing list pgsql-hackers

From Tom Lane
Subject Reducing the memory footprint of large sets of pending triggers
Date
Msg-id 16830.1224811959@sss.pgh.pa.us
Whole thread Raw
Responses Re: Reducing the memory footprint of large sets of pending triggers
List pgsql-hackers
We've occasionally talked about allowing pending-trigger-event lists to
spill to disk when there get to be so many events that it's a memory
problem.  I'm not especially interested in doing that right now, but
I noticed recently that we could alleviate the problem a lot by adopting
a more compact representation.

Currently, each event is a separately palloc'd instance of struct
AfterTriggerEventData.  On a 32-bit machine that struct takes 32 bytes,
plus 8 bytes palloc overhead = 40 bytes.  On a 64-bit machine the struct
takes 36 bytes, but palloc rounds that up to 64 bytes, plus there's 16
bytes palloc overhead = 80 bytes :-(.

I see several things we could do here:

* Allocate the event structs in reasonably-large arrays instead of
separate palloc chunks.  This would require some data copying where we
now get away with pointer-swinging --- but on the other hand per-event
palloc'ing isn't exactly free either, so I suspect that this would net
out to a wash if not an actual speedup.

* Don't store the ate_tgoid and ate_relid fields in each individual
event struct.  Instead keep a separate array with one instance of these
values for each distinct trigger that's been fired in the current
transaction (in most cases this list should be pretty short, even if
there are many events).  We can commandeer the high order bits of
ate_event to store an index into that array.  Currently only 8 bits
of ate_event are actually used, so we'd have room for 16 million
distinct triggers fired in a transaction.  Even if we need a few more
ate_event flag bits later, I don't see a problem there.

* Don't store two ItemPointers in insert or delete events.  This would
make the array element stride variable, but since we don't need random
access into the arrays AFAICS, that doesn't seem to be a problem.

In combination these changes would get us down to 16 bytes per
insert/delete and 20 per update event, which represents a factor of 2
or 2.5 savings on a 32-bit machine and a factor of 4 or 5 on a 64-bit
machine.  Seems worth doing to me, especially since it looks like
only about a 1-day project touching only a single source file.

It might be possible to go further and move the event status bits and
firing_id into the separate array, which would save a further four bytes
per event in the typical situation that a lot of events of the same
trigger are queued by a single command.  I think I'd want to tackle that
as a follow-on patch though, because it would be a change in the data
structure semantics not just rearranging the representation a tad.

Comments, better ideas?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: SSL README
Next
From: Greg Stark
Date:
Subject: Re: Multi CPU Queries - Feedback and/or suggestions wanted!