Thread: Postgresql vs. aggregates
I have an application with a table that tracks objects with a "size" attribute. What we want to do is to periodically report on the number of these objects and the sum of the object sizes. The table will typically have 1-5 million rows. I know that the obvious "select count(*), sum(size) ..." is not a good idea, and we have seen the impact such a query has on performance. Then we tried creating triggers to maintain a table with object_count and sum_size columns: create table summary(object_count bigint, sum_size bigint) We populate the table with exactly one row, initialized to (0, 0). A trigger runs an update like this: update summary set object_count = object_count + 1, sum_size = sum_size + new.size The problem is that our application has periods where objects are being created at a rapid rate. Updates to the summary table slow down over time, and the slowdown is accompanied by very low CPU idle% as reported by vmstat. I found that if I have a thread which vacuums the summary table every few seconds (e.g. 30), then update times stay reasonable. We're starting to reconsider this approach, wondering if the vacuuming or the buildup of row versions (in those 30 seconds) is behind some variability in performance that we're seeing. To that end, we are measuring the impact of more frequent and less frequent vacuuming. I'll skip the details here since this is not the main point of this email. Another idea we had is to maintain object_count and sum_size in sequences. The trigger would use currval and setval to maintain the values. The problem with this is that when we try to examine the sequence on another connection, we can't use currval before bumping the value with nextval, which would interfere with maintenance of the totals. (The error message for a sequence s is "ERROR: s.currval is not yet defined in this session".) Yet another idea is to write C functions which maintain counts in shared memory, simulating what we can't quite do with sequences. I understand why asking for aggregate values computed over rapidly changing data sets is not a completely meaningful thing to do. Yet we have the requirement. Users of our product need to be able to look at object_count and sum_size to get a sense of how an object load is proceeding. This shouldn't be so difficult. I've seen, on this mailing list, the advice "don't do that, use a trigger" when someone want to do select count(*). But I haven't seen any discussion of the consequences of using a trigger. Can anyone recommend a good way to maintain aggregates using a trigger? Is vacuuming every few seconds really the only way to go? Jack Orenstein ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
On Wed, 2004-06-09 at 16:32, jao@geophile.com wrote: > I have an application with a table that tracks objects with a "size" > attribute. What we want to do is to periodically report on the number > of these objects and the sum of the object sizes. The table will > typically have 1-5 million rows. > > I know that the obvious "select count(*), sum(size) ..." is not a good > idea, and we have seen the impact such a query has on performance. > > Then we tried creating triggers to maintain a table with object_count > and sum_size columns: > > create table summary(object_count bigint, sum_size bigint) > > We populate the table with exactly one row, initialized to (0, 0). > A trigger runs an update like this: > > update summary > set object_count = object_count + 1, > sum_size = sum_size + new.size > > The problem is that our application has periods where objects are > being created at a rapid rate. Updates to the summary table slow down > over time, and the slowdown is accompanied by very low CPU idle% as > reported by vmstat. I'm wondering if these data are time sensitive, and if so, how sensitive. If they could be up to say a minute old or something, using a materialized view of some sort might work well here. Or a temp table to hold all the incoming data until it needs to be dumped into the real table. Just a thought, might not work for you. Also, do you have a lot of parallel updates to the same rows, or are they all different rows? If you're doing a lot of updating of the same rows over and over, any database is going to be somewhat slow, either in vacuuming like for postgresql, or locking as in row level locking databases. If they're all different, then this should be no different for postgresql than for a row locking database, since each tuple will only have two or three versions, instead of some smaller percentage of rows having hundreds of versions. Just wondering.
Quoting Scott Marlowe <smarlowe@qwest.net>: > On Wed, 2004-06-09 at 16:32, jao@geophile.com wrote: > > Then we tried creating triggers to maintain a table with object_count > > and sum_size columns: > > > > create table summary(object_count bigint, sum_size bigint) > > > > We populate the table with exactly one row, initialized to (0, 0). > > A trigger runs an update like this: > > > > update summary > > set object_count = object_count + 1, > > sum_size = sum_size + new.size > > > > The problem is that our application has periods where objects are > > being created at a rapid rate. Updates to the summary table slow down > > over time, and the slowdown is accompanied by very low CPU idle% as > > reported by vmstat. > > I'm wondering if these data are time sensitive, and if so, how > sensitive. If they could be up to say a minute old or something, using > a materialized view of some sort might work well here. Or a temp table > to hold all the incoming data until it needs to be dumped into the real > table. So once a minute I'd run a stored procedure to compute statistics and copy rows to the main table? That might be doable, but then queries would have to be to a view combining the main table and the temp table. Can you expand on the use of materialized views to solve this? > Also, do you have a lot of parallel updates to the same rows, or are > they all different rows? If you're doing a lot of updating of the same > rows over and over, any database is going to be somewhat slow, either in > vacuuming like for postgresql, or locking as in row level locking > databases. > > If they're all different, then this should be no different for > postgresql than for a row locking database, since each tuple will only > have two or three versions, instead of some smaller percentage of rows > having hundreds of versions. The summary table has exactly one row, and whenever an object is added to the main table, this one summary row is updated. An odd feature of my application is that there is only one object creation going on at a time. The summary row would not be a concurrency hotspot in a non-MVCC database system. But that raises an interesting idea. Suppose that instead of one summary row, I had, let's say, 1000. When my application creates an object, I choose one summary row at random (or round-robin) and update it. So now, instead of one row with many versions, I have 1000 with 1000x fewer versions each. When I want object counts and sizes, I'd sum up across the 1000 summary rows. Would that allow me to maintain performance for summary updates with less frequent vacuuming? I'd find all this much easier to reason about if I understood how the versions of a row are organized and accessed. How does postgresql locate the correct version of a row? Jack Orenstein ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
jao@geophile.com writes: > I'd find all this much easier to reason about if I understood how > the versions of a row are organized and accessed. How does postgresql > locate the correct version of a row? It doesn't, particularly. A seqscan will of course visit all the versions of a row, and an indexscan will visit all the versions matching the indexscan key-column conditions. It's up to the "time qualification" tests (tqual.c) to accept only the version that is visible to your transaction. There are two or three implementation choices in the btree index routines that are intended to increase the chances that you'll hit the version you want sooner rather than later. But they're only heuristics. The bottom line is that we check all the versions till we find the right one. regards, tom lane
jao@geophile.com wrote: > But that raises an interesting idea. Suppose that instead of one > summary row, I had, let's say, 1000. When my application creates > an object, I choose one summary row at random (or round-robin) and update > it. So now, instead of one row with many versions, I have 1000 with 1000x > fewer versions each. When I want object counts and sizes, I'd sum up across > the 1000 summary rows. Would that allow me to maintain performance > for summary updates with less frequent vacuuming? Perhaps the simplest approach might be to define the summary table as containing a SERIAL and your count. Every time you add another object insert (nextval(...), 1) Every 10s summarise the table (i.e. replace 10 rows all "scored" 1 with 1 row scored 10) Use sum() over the much smaller table to find your total. Vacuum regularly. -- Richard Huxton Archonet Ltd
Hi, ----- Original Message ----- From: "Richard Huxton" <dev@archonet.com> To: <jao@geophile.com> Cc: "Scott Marlowe" <smarlowe@qwest.net>; <pgsql-general@postgresql.org> Sent: Thursday, June 10, 2004 8:03 AM Subject: Re: [GENERAL] Postgresql vs. aggregates > jao@geophile.com wrote: > > > But that raises an interesting idea. Suppose that instead of one > > summary row, I had, let's say, 1000. When my application creates > > an object, I choose one summary row at random (or round-robin) and update > > it. So now, instead of one row with many versions, I have 1000 with 1000x > > fewer versions each. When I want object counts and sizes, I'd sum up across > > the 1000 summary rows. Would that allow me to maintain performance > > for summary updates with less frequent vacuuming? > > Perhaps the simplest approach might be to define the summary table as > containing a SERIAL and your count. > Every time you add another object insert (nextval(...), 1) > Every 10s summarise the table (i.e. replace 10 rows all "scored" 1 with > 1 row scored 10) > Use sum() over the much smaller table to find your total. > Vacuum regularly. > Something along these lines except using a SUM instead of a COUNT. http://archives.postgresql.org/pgsql-performance/2004-01/msg00059.php Nick
> I understand why asking for aggregate values computed over rapidly > changing data sets is not a completely meaningful thing to do. Yet we > have the requirement. Users of our product need to be able to look at > object_count and sum_size to get a sense of how an object load is > proceeding. If the customer just needs to know how things are *proceeding* will it not suffice to be looking at reasonable estimates ? Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346