Thread: Re: [SQL] Yet Another (Simple) Case of Index not used

Re: [SQL] Yet Another (Simple) Case of Index not used

From
Josh Berkus
Date:
Dennis,

> I'm running into a quite puzzling simple example where the index I've
> created on a fairly big table (465K entries) is not used, against all common
> sense expectations:
> The query I am trying to do (fast) is:
>
> select count(*) from addresses;

PostgreSQL is currently unable to use indexes on aggregate queries.   This is
because of two factors:
1) MVCC means that the number of rows must be recalculated for each
connection's current transaction, and cannot be "cached" anywhere by the
database system;
2) Our extensible model of user-defined aggregates means that each aggregate
is a "black box" whose internal operations are invisible to the planner.

This is a known performance issue for Postgres, and I believe that a couple of
people on Hackers are looking at modifying aggregate implementation for 8.0
to use appropriate available indexes, at least for MIN, MAX and COUNT.  Until
then, you will need to either put up with the delay, or create a
trigger-driven aggregates caching table.

If you are trying to do a correlated count, like "SELECT type, count(*) from
aggregates GROUP BY type", Tom Lane has already added a hash-aggregates
structure in the 7.4 source that will speed this type of query up
considerably for systems with lots of RAM.

(PS: in the future, please stick to posting questions to one list at a time,
thanks)

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: [SQL] Yet Another (Simple) Case of Index not used

From
"Denis @ Next2Me"
Date:
Josh,

I am no database expert, and even less knowledgeable about the internals
of postgresql, so I'll trust you on the 2 points you make below.

Are you saying the 7.4 'group by' trick would be faster than the simple select count(*)?
That seems hard to believe, being that the request now has to fetch / sort the data.
I must be missing something.

The kind of requests that I am really interested in are:
select count(*) from table where table.column like 'pattern%'
These seems to go much master on mysql (which I guess it not a MVCC database? or wasn't
the Innobase supposed to make it so?), than on postgresql.

So, in the meantime, I've decided to split up my data into two sets,
the static big tables which are handled by mysql, and the rest of it handled
by postgresql....



ps: apologies for the cross-posting.

  > -----Original Message-----
  > From: Josh Berkus [mailto:josh@agliodbs.com]
  > Sent: Tuesday, April 08, 2003 2:53 PM
  > To: Denis; pgsql-performance@postgresql.org
  > Subject: Re: [SQL] Yet Another (Simple) Case of Index not used
  >
  >
  > Dennis,
  >
  > > I'm running into a quite puzzling simple example where the index I've
  > > created on a fairly big table (465K entries) is not used, against all common
  > > sense expectations:
  > > The query I am trying to do (fast) is:
  > >
  > > select count(*) from addresses;
  >
  > PostgreSQL is currently unable to use indexes on aggregate queries.   This is
  > because of two factors:
  > 1) MVCC means that the number of rows must be recalculated for each
  > connection's current transaction, and cannot be "cached" anywhere by the
  > database system;
  > 2) Our extensible model of user-defined aggregates means that each aggregate
  > is a "black box" whose internal operations are invisible to the planner.
  >
  > This is a known performance issue for Postgres, and I believe that a couple of
  > people on Hackers are looking at modifying aggregate implementation for 8.0
  > to use appropriate available indexes, at least for MIN, MAX and COUNT.  Until
  > then, you will need to either put up with the delay, or create a
  > trigger-driven aggregates caching table.
  >
  > If you are trying to do a correlated count, like "SELECT type, count(*) from
  > aggregates GROUP BY type", Tom Lane has already added a hash-aggregates
  > structure in the 7.4 source that will speed this type of query up
  > considerably for systems with lots of RAM.
  >
  > (PS: in the future, please stick to posting questions to one list at a time,
  > thanks)
  >
  > --
  > -Josh Berkus
  >  Aglio Database Solutions
  >  San Francisco
  >


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Stephan Szabo
Date:
On Tue, 8 Apr 2003, Denis @ Next2Me wrote:

> The kind of requests that I am really interested in are:
> select count(*) from table where table.column like 'pattern%'

If you think an index scan should be faster, you can try
set enable_seqscan=off;
and see if that changes the plan generated by explain and with analyze
you can compare the time used.  Without information on the estimated
selectivity it's hard to say what's right.

If it doesn't use the index (ie, it's still using a sequential scan)
after the enable_seqscan=off it's likely that you didn't initdb in "C"
locale in which case like won't use indexes currently (you can see the
archives for long description, but the short one is that some of the
locale rules can cause problems with using the index).


Re: [SQL] Yet Another (Simple) Case of Index not used

From
"Christopher Kings-Lynne"
Date:
Hi Denis,

> The kind of requests that I am really interested in are:
> select count(*) from table where table.column like 'pattern%'
> These seems to go much master on mysql (which I guess it not a MVCC database? or wasn't 
> the Innobase supposed to make it so?), than on postgresql.

A few things.

* MVCC in PostgreSQL allows us to be way faster than MySQL when you have heaps of concurrent readers and writers.  The
tradeoffis that count(*) is slow since PostgreSQL needs to check that each tuple is actually visible to your query (eg.
youstart a transaction, somone else inserts a row, you do a count(*) - should the result include that new row or not?
Answer:no.)
 

* Just avoid doing count(*) over the entire table with no where clause!!! It's as easy as that

* The LIKE 'pattern%' is indexable in Postgresql.  You will need to create a normal btree index over table.column.  So
longas the index is returning a small portion of the table (eg. say only 5-10% of the fields begin with pattern), then
theindex will be used and it will be fast.
 

* If you want really fast full text indexing, check out contrib/tsearch - it's really, really, really fast.

Chris

Re: [SQL] Yet Another (Simple) Case of Index not used

From
Josh Berkus
Date:
Denis,

> Are you saying the 7.4 'group by' trick would be faster than the simple
> select count(*)? That seems hard to believe, being that the request now has
> to fetch / sort the data. I must be missing something.

No, I'm saying that the 7.4 hash-aggregate is faster than the same query was
under 7.2 or 7.3.   Much faster.   But it does little to speed up a raw
count(*).

> The kind of requests that I am really interested in are:
> select count(*) from table where table.column like 'pattern%'

Hash-aggregates may, in fact, help with this.   Care to try downloading the
the source from CVS?

> These seems to go much master on mysql (which I guess it not a MVCC
> database? or wasn't the Innobase supposed to make it so?),

They did incorporate a lot of MVCC logic into InnoDB tables, yes.  Which means
that if SELECT count(*) on an InnoDB table is just as fast as a MyISAM table,
then it is not accurate.  This would be in keeping with MySQL's design
philosophy, which values performance and simplicity over accuracy and
precision -- the opposite of our philosophy.

> So, in the meantime, I've decided to split up my data into two sets,
> the static big tables which are handled by mysql, and the rest of it
> handled by postgresql....

Hey, if it works for you, it's probably easier than dealing with the
PostgreSQL workarounds to this performance issue.  I'll ask you to give
PostgreSQL a try for those tables again when 7.4 comes out.

> ps: apologies for the cross-posting.

De nada.   The Performance list is the right place for this sort of question
in the future.

--
Josh Berkus
Aglio Database Solutions
San Francisco


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Kevin Brown
Date:
Josh Berkus wrote:
> Denis,
>
> > Are you saying the 7.4 'group by' trick would be faster than the simple
> > select count(*)? That seems hard to believe, being that the request now has
> > to fetch / sort the data. I must be missing something.
>
> No, I'm saying that the 7.4 hash-aggregate is faster than the same query was
> under 7.2 or 7.3.   Much faster.   But it does little to speed up a raw
> count(*).
>
> > The kind of requests that I am really interested in are:
> > select count(*) from table where table.column like 'pattern%'
>
> > These seems to go much master on mysql (which I guess it not a MVCC
> > database? or wasn't the Innobase supposed to make it so?),
>
> They did incorporate a lot of MVCC logic into InnoDB tables, yes.
> Which means that if SELECT count(*) on an InnoDB table is just as
> fast as a MyISAM table, then it is not accurate.

This is not necessarily true.  The trigger-based approach to tracking
the current number of rows in a table might well be implemented
internally, and that may actually be much faster than doing it using
triggers (the performance losses you saw may well have been the result
of PG's somewhat poor trigger performance, and not the result of the
approach itself.  It would be interesting to know how triggers effect
the performance of other databases).


--
Kevin Brown                          kevin@sysexperts.com


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Tom Lane
Date:
Kevin Brown <kevin@sysexperts.com> writes:
> Josh Berkus wrote:
>> They did incorporate a lot of MVCC logic into InnoDB tables, yes.
>> Which means that if SELECT count(*) on an InnoDB table is just as
>> fast as a MyISAM table, then it is not accurate.

> This is not necessarily true.  The trigger-based approach to tracking
> the current number of rows in a table might well be implemented
> internally, and that may actually be much faster than doing it using
> triggers

You missed the point of Josh's comment: in an MVCC system, the correct
COUNT() varies depending on which transaction is asking.  Therefore it
is not possible for a centrally maintained row counter to give accurate
results to everybody, no matter how cheap it is to maintain.

(The cheapness can be disputed as well, since it creates a single point
of contention for all inserts and deletes on the table.  But that's a
different topic.)

            regards, tom lane


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Josh Berkus
Date:
Kevin, Tom:

> (The cheapness can be disputed as well, since it creates a single point
> of contention for all inserts and deletes on the table.  But that's a
> different topic.)

Actually, this was the problem with the trigger method of maintaining COUNT
information in PostgreSQL.   The statistics table itself becomes a
significant souce of delay, since if a table_A gets 10,000 rows updated than
table_count_A must necessarily be updated 10,000 times ... creating a lot of
dead tuples and severely attenuating the table on disk until the next vacuum
... resulting in Update #10,000 to table_count_A taking 100+ times as long as
Update #1 does, due to the required random seek time on disk.

I can personally think of two ways around this:

In MySQL: store table_count_A as a non-MVCC table or global variable.
Drawback: the count would not be accurate, as you would see changes due to
incomplete transactions and eventually the count would be knocked off
completely by an overload of multi-user activity.  However, this does fit
with MySQL's design philosophy of "Speed over accuracy", so I suspect that
that's what they're doing.

In PostgreSQL:
a) Put table_count_A on superfast media like a RAM card so that random seeks
after 10,000 updates do not become a significant delay;
b) create an asynchronious table aggregates collector which would collect
programmed statistics (like count(*) from table A) much in the same way that
the planner statistics collector does.  This would have the disadvantage of
on being up to date when the database is idle, but the advantage of not
imposing any significant overhead on Updates.
    (Incidentally, I proposed this to one of my clients who complained about
Postgres' slow aggregate performance, but they declined to fund the effort)

--
Josh Berkus
Aglio Database Solutions
San Francisco


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> In PostgreSQL:
> a) Put table_count_A on superfast media like a RAM card so that random seeks
> after 10,000 updates do not become a significant delay;

As long as we're talking ugly, here ;-)

You could use a sequence to hold the aggregate counter.  A sequence
isn't transactional and so does not accumulate dead tuples.  "setval()"
and "select last_value" should have constant-time performance.

            regards, tom lane


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Kevin Brown
Date:
Tom Lane wrote:
> Kevin Brown <kevin@sysexperts.com> writes:
> > Josh Berkus wrote:
> >> They did incorporate a lot of MVCC logic into InnoDB tables, yes.
> >> Which means that if SELECT count(*) on an InnoDB table is just as
> >> fast as a MyISAM table, then it is not accurate.
>
> > This is not necessarily true.  The trigger-based approach to tracking
> > the current number of rows in a table might well be implemented
> > internally, and that may actually be much faster than doing it using
> > triggers
>
> You missed the point of Josh's comment: in an MVCC system, the correct
> COUNT() varies depending on which transaction is asking.  Therefore it
> is not possible for a centrally maintained row counter to give accurate
> results to everybody, no matter how cheap it is to maintain.

Hmm...true...but only if you really implement it as a faithful copy of
the trigger-based method.  Implementing it on the backend brings some
advantages to the table, to wit:

* The individual transactions don't need to update the
  externally-visible count on every insert or delete, they only need
  to update it at commit time.

* The transaction can keep a count of the number of inserted and
  deleted tuples it generates (on a per-table basis) during the life
  of the transaction.  The count value it returns to a client is the
  count value it reads from the table that stores the count value plus
  any differences that have been applied during the transaction.  This
  is fast, because the backend handling the transaction can keep this
  difference value in its own private memory.

* When a transaction commits, it only needs to apply the "diff value"
  it stores internally to the external count value.

Contention on the count value is only an issue if the external count
value is currently being written to by a transaction in the commit
phase.  But the only time a transaction will be interested in reading
that value is when it's performing a count(*) operation or when it's
committing inserts/deletes that happened on the table in question (and
then only if the number of tuples inserted differs from the number
deleted).  So the total amount of contention should be relatively low.


> (The cheapness can be disputed as well, since it creates a single point
> of contention for all inserts and deletes on the table.  But that's a
> different topic.)

That's true, but the single point of contention is only an issue at
transaction commit time (unless you're implementing READ UNCOMMITTED),
at least if you do something like what I described above.



--
Kevin Brown                          kevin@sysexperts.com


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Tom Lane
Date:
Kevin Brown <kevin@sysexperts.com> writes:
> Tom Lane wrote:
>> You missed the point of Josh's comment: in an MVCC system, the correct
>> COUNT() varies depending on which transaction is asking.  Therefore it
>> is not possible for a centrally maintained row counter to give accurate
>> results to everybody, no matter how cheap it is to maintain.

> Hmm...true...but only if you really implement it as a faithful copy of
> the trigger-based method.
> [ instead have transactions save up net deltas to apply at commit ]

Good try, but it doesn't solve the problem.  SERIALIZABLE transactions
should not see deltas applied by any transaction that commits after
they start.  READ COMMITTED transactions can see such deltas --- but not
deltas applied since the start of their current statement.  (And there
could be several different "current statements" with different snapshots
in progress in a single READ COMMITTED transaction.)

AFAICS, central-counter techniques could only work in an MVCC system
if each transaction copies every counter in the system at each snapshot
freeze point, in case it finds itself needing that counter value later
on.  This is a huge amount of mostly-useless overhead, and it makes the
problem of lock contention for access to the counters several orders of
magnitude worse than you'd first think.

Of course you can dodge lots of this overhead if you're willing to
accept approximate answers.  But I don't believe that central counters
are useful in an exact-MVCC-semantics system.

            regards, tom lane


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Kevin Brown
Date:
Tom Lane wrote:
> Kevin Brown <kevin@sysexperts.com> writes:
> > Tom Lane wrote:
> >> You missed the point of Josh's comment: in an MVCC system, the correct
> >> COUNT() varies depending on which transaction is asking.  Therefore it
> >> is not possible for a centrally maintained row counter to give accurate
> >> results to everybody, no matter how cheap it is to maintain.
>
> > Hmm...true...but only if you really implement it as a faithful copy of
> > the trigger-based method.
> > [ instead have transactions save up net deltas to apply at commit ]
>
> Good try, but it doesn't solve the problem.  SERIALIZABLE transactions
> should not see deltas applied by any transaction that commits after
> they start.  READ COMMITTED transactions can see such deltas --- but not
> deltas applied since the start of their current statement.  (And there
> could be several different "current statements" with different snapshots
> in progress in a single READ COMMITTED transaction.)

This is why I suspect the best way to manage this would be to manage
the counter itself using the MVCC mechanism (that is, you treat the
shared counter as a row in a table just like any other and, in fact,
it might be most beneficial for it to actually be exactly that), which
handles the visibility problem automatically.  But I don't know how
much contention there would be as a result.

> Of course you can dodge lots of this overhead if you're willing to
> accept approximate answers.  But I don't believe that central counters
> are useful in an exact-MVCC-semantics system.

No, but an MVCC-managed counter would be useful in such a system,
wouldn't it?  Or am I missing something there, too (the deltas
themselves would be managed as described, and would be applied as
described)?

So: how much contention would there be if the counter were managed in
exactly the same way as any row of a table is managed?  Because I'm
not terribly familiar with how PG manages MVCC (pointers to
documentation on it welcomed) I can't answer that question myself.


--
Kevin Brown                          kevin@sysexperts.com


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Andreas Pflug
Date:
Kevin Brown wrote:

>
>No, but an MVCC-managed counter would be useful in such a system,
>wouldn't it?  Or am I missing something there, too (the deltas
>themselves would be managed as described, and would be applied as
>described)?
>
>So: how much contention would there be if the counter were managed in
>exactly the same way as any row of a table is managed?  Because I'm
>not terribly familiar with how PG manages MVCC (pointers to
>documentation on it welcomed) I can't answer that question myself.
>
>
>
It looks to me that a "row number -1" really would solve this problem.

 I think a row counter on each table would be even useful for some kind
of auto-vacuum mechanism, that could be triggered if pg_class.reltuples
deviates too far from the real row count. Observing this mailing list,
missing or outdated statistics still seem to be a major source of
performance degradation. We all know these 1000 row estimates from
EXPLAIN, don't we? A default vacuum strategy for pgsql newbies should
solve a lot of those problems, preventing a lot of "pgsql is slow" threads.

Regards,
Andreas


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Tom Lane
Date:
Kevin Brown <kevin@sysexperts.com> writes:
> This is why I suspect the best way to manage this would be to manage
> the counter itself using the MVCC mechanism (that is, you treat the
> shared counter as a row in a table just like any other and, in fact,
> it might be most beneficial for it to actually be exactly that), which
> handles the visibility problem automatically.  But I don't know how
> much contention there would be as a result.

Hm.  Contention probably wouldn't be the killer, since if transactions
don't try to update the count until they are about to commit, they won't
be holding the row lock for long.  (You'd have to beware of deadlocks
between transactions that need to update multiple counters, but that
seems soluble.)  What *would* be a problem is that such counter tables
would accumulate huge numbers of dead rows very quickly, making it
inefficient to find the live row.  Josh already mentioned this as a
problem with user-trigger-based counting.  You could stanch the bleeding
with sufficiently frequent vacuums, perhaps, but it just doesn't look
very appealing.

Ultimately what this comes down to is "how much overhead are we willing
to load onto all other operations in order to make SELECT-COUNT(*)-with-
no-WHERE-clause fast"?  Postgres has made a set of design choices that
favor the other operations.  If you've designed an application that
lives or dies by fast COUNT(*), perhaps you should choose another
database.

            regards, tom lane


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Tom Lane
Date:
Andreas Pflug <Andreas.Pflug@web.de> writes:
>  I think a row counter on each table would be even useful for some kind
> of auto-vacuum mechanism, that could be triggered if pg_class.reltuples
> deviates too far from the real row count.

It would be counting the wrong thing.  auto-vacuum needs to know how
many dead tuples are in a table, not how many live ones.  Example:
UPDATE doesn't change the live-tuple count (without this property,
I don't think the sort of count maintenance Kevin is proposing could
possibly be efficient enough to be interesting).  But it does create
a dead tuple that vacuum wants to know about.

            regards, tom lane


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Andreas Pflug
Date:
Tom Lane wrote:

>It would be counting the wrong thing.  auto-vacuum needs to know how
>many dead tuples are in a table, not how many live ones.  Example:
>UPDATE doesn't change the live-tuple count (without this property,
>I don't think the sort of count maintenance Kevin is proposing could
>possibly be efficient enough to be interesting).  But it does create
>a dead tuple that vacuum wants to know about.
>
>
>
I understand your point, but is this about VACUUM only or VACUUM ANALYZE
too? People wouldn't bother about big databases if it's still fast
(until the disk is full :-)

Do dead tuples affect query planning? I thought the plan only cares
about existing rows and their data patterns.
So count(*), pg_stat_all_tables.n_tup_ins, .n_tup_upd and .n_tup_del all
together can make a VACUUM ANALYZE necessary, right?

Regards,
Andreas


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Andrew Sullivan
Date:
On Sun, Apr 20, 2003 at 11:21:32AM -0400, Tom Lane wrote:

> favor the other operations.  If you've designed an application that
> lives or dies by fast COUNT(*), perhaps you should choose another
> database.

Or consider redesigning the application.  The "no where clause"
restriction sure smacks of poor database normalisation to me.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Kevin Brown
Date:
Tom Lane wrote:
> Kevin Brown <kevin@sysexperts.com> writes:
> > This is why I suspect the best way to manage this would be to manage
> > the counter itself using the MVCC mechanism (that is, you treat the
> > shared counter as a row in a table just like any other and, in fact,
> > it might be most beneficial for it to actually be exactly that), which
> > handles the visibility problem automatically.  But I don't know how
> > much contention there would be as a result.
>
> Hm.  Contention probably wouldn't be the killer, since if transactions
> don't try to update the count until they are about to commit, they won't
> be holding the row lock for long.  (You'd have to beware of deadlocks
> between transactions that need to update multiple counters, but that
> seems soluble.)  What *would* be a problem is that such counter tables
> would accumulate huge numbers of dead rows very quickly, making it
> inefficient to find the live row.

But that inefficiency is a problem for *all* oft-updated tables, is it
not?  I know that you'll end up with an additional n tuples per
transaction (where n is the average number of tables inserted into or
deleted from per transaction), so this isn't an insignificant problem,
but it's one faced by any application that often updates a small
table.

Causing a transaction which is already doing inserts/deletes to take
the hit of doing one additional update doesn't seem to me to be a
particularly large sacrifice, especially since the table it's updating
(the one that contains the counts) is likely to be cached in its
entirety.  The chances are reasonable that the other activity the
transaction is performing will dwarf the additional effort that
maintaining the count demands.

> Josh already mentioned this as a problem with user-trigger-based
> counting.

Right, but the trigger based mechanism probably magnifies the issue by
orders of magnitude, and thus can't necessarily be used as an argument
against an internally-implemented method.

> You could stanch the bleeding with sufficiently frequent vacuums,
> perhaps, but it just doesn't look very appealing.

I would say this is more a strong argument for automatic VACUUM
management than against count management, because what you say here is
true of any oft-updated, oft-referenced table.

> Ultimately what this comes down to is "how much overhead are we willing
> to load onto all other operations in order to make SELECT-COUNT(*)-with-
> no-WHERE-clause fast"?  Postgres has made a set of design choices that
> favor the other operations.  If you've designed an application that
> lives or dies by fast COUNT(*), perhaps you should choose another
> database.

Or perhaps a mechanism similar to the one being discussed should be
implemented and controlled with a GUC variable, so instead of forcing
someone to choose another database you force them to choose between
the performance tradeoffs involved.  We already give DBAs such choices
elsewhere, e.g. pg_stat_activity.

The real question in all this is whether or not fast COUNT(*)
operations are needed often enough to even justify implementing a
mechanism to make them possible in PG.  The question of getting fast
answers from COUNT(*) comes up often enough to be a FAQ, and that
suggests that there's enough demand for the feature that it may be
worth implementing just to shut those asking for it up.  :-)

Personally, I'd rather see such development effort go towards more
beneficial improvements, such as replication, 2PC, SQL/MED, etc. (or
even improving the efficiency of MVCC, since it was mentioned here as
a problem! :-).  I consider COUNT(*) without a WHERE clause to be a
corner case, despite the frequency of questions about it.  But I don't
think we should reject a patch to implement fast COUNT(*) just because
it represents a performance tradeoff, at least if it's GUC-controlled.


--
Kevin Brown                          kevin@sysexperts.com


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Tom Lane
Date:
Kevin Brown <kevin@sysexperts.com> writes:
> Personally, I'd rather see such development effort go towards more
> beneficial improvements, such as replication, 2PC, SQL/MED, etc. (or
> even improving the efficiency of MVCC, since it was mentioned here as
> a problem! :-).  I consider COUNT(*) without a WHERE clause to be a
> corner case, despite the frequency of questions about it.

Exactly.

> But I don't
> think we should reject a patch to implement fast COUNT(*) just because
> it represents a performance tradeoff, at least if it's GUC-controlled.

Well, this is moot since I see no one offering to provide such a patch.
But performance tradeoffs are only one of the costs involved.  I suspect
any such mechanism would be invasive enough to represent a nontrivial
ongoing maintenance cost, whether anyone uses it or not.  The extent
to which it munges core functionality would have to be a factor in
deciding whether to accept it.  It'd take lots more thought than we've
expended in this thread to get an accurate handle on just what would
be involved...

(BTW, if anyone actually is thinking about this, please make it a
per-table option not a global GUC option.)

            regards, tom lane


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Josh Berkus
Date:
Kevin,

> > Josh already mentioned this as a problem with user-trigger-based
> > counting.
>
> Right, but the trigger based mechanism probably magnifies the issue by
> orders of magnitude, and thus can't necessarily be used as an argument
> against an internally-implemented method.

I'm not sure about that, Kevin.   The production trigger test was written in C
(by Joe Conway), using some of the best memory/efficiency management he could
devise.   I could buy that the trigger mechanism adds a certain fixed
overhead to the process, but not the contention that we were seeing ...
especially not the geometric progression of inefficiency as the transaction
count went up.   We'll talk about this offlist; I may be able to get the
client to authorize letting you examine the database.

For further detail, our setup was sort of a "destruction test"; including:
1) a slightly underpowered server running too many processes;
2) a very high disk contention environment, with multiple applications
fighting for I/O.
3) running COUNT(*), GROUP BY x on a table with 1.4 million rows, which was
being updated in batches of 10,000 rows to 40,000 rows every few minutes.

As I said before, the overhead for c-trigger based accounting, within the MVCC
framework, was quite tolerable with small update batches, only 9-11% penalty
to the updates overall for batches of 100-300 updates.   However, as we
increased the application activity, the update penalty increased, up to
40-45% with the full production load.

It's not hard to figure out why; like most user's servers, the aggregate
caching table was on the same disk as the table(s) being updated.    The
resut was a huge amount of disk-head-skipping between the updated table and
the aggregate caching table every time a commit hit the database, with random
seek times increasing the longer the time since the last VACUUM.

Now, on a better server with these tables on fast RAID or on different
spindles, I expect the result would be somewhat better.   However, I also
suspect that many of the users who complain the loudest about slow count(*)
are operating in single-spindle environments.

--
Josh Berkus
Aglio Database Solutions
San Francisco


Re: [SQL] Yet Another (Simple) Case of Index not used

From
"Jim C. Nasby"
Date:
On Sat, Apr 19, 2003 at 12:03:18PM -0700, Josh Berkus wrote:
> Kevin, Tom:
>
> > (The cheapness can be disputed as well, since it creates a single point
> > of contention for all inserts and deletes on the table.  But that's a
> > different topic.)
>
> Actually, this was the problem with the trigger method of maintaining COUNT
> information in PostgreSQL.   The statistics table itself becomes a
> significant souce of delay, since if a table_A gets 10,000 rows updated than
> table_count_A must necessarily be updated 10,000 times ... creating a lot of
> dead tuples and severely attenuating the table on disk until the next vacuum
> ... resulting in Update #10,000 to table_count_A taking 100+ times as long as
> Update #1 does, due to the required random seek time on disk.

Once statement level triggers are implimented, the performance would
probably be fine, assuming your update was a single statement.
--
Jim C. Nasby (aka Decibel!)                    jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


Re: [SQL] Yet Another (Simple) Case of Index not used

From
"Christopher Kings-Lynne"
Date:
> Once statement level triggers are implimented, the performance would
> probably be fine, assuming your update was a single statement.

Statement triggers _are_ implemented in CVS.

Chris


Re: [SQL] Yet Another (Simple) Case of Index not used

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> AFAICS, central-counter techniques could only work in an MVCC system
> if each transaction copies every counter in the system at each snapshot
> freeze point, in case it finds itself needing that counter value later
> on.  This is a huge amount of mostly-useless overhead, and it makes the
> problem of lock contention for access to the counters several orders of
> magnitude worse than you'd first think.

Well, one option would be to do it in a lazy way. If you do an update on a
table with cached aggregate data just throw the data out. This way you get to
cache data on infrequently updated tables and get only a very small penalty on
frequently updated tables.

--
greg