Re: Seqscan in MAX(index_column) - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Seqscan in MAX(index_column)
Date
Msg-id 873cfcbl7d.fsf@stark.dyndns.tv
Whole thread Raw
In response to Re: Seqscan in MAX(index_column)  ("Shridhar Daithankar" <shridhar_daithankar@persistent.co.in>)
Responses Re: Seqscan in MAX(index_column)  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
"Shridhar Daithankar" <shridhar_daithankar@persistent.co.in> writes:

> On 4 Sep 2003 at 11:32, Paulo Scardine wrote:
> 
> > (Perhaps a newbie question, but I tried to google this out without success).
> > 
> > Why postgres does an expensive seqscan to find the max(value) for an indexed
> > column? I think MAX() does not know or cares if a column is indexed, but...
> > Should not it? BTW, is there some smarter trick to do that?
> 
> No. Postgresql uses MVCC which mean there could be multiple views of sample 
> tuple active at the same time. There is no way to tell which is max. value for 
> a column as definition of a committed value can be a moving target.

It has nothing to do with MVCC. It has to do with implementing this is hard in
the general case.

Think of examples like:

select max(foo) group by bar;

or

select max(foo) where xyz = z;

To do it properly max/min have to be special-cased and tightly integrated with
other code to handle index scans and aggregates. As it currently stands
they're implemented the same way as any other aggregate, which means they get
to see all the records in the grouping.

This is a frequently asked question, I'm surprised you didn't find stuff
searching with google. There have been numerous long discussions on this topic
not long ago. People are still trying to think about how to handle this
better.

-- 
greg



pgsql-hackers by date:

Previous
From: Neil Conway
Date:
Subject: FK type mismatches?
Next
From: Alvaro Herrera Munoz
Date:
Subject: Re: tablelevel and rowlevel locks