Re: Performance improvement hints + measurement - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Performance improvement hints + measurement
Date
Msg-id 7484.968856456@sss.pgh.pa.us
Whole thread Raw
In response to Re: Performance improvement hints + measurement  (devik@cdi.cz)
List pgsql-hackers
devik@cdi.cz writes:
> What do you think about this approach:

> 1) add all validity & tx fields from heap tuple into 
>    index tuple too

Non-starter I'm afraid.  That would mean that whenever we update a
tuple, we'd have to find and update all the index entries that refer to
it.  You'd be taking a tremendous performance hit on all update
operations in the hope of saving time on only a relatively small number
of inquiries.

This has been discussed before (repeatedly, IIRC).  Please peruse the
pghackers archives.

> I regulary need to run this query against it:
> select nazev,sum(cnt) from bigrel group by name;
> With index, in pg there is a big overhead of heap tuple
> reading - mssql uses data directly from scanned index.

How exactly is MSSQL going to do that with only an index on "name"?
You need to have access to the cnt field as well, which wouldn't be
present in an index entry for name.

> I'm not sure how complex the proposed changes are. Another
> way would be to implement another aggregator like HashAgg
> which will use hashing. 

That would be worth looking at --- we have no such plan type now.

> But it could be even more complicated as one has to use
> temp relation to store all hash buckets ..

You could probably generalize the existing code for hashjoin tables
to support hash aggregation as well.  Now that I think about it, that
sounds like a really cool idea.  Should put it on the TODO list.
        regards, tom lane


pgsql-hackers by date:

Previous
From: The Hermit Hacker
Date:
Subject: Re: man, I feel like a beginner ...
Next
From: Tom Lane
Date:
Subject: Re: null in constraints