Thread: low cardinality column

low cardinality column

From
"Rong Wu"
Date:
Hi,

I have a select like this:

SELECT MAX(transactionid) FROM cbntransaction WHERE transactiontypeid=0;

in the query:
transactionid is the primary key of cbntransaction table,
But transactiontypeid is a low cardinality column, there're over 100,000
records has the same trnsactiontypeid.
I was trying to create an index on (transactiontypeid, transactionid), but
no luck on that, postgresql will still scan the table.
I'm wondering if there's solution for this query:
Maybe something like if I can partition the table using transactiontypeid,
and do a local index on transactionid on each partition, but I couldnt'
find any doc on postgresql to do that.

Thanks in advance,
rong :-)



Re: low cardinality column

From
Josh Berkus
Date:
Rong,

> I have a select like this:
>
> SELECT MAX(transactionid) FROM cbntransaction WHERE transactiontypeid=0;

Simple workaround:

Create an mulit-column index on transactiontypeid, transactionid.

SELECT transactionid FROM cbtransaction
WHERE transactiontypeid=0
ORDER BY transactionid DESC LIMIT 1;

This approach will use the index.

Of course, if the reason you are selecting the max id is to get the next id,
there are much better ways to do that.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: low cardinality column

From
Rod Taylor
Date:
On Thu, 2003-10-02 at 14:30, Rong Wu wrote:
> Hi,
>
> I have a select like this:
>
> SELECT MAX(transactionid) FROM cbntransaction WHERE transactiontypeid=0;

For various reasons (primarily MVCC and the ability to make custom
aggregates making it difficult)  MAX() is not optimized in this fashion.

Try:

  SELECT transactionid
    FROM ...
   WHERE ...
ORDER BY transactionid DESC
   LIMIT 1;

Attachment

Re: low cardinality column

From
Bill Moran
Date:
Rod Taylor wrote:
> On Thu, 2003-10-02 at 14:30, Rong Wu wrote:
>
>>Hi,
>>
>>I have a select like this:
>>
>>SELECT MAX(transactionid) FROM cbntransaction WHERE transactiontypeid=0;
>
>
> For various reasons (primarily MVCC and the ability to make custom
> aggregates making it difficult)  MAX() is not optimized in this fashion.
>
> Try:
>
>   SELECT transactionid
>     FROM ...
>    WHERE ...
> ORDER BY transactionid DESC
>    LIMIT 1;

Despite this good suggestion, if you're using this technique to generate
the next transaction ID, you're going to have errors as concurrency rises.

Use a SERIAL, which guarantees that you won't have two processes generate
the same number.

--
Bill Moran
Potential Technologies
http://www.potentialtech.com


Thanks - Re: low cardinality column

From
"Rong Wu"
Date:
Thanks, Rod, Josh and Bill, That' fantastic.

have a nice day,
rong :-)

> Rod Taylor wrote:
>> On Thu, 2003-10-02 at 14:30, Rong Wu wrote:
>>
>>>Hi,
>>>
>>>I have a select like this:
>>>
>>>SELECT MAX(transactionid) FROM cbntransaction WHERE transactiontypeid=0;
>>
>>
>> For various reasons (primarily MVCC and the ability to make custom
>> aggregates making it difficult)  MAX() is not optimized in this fashion.
>>
>> Try:
>>
>>   SELECT transactionid
>>     FROM ...
>>    WHERE ...
>> ORDER BY transactionid DESC
>>    LIMIT 1;
>
> Despite this good suggestion, if you're using this technique to generate
> the next transaction ID, you're going to have errors as concurrency rises.
>
> Use a SERIAL, which guarantees that you won't have two processes generate
> the same number.
>
> --
> Bill Moran
> Potential Technologies
> http://www.potentialtech.com
>
>