Thread: help with dual indexing
I've got a table with about 10 million events in it. Each has a user_id (about 1000 users) and a event_time timestamp covering a 4 year period with about 50% of the events being in the last year. Some users have only dozens of events. A few have hundreds of thousands. The queries usually are in the form of, where "user_id = something and event_time between something and something". Half of my queries index off of the user_id and half index off the event_time. I was thinking this would be a perfect opportunity to use a dual index of (user_id,event_time) but I'm confused as to weather this will help considering the size of this index given that there very few tuples that have the exact same timestamp as another and I'm not sure which order to put the user_id/event_time as I don't know what is meant when people on this list ask which is more selective. Also, would it make sense for me to raise my ANALYZE value and how would I go about doing this? Thanks for the help. -- Orion Henry <orion@trustcommerce.com>
Attachment
Orion Henry <orion@trustcommerce.com> writes: > The queries usually are in the form of, where "user_id = something and > event_time between something and something". > Half of my queries index off of the user_id and half index off the > event_time. I was thinking this would be a perfect opportunity to use a > dual index of (user_id,event_time) but I'm confused as to weather this > will help Probably. Put the user_id as the first column of the index --- if you think about the sort ordering of a multicolumn index, you will see why. With user_id first, a constraint as above describes a contiguous subrange of the index; with event_time first it does not. regards, tom lane
Thanks Tom! You're a life-saver. On Fri, 2004-01-23 at 17:08, Tom Lane wrote: > Orion Henry <orion@trustcommerce.com> writes: > > The queries usually are in the form of, where "user_id = something and > > event_time between something and something". > > > Half of my queries index off of the user_id and half index off the > > event_time. I was thinking this would be a perfect opportunity to use a > > dual index of (user_id,event_time) but I'm confused as to weather this > > will help > > Probably. Put the user_id as the first column of the index --- if you > think about the sort ordering of a multicolumn index, you will see why. > With user_id first, a constraint as above describes a contiguous > subrange of the index; with event_time first it does not. > > regards, tom lane -- Orion Henry <orion@trustcommerce.com>