Thread: Re: Expressional Indexes
Josh Berkus <josh@agliodbs.com> writes: >> I'm not particularly happy with the phrase "expressional index", and >> would like to think of something else to use before the 7.4 docs go >> final. Any ideas? > Hmmm ... well, actually, I like "Expressional Indexes". What's wrong > with it? Mainly that "expressional" is a made-up word. I have been considering using "calculated index" or "computed index" but dunno if that really conveys anything. regards, tom lane
Tom, > Mainly that "expressional" is a made-up word. So? We're in the tech biz, Tom. New-coined words are expected. And that way nobody will expect it to mean something else, since we made it up. -- H. Dumpty, Q.E.D. > I have been considering using "calculated index" or "computed index" > but dunno if that really conveys anything. Well, "Expression Indexes" is the most accurate. Or "Expression-Based Indexes." -- -Josh BerkusAglio Database SolutionsSan Francisco
Josh Berkus wrote: > Tom, > > > Mainly that "expressional" is a made-up word. > > So? We're in the tech biz, Tom. New-coined words are expected. And that > way nobody will expect it to mean something else, since we made it up. > -- H. Dumpty, Q.E.D. > > > I have been considering using "calculated index" or "computed index" > > but dunno if that really conveys anything. > > Well, "Expression Indexes" is the most accurate. Or "Expression-Based > Indexes." Computed index sound too much like there is computation done during the index lookup, which there isn't, and it sounds like it would behave differently from a normal index, which it doesn't. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane writes: > Mainly that "expressional" is a made-up word. At least it's better than "functional index", because I had always wondered where the dysfunctional indexes went. :) I like "expression index". -- Peter Eisentraut peter_e@gmx.net
[sNip] >> I have been considering using "calculated index" or "computed index" >> but dunno if that really conveys anything. > > Well, "Expression Indexes" is the most accurate. Or "Expression-Based > Indexes." What is the proposed definition of an "Expression Index?" When I see this term, I get the impression I can create an index that's based on the results of a SELECT, such as for selecting data with specific values or ranges of values... For example, if I want to index on a date field but only have the index keep track of the most recent 30 days (and then create a secondary index for all dates) so as to improve performance on more heavily loaded systems. Am I understanding this new terminology correctly? Thanks in advance. -- Randolf Richardson - rr@8x.ca Inter-Corporate Computer & Network Services, Inc. Vancouver, British Columbia, Canada http://www.8x.ca/ This message originated from within a secure, reliable, high-performance network ... a Novell NetWare network.
"Randolf Richardson, DevNet SysOp 29" <rr@8x.ca> writes: > For example, if I want to index on a date field but only have the index > keep track of the most recent 30 days (and then create a secondary index for > all dates) so as to improve performance on more heavily loaded systems. > > Am I understanding this new terminology correctly? Thanks in advance. No, you could do the above using "partial indexes" but it wouldn't work very well in this case because the "last 30 days" keeps moving and you would have to keep redefining the index periodically. It also wouldn't really help performance. Expression Indexes are just more powerful "functional indexes". In 7.3 they could be used for indexing expressions like "lower(foo)". In 7.4 they're more powerful and you can index expressions other than simple function calls. They still should be things that always return the same value, which excludes subqueries. -- greg
>> For example, if I want to index on a date field but only have the index >> keep track of the most recent 30 days (and then create a secondary >> index for all dates) so as to improve performance on more heavily >> loaded systems. >> >> Am I understanding this new terminology correctly? Thanks in advance. > > No, you could do the above using "partial indexes" but it wouldn't work > very well in this case because the "last 30 days" keeps moving and you > would have to keep redefining the index periodically. For the application I will need to develop in the future, it would be okay for this index to hold data more than 30 days old, and then be redefined on a monthly basis along with regular database vacuuming (and other maintenance). Could this be done with a "partial index" as follows? CREATE INDEX my_index on my_table (create_date) WHERE (create_date > age(timestamp '30 days')); If I've made any mistakes here, please don't hesitate to let me know because the age() function is new to me. > It also wouldn't really help performance. Really? A smaller index would result in fewer comparisons behind-the- scenes though, wouldn't it? > Expression Indexes are just more powerful "functional indexes". In 7.3 > they could be used for indexing expressions like "lower(foo)". In 7.4 > they're more powerful and you can index expressions other than simple > function calls. [sNip] So an "Expression Index" could, for example, be used to sort alpha- numeric data in a case-insensitive manner? I just want to make sure I'm understanding this correctly. Thanks. -- Randolf Richardson - rr@8x.ca Inter-Corporate Computer & Network Services, Inc. Vancouver, British Columbia, Canada http://www.8x.ca/ This message originated from within a secure, reliable, high-performance network ... a Novell NetWare network.
"Randolf Richardson, DevNet SysOp 29" <rr@8x.ca> writes: > For the application I will need to develop in the future, it would be > okay for this index to hold data more than 30 days old, and then be > redefined on a monthly basis along with regular database vacuuming (and > other maintenance). Okay ... > Could this be done with a "partial index" as follows? > CREATE INDEX my_index on my_table (create_date) > WHERE (create_date > age(timestamp '30 days')); No, because the above represents a moving cutoff; it will (and should) be rejected as a non-immutable predicate condition. You could do something like CREATE INDEX my_Nov_03_index on my_table (create_date) WHERE (create_date >= date '2003-11-01'); and then a month from now replace this with CREATE INDEX my_Dec_03_index on my_table (create_date) WHERE (create_date >= date '2003-12-01'); bearing in mind that this index can be used with queries that contain WHERE conditions like "create_date >= some-date-constant". The planner must be able to convince itself that the right-hand side of the WHERE condition is >= the cutoff in the index's predicate condition. Since the planner is not very bright, both items had better be simple DATE constants, or it won't be able to figure it out ... > So an "Expression Index" could, for example, be used to sort alpha- > numeric data in a case-insensitive manner? I just want to make sure I'm > understanding this correctly. It won't do anything you could not have done in prior releases using a custom-built function as the named function of a functional index. In that sense it's just a notational improvement rather than a real advance in capability. (Actually I guess there is an advance in capability: 7.4 lets you build a multi-column index on multiple expressions, whereas in prior releases a functional index could only have one index column. Not sure how important this is in practice though.) regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > No, because the above represents a moving cutoff; it will (and should) > be rejected as a non-immutable predicate condition. You could do > something like > > CREATE INDEX my_Nov_03_index on my_table (create_date) > WHERE (create_date >= date '2003-11-01'); > > and then a month from now replace this with > > CREATE INDEX my_Dec_03_index on my_table (create_date) > WHERE (create_date >= date '2003-12-01'); > > bearing in mind that this index can be used with queries that contain > WHERE conditions like "create_date >= some-date-constant". The planner > must be able to convince itself that the right-hand side of the WHERE > condition is >= the cutoff in the index's predicate condition. Since > the planner is not very bright, both items had better be simple DATE > constants, or it won't be able to figure it out ... Note that if you're just doing this to speed up regular queries where you have create_date in some small range, then you'll likely not see much of an increase. Mainly you'll just save space. What can be interesting is to create a partial index like this but over a second unrelated column. Something like: CREATE INDEX my_dec_03_index on my_table (userid)WHERE (create_date >= date '2003-11-02'); Then you can do queries like SELECT * FROM my_table WHERE userid = ? AND create_date >= date '2003-11-02' And it'll be able to efficiently pull out just those records, even if there are thousands more records that are older than 2003-11-02. This avoids having to create a two-column index with a low-selectivity column like "month". -- greg
[sNip] >> bearing in mind that this index can be used with queries that contain >> WHERE conditions like "create_date >= some-date-constant". The planner >> must be able to convince itself that the right-hand side of the WHERE >> condition is >= the cutoff in the index's predicate condition. Since >> the planner is not very bright, both items had better be simple DATE >> constants, or it won't be able to figure it out ... > > Note that if you're just doing this to speed up regular queries where > you have create_date in some small range, then you'll likely not see > much of an increase. Mainly you'll just save space. Saving space is not my objective for this particular problem, but of course it's definitely good news as I strive to optimize from every angle whenever possible. What I'm aiming for is to get an increase, no matter how small, because the volume is expected to be extremely heavy duty (unfortunately I can't discuss too many details of the project due to agreements I've made with others), so even if the end result seems insignificant from the perspective of a single query, the advantages become obvious when the queries are performed repeatedly simultaneously for wide variety of massive numbers of clients. > What can be interesting is to create a partial index like this but over > a second unrelated column. Something like: > > CREATE INDEX my_dec_03_index on my_table (userid) > WHERE (create_date >= date '2003-11-02'); > > Then you can do queries like > > SELECT * FROM my_table WHERE userid = ? AND create_date >= date > '2003-11-02' > > And it'll be able to efficiently pull out just those records, even if > there are thousands more records that are older than 2003-11-02. What a fascinating trick. I could also use dummy data instead of "userid" (which has obvious functionality as implied by its name), or even data that the applications can even specify first because they'll be able to determine things on the client-side that will make index selection more appropriate. I haven't checked into this yet, but is there a way to specify which index PostgreSQL use as a parameter in a SELECT? > This avoids having to create a two-column index with a low-selectivity > column like "month". Thanks, both of you. I've got some re-thinking to do for this project I'm working on (since it's a low priority project at the moment, I've got plenty of time to re-work the plan over and over again). -- Randolf Richardson - rr@8x.ca Inter-Corporate Computer & Network Services, Inc. Vancouver, British Columbia, Canada http://www.8x.ca/ This message originated from within a secure, reliable, high-performance network ... a Novell NetWare network.