Thread: queries on 2+ indices
How does postgresql perform queries on one table using more than one index? For example, assuming the following: create table t1 ( f1 int, f2 int); create index t1_f1 on t1 (f1); create index t1_f2 on t1 (f2); select * from t1 where f1=123 and f2=456; By default, both indices will be btree's making them sorted. Therefore, does postgresql retrieve all matching records from t1_f1 and t1_f2 into intermediate tables and then performs somekind of merge sort before retrieving the final results from t1? If so or if not, are intermediate files created for this kind of operation or does the postgresql support queries to multiple fields directly in its indexing system (perhaps aided by "analyze")? Or, does this kind of operation rely much on memory? I have tried making heads or tails out of the source code, but postgresql is far more daunting than I had expected. Nevertheless, for future reference, how could I find answers to questions about query management by postgresql? Many thanks for your excellent ordbms, Marc Tardif
Marc Tardif <admin@wtbwts.com> writes: > How does postgresql perform queries on one table using more than one > index? It doesn't. Simple enough, eh? For example, assuming the following: > create table t1 ( f1 int, f2 int); > create index t1_f1 on t1 (f1); > create index t1_f2 on t1 (f2); > select * from t1 where f1=123 and f2=456; The optimizer will attempt to guess which index is more selective (will return fewer tuples for its part of the WHERE clause). That index would be used for the indexscan, and the rest of the WHERE clause would be applied as a "qpqual", ie actually evaluated as an expression against each tuple found by the index. As you note, there's not any really efficient way to make use of independent indexes to evaluate an AND condition like this one. While Postgres' approach is pretty simplistic, I'm not sure that a more-complicated approach would actually be any faster. If you have a multi-column index, eg create index t1_f1_f2 on t1 (f1, f2); then the system can and will use both clauses of the WHERE with that single index. But again, it's not entirely clear that that's all that much faster than just using the more-selective clause in a smaller index. Furthermore, a multi-column index is more specialized than single-column indexes because it is useful for only a narrower range of queries; so you have to consider the extra work done at insert/update to manage the extra index, and decide if it's really a win overall for your application. > how could I find answers to questions about query management by > postgresql? Asking questions on the mailing lists isn't a bad way to start. Seeing what EXPLAIN says about how queries will be executed is another nice learning tool. There is some high-level implementation info in the SGML documentation, and more scattered in various README files, but you won't really understand a lot until you start burrowing into the source code. regards, tom lane
I'm not sure I understand what is "qpqual" in your explanation. Once the first indexscan is performed, is a temporary table created (in-file or in-memory) containing the relevant tuples? If not, how can the remaining part of the WHERE clause be evaluated against the previously selected tuples during the first indexscan? Or, is the remaining part of the WHERE clause re-evaluated again and again for each of the found tuples in the first indexscan? On Fri, 18 Feb 2000, Tom Lane wrote: > Marc Tardif <admin@wtbwts.com> writes: > > > For example, assuming the following: > > > > create table t1 ( f1 int, f2 int); > > create index t1_f1 on t1 (f1); > > create index t1_f2 on t1 (f2); > > select * from t1 where f1=123 and f2=456; > > The optimizer will attempt to guess which index is more selective > (will return fewer tuples for its part of the WHERE clause). That > index would be used for the indexscan, and the rest of the WHERE > clause would be applied as a "qpqual", ie actually evaluated as > an expression against each tuple found by the index. > > As you note, there's not any really efficient way to make use of > independent indexes to evaluate an AND condition like this one. > While Postgres' approach is pretty simplistic, I'm not sure that > a more-complicated approach would actually be any faster. >
Marc Tardif <admin@wtbwts.com> writes: > I'm not sure I understand what is "qpqual" in your explanation. Once the > first indexscan is performed, is a temporary table created (in-file or > in-memory) containing the relevant tuples? No, it's all done on-the-fly as each tuple is scanned. > is the remaining part of the WHERE > clause re-evaluated again and again for each of the found tuples in the > first indexscan? That's what I said. regards, tom lane