Thread: query optimizer questions
Why do the following three queries use different scan types under 7.1.2? Under 6.5.3 they all used index (and my system ran MUCH faster, since I often use the trailing %) I've done a vacuum analyze. I installed 7.1.2 on RedHat 6.2 using the RPMs. anderson=# explain select value from s_n_grantor where value='foobar'; NOTICE: QUERY PLAN: Index Scan using s_n_grantor_key on s_n_grantor (cost=0.00..823.58 rows=247 width=12) EXPLAIN anderson=# explain select value from s_n_grantor where value like 'foobar'; NOTICE: QUERY PLAN: Seq Scan on s_n_grantor (cost=0.00..8178.46 rows=247 width=12) anderson=# explain select value from s_n_grantor where value like 'foobar%'; NOTICE: QUERY PLAN: Seq Scan on s_n_grantor (cost=0.00..8178.46 rows=1 width=12) EXPLAIN anderson=# \d s_n_grantor_key Index "s_n_grantor_key" Attribute | Type -----------+----------------------- value | character varying(80) btree anderson=# \d s_n_grantor Table "s_n_grantor" Attribute | Type | Modifier -----------+-----------------------+---------- doc | character varying(16) | not null entry | smallint | not null value | character varying(80) | Indices: s_n_grantor_key, s_n_grantor_pkey
Robert Berger writes: > Why do the following three queries use different scan types under 7.1.2? > Under > 6.5.3 they all used index (and my system ran MUCH faster, since I often > use the trailing %) In 7.1, LIKE optimization (i.e., possibly using an index) has been disabled unless you are in the C locale because it does not work correctly. You could set your locale to C and re-initdb to get it to work again. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
Thanks. How do I set the locale? > In 7.1, LIKE optimization (i.e., possibly using an index) has been > disabled unless you are in the C locale because it does not work > correctly. You could set your locale to C and re-initdb to get it to work > again. > > -- > Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
Robert Berger <rwb@vtiscan.com> writes: > Why do the following three queries use different scan types under 7.1.2? As Peter points out, index optimization of LIKE queries is disabled unless your database is in C locale (if you're not sure, contrib/pg_controldata can tell you what locale your database is using). However, the estimated row counts seem peculiar to me in any case: > anderson=# explain select value from s_n_grantor where value='foobar'; > NOTICE: QUERY PLAN: > Index Scan using s_n_grantor_key on s_n_grantor (cost=0.00..823.58 > rows=247 width=12) > EXPLAIN > anderson=# explain select value from s_n_grantor where value like > 'foobar'; > NOTICE: QUERY PLAN: > Seq Scan on s_n_grantor (cost=0.00..8178.46 rows=247 width=12) > anderson=# explain select value from s_n_grantor where value like > 'foobar%'; > NOTICE: QUERY PLAN: > Seq Scan on s_n_grantor (cost=0.00..8178.46 rows=1 width=12) I'd expect the LIKE 'foobar' to produce the same estimate as = 'foobar', because the LIKE estimator recognizes this as a fixed pattern. But why is that considered *less* selective than LIKE 'foobar%'? Something fishy about that. Could we see the output of select attname,attdispersion,s.* from pg_statistic s, pg_attribute a, pg_class c where starelid = c.oid and attrelid = c.oid and staattnum = attnum and relname = 's_n_grantor'; (If you get no output, it means you didn't VACUUM ANALYZE ...) regards, tom lane
> Thanks. How do I set the locale? Short answer: bash$ LANG=C initdb Longer answer: I find locale support in PostgreSQL confusing. The locale of the current user at the time of initialising the database determinesthe collating order (e.g. is 'a' < 'A'). This in turns affect things like indexing (as described below). Thelocale of the account that is running the current postmaster defines format conversions (e.g. the currency symbol into_char(1,'L99')). The locale of the current process requesting a query defines how the user expects to see the result,and is (of course) ignored by PostgreSQL. To make matters worse: On Red Hat 7.1 (US boxed set) the default locale is set in /etc/sysconf/i18n to en_US, i.e. AmericanEnglish. Fine, except some Americans may be surprised to learn that 'A' > 'a'. When you start the postgresql servicewithout doing an initdb first, the script will helpfully copy i18n to ~postgres/initdb.i18n so you have a recordof the locale (and can re-set it). Except it never uses this file. In fact, the postmaster service runs in the defaultlocale *for root*. A previous version of RH did go through the trouble of sourcing initdb.i18n in .bash_profile,but it does nothing for postmaster which is started as su -l postgres -s /bin/sh ... In order to start postmaster on RH7.1 with a locale that is different from the default (actually: different from root) youshould create ~postgres/.profile with the content PGLIB=/usr/lib/pgsql PGDATA=/var/lib/pgsql/data export PGLIB PGDATA [ -f $PGDATA/../initdb.i18n ] && source $PGDATA/../initdb.i18n You may, of course, have to change the path. You may want to add that last line to your ~postgres/.bash_profile as well,in order to avoid future confusion.... If you dont have initdb.i18n, just delete the content of the data directory and start the postgres service. The result shouldlook something like LANG="en_US" export LANG LC_ALL LC_CTYPE LC_COLLATE LC_NUMERIC LC_CTYPE LC_TIME for a default installation, but of course you want LANG="C" Hope this makes sense to somebody....e-mail if I've managed to confuse anybody, including myself :-) Allan.
Allan Engelhardt writes: > I find locale support in PostgreSQL confusing. The locale of the > current user at the time of initialising the database determines the > collating order (e.g. is 'a' < 'A'). This in turns affect things like > indexing (as described below). This is because if you were allowed to change this setting after the database initialization, your indexes would get corrupted. (Recall that (some) indexes are sorted, and the index scanning code relies on indexes being sorted a certain way.) > The locale of the account that is running the current postmaster > defines format conversions (e.g. the currency symbol in > to_char(1,'L99')). This is a courtesy because the above contraints do not apply to this locale category. I agree it's confusing, but I'm not sure what can be done about it. > The locale of the current process requesting a query defines how the > user expects to see the result, and is (of course) ignored by > PostgreSQL. I guess it would make some sense if the client would pass along its locale settings (in the allowable categories) to the server. > To make matters worse: On Red Hat 7.1 (US boxed set) the default > locale is set in /etc/sysconf/i18n to en_US, i.e. American English. > Fine, except some Americans may be surprised to learn that 'A' > 'a'. What's so surprising about that? Yeah, ASCII order, blah. Remember that computers are for humans, not for computers. Btw., actually the rules are more complicated: ab < Ab < ac -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
Peter Eisentraut wrote: > Allan Engelhardt writes: > > > I find locale support in PostgreSQL confusing. The locale of the > > current user at the time of initialising the database determines the > > collating order (e.g. is 'a' < 'A'). This in turns affect things like > > indexing (as described below). > > This is because if you were allowed to change this setting after the > database initialization, your indexes would get corrupted. (Recall that > (some) indexes are sorted, and the index scanning code relies on indexes > being sorted a certain way.) Yes, I understand WHY it is, I just wish it was documented somewhere obvious. But nevermind - that what these mailing listsare for! Thanks, all. Allan.
> This is because if you were allowed to change this setting after the > database initialization Why can't it be set at database creation time, so different databases can have different locales? Why is it a global per-server setting?