Thread: BUG #5126: convert_to preventing index scan

BUG #5126: convert_to preventing index scan

From
"Roman Kapusta"
Date:
The following bug has been logged online:

Bug reference:      5126
Logged by:          Roman Kapusta
Email address:      roman.kapusta@gmail.com
PostgreSQL version: 8.3.8-1
Operating system:   fedora 11 i586 (32bit)
Description:        convert_to preventing index scan
Details:

I have table with bytea column, which is indexed (1)
I want to use index during pattern matching (eg. dir like someDirectoryName
|| '/%'), but concatenation of two strings cause error (2)
So I have to use function convert_to (converting text to bytea), but this
has awful explain plan (3)
If I rewrite string concatenation to just one string (4) query plan is
optimal
I found workaround (5), but still it looks like convert_to is causing full
table scan where it should not


(1) # \d paths
Table "paths"
    Column     |            Type             |                     Modifiers

---------------+-----------------------------+------------------------------
----------------------
 dev_id        | bigint                      | not null
 valid_to      | bigint                      | not null default
9223372036854775807::bigint
 name          | character varying(300)      | not null
 dir           | bytea                       | not null
Indexes:
    "paths_dev_id_key" UNIQUE, btree (dev_id, dir, name, valid_to)


(2) # explain select * from paths p where p.dir like E'Multimedia/Videos' ||
E'/%' and p.dev_id = 14 and p.valid_to >= 486629;
ERROR:  operator does not exist: bytea ~~ text
LINE 1: explain select * from paths p where p.dir like E'Multimedia/...
                                                  ^
HINT:  No operator matches the given name and argument type(s). You might
need to add explicit type casts.


(3) # explain select * from paths p where p.dir like
convert_to(E'Multimedia/Videos' || E'/%', 'UTF8') and p.dev_id = 14 and
p.valid_to >= 486629;
QUERY PLAN
----------------------------------------------------------------------------
-------------------------------------------
 Seq Scan on paths p  (cost=0.00..212065.54 rows=1 width=333)
   Filter: ((valid_to >= 486629) AND (dev_id = 14) AND (dir ~~
convert_to('Multimedia/Videos/%'::text, 'UTF8'::name)))



(4) # explain select * from paths p where p.dir like E'Multimedia/Videos/%'
and p.dev_id = 14 and p.valid_to >= 486629;
QUERY PLAN
----------------------------------------------------------------------------
-------------------------------------------------------------
 Index Scan using paths_dev_id_key on paths p  (cost=0.00..10.02 rows=1
width=333)
   Index Cond: ((dev_id = 14) AND (dir >= 'Multimedia/Videos/'::bytea) AND
(dir < 'Multimedia/Videos0'::bytea) AND (valid_to >= 486629))
   Filter: (dir ~~ 'Multimedia/Videos/%'::bytea)


(5) # explain select * from paths p where p.dir like
E'Multimedia/Videos'::bytea || E'/%'::bytea and p.dev_id = 14 and p.valid_to
>= 486629;
QUERY PLAN
----------------------------------------------------------------------------
-------------------------------------------------------------
 Index Scan using paths_dev_id_key on paths p  (cost=0.00..10.02 rows=1
width=333)
   Index Cond: ((dev_id = 14) AND (dir >= 'Multimedia/Videos/'::bytea) AND
(dir < 'Multimedia/Videos0'::bytea) AND (valid_to >= 486629))
   Filter: (dir ~~ 'Multimedia/Videos/%'::bytea)

Re: BUG #5126: convert_to preventing index scan

From
Peter Eisentraut
Date:
On Mon, 2009-10-19 at 11:05 +0000, Roman Kapusta wrote:
> I have table with bytea column, which is indexed (1)
> I want to use index during pattern matching (eg. dir like
> someDirectoryName
> || '/%'), but concatenation of two strings cause error (2)
> So I have to use function convert_to (converting text to bytea), but
> this
> has awful explain plan (3)

You haven't told us how the convert_to function is defined.

Re: BUG #5126: convert_to preventing index scan

From
Andrew Gierth
Date:
>>>>> "Peter" == Peter Eisentraut <peter_e@gmx.net> writes:

 >> I have table with bytea column, which is indexed (1)
 >> I want to use index during pattern matching (eg. dir like
 >> someDirectoryName
 >> || '/%'), but concatenation of two strings cause error (2)
 >> So I have to use function convert_to (converting text to bytea), but
 >> this
 >> has awful explain plan (3)

 Peter> You haven't told us how the convert_to function is defined.

convert_to is a builtin function. If there's a bug here, it's that
convert_to is defined as stable rather than immutable. (Sure it depends
on server_encoding, but that can't exactly change... if there's any
other reason why it's not immutable, I can't think what it is.)

Example (5) from the original message is the correct approach in any
case; as long as either operand of the || is explicitly passed as, or
cast to, a bytea, then it should work.

--
Andrew (irc:RhodiumToad)

Re: BUG #5126: convert_to preventing index scan

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
>  Peter> You haven't told us how the convert_to function is defined.

> convert_to is a builtin function. If there's a bug here, it's that
> convert_to is defined as stable rather than immutable. (Sure it depends
> on server_encoding, but that can't exactly change... if there's any
> other reason why it's not immutable, I can't think what it is.)

The conversion itself is dependent on changeable catalog entries,
ie, pg_conversion.  So "stable" seems the appropriate marking to me.

            regards, tom lane

Re: BUG #5126: convert_to preventing index scan

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> convert_to is a builtin function. If there's a bug here, it's that
 >> convert_to is defined as stable rather than immutable. (Sure it
 >> depends on server_encoding, but that can't exactly change... if
 >> there's any other reason why it's not immutable, I can't think
 >> what it is.)

 Tom> The conversion itself is dependent on changeable catalog
 Tom> entries, ie, pg_conversion.  So "stable" seems the appropriate
 Tom> marking to me.

That sounds like a bit of a stretch to me... we treat lots of stuff as
immutable which is actually easier to change than pg_conversion entries
(OS locale definitions for example).

--
Andrew.

Re: BUG #5126: convert_to preventing index scan

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> The conversion itself is dependent on changeable catalog
>  Tom> entries, ie, pg_conversion.  So "stable" seems the appropriate
>  Tom> marking to me.

> That sounds like a bit of a stretch to me... we treat lots of stuff as
> immutable which is actually easier to change than pg_conversion entries
> (OS locale definitions for example).

Um ... locale *is* fixed within a given database, or at least LC_COLLATE
and LC_CTYPE are.  If you see cases where we have this wrong, they may
need to be revisited.

            regards, tom lane

Re: BUG #5126: convert_to preventing index scan

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 Tom> The conversion itself is dependent on changeable catalog
 Tom> entries, ie, pg_conversion.  So "stable" seems the appropriate
 Tom> marking to me.

 >> That sounds like a bit of a stretch to me... we treat lots of
 >> stuff as immutable which is actually easier to change than
 >> pg_conversion entries (OS locale definitions for example).

 Tom> Um ... locale *is* fixed within a given database, or at least
 Tom> LC_COLLATE and LC_CTYPE are.  If you see cases where we have
 Tom> this wrong, they may need to be revisited.

The value of LC_CTYPE etc. is fixed, but the meaning that the OS
assigns to that value can be changed (arguably more easily than
changing pg_conversion, now that we don't allow builtin conversion
funcs to be used for conversions other than the one they are coded
for).

Of course, changing the definition of a locale will break everything
until you reindex, etc., but we put up with that because the
alternatives are clearly silly.

--
Andrew.

Re: BUG #5126: convert_to preventing index scan

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> Um ... locale *is* fixed within a given database, or at least
>  Tom> LC_COLLATE and LC_CTYPE are.  If you see cases where we have
>  Tom> this wrong, they may need to be revisited.

> The value of LC_CTYPE etc. is fixed, but the meaning that the OS
> assigns to that value can be changed (arguably more easily than
> changing pg_conversion, now that we don't allow builtin conversion
> funcs to be used for conversions other than the one they are coded
> for).

[ shrug... ]  So?  We have no way of detecting that.  By that logic,
float8pl isn't immutable because the hardware's FP behavior might
change under us.

My cutoff for this sort of issue is that if we provide a SQL command
to change something, we should expect that it will change.  Otherwise,
it's the user's responsibility to deal with the consequences of a change.

            regards, tom lane