Thread: bytea_ops
Not sure if this needs discussion on hackers or not, but I am sure someone will redirect me if it does ;) I found that I had a need to create and use an index on bytea columns for a project I'm currently working on. In the archives, I found this message by Tom -- http://fts.postgresql.org/db/mw/msg.html?mid=93112 -- in which he indicated that operators for bytea were needed but no one had gotten to them yet. So here's a patch against cvs HEAD. Seems to work correctly in my limited testing, and passes all regression tests. I would appreciate it if someone more experienced could give it a look. Thanks, -- Joe
Attachment
"Joe Conway" <joseph.conway@home.com> writes: > I found that I had a need to create and use an index on bytea columns for a > project I'm currently working on. Looks pretty good. I have only three gripes, two trivial. In order of decreasing trivialness: +bytealt(PG_FUNCTION_ARGS) +{ + bytea *arg1 = PG_GETARG_BYTEA_P(0); + VarChar *arg2 = PG_GETARG_BYTEA_P(1); Looks like you missed one VarChar -> bytea. + if ((cmp < 0) || ((cmp == 0) && (len1 < len2))) + PG_RETURN_BOOL(TRUE); + else + PG_RETURN_BOOL(FALSE); I think it's better style to code stuff like this as PG_RETURN_BOOL((cmp < 0) || ((cmp == 0) && (len1 < len2))); BoolGetDatum already does the work of ensuring that the returned Datum is exactly TRUE or FALSE, so why do it over again? The biggie is that you missed adding support for bytea to scalarltsel, which puts a severe crimp on the optimizer's ability to make any intelligent decisions about using your index. See src/backend/utils/adt/selfuncs.c, about line 580. You need to add a case to the convert_to_scalar() function in that file. I'd say that bytea should be a separate type category --- don't try to cram it into convert_to_scalar's string category, because the string-handling code assumes null-termination is safe. But the implementation should be modeled on the handling of strings: you want to strip any common prefix, then use the next few bytes as base-256 digits to form numeric values. The comments for convert_string_to_scalar should give you an idea. regards, tom lane
> Looks pretty good. I have only three gripes, two trivial. In order of > decreasing trivialness: > > +bytealt(PG_FUNCTION_ARGS) > +{ > + bytea *arg1 = PG_GETARG_BYTEA_P(0); > + VarChar *arg2 = PG_GETARG_BYTEA_P(1); > > Looks like you missed one VarChar -> bytea. Ouch! You are of course correct. > > + if ((cmp < 0) || ((cmp == 0) && (len1 < len2))) > + PG_RETURN_BOOL(TRUE); > + else > + PG_RETURN_BOOL(FALSE); > > I think it's better style to code stuff like this as > > PG_RETURN_BOOL((cmp < 0) || ((cmp == 0) && (len1 < len2))); OK > > BoolGetDatum already does the work of ensuring that the returned Datum > is exactly TRUE or FALSE, so why do it over again? > > The biggie is that you missed adding support for bytea to scalarltsel, > which puts a severe crimp on the optimizer's ability to make any > intelligent decisions about using your index. See > src/backend/utils/adt/selfuncs.c, about line 580. You need to add a > case to the convert_to_scalar() function in that file. I'd say that > bytea should be a separate type category --- don't try to cram it into > convert_to_scalar's string category, because the string-handling code > assumes null-termination is safe. But the implementation should be > modeled on the handling of strings: you want to strip any common prefix, > then use the next few bytes as base-256 digits to form numeric values. > The comments for convert_string_to_scalar should give you an idea. I thought I must have missed something here. I noticed that with ~40000 rows inserted (and the table vacuumed), explain on a < or > query showed the optimizer tended to not use the index unless severely restricted, i.e. select * from foobar where f1 < '\\000\\015'; would do a table scan while select * from foobar where f1 < '\\000\\011'; would use the index. Hopefully fixing the above will improve this. Thank you for looking at this so quickly :-) I'll take another pass and send in a new patch soon. -- Joe
Here's a new patch. > Looks like you missed one VarChar -> bytea. Fixed. > I think it's better style to code stuff like this as > > PG_RETURN_BOOL((cmp < 0) || ((cmp == 0) && (len1 < len2))); Done. > The biggie is that you missed adding support for bytea to scalarltsel, > which puts a severe crimp on the optimizer's ability to make any > intelligent decisions about using your index. Hopefully done correctly ;-) -- Joe
Attachment
"Joe Conway" <joseph.conway@home.com> writes: >> The biggie is that you missed adding support for bytea to scalarltsel, >> which puts a severe crimp on the optimizer's ability to make any >> intelligent decisions about using your index. > Hopefully done correctly ;-) I'm inclined to think that convert_bytea_to_scalar should just always assume that the appropriate base is 256, ie, all possible byte values can appear in the data. The logic that convert_string_to_scalar uses to guess at a suitable base depends strongly on the assumption that ASCII text is much more likely than any other kind of data. That doesn't seem like the right assumption to make for bytea, I'd think. You've removed the more blatant ASCII dependencies from the code, but I still wonder whether it makes any sense to assume that the byte values seen in the given strings should be used as predictors of the overall distribution of byte values in the column. On the other hand, the reason that convert_string_to_scalar makes all those difficult-to-justify assumptions is that using a large base leads to overly optimistic selectivity estimates. (Example: suppose the histogram bounds are 'aa' and 'zz', and we are trying to estimate the selectivity of the range 'bb' to 'cc'. If we assume the data range is 'a'..'z' then we get scalar equivalents of aa = 0, zz = 0.9985, bb = 0.03994, cc = 0.079881 leading to a selectivity estimate of 0.03994. If we use a data range of 0..255 then we get aa = 0.380386, bb = 0.384307, cc = 0.388229, zz = 0.478424 leading to selectivity = 0.00392, more than a factor of 10 smaller.) Depending on how you are using bytea, 0..255 might be too large for its data range too. Thoughts? BTW, I think that convert_bytea_datum is probably unnecessary, and definitely it's a waste of cycles to palloc a copy of the input values. convert_string_datum exists to (a) unify the representations of the different datatypes that we consider strings, and (b) apply strxfrm if necessary. Neither of those motivations will ever apply to bytea AFAICS. So you could just as easily pass the given Datums directly to convert_bytea_to_scalar and let it work directly on them. regards, tom lane
> selectivity of the range 'bb' to 'cc'. If we assume the data range is > 'a'..'z' then we get scalar equivalents of aa = 0, zz = 0.9985, bb = > 0.03994, cc = 0.079881 leading to a selectivity estimate of 0.03994. > If we use a data range of 0..255 then we get aa = 0.380386, bb = > 0.384307, cc = 0.388229, zz = 0.478424 leading to selectivity = 0.00392, > more than a factor of 10 smaller.) Depending on how you are using > bytea, 0..255 might be too large for its data range too. Thoughts? The specific case that I am using bytea for is storage of completely random binary data generated for use as cryptographic keys and account identifiers. For this application, you generally are only interested in the selectivity of an exact match, i.e. "where vl_guid = 'bytea-data'". But in any case, for this type of data assuming a 0..255 range for any particular byte is completely appropriate. And in testing, I found that the current calculation is reasonably accurate. For example, with some randomly generated data: vsreg_192=# select count(*) from vs_lkup; count ------- 24950 (1 row) vsreg_192=# explain select vl_guid from vs_lkup where vl_guid < '\\300\\300'; NOTICE: QUERY PLAN: Seq Scan on vs_lkup (cost=0.00..727.88 rows=319 width=24) EXPLAIN vsreg_192=# select count(*) from vs_lkup where vl_guid < '\\300\\300'; count ------- 287 (1 row) vsreg_192=# explain select vl_guid from vs_lkup where vl_guid < '\\300\\377'; NOTICE: QUERY PLAN: Seq Scan on vs_lkup (cost=0.00..727.88 rows=425 width=24) EXPLAIN vsreg_192=# select count(*) from vs_lkup where vl_guid < '\\300\\377'; count ------- 390 (1 row) vsreg_192=# explain select vl_guid from vs_lkup where vl_guid < '\\300\\010'; NOTICE: QUERY PLAN: Index Scan using vs_lkup_vl_guid_key on vs_lkup (cost=0.00..43.80 rows=11 width=24) EXPLAIN vsreg_192=# select count(*) from vs_lkup where vl_guid < '\\300\\010'; count ------- 14 (1 row) (**Note that this database *only* has data >= \300 in vl_guid**) I think there are other scenarios in which you might want to use bytea where the distribution is less random (I'm thinking in terms of any compressible binary data like executables or some image types), but I can't think of any offhand where ordering of the data is all that meaningful. On the other hand, I suppose if you wanted to use bytea to store some sort of bitmapped data it might be highly skewed, and interesting to select distinct ranges from. Given that, it might make sense to leave the range estimate as-is. > BTW, I think that convert_bytea_datum is probably unnecessary, and > definitely it's a waste of cycles to palloc a copy of the input values. > convert_string_datum exists to (a) unify the representations of the > different datatypes that we consider strings, and (b) apply strxfrm > if necessary. Neither of those motivations will ever apply to bytea > AFAICS. So you could just as easily pass the given Datums directly to > convert_bytea_to_scalar and let it work directly on them. OK -- I almost did just that, but I was unsure of the reason for working with copies. I'll fix it on the next round. -- Joe
"Joe Conway" <joseph.conway@home.com> writes: > But in any case, for this type of data assuming a 0..255 range for any > particular byte is completely appropriate. And in testing, I found that the > current calculation is reasonably accurate. Well, it *is* accurate, if and only if that assumption is correct. Obviously it's correct for random-byte data. > I think there are other scenarios in which you might want to use bytea where > the distribution is less random (I'm thinking in terms of any compressible > binary data like executables or some image types), but I can't think of any > offhand where ordering of the data is all that meaningful. Yeah. Compressed data would show a pretty even byte-value distribution as well, but the real question is what sort of data might be found in a bytea column for which "x < y" is an interesting comparison. I'm not sure either. > On the other hand, I suppose if you wanted to use bytea to store some sort > of bitmapped data it might be highly skewed, and interesting to select > distinct ranges from. Given that, it might make sense to leave the > range estimate as-is. I don't like the estimate as-is. For textual data it makes some sense to classify characters into a small number of categories (letters, digits, other), and with so few categories it's not completely ridiculous to suppose that the three available strings might tell you which categories are present in a column. For bytea data, there are no natural categories and thus no justification for extrapolating byte-value distribution from the info available to scalarltsel. So I think there's no defensible argument for using anything but 0..255. (I suppose we could consider adding more info to pg_statistic for these types of columns, but I'm not eager to do that right at the moment.) regards, tom lane
> which categories are present in a column. For bytea data, there are > no natural categories and thus no justification for extrapolating > byte-value distribution from the info available to scalarltsel. So > I think there's no defensible argument for using anything but 0..255. > (I suppose we could consider adding more info to pg_statistic for these > types of columns, but I'm not eager to do that right at the moment.) Changes in this rev: - convert_byte_to_scalar uses 0..255 range always - eliminated convert_bytea_datum, just pass and use the Datums directly Results still look good on my random data and passes all regression tests. -- Joe
Attachment
"Joe Conway" <joseph.conway@home.com> writes: > [ bytea comparison & indexing patch, rev 2 ] Checked and applied. I renumbered the OID assignments to avoid creating gaps in the set of used OIDs, but otherwise it looked great. Many thanks! regards, tom lane
Tom has applied your patch. > > which categories are present in a column. For bytea data, there are > > no natural categories and thus no justification for extrapolating > > byte-value distribution from the info available to scalarltsel. So > > I think there's no defensible argument for using anything but 0..255. > > (I suppose we could consider adding more info to pg_statistic for these > > types of columns, but I'm not eager to do that right at the moment.) > > Changes in this rev: > > - convert_byte_to_scalar uses 0..255 range always > - eliminated convert_bytea_datum, just pass and use the Datums directly > > Results still look good on my random data and passes all regression tests. > > -- Joe > [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026