Re: Efficient slicing/substring of TOAST values (for - Mailing list pgsql-patches
From | John Gray |
---|---|
Subject | Re: Efficient slicing/substring of TOAST values (for |
Date | |
Msg-id | 1002669495.23073.637.camel@adzuki Whole thread Raw |
In response to | Re: Efficient slicing/substring of TOAST values (for comment) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Efficient slicing/substring of TOAST values (for comment)
|
List | pgsql-patches |
On Mon, 2001-10-08 at 16:06, Tom Lane wrote: > John Gray <jgray@azuli.co.uk> writes: > > 2. Changed e.g. textsubstr and byteasubstr to use this method. > > textsubstr is complicated by the multibyte support -the fast method is > > only applicable in a non-multibyte environment. > > More accurately, it's only applicable in single-byte character sets, > which is a useful fraction of the MULTIBYTE world too. Keep in mind > that MULTIBYTE is likely to default to ON in the near future, so there's > little point in such an exercise if you are going to ifdef it out. > I believe there's a routine somewhere that will tell you whether the > current database encoding is single-byte or multi-byte. > I've found pg_database_encoding_max_length() which seems to be the right thing. There may be some prospect of performance improvement even for multibyte cases. For a substr(attr,start,len) call, I could fetch the segment starting from the start of the attribute through to pg_database_encoding_max_length()*(start+len). If this is less than the length of attr then we can save time and memory. The existing multibyte substring operation is then applied to this returned string. > > 3. Added any facility to force a column to have attstorage 'e'. At > > present it appears to be defaulted from typstorage, but I couldn't see > > any problem with changing it after table creation. Would a keyword to > > CREATE TABLE to override the default attstorage be useful? > > I've been envisioning adding an ALTER TABLE variant to do this, rather > than mucking up CREATE TABLE --- compare the handling of attstattarget. > I've had a look, and it seems that AlterTableAlterColumnStatistics contains a lot of template code (and only a small amount actually dealing with stats). If we had different Node types for the different Column flag-changing operations (especially those which don't touch tuple data at all) then the current AlterTableAlterColumnStatistics might become AlterTableAlterColumnFlags (?) and take several kinds of nodes. Thus all the inheritance-supporting code wouldn't need to be duplicated. Alternatively, cloning the routine wouldn't be too much work. > No, this has been a to-do item for awhile. I think it's a tad too late > to think in terms of squeezing it into 7.2, but keep working on it for > 7.3. > Shall do. I'll develop something for substr operations in line with above. > Have you got any ideas about the update side of the problem? > Well, if a user is only updating a small part of a toasted value, then only the corresponding chunks need be updated -assuming that the update is Procrustean i.e. that the substitute string is coerced to the same length as the segment it replaces, by truncation or blank padding. What I'm not sure about is the syntax for specifying such an update. Would we just overload the syntax for an array? e.g. UPDATE objects SET obj_data[32:47]='0123456789abcdef'; I can see the other problem being that an updated toast value currently gets a new valueid. Maybe I'll need to work through this more, but I can't see why that need be the case -i.e. the toast update could just perform a normal update on each chunk rather than creating new ones and deleting old ones? Regards John
pgsql-patches by date: