Re: [PATCH] [v8.5] Security checks on largeobjects - Mailing list pgsql-hackers

From KaiGai Kohei
Subject Re: [PATCH] [v8.5] Security checks on largeobjects
Date
Msg-id 4A4BF87E.7010107@ak.jp.nec.com
Whole thread Raw
In response to Re: [PATCH] [v8.5] Security checks on largeobjects  (KaiGai Kohei <kaigai@ak.jp.nec.com>)
Responses Re: [PATCH] [v8.5] Security checks on largeobjects
List pgsql-hackers
I could find one more issue when we apply largeobject-style interfaces
on generic toasted varlena data.

When we fetch a toasted datum, it scans the pg_toast_%u with SnapshotToast,
because it assumes any toasted chunks don't have multiple versions, and
visibility of the toast pointer always means visibility of the toast chunks.

However, if we provide largeobject-style interfaces which allow partial
updates on toasted varlena, it seems to me this assumption will get being
incorrect.

Is there any good idea?

KaiGai Kohei wrote:
> I concluded that the following issues should be solved when we apply
> largeobject-like interfaces on the big toasted data within general
> relations, not only pg_largeobject system catalog.
> 
> At first, we need to add a new strategy to store the given varlena data
> on the external toast relation.
> If we try to seek and fetch a certain data chunk, it is necessary to be
> computable what chunk stores the required data specified by offset and
> length. So, the external chunks should be uncompressed always. It is a
> common requirement for both of read and write operations.
> If we try to update a part of the toasted data chunks, it should not be
> inlined independent from length of the datum, because we need to update
> whole the tuple which contains inlined toasted chunks in this case.
> If we open the toasted varlena with read-only mode, inlined one does not
> prevent anything. It is an issue for only write operation.
> 
> I would like to add a new strategy on pg_type.typstorage with the following
> characteristics:
>  1. It always stores the given varlena data without any compression.
>     So, the given data is stored as a set of fixed-length chunks.
>  2. It always stores the given varlena data on external toast relation.
> 
> I suggest a new built-in type named BLOB which has an identical definition
> to BYTEA type, expect for its attstorage.
> 
> Next, a different version of lo_open() should be provided to accept
> BLOB type as follows:
> 
>   SELECT pictname, lo_open(pictdata, x'20000'::int) FROM my_picture;
> 
> It will allocate a largeobject descriptor for the given BLOB data,
> and user can read and write using loread() and lowrite() interfaces.
> 
> issue:
>   In this case, should it hold the relation handler and locks on the
>   "my_picture" relation, not only its toast relation?
> issue:
>   Should the lo_open() with read-only mode be available on the existing
>   TEXT or BYTEA types? I could not find any reason to deny them.
> 
> Next, pg_largeobject system catalog can be redefined using the BLOB
> type as follows:
> 
>   CATALOG(pg_largeobject,2613)
>   {
>       Oid         loowner;        /* OID of the largeobject owner */
>       Oid         lonsp;          /* OID of the largeobject namespace */
>       aclitem     loacl[1];       /* access permissions */
>       blob        lodata;         /* contents of the largeobject */
>   } FormData_pg_largeobject;
> 
> The existing largeobject interfaces perform on pg_largeobject.lodata
> specified by largeobject identifier.
> Rest of metadata can be used for access control purpose.
> 
> Thanks,
> 
> KaiGai Kohei wrote:
>> Tom Lane wrote:
>>> Bernd Helmle <mailings@oopsware.de> writes:
>>>> It might be interesting to dig into your proposal deeper in conjunction 
>>>> with TOAST (you've already mentioned this TODO). Having serial access with 
>>>> a nice interface into TOAST would be eliminating the need for 
>>>> pg_largeobject completely (i'm not a big fan of this one-big-system-table 
>>>> approach the old LO interface currently is).
>>> Yeah, it would be more useful probably to fix that than to add
>>> decoration to the LO facility.  Making LO more usable is just going to
>>> encourage people to bump into its other limitations (32-bit OIDs,
>>> 32-bit object size, finite maximum size of pg_largeobject, lack of
>>> dead-object cleanup, etc etc).
>> The reason why I tried to mention the named largeobject feature is
>> that dac security checks on largeobject require them to belong to
>> a certain schema, so I thought it is quite natural to have a string
>> name. However, obviously, it is not a significant theme for me.
>>
>> I can also agree your opinion that largeobject interfaces should be
>> redefined to access partial stuff of TOAST'ed verlena data structure,
>> not only pg_largeobject.
>>
>> In this case, we will need a new pg_type.typstorage option which
>> force to put the given verlena data on external relation without
>> compression, because we cannot estimate the data offset in inlined
>> or compressed external verlena data.
>>
>> I'll try to submit a design within a few days.
>> Thanks,
> 
> 


-- 
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai@ak.jp.nec.com>


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: 8.5 development schedule
Next
From: Greg Stark
Date:
Subject: Re: pg_migrator versus inherited columns