Re: 64-bit API for large object - Mailing list pgsql-hackers
From | Amit kapila |
---|---|
Subject | Re: 64-bit API for large object |
Date | |
Msg-id | 6C0B27F7206C9E4CA54AE035729E9C38285378EC@szxeml509-mbs Whole thread Raw |
In response to | Re: 64-bit API for large object (Tatsuo Ishii <ishii@postgresql.org>) |
Responses |
Re: 64-bit API for large object
|
List | pgsql-hackers |
>On Sunday, October 07, 2012 5:42 AM Tatsuo Ishii wrote: >Ok, committed with minor editings(fix header comments in testlo64.c). >Thank you Kaigai-san for review! Hello Tatsuo Ishii San, Today when I tried to build the latest code on my windows m/c, I got few errors from the checkin of this patch. lo_hton64 (due to -- unint32_t).\src\interfaces\libpq\fe-lobj.c(1049) : error C2065: 'uint32_t' : undeclared identifier inv_seek (due to -- MAX_LARGE_OBJECT_SIZE) \src\backend\storage\large_object\inv_api.c(389) : error C2065: 'LOBLKSIZELL' : undeclared identifier inv_read ((due to -- MAX_LARGE_OBJECT_SIZE)) \src\backend\storage\large_object\inv_api.c(441) : error C2065: 'LOBLKSIZELL' : undeclared identifier It may be some settings problem of my m/c if it is okay on some other windows m/c. With Regards, Amit Kapila. > As a committer, I have looked into the patch and it seems it's good to > commit. However I want to make a small enhancement in the > documentation part: > > 1) lo_open section needs to mention about new 64bit APIs. Also it > should include description about lo_truncate, but this is not 64bit > APIs author's fault since it should had been there when lo_truncate > was added. > > 2) Add mention that 64bit APIs are only available in PostgreSQL 9.3 or > later and if the API is requested against older version of servers > it will fail. > > If there's no objection, I would like commit attached patches. > -- > Tatsuo Ishii > SRA OSS, Inc. Japan > English: http://www.sraoss.co.jp/index_en.php > Japanese: http://www.sraoss.co.jp > >> Hi Anzai-san, >> >> The latest patch is fair enough for me, so let me hand over its reviewing >> for comitters. >> >> Thanks, >> >> 2012/10/1 Nozomi Anzai <anzai@sraoss.co.jp>: >>> Here is 64-bit API for large object version 3 patch. >>> >>>> I checked this patch. It looks good, but here are still some points to be >>>> discussed. >>>> >>>> * I have a question. What is the meaning of INT64_IS_BUSTED? >>>> It seems to me a marker to indicate a platform without 64bit support. >>>> However, the commit 901be0fad4034c9cf8a3588fd6cf2ece82e4b8ce >>>> says as follows: >>>> | Remove all the special-case code for INT64_IS_BUSTED, per decision that >>>> | we're not going to support that anymore. >>> >>> Removed INT64_IS_BUSTED. >>> >>> >>>> * At inv_seek(), it seems to me it checks offset correctness with wrong way, >>>> as follows: >>>> | case SEEK_SET: >>>> | if (offset < 0) >>>> | elog(ERROR, "invalid seek offset: " INT64_FORMAT, offset); >>>> | obj_desc->offset = offset; >>>> | break; >>>> It is a right assumption, if large object size would be restricted to 2GB. >>>> But the largest positive int64 is larger than expected limitation. >>>> So, it seems to me it should be compared with (INT_MAX * PAGE_SIZE) >>>> instead. >>> >>> Fixed. >>> >>> >>>> * At inv_write(), it definitely needs a check to prevent data-write upper 4TB. >>>> In case when obj_desc->offset is a bit below 4TB, an additional 1GB write >>>> will break head of the large object because of "pageno" overflow. >>> >>> Added a such check. >>> >>> >>>> * Please also add checks on inv_read() to prevent LargeObjectDesc->offset >>>> unexpectedly overflows 4TB boundary. >>> >>> Added a such check. >>> >>> >>>> * At inv_truncate(), variable "off" is re-defined to int64. Is it really needed >>>> change? All its usage is to store the result of "len % LOBLKSIZE". >>> >>> Fixed and back to int32. >>> >>> >>>> Thanks, >>>> >>>> 2012/9/24 Nozomi Anzai <anzai@sraoss.co.jp>: >>>> > Here is 64-bit API for large object version 2 patch. >>>> > >>>> >> I checked this patch. It can be applied onto the latest master branch >>>> >> without any problems. My comments are below. >>>> >> >>>> >> 2012/9/11 Tatsuo Ishii <ishii@postgresql.org>: >>>> >> > Ok, here is the patch to implement 64-bit API for large object, to >>>> >> > allow to use up to 4TB large objects(or 16TB if BLCKSZ changed to >>>> >> > 32KB). The patch is based on Jeremy Drake's patch posted on September >>>> >> > 23, 2005 >>>> >> > (http://archives.postgresql.org/pgsql-hackers/2005-09/msg01026.php) >>>> >> > and reasonably updated/edited to adopt PostgreSQL 9.3 by Nozomi Anzai >>>> >> > for the backend part and Yugo Nagata for the rest(including >>>> >> > documentation patch). >>>> >> > >>>> >> > Here are changes made in the patch: >>>> >> > >>>> >> > 1) Frontend lo_* libpq functions(fe-lobj.c)(Yugo Nagata) >>>> >> > >>>> >> > lo_initialize() gathers backend 64-bit large object handling >>>> >> > function's oid, namely lo_lseek64, lo_tell64, lo_truncate64. >>>> >> > >>>> >> > If client calls lo_*64 functions and backend does not support them, >>>> >> > lo_*64 functions return error to caller. There might be an argument >>>> >> > since calls to lo_*64 functions can automatically be redirected to >>>> >> > 32-bit older API. I don't know this is worth the trouble though. >>>> >> > >>>> >> I think it should definitely return an error code when user tries to >>>> >> use lo_*64 functions towards the backend v9.2 or older, because >>>> >> fallback to 32bit API can raise unexpected errors if application >>>> >> intends to seek the area over than 2GB. >>>> >> >>>> >> > Currently lo_initialize() throws an error if one of oids are not >>>> >> > available. I doubt we do the same way for 64-bit functions since this >>>> >> > will make 9.3 libpq unable to access large objects stored in pre-9.2 >>>> >> > PostgreSQL servers. >>>> >> > >>>> >> It seems to me the situation to split the case of pre-9.2 and post-9.3 >>>> >> using a condition of "conn->sversion >= 90300". >>>> > >>>> > Fixed so, and tested it by deleteing the lo_tell64's row from pg_proc. >>>> > >>>> > >>>> >> > To pass 64-bit integer to PQfn, PQArgBlock is used like this: int *ptr >>>> >> > is a pointer to 64-bit integer and actual data is placed somewhere >>>> >> > else. There might be other way: add new member to union u to store >>>> >> > 64-bit integer: >>>> >> > >>>> >> > typedef struct >>>> >> > { >>>> >> > int len; >>>> >> > int isint; >>>> >> > union >>>> >> > { >>>> >> > int *ptr; /* can't use void (dec compiler barfs) */ >>>> >> > int integer; >>>> >> > int64 bigint; /* 64-bit integer */ >>>> >> > } u; >>>> >> > } PQArgBlock; >>>> >> > >>>> >> > I'm a little bit worried about this way because PQArgBlock is a public >>>> >> > interface. >>>> >> > >>>> >> I'm inclined to add a new field for the union; that seems to me straight >>>> >> forward approach. >>>> >> For example, the manner in lo_seek64() seems to me confusable. >>>> >> It set 1 on "isint" field even though pointer is delivered actually. >>>> >> >>>> >> + argv[1].isint = 1; >>>> >> + argv[1].len = 8; >>>> >> + argv[1].u.ptr = (int *) &len; >>>> > >>>> > Your proposal was not adopted per discussion. >>>> > >>>> > >>>> >> > Also we add new type "pg_int64": >>>> >> > >>>> >> > #ifndef NO_PG_INT64 >>>> >> > #define HAVE_PG_INT64 1 >>>> >> > typedef long long int pg_int64; >>>> >> > #endif >>>> >> > >>>> >> > in postgres_ext.h per suggestion from Tom Lane: >>>> >> > http://archives.postgresql.org/pgsql-hackers/2005-09/msg01062.php >>>> >> > >>>> >> I'm uncertain about context of this discussion. >>>> >> >>>> >> Does it make matter if we include <stdint.h> and use int64_t instead >>>> >> of the self defined data type? >>>> > >>>> > Your proposal was not adopted per discussion. >>>> > Per discussion, endiannness translation was moved to fe-lobj.c. >>>> > >>>> > >>>> >> > 2) Backend lo_* functions (be-fsstubs.c)(Nozomi Anzai) >>>> >> > >>>> >> > Add lo_lseek64, lo_tell64, lo_truncate64 so that they can handle >>>> >> > 64-bit seek position and data length. loread64 and lowrite64 are not >>>> >> > added because if a program tries to read/write more than 2GB at once, >>>> >> > it would be a sign that the program need to be re-designed anyway. >>>> >> > >>>> >> I think it is a reasonable. >>>> >> >>>> >> > 3) Backend inv_api.c functions(Nozomi Anzai) >>>> >> > >>>> >> > No need to add new functions. Just extend them to handle 64-bit data. >>>> >> > >>>> >> > BTW , what will happen if older 32-bit libpq accesses large objects >>>> >> > over 2GB? >>>> >> > >>>> >> > lo_read and lo_write: they can read or write lobjs using 32-bit API as >>>> >> > long as requested read/write data length is smaller than 2GB. So I >>>> >> > think we can safely allow them to access over 2GB lobjs. >>>> >> > >>>> >> > lo_lseek: again as long as requested offset is smaller than 2GB, there >>>> >> > would be no problem. >>>> >> > >>>> >> > lo_tell:if current seek position is beyond 2GB, returns an error. >>>> >> > >>>> >> Even though iteration of lo_lseek() may move the offset to 4TB, it also >>>> >> makes unavailable to use lo_tell() to obtain the current offset, so I think >>>> >> it is reasonable behavior. >>>> >> >>>> >> However, error code is not an appropriate one. >>>> >> >>>> >> + if (INT_MAX < offset) >>>> >> + { >>>> >> + ereport(ERROR, >>>> >> + (errcode(ERRCODE_UNDEFINED_OBJECT), >>>> >> + errmsg("invalid large-object >>>> >> descriptor: %d", fd))); >>>> >> + PG_RETURN_INT32(-1); >>>> >> + } >>>> >> >>>> >> According to the manpage of lseek(2) >>>> >> EOVERFLOW >>>> >> The resulting file offset cannot be represented in an off_t. >>>> >> >>>> >> Please add a new error code such as ERRCODE_BLOB_OFFSET_OVERFLOW. >>>> > >>>> > Changed the error code and error message. We added a new error code >>>> > "ERRCODE_UNDEFINED_OBJECT (22P07)". >>>> > >>>> > >>>> >> > 4) src/test/examples/testlo64.c added for 64-bit API example(Yugo Nagata) >>>> >> > >>>> >> > Comments and suggestions are welcome. >>>> >> > >>>> >> miscellaneous comments are below. >>>> >> >>>> >> Regression test is helpful. Even though no need to try to create 4TB large >>>> >> object, it is helpful to write some chunks around the design boundary. >>>> >> Could you add some test cases that writes some chunks around 4TB offset. >>>> > >>>> > Added 64-bit lobj test items into regression test and confirmed it worked >>>> > rightly. >>>> > >>>> > >>>> >> Thanks, >>>> >> -- >>>> >> KaiGai Kohei <kaigai@kaigai.gr.jp> >>>> >> >>>> >> >>>> >> -- >>>> >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >>>> >> To make changes to your subscription: >>>> >> http://www.postgresql.org/mailpref/pgsql-hackers >>>> > >>>> > >>>> > -- >>>> > Nozomi Anzai >>>> > SRA OSS, Inc. Japan >>>> > >>>> > >>>> > -- >>>> > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >>>> > To make changes to your subscription: >>>> > http://www.postgresql.org/mailpref/pgsql-hackers >>>> > >>>> >>>> >>>> >>>> -- >>>> KaiGai Kohei <kaigai@kaigai.gr.jp> >>>> >>>> >>>> -- >>>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >>>> To make changes to your subscription: >>>> http://www.postgresql.org/mailpref/pgsql-hackers >>> >>> >>> -- >>> Nozomi Anzai >>> SRA OSS, Inc. Japan >>> >>> >>> -- >>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >>> To make changes to your subscription: >>> http://www.postgresql.org/mailpref/pgsql-hackers >>> >> >> >> >> -- >> KaiGai Kohei <kaigai@kaigai.gr.jp> >> >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date: