Re: Inconvenience of pg_read_binary_file() - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Inconvenience of pg_read_binary_file()
Date
Msg-id 20220729.162145.1850303588308173872.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Inconvenience of pg_read_binary_file()  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Inconvenience of pg_read_binary_file()
List pgsql-hackers
Thanks for taking a look!

At Thu, 28 Jul 2022 16:22:17 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in 
> Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> > - Simplified the implementation (by complexifying argument handling..).
> > - REVOKEd EXECUTE from the new functions.
> > - Edited the signature of the two functions.
> 
> >> - pg_read_file ( filename text [, offset bigint, length bigint [, missing_ok boolean ]] ) → text
> >> + pg_read_file ( filename text [, offset bigint, length bigint ] [, missing_ok boolean ] ) → text
> 
> I'm okay with allowing this variant of the functions.  Since there's
> no implicit cast between bigint and bool, plus the fact that you
> can't give just offset without length, there shouldn't be much risk
> of confusion as to which variant to invoke.

Grad to hear that.

> I don't really like the implementation style though.  That mess of
> PG_NARGS tests is illegible code already and this makes it worse.

Ah..., I have to admit that I faintly felt that feeling while on it...

> I think it'd be way cleaner to have all the PG_GETARG calls in the
> bottom SQL-callable functions (which are already one-per-signature)
> and then pass them on to a common function that has an ordinary C
> call signature, along the lines of
> 
> static Datum
> pg_read_file_common(text *filename_t,
>                     int64 seek_offset, int64 bytes_to_read,
>                     bool read_to_eof, bool missing_ok)
> {
>     if (read_to_eof)
>         bytes_to_read = -1;    // or just Assert that it's -1 ?

I prefer assertion since that parameter cannot be passed by users.

>     else if (bytes_to_read < 0)
>         ereport(...);
>     ...
> }

This function cannot return NULL directly. Without the ability to
return NULL, it is pointless for the function to return Datum.  In the
attached the function returns text*.

> Datum
> pg_read_file_off_len(PG_FUNCTION_ARGS)
> {
>     text       *filename_t = PG_GETARG_TEXT_PP(0);
>     int64       seek_offset = PG_GETARG_INT64(1);
>     int64       bytes_to_read = PG_GETARG_INT64(2);
> 
>     return pg_read_file_common(filename_t, seek_offset, bytes_to_read,
>                                false, false);

As the result this function need to return NULL or TEXT_P according to
the returned value from pg_read_file_common.

+    if (!ret)
+        PG_RETURN_NULL();
+
+    PG_RETURN_TEXT_P(ret);
> }

# I'm tempted to call read_text_file() directly from each SQL functions..

Please find the attached.  I added some regression tests for both
pg_read_file() and pg_read_binary_file().

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Next
From: Pavel Borisov
Date:
Subject: Re: POC: Lock updated tuples in tuple_update() and tuple_delete()