Home > mailing lists
Re: PATCH: Attempt to make dbsize a bit more consistent - Mailing list pgsql-hackers

From	gkokolatos@pm.me
Subject	Re: PATCH: Attempt to make dbsize a bit more consistent
Date	February 19, 2021 15:57:22
Msg-id	GtoPR_W8-xGE3VMXNFaxckpN6KFDXWuJbMNdOZV_QhCImUhGDdrnA0W33CWcO11a0yiQz04FvHmVr692FYjzHENR7Fu7GWtir6a4n8GrnLU=@pm.me Whole thread
In response to	Re: PATCH: Attempt to make dbsize a bit more consistent (Masahiko Sawada <sawada.mshk@gmail.com>)
List	pgsql-hackers
Tree view




‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, February 1, 2021 1:18 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> On Thu, Nov 12, 2020 at 2:54 AM Soumyadeep Chakraborty
> soumyadeep2007@gmail.com wrote:
>
> > Hey Georgios,
> > On Tue, Nov 10, 2020 at 6:20 AM gkokolatos@pm.me wrote:
> >
> > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > On Monday, November 9, 2020 7:50 PM, Soumyadeep Chakraborty soumyadeep2007@gmail.com wrote:
> > >
> > > > Hey Georgios,
> > > > Thanks for looking for more avenues to invoke tableAM APIS! Please find
> > > > my review below:
> > >
> > > A great review Soumyadeep, it is much appreciated.
> > > Please remember to add yourself as a reviewer in the commitfest
> > > [https://commitfest.postgresql.org/30/2701/]
> >
> > Ah yes. Sorry, I forgot that!
> >
> > > > On Tue, Oct 13, 2020 at 6:28 AM gkokolatos@pm.me wrote:
> > > >
> > > > 1.
> > > >
> > > > > /*
> > > > >
> > > > > -   -   heap size, including FSM and VM
> > > > > -   -   table size, including FSM and VM
> > > > >         */
> > > > >
> > > >
> > > > We should not mention FSM and VM in dbsize.c at all as these are
> > > > heap AM specific. We can say:
> > > > table size, excluding TOAST relation
> > >
> > > Yeah, I was thinking that the notion that FSM and VM are still taken
> > > into account should be stated. We are iterating over ForkNumber
> > > after all.
> > > How about phrasing it as:
> > >
> > > -   table size, including all implemented forks from the AM (e.g. FSM, VM)
> > > -   excluding TOAST relations
> > >
> > > Thoughts?
> >
> > Yes, I was thinking along the same lines. The concept of a "fork" forced
> > should not be forced down into the tableAM. But that is a discussion for
> > another day. We can probably say:
> >
> > -   table size, including all implemented forks from the AM (e.g. FSM, VM
> > -   for the heap AM) excluding TOAST relations
> >
> > > > 2.
> > > >
> > > > > /*
> > > > >
> > > > > -   Size of toast relation
> > > > >     */
> > > > >     if (OidIsValid(rel->rd_rel->reltoastrelid))
> > > > >
> > > > > -   size += calculate_toast_table_size(rel->rd_rel->reltoastrelid);
> > > > >
> > > > > -   {
> > > > >
> > > > > -   Relation toastRel;
> > > > >
> > > > > -
> > > > > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> > > > >
> > > >
> > > > We can replace the OidIsValid check with relation_needs_toast_table()
> > > > and then have the OidIsValid() check in an Assert. Perhaps, that will
> > > > make things more readable.
> > >
> > > Please, allow me to kindly disagree.
> > > Relation is already open at this stage. Even create_toast_table(), the
> > > internal workhorse for creating toast relations, does check reltoastrelid
> > > to test if the relation is already toasted.
> > > Furthermore, we do call:
> > >
> > > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> > >
> > > and in order to avoid elog() errors underneath us, we ought to have
> > > verified the validity of reltoastrelid.
> > > In short, I think that the code in the proposal is not too unreadable
> > > nor that it breaks the coding patterns throughout the codebase.
> > > Am I too wrong?
> >
> > No not at all. The code in the proposal is indeed adhering to the
> > codebase. What I was going for here was to increase the usage of
> > relation_needs_toast_table(). What I meant was:
> > if (table_relation_needs_toast_table(rel))
> > {
> > if (!OidIsValid(rel->rd_rel->reltoastrelid))
> > {
> > elog(ERROR, <errmsg that toast table wasn't found>);
> > }
> > //size calculation here..
> > }
> > We want to error out if the toast table can't be found and the relation
> > is expected to have one, which the existing code doesn't handle.
> >
> > > > 3.
> > > >
> > > > > -   if (rel->rd_rel->relkind == RELKIND_RELATION ||
> > > > > -   rel->rd_rel->relkind == RELKIND_TOASTVALUE ||
> > > > > -   rel->rd_rel->relkind == RELKIND_MATVIEW)
> > > > > -   size = calculate_table_size(rel);
> > > > > -   else
> > > > > -   {
> > > > > -   relation_close(rel, AccessShareLock);
> > > > > -   PG_RETURN_NULL();
> > > > > -   }
> > > >
> > > > This leads to behavioral changes:
> > > > I am talking about the case where one calls pg_table_size() on an index.
> > > > W/o your patch, it returns the size of the index. W/ your patch it
> > > > returns NULL. Judging by the documentation, this function should not
> > > > ideally apply to indexes but it does! I have a sinking feeling that lots
> > > > of users use it for this purpose, as there is no function to calculate
> > > > the size of a single specific index (except pg_relation_size()).
> > > > The same argument I have made above applies to sequences. Users may have
> > > > trial-and-errored their way into finding that pg_table_size() can tell
> > > > them the size of a specific index/sequence! I don't know how widespread
> > > > the use is in the user community, so IMO maybe we should be conservative
> > > > and not introduce this change? Alternatively, we could call out that
> > > > pg_table_size() is only for tables by throwing an error if anything
> > > > other than a table is passed in.
> > > > If we decide to preserve the existing behavior of the pg_table_size():
> > > > It seems that for things not backed by the tableAM (indexes and
> > > > sequences), they should still go through calculate_relation_size().
> > > > We can call table_relation_size() based on if relkind is
> > > > RELKIND_RELATION, RELKIND_TOASTVALUE or RELKIND_MATVIEW. Perhaps it
> > > > might be worthwhile making a new macro RELKIND_HAS_TABLE_STORAGE to
> > > > capture these three cases (See RELKIND_HAS_STORAGE). This also ensures
> > > > that we return 0 for things that don't qualify as RELKIND_HAS_STORAGE,
> > > > such as a partitioned table (Currently w/ the patch applied, we return
> > > > NULL for those cases, which is another behavior change)
> > >
> > > Excellent point. This is the discussion I was longing to have.
> > > I stand by the decision coded in the patch, that pg_table_size() should
> > > return NULL for other kinds of relations, such as indexes, sequences
> > > etc.
> > > It is a conscious decision based on the following:
> > >
> > > -   Supported by the documentation, pg_table_size() applies to tables only.
> > >     For other uses the higher-level functions pg_total_relation_size() or
> > >     pg_relation_size() should be used.
> > >
> > > -   Commit fa352d662e taught pg_relation_size() and friends to return NULL if the object doesn't exist. This
makesperfect sense for the 
> > >     scenarios described in the commit:
> > >     That avoids errors when the functions are used in queries like
> > >     "SELECT pg_relation_size(oid) FROM pg_class",
> > >     and a table is dropped concurrently.
> > >
> > >
> > > IMHO: It is more consistent to return NULL when the relation does exist
> > > OR it is not a table kind.
> > >
> > > -   Returning 0 for things that do not have storage, is nonsensical because
> > >     it implies that it can be NON zero at some point. Things that do not
> > >     have storage have an unknown size.
> > >
> >
> > Fair. We will have to document the behavior change.
> >
> > > As far as for the argument that users might have trialed and errored
> > > their way into undocumented behaviour, I do not think it is strong
> > > enough to stop us from implementing the documented behaviour.
> >
> > Fair. I would strongly vote for having two additional functions
> > (pg_index_size() and pg_sequence_size()) to strongly signal our intent
> > of banning that kind of use of pg_table_size(). I think it would help
> > users a lot. It is not easy to find what function to call when you want
> > the size of a single index/sequence.
> >
> > > > 4.
> > > >
> > > > > @@ -3776,10 +3776,24 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
> > > > > gettext_noop("Access Method"));
> > > > >
> > > > >          /*
> > > > >
> > > > >
> > > > > -             * As of PostgreSQL 14, do not use pg_table_size() for indexes and
> > > > >
> > > > >
> > > > > -             * sequences as it does not behave sanely for those.
> > > > >
> > > > >
> > > > > -             *
> > > > >               * As of PostgreSQL 9.0, use pg_table_size() to show a more accurate
> > > > >               * size of a table, including FSM, VM and TOAST tables.
> > > > >               */
> > > > >
> > > > >
> > > > > -            if (pset.sversion >= 90000)
> > > > >
> > > > >
> > > > > -            if (pset.sversion >= 140000)
> > > > >
> > > > >
> > > > > -                appendPQExpBuffer(&buf,
> > > > >
> > > > >
> > > > > -                                  ",\\\\n CASE"
> > > > >
> > > > >
> > > > > -                                  " WHEN c.relkind in ("CppAsString2(RELKIND_INDEX)", "
> > > > >
> > > > >
> > > > > -                                                        CppAsString2(RELKIND_PARTITIONED_INDEX)", "
> > > > >
> > > > >
> > > > > -                                                        CppAsString2(RELKIND_SEQUENCE)") THEN"
> > > > >
> > > > >
> > > > > -                                  "     pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid))"
> > > > >
> > > > >
> > > > > -                                  " ELSE"
> > > > >
> > > > >
> > > > > -                                  "     pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid))"
> > > > >
> > > > >
> > > > > -                                  " END as \\\\"%s\\\\"",
> > > > >
> > > > >
> > > > > -                                  gettext_noop("Size"));
> > > > >
> > > > >
> > > > > -            else if (pset.sversion >= 90000)
> > > > >                  appendPQExpBuffer(&buf,
> > > > >                                    ",\\\\n  pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as
\\\\"%s\\\\"",
> > > > >                                    gettext_noop("Size"));
> > > > >
> > > > >
> > > >
> > > > Following on from point 3, if we decide to preserve the existing behavior,
> > > > we wouldn't need this change, as it would be internalized within
> > > > pg_table_size().
> > >
> > > We really should not decide to preserve the existing behaviour.
> > > I will reiterate my point: Returning 0 for things that do not have
> > > storage, implies that it can be NON zero at some point. We should not
> > > treat pg_table_size() as an alias for pg_relation_size().
> >
> > +1
> >
> > > > 4.
> > > >
> > > > > > -   return size;
> > > > > >
> > > > > > -   Assert(size < PG_INT64_MAX);
> > > > > >
> > > > > > -
> > > > > > -   return (int64)size;
> > > > > >     I assume that this change, and the other ones like that, aim to handle int64
> > > > > >     overflow? Using the extra legroom of uint64 can still lead to an overflow,
> > > > > >     however theoretical it may be. Wouldn't it be better to check for overflow
> > > > > >     before adding to size rather than after the fact? Something along the lines of
> > > > > >     checking for headroom left:
> > > > > >     rel_size = table_relation_size(..);
> > > > > >     if (rel_size > (PG_INT64_MAX - total_size))
> > > > > >     < error codepath >
> > > > > >
> > > > > >
> > > > > > total_size += rel_size;
> > > > >
> > > > > Actually not, the intention is not to handle overflow. The table_relation_size() returns a uint64 and the
callingfunction returns int64. 
> > > > > The Assert() has been placed in order to denote that it is acknowledged that the two functions return
differenttypes. I was of the opinion that a run time check will not be needed as even the > smaller type can cover more
than9200 PetaByte tables. 
> > > > > If we were to change anything, then I would prefer to change the signature of the pg_*_size family of
functionsto return uint64 instead. 
> > > >
> > > > Changing the signature would be the ideal change for all of this here.
> > > > But Postgres does not have support for an unsigned 64bit integer (bigint
> > > > is signed). One would need to turn to extensions such as [1]. Thus, +1
> > > > to what Daniel said above.
> > >
> > > Apologies, I do not follow. Are you suggesting that we should
> > > introduce overflow tests?
> >
> > Yes, to introduce the same overflow test that Daniel suggested above as
> > returning a uint64 in PG does not really return a uint64 AFAIU (since
> > the pg_**size() functions all return bigint which is signed and there
> > is no uint64 user-facing type).
>
> Status update for a commitfest entry.
>
> This patch gets review comments and there was some discussion. It
> seems we're waiting for the patch update. So I've moved this patch to
> the next commitfest and set it to "Waiting on Author".

Thank you for your patience.

Please find v3 attached. I will reset the status to 'Ready for review'.

>
> Regards,
>
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Masahiko Sawada
> EDB: https://www.enterprisedb.com/
Attachment

v3-0001-Attempt-to-make-dbsize-a-bit-more-consistent.patch
pgsql-hackers by date:
From: Pavel Stehule
Date: 19 February 2021, 15:28:02
Subject: Re: Problem with accessing TOAST data in stored procedures
From: Dilip Kumar
Date: 19 February 2021, 16:12:29
Subject: Re: [HACKERS] Custom compression methods
Re: PATCH: Attempt to make dbsize a bit more consistent - Mailing list pgsql-hackers

Attachment

Previous

Next