Re: TRUNCATE on foreign table - Mailing list pgsql-hackers

From Kohei KaiGai
Subject Re: TRUNCATE on foreign table
Date
Msg-id CAOP8fzYgM=bxsDgtnucx4k_QUdUknri4UEdPxr9bkkfqRYoRog@mail.gmail.com
Whole thread Raw
In response to Re: TRUNCATE on foreign table  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: TRUNCATE on foreign table  (Kohei KaiGai <kaigai@heterodb.com>)
List pgsql-hackers
2021年4月9日(金) 22:51 Fujii Masao <masao.fujii@oss.nttdata.com>:
>
> On 2021/04/09 12:33, Kohei KaiGai wrote:
> > 2021年4月8日(木) 22:14 Fujii Masao <masao.fujii@oss.nttdata.com>:
> >>
> >> On 2021/04/08 22:02, Kohei KaiGai wrote:
> >>>> Anyway, attached is the updated version of the patch. This is still based on the latest Kazutaka-san's patch.
Thatis, extra list for ONLY is still passed to FDW. What about committing this version at first? Then we can continue
thediscussion and change the behavior later if necessary. 
> >>
> >> Pushed! Thank all involved in this development!!
> >> For record, I attached the final patch I committed.
> >>
> >>
> >>> Ok, it's fair enought for me.
> >>>
> >>> I'll try to sort out my thought, then raise a follow-up discussion if necessary.
> >>
> >> Thanks!
> >>
> >> The followings are the open items and discussion points that I'm thinking of.
> >>
> >> 1. Currently the extra information (TRUNCATE_REL_CONTEXT_NORMAL, TRUNCATE_REL_CONTEXT_ONLY or
TRUNCATE_REL_CONTEXT_CASCADING)about how a foreign table was specified as the target to truncate in TRUNCATE command is
collectedand passed to FDW. Does this really need to be passed to FDW? Seems Stephen, Michael and I think that's
necessary.But Kaigai-san does not. I also think that TRUNCATE_REL_CONTEXT_CASCADING can be removed because there seems
nouse case for that maybe. 
> >>
> >> 2. Currently when the same foreign table is specified multiple times in the command, the extra information only
forthe foreign table found first is collected. For example, when "TRUNCATE ft, ONLY ft" is executed,
TRUNCATE_REL_CONTEXT_NORMALis collected and _ONLY is ignored because "ft" is found first. Is this OK? Or we should
collectall, e.g., both _NORMAL and _ONLY should be collected in that example? I think that the current approach (i.e.,
collectthe extra info about table found first if the same table is specified multiple times) is good because even local
tablesare also treated the same way. But Kaigai-san does not. 
> >>
> >> 3. Currently postgres_fdw specifies ONLY clause in TRUNCATE command that it constructs. That is, if the foreign
tableis specified with ONLY, postgres_fdw also issues the TRUNCATE command for the corresponding remote table with ONLY
tothe remote server. Then only root table is truncated in remote server side, and the tables inheriting that are not
truncated.Is this behavior desirable? Seems Michael and I think this behavior is OK. But Kaigai-san does not. 
> >>
> > Prior to the discussion of 1-3, I like to clarify the role of foreign-tables.
> > (Likely, it will lead a natural conclusion for the above open items.)
> >
> > As literal of SQL/MED (Management of External Data), a foreign table
> > is a representation of external data in PostgreSQL.
> > It allows to read and (optionally) write the external data wrapped by
> > FDW drivers, as if we usually read / write heap tables.
> > By the FDW-APIs, the core PostgreSQL does not care about the
> > structure, location, volume and other characteristics of
> > the external data itself. It expects FDW-APIs invocation will perform
> > as if we access a regular heap table.
> >
> > On the other hands, we can say local tables are representation of
> > "internal" data in PostgreSQL.
> > A heap table is consists of one or more files (per BLCKSZ *
> > RELSEG_SIZE), and table-am intermediates
> > the on-disk data to/from on-memory structure (TupleTableSlot).
> > Here are no big differences in the concept. Ok?
> >
> > As you know, ONLY clause controls whether TRUNCATE command shall run
> > on child-tables also, not only the parent.
> > If "ONLY parent_table" is given, its child tables are not picked up by
> > ExecuteTruncate(), unless child tables are not
> > listed up individually.
> > Then, once ExecuteTruncate() picked up the relations, it makes the
> > relations empty using table-am
> > (relation_set_new_filenode), and the callee
> > (heapam_relation_set_new_filenode) does not care about whether the
> > table is specified with ONLY, or not. It just makes the data
> > represented by the table empty (in transactional way).
> >
> > So, how foreign tables shall perform?
> >
> > Once ExecuteTruncate() picked up a foreign table, according to
> > ONLY-clause, does FDW driver shall consider
> > the context where the foreign tables are specified? And, what behavior
> > is consistent?
> > I think that FDW driver shall make the external data represented by
> > the foreign table empty, regardless of the
> > structure, location, volume and others.
> >
> > Therefore, if we follow the above assumption, we don't need to inform
> > the context where foreign-tables are
> > picked up (TRUNCATE_REL_CONTEXT_*), so postgres_fdw shall not control
> > the remote TRUNCATE query
> > according to the flags. It always truncate the entire tables (if
> > multiple) on behalf of the foreign tables.
>
> This makes me wonder if the information about CASCADE/RESTRICT (maybe also RESTART/CONTINUE) also should not be
passedto FDW. You're thinking that? Or only ONLY clause should be ignored for a foreign table? 
>
I think the above information (DropBehavior and restart_seqs) are
valuable to pass.

The CASCADE/RESTRICT clause controls whether the truncate command also
eliminates
the rows that blocks to delete (FKs in RDBMS). Only FDW driver can
know whether the
external data has "removal-blocker", thus we need to pass the
DropBehavior for the callback.

The RESTART/CONTINUE clause also controle whether the truncate command restart
the relevant resources that is associated with the target table
(Sequences in RDBMS).
Only FDW driver can know whether the external data has relevant
resources to reset,
thus we need to pass the "restart_seqs" for the callback.

Unlike above two parameters, the role of ONLY-clause is already
finished at the time
when ExecuteTruncate() picked up the target relations, from the
standpoint of above
understanding of foreign-tables and external data.

Thought?

Best regards,
--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai@heterodb.com>



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: TRUNCATE on foreign table
Next
From: Bharath Rupireddy
Date:
Subject: Avoid unnecessary table open/close for TRUNCATE foo, foo, foo; kind of commands