Thread: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
hi. generally COPY TO``COPY table`` is faster than ``COPY (select * from table)``. in BeginCopyTo. we have """ else if (rel->rd_rel->relkind == RELKIND_MATVIEW) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("cannot copy from materialized view \"%s\"", RelationGetRelationName(rel)), errhint("Try the COPY (SELECT ...) TO variant."))); """ Since materialized views have physical storage, we can make materialized views also using COPY table_name, instead of COPY(query). Some simple tests show around %3.7 or 4.3% speed up.
On Fri, Dec 20, 2024 at 8:02 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Dec 19, 2024 at 02:28:21PM +0800, jian he wrote: > > Since materialized views have physical storage, > > > > we can make materialized views also using COPY table_name, instead of > > COPY(query). > > > > Some simple tests show around %3.7 or 4.3% speed up. > > This restriction comes from 3bf3ab8c5636 as such relations may not be > scannable when they have no data, no? Perhaps this restriction could > be lifted, but I'd suggest to dig more into the lists, there should be > arguments and ideas explaining what could be done in this case > (spoiler: I did not look at that). > -- Thanks for the suggestion. it was mentioned in link [1] and [2]. [1] https://www.postgresql.org/message-id/flat/8967.1353167301%40sss.pgh.pa.us#f5e947cfa9357dba780d238f1c5f6932 [2] https://www.postgresql.org/message-id/20121116162558.90150%40gmx.com Basically we want to have the two directions of COPY. so copy the_materialized_view to stdout; copy the_materialized_view from stdin; both will work fine. obviously "copy the_materialized_view from stdin; " will not work.
hi. about this issue, last email in 2012 (https://postgr.es/m/8967.1353167301@sss.pgh.pa.us) """ Even if it happens to be trivial in the current patch, it's an added functional requirement that we might later regret having cavalierly signed up for. And, as noted upthread, relations that support only one direction of COPY don't exist at the moment; that would be adding an asymmetry that we might later regret, too. regards, tom lane """ but now we have numerous COPY options that work solely in a single direction of COPY. I think now we can make some kind of relation (pg_class.relkind) that only works in one direction of COPY.
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
The following review has been posted through the commitfest application: make installcheck-world: not tested Implements feature: not tested Spec compliant: not tested Documentation: not tested Hi, Tested the latest patch that allows direct `COPY` operations on Materialized Views, removing the need for `COPY (SELECT ...)`.This enhancement reduces query overhead, improving performance by **4–5%**. Example: Previous approach: COPY (SELECT * FROM staff_summary) TO STDOUT WITH CSV HEADER; Optimized approach: COPY staff_summary TO STDOUT WITH CSV HEADER; Performance tests were conducted using a Materialized View containing around 80,000 records, confirming that the new approachis faster and more efficient for exporting data. Regards, Newt Global PostgreSQL Contributors
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On Tue, 28 Jan 2025 at 07:49, jian he <jian.universality@gmail.com> wrote: > > On Mon, Jan 6, 2025 at 5:34 PM jian he <jian.universality@gmail.com> wrote: > > > > hi. > > > > about this issue, > > last email in 2012 (https://postgr.es/m/8967.1353167301@sss.pgh.pa.us) > > """ > > Even if it happens to be trivial in the current patch, it's an added > > functional requirement that we might later regret having cavalierly > > signed up for. And, as noted upthread, relations that support only > > one direction of COPY don't exist at the moment; that would be adding > > an asymmetry that we might later regret, too. > > > > regards, tom lane > > """ > > > > but now we have numerous COPY options that work solely in a single > > direction of COPY. > > I think now we can make some kind of relation (pg_class.relkind) that > > only works in one direction of COPY. > > hi. > patch attached. > also cc to Tom, > since at that time, you are against the idea of ``COPY matview TO``. Hi! With this patch it is possible to COPY matview TO, but not regular view, which is surprising. Let's fix that? -- Best regards, Kirill Reshke
On Thu, Mar 27, 2025 at 3:04 AM Kirill Reshke <reshkekirill@gmail.com> wrote: > > > > hi. > > patch attached. > > also cc to Tom, > > since at that time, you are against the idea of ``COPY matview TO``. > > Hi! With this patch it is possible to COPY matview TO, but not regular > view, which is surprising. Let's fix that? create view v1 as select 1; copy v1 to stdout; if you specifying table name, not query, then { cstate = BeginCopyTo(pstate, rel, query, relid, stmt->filename, stmt->is_program, NULL, stmt->attlist, stmt->options); *processed = DoCopyTo(cstate); /* copy from database to file * } will use {table_beginscan, table_scan_getnextslot, table_endscan} to output the data. but views don't have storage, table_beginscan mechanism won't work. so i don't think this is possible for view.
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On Sat, 29 Mar 2025 at 09:47, jian he <jian.universality@gmail.com> wrote: > > will use {table_beginscan, table_scan_getnextslot, table_endscan} > to output the data. > but views don't have storage, table_beginscan mechanism won't work. > > so i don't think this is possible for view. Well... So you are saying that let us have inconsistent features because of how things are implemented in core... I don't sure I'm buying that, but whatever, let's hear some other voices from the community. My argument is that while we are working on it, perhaps we should revise certain implementation specifics along the way. However, this is merely my opinion on the matter. -- Best regards, Kirill Reshke
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On Sat, 29 Mar 2025 at 09:47, jian he <jian.universality@gmail.com> wrote:
>
> will use {table_beginscan, table_scan_getnextslot, table_endscan}
> to output the data.
> but views don't have storage, table_beginscan mechanism won't work.
>
> so i don't think this is possible for view.
Well... So you are saying that let us have inconsistent features
because of how things are implemented in core... I don't sure I'm
buying that, but whatever, let's hear some other voices from the
community. My argument is that while we are working on it, perhaps we
should revise certain implementation specifics along the way. However,
this is merely my opinion on the matter.
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On Saturday, March 29, 2025, Kirill Reshke <reshkekirill@gmail.com> wrote:On Sat, 29 Mar 2025 at 09:47, jian he <jian.universality@gmail.com> wrote:
>
> will use {table_beginscan, table_scan_getnextslot, table_endscan}
> to output the data.
> but views don't have storage, table_beginscan mechanism won't work.
>
> so i don't think this is possible for view.
Well... So you are saying that let us have inconsistent features
because of how things are implemented in core... I don't sure I'm
buying that, but whatever, let's hear some other voices from the
community. My argument is that while we are working on it, perhaps we
should revise certain implementation specifics along the way. However,
this is merely my opinion on the matter.At present copy {table} to only exists to support pg_dump. It is not marketed as a general purpose export facility.
*ahem*
What is your evidence for that proposition? If this were true we would not support CSV mode, which pg_dump does not use. It might have limitations, but its use goes far beyond just pg_dump, both in theory and practice.
cheers
andew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On 2025-03-29 Sa 10:40 AM, David G. Johnston wrote:On Saturday, March 29, 2025, Kirill Reshke <reshkekirill@gmail.com> wrote:On Sat, 29 Mar 2025 at 09:47, jian he <jian.universality@gmail.com> wrote:
>
> will use {table_beginscan, table_scan_getnextslot, table_endscan}
> to output the data.
> but views don't have storage, table_beginscan mechanism won't work.
>
> so i don't think this is possible for view.
Well... So you are saying that let us have inconsistent features
because of how things are implemented in core... I don't sure I'm
buying that, but whatever, let's hear some other voices from the
community. My argument is that while we are working on it, perhaps we
should revise certain implementation specifics along the way. However,
this is merely my opinion on the matter.At present copy {table} to only exists to support pg_dump. It is not marketed as a general purpose export facility.
*ahem*
What is your evidence for that proposition? If this were true we would not support CSV mode, which pg_dump does not use. It might have limitations, but its use goes far beyond just pg_dump, both in theory and practice.
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On Sat, Mar 29, 2025 at 9:06 AM Andrew Dunstan <andrew@dunslane.net> wrote:
On 2025-03-29 Sa 10:40 AM, David G. Johnston wrote:On Saturday, March 29, 2025, Kirill Reshke <reshkekirill@gmail.com> wrote:On Sat, 29 Mar 2025 at 09:47, jian he <jian.universality@gmail.com> wrote:
>
> will use {table_beginscan, table_scan_getnextslot, table_endscan}
> to output the data.
> but views don't have storage, table_beginscan mechanism won't work.
>
> so i don't think this is possible for view.
Well... So you are saying that let us have inconsistent features
because of how things are implemented in core... I don't sure I'm
buying that, but whatever, let's hear some other voices from the
community. My argument is that while we are working on it, perhaps we
should revise certain implementation specifics along the way. However,
this is merely my opinion on the matter.At present copy {table} to only exists to support pg_dump. It is not marketed as a general purpose export facility.
*ahem*
What is your evidence for that proposition? If this were true we would not support CSV mode, which pg_dump does not use. It might have limitations, but its use goes far beyond just pg_dump, both in theory and practice.
"copy {subquery} to" is a general-purpose exporter that makes use of those additional features. Sure, they also work for the narrowed case of "copy {relation/table} to" but I make my claim on the very fact that {relation} cannot be stuff like foreign tables or partitioned tables, which pg_dump has no need to target.
I don't believe that the premise supports the conclusion.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
I don't believe that the premise supports the conclusion.
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On Sat, 29 Mar 2025 at 19:59, David G. Johnston <david.g.johnston@gmail.com> wrote: > Regardless, I do support this patch and probably any similar ones proposed in the future. Do you have an opinion on that? > > David J. > I do also support what this patch aims to do, how do you like v1? -- Best regards, Kirill Reshke
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On Sat, 29 Mar 2025 at 19:59, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> Regardless, I do support this patch and probably any similar ones proposed in the future. Do you have an opinion on that?
>
I do also support what this patch aims to do, how do you like v1?
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On Sat, Mar 29, 2025 at 11:56 AM Andrew Dunstan <andrew@dunslane.net> wrote:I don't believe that the premise supports the conclusion.
Regardless, I do support this patch and probably any similar ones proposed in the future. Do you have an opinion on that?
In principle I think it would be good to have COPY materialized_view TO ...
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On 2025/03/30 5:01, Andrew Dunstan wrote: > > On 2025-03-29 Sa 2:58 PM, David G. Johnston wrote: >> On Sat, Mar 29, 2025 at 11:56 AM Andrew Dunstan <andrew@dunslane.net> wrote: >> >> I don't believe that the premise supports the conclusion. >> >> >> Regardless, I do support this patch and probably any similar ones proposed in the future. Do you have an opinion on that? >> >> > > In principle I think it would be good to have COPY materialized_view TO ... I haven't found any reasons to object to this patch for now, so I have no objections to this change. Regarding the patch, here are some review comments: + errmsg("cannot copy from materialized view when the materialized view is not populated"), How about including the object name for consistency with other error messages in BeginCopyTo(), like this? errmsg("cannot copy from unpopulated materialized view \"%s\"", RelationGetRelationName(rel)), + errhint("Use the REFRESH MATERIALIZED VIEW command populate the materialized view first.")); There seems to be a missing "to" just after "command". Should it be "Use the REFRESH MATERIALIZED VIEW command to populate the materialized view first."? Or we could simplify the hint to match what SELECT on an unpopulated materialized view logs: "Use the REFRESH MATERIALIZED VIEW command.". The copy.sgml documentation should clarify that COPY TO can be used with a materialized view only if it is populated. Wouldn't it be beneficial to add a regression test to check whether COPY matview TO works as expected? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Mon, Mar 31, 2025 at 11:27 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > > Regarding the patch, here are some review comments: > > + errmsg("cannot copy from materialized view when the materialized view isnot populated"), > > How about including the object name for consistency with > other error messages in BeginCopyTo(), like this? > > errmsg("cannot copy from unpopulated materialized view \"%s\"", > RelationGetRelationName(rel)), > > > + errhint("Use the REFRESH MATERIALIZED VIEW command populate the materializedview first.")); > > There seems to be a missing "to" just after "command". > Should it be "Use the REFRESH MATERIALIZED VIEW command to > populate the materialized view first."? Or we could simplify > the hint to match what SELECT on an unpopulated materialized > view logs: "Use the REFRESH MATERIALIZED VIEW command.". > based on your suggestion, i changed it to: if (!RelationIsPopulated(rel)) ereport(ERROR, errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot copy from unpopulated materialized view \"%s\"", RelationGetRelationName(rel)), errhint("Use the REFRESH MATERIALIZED VIEW command to populate the materialized view first.")); > > The copy.sgml documentation should clarify that COPY TO can > be used with a materialized view only if it is populated. > "COPY TO can be used only with plain tables, not views, and does not copy rows from child tables or child partitions" i changed it to "COPY TO can be used with plain tables and materialized views, not regular views, and does not copy rows from child tables or child partitions" Another alternative wording I came up with: "COPY TO can only be used with plain tables and materialized views, not regular views. It also does not copy rows from child tables or child partitions." > > Wouldn't it be beneficial to add a regression test to check > whether COPY matview TO works as expected? sure.
Attachment
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On Mon, Mar 31, 2025 at 11:27 PM Fujii Masao
>
> The copy.sgml documentation should clarify that COPY TO can
> be used with a materialized view only if it is populated.
>
"COPY TO can be used only with plain tables, not views, and does not
copy rows from child tables or child partitions"
i changed it to
"COPY TO can be used with plain tables and materialized views, not
regular views, and does not copy rows from child tables or child
partitions"
Another alternative wording I came up with:
"COPY TO can only be used with plain tables and materialized views,
not regular views. It also does not copy rows from child tables or
child partitions."
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On Tue, 1 Apr 2025 at 08:43, jian he <jian.universality@gmail.com> wrote: > > On Mon, Mar 31, 2025 at 11:27 PM Fujii Masao > <masao.fujii@oss.nttdata.com> wrote: > > > > Regarding the patch, here are some review comments: > > > > + errmsg("cannot copy from materialized view when the materialized viewis not populated"), > > > > How about including the object name for consistency with > > other error messages in BeginCopyTo(), like this? > > > > errmsg("cannot copy from unpopulated materialized view \"%s\"", > > RelationGetRelationName(rel)), > > > > > > + errhint("Use the REFRESH MATERIALIZED VIEW command populate the materializedview first.")); > > > > There seems to be a missing "to" just after "command". > > Should it be "Use the REFRESH MATERIALIZED VIEW command to > > populate the materialized view first."? Or we could simplify > > the hint to match what SELECT on an unpopulated materialized > > view logs: "Use the REFRESH MATERIALIZED VIEW command.". > > > based on your suggestion, i changed it to: > > if (!RelationIsPopulated(rel)) > ereport(ERROR, > errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > errmsg("cannot copy from unpopulated > materialized view \"%s\"", > RelationGetRelationName(rel)), > errhint("Use the REFRESH MATERIALIZED VIEW > command to populate the materialized view first.")); > > > > > > The copy.sgml documentation should clarify that COPY TO can > > be used with a materialized view only if it is populated. > > > "COPY TO can be used only with plain tables, not views, and does not > copy rows from child tables or child partitions" > i changed it to > "COPY TO can be used with plain tables and materialized views, not > regular views, and does not copy rows from child tables or child > partitions" > > Another alternative wording I came up with: > "COPY TO can only be used with plain tables and materialized views, > not regular views. It also does not copy rows from child tables or > child partitions." One thing I noticed was that if the materialized view is not refreshed user will get stale data: postgres=# create table t1(c1 int); CREATE TABLE postgres=# create materialized view mv2 as select * from t1; SELECT 0 postgres=# insert into t1 values(10); INSERT 0 1 postgres=# select * from t1; c1 ---- 10 (1 row) -- Before refresh the data will not be selected postgres=# copy mv2 to stdout with (header); c1 -- After refresh the data will be available postgres=# refresh materialized view mv2; REFRESH MATERIALIZED VIEW postgres=# copy mv2 to stdout with (header); c1 10 Should we document this? The following can be changed to keep it consistent: +copy matview1(id) TO stdout with (header); +copy matview2 TO stdout with (header); To: COPY matview1(id) TO stdout with (header); COPY matview2 TO stdout with (header); Regards, Vignesh
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
One thing I noticed was that if the materialized view is not refreshed
user will get stale data
Should we document this?
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On Tue, 1 Apr 2025 at 15:49, Kirill Reshke <reshkekirill@gmail.com> wrote: > > On Tue, 1 Apr 2025, 11:45 vignesh C, <vignesh21@gmail.com> wrote: >> >> >> One thing I noticed was that if the materialized view is not refreshed >> user will get stale data >> >> Should we document this? > > Does this patch alter thus behaviour? User will get stale data even on HEAD, why should we take a care within this thread? We are not changing the existing behavior. However, since copying data from large tables can take a significant amount of time, would it be helpful to add a cautionary note advising users to refresh the materialized view before running copy command to avoid stale data? This could prevent users from realizing the issue only after running the copy operation, which would then require them to run it again. If you think this is already obvious, then the note may not be necessary. Regards, Vignesh
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On Tue, 1 Apr 2025 at 15:52, vignesh C <vignesh21@gmail.com> wrote: > > On Tue, 1 Apr 2025 at 15:49, Kirill Reshke <reshkekirill@gmail.com> wrote: > > > > On Tue, 1 Apr 2025, 11:45 vignesh C, <vignesh21@gmail.com> wrote: > >> > >> > >> One thing I noticed was that if the materialized view is not refreshed > >> user will get stale data > >> > >> Should we document this? > > > > Does this patch alter thus behaviour? User will get stale data even on HEAD, why should we take a care within this thread? > > We are not changing the existing behavior. However, since copying data > from large tables can take a significant amount of time, would it be > helpful to add a cautionary note advising users to refresh the > materialized view before running copy command to avoid stale data? > This could prevent users from realizing the issue only after running > the copy operation, which would then require them to run it again. Yes, agree, +1 on that. > If > you think this is already obvious, then the note may not be necessary. I don't think this is already obvious, but my objection is that we should maybe discuss this as a separate issue (in a separate patch). Looks like fixing this together with code commit is too much at once. I prefer a one-commit-for-one-purpose style. > Regards, > Vignesh -- Best regards, Kirill Reshke
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
We are not changing the existing behavior. However, since copying data
from large tables can take a significant amount of time, would it be
helpful to add a cautionary note advising users to refresh the
materialized view before running copy command to avoid stale data?
Re: in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
On 2025/04/01 12:12, jian he wrote: > On Mon, Mar 31, 2025 at 11:27 PM Fujii Masao > <masao.fujii@oss.nttdata.com> wrote: >> >> Regarding the patch, here are some review comments: >> >> + errmsg("cannot copy from materialized view when the materialized viewis not populated"), >> >> How about including the object name for consistency with >> other error messages in BeginCopyTo(), like this? >> >> errmsg("cannot copy from unpopulated materialized view \"%s\"", >> RelationGetRelationName(rel)), >> >> >> + errhint("Use the REFRESH MATERIALIZED VIEW command populate the materializedview first.")); >> >> There seems to be a missing "to" just after "command". >> Should it be "Use the REFRESH MATERIALIZED VIEW command to >> populate the materialized view first."? Or we could simplify >> the hint to match what SELECT on an unpopulated materialized >> view logs: "Use the REFRESH MATERIALIZED VIEW command.". >> > based on your suggestion, i changed it to: Thanks for updating the patch! > > if (!RelationIsPopulated(rel)) > ereport(ERROR, > errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > errmsg("cannot copy from unpopulated > materialized view \"%s\"", > RelationGetRelationName(rel)), > errhint("Use the REFRESH MATERIALIZED VIEW > command to populate the materialized view first.")); I think it's better to use the same hint message as the one output by "COPY (SELECT * FROM <unpopulated matview>) TO", specifically: "Use the REFRESH MATERIALIZED VIEW command," for consistency. >> The copy.sgml documentation should clarify that COPY TO can >> be used with a materialized view only if it is populated. >> > "COPY TO can be used only with plain tables, not views, and does not > copy rows from child tables or child partitions" > i changed it to > "COPY TO can be used with plain tables and materialized views, not > regular views, and does not copy rows from child tables or child > partitions" It would be clearer to specify that "COPY TO" applies to *populated* materialized views rather than just "materialized views"? > Another alternative wording I came up with: > "COPY TO can only be used with plain tables and materialized views, > not regular views. It also does not copy rows from child tables or > child partitions." If we split the first description into two as you suggested, I'm tempted to propose the following improvements to enhance the overall descriptions: ------------- "COPY TO" can be used with plain tables and populated materialized views. For example, "COPY table TO" copies the same rowsas "SELECT * FROM ONLY table." However, it doesn't directly support other relation types, such as partitioned tables,inheritance child tables, or views. To copy all rows from these relations, use "COPY (SELECT * FROM table) TO." ------------- >> Wouldn't it be beneficial to add a regression test to check >> whether COPY matview TO works as expected? > sure. The tests seem to have been placed under the category "COPY FROM ... DEFAULT", which feels a bit misaligned. How about adding them to the end of copy.sql instead? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION