Re: Make COPY format extendable: Extract COPY TO format implementations - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Make COPY format extendable: Extract COPY TO format implementations |
Date | |
Msg-id | 9172d4eb-6de0-4c6d-beab-8210b7a2219b@enterprisedb.com Whole thread Raw |
In response to | Re: Make COPY format extendable: Extract COPY TO format implementations (Sutou Kouhei <kou@clear-code.com>) |
Responses |
Re: Make COPY format extendable: Extract COPY TO format implementations
|
List | pgsql-hackers |
On 7/22/24 09:45, Sutou Kouhei wrote: > Hi Tomas, > > Thanks for joining this thread! > > In <257d5573-07da-48c3-ac07-e047e7a65e99@enterprisedb.com> > "Re: Make COPY format extendable: Extract COPY TO format implementations" on Fri, 19 Jul 2024 14:40:05 +0200, > Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > >> I think it'd be helpful if you could post a patch status, i.e. a message >> re-explaininig what it aims to achieve, summary of the discussion so >> far, and what you think are the open questions. Otherwise every reviewer >> has to read the whole thread to learn this. > > It makes sense. It seems your questions covers all important > points in this thread. So my answers of your questions > summarize the latest information. > Thanks for the summary/responses. I still think it'd be better to post a summary as a separate message, not as yet another post responding to someone else. If I was reading the thread, I would not have noticed this is meant to be a summary. I'd even consider putting a "THREAD SUMMARY" title on the first line, or something like that. Up to you, of course. As for the patch / decisions, thanks for the responses and explanations. But I still find it hard to review / make judgements about the approach based on the current version of the patch :-( Yes, it's entirely possible earlier versions did something interesting - e.g. it might have implemented the existing formats to the new approach. Or it might have a private pointer in v6. But how do I know why it was removed? Was it because it's unnecessary for the initial version? Or was it because it turned out to not work? And when reviewing a patch, I really don't want to scavenge through old patch versions, looking for random parts. Not only because I don't know what to look for, but also because it'll be harder and harder to make those old versions work, as the patch moves evolves. My suggestions would be to maintain this as a series of patches, making incremental changes, with the "more complex" or "more experimental" parts larger in the series. For example, I can imagine doing this: 0001 - minimal version of the patch (e.g. current v17) 0002 - switch existing formats to the new interface 0003 - extend the interface to add bits needed for columnar formats 0004 - add DML to create/alter/drop custom implementations 0005 - minimal patch with extension adding support for Arrow Or something like that. The idea is that we still have a coherent story of what we're trying to do, and can discuss the incremental changes (easier than looking at a large patch). It's even possible to commit earlier parts before the later parts are quite cleanup up for commit. And some changes changes may not be even meant for commit (e.g. the extension) but as guidance / validation for the earlier parts. I do realize this might look like I'm requiring you to do more work. Sorry about that. I'm just thinking about how to move the patch forward and convince myself the approach is OK. Also, it's what I think works quite well for other patches discussed on this mailing list (I do this for various patches I submitted, for example). And I'm not even sure it actually is more work. As for the performance / profiling issues, I've read the reports and I'm not sure I see something tremendously wrong. Yes, there are differences, but 5% change can easily be noise, shift in binary layout, etc. Unfortunately, there's not much information about what exactly the tests did, context (hardware, ...). So I don't know, really. But if you share enough information on how to reproduce this, I'm willing to take a look and investigate. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: