Re: question regarding copyData containers - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: question regarding copyData containers |
Date | |
Msg-id | 20200603220816.bq6kwazzmxgc6aty@alap3.anarazel.de Whole thread Raw |
In response to | question regarding copyData containers (Jerome Wagner <jerome.wagner@laposte.net>) |
Responses |
Re: question regarding copyData containers
|
List | pgsql-hackers |
Hi, On 2020-06-03 19:28:12 +0200, Jerome Wagner wrote: > I have been working on a node.js streaming client for different COPY > scenarios. > usually, during CopyOut, clients tend to buffer network chunks until they > have gathered a full copyData message and pass that to the user. > > In some cases, this can lead to very large copyData messages. when there > are very long text fields or bytea fields it will require a lot of memory > to be handled (up to 1GB I think in the worst case scenario) > > In COPY TO, I managed to relax that requirement, considering that copyData > is simply a transparent container. For each network chunk, the relevent > message content is forwarded which makes for 64KB chunks at most. Uhm. > We loose the semantics of the "row" that copyData has according to the > documentation > https://www.postgresql.org/docs/10/protocol-flow.html#PROTOCOL-COPY > >The backend sends a CopyOutResponse message to the frontend, followed by > zero or more >CopyData messages (**always one per row**), followed by > CopyDone > > but it is not a problem because the raw bytes are still parsable (rows + > fields) in text mode (tsv) and in binary mode) This seems like an extremely bad idea to me. Are we really going to ask clients to incur the overhead (both in complexity and runtime) to parse incoming data just to detect row boundaries? Given the number of options there are for COPY, that's a seriously complicated task. I think that's a completely no-go. Leaving error handling aside (see para below), what does this actually get you? Either your client cares about getting a row in one sequential chunk, or it doesn't. If it doesn't care, then there's no need to allocate a buffer that can contain the whole 'd' message. You can just hand the clients the chunks incrementally. If it does, then you need to reassemble either way (or worse, you force to reimplement the client to reimplement that). I assume what you're trying to get at is being able to send CopyData messages before an entire row is assembled? And you want to send separate CopyData messages to allow for error handling? I think that's a quite worthwhile goal, but I don't think it can sensibly solved by just removing protocol level framing of row boundaries. And that will mean evolving the protocol in a non-compatible way. > Now I started working on copyBoth and logical decoding scenarios. In this > case, the server send series of copyData. 1 copyData containing 1 message : > > at the network chunk level, in the case of large fields, we can observe > > in: CopyData Int32 XLogData Int64 Int64 Int64 Byten1 > in: Byten2 > in: CopyData Int32 XLogData Int64 Int64 Int64 Byten3 > in: CopyData Int32 XLogData Int64 Int64 Int64 Byten4 > > out: XLogData Int64 Int64 Int64 Byten1 > out: Byten2 > out: XLogData Int64 Int64 Int64 Byten3 > out: XLogData Int64 Int64 Int64 Byten4 > but at the XLogData level, the protocol is not self-describing its length, > so there is no real way of knowing where the first XLogData ends apart from > - knowing the length of the first copyData (4 + 1 + 3*8 + n1 + n2) > - knowing the internals of the output plugin and benefit from a plugin > that self-describe its span > when a network chunks contains several copyDatas > in: CopyData Int32 XLogData Int64 Int64 Int64 Byten1 CopyData Int32 > XLogData Int64 Int64 Int64 Byten2 > we have > out: XLogData Int64 Int64 Int64 Byten1 XLogData Int64 Int64 Int64 Byten2 Right now all 'w' messages should be contained in one CopyData/'d' that doesn't contain anything but the XLogData/'w'. Do you just mean that if we'd change the server side code to split 'w' messages across multiple 'd' messages, then we couldn't make much sense of the data anymore? If so, then I don't really see a problem. Unless you do a much larger change, what'd be the point in allowing to split 'w' across multiple 'd' chunks? The input data exists in a linear buffer already, so you're not going to reduce peak memory usage by sending smaller CopyData chunks. Sure, we could evolve the logical decoding interface to output to be able to send data in a much more incremental way than, typically, per-row basis. But I think that'd quite substantially increase complexity. And the message framing seems to be the easier part of such a change. Greetings, Andres Freund
pgsql-hackers by date: