Re: Proposal: http2 wire format - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: Proposal: http2 wire format
Date
Msg-id CAMsr+YHT+wOutgvmsmGFKWqQHzTZ6tzaBjs3272mVANOa-eyVw@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: http2 wire format  (Damir Simunic <damir.simunic@wa-research.ch>)
Responses Re: Proposal: http2 wire format  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On 26 March 2018 at 21:05, Damir Simunic <damir.simunic@wa-research.ch> wrote:
> On 26 Mar 2018, at 11:06, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote:
>
> Hi,
>
> >If anyone finds the idea of Postgres speaking http2 appealing
>
> HTTP/2 sounds interesting.
> What do you think of https://grpc.io/ ?
>
> Have you evaluated it?
> It does sound like a ready RPC on top of HTTP/2 with support for lots of languages.
>
> The idea of reimplementing the protocol for multiple languages from scratch does not sound too appealing.

This proposal takes the stance that having HTTP2 wire protocol in place will enable wide experimentation  with and implementation of many new features and content types, but is not concerned with the specifics of those.

---
Let me illustrate with an example how it would look if we already had HTTP2 as proposed.

Lets’ say you have a building automation device on your network that happens to speak grpc, and you decided to use Postgres to store published topics in the database.

Your grpc-speaking device might connect to Postgres and issue a request like this:

HEADERS (flags = END_HEADERS)
:method = POST
:scheme = http
:path = /CreateTopic
pg-database = Publisher
content-type = application/grpc+proto
grpc-encoding = gzip
authorization = Bearer y235.wef315yfh138vh31hv93hv8h3v

DATA (flags = END_STREAM)
<Length-Prefixed Message>

(This is from grpc.io homepage; uppercase HEADERS and DATA are frame names from the HTTP2 specification).

Postgres would take care of TLS negotiation, unpack the frames, decompress the headers (:method, :path, etc are transferred compressed with a lookup table) and copy the payload into memory and make it  all available to the backend. If this was the first request, it would start the backend for you as well.

Postgres doesn’t know about grpc, so it would just conveniently return "406 Not Supported” to your client and close the stream (but not the connection). Still connected and authenticated, the device could retry the request with `content-type: application/json`, and if you somehow programmed a function that accepts json, the request would go through. (Let’s imagine we have some kind of mechanism to associate functions to requests and content types, maybe through some function attributes in the catalog).


This seems to have gone pretty pie-in-the-sky overnight. If I understand correctly, what you're getting at is "eventually I'd like content negotiation that lets us support alternate query representations and response respresentations".

If so, me too. And HTTP2 has some features that are interesting there. But it doesn't have a great deal to do with the immediate issues with v3, or concrete benefits to uses that are already possible with v3.

Again, if your proposed protocol implementation adds significant overhead it's probably a nonstarter.
 
The same goes for the ‘authorization’ header. Postgres does not support Bearer token authorization today. But maybe you’ll be able to define a function that knows how to deal with the token, and somehow signal to Postgres that you want it to call this function when it sees such a header. Or maybe someone wrote a plugin that does that, and you configure your server to use it.

You've consistently ignored my comments re authentication and authorization.

How would a multi-step handshake authentication like GSSAPI or SSPI be implemented with HTTP2? Efficiently?

You also mentioned Pg "starting a backend or using an existing one". Er, no. You're assuming the presence of a connection pooler of sorts within Pg its self. Many people want that, myself included, but it's a fairly tricky problem with Pg's architecture, and definitely not something you should assume with any new protocol proposal.

I'm increasingly convinced that you're pursuing your interesting use cases and disregarding the need to solve the specific problems with the current protocol and server architecture. You also seem to be handwaving away impediments like the strongly tcp-session-based connection structure. That's not going to fly.

IMO, you should really:

* Read https://wiki.postgresql.org/wiki/Todo#Wire_Protocol_Changes_.2F_v4_Protocol and explain how this protocol does/doesn't address those items

* Explain how you see handshake based auth fitting into this. Remember that we currently support strong authentication on cleartext protocols.

* Explain how query-cancels will work. Does the protocol help? Retaining the current make-a-second-connection model is tolerable, but gross; a new protocol should ideally address this.

* Explain how sync recovery will work when the data stream is interrupted by a cancel or error, WITHOUT terminating the session

* Explain what a MINIMAL implementation delivers. Touching on extensibility is good, but lets focus on what can be done soon.

* Explain how sessions will work across multiple request/response cycles. You should assume that 1 session = 1 TCP connection for now. If you want to change that, more power to you, but it's a whole separate project. You'll need to learn Pg's guts in great detail and Windows will become your nightmare.


Personally, I increasingly think that what you really want to do is better done in a proxy, at least for now.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Parallel safety of binary_upgrade_create_empty_extension
Next
From: Tom Lane
Date:
Subject: Re: Parallel Aggregates for string_agg and array_agg