Thread: [HACKERS] Notes on testing Postgres 10b1

[HACKERS] Notes on testing Postgres 10b1

From
Josh Berkus
Date:
Folks,

I've put together some demos on PostgreSQL 10beta1.  Here's a few
feedback notes based on my experience with it.

Things I tested
----------------

* Logical replication pub/sub with replicating only two tables out of a
12-table FK heirarchy, including custom data types

* Partitioning a log-structured table, including a range type, exclusion
constraint, and foreign key.

* Various Parallel index queries on a 100m-row pgbench table

* Full text JSON search in a books database

* SCRAM authentication for local connections and replication

Positive changes beyond the obvious
-----------------------------------

* Yay defaults with replication on!

* Having defaults on the various _workers all devolve from max_workers
is also great.

* Constraint exclusion + partitioning Just Worked.

Questions
----------

Q1. Why does wal_level default to "replica" and not "logical"?

Q2: I thought we were going to finally change the pg_dump default to
"custom" format in this release?  No?

Problems
--------

P1. On the publishing node, logical replication relies on the *implied*
correspondence of the application_name and the replication_slot both
being named the same as the publication in order to associate a
particular publication with a particular replication connection.
However, there's absolutely nothing preventing me from also creating a
binary replication connection by the same name  It really seems like we
need a field in pg_stat_replication or pg_replication_slots which lists
the publication.


P2: If I create a subscription on a table with no primary key, I do not
recieve a warning.  There should be a warning, since in most cases such
a subscription will not work.  I suggest the text:

"logical replication target relation "public.fines" has no primary key.
Either create one, or set REPLICA IDENTITY index and set the published
relation to REPLICA IDENTITY FULL."


P3: apparently jsonb_to_tsvector with lang parameter isn't immutable?
This means that it can't be used for indexing:

libdata=# create index bookdata_fts on bookdata using gin ((
to_tsvector('english',bookdata)));
ERROR:  functions in index expression must be marked IMMUTABLE

... and indeed it's not:

select proname, prosrc, proargtypes, provolatile from pg_proc where
proname = 'to_tsvector';  proname   |         prosrc         | proargtypes | provolatile
-------------+------------------------+-------------+-------------to_tsvector | jsonb_to_tsvector      | 3802        |
sto_tsvector| to_tsvector_byid       | 3734 25     | ito_tsvector | to_tsvector            | 25          | sto_tsvector
|json_to_tsvector       | 114         | sto_tsvector | jsonb_to_tsvector_byid | 3734 3802   | sto_tsvector |
json_to_tsvector_byid | 3734 114    | s
 

... can we fix that?


-- 
Josh Berkus
Containers & Databases Oh My!



Re: [HACKERS] Notes on testing Postgres 10b1

From
Michael Paquier
Date:
On Wed, Jun 7, 2017 at 2:01 PM, Josh Berkus <josh@berkus.org> wrote:
> Q1. Why does wal_level default to "replica" and not "logical"?

The difference of WAL generated is way higher between
archive->hot_standby than hot_standby->logical. And unlike replica,
logical decoding is not something that is widely spread in user's
deployments to justify changing to such a default. At least that's
what I recall on the matter.

> Q2: I thought we were going to finally change the pg_dump default to
> "custom" format in this release?  No?

I don't recall any discussion on this matter, but my memory may fail me.
-- 
Michael



Re: [HACKERS] Notes on testing Postgres 10b1

From
Andres Freund
Date:
On 2017-06-07 14:29:04 +0900, Michael Paquier wrote:
> On Wed, Jun 7, 2017 at 2:01 PM, Josh Berkus <josh@berkus.org> wrote:
> > Q1. Why does wal_level default to "replica" and not "logical"?
> 
> The difference of WAL generated is way higher between
> archive->hot_standby than hot_standby->logical. And unlike replica,
> logical decoding is not something that is widely spread in user's
> deployments to justify changing to such a default. At least that's
> what I recall on the matter.

Right.  I think what we really want there is some form of magic
switching to logical when a slot is present.  Thats easy enough on the
master, a good bit harder when we allow decoding on standbys, which
Craig's working on.


> > Q2: I thought we were going to finally change the pg_dump default to
> > "custom" format in this release?  No?
> 
> I don't recall any discussion on this matter, but my memory may fail me.

Nothing here either.


- Andres



Re: [HACKERS] Notes on testing Postgres 10b1

From
Peter Eisentraut
Date:
On 6/7/17 01:01, Josh Berkus wrote:
> * Having defaults on the various _workers all devolve from max_workers
> is also great.

I'm not aware of anything like that happening.

> P1. On the publishing node, logical replication relies on the *implied*
> correspondence of the application_name and the replication_slot both
> being named the same as the publication in order to associate a
> particular publication with a particular replication connection.
> However, there's absolutely nothing preventing me from also creating a
> binary replication connection by the same name  It really seems like we
> need a field in pg_stat_replication or pg_replication_slots which lists
> the publication.

I'm not quite sure what you are getting at here.  The application_name
seen on the publisher side is the subscription name.  You can create a
binary replication connection using the same application_name, but
that's already been possible before.  But the publications don't care
about any of this.

> P2: If I create a subscription on a table with no primary key, I do not
> recieve a warning.  There should be a warning, since in most cases such
> a subscription will not work.  I suggest the text:
> 
> "logical replication target relation "public.fines" has no primary key.
> Either create one, or set REPLICA IDENTITY index and set the published
> relation to REPLICA IDENTITY FULL."

At that point, we don't know what is being published.  If only inserts
are being published or REPLICA IDENTITY FULL is set, then it will work.
We don't want to give warnings about things that might not be true.

More guidance on some of the potential failure cases would be good, but
it would need more refinement.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Notes on testing Postgres 10b1

From
Petr Jelinek
Date:
Hi,

On 07/06/17 07:01, Josh Berkus wrote:
> Folks,
> 
> I've put together some demos on PostgreSQL 10beta1.  Here's a few
> feedback notes based on my experience with it.
> [...snip...]
> 
> Problems
> --------
> 
> P1. On the publishing node, logical replication relies on the *implied*
> correspondence of the application_name and the replication_slot both
> being named the same as the publication in order to associate a
> particular publication with a particular replication connection.
> However, there's absolutely nothing preventing me from also creating a
> binary replication connection by the same name  It really seems like we
> need a field in pg_stat_replication or pg_replication_slots which lists
> the publication.
> 

What do you mean implied correspondence of application_name and the
replication_slot? We only use subscription_name as default value for
those when user does not specify something else, all three of those can
have different value if user sets it up that way. And there is no
correspondence whatsoever to names of publications. The upstream only
knows which publications to replicate because subscription gives list of
requested publications as option to START_REPLICATION walsender command.
The list of publications associated with a subscription are only stored
on the subscriber and publisher has no idea what those are.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: [HACKERS] Notes on testing Postgres 10b1

From
Josh Berkus
Date:
Peter and Petr:

On 06/07/2017 05:24 PM, Peter Eisentraut wrote:
> On 6/7/17 01:01, Josh Berkus wrote:
>> * Having defaults on the various _workers all devolve from max_workers
>> is also great.
> 
> I'm not aware of anything like that happening.
> 
>> P1. On the publishing node, logical replication relies on the *implied*
>> correspondence of the application_name and the replication_slot both
>> being named the same as the publication in order to associate a
>> particular publication with a particular replication connection.
>> However, there's absolutely nothing preventing me from also creating a
>> binary replication connection by the same name  It really seems like we
>> need a field in pg_stat_replication or pg_replication_slots which lists
>> the publication.
> 
> I'm not quite sure what you are getting at here.  The application_name
> seen on the publisher side is the subscription name.  You can create a
> binary replication connection using the same application_name, but
> that's already been possible before.  But the publications don't care
> about any of this.

My point is that there is no system view where I can see, on the origin
node, what subscribers are subscribing to which publications.  You can
kinda guess that from pg_stat_replication etc., but it's not dependable
information.


>> P2: If I create a subscription on a table with no primary key, I do not
>> recieve a warning.  There should be a warning, since in most cases such
>> a subscription will not work.  I suggest the text:
>>
>> "logical replication target relation "public.fines" has no primary key.
>> Either create one, or set REPLICA IDENTITY index and set the published
>> relation to REPLICA IDENTITY FULL."
> 
> At that point, we don't know what is being published.  If only inserts
> are being published or REPLICA IDENTITY FULL is set, then it will work.
> We don't want to give warnings about things that might not be true.
> 
> More guidance on some of the potential failure cases would be good, but
> it would need more refinement.

Hmmm, yah, I see.  Let me explain why this is a UX issue as-is though:

1. User forgets to create a PK on the subscriber node.

2. User starts a subscription to the tables.

3. Subscription is successful.

4. First update hits the publisher node.

5. Subscription fails and disconnects.

The user's first thought is going to be a network issue, or a bug, or
some other problem, not a missing PK.  Yeah, they can find that
information in the logs, but only if they think to look for it in the
first place, and in some environments (AWS, containers, etc.) logs can
be very hard to access.

We really need the subscription to fail at step (2), not wait for the
first update to fail.  And if it doesn't fail at step 2, then we should
at least give a warning.

-- 
Josh Berkus
Containers & Databases Oh My!



Re: [HACKERS] Notes on testing Postgres 10b1

From
Petr Jelinek
Date:
On 08/06/17 03:19, Josh Berkus wrote:
> 
> Peter and Petr:
> 
> On 06/07/2017 05:24 PM, Peter Eisentraut wrote:
>> On 6/7/17 01:01, Josh Berkus wrote:
>>> * Having defaults on the various _workers all devolve from max_workers
>>> is also great.
>>
>> I'm not aware of anything like that happening.
>>
>>> P1. On the publishing node, logical replication relies on the *implied*
>>> correspondence of the application_name and the replication_slot both
>>> being named the same as the publication in order to associate a
>>> particular publication with a particular replication connection.
>>> However, there's absolutely nothing preventing me from also creating a
>>> binary replication connection by the same name  It really seems like we
>>> need a field in pg_stat_replication or pg_replication_slots which lists
>>> the publication.
>>
>> I'm not quite sure what you are getting at here.  The application_name
>> seen on the publisher side is the subscription name.  You can create a
>> binary replication connection using the same application_name, but
>> that's already been possible before.  But the publications don't care
>> about any of this.
> 
> My point is that there is no system view where I can see, on the origin
> node, what subscribers are subscribing to which publications.  You can
> kinda guess that from pg_stat_replication etc., but it's not dependable
> information.
> 

That's like wanting the foreign server to show you which foreign tables
exist on the local server. This is not a tightly coupled system and you
are able to setup both sides without them being connected to each other
at the time of setup, so there is no way publisher can know anything.

> 
>>> P2: If I create a subscription on a table with no primary key, I do not
>>> recieve a warning.  There should be a warning, since in most cases such
>>> a subscription will not work.  I suggest the text:
>>>
>>> "logical replication target relation "public.fines" has no primary key.
>>> Either create one, or set REPLICA IDENTITY index and set the published
>>> relation to REPLICA IDENTITY FULL."
>>
>> At that point, we don't know what is being published.  If only inserts
>> are being published or REPLICA IDENTITY FULL is set, then it will work.
>> We don't want to give warnings about things that might not be true.
>>
>> More guidance on some of the potential failure cases would be good, but
>> it would need more refinement.
> 
> Hmmm, yah, I see.  Let me explain why this is a UX issue as-is though:
> 
> 1. User forgets to create a PK on the subscriber node.
> 
> 2. User starts a subscription to the tables.
> 
> 3. Subscription is successful.
> 
> 4. First update hits the publisher node.
> 
> 5. Subscription fails and disconnects.
> 
> The user's first thought is going to be a network issue, or a bug, or
> some other problem, not a missing PK.  Yeah, they can find that
> information in the logs, but only if they think to look for it in the
> first place, and in some environments (AWS, containers, etc.) logs can
> be very hard to access.
> 
> We really need the subscription to fail at step (2), not wait for the
> first update to fail.  And if it doesn't fail at step 2, then we should
> at least give a warning.
> 

Yes, I actually mentioned somewhere at some point that we should call
the checks we call during the replication also from the appropriate DDL
commands when possible (the information might not be available when the
DDL is executed), but never got to actually implementing it.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: [HACKERS] Notes on testing Postgres 10b1

From
Peter Eisentraut
Date:
On 6/7/17 21:19, Josh Berkus wrote:
> The user's first thought is going to be a network issue, or a bug, or
> some other problem, not a missing PK.  Yeah, they can find that
> information in the logs, but only if they think to look for it in the
> first place, and in some environments (AWS, containers, etc.) logs can
> be very hard to access.

You're not going to get very far with using this feature if you are not
looking in the logs for errors.  These are asynchronously operating
background workers, so the only way they can communicate problems is
through the log.

I don't disagree with your general premise.  We have done a fair amount
of fiddling already to show some errors as early as possible.  But we
can't know all of them, and we shouldn't give the impression that we do.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Notes on testing Postgres 10b1

From
Josh Berkus
Date:
On 06/07/2017 06:25 PM, Petr Jelinek wrote:
> On 08/06/17 03:19, Josh Berkus wrote:
>>
>> Peter and Petr:
>>
>> On 06/07/2017 05:24 PM, Peter Eisentraut wrote:
>>> On 6/7/17 01:01, Josh Berkus wrote:
>>>> * Having defaults on the various _workers all devolve from max_workers
>>>> is also great.
>>>
>>> I'm not aware of anything like that happening.
>>>
>>>> P1. On the publishing node, logical replication relies on the *implied*
>>>> correspondence of the application_name and the replication_slot both
>>>> being named the same as the publication in order to associate a
>>>> particular publication with a particular replication connection.
>>>> However, there's absolutely nothing preventing me from also creating a
>>>> binary replication connection by the same name  It really seems like we
>>>> need a field in pg_stat_replication or pg_replication_slots which lists
>>>> the publication.
>>>
>>> I'm not quite sure what you are getting at here.  The application_name
>>> seen on the publisher side is the subscription name.  You can create a
>>> binary replication connection using the same application_name, but
>>> that's already been possible before.  But the publications don't care
>>> about any of this.
>>
>> My point is that there is no system view where I can see, on the origin
>> node, what subscribers are subscribing to which publications.  You can
>> kinda guess that from pg_stat_replication etc., but it's not dependable
>> information.
>>
> 
> That's like wanting the foreign server to show you which foreign tables
> exist on the local server. This is not a tightly coupled system and you
> are able to setup both sides without them being connected to each other
> at the time of setup, so there is no way publisher can know anything.

Why wouldn't the publisher know who's connected once the replication
connection as been made and the subscription has started?  Or is it just
a log position, and the publisher really has no idea how many
publications are being consumed?


-- 
Josh Berkus
Containers & Databases Oh My!



Re: [HACKERS] Notes on testing Postgres 10b1

From
Petr Jelinek
Date:
On 08/06/17 03:50, Josh Berkus wrote:
> On 06/07/2017 06:25 PM, Petr Jelinek wrote:
>> On 08/06/17 03:19, Josh Berkus wrote:
>>>
>>> Peter and Petr:
>>>
>>> On 06/07/2017 05:24 PM, Peter Eisentraut wrote:
>>>> On 6/7/17 01:01, Josh Berkus wrote:
>>>>> * Having defaults on the various _workers all devolve from max_workers
>>>>> is also great.
>>>>
>>>> I'm not aware of anything like that happening.
>>>>
>>>>> P1. On the publishing node, logical replication relies on the *implied*
>>>>> correspondence of the application_name and the replication_slot both
>>>>> being named the same as the publication in order to associate a
>>>>> particular publication with a particular replication connection.
>>>>> However, there's absolutely nothing preventing me from also creating a
>>>>> binary replication connection by the same name  It really seems like we
>>>>> need a field in pg_stat_replication or pg_replication_slots which lists
>>>>> the publication.
>>>>
>>>> I'm not quite sure what you are getting at here.  The application_name
>>>> seen on the publisher side is the subscription name.  You can create a
>>>> binary replication connection using the same application_name, but
>>>> that's already been possible before.  But the publications don't care
>>>> about any of this.
>>>
>>> My point is that there is no system view where I can see, on the origin
>>> node, what subscribers are subscribing to which publications.  You can
>>> kinda guess that from pg_stat_replication etc., but it's not dependable
>>> information.
>>>
>>
>> That's like wanting the foreign server to show you which foreign tables
>> exist on the local server. This is not a tightly coupled system and you
>> are able to setup both sides without them being connected to each other
>> at the time of setup, so there is no way publisher can know anything.
> 
> Why wouldn't the publisher know who's connected once the replication
> connection as been made and the subscription has started?  Or is it just
> a log position, and the publisher really has no idea how many
> publications are being consumed?
> 

Plugin knows while the connection exists, but that's the thing, it goes
through pluggable interface (that can be used by other plugins, without
publications) so there would have to be some abstracted way for plugins
to give some extra information for the pg_stat_replication or similar
view. I am afraid it's bit too late to design something like that in
PG10 cycle.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



Re: [HACKERS] Notes on testing Postgres 10b1

From
Josh Berkus
Date:
On 06/07/2017 07:01 PM, Petr Jelinek wrote:
> On 08/06/17 03:50, Josh Berkus wrote:
>> On 06/07/2017 06:25 PM, Petr Jelinek wrote:
>>> On 08/06/17 03:19, Josh Berkus wrote:
>>>>
>>>> Peter and Petr:
>>>>
>>>> On 06/07/2017 05:24 PM, Peter Eisentraut wrote:
>>>>> On 6/7/17 01:01, Josh Berkus wrote:
>>>>>> * Having defaults on the various _workers all devolve from max_workers
>>>>>> is also great.
>>>>>
>>>>> I'm not aware of anything like that happening.
>>>>>
>>>>>> P1. On the publishing node, logical replication relies on the *implied*
>>>>>> correspondence of the application_name and the replication_slot both
>>>>>> being named the same as the publication in order to associate a
>>>>>> particular publication with a particular replication connection.
>>>>>> However, there's absolutely nothing preventing me from also creating a
>>>>>> binary replication connection by the same name  It really seems like we
>>>>>> need a field in pg_stat_replication or pg_replication_slots which lists
>>>>>> the publication.
>>>>>
>>>>> I'm not quite sure what you are getting at here.  The application_name
>>>>> seen on the publisher side is the subscription name.  You can create a
>>>>> binary replication connection using the same application_name, but
>>>>> that's already been possible before.  But the publications don't care
>>>>> about any of this.
>>>>
>>>> My point is that there is no system view where I can see, on the origin
>>>> node, what subscribers are subscribing to which publications.  You can
>>>> kinda guess that from pg_stat_replication etc., but it's not dependable
>>>> information.
>>>>
>>>
>>> That's like wanting the foreign server to show you which foreign tables
>>> exist on the local server. This is not a tightly coupled system and you
>>> are able to setup both sides without them being connected to each other
>>> at the time of setup, so there is no way publisher can know anything.
>>
>> Why wouldn't the publisher know who's connected once the replication
>> connection as been made and the subscription has started?  Or is it just
>> a log position, and the publisher really has no idea how many
>> publications are being consumed?
>>
> 
> Plugin knows while the connection exists, but that's the thing, it goes
> through pluggable interface (that can be used by other plugins, without
> publications) so there would have to be some abstracted way for plugins
> to give some extra information for the pg_stat_replication or similar
> view. I am afraid it's bit too late to design something like that in
> PG10 cycle.

OK, consider it a feature request for PG11, then.


-- 
Josh Berkus
Containers & Databases Oh My!



Re: [HACKERS] Notes on testing Postgres 10b1

From
Josh Berkus
Date:
On 06/07/2017 06:37 PM, Peter Eisentraut wrote:
> On 6/7/17 21:19, Josh Berkus wrote:
>> The user's first thought is going to be a network issue, or a bug, or
>> some other problem, not a missing PK.  Yeah, they can find that
>> information in the logs, but only if they think to look for it in the
>> first place, and in some environments (AWS, containers, etc.) logs can
>> be very hard to access.
> 
> You're not going to get very far with using this feature if you are not
> looking in the logs for errors.  These are asynchronously operating
> background workers, so the only way they can communicate problems is
> through the log.

Well, we *could* provide a system view, as we now do for archiving, and
for the same reasons.

The issue isn't that the error detail is in the log.  It's somehow
letting the user know that they need to look at the log, as opposed to
somewhere else.  Consider that this is asynchonous for the user as well;
they are likely to find out about the broken replication well after it
happens, and thus have a lot of log to search through.

Activity logs are a *terrible* UI for debugging systems problems.  I
realize that there is information it's hard for us to provide any other
way.  But the logs should be our "monitoring of last resort", where we
put stuff after we've run out of ideas on where else to put it, because
they are the hardest thing to access for a user.

-- 
Josh Berkus
Containers & Databases Oh My!



Re: [HACKERS] Notes on testing Postgres 10b1

From
Peter Eisentraut
Date:
On 6/8/17 13:15, Josh Berkus wrote:
> Well, we *could* provide a system view, as we now do for archiving, and
> for the same reasons.

Which view are you referring to here?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] Notes on testing Postgres 10b1

From
Greg Stark
Date:
On 7 June 2017 at 01:01, Josh Berkus <josh@berkus.org> wrote:
> P3: apparently jsonb_to_tsvector with lang parameter isn't immutable?
> This means that it can't be used for indexing:
>
> libdata=# create index bookdata_fts on bookdata using gin ((
> to_tsvector('english',bookdata)));
> ERROR:  functions in index expression must be marked IMMUTABLE

I don't have a machine handy to check on but isn't this a strange
thing to do? Isn't there a GIN opclass on jsonb itself which would be
the default if you didn't have that to_tsvector() call  -- and which
would also work properly with the jsonb operators?

-- 
greg



Re: [HACKERS] Notes on testing Postgres 10b1

From
Josh Berkus
Date:
On 06/09/2017 07:54 PM, Greg Stark wrote:
> On 7 June 2017 at 01:01, Josh Berkus <josh@berkus.org> wrote:
>> P3: apparently jsonb_to_tsvector with lang parameter isn't immutable?
>> This means that it can't be used for indexing:
>>
>> libdata=# create index bookdata_fts on bookdata using gin ((
>> to_tsvector('english',bookdata)));
>> ERROR:  functions in index expression must be marked IMMUTABLE
> 
> I don't have a machine handy to check on but isn't this a strange
> thing to do? Isn't there a GIN opclass on jsonb itself which would be
> the default if you didn't have that to_tsvector() call  -- and which
> would also work properly with the jsonb operators?
> 

The above is the documented way to create an FTS index on a JSONB field.

-- 
Josh Berkus
Containers & Databases Oh My!