Thread: [HACKERS] bytea_output vs make installcheck

[HACKERS] bytea_output vs make installcheck

From

Jeff Janes

Date:

15 February 2017, 00:50:28

make installcheck currently fails against a server running with bytea_output = escape.

Making it succeed is fairly easy, and I think it is worth doing.

Attached are two options for doing that. One overrides bytea_output locally where-ever needed, and the other overrides it for the entire 'regression' database.

Cheers,

Jeff

Attachment

Re: [HACKERS] bytea_output vs make installcheck

From

neha khatri

Date:

15 February 2017, 02:04:24

On Wed, Feb 15, 2017 at 8:50 AM, Jeff Janes <jeff.janes@gmail.com> wrote:

make installcheck currently fails against a server running with bytea_output = escape.

Making it succeed is fairly easy, and I think it is worth doing.

Attached are two options for doing that. One overrides bytea_output locally where-ever needed, and the other overrides it for the entire 'regression' database.

The solution that overrides bytea_output locally looks appropriate. It may not be required to change the format for entire database.

Had there been a way to convert bytea_output from 'hex' to 'escape' internally, that could have simplified this customisation even more.

Regards,
Neha

Cheers,
Neha

On Wed, Feb 15, 2017 at 8:50 AM, Jeff Janes <jeff.janes@gmail.com> wrote:

make installcheck currently fails against a server running with bytea_output = escape.

Making it succeed is fairly easy, and I think it is worth doing.

Attached are two options for doing that. One overrides bytea_output locally where-ever needed, and the other overrides it for the entire 'regression' database.

Cheers,

Jeff

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bytea_output vs make installcheck

From

neha khatri

Date:

15 February 2017, 08:02:14

On Wed, Feb 15, 2017 at 10:04 AM, neha khatri <nehakhatri5@gmail.com> wrote:.

Attached are two options for doing that. One overrides bytea_output locally where-ever needed, and the other overrides it for the entire 'regression' database.

The solution that overrides bytea_output locally looks appropriate. It may not be required to change the format for entire database.
Had there been a way to convert bytea_output from 'hex' to 'escape' internally, that could have simplified this customization even more.

Well, the conversion from 'hex' to 'escape' is available using the function encode().

So the queries that are failing due to the setting bytea_output = escape, can be wrapped under encode(), to obtain the result in 'escape' format.

Here is another way to resolve the same problem. The patch is attached.

Regards,
Neha

Attachment

intallcheck_bytea_output_escape.patch

Re: [HACKERS] bytea_output vs make installcheck

From

Andres Freund

Date:

15 February 2017, 08:17:59


On February 14, 2017 9:02:14 PM PST, neha khatri <nehakhatri5@gmail.com> wrote:
>On Wed, Feb 15, 2017 at 10:04 AM, neha khatri <nehakhatri5@gmail.com>
> wrote:.
>>
>>
>>> Attached are two options for doing that.  One overrides bytea_output
>>> locally where-ever needed, and the other overrides it for the entire
>>> 'regression' database.
>>>
>>
>> The solution that overrides bytea_output locally looks appropriate.
>It may
>> not be required to change the format for entire database.
>> Had there been a way to convert  bytea_output from 'hex' to 'escape'
>> internally, that could have simplified this customization even more.
>>
>
>Well, the conversion from 'hex' to 'escape' is available using the
>function
>encode().
>So the queries that are failing due to the setting bytea_output =
>escape,
>can be wrapped under encode(), to obtain the result in 'escape' format.
>Here is another way to resolve the same problem. The patch is attached.

I don't quite see the point of this - there's a lot of settings that cause spurious test failures. I don't see any
pointfixing random cases of that.  And I don't think the continual cost of doing so overall is worth the minimal gain. 

What's your reason to get this fixed?

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: [HACKERS] bytea_output vs make installcheck

From

Tom Lane

Date:

15 February 2017, 08:30:49

Andres Freund <andres@anarazel.de> writes:
> I don't quite see the point of this - there's a lot of settings that cause spurious test failures. I don't see any
pointfixing random cases of that.  And I don't think the continual cost of doing so overall is worth the minimal gain. 

> What's your reason to get this fixed?

FWIW, I'm inclined to do something about Jeff's nearby complaint about
operator_precedence_warning, because the cause of that failure is pretty
obscure.  I'm less excited about this one, because it should be obvious
what happened to anyone who looks at the regression diffs.

In general I think there's value in "make installcheck" passing when
it reasonably can, but you're quite right that there's a lot of setting
changes that would break it, and not all are going to be practical to
fix.
        regards, tom lane

Re: [HACKERS] bytea_output vs make installcheck

From

Jeff Janes

Date:

15 February 2017, 20:30:39

On Tue, Feb 14, 2017 at 9:17 PM, Andres Freund <andres@anarazel.de> wrote:

On February 14, 2017 9:02:14 PM PST, neha khatri <nehakhatri5@gmail.com> wrote:
>On Wed, Feb 15, 2017 at 10:04 AM, neha khatri <nehakhatri5@gmail.com>
> wrote:.
>>
>>
>>> Attached are two options for doing that. One overrides bytea_output
>>> locally where-ever needed, and the other overrides it for the entire
>>> 'regression' database.
>>>
>>
>> The solution that overrides bytea_output locally looks appropriate.
>It may
>> not be required to change the format for entire database.
>> Had there been a way to convert bytea_output from 'hex' to 'escape'
>> internally, that could have simplified this customization even more.
>>
>
>Well, the conversion from 'hex' to 'escape' is available using the
>function
>encode().
>So the queries that are failing due to the setting bytea_output =
>escape,
>can be wrapped under encode(), to obtain the result in 'escape' format.
>Here is another way to resolve the same problem. The patch is attached.

I don't quite see the point of this - there's a lot of settings that cause spurious test failures. I don't see any point fixing random cases of that. And I don't think the continual cost of doing so overall is worth the minimal gain.

What's your reason to get this fixed?

More testing is better than less testing, and a good way to get less testing is requiring the tester to memorize a list of false positives they might encounter. I'd like to systematically clone my production system, run it through pg_upgrade, and then do installcheck (plus my own app-specific) on the result, and I'd like others to do that as well with their own production systems.

I don't really see the cost here. It took me longer to satisfy myself that this was not actually a real bug than it did to write the patch. Now much of that time was just because there were 3 other problems as well, which makes isolating and evaluating them exponentially harder--which itself is a good reason for fixing the ones that are easy to fix, so we don't have to get distracted by those while investigating the other ones. And if we go with the option 2 patch, it is one line which seems pretty self-documenting and easy to maintain. I'd rather light a candle than to curse the darkness, at least when the candle is this easy to light.

Cheers,

Jeff

Re: [HACKERS] bytea_output vs make installcheck

From

neha khatri

Date:

16 February 2017, 01:51:59

Agreed with Jeff, false alarms should be avoided, whenever it is easy to put the avoiding mechanism in place.

Regards,
Neha

Re: [HACKERS] bytea_output vs make installcheck

From

Andres Freund

Date:

16 February 2017, 02:04:31

Hi,

On 2017-02-15 09:30:39 -0800, Jeff Janes wrote:
> On Tue, Feb 14, 2017 at 9:17 PM, Andres Freund <andres@anarazel.de> wrote:
> > What's your reason to get this fixed?
> >
> 
> More testing is better than less testing, and a good way to get less
> testing is requiring the tester to memorize a list of false positives they
> might encounter.  I'd like to systematically clone my production system,
> run it through pg_upgrade, and then do installcheck (plus my own
> app-specific) on the result, and I'd like others to do that as well with
> their own production systems.

> I don't really see the cost here.

Because that means we essentially need to make sure that our tests pass
with a combination of about ~20-30 behaviour changing gucs, and ~5
different compilation settings that change output.  Either we do that
systematically - which'd be a fair amount of effort - or we're not
getting anywhere, because the setting around the next corner breaks a
bunch of different tests.

Should tests pass with random client_min_messages settings? Random
enable_? Switched default_with_oids? statement_timeout? Could go on a
long while.

Having to think about all GUCs/compile time settings that could
potentially affect a test and manually set everything up so they don't
makes writing tests a lot harder, and we'll fail anyway unless it's
checked in an automated fashion.  Unless that testing is done you're not
really benefiting, because you can't rely on test failures meaning much.

If we want to have a list of GUCs that we want tests to pass under, then
the proponents of that at least should bring up a buildfarm animal
afterwards that test that reasonable combinations of those pass and
continue to pass.

Alternatively we could ALTER DATABASE a bunch of settings on the
regression database at the start, but I'm not sure that's nice either,
because it makes ad-hoc tests with unusual settings harder.

Greetings,

Andres Freund

Re: [HACKERS] bytea_output vs make installcheck

From

Tom Lane

Date:

16 February 2017, 02:30:30

Andres Freund <andres@anarazel.de> writes:
> On 2017-02-15 09:30:39 -0800, Jeff Janes wrote:
>> I don't really see the cost here.

> Because that means we essentially need to make sure that our tests pass
> with a combination of about ~20-30 behaviour changing gucs, and ~5
> different compilation settings that change output.

Yeah, the problem with addressing this non-systematically is that it'll
never stay fixed for long.

> Alternatively we could ALTER DATABASE a bunch of settings on the
> regression database at the start, but I'm not sure that's nice either,
> because it makes ad-hoc tests with unusual settings harder.

I'd definitely be -1 on that.

I think that it is worth fixing cases where a parameter change leads to
surprising results, like the operator_precedence_warning case just now.
But people should not be surprised when, say, changes in planner
parameters lead to different EXPLAIN output or different row ordering.
If we tried to lock that down it'd be counterproductive for the reason
Andres mentions: sometimes you *want* to see what you get for other
settings.
        regards, tom lane

Re: [HACKERS] bytea_output vs make installcheck

From

Andres Freund

Date:

16 February 2017, 02:32:50

On 2017-02-15 18:30:30 -0500, Tom Lane wrote:
> If we tried to lock that down it'd be counterproductive for the reason
> Andres mentions: sometimes you *want* to see what you get for other
> settings.

We could kinda address that by doing it in a separate file early in the
schedule, which could just be commented out when doing something like
this.  But I'm still unconvinced it's worth caring.

Re: [HACKERS] bytea_output vs make installcheck

From

Tom Lane

Date:

16 February 2017, 02:51:03

Andres Freund <andres@anarazel.de> writes:
> On 2017-02-15 18:30:30 -0500, Tom Lane wrote:
>> If we tried to lock that down it'd be counterproductive for the reason
>> Andres mentions: sometimes you *want* to see what you get for other
>> settings.

> We could kinda address that by doing it in a separate file early in the
> schedule, which could just be commented out when doing something like
> this.  But I'm still unconvinced it's worth caring.

Actually, that idea might be worth pursuing.  Right now pg_regress.c has a
hard-wired notion that it should ALTER DATABASE SET certain parameters on
the regression database.  The best you can say for that is it's ugly as
sin.  It'd definitely be nicer if we could move that into someplace where
it's more readily adjustable.  Having done that, locking down most stuff
by default might become more practical.

However, I'm not sure that just dumping the responsibility into an initial
test script is very workable.  We'd need another copy for each contrib
and PL test suite, the isolation tests, yadda yadda.  And maintaining
that many copies would be a nightmare.  I think we'd need some way to have
pg_regress pick up the desired settings from a central place.
        regards, tom lane

Re: [HACKERS] bytea_output vs make installcheck

From

Peter Eisentraut

Date:

09 March 2017, 22:14:30

On 2/14/17 16:50, Jeff Janes wrote:
> make installcheck currently fails against a server running
> with bytea_output = escape.
> 
> Making it succeed is fairly easy, and I think it is worth doing.
> 
> Attached are two options for doing that.  One overrides bytea_output
> locally where-ever needed, and the other overrides it for the entire
> 'regression' database.

I would use option 2 here (ALTER DATABASE) and be done with it.  Some
people didn't like using ALTER DATABASE, but it's consistent with
existing use.  If someone wants to change that, that can be independent
of this issue.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] bytea_output vs make installcheck

From

Neha Khatri

Date:

10 March 2017, 02:45:37

On Fri, Mar 10, 2017 at 6:14 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

On 2/14/17 16:50, Jeff Janes wrote:
> make installcheck currently fails against a server running
> with bytea_output = escape.
>
> Making it succeed is fairly easy, and I think it is worth doing.
>
> Attached are two options for doing that. One overrides bytea_output
> locally where-ever needed, and the other overrides it for the entire
> 'regression' database.

I would use option 2 here (ALTER DATABASE) and be done with it. Some
people didn't like using ALTER DATABASE, but it's consistent with
existing use. If someone wants to change that, that can be independent
of this issue.

Sorry about the naive question, but if someone has set the GUC bytea_output = 'escape', then the intention seem to be to obtain the output in 'escape' format for bytea.

With this, if an installcheck is done, that might also have been done with the expectation that the output will be in 'escape' format. In that case, how much is it justified to hard code the format for regression database? However, I agree that there are not many bytea outputs in the current regression suite

Regards,
Neha

Re: [HACKERS] bytea_output vs make installcheck

From

"David G. Johnston"

Date:

10 March 2017, 02:57:01

On Thu, Mar 9, 2017 at 4:45 PM, Neha Khatri <nehakhatri5@gmail.com> wrote:

On Fri, Mar 10, 2017 at 6:14 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 2/14/17 16:50, Jeff Janes wrote:
> make installcheck currently fails against a server running
> with bytea_output = escape.
>
> Making it succeed is fairly easy, and I think it is worth doing.
>
> Attached are two options for doing that. One overrides bytea_output
> locally where-ever needed, and the other overrides it for the entire
> 'regression' database.

I would use option 2 here (ALTER DATABASE) and be done with it. Some
people didn't like using ALTER DATABASE, but it's consistent with
existing use. If someone wants to change that, that can be independent
of this issue.

Sorry about the naive question, but if someone has set the GUC bytea_output = 'escape', then the intention seem to be to obtain the output in 'escape' format for bytea.
With this, if an installcheck is done, that might also have been done with the expectation that the output will be in 'escape' format. In that case, how much is it justified to hard code the format for regression database? However, I agree that there are not many bytea outputs in the current regression suite

At a high level (which is all I know here) If we leave behind tests that at least exercise bytea_output='escape' mode to ensure it is functioning properly then I'd have no problem having the testing of other features dependent upon bytea_output, but that are coded to compare against the now-default output format, set that runtime configurable mode to that which they require. If the choice of output mode is simply a byproduct we should be free to set it to whatever we need for the currently executing test.

If a simple way of doing this involves fixing the default to what the suite expects and one-off changing it when testing escape mode stuff that seems like a reasonable position to take. Having to set bytea_output when it isn't the item under test seems like its just going to add noise.

David J.

Re: [HACKERS] bytea_output vs make installcheck

From

Robert Haas

Date:

10 March 2017, 06:31:53

On Thu, Mar 9, 2017 at 6:45 PM, Neha Khatri <nehakhatri5@gmail.com> wrote:
> Sorry about the naive question, but if someone has set the GUC bytea_output
> = 'escape', then the intention seem to be to obtain the output in 'escape'
> format for bytea.
> With this, if an installcheck is done, that might also have been done with
> the expectation that the output will be in 'escape' format. In that case,
> how much is it justified to hard code the format for regression database?
> However, I agree that there are not many bytea outputs in the current
> regression suite

I don't understand this.  People don't run the regression tests to get
the output.  They run the regression tests to see whether they pass.
While it may not be possible to make them pass with arbitrarily-crazy
settings, that's not a reason not to patch up the cases we can handle
sanely.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] bytea_output vs make installcheck

From

Tom Lane

Date:

14 March 2017, 01:36:28

Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Mar 9, 2017 at 6:45 PM, Neha Khatri <nehakhatri5@gmail.com> wrote:
>> With this, if an installcheck is done, that might also have been done with
>> the expectation that the output will be in 'escape' format. In that case,
>> how much is it justified to hard code the format for regression database?
>> However, I agree that there are not many bytea outputs in the current
>> regression suite

> I don't understand this.  People don't run the regression tests to get
> the output.  They run the regression tests to see whether they pass.
> While it may not be possible to make them pass with arbitrarily-crazy
> settings, that's not a reason not to patch up the cases we can handle
> sanely.

I think the question that has to be settled to move this forward is
whether we're content with hard-wiring something for bytea_output
(as in Jeff's installcheck_bytea_fix_2.patch, which I concur with
Peter is superior to installcheck_bytea_fix_1.patch), or whether
we want to hold out for a more flexible solution, probably about like
what I sketched in
https://www.postgresql.org/message-id/30246.1487202663%40sss.pgh.pa.us

I think the more flexible solution is definitely a desirable place to
get to, but somehow I doubt it's going to happen for v10.  Meanwhile
the question is whether adding more hard-wired behavior in pg_regress
is desirable or not.

I tend to vote with Andres that it's not worth the trouble, but
considering that it's only a 2-line change, I won't stand in the
way if some other committer is convinced this is an improvement.
        regards, tom lane