Home > mailing lists

Thread: 2018-03 Commitfest Summary (Andres #1)

2018-03 Commitfest Summary (Andres #1)

From

Andres Freund

Date:

01 March 2018, 14:03:44

Hi,

Going through all non bugfix CF entries. Here's the summary for the
entries I could stomach tonight:

RFC: ready for committer
NR: needs review
WOA: waiting on author.

- pgbench - allow to store query results into variables

RFC, has been worked on for a while. Patch doesn't look 100% there,
but not that far away from it either.

A bit concerned that we're turning pgbench into a kitchen sink.

- pgbench - another attempt at tap test for time-related options

NR, another attempt at adding tests. I'd personally boot this to the
next fest, it's definitely not critical.

- Tab completion for SELECT in psql

NR. Submitted in November.

There is discussion around version compatibility of tab completion,
which doesn't seem like we necessarily want to conclude this late in
the cycle.

I'd personally kick this this to the next fest.

- pgbench - add \if support

RFC.

I'm personally unconvinced we want this. Others on the thread seem
unsure as well.

- pgbench - option to build using ppoll() for larger connection counts

RFC. Seems like a reasonable idea. Looks like it needs a littlebit of
tlc, but seems reasonable to get into 11.

- WIP: Pgbench Errors and serialization/deadlock retries

WOA. Has been waiting since mid January.

Boot to next, even though I really would like to have this feature.

- General purpose hashing func in pgbench

RFC. Does mostly look ready, although I'm not sure the docs are at the
right level of detail.

I've similar concerns to earlier pgbench patches adding new
operators, even though to a bit lower degree.

- pgbench - add ability to control random seed

NR. Patch in current form has only been submitted in January. Seems to
need some work, although not too much.

Probably doable for 11, arguably a bit late.

- Find additional connection service files in pg_service.conf.d
directory

NR. Patch submitted for the first time in January.

Boot to next, as bad as that is for new contributors. Did a short
review.

- pgbench - break out initialization timing info by init phase

WOA. Patch submitted for the first time in January. Not updated.

Boot to next. IMO this shouldn't even *be* in this commitfest.

- csv output format for psql

WOA. Patch submitted for the first time end of January.

It's unclear to me whether we want this, there's not been much debate
on that so far. The use case seems fairly minimal to me.

I'd boot.

- Using base backup exclusion filters to reduce data transferred with pg_rewind

NR. Nontrivial patchset, submittted in February.

50 files changed, 588 insertions(+), 351 deletions(-)

There've been no comments on the patch so far. There is new
infrastructure.

I'd boot.

- pgbench - allow to specify scale as a size

NR. Submitted first time mid February. There's debate about whether
this is the right thing to do.

I'd boot.

- pgbench - test whether a variable exists

NR. Submitted first time mid February.

Quoting from author:
"Note that it is not really that useful for benchmarking, although it
does not harm."

I'd reject or boot.

- Support target_session_attrs=read-only and eliminate the added round-trip to detect session status.

This is a new CF entry for a patch that's been removed from CF 2017-11
that still doesn't apply anymore & hasn't been updated since
October. Marked as returned.

- Parallel Dump to /dev/null

NR: Allows directory dumps to /dev/null. Added to CF yesterday.

As this is a late unreviewed nontrivial patch I'd argue this is too
late, and we should move to the next CF. Message sent to thread.

- Comment of formdesc() forgets pg_subscription.

NR. Trivial cleanup, will get resolved in some manner.

- Mention connection parameter "replication" in libpq section

RFC. Doc fix. Should be doable. Commented.

- Correct the calculation of vm.nr_hugepages on Linux

RFC. Doc fix. Should be doable. Commented.

- Updating parallel.sgml's treatment of parallel joins

NR. Doc fix for commited change. Should get in.

- GUC for cleanup index threshold

NR. Open for over a year. Alexander Korotkov just tried to re-start
discussion.

Not quite sure if it's realistic unless somebody familiar with the
code invests some time.

- default namespace support [for xpath]

RFC. Open for a while. Nontrivial.

I don't think it's quite RFC, comments and docs at least need a fair
bit of polish.

This probably deserves some attention.

- Vacuum: Update FSM more frequently

NR. Older patch, but CF entry just created in January.

Nice improvement, but it hasn't gotten a whole lot of review (starting
end of Jan). Possible, but somebody would need to be really
interested.

- Test partition-wise join with partitioned tables containing default
partition

RFC. Tests for new feature. Should get in.

- Custom signals and handlers for extension

WOA. Patch submitted 2017-12-22.

There's not been much activity. This integrates in very fragile pieces
of code, I'm not quite sure we want this in the current form. But
feature is hugely useful for extensions. Will try to have a look in
the next few days.

Think it'd reasonable to not try to merge this in v11.

- TAP test module - PostgresClient

NR. I'm unclear why this has a CF entry in the current fest.

Suggested marking as returned with feedback in thread.

- taking stdbool.h into use

RFC. Doesn't seem to be actual status. Patch by committer. Inquired on
thread.

- (tiny) polish BGWorker example code and documentation

NR. Minor code change. Should either commit or reject.

- Applying PMDK to WAL operations for persistent memory

NR. Added in mid Jan.

There is no way we can do anything actionable based on this thread in
this CF. Commented on thread.

Should move to next CF.

- Minor fixes for reloptions tests

NR.

Not sure if anybody will care. Could just apply.

- Failed to request an autovacuum work-item in silence

NR. This seems more like a bugfix for recent code.

Pinged Alvaro.

Context:
7526e10224f0792201e99631567bbe44492bbde4 : BRIN auto-summarization

- Control better SSL and LDAP test suites

NR. I can't get myself to care.

- Cast jsonb to numeric, int, float, bool

RFC. I'm have no opinion whether we want this, but it does not seem
unreasonable and is fairly simpl. Might need a bit of error message
polish.

- symlink installs

NR. Added last second.

No idea whether we want this. Tom, you added behaviour this
reverts. If you care ...

- amcheck heap verification

NR. Been submitted for a while, a modicum of code review has been
performed.

Adds a fair bit of code:
21 files changed, 1032 insertions(+), 64 deletions(-)

but is low risk, being a readonly contrib module.

Can probably committed if somebody has the energy.

- Enhancement of pg_stat_wal_receiver view to display connected host

NR. Submitted late December. Has gotten some review.

Seems like a reasonable idea. Adds a new libpq function, so deserves
some careful attention.

Ok, bedtime. Will continue tomorrow. Committed a few patches along the
way.

Greetings,

Andres Freund

Re: 2018-03 Commitfest Summary (Andres #1)

From

Fabien COELHO

Date:

01 March 2018, 16:09:02

Hello Andres,

Thanks for managing this CF.

> - pgbench - allow to store query results into variables
>
>  RFC, has been worked on for a while.  Patch doesn't look 100% there,
>  but not that far away from it either.

What would be missing to look 100% there?

>  A bit concerned that we're turning pgbench into a kitchen sink.

I do not understand "kitchen sink" expression in this context, and your 
general concerns about pgbench in various comments in your message.

Currently pgbench this a one-way thing: one can send queries but there is 
no way to manipulate the result and act upon it, which limits the kind of 
scenario that can be implemented to unconditional data-independent 
transactions.

This makes the tool only half-backed thing: it provides a lot of detailed 
logging and reporting features, which are definitely useful to any 
benchmarking, but then you cannot write anything useful with it, which is 
just too bad.

So this setting-variable-from-query patch goes with having boolean 
expressions (already committed), having conditions (\if in the queue), 
improving the available functions (eg hashes, in the queue)... so that 
existing, data-dependent, realistic benchmarks can be implemented, and 
benefit for the great performance data collection provided by the tool.

> - pgbench - another attempt at tap test for time-related options
>
>  NR, another attempt at adding tests. I'd personally boot this to the
>  next fest, it's definitely not critical.

Indeed. The main point is to improve code coverage.

> - pgbench - add \if support
>
>  RFC.
>
>  I'm personally unconvinced we want this. Others on the thread seem
>  unsure as well.

See above.

> - General purpose hashing func in pgbench
>
>  RFC. Does mostly look ready, although I'm not sure the docs are at the
>  right level of detail.
>
>  I've similar concerns to earlier pgbench patches adding new
>  operators, even though to a bit lower degree.

See above. Such simple functions are used in actual benchmarks.

> - pgbench - test whether a variable exists
>
>  NR. Submitted first time mid February.
>
>  Quoting from author:
>  "Note that it is not really that useful for benchmarking, although it
>  does not harm."
>
>  I'd reject or boot.

As already said, the motivation is that it is a preparation for a (much) 
larger patch which would move pgbench expressions to fe utils and use them 
in "psql". If you do not want the final feature, there is no point in 
preparing, and you can reject it. However ISTM that the final feature is 
desired, hence the submission of this necessary step.

-- 
Fabien.

Re: 2018-03 Commitfest Summary (Andres #1)

From

Andres Freund

Date:

01 March 2018, 20:53:40

Hi,

On 2018-03-01 14:09:02 +0100, Fabien COELHO wrote:
> >  A bit concerned that we're turning pgbench into a kitchen sink.
> 
> I do not understand "kitchen sink" expression in this context, and your
> general concerns about pgbench in various comments in your message.

We're adding a lot of stuff to pgbench that only a few people
use. There's a lot of duplication with similar parts of code in other
parts of the codebase. pgbench in my opinion is a tool to facilitate
postgres development, not a goal in itself.

It's a bad measure, but the code growth shows my concerns somewhat:
master:        5660 +817
REL_10_STABLE: 4843 +266
REL9_6_STABLE: 4577 +424
REL9_5_STABLE: 4153 +464
REL9_4_STABLE: 3689 +562
REL9_3_STABLE: 3127 +338
REL9_2_STABLE: 2789 +96
REL9_1_STABLE: 2693

> So this setting-variable-from-query patch goes with having boolean
> expressions (already committed), having conditions (\if in the queue),
> improving the available functions (eg hashes, in the queue)... so that
> existing, data-dependent, realistic benchmarks can be implemented, and
> benefit for the great performance data collection provided by the tool.

I agree that they're useful in a few cases, but they have to consider
that they need to be reviewed and maintained an the project is quite
resource constrained in that regard.

> > - pgbench - test whether a variable exists
> > 
> >  NR. Submitted first time mid February.
> > 
> >  Quoting from author:
> >  "Note that it is not really that useful for benchmarking, although it
> >  does not harm."
> > 
> >  I'd reject or boot.
> 
> As already said, the motivation is that it is a preparation for a (much)
> larger patch which would move pgbench expressions to fe utils and use them
> in "psql".

You could submit it together with that. But I don't see in the first
place why we need to add the feature with duplicate code, just so we can
unify. We can gain it via the unification, no?

Greetings,

Andres Freund

Re: 2018-03 Commitfest Summary (Andres #1)

From

Tom Lane

Date:

01 March 2018, 21:37:32

Andres Freund <andres@anarazel.de> writes:
> On 2018-03-01 14:09:02 +0100, Fabien COELHO wrote:
>>> A bit concerned that we're turning pgbench into a kitchen sink.

>> I do not understand "kitchen sink" expression in this context, and your
>> general concerns about pgbench in various comments in your message.

> We're adding a lot of stuff to pgbench that only a few people
> use. There's a lot of duplication with similar parts of code in other
> parts of the codebase. pgbench in my opinion is a tool to facilitate
> postgres development, not a goal in itself.

FWIW, I share Andres' concern that pgbench is being extended far past
what anyone has shown a need for.  If we had infinite resources this
wouldn't be a big problem, but it's eating into limited committer hours
and I'm not really convinced that we're getting adequate return.

            regards, tom lane

Re: 2018-03 Commitfest Summary (Andres #1)

From

Peter Geoghegan

Date:

01 March 2018, 21:41:51

On Thu, Mar 1, 2018 at 3:03 AM, Andres Freund <andres@anarazel.de> wrote:
> - amcheck heap verification
>
>   NR. Been submitted for a while, a modicum of code review has been
>   performed.
>
>   Adds a fair bit of code:
>    21 files changed, 1032 insertions(+), 64 deletions(-)
>
>   but is low risk, being a readonly contrib module.
>
>   Can probably committed if somebody has the energy.

A lot of that code is boilerplate test harness code. If you actually
look at the changes to amcheck, there are only a couple of hundred
lines of code, many of which are comments (there is also another 300
lines for the Bloom filter implementation). The changes to the
IndexBuildHeapScan() interface made by parallel CREATE INDEX allowed
me to simplify things considerably in the most recent revision.

-- 
Peter Geoghegan

Re: 2018-03 Commitfest Summary (Andres #2)

From

Andres Freund

Date:

02 March 2018, 01:45:15

On 2018-03-01 03:03:44 -0800, Andres Freund wrote:
> Going through all non bugfix CF entries. Here's the summary for the
> entries I could stomach tonight:
>
> RFC: ready for committer
> NR: needs review
> WOA: waiting on author.

Second round.

- Sample values for pg_stat_statements

NR. Submitted to CF -1.

Nontrivial, some issues. Needs some action by reviewers to get to
being committable...

- pg_stat_statements with plans (v02)

NR. Submitted a year ago and immediately booted, re-submitted this
January.

Cool but decidedly nontrivial feature. Should get some review, but I
can't see us getting this in.

- "failed to find parent tuple for heap-only tuple" error as an ERRCODE_DATA_CORRUPTION ereport()

NR. Should probably just get applied. Can't quite make myself care
enough to interrupt right now.

- Function to track shmem reinit time

Tracks time since of last crash restart.

NR. Submitted first 2018-02-28.

Submitted very late, OTOH it's pretty simple and seems to serve a
valid monitoring need.

- Vacuum: allow usage of more than 1GB of work mem

RFC. Has been alive since 2016-11.

I think this is definitely due some serious attention.

- Protect syscache from bloating with negative cache entries

WOA. Has been alive in some form for a year, but current patch form is
recent.

The patch affects a central part of the code, has not gotten any
code review until just now. Likely problematic behaviour. No
benchmarks done. I don't see this going in.

- Convert join OR clauses into UNION queries

NR. ~1 year old.

Feature has been requested several times independently since. Patch
has been rebased, but otherwise not meaningfully changed.

I think some committers need to bite the bullet and review this one.

- SERIALIZABLE with parallel query

NR. A good year old, updated several times.

This is complicated stuff. Robert mentioned a month ago
(CA+TgmoYTiSsLOKG+wPdm2EQ8i+h_3reK4ub_RM-4zX0314r-SQ@mail.gmail.com)
that he might commit baring some issues. So I think it has a chance.

- Incremental sort

NR. This is a *very* old "patch" (obviously has evolved).

Looks to be in a reasonable shape. This needs at the very least a few
planner & executor skilled committer cycles to do a review so it can
progress.

- Moving relation extension locks out of heavyweight lock manager

NR.

Path currently stalled a bit due to some edge case performance
concerns we aren't yet sure matter. Otherwise I think close to commit.

- Improve compactify_tuples and PageRepairFragmentation

NR.

Current patch doesn't show that large gains afaict. Makes the way
specialized versions of qsort more maintainable though.

- Full merge join on comparison clause

NR. Patch originates from a few months ago, current incarnation a few
days old.

I suspect this might need more high-level review than it has gotten,
so it seems debatable whether it can get into 11.

- Implicit prepare of statements: replace literals with parameters and store cached plans

NR. Patch is large, invasive, has significant potential to cause performance
regressions. Has gotten very little code level review.

I can't see this getting in this release.

- Surjective indexes

NR. Older patch.

Simon says he likes the patch, and thinks it might be committable.

I still basically think what Tom and I wrote in
https://www.postgresql.org/message-id/15995.1495730260@sss.pgh.pa.us
and following is the right thing. IOW, I don't think we want the
feature with the current implementation. But others obviously
disagree.

I think a few more committers (Tom?) taking a look at the current
patch would be good.

- Gather speed-up

RFC. We'd some discussions about other potential implementations about
one of the pending improvements, Robert evaluated and found it
unconvincing. I think Robert just need to find tuits to commit.

- Fix LWLock degradation on NUMA

RFC. I should probably take a look on this and see if I can get it
committed.

- Partition-wise aggregation/grouping

NR. Older patch origins, development ongoing.

Can't quite judge how likely this is to get in, but seems feasible.

- Exclude partition constaint checking in query execution plan for
partitioned table queries

NR. A few months old.

This patch, not obvious from thread title, removes restrictions that
are already implied by constraints.

Code doesn't look bad, but I've some concern about potential for perf
regression. Seems doable, but needs review work.

- faster partition pruning in planner

NR. A few months old. Actively being reviewed.

- Lazy hash table for snapshot's "xid in progress"

NR. A few months old. Has not gotten any love in the last few fests.

Unless somebody spends some serious time reviewing and evaluating I
don't see this going anywhere.

- Better estimate for filter cost

NR. A few months old, but only recently updated based on feedback.

Not a large patch, but basically unreviewed.

- Pipelining/batch mode support for libpq.

NR. Patch of old origins, largely unreviewed in the last CFs.

The latest version of the patch doesn't look bad, but would at least
need 2-3 polishing rounds. I think the feature is pretty important...

- Runtime Partition Pruning

NR. A few months old. Still quite some churn.

Potentially doable, but would require close attention.

- Removing [Merge]Append nodes which contain a single subpath

NR. A few months old.

This introduces a new concept of a "proxy path" which doesn't seem to
have gotten a lot of design review.

Color me a bit sceptical that this is doable for v11.

- Remove LEFT JOINs in more cases

WOA. Since last fest.

Pinged thread about marking it as RWF.

- Removing useless DISTINCT clauses

WOA. Since last fest.

Pinged thread about marking it as RWF.

- verify ALTER TABLE SET NOT NULL by valid constraints

NR. Path a couple months old.

Has gotten some design level review (no spi) and a small amount of
code review. Patch looks fairly non-intrusive, albeit needing a bit
polish. Should be doable.

- ALTER TABLE ADD COLUMN fast default

WOA. Recent form of patch created December. Heavy churn.

Possibly can get ready, but that'll be some work.

- MCV lists for highly skewed distributions

WOA, but Dean seems to feel it's ready to be committed and plans to do
so.

Pinged.

- Range Merge Join

NR. Old patch origins.

This patch hasn't gotten that much code level review in the last
iterations, and has basically been in needs-review for a CF+. It's
not a huge patch, but certainly also not trivial.

- WIP: Precalculate stable functions

WOA. Patch hasn't been updated since last fest, even though there's
pending review.

I'd RWF.

- Parallel Aggregates for string_agg and array_agg

WOA, patch hails from mid December.

Hasn't gotten any sort of review yet. It's not a particularly invasive
patch though.

- Optimize Arm64 crc32c implementation in Postgresql

WOA, patch from January, hadn't gotten any review until just now.

Seems a simple enough idea, but there's no benchmarks or anything
yet. I'd not necessarily aim to merge this fest, but giving some
feedback seems like a good plan.

- Faster inserts with mostly-monotonically increasing values

NR, patch from Dec, CF entry created Feb.

Nice performance improvement. Hasn't gotten that much review, and it
appears Peter Geoghegan has some correctness concerns. Not sure.

- OFFSET optimisation for IndexScan using visibility map

NR, patch from 2018-01-31.

This hasn't gotten code level review yet, and there's some executor
structure implications. It seems unlikely to be wise to target this
for v11. Some feedback would be good though.

- Speed up WAL file removal during recovery

NR. Patch from 2017-11-16, CF entry from 2018-02-20.

Hasn't quite gotten enough attention yet, but the original proposal is
a fairly short patch.

- file cloning in pg_upgrade and CREATE DATABASE

NR. Patch from 2018-02-23.

This patch obviously is fairly late, OTOH it's by a committer and not
hugely complicated...

- hash joins with bloom filters

NR, patch from 2018-02-20.

This is a major new patch, arriving just before the last CF. I think
this should be moved to the next fest.

- Nested ConvertRowtypeExpr optimization

NR, patch arrived 2018-02-27.

This is a medium sized patch, with open questions, submitted two days
before the last CF starts. I think this should be moved.

- JIT compiling expressions & tuple deforming

NR, current CF entry is from 2018-02-28, but there have been previous
ones, work has been ongoing all over the last two years.

I'm not neutral on this one obviously ;). I plan to continue working
towards commit on this one.

- new plpgsql extra_checks

WOA, but recently set to that status. Patch essentially from
2017-01-11.

I'm not really sure there's agreement we want this.

- "Possibility to controll plpgsql plan cache behave"

NR, current incarnation is from late last year. Note that the patch
doesn't at all do anymore what the subject says. It's GUCs that can
force custom / generic plans.

Seems simple, if we want it.

- Jsonb transform for pl/perl & pl/python

Peter claimed this as a committer, so I assume they're going to be
dealt with.

- GET DIAGNOSTICS FUNCTION_NAME

WOA.

Conclusion of thread seems to be that we do not want this. Move to
rejected?

- Add missing type conversion functions for PL/Python

NR. Added only 2018-01-31.

This adds conversions between python's int float types and int{2,4,8},
numeric, float{4,8}.

Pretty simple patch, and it seems reasonable to this. I think there's
a bit too much duplication in the code. One could argue that this
ought to be done via transforms, but I'm not fully convinced.

- various procedure related patches

NR, Peter is working on them.

- Remove special wraparound code for pg_serial SLRU

NR, old origins.

I think this actually closer to RFC than NR. I think we should just
commit this.

- 64-bit transaction identifiers

NR, old-ish.

This certainly hasn't gotten enough review. But I see no chance that
we can get this into v11 at the current state. It's too invasive and
there's barely been any analysis. There's some preliminary patches
(e.g. 64bit int GUCs) that could get in, if somebody wants to look at
that.

I think we should give the patch a bit of review and then move it.

Stopping here for a coffee, will try to get the rest done
afterwards. Also, there are too many patches!

- Andres

Re: 2018-03 Commitfest Summary (Andres #2)

From

Peter Geoghegan

Date:

02 March 2018, 02:16:22

()
On Thu, Mar 1, 2018 at 2:45 PM, Andres Freund <andres@anarazel.de> wrote:
> - "failed to find parent tuple for heap-only tuple" error as an ERRCODE_DATA_CORRUPTION ereport()
>
>   NR. Should probably just get applied. Can't quite make myself care
>   enough to interrupt right now.

Tom just committed this.

> - Convert join OR clauses into UNION queries
>
>   NR. ~1 year old.
>
>   Feature has been requested several times independently since. Patch
>   has been rebased, but otherwise not meaningfully changed.
>
>   I think some committers need to bite the bullet and review this one.

I think that this is really important, too. Greg Stark expressed
interest recently, in a thread where we asked about this topic without
actually being aware of the existence of the patch. Perhaps there is
some chance he'll help with it.

> - Incremental sort
>
>   NR. This is a *very* old "patch" (obviously has evolved).
>
>   Looks to be in a reasonable shape. This needs at the very least a few
>   planner & executor skilled committer cycles to do a review so it can
>   progress.

This one has been bounced way too many times already. I really hope it
doesn't fall through the cracks for v11.

> - Faster inserts with mostly-monotonically increasing values
>
>   NR, patch from Dec, CF entry created Feb.
>
>   Nice performance improvement. Hasn't gotten that much review, and it
>   appears Peter Geoghegan has some correctness concerns.  Not sure.

I think that the idea is basically sound, but I would like to see a
justification for not caching page LSN instead of testing if the leaf
page is still rightmost and so on. If the LSN changed, that means that
the cached block number is stale. That seems simpler, and more or less
matches what we already do within _bt_killitems().

> - hash joins with bloom filters
>
>   NR, patch from 2018-02-20.
>
>   This is a major new patch, arriving just before the last CF. I think
>   this should be moved to the next fest.

I also think it should be bumped. I like this patch, but it's not
gonna happen for 11.

-- 
Peter Geoghegan

Re: 2018-03 Commitfest Summary (Andres #2)

From

Pavel Stehule

Date:

02 March 2018, 05:03:13

- new plpgsql extra_checks

WOA, but recently set to that status. Patch essentially from
2017-01-11.

I'm not really sure there's agreement we want this.

This patch is simple and has benefit for users with basic plpgsql skills, and some for all.

In more complex cases, probably plpgsql_check is used everywhere today, what is better but significantly complex solution. But not all cases can be solved by plpgsql_check, because it does only static analyze. This patch does some runtime warnings and checks.

Regards

Pavel

- "Possibility to controll plpgsql plan cache behave"

NR, current incarnation is from late last year. Note that the patch
doesn't at all do anymore what the subject says. It's GUCs that can
force custom / generic plans.

Seems simple, if we want it.

Re: 2018-03 Commitfest Summary (Andres #3)

From

Andres Freund

Date:

02 March 2018, 07:34:11

On 2018-03-01 14:45:15 -0800, Andres Freund wrote:
> Second round.

And last round. [ work ]. Scratch that. There'll be one more after this
;)

- Subscription code improvements

WOA. This is some minor cleanup stuff afaics. Don't think it's
terribly release bound.

- Bootstrap data conversion

NR.

Not quite sure if it's realistic to do something about for v11. If so,
it'd probably reasonable to merge this *after* the formal code freeze,
to avoid breaking ~all pending patches.

- reorganize partitioning code

NR. Created recently, but split off an older patch.

Seems like a generally reasonable idea. Wonder if it conflicts with
some other partition related patches?

- get rid of StdRdOptions, use individual binary reloptions
representation for each relation kind instead

NR. Created days ago, split off another patch.

I'm unclear why this is particularly desirable, but I might just be
missing something.

- Rewrite of pg_dump TAP tests

WOA. Recently created.

As it's "just" test infrastructure and makes things easier to
understand, I think it makes sense to push this ASAP.

- remove pg_class.relhaspkey

NR. Recentrly created.

Minor cleanup, might break a few clients. Probably should do it, don't
think it matters whether it's v11 or v12.

- pg_proc.prokind column

RFC. Cleans up some sql procedure stuff.

Obviously should get in.

- Unlogged tables re-initialization tests

NR.

Some questions about windows stuff that probably can easily be
resolved within CF.

- fixing more format truncation issues

NR. Warnings cleanup / addition by configure.

- Make async slave to wait for lsn to be replayed

WOA.

There's been some design discussion a month ago, but no other
activity. Given that this seems unlikely to be a candidate for
v11. Should RWF.

- Logical decoding of two-phase transactions

NR. Old thread, but lots of churn.

I personally don't see how all of this patch series could get into
v11. There's a bunch of fairly finnicky details here.

- Replication status in logical replication

RFC. Pretty small patch, Simon has claimed it.

- Restricting maximum keep segments by repslots

NR. Submitted a good while ago.

I've some doubts about the design here, and there's certainly not been
a lot of low level review. Would need some reviewer focus to make it
possible to get into v11.

- Exclude unlogged tables from base backups

RFC, for quite a while.

I still have some correctness doubts around this, will try to have a
look.

- logical_work_mem limit for reorderbuffer
- logical streaming for large in-progress transactions

NR. I'm not quite sure why these in this CF, inquired on thread.

- TRUNCATE behavior when session_replication_role = replica

RFC. Small patch, RFC since last CF. Should be merged.

- Logical decoding of TRUNCATE

RFC.

I think this isn't quite pretty, but I think that's largely because
the DML vs DDL separation itself isn't really clean.

- Changing WAL Header to reduce contention during ReserveXLogInsertLocation()

NR.

We don't really seem to have agreement on what precisely to do, and
there's no patch implementing the latest state of what's being
discussed. I'm not really certain why this in this CF.

- Checksums for slru files

WOA. Patch clearly not close to commit, and there's not much
activity. Proposed to mark as RWF.

- handling of heap rewrites in logical decoding

NR. Sent 2018-02-24, CFed 2018-02-28.

This is an important feature, but I think it's going to have to wait
for v12.

- Exclude temp relations from base backup

NR. Sent 2018-02-28.

But it's essentially a trivial extension from the other "exclude"
patch above. If we merge the above, we should just merge this as
well.

- Verify Checksums during Basebackups

NR. Sent 2018-02-28.

It's a pretty simple patch, but it's been submitted on the day before
the last CF. So I think there's a good case to be made for just moving
it.

- GnuTLS support

WOA. I'm not quite sure what the status here is. If we want it, there
seems to be some work to make it fully compatible, including scram
work.

- Group Read Access for Data Directory

WOA, although I don't quite know what for.

Seems mostly ready, but I'm not sure how detailed a code review this
got. A lot of the discussion in the thread is about tangential stuff.

- macOS Secure Transport SSL Support

NR, CF entry created recently, but thread started much earlier.

This is a fairly large part, and there seem to be some open
issues. Being pre-auth and network exposed, It's also an area of code
that is more security critical than a lot of other stuff. OTOH, there
seems to be quite some desire to get off openssl on macs...

- pg_hba.conf : new auth option : clientcert=verify-full

NR. Submitted first 2018-02-16.

Fairly trivial patch, but submitted to last CF. I think this is
pretty borderline.

- Correct space parsing in to_timestamp()

WOA. Based on thread that'd more appropriately be RFC.

This is probably ready.

- Generic type subscripting

NR. Has been worked on for a year.

I'm not quite sure where this stands. Tom had done quite a bit of
review, but there's been a fair bit of change in the version since
then.

- Improve geometric types

RFC. This has been ready for a long while, but gone through several
revisions. So this doesn't look accurate to me. Thread pinged.

- Predicate locking in Gist index

RFC. Seems pretty small and has tests.

- Predicate locking in hash index

NR. There has been some back & forth between NR and WOA here, and as
far as I can tell the patch isn't close to being done.

pinged thread.

- Add support for tuple routing to foreign partitions

NR. Patch is still evolving, the latest versions haven't gotten a lot
of review.

It seems a bit ambitious to try to get this into v11.

- New function for tsquery creation

RFC. But I'm not sure how much detailed review this actually has got.

- Implement NULL-related checks in object address functions to prevent cache lookup errors

RFC.

- Creating backup history files for backups taken from standbys

RFC, claimed by Fujii. Since nothing moved in the last month, I've
pinged thread.

- multivariate MCV lists and histograms

NR, for a good while.

This hasn't moved much in the last few fests, largely due to lack of
review afaics.

- Push aggregation down to base relations and joins

NR.

Based on the discussion in the last two months it seems quite unlikely
to make it into 11.

- Pluggable storage API

NR. This unfortunately still seems far off. It's possible though that
we can split off some smaller pieces of work and get them committed?

- Custom compression methods

RFC. While marked as ready for committer, I think this is mostly
because some high level design input is needed. It's huge, and so far
only seems to add already existing compression support.

Not seing this for v11.

- BRIN bloom and multi-minmax indexes

NR. This unfortunately hasn't gotten much review. I don't think it's
realistic to get into v11, but there should be at least some useful
feedback.

- Covering B-tree indexes (aka INCLUDE)

RFC. This has been around for a long while. Seems a bit closer to
being committable than last time round, but I'm a bit doubtful it has
gotten qutie enough detailed review.

- UPDATE of partition key : Restrict concurrent update/delete

RFC. This item IMO is kind of a must-have after
2f178441044be430f6b4d626e4dae68a9a6f6cec, lest we violate visibility
semantics.

I'll try to take a look over the next few days.

- Flexible configuration for full-text search

NR.

This is a huge patch that hasn't gotten a whole lot of code review,
nor a lot of design discussion / review (has a huge user facing
surface). I can't see this going into v11.

- FOR EACH ROW triggers on partitioned tables

WOA, for about a week.

Not sure if this has any chance, doesn't yet seem to have gotten much
review, there's known outstanding issues...

- Shared Ispell dictionaries

NR. I'm not sure this feature is good enough that we want it, but it
does solve a need I've (back when I interacted with users) heard a
couple times. I responded on the thread with suggestions how we can
not make the configuration as painful.

The code also doesn't look quite there yet, even if we were to go with
the current design. Thus I'm included to return with feedback soon-ish.

- Andres

Re: 2018-03 Commitfest Summary (Andres #3)

From

Amit Langote

Date:

02 March 2018, 07:51:48

Hi Andres.

On 2018/03/02 13:34, Andres Freund wrote:
> - reorganize partitioning code
> 
>   NR. Created recently, but split off an older patch.
> 
>   Seems like a generally reasonable idea. Wonder if it conflicts with
>   some other partition related patches?

It actually does.  There are at least a few other functionality patches
that touch same files and it may not be a good idea for them to have to
worry about conflicts with this.

I gave up on rebasing this patch yesterday as I couldn't finish it in 5
minutes, but maybe I will try later this month.  Gotta focus on the faster
pruning stuff for now...

> - Add support for tuple routing to foreign partitions
> 
>   NR. Patch is still evolving, the latest versions haven't gotten a lot
>   of review.
> 
>   It seems a bit ambitious to try to get this into v11.

I've looked at this a bit yesterday and plan to do review it sometime this
month.  Having skimmed it, the new FDW API added by the patch looks well
documented and seems to do the very specific job assigned to it well
enough, but then again I haven't done a full review yet.

Thanks,
Amit

Re: 2018-03 Commitfest Summary (Andres #4)

From

Andres Freund

Date:

02 March 2018, 10:52:42

Hi,

On 2018-03-01 20:34:11 -0800, Andres Freund wrote:
> On 2018-03-01 14:45:15 -0800, Andres Freund wrote:
> > Second round.
>
> And last round. [ work ]. Scratch that. There'll be one more after this
> ;)

Let's do this. Hell, this CF is large.  I'll have a glass of wine at
some point of this.

- Add default role pg_access_server_files

  WOA. For a couple weeks now.  Should kind abe punted, but I think the
  amount of changes required isn't that large...

- foreign keys and partitioned tables

  NR, but should probably be WOA.

  This still seems fairly rough, hasn't been reviewed etc.  I don't see
  this going in unfortunately.

- Predicate locking in gin index

  NR.

  I don't quite think this is going in v11. There's not been much
  review, the path is nontrivial.

- SQL/JSON: jsonpath

  NR, could also bo WOA.

  I don't see this realistically targeting v11. There's been very
  little code review of the main patch.   It's a bit sad that this got
  kickstarted quite late, after people argued about getting an earlier
  version of this into last release...

- Refuse setting toast.* reloptions when TOAST relation does not exist

  NR.  I think the conclusion here is that we're not sure what we want
  to do.  I'd reject and let the issue live for another day.

- Add enum releation option type

  NR.

  I don't see any urgency here.

- Handling better supported channel binding types for SSL
  implementations

  RFC. Seems simple enough.

- Rewriting the test of pg_upgrade as a TAP test - take two~

  NR. Looks like it be doable to get this in, but depends a bit on
  interactions with Andrew Dunstan / buildfarm.

- btree_gin, add support for uuid, bool, name, bpchar and anyrange types

  NR. Definitely submitted late (2018-02-20). But also not a complicated
  patch.

- ICU as default collation provider

  NR.  This has been posted 2018-02-10, submitted 2018-02-26. It's a
  large patch. Therefore I don't think this is elegible for v11.
  Commented on thread.

- Advanced partition matching for partition-wise join

  NR.  While CF entry has been created late 2018-02-27, it has been in
  development for much longer, thread started 2017-08-21.  Apparently
  because it dependent on partitionwise join?

  Given the level of review and size of patch I've a bit of a hard time
  seing this getting into v11.

- Nepali Snowball dictionary

  NR. Referred to new snowball upstream. Will need sync with new code,
  which the CF entry doesn't yet do. Therefore I think this should be
  marked RF, as proposed on thread.

- Remove DSM_IMPL_NONE

  NR. Code removal, so we probably want this despite being submitted
  late.

- Transactions involving multiple postgres foreign servers

  NR. This is an *old* thread.

  This is large, not reviewed a lot (Robert started some serious
  reviewing in Feb, large changes ensued). I unfortunately don't see
  this in v11.

- ON CONFLICT DO UPDATE for partitioned tables

  NR. CF entry was created recently (2018-02-28), but is based on an
  earlier patch.

  This is relatively large and not reviewed much. I don't quite see it.

- kNN for SP-GiST

  NR. Current CF entry created 2018-02-28, but there was one previous
  entry that got one cycle of review 2017-03-09.

  This appears to be a large patch dropped on the eve of the current CF,
  despite some older ancestry.  I think we should just move it.

- SQL/JSON support in PostgreSQL

  This appears to be the older version of "SQL/JSON support in
  PostgreSQL", plus some further patches, which all also have new
  entries. I think we should close this entry, even the references patch
  seems a challenge, we certainly can't bite off even more. RWF?

- Foreign Key Arrays

  RFC. This is a fairly large, but not huge patch.  While it's marked
  RFC, I'm not quite sure it's gotten enough detailed review.

- Support to COMMENT ON DATABASE CURRENT_DATABASE

  NR.  Tom proposes to reject, and I can't fault him.

- Handling of the input data errors in COPY FROM

  NR.  I can't see this going anywhere, the current incarnation uses
  PG_TRY/CATCH around InputFunctionCall.

- Add NOWAIT option to VACUUM and ANALYZE

  RFC.  Looks pretty trivial.

- Boolean partition syntax

  WOA.  There seems to be pretty strong consensus that the current
  approach isn't the right one. I think this should be RWF. Proposed on
  thread.

- Lockable views

  RFC.  While marked as RFC, I think there's actually quite an
  uncertainty whether we want the feature in the current form at all.

  I think it'd be good for others to weigh in.

- generated columns

  NR.  This hasn't yet gotten much detailed review.  Not sure if
  realistic.

- MERGE

  NR.  There's qutie some review activity. Seems possible to get
  there. We'll have to see.

- SQL/JSON: functions

  NR.  Extracted from a bigger patch, submitted 2018-01-10.

  This has not gotten any review so far and is huge.  I don't see it.

  I think the SQL/JSON folks gotta prioritize. There's a number of
  features and max one of them can get in. Choose. Jsonpath seems to
  have gotten the most attention so far.

- SQL/JSON: JSON_TABLE

  Same story as above, albeit slightly smaller.

- prefix operator for text type and spgist index support

  NR.  Submitted 2018-02-02.  Not huge, but also completely new. I'd
  just move.

- Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT

  NR. Submitted 2018-02-20.  I think this should just be moved to the
  next fest. Proposed on thread.

- chained transactions

  NR. Submitted 2018-02-28.  This is a large complicated patch submitted
  just before the last CF. I think it should promptly moved.

- log_destination=file

  NR.  My perception is that we don't know precisely enough what we
  want. I think we should just return this with feedback and see where
  a potential discussion leads. Certainly doesn't seem v11 material.

- Early locking option to parallel backup

  WOA.  I think there's concensus we don't want this in the current
  incarnation. Outlines of other approaches described.  Should mark as
  RWF.

- Support optional message in backend cancellation/termination

  NR.  Current incarnation of patch fairly new (2018-01-24).

  Seems a bit late, and unripe, but possibly doable?

- Allow changing WAL segment size with pg_resetwal

  NR.  While there was an earlier patch, the current incarnation (redone
  based on review), is from 2018-02-07

  The patch seems simply enough, so I think we might try to get this in?

- autovacuum: add possibility to change priority of vacuumed tables

  WOA. Submitted 2018-02-08.

  This seems to more at the design stage than something ready for v11.

- Kerberos test suite

  WOA.  Submitted late, but just tests.

- Online enabling of checksums

  NR.  Submitted late 2018-02-21.

  This hasn't yet gotten that much review and is definitely not a
  trivial patch. I'm not sure it's fair to target this to v11, even
  taking committerness into account.

- SSL passphrase prompt external command

  NR. Submitted late 2018-02-24.

  While not large, it's also not trivial...

- Zero headers of remaining pages in WAL segment after switch

  NR. Submitted decidely late (2018-02-25), but at least it's a
  different implementation of an older proposal.

  The patch is fairly simple though.

- Reopen logfile on SIGHUP

  NR.  There's debate about whether we want this.

- Changing the autovacuum launcher scheduling; oldest table first
  algorithm

  NR, submitted late: 2018-02-28.

  Proposed it being moved to next CF.

Yay!  Also, sorry for all the spam today.

Greetings,

Andres Freund

Re: 2018-03 Commitfest Summary (Andres #4)

From

Craig Ringer

Date:

02 March 2018, 11:02:35

On 2 March 2018 at 15:52, Andres Freund <andres@anarazel.de> wrote:

Yay! Also, sorry for all the spam today.

Quite the opposite, thanks for keeping everyone up to date and keeping things on track.

I'm hoping to resurrect the ProcSignal based memory context dump soon, maybe for final CF since it's minor. Still no time for libpq pipelining :(

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: 2018-03 Commitfest Summary (Andres #4)

From

Andres Freund

Date:

02 March 2018, 11:06:20

Hi,

On 2018-03-02 16:02:35 +0800, Craig Ringer wrote:
> Quite the opposite, thanks for keeping everyone up to date and keeping
> things on track.

Thanks.


> I'm hoping to resurrect the ProcSignal based memory context dump soon,
> maybe for final CF since it's minor.

Say what? That's definitely too late. The CF has already started and is
closed for new submissions.

Greetings,

Andres Freund

Re: 2018-03 Commitfest Summary (Andres #3)

From

Ildus Kurbangaliev

Date:

02 March 2018, 12:22:15

On Thu, 1 Mar 2018 20:34:11 -0800
Andres Freund <andres@anarazel.de> wrote:

> 
> - Custom compression methods
> 
>   RFC. While marked as ready for committer, I think this is mostly
>   because some high level design input is needed.  It's huge, and so
> far only seems to add already existing compression support.
> 
>   Not seing this for v11.
> 

Hi,

This patch is not about adding new compression algorithms, it's about
adding a new access method type which could be used for new compression
methods.

It's quite big but the large part of it are changes in regression tests
(becauses it adds new field in \d+) and new tests.

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: 2018-03 Commitfest Summary (Andres #1)

From

Fabien COELHO

Date:

02 March 2018, 12:47:01

Hello andres & Tom,

>>>  A bit concerned that we're turning pgbench into a kitchen sink.
>>
>> I do not understand "kitchen sink" expression in this context, and your
>> general concerns about pgbench in various comments in your message.
>
> We're adding a lot of stuff to pgbench that only a few people
> use. There's a lot of duplication with similar parts of code in other
> parts of the codebase. pgbench in my opinion is a tool to facilitate
> postgres development, not a goal in itself.

I disagree.

I think that pgbench should *also* allow to test postgres performance in 
realistic scenarii that allow to communicate about performance, and 
reassure users about their use case, not just a simplified tpc-b.

Even if you would want to restrict it to internal postgres development, 
which I would see as a shame, recently added features are still useful.

For instance, I used extensively tps throttling, latencies and timeouts 
measures when developping and testing the checkpointer sorting & 
throttling patch.

Some people are just proposing a new storage engine which changes the cost 
of basic operations (improves committed transaction, makes rolling-back 
more expensive). What is the actual impact depending on the rollback rate? 
How do you plan to measure that? Pgbench needs capacities to be useful 
there, and the good news is that some recently added ones would come 
handy.

> It's a bad measure, but the code growth shows my concerns somewhat:
> master:        5660 +817
> REL_10_STABLE: 4843 +266
> REL9_6_STABLE: 4577 +424
> REL9_5_STABLE: 4153 +464
> REL9_4_STABLE: 3689 +562
> REL9_3_STABLE: 3127 +338
> REL9_2_STABLE: 2789 +96
> REL9_1_STABLE: 2693

A significant part of this growth is the expression engine, which is 
mostly trivial code, although alas not necessarily devout of bugs. If 
moved to fe-utils, pgbench code footprint will be reduced by about 2000 
lines.

Also, code has been removed (eg the fork-based implementation) and 
significant restructuring which has greatly improved code maintenance, 
even if the number of lines has possibly increased in passing.

>> So this setting-variable-from-query patch goes with having boolean
>> expressions (already committed), having conditions (\if in the queue),
>> improving the available functions (eg hashes, in the queue)... so that
>> existing, data-dependent, realistic benchmarks can be implemented, and
>> benefit for the great performance data collection provided by the tool.
>
> I agree that they're useful in a few cases, but they have to consider
> that they need to be reviewed and maintained an the project is quite
> resource constrained in that regard.

Currently I do most of the reviewing & maintenance of pgbench, apart from 
the patch I submit.

I can stop doing both if the project decides that improving pgbench 
capabilities is against its interest.

Tom said:

> FWIW, I share Andres' concern that pgbench is being extended far past 
> what anyone has shown a need for.  If we had infinite resources
> this wouldn't be a big problem, but it's eating into limited
> committer hours and I'm not really convinced that we're getting 
> adequate return.

As pgbench patches can stay ready-to-committers for half a dozen CF, I'm 
not sure the strain on the committer time is that heavy:-) There are not 
so many of them, most of them are trivial. If you drop them on the ground 
that the you do not want them, it will not change anything to the lack of 
reviewing resources and incapacity of the project to process the submitted 
patches, which in my opinion is a wider issue, not related to the few 
pgbench-related submissions.

On the "adequate return" point, my opinion is that currently pgbench is 
just below the feature set needed to be generally usable, so not improving 
it is a self-fullfilling ensurance that it will not be used further. Once 
the "right" feature set is reached (for me, at least extracting query 
output into variables, having conditionals, possibly a few more functions 
if some benches use them), whether it would be actually more widely used 
by both developers and users is an open question.

Now, as I said, if pgbench improvements are not seen as desirable, I can 
mark submissions as "rejected" and do other things with my little 
available time than try to contribute to postgres.

>>> - pgbench - test whether a variable exists
>>
>> As already said, the motivation is that it is a preparation for a (much)
>> larger patch which would move pgbench expressions to fe utils and use them
>> in "psql".
>
> You could submit it together with that.

Sure, I could. My previous experience is that maintaining a set of 
dependent patches is tiresome, and it does not help much with testing and 
reviewing either. So I'm doing things one (slow) step at a time, 
especially as each time I've submitted patches which were doing more than 
one thing I was asked to disentangle features and restructuring.

> But I don't see in the first place why we need to add the feature with 
> duplicate code, just so we can unify.

It is not duplicate code. In psql the variable-exists-test is currently 
performed on the fly by the lexer. With the expression engine, it needs to 
be lexed, parsed and finally evaluated, so this is necessarily new code.

> We can gain it via the unification, no?

Well, this would be a re-implementation anyway. I'm not sure the old one 
would disappear completly, because it depends on backslash commands which 
have different lexing assumptions (eg currently the variable-exists-test 
is performed from both "psqlscan.l" and "psqlscanslash.l" independently).

-- 
Fabien.

Re: 2018-03 Commitfest Summary (Andres #1)

From

Alexander Korotkov

Date:

02 March 2018, 13:02:52

Hi!

On Thu, Mar 1, 2018 at 2:03 PM, Andres Freund <andres@anarazel.de> wrote:

- GUC for cleanup index threshold

NR. Open for over a year. Alexander Korotkov just tried to re-start
discussion.

Not quite sure if it's realistic unless somebody familiar with the
code invests some time.

Right. Assuming that it's small and non-invasive patch, I'd like to ask to not mark it RWF too earlier. Let's see what will happen during commitfest. In particular, I'm planning to spend some more time on this patch.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: 2018-03 Commitfest Summary (Andres #3)

From

Alexander Korotkov

Date:

02 March 2018, 13:05:16

On Fri, Mar 2, 2018 at 7:34 AM, Andres Freund <andres@anarazel.de> wrote:

- Pluggable storage API

NR. This unfortunately still seems far off. It's possible though that
we can split off some smaller pieces of work and get them committed?

I'm working on reordering patchset so that refactoring patches go before API introduction. I'm planning to publish patchset this weekend. Hopefully, some peices of patchset could be committed.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: 2018-03 Commitfest Summary (Andres #1)

From

Fabien COELHO

Date:

02 March 2018, 13:06:12

Hello Tom,

> FWIW, I share Andres' concern that pgbench is being extended far past 
> what anyone has shown a need for.  If we had infinite resources this 
> wouldn't be a big problem, but it's eating into limited committer hours 
> and I'm not really convinced that we're getting adequate return.

Another specific point about the CF patch management:

A lot of patches do not even get a review: no immediate interest or more 
often no ressources currently available, patch stays put, I'm fine with 
that.

Now, some happy patches actually get reviews and are switched to ready, 
which shows that somebody saw enough interest in them to spend some time 
to improve them.

If committers ditch these reviewed patches on weak ground (eg "I do not 
need this feature so nobody should need it"), it is both in contradiction 
with the fact that someone took the time to review it, and is highly 
demotivating for people who do participate to the reviewing process and 
contribute to hopefully improve these patches, because the reviewing time 
just goes to the drain in the end even when the patch is okay.

So for me killing ready patches in the end of the process and on weak 
ground can only make the process worse. The project is shooting itself in 
the foot, and you cannot complain later that there is not enough 
reviewers.

-- 
Fabien.

Re: 2018-03 Commitfest Summary (Andres #4)

From

David Steele

Date:

02 March 2018, 18:13:37

On 3/2/18 2:52 AM, Andres Freund wrote:
> 
> Let's do this. Hell, this CF is large.  

Yeah it is.

> I'll have a glass of wine at some point of this.

Hope you did!

I've gone through all your notes and will follow up on your
recommendations where you have not already done so.

Thanks,
-- 
-David
david@pgmasters.net

Re: 2018-03 Commitfest Summary (Andres #1)

From

Andres Freund

Date:

02 March 2018, 22:48:09

Hi,

On 2018-03-02 11:06:12 +0100, Fabien COELHO wrote:
> A lot of patches do not even get a review: no immediate interest or more
> often no ressources currently available, patch stays put, I'm fine with
> that.

Well, even if that's the case that's not free of cost to transport them
from fest to fest. Going through them continually isn't free. At some
point we just gotta decide it's an undesirable feature.

> Now, some happy patches actually get reviews and are switched to ready,
> which shows that somebody saw enough interest in them to spend some time to
> improve them.
> 
> If committers ditch these reviewed patches on weak ground (eg "I do not need
> this feature so nobody should need it"), it is both in contradiction with
> the fact that someone took the time to review it, and is highly demotivating
> for people who do participate to the reviewing process and contribute to
> hopefully improve these patches, because the reviewing time just goes to the
> drain in the end even when the patch is okay.

The consequence of this appears to be that we should integrate
everything that anybody decided worthy enough to review. That just
doesn't make sense, we can't maintain the project that way, nor will the
design be even remotely coherent.

Sure it's a balancing act, but nobody denies that.

> So for me killing ready patches in the end of the process and on weak ground
> can only make the process worse. The project is shooting itself in the foot,
> and you cannot complain later that there is not enough reviewers.

How would you want it to work otherwise? Merge everything that somebody
found review worthy? Have everything immediately reviewed by committers?
Neither of those seem realistic.

Greetings,

Andres Freund

Re: 2018-03 Commitfest Summary (Andres #1)

From

Andres Freund

Date:

02 March 2018, 22:54:42

Bcc: 
Reply-To: 

Hi,

On 2018-03-02 10:47:01 +0100, Fabien COELHO wrote:
> For instance, I used extensively tps throttling, latencies and timeouts
> measures when developping and testing the checkpointer sorting & throttling
> patch.

That doesn't say that much about proposed feature additions, we didn't
say that feature isn't useful?

> I can stop doing both if the project decides that improving pgbench
> capabilities is against its interest.

That's not what we said. There's a difference between "we do not want to
improve pgbench" and "the cost/benefit balance seems to have shifted
over time, and the marginal benefit of proposed features isn't that
high".

> As pgbench patches can stay ready-to-committers for half a dozen CF, I'm not
> sure the strain on the committer time is that heavy:-)

That's just plain wrong. Even patches that linger cost time and attention.

> > > As already said, the motivation is that it is a preparation for a (much)
> > > larger patch which would move pgbench expressions to fe utils and use them
> > > in "psql".
> > 
> > You could submit it together with that.
> 
> Sure, I could. My previous experience is that maintaining a set of dependent
> patches is tiresome, and it does not help much with testing and reviewing
> either.

I'm exactly of the opposite opinion. Submitting things out of context,
without seeing at least drafts of later patches, is a lot more work and
doesn't allow to see the big picture.

> So I'm doing things one (slow) step at a time, especially as each
> time I've submitted patches which were doing more than one thing I was asked
> to disentangle features and restructuring.

There's a difference between maintaining a set of patches in a queue,
nicely split up, and submitting them entirely independently.

Greetings,

Andres Freund

Re: 2018-03 Commitfest Summary (Andres #3)

From

Andres Freund

Date:

02 March 2018, 22:55:58

On 2018-03-02 12:22:15 +0300, Ildus Kurbangaliev wrote:
> On Thu, 1 Mar 2018 20:34:11 -0800
> Andres Freund <andres@anarazel.de> wrote:
> > - Custom compression methods
> > 
> >   RFC. While marked as ready for committer, I think this is mostly
> >   because some high level design input is needed.  It's huge, and so
> > far only seems to add already existing compression support.
> > 
> >   Not seing this for v11.

> This patch is not about adding new compression algorithms, it's about
> adding a new access method type which could be used for new compression
> methods.

Yes, but that's not a contradiction to what I wrote? Currently we don't
gain any practical improvements for users with this patch, no?  It'd
"just" allow extension authors to provide benefit.

Greetings,

Andres Freund

Re: 2018-03 Commitfest Summary (Andres #1)

From

Peter Geoghegan

Date:

02 March 2018, 23:13:26

On Fri, Mar 2, 2018 at 1:47 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> On the "adequate return" point, my opinion is that currently pgbench is just
> below the feature set needed to be generally usable, so not improving it is
> a self-fullfilling ensurance that it will not be used further. Once the
> "right" feature set is reached (for me, at least extracting query output
> into variables, having conditionals, possibly a few more functions if some
> benches use them), whether it would be actually more widely used by both
> developers and users is an open question.

FWIW, I think that pgbench would become a lot more usable if someone
maintained a toolset for managing pgbench. Something similar to Greg
Smith's pgbench-tools project, but with additional features for
instrumenting the server. There would be a lot of value in integrating
it with third party tooling, such as perf and BCC, and in making it
easy for non-experts to run relevant, representative tests.

Things like the rate limiting and alternative distributions were
sorely needed, but there are diminishing returns. It's pretty clear to
me that much of the remaining low hanging fruit is outside of pgbench
itself. None of the more recent pgbench enhancements seem to make it
easier to use.

-- 
Peter Geoghegan

Re: 2018-03 Commitfest Summary (Andres #1)

From

Fabien COELHO

Date:

03 March 2018, 12:36:18

Hello,

> [...] The consequence of this appears to be that we should integrate 
> everything that anybody decided worthy enough to review. That just 
> doesn't make sense, we can't maintain the project that way, nor will the 
> design be even remotely coherent.

Sure. The pgbench capabilities we are discussing are consistent, though, 
which is why I'm arguing.

> Sure it's a balancing act, but nobody denies that.

Yep. I'm arguing on the balance.

>> So for me killing ready patches in the end of the process and on weak ground
>> can only make the process worse. The project is shooting itself in the foot,
>> and you cannot complain later that there is not enough reviewers.
>
> How would you want it to work otherwise? Merge everything that somebody
> found review worthy?

No, but I think that there should be stronger technical/design/... reject 
arguments than the over-conservatism shown on some patches.

I'll take as an example the pgbench hash functions patch about which you 
seemed to show some reservation: it adds a few hash functions, a necessary 
feature to reproduce a YCSB load (Yahoo cloud service benchmark), together 
with zipfian distributions (already accepted, thanks).

Why would committers prevent pgbench to be used to reproduce this kind of 
load? The point of pgbench is to be able to test different kind of 
loads/scenarii... Accepting such features, if the implementation is good 
enough, should be no brainer, and alas it is not.

> Have everything immediately reviewed by committers? Neither of those 
> seem realistic.

Sure. I'm aiming at an ambitious "eventually":-)

Note that I'd wish that at least the ready-for-committer bug-fixes would 
be processed by committers when tagged as ready in a timely fashion (say 
under 1 CF), which is not currently the case.

-- 
Fabien.

Re: 2018-03 Commitfest Summary (Andres #1)

From

Fabien COELHO

Date:

03 March 2018, 12:52:49

Hello Peter,

>> On the "adequate return" point, my opinion is that currently pgbench is just
>> below the feature set needed to be generally usable, so not improving it is
>> a self-fullfilling ensurance that it will not be used further. Once the
>> "right" feature set is reached (for me, at least extracting query output
>> into variables, having conditionals, possibly a few more functions if some
>> benches use them), whether it would be actually more widely used by both
>> developers and users is an open question.
>
> FWIW, I think that pgbench would become a lot more usable if someone
> maintained a toolset for managing pgbench. Something similar to Greg
> Smith's pgbench-tools project, but with additional features for
> instrumenting the server. There would be a lot of value in integrating
> it with third party tooling, such as perf and BCC, and in making it
> easy for non-experts to run relevant, representative tests.
>
> Things like the rate limiting and alternative distributions were
> sorely needed, but there are diminishing returns. It's pretty clear to
> me that much of the remaining low hanging fruit is outside of pgbench
> itself.

It happens that I might start something on the line of what you are 
suggesting above.

However there is a minimal set of features needed in pgbench itself, 
especially on the scripting side (functions, variables, conditions, error 
handling... which are currently work in progress). I do not think that it 
would be make any sense to re-implement all detailed data collection, load 
throttling, client & thread handling outside pgbench, just because there 
is a missing basic feature such as a particular hash function or a stupid 
\if on the result of a query to implement a simple benchmark.

> None of the more recent pgbench enhancements seem to make it easier to 
> use.

I agree that "easier to use" is a worthy objective, and that its answer is 
probably partly outside pgbench (although there could be parts inside, eg 
json/CSV/... outputs to help collect performance data for instance).

-- 
Fabien.

Re: 2018-03 Commitfest Summary (Andres #1)

From

Fabien COELHO

Date:

03 March 2018, 12:56:05

Hello Andres,

>> For instance, I used extensively tps throttling, latencies and timeouts
>> measures when developping and testing the checkpointer sorting & throttling
>> patch.
>
> That doesn't say that much about proposed feature additions, we didn't
> say that feature isn't useful?

Sure.

The point I am trying to make is that adding capabilities to pgbench 
enables new kind of performance tests, and how useful they are cannot be 
foreseen from the starting point. Moreover, the usability increases when 
these capabilities can be combined.

For the latency & timeout measures, my initial point was to check the 
state of postgres for a performance subject I'm interested in, with the 
idea that people put too much emphasis on tps and not enough on latency. I 
knew there were some issues there, but I had no idea how terrible it was 
before I was able to get detailed measures. So the actual process was add 
capabilities (a lot of argumentation because people do not see the 
point...), then use them to collect data, spot a larger than expected 
problem and then try to fix it.

>> I can stop doing both if the project decides that improving pgbench
>> capabilities is against its interest.
>
> That's not what we said. There's a difference between "we do not want to
> improve pgbench" and "the cost/benefit balance seems to have shifted
> over time, and the marginal benefit of proposed features isn't that
> high".

I'm probably exagerating a bit for the sake of argument, but I still think 
that the project is over conservative about pgbench feature set, which 
IMHO is "nearly there", but not quite, so as to be useful for running 
actual benchmarks against postgres.

I've been pushing for a consistent set of basic features. Pgbench got 
boolean expressions, but they cannot be used in a conditional (eh, no \if, 
low marginal benefit) and they cannot test anything related to the data 
(no \gset, low marginal benefit). The current result is a half-baked 
inconsistent tool, which is just too bad.

>> As pgbench patches can stay ready-to-committers for half a dozen CF, I'm not
>> sure the strain on the committer time is that heavy:-)
>
> That's just plain wrong. Even patches that linger cost time and attention.

Hmmm. A few seconds per month to move it to the next CF? A few seconds per 
month so skip these few "ready" patches while reading the very long list 
cluttered with a hundred "needs review" items anyway?


> [...] I'm exactly of the opposite opinion. Submitting things out of 
> context, without seeing at least drafts of later patches, is a lot more 
> work and doesn't allow to see the big picture.

Hmmm.

The (trivial) big picture is to allow client-side expressions in psql 
(which as a \if:-) by reusing pgbench expression engine, so that one could 
write things like

   \let i :j + 12 * :k

or

   \if :VERSION_NUM < 140000

In a psql script. For this purpose, the expression engine must somehow 
support the existing syntax, including testing whether a variable exists, 
hence the small 30-lines (including doc & tests) patch submission.

Obviously I should check first to get at least a committer's agreement 
that the feature is desirable: I have no issue with having a patch 
rejected after a long process because of bad programming and/or bad 
design, but if the thing is bared from the beginning there is no point in 
trying.

>> So I'm doing things one (slow) step at a time, especially as each time 
>> I've submitted patches which were doing more than one thing I was asked 
>> to disentangle features and restructuring.
>
> There's a difference between maintaining a set of patches in a queue, 
> nicely split up, and submitting them entirely independently.

Sure. I had a bad experience before on that subject, but I may be 
masochistic enough to try again:-)

-- 
Fabien.

Re: 2018-03 Commitfest Summary (Andres #1)

From

"Tels"

Date:

03 March 2018, 17:56:14

Moin,

On Fri, March 2, 2018 2:48 pm, Andres Freund wrote:
> Hi,
>
> On 2018-03-02 11:06:12 +0100, Fabien COELHO wrote:
>> A lot of patches do not even get a review: no immediate interest or more
>> often no ressources currently available, patch stays put, I'm fine with
>> that.
>
> Well, even if that's the case that's not free of cost to transport them
> from fest to fest. Going through them continually isn't free. At some
> point we just gotta decide it's an undesirable feature.

Erm, I don't think that from A: "There are not enough reviewer capacities"
necessarily follows B: "This is an undesirable feature."

That PG doesn't have enough reviewers doesn't mean users/authors want or
need features - it just means there are not enough people who are capable
of doing a review, or they haven't been found yet, or worse, turned away
by the current process.

Sure, sometimes a lack of interest just means a lack of interest - but not
every user of PG reads or follows the -hackers list or does (or can) even
contribute to PG.

And, the most strongest point to me is: Somebody showed soo much interest
they got up and wrote a patch. Even if nobody reviewed it, that is already
a high hurdle cleared, isn't it?

>> Now, some happy patches actually get reviews and are switched to ready,
>> which shows that somebody saw enough interest in them to spend some time
>> to
>> improve them.
>>
>> If committers ditch these reviewed patches on weak ground (eg "I do not
>> need
>> this feature so nobody should need it"), it is both in contradiction
>> with
>> the fact that someone took the time to review it, and is highly
>> demotivating
>> for people who do participate to the reviewing process and contribute to
>> hopefully improve these patches, because the reviewing time just goes to
>> the
>> drain in the end even when the patch is okay.
>
> The consequence of this appears to be that we should integrate
> everything that anybody decided worthy enough to review.

And likewise, I don't think the reverse follows, either.

[snipabit]
>> So for me killing ready patches in the end of the process and on weak
>> ground
>> can only make the process worse. The project is shooting itself in the
>> foot,
>> and you cannot complain later that there is not enough reviewers.

That would also be my opinion.

> How would you want it to work otherwise? Merge everything that somebody
> found review worthy? Have everything immediately reviewed by committers?
> Neither of those seem realistic.

True, but the process could probably be streamlined quite a bit. As an
outsider, I do wonder why sometimes long time periods go by without any
action - but when then a deadline comes, the action is taken suddenly
without the involved parties having had the time to respond first (I mean
"again", of course they could have responded earlier. But you know how it
is in a busy world, and with PG being a "side-project").

That is unsatisfactory for the authors (who either rebase patches
needlessly, or find their patch not getting a review because it has
bitrotted again) or reviewers (because they find patches have bitrotted)
and everyone else (because even the simplest features or problem-solving
fixes can take easily a year or two, if they don't get rejected outright).

Maybe an idea would be to send automatic notifications about patches that
need rebasing, or upcoming deadlines like starting commit fests?

In addition, clear rules and well-formulated project goals would help a lot.

Also, the discussion about "needs of the project" vs. the "needs of the
users" [0] should be separate from the "what do we do about the lack of
manpower" discussion.

Because if you argue what you want with the what you have, you usually end
up with nothing in both departments.

Best regards,

Tels

[0]: E.g. how much is it worth to have a clean, pristine state of source
code vs. having some features and fixes for the most pressing items in a
somewhat timely manner? I do think that there should be at least different
levels between PG core and utilities like pgbench.

Re: 2018-03 Commitfest Summary (Andres #1)

From

Andres Freund

Date:

04 March 2018, 04:13:25

Hi,

On 2018-03-03 10:36:18 +0100, Fabien COELHO wrote:
> Why would committers prevent pgbench to be used to reproduce this kind of
> load?

The goal of *discussing* whether a feature is worth the cost is
obviously not to deprive users of features. I find that is a fairly
absurd, and frankly insulting, ascription of motives.

I didn't "veto" the patch or anything, nor did Tom. I wondered whether
we're adding more cost than overall gains.  We have very few people that
actually show up when there are bugs and fix them, and adding more code
tends to make maintenance harder.

> The point of pgbench is to be able to test different kind of
> loads/scenarii... Accepting such features, if the implementation is
> good enough, should be no brainer, and alas it is not.

Reviewing whether the implementation is good enough *does* use
resources.  Our scarcest resource isn't patch contributions, it's
dealing with review and maintenance.

A lot of contributors, including serial ones, don't even remotely put in
as much resources reviewing other people's patches as they use up in
reviewer and committer bandwidth.  You certainly have contributed more
patches than you've reviewed for example.  That fundamentally can't
scale, unless some individual contribute way more review resources than
they use up, and that's not something many people afford nor want.

And while possibly not universally seen that way, in my opinion, and I'm
not alone seeing things that way, contributors that contribute more
review resources than they "use", are granted more latitude in what they
want to do because they help the project scale.

Note that pgbench code does add work, even if one is not using the new
features. As you know, I was working on performance and robustness
improvements, and to make sure they are and stay correct I was
attempting to compile postgres with -fsanitize=overflow - which fails
because pgbench isn't overflow safe. I reported that, but you didn't
follow up with fixes.

> Note that I'd wish that at least the ready-for-committer bug-fixes would be
> processed by committers when tagged as ready in a timely fashion (say under
> 1 CF), which is not currently the case.

Yes, we're not fast enough to integrate fixes. That's largely because
there's few committers that are active. I fail to see how that is an
argument to integrate *more* code that needs fixes.

I entirely agree that not getting ones patches can be very
demoralizing. But unless somebody manages to find a few seasoned
developers and pays them to focus on review and maintenance, I don't see
how that is going to change.  And given scant resources, we need to
prioritize.

Greetings,

Andres Freund

Re: 2018-03 Commitfest Summary (Andres #1)

From

Craig Ringer

Date:

04 March 2018, 04:49:35

On 2 March 2018 at 02:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:
> On 2018-03-01 14:09:02 +0100, Fabien COELHO wrote:
>>> A bit concerned that we're turning pgbench into a kitchen sink.

>> I do not understand "kitchen sink" expression in this context, and your
>> general concerns about pgbench in various comments in your message.

> We're adding a lot of stuff to pgbench that only a few people
> use. There's a lot of duplication with similar parts of code in other
> parts of the codebase. pgbench in my opinion is a tool to facilitate
> postgres development, not a goal in itself.

FWIW, I share Andres' concern that pgbench is being extended far past
what anyone has shown a need for. If we had infinite resources this
wouldn't be a big problem, but it's eating into limited committer hours
and I'm not really convinced that we're getting adequate return.

I have similar worries about us growing an ad-hoc scripting language in psql.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: 2018-03 Commitfest Summary (Andres #1)

From

Craig Ringer

Date:

04 March 2018, 04:56:28

On 2 March 2018 at 17:47, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

For instance, I used extensively tps throttling, latencies and timeouts measures when developping and testing the checkpointer sorting & throttling patch.

I have to admit, I've found tps throttling and latency measurement useful when working with logical replication. It's really handy to find a stable, sustainable throughput on master at which a replica can keep up.

PostgreSQL is about more than raw TPS. Users care about latency. Things we change affect latency. New index tricks like batching updates; sync commit changes for standby consistency, etc.

That's not a reason to throw anything and everything into pgbench. But there's value to more than measuring raw tps.

Also, I'm not the one doing the work.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: 2018-03 Commitfest Summary (Andres #1)

From

Craig Ringer

Date:

04 March 2018, 05:09:39

On 3 March 2018 at 17:56, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

The (trivial) big picture is to allow client-side expressions in psql (which as a \if:-) by reusing pgbench expression engine, so that one could write things like

\let i :j + 12 * :k

or

\if :VERSION_NUM < 140000

I still haven't really grasped why this isn't done by embedding a client-side scripting language interpreter, giving us vastly greater capabilities with only the maintenance of the glue instead of writing our own ad-hoc scripting tool. Something confine-able like JavaScript or Lua.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: 2018-03 Commitfest Summary (Andres #1)

From

Pavel Stehule

Date:

04 March 2018, 09:58:17

2018-03-04 3:09 GMT+01:00 Craig Ringer <craig@2ndquadrant.com>:

On 3 March 2018 at 17:56, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

The (trivial) big picture is to allow client-side expressions in psql (which as a \if:-) by reusing pgbench expression engine, so that one could write things like

\let i :j + 12 * :k

or

\if :VERSION_NUM < 140000

I still haven't really grasped why this isn't done by embedding a client-side scripting language interpreter, giving us vastly greater capabilities with only the maintenance of the glue instead of writing our own ad-hoc scripting tool. Something confine-able like JavaScript or Lua.

I am primary a psql user, so I'll talk about psql. I don't need there more functionality, than has C macros. Not more. So \if :VERSION_NUM < x is good enough. && || operators are nice to have

For this I don't to join any VM. It is overkill. What scripting functionality we can do in psql now, and probably in long future

a) setting prompt

b) some deployment - version checks

c) implementation of some simple regress scenarios

Can be different if we start new rich TUI client based on ncurses, new features - but it is different story.

More - implementation of simple expression evaluation in psql doesn't break later integration of some VM. Maybe it prepare a way for it.

I can imagine the psql command \lua anylua expression, with possibility to call any lua function with possibility to iterate over SQL result, show any result, and read, write from psql variables - prefer Lua against JavaScript, but same can be done there too.

But this not against to this patch. This patch is not too big, too complex, too hard to maintain - and some simple today issues can be done simple

Regards

Pavel

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: 2018-03 Commitfest Summary (Andres #1)

From

Craig Ringer

Date:

04 March 2018, 10:29:15

On 4 March 2018 at 14:58, Pavel Stehule <pavel.stehule@gmail.com> wrote:

2018-03-04 3:09 GMT+01:00 Craig Ringer <craig@2ndquadrant.com>:
On 3 March 2018 at 17:56, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

The (trivial) big picture is to allow client-side expressions in psql (which as a \if:-) by reusing pgbench expression engine, so that one could write things like

\let i :j + 12 * :k

or

\if :VERSION_NUM < 140000

I still haven't really grasped why this isn't done by embedding a client-side scripting language interpreter, giving us vastly greater capabilities with only the maintenance of the glue instead of writing our own ad-hoc scripting tool. Something confine-able like JavaScript or Lua.

I am primary a psql user, so I'll talk about psql. I don't need there more functionality, than has C macros. Not more. So \if :VERSION_NUM < x is good enough. && || operators are nice to have

For this I don't to join any VM. It is overkill. What scripting functionality we can do in psql now, and probably in long future

a) setting prompt
b) some deployment - version checks
c) implementation of some simple regress scenarios

Can be different if we start new rich TUI client based on ncurses, new features - but it is different story.

More - implementation of simple expression evaluation in psql doesn't break later integration of some VM. Maybe it prepare a way for it.

I can imagine the psql command \lua anylua expression, with possibility to call any lua function with possibility to iterate over SQL result, show any result, and read, write from psql variables - prefer Lua against JavaScript, but same can be done there too.

But this not against to this patch. This patch is not too big, too complex, too hard to maintain - and some simple today issues can be done simple

Fine by me so long as it remains a manageable scope, rather than incrementally turning into some horror scripting language.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: 2018-03 Commitfest Summary (Andres #1)

From

Michael Paquier

Date:

04 March 2018, 10:32:22

On Sat, Mar 03, 2018 at 05:13:25PM -0800, Andres Freund wrote:
> Reviewing whether the implementation is good enough *does* use
> resources.  Our scarcest resource isn't patch contributions, it's
> dealing with review and maintenance.

This is true.  One thing that patch authors tend to easily forget is
that a patch merged into the development branch means in no way that the
work is done, it is only the beginning.  Even if it is the committer's
responsibility to maintain a feature because by committing he accepts to
take the maintenance load, the author should also help with things as
the person who knows the pushed code as much as the committer himself.
The more patches pushed, the more maintenance load.  Careful peer review
is critical in being able to measure if a feature is going to cost much
in maintenance or not in the years following its commit.

> A lot of contributors, including serial ones, don't even remotely put in
> as much resources reviewing other people's patches as they use up in
> reviewer and committer bandwidth.  You certainly have contributed more
> patches than you've reviewed for example.  That fundamentally can't
> scale, unless some individual contribute way more review resources than
> they use up, and that's not something many people afford nor want.

This problem has existed for years, and has existed since I managed my
first commit fest.  In my opinion, it is easier to give value in a
company to new and shiny features than to bug tasks, so people tend to
give priority to features over maintenance, because they give more value
to their own career as new features are mainly seen as individual
achievements, while maintenance is a collective achievement, as this
involves most of the time fixing a problem that somebody else
introduced.  That's sad I think, maintenance should be given more value
as this is in the area of teamwork-related metrics.
--
Michael

Attachment

signature.asc

Re: 2018-03 Commitfest Summary (Andres #1)

From

Pavel Stehule

Date:

04 March 2018, 10:52:27

2018-03-04 8:29 GMT+01:00 Craig Ringer <craig@2ndquadrant.com>:

On 4 March 2018 at 14:58, Pavel Stehule <pavel.stehule@gmail.com> wrote:

2018-03-04 3:09 GMT+01:00 Craig Ringer <craig@2ndquadrant.com>:
On 3 March 2018 at 17:56, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

The (trivial) big picture is to allow client-side expressions in psql (which as a \if:-) by reusing pgbench expression engine, so that one could write things like

\let i :j + 12 * :k

or

\if :VERSION_NUM < 140000

I still haven't really grasped why this isn't done by embedding a client-side scripting language interpreter, giving us vastly greater capabilities with only the maintenance of the glue instead of writing our own ad-hoc scripting tool. Something confine-able like JavaScript or Lua.

I am primary a psql user, so I'll talk about psql. I don't need there more functionality, than has C macros. Not more. So \if :VERSION_NUM < x is good enough. && || operators are nice to have

For this I don't to join any VM. It is overkill. What scripting functionality we can do in psql now, and probably in long future

a) setting prompt
b) some deployment - version checks
c) implementation of some simple regress scenarios

Can be different if we start new rich TUI client based on ncurses, new features - but it is different story.

More - implementation of simple expression evaluation in psql doesn't break later integration of some VM. Maybe it prepare a way for it.

I can imagine the psql command \lua anylua expression, with possibility to call any lua function with possibility to iterate over SQL result, show any result, and read, write from psql variables - prefer Lua against JavaScript, but same can be done there too.

But this not against to this patch. This patch is not too big, too complex, too hard to maintain - and some simple today issues can be done simple

Fine by me so long as it remains a manageable scope, rather than incrementally turning into some horror scripting language.

It is Postgres development limited by commitfest cycles - step by step. I can imagine to grow this scripting possibilities - but programming any non-trivial task in \xxx commands is everything, but not nice and friendly. This is natural and practical limit. So the scope is limited to expression evaluation. Nothing more. Maybe, if there will be a agreement and spirit, we can talk about some new concepts for psql, pgbench to allow some more customization or some modernization. Personally, I am almost happy with psql, when I wrote pspg :). We can get some inspiration from pgcli https://github.com/dbcli/pgcli. These people does good work - but due python it is slower in visualisation. After this commitfest I'll start discussion, what is missing in psql.

Maybe integration of some VM can reduce lot of code inside psql and can help with better autocomplete implementation - but it is pretty long story, and it means direct dependecy on selected VM.

Regards

Pavel

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: 2018-03 Commitfest Summary (Andres #1)

From

Fabien COELHO

Date:

04 March 2018, 13:03:05

Hello Andres,

> [...] I find that is a fairly absurd, and frankly insulting, ascription 
> of motives.

I do not thought that I was insulting anyone, and I apologise if anyone 
felt insulted by my message.


> I didn't "veto" the patch or anything, nor did Tom.

Good. My past experience is that if Tom shows a reservation against the 
purpose of a patch, the patch is nearly dead, so I interpret that as a 
veto, even if it is not strictly one.


> I wondered whether we're adding more cost than overall gains.  We have 
> very few people that actually show up when there are bugs and fix them, 
> and adding more code tends to make maintenance harder.

As far as pgbench is concerned, I've been doing most of the maintenance, 
but it necessarily involves a committer in the end, obviously.


> A lot of contributors, including serial ones, don't even remotely put in 
> as much resources reviewing other people's patches as they use up in 
> reviewer and committer bandwidth. You certainly have contributed more 
> patches than you've reviewed for example.

Please check your numbers before criticising someone unduly.

Last time I checked I was more than even. According to the last three CF 
data, it seems that I'm quite okay over one year:

  2018-01: 4 submitted vs 7 reviewed
  2017-11: 8 submitted vs 7 reviewed
  2017-09: 8 submitted vs 12 reviewed
  2017-03: 5 submitted vs 3 reviewed

Total: 25 "submitted" and 29 reviewed.

I am counting patches I submitted which can stay "ready" over half a dozen 
CF as half a dozen submissions, which is not that to the count, because 
they were not new submissions to the CF and they did not require any 
reviewing on the CF where they stay put. I'm not sure that bug fixes 
should really be counted either.

Basically, I'm spending much more time reviewing & testing than 
developing. The patch I submit are usually pretty small and low impact, 
with exception. I tend to review similar patches, which seems fair.


> That fundamentally can't scale, unless some individual contribute way 
> more review resources than they use up, and that's not something many 
> people afford nor want.

Hmmm. Sure. Please note that it is what I think I am doing:-)

Now, if you feel that my overall contribution is detrimental to the 
project, feel free to ask me to stop contributing so as to help it, which 
is my primary objective anyway.


> And while possibly not universally seen that way, in my opinion, and I'm
> not alone seeing things that way, contributors that contribute more
> review resources than they "use", are granted more latitude in what they
> want to do because they help the project scale.

Sure.

> Note that pgbench code does add work, even if one is not using the new 
> features. As you know, I was working on performance and robustness 
> improvements, and to make sure they are and stay correct I was 
> attempting to compile postgres with -fsanitize=overflow - which fails 
> because pgbench isn't overflow safe. I reported that, but you didn't 
> follow up with fixes.

Indeed. AFAICR you did it before, I think that I reviewed it, it was not a 
period for which I had a lot of available time, and I did not feel it was 
something that urgent to fix because there was no practical impact. I 
would have done it later, probably.

-- 
Fabien.

Re: 2018-03 Commitfest Summary (Andres #1)

From

Fabien COELHO

Date:

04 March 2018, 13:28:09

Hello Craig,

> I still haven't really grasped why this isn't done by embedding a
> client-side scripting language interpreter, giving us vastly greater
> capabilities with only the maintenance of the glue instead of writing our
> own ad-hoc scripting tool. Something confine-able like JavaScript or Lua.

ISTM that what you are asking for cannot work well: Pg needs a client the 
primary language of which is SQL, so that:

   calvin> SELECT * FROM Stuff;

Does what you expect.

How would you mix/embedded another language with that? The answer is not 
obvious to me because you need some lexical convention to switch between 
SQL and the "other" language. PHP-looking solution or putting all SQL in 
strings would not be that great in interactive mode.

Moreover, I do not think that a full language (including loops) is 
desirable for interactive scripting and simple database creations. The 
needed level is more like cpp, i.e. not really a "full" language.

Also, which language should it be? I've been practicing perl, tcl, python, 
bash... do I want to learn lua as well?

So ISTM that the current solution is the reasonable one: limited 
SQL-language tools (psql & pgbench) for basic and interactive operations, 
and if you really need scripting you can always write a script in your 
favorite language which is interfaced with SQL through some API.

The limit between what can be done with the client tool and with scripting 
is a fuzzy one.

-- 
Fabien.

Re: 2018-03 Commitfest Summary (Andres #1)

From

Fabien COELHO

Date:

04 March 2018, 13:38:03

> Fine by me so long as it remains a manageable scope, rather than
> incrementally turning into some horror scripting language.

Pavel's description of what is needed is really the deal.

The target is "cpp" level simple scripting: expressions, variables, 
conditions, echoing, quitting on error... things like that.

Also, for me it should be the same for psql & pgbench.

Whether the resulting thing constitutes an "horror scripting language" is 
obviously debatable and a matter of taste:-)

-- 
Fabien.

Re: 2018-03 Commitfest Summary (Andres #1)

From

Andres Freund

Date:

06 March 2018, 02:43:15

Hi,

On 2018-03-04 11:03:05 +0100, Fabien COELHO wrote:
> > A lot of contributors, including serial ones, don't even remotely put in
> > as much resources reviewing other people's patches as they use up in
> > reviewer and committer bandwidth. You certainly have contributed more
> > patches than you've reviewed for example.
> 
> Please check your numbers before criticising someone unduly.

I did.  I filtered emails by threads, and counted the number of
messages.


> > Note that pgbench code does add work, even if one is not using the new
> > features. As you know, I was working on performance and robustness
> > improvements, and to make sure they are and stay correct I was
> > attempting to compile postgres with -fsanitize=overflow - which fails
> > because pgbench isn't overflow safe. I reported that, but you didn't
> > follow up with fixes.
> 
> Indeed. AFAICR you did it before, I think that I reviewed it, it was not a
> period for which I had a lot of available time, and I did not feel it was
> something that urgent to fix because there was no practical impact. I would
> have done it later, probably.

It's still not fixed.

Greetings,

Andres Freund

Re: 2018-03 Commitfest Summary (Andres #1)

From

Fabien COELHO

Date:

06 March 2018, 19:41:13

Hello,

>> Please check your numbers before criticising someone unduly.
>
> I did.  I filtered emails by threads, and counted the number of
> messages.

I do not see how this is related to the number of patch submissions or the 
number of reviews posted, but it is certainly counting something.

The CF reviewer data are mostly accurate for me, and show that I do more 
reviews than submissions.

Now I'm not paid for this, and I only want to help. If the project think 
that I can help more by not contributing, it just has to ask.

>>> because pgbench isn't overflow safe. I reported that, but you didn't
>>> follow up with fixes.
>>
>> Indeed. AFAICR you did it before, I think that I reviewed it, it was not a
>> period for which I had a lot of available time, and I did not feel it was
>> something that urgent to fix because there was no practical impact. I would
>> have done it later, probably.
>
> It's still not fixed.

Then I apologise: I definitely missed something. I'll look into it, 
although it may be yet another patch submission.

-- 
Fabien.

Re: 2018-03 Commitfest Summary (Andres #1)

From

Fabien COELHO

Date:

07 March 2018, 12:12:40

>>>> because pgbench isn't overflow safe. I reported that, but you didn't
>>>> follow up with fixes.
>>> 
>>> Indeed. AFAICR you did it before, I think that I reviewed it, it was not a
>>> period for which I had a lot of available time, and I did not feel it was
>>> something that urgent to fix because there was no practical impact. I 
>>> would
>>> have done it later, probably.
>> 
>> It's still not fixed.
>
> Then I apologise: I definitely missed something. I'll look into it, although 
> it may be yet another patch submission.

After investigation, my memory was indeed partly failing. I mixed your 
point with the handling of int_min / -1 special case which was committed 
some time ago.

In your initial mail you stated that you were going to send a patch for 
that shortly, and I concluded that I would certainly review it. I would 
not start developing a patch if someone said they would do it. No patch 
has been sent after 3 months. I can do it sometime in the future, although 
it would be yet another small patch submission from me which you like to 
criticise.

I'll answer on the technical point in the initial thread. I agree that the 
strtoint64 function which does not handle min int is not great.

-- 
Fabien.

Re: 2018-03 Commitfest Summary (Andres #1)

From

Bruce Momjian

Date:

29 March 2018, 17:07:04

On Fri, Mar  2, 2018 at 12:13:26PM -0800, Peter Geoghegan wrote:
> FWIW, I think that pgbench would become a lot more usable if someone
> maintained a toolset for managing pgbench. Something similar to Greg
> Smith's pgbench-tools project, but with additional features for
> instrumenting the server. There would be a lot of value in integrating
> it with third party tooling, such as perf and BCC, and in making it
> easy for non-experts to run relevant, representative tests.
> 
> Things like the rate limiting and alternative distributions were
> sorely needed, but there are diminishing returns. It's pretty clear to
> me that much of the remaining low hanging fruit is outside of pgbench
> itself. None of the more recent pgbench enhancements seem to make it
> easier to use.

Has anyone considered moving pgbench out of our git tree and into a
separate project where a separate team could maintain and improve it?

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

Re: 2018-03 Commitfest Summary (Andres #1)

From

Pavel Stehule

Date:

29 March 2018, 17:09:04

2018-03-29 16:07 GMT+02:00 Bruce Momjian <bruce@momjian.us>:

On Fri, Mar 2, 2018 at 12:13:26PM -0800, Peter Geoghegan wrote:
> FWIW, I think that pgbench would become a lot more usable if someone
> maintained a toolset for managing pgbench. Something similar to Greg
> Smith's pgbench-tools project, but with additional features for
> instrumenting the server. There would be a lot of value in integrating
> it with third party tooling, such as perf and BCC, and in making it
> easy for non-experts to run relevant, representative tests.
>
> Things like the rate limiting and alternative distributions were
> sorely needed, but there are diminishing returns. It's pretty clear to
> me that much of the remaining low hanging fruit is outside of pgbench
> itself. None of the more recent pgbench enhancements seem to make it
> easier to use.

Has anyone considered moving pgbench out of our git tree and into a
separate project where a separate team could maintain and improve it?

It shares code with psql.

Regards

Pavel

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

Re: 2018-03 Commitfest Summary (Andres #1)

From

Fabien COELHO

Date:

31 March 2018, 11:08:34

Hello Bruce,

> Has anyone considered moving pgbench out of our git tree and into a
> separate project where a separate team could maintain and improve it?

The movements has been the exact reverse: it was initially in contrib 
where it had some independence, and has been promoted to the main source 
tree by Peter Eisentraut in March 2015, effective on 9.5 release.

Not sure why... Handy dev tool for testing? One regression test in pgbench 
is really testing some postgres feature, and it could be used more for 
this purpose (eg generating a continuous stream of sql stuff to test 
failover, replication, ...).

As pointed out by Pavel, there is significant code sharing with psql 
(scanner, \if stuff), which may grow even more if pgbench client-side 
expressions are moved there as well (whether this is actually desired is 
pretty unclear, though).

So I do not think it would be desirable or practical to have it outside.

However, it helps explain the disagreements about pgbench features: 
pgbench internal developer-oriented use for postgres is somehow limited, 
and a lot of new features are suggested with an external performance 
benchmarking use in mind, about which core committers do not seem much 
interested.

-- 
Fabien.

Re: 2018-03 Commitfest Summary (Andres #1)

From

Bruce Momjian

Date:

03 April 2018, 03:57:12

On Sat, Mar 31, 2018 at 10:08:34AM +0200, Fabien COELHO wrote:
> However, it helps explain the disagreements about pgbench features: pgbench
> internal developer-oriented use for postgres is somehow limited, and a lot
> of new features are suggested with an external performance benchmarking use
> in mind, about which core committers do not seem much interested.

Agreed.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +