Thread: add modulo (%) operator to pgbench

add modulo (%) operator to pgbench

From
Fabien
Date:
This patch is pretty trivial.

Add modulo operator to pgbench.

This is useful to compute a permutation for tests with non uniform 
accesses (exponential or gaussian), so as to avoid trivial correlations 
between neighbour keys.

-- 
Fabien.

Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:

> This patch is pretty trivial.

Another slightly less trivial but more useful version.

The issue is that there are 3 definitions of modulo, two of which are fine 
(Knuth floored division and Euclidian), and the last one much less useful. 
Alas, C (%) & SQL (MOD) choose the bad definition:-( I really need any of 
the other two. The attached patch adds all versions, with "%" and "mod" 
consistent with their C and SQL unfortunate counterparts, and "fmod" and 
"emod" the sane ones.

> Add modulo operator to pgbench.
>
> This is useful to compute a permutation for tests with non uniform 
> accesses (exponential or gaussian), so as to avoid trivial correlations 
> between neighbour keys.

-- 
Fabien.

Re: add modulo (%) operator to pgbench

From
Robert Haas
Date:
On Mon, Aug 4, 2014 at 5:20 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>> This patch is pretty trivial.
> Another slightly less trivial but more useful version.
>
> The issue is that there are 3 definitions of modulo, two of which are fine
> (Knuth floored division and Euclidian), and the last one much less useful.
> Alas, C (%) & SQL (MOD) choose the bad definition:-( I really need any of
> the other two. The attached patch adds all versions, with "%" and "mod"
> consistent with their C and SQL unfortunate counterparts, and "fmod" and
> "emod" the sane ones.

Three different modulo operators seems like a lot for a language that
doesn't even have a real expression syntax, but I'll yield to whatever
the consensus is on this one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add modulo (%) operator to pgbench

From
Alvaro Herrera
Date:
Robert Haas wrote:
> On Mon, Aug 4, 2014 at 5:20 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> >> This patch is pretty trivial.
> > Another slightly less trivial but more useful version.
> >
> > The issue is that there are 3 definitions of modulo, two of which are fine
> > (Knuth floored division and Euclidian), and the last one much less useful.
> > Alas, C (%) & SQL (MOD) choose the bad definition:-( I really need any of
> > the other two. The attached patch adds all versions, with "%" and "mod"
> > consistent with their C and SQL unfortunate counterparts, and "fmod" and
> > "emod" the sane ones.
> 
> Three different modulo operators seems like a lot for a language that
> doesn't even have a real expression syntax, but I'll yield to whatever
> the consensus is on this one.

I wonder if it would be necessary to offer the division operator
semantics corresponding to whatever additional modulo operator we choose
to offer.  That is, if we add emod, do we need "ediv" as well?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Hello Robert,

>> The issue is that there are 3 definitions of modulo, two of which are fine
>> (Knuth floored division and Euclidian), and the last one much less useful.
>> Alas, C (%) & SQL (MOD) choose the bad definition:-( I really need any of
>> the other two. The attached patch adds all versions, with "%" and "mod"
>> consistent with their C and SQL unfortunate counterparts, and "fmod" and
>> "emod" the sane ones.
>
> Three different modulo operators seems like a lot for a language that
> doesn't even have a real expression syntax, but I'll yield to whatever
> the consensus is on this one.

I agree that it is overkill.

In fact there is a link: if there was a real expression syntax, the 
remainder sign could be fixed afterwards, so the standard C/SQL version 
would do. If it is not available, the modulo operator must be right.

If there is only one modulo added, I would rather have the Knuth version. 
However I was afraid that someone would object if "%" does not return the 
same result than the C/PostgreSQL versions (even if I think that nearly 
nobody has a clue about what % returns when arguments are negative), hence 
the 3 modulo version to counter this potential critic.

But I would prefer just one version with the Knuth (or Euclidian)
definitions.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Hello Alvaro,

> I wonder if it would be necessary to offer the division operator
> semantics corresponding to whatever additional modulo operator we choose
> to offer.  That is, if we add emod, do we need "ediv" as well?

I would make sense, however I do not need it, and I'm not sure of a use 
case where it would be needed, so I do not think that it is "necessary". 
If it happens to be, it could be added then quite easily.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
> Three different modulo operators seems like a lot for a language that
> doesn't even have a real expression syntax, but I'll yield to whatever
> the consensus is on this one.

Here is a third simpler patch which only implements the Knuth's modulo, 
where the remainder has the same sign as the divisor.

I would prefer this version 3 (one simple modulo based on Knuth 
definition) or if there is a problem version 2 (all 3 modulos). Version 1 
which provides a modulo compatible with C & SQL is really useless to me.

-- 
Fabien.

Re: add modulo (%) operator to pgbench

From
Robert Haas
Date:
On Tue, Aug 5, 2014 at 5:53 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Robert Haas wrote:
>> On Mon, Aug 4, 2014 at 5:20 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>> >> This patch is pretty trivial.
>> > Another slightly less trivial but more useful version.
>> >
>> > The issue is that there are 3 definitions of modulo, two of which are fine
>> > (Knuth floored division and Euclidian), and the last one much less useful.
>> > Alas, C (%) & SQL (MOD) choose the bad definition:-( I really need any of
>> > the other two. The attached patch adds all versions, with "%" and "mod"
>> > consistent with their C and SQL unfortunate counterparts, and "fmod" and
>> > "emod" the sane ones.
>>
>> Three different modulo operators seems like a lot for a language that
>> doesn't even have a real expression syntax, but I'll yield to whatever
>> the consensus is on this one.
>
> I wonder if it would be necessary to offer the division operator
> semantics corresponding to whatever additional modulo operator we choose
> to offer.  That is, if we add emod, do we need "ediv" as well?

Maybe we ought to break down and support a real expression syntax.
Sounds like that would be better all around.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
> Maybe we ought to break down and support a real expression syntax.
> Sounds like that would be better all around.

Adding operators is more or less orthogonal with providing a new 
expression syntax. I'm not sure that there is currently a crying need for 
it (a syntax). It would be a significant project. It would raise the 
question "where to stop"... And I just really need a few more 
functions/operators which can be fitted in the current implementation 
quite easily.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Mitsumasa KONDO
Date:

2014-08-06 23:38 GMT+09:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Three different modulo operators seems like a lot for a language that
doesn't even have a real expression syntax, but I'll yield to whatever
the consensus is on this one.

Here is a third simpler patch which only implements the Knuth's modulo, where the remainder has the same sign as the divisor.

I would prefer this version 3 (one simple modulo based on Knuth definition) or if there is a problem version 2 (all 3 modulos). Version 1 which provides a modulo compatible with C & SQL is really useless to me.
I like version 3, it is simple and practical. And it's enough in pgbench.
If someone wants to use other implementation of modulo algorithm, he just changes his source code.

Best regards,
--
Mitsumasa KONDO

Re: add modulo (%) operator to pgbench

From
Mitsumasa KONDO
Date:
Hi,

Here is the review result.

#1. Patch compatibility
Little bit hunk, but it can patch to latest master.

#2. Functionality
No problem.

#3. Documentation
I think modulo operator explanation should put at last at the doc line.
Because the others are more frequently used.

#4. Algorithm 
You proposed three modulo algorithm, that are 
1. general modulo, 2. floor modulo and 3. euclidian modulo.

They calculate different value when modulo2 or reminder is negative number.
Calculate examples are here, 

1. general modulo (patch1)
    5 %  3 = 2
    5 % -3 = 1
   -5 %  3 = -1

2. floor modulo (patch2, 3)
   5 %  3  =  2
   5 % -3  = -2
  -5 %  3  =  2

3. euclidian modulo (patch2)
   5 %  3  =  2
   5 % -3  =  4
  -5 %  3  =  2

That's all.

I think if we want to create equal possibility and inutuitive random generator, we select floor modulo, as you see the upper example. It can create contrapositive random number. 1 and 2 method cannot.

I think euclidian modulo doesn't need by a lot of people. If we add it, many people will confuse, because they doesn't know the mathematic algorithms.
 
So I like patch3 which is simple and practical.

If you agree or reply my comment, I will mark ready for commiter.

Best Regards,
--
Mitsumsasa KONDO

Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Hello Mutsumara-san,

> #3. Documentation
> I think modulo operator explanation should put at last at the doc line.
> Because the others are more frequently used.

> So I like patch3 which is simple and practical.

Ok.

> If you agree or reply my comment, I will mark ready for commiter.

Please find attached v4, which is v3 plus an improved documentation
which is clearer about the sign of the remainder.

-- 
Fabien.

Re: add modulo (%) operator to pgbench

From
Mitsumasa KONDO
Date:
Hi Fabien-san,

Thank you for your fast work!

2014-09-08 23:08 GMT+09:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Mutsumara-san,

#3. Documentation
I think modulo operator explanation should put at last at the doc line.
Because the others are more frequently used.

So I like patch3 which is simple and practical.

Ok.

If you agree or reply my comment, I will mark ready for commiter.

Please find attached v4, which is v3 plus an improved documentation
which is clearer about the sign of the remainder.

 The attached is seemed no problem. But I'd like to say about order of explanation in five formulas.

Fix version is here. Please confirm, and I mark it for ready for commiter.

Best regards,
--
Mitsumasa KONDO
Attachment

Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
> The attached is seemed no problem. But I'd like to say about order of
> explanation in five formulas.
>
> Fix version is here. Please confirm, and I mark it for ready for 
> commiter.

I'm ok with this version.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Robert Haas
Date:
On Wed, Aug 6, 2014 at 3:22 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>> Maybe we ought to break down and support a real expression syntax.
>> Sounds like that would be better all around.
>
> Adding operators is more or less orthogonal with providing a new expression
> syntax. I'm not sure that there is currently a crying need for it (a
> syntax). It would be a significant project. It would raise the question
> "where to stop"... And I just really need a few more functions/operators
> which can be fitted in the current implementation quite easily.

Writing a simple expression parser for pgbench using flex and bison
would not be an inordinately difficult problem.  And it would let you
add abs() and ?: and thereby avoid the need to hard-code multiple
versions of the modulo operator in the patch.  The fact that you're
agonizing about which modulo to add is a sign that the language is too
impoverished to let you do anything non-trivial.  That's why C only
has one modulo operator: as the patch demonstrates, the differences
between the version can be patched over with a very short C
expression.  If we had real expressions in pgbench, you could do the
same thing there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Hello Robert,

> Writing a simple expression parser for pgbench using flex and bison 
> would not be an inordinately difficult problem.

Sure. Note that there is not only the parsing but also the evaluating to 
think of, which mean a data structure to represent the expressions which 
would be more complex than the current one. Although it is not difficult 
as such, it would mean breaking pgbench heavily, which would mean 
more time and energy than I have available.

> And it would let you add abs() and ?: and thereby avoid the need to 
> hard-code multiple versions of the modulo operator in the patch.

It would also mean to *implement* abs and ?: in the "execution" code to 
compute the parsed expression.

> The fact that you're agonizing about which modulo to add is a sign that 
> the language is too impoverished to let you do anything non-trivial.

I'm not agonizing about which modulo to use:-) I know I do not want the 
C/SQL version which is never really useful if you have signed data. I 
overlooked this detail when submitting the first patch, and produced a 
stupid patch with all 3 possible modulos, before going to the sane 
solution which is to use the floored division Knuth version. If I had 
thought a bit, I would have sent v3 as v1 directly.

> That's why C only has one modulo operator: as the patch demonstrates, 
> the differences between the version can be patched over with a very 
> short C expression. If we had real expressions in pgbench, you could do 
> the same thing there.

Sure. I agree that pgbench is limited and that real expressions would have 
helped, but it is also quite useful and easy to extend "as is". Maybe the 
"add an expression parser to pgbench" could be added in the postgresql 
todo wiki?

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Robert Haas
Date:
On Tue, Sep 9, 2014 at 11:07 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>> The fact that you're agonizing about which modulo to add is a sign that
>> the language is too impoverished to let you do anything non-trivial.
>
> I'm not agonizing about which modulo to use:-) I know I do not want the
> C/SQL version which is never really useful if you have signed data. I
> overlooked this detail when submitting the first patch, and produced a
> stupid patch with all 3 possible modulos, before going to the sane solution
> which is to use the floored division Knuth version. If I had thought a bit,
> I would have sent v3 as v1 directly.

Sure, and I would have looked at that patch and complained that you
were implementing a modulo operator with different semantics than C.
And then we'd be right back where we are now.

The problem here isn't that I don't know what you want, or that what
you want is unreasonable.  The problem is that we can't cater to every
slightly-different thing that somebody wants to do with pgbench.  If
we try, we'll end up with a neverending jumble of features most of
which never get used by 1 or 2 people.  Features for any part of
PostgreSQL need to be of reasonably general interest, not something
that is super-specialized to one particular set of needs.  If I commit
what you're asking me to commit here, I think that's exactly what I'll
be doing, and I don't think that's a good idea.

In all seriousness, sometimes the right solution to these problems is
just to patch your own copy.  I've done that with pgbench at least
once and with PostgreSQL in general more times than I can conveniently
count.  I've learned a lot of useful things that way, but I can't
expect all of the debugging instrumentation and trial features that
I've created to get incorporated into the product.  It's not
reasonable, and it's not worth the time it would take me to make the
code general and maintainable, so the code just gets dropped on the
floor.  In a perfect world, that wouldn't happen, but in a perfect
world they'd pay me the same salary they pay Linus Torvalds.

In this particular instance, we got onto this topic in the first place
because of the Gaussian and exponential patches, and the desire to
have the "hot" portion of the keys distributed in some random way
through the data set rather than having everything be clustered at the
front.  As you yourself said, the best case for this patch is to allow
a poor-man's approximation of that.  Adding a weird non-standard
operator for a poor approximation of the feature we're really looking
for just doesn't feel like the right call.  I recognize that you have
a different view, and if enough people agree with you, that argument
may win the day.  But my opinion is that we are too far down in the
weeds.  We should be trying to create features that will have general
appeal to pgbench users, not features that solve narrow problems for
individual developers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Hello Robert,

> Sure, and I would have looked at that patch and complained that you
> were implementing a modulo operator with different semantics than C.
> And then we'd be right back where we are now.
> [...]

Probably.

To be clear about my intent, which is a summary of what you already know: 
I would like to be able to generate a reasonable non uniform throttled 
workload with pgbench.

I do think such a feature is worth having for anybody who would like to do 
something realistic with pgbench, and that it is in the "general 
interest" of postgresql to have such features.

In particular, given the current state of abysmal performance under some 
trivial load with pg, I really think that it is really worth having a 
better tool, and I think that my effort with rate and progress helped to 
put these hidden problems into the light, and I do hope that they are 
going to be solved quickly.

Now if I submitted a big patch with all the necessary features in it, 
someone would ask to break it into pieces. So they are submitted one by 
one (progress, throttling, limit, modulo, ...).

Note I did not start with the non uniform stuff, but Mitsumasa-san sent a 
gaussian distribution patch and I jumped onto the wagon to complement it 
with an exponential distribution patch. I knew when doing it that is was 
not enough, but as I said "one piece at a time", given the effort required 
to pass simple patch.

What is still needed for the overall purpose is the ability to scatter the 
distribution. This is really:
 (1) a positive remainder modulo, which is the trivial patch under     discussion
 (2) a fast but good quality for the purpose hash function     also a rather small patch, not submitted yet.
 (3) maybe the '|' operator to do a TP*-like non-uniform load,     which is really periodic so I do not like it.     a
trivialpatch, not submitted yet.
 

If you do not want one of these pieces (1 & 2), basically the interest of 
gaussian/exponential addition is much reduced, and this is just a half 
baked effort aborted because you did not want what was required to make it 
useful. Well, I can only disagree, but you are a committer and I'm not!

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Robert Haas
Date:
On Wed, Sep 10, 2014 at 4:48 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> Note I did not start with the non uniform stuff, but Mitsumasa-san sent a
> gaussian distribution patch and I jumped onto the wagon to complement it
> with an exponential distribution patch. I knew when doing it that is was not
> enough, but as I said "one piece at a time", given the effort required to
> pass simple patch.
>
> What is still needed for the overall purpose is the ability to scatter the
> distribution. This is really:
>
>  (1) a positive remainder modulo, which is the trivial patch under
>      discussion
>
>  (2) a fast but good quality for the purpose hash function
>      also a rather small patch, not submitted yet.
>
>  (3) maybe the '|' operator to do a TP*-like non-uniform load,
>      which is really periodic so I do not like it.
>      a trivial patch, not submitted yet.
>
> If you do not want one of these pieces (1 & 2), basically the interest of
> gaussian/exponential addition is much reduced, and this is just a half baked
> effort aborted because you did not want what was required to make it useful.
> Well, I can only disagree, but you are a committer and I'm not!

I am not objecting to the functionality; I'm objecting to bolting on
ad-hoc operators one at a time.  I think an expression syntax would
let us do this in a much more scalable way.  If I had time, I'd go do
that, but I don't.  We could add abs(x) and hash(x) and it would all
be grand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Hello Robert,

> I am not objecting to the functionality; I'm objecting to bolting on
> ad-hoc operators one at a time.  I think an expression syntax would
> let us do this in a much more scalable way.  If I had time, I'd go do
> that, but I don't.  We could add abs(x) and hash(x) and it would all
> be grand.

Ok. I do agree that an expression syntax would be great!

However, that would not diminish nor change much the amount and kind of 
code necessary to add an operator or a function: the parser would have to 
be updated, and the expression structure, and the executor. Currently the 
pgbench "parser" and expression are very limited, but they are also very 
generic so that nothing is needed to add a new operator there, only the 
execution code needs to be updated, and the update would be basically the 
same (if this is this function or operator, actually do it...) if the 
execution part of a real expression syntax would have to be updated.

So although I agree that a real expression syntax would be great "nice to 
have", I do not think that it is valid objection to block this patch.

Moreover, upgrading pgbench to handle an actual expression syntax is not 
so trivial, because for instance some parts of the text needs to be parsed 
(set syntax) while others would need not to be pased (SQL commands), so 
some juggling would be needed in the lexer, or maybe only call the parser 
on some lines and not others... Everything is possible, but this one would 
require some careful thinking.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Mitsumasa KONDO
Date:
2014-09-11 15:47 GMT+09:00 Fabien COELHO <coelho@cri.ensmp.fr>:

Hello Robert,

I am not objecting to the functionality; I'm objecting to bolting on
ad-hoc operators one at a time.  I think an expression syntax would
let us do this in a much more scalable way.  If I had time, I'd go do
that, but I don't.  We could add abs(x) and hash(x) and it would all
be grand.

Ok. I do agree that an expression syntax would be great!
Yes, it's not bad.

However, will we need some method of modulo calculation?
I don't think so much. I think most intuitive modulo calculation method is floor modulo,
Because if we use the negative value in modulo calculation, it just set negative value as both positive values,
it is easy to expect the result than others. This strong point is good for benchmark script users.

But I don't have any strong opinion about this patch, not blocking:)

Best Regards
--
Mistumasa KONDO

Re: add modulo (%) operator to pgbench

From
Robert Haas
Date:
On Thu, Sep 11, 2014 at 2:47 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> Ok. I do agree that an expression syntax would be great!
>
> However, that would not diminish nor change much the amount and kind of code
> necessary to add an operator or a function

That's not really true.  You can't really add abs(x) or hash(x) right
now because the current code only supports this syntax:

\set varname operand1 [ operator operand2 ]

There's no way to add support for a unary operator with that syntax.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
>> However, that would not diminish nor change much the amount and kind of 
>> code necessary to add an operator or a function
>
> That's not really true.  You can't really add abs(x) or hash(x) right
> now because the current code only supports this syntax:
>
> \set varname operand1 [ operator operand2 ]
>
> There's no way to add support for a unary operator with that syntax.

Hmmm. If you accept a postfix syntax, there is:-)

But this is not convincing. Adding a unary function with a clean syntax 
indeed requires doing something with the parser.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Stephen Frost
Date:
Fabien,

* Fabien COELHO (coelho@cri.ensmp.fr) wrote:
> >That's not really true.  You can't really add abs(x) or hash(x) right
> >now because the current code only supports this syntax:
> >
> >\set varname operand1 [ operator operand2 ]
> >
> >There's no way to add support for a unary operator with that syntax.
>
> Hmmm. If you accept a postfix syntax, there is:-)
>
> But this is not convincing. Adding a unary function with a clean
> syntax indeed requires doing something with the parser.

Based on the discussion so far, it sounds like you're coming around to
agree with Robert (as I'm leaning towards also) that we'd be better off
building a real syntax here instead.  If so, do you anticipate having a
patch to do so in the next few days, or...?
Thanks!
    Stephen

Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Hello Stephen,

>> But this is not convincing. Adding a unary function with a clean
>> syntax indeed requires doing something with the parser.
>
> Based on the discussion so far, it sounds like you're coming around to
> agree with Robert (as I'm leaning towards also) that we'd be better off
> building a real syntax here instead.

Not exactly.

My actual opinion is that it is really an orthogonal issue. ISTM that a 
similar code would be required somehow for the executor part of an 
hypothetical real syntax if it was to support modulo.

> If so, do you anticipate having a patch to do so in the next few days, 
> or...?

Developping a full expression syntax is a significant project that I'm not 
planing to undertake in the short or medium term, especially as I'm not 
convinced that it is worth it: It would involve many changes, and I'm 
afraid that the likelyhood of the patch being rejected on some ground is 
high.

So my opinion is that this small modulo operator patch is both useful and 
harmless, so it should be committed. If someday there is a nice real 
syntax implemented, that will be great as well.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Tom Lane
Date:
Fabien COELHO <coelho@cri.ensmp.fr> writes:
> So my opinion is that this small modulo operator patch is both useful and 
> harmless, so it should be committed.

You've really failed to make that case --- in particular, AFAICS there is
not even consensus on the exact semantics that the operator should have.
So I'm inclined to reject rather than put in something that may cause
surprises down the road.  The usefulness doesn't seem great enough to
take that risk.

The way forward, if we think there is enough value in it (I'm not
sure there is), would be to build enough expression infrastructure
so that we could inexpensively add both operators and functions.
Then we could add a modulo operator with whatever semantics seem
most popular, and some function(s) for the other semantics, and
there would not be so much riding on choosing the "right" semantics.
        regards, tom lane



Re: add modulo (%) operator to pgbench

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Fabien COELHO <coelho@cri.ensmp.fr> writes:
> > So my opinion is that this small modulo operator patch is both useful and
> > harmless, so it should be committed.
>
> You've really failed to make that case --- in particular, AFAICS there is
> not even consensus on the exact semantics that the operator should have.
> So I'm inclined to reject rather than put in something that may cause
> surprises down the road.  The usefulness doesn't seem great enough to
> take that risk.

Agreed.

> The way forward, if we think there is enough value in it (I'm not
> sure there is), would be to build enough expression infrastructure
> so that we could inexpensively add both operators and functions.
> Then we could add a modulo operator with whatever semantics seem
> most popular, and some function(s) for the other semantics, and
> there would not be so much riding on choosing the "right" semantics.

Indeed and there's plenty of time to make it happen for 9.5.
Personally, I'd really like to see as I feel it'd help with the
performance farm goal which has been discussed many times over.

Fabien, I'd ask that you not be discouraged by this and continue to work
with pgbench and work on such an improvement, if you're able to take it
on with your other committments.  I do see value in it and I feel it
will help reproducability, a key and important aspect of performance
analysis, much more so than just hacking a local copy of pgbench would.
Thanks!
    Stephen

Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
>> So my opinion is that this small modulo operator patch is both useful and
>> harmless, so it should be committed.
>
> You've really failed to make that case --- in particular, AFAICS there is
> not even consensus on the exact semantics that the operator should have.

There is. Basically whatever with a positive remainder when the divisor is 
positive is fine. The only one to avoid is the dividend signed version, 
which happen to be the one of C and SQL, a very unfortunate choice in both 
case as soon as you have negative numbers.

> So I'm inclined to reject rather than put in something that may cause
> surprises down the road.  The usefulness doesn't seem great enough to
> take that risk.

If you reject it, you can also remove the gaussian and exponential random 
distribution which is near useless without a mean to add a minimal 
pseudo-random stage, for which "(x * something) % size" is a reasonable 
approximation, hence the modulo submission.

> The way forward, if we think there is enough value in it (I'm not
> sure there is), would be to build enough expression infrastructure
> so that we could inexpensively add both operators and functions.

The modulo patch is basically 10 lines of code, I would not call that 
"expensive"...

As I explained earlier, it would NOT be any different with an "expression 
infrastructure", as you would have to add a lex for the operator, then a 
yacc rule for the construction, the operator would need to be represented 
in a data structure, and the executor should be able to handle the case 
including errors... there is no way that this would be less that 10 lines 
of code. It would basically include the very same lines for the executor 
part.

> Then we could add a modulo operator with whatever semantics seem
> most popular, and some function(s) for the other semantics, and
> there would not be so much riding on choosing the "right" semantics.

The semantics is clear. I just choose the wrong one in the first patch:-)

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Heikki Linnakangas
Date:
On 09/23/2014 09:15 PM, Fabien COELHO wrote:
>> So I'm inclined to reject rather than put in something that may cause
>> surprises down the road.  The usefulness doesn't seem great enough to
>> take that risk.

Marked as "Returned with feedback".

> If you reject it, you can also remove the gaussian and exponential random
> distribution which is near useless without a mean to add a minimal
> pseudo-random stage, for which "(x * something) % size" is a reasonable
> approximation, hence the modulo submission.

I'm confused. The gaussian and exponential patch was already committed. 
Are you saying that it's not actually useful, and should be reverted? 
That doesn't make sense to me, gaussian and exponential distributed 
values seems quite useful to me in test scripts.

I don't understand what that pseudo-random stage you're talking about 
is. Can you elaborate?

- Heikki




Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Hello Heikki,

>> If you reject it, you can also remove the gaussian and exponential random
>> distribution which is near useless without a mean to add a minimal
>> pseudo-random stage, for which "(x * something) % size" is a reasonable
>> approximation, hence the modulo submission.
>
> I'm confused. The gaussian and exponential patch was already committed.

Yes.

They are significant patches that really involved significant work, and 
which are only really useful with a complementary stupid 10 lines patch 
which is being rejected without understanding why it is needed.

> Are you saying that it's not actually useful, and should be reverted? 
> That doesn't make sense to me, gaussian and exponential distributed 
> values seems quite useful to me in test scripts.

Yes and no.

Currently these distributions are achieved by mapping a continuous 
function onto integers, so that neighboring integers get neighboring 
number of draws, say with size=7:
  #draws     10 6 3 1 0 0 0  // some exponential distribution  int drawn   0 1 2 3 4 5 6

Although having an exponential distribution of accesses on tuples is quite 
reasonable, the likelyhood there would be so much correlation between 
neighboring values is not realistic at all. You need some additional 
shuffling to get there.

> I don't understand what that pseudo-random stage you're talking about is. Can 
> you elaborate?

The pseudo random stage is just a way to scatter the values. A basic 
approach to achieve this is "i' = (i * large-prime) % size", if you have a 
modulo. For instance with prime=5 you may get something like:
  #draws     10 6 3 1 0 0 0  int drawn   0 1 2 3 4 5 6 (i)  scattered   0 5 3 1 6 4 2 (i' = 5 i % 7)

So the distribution becomes:
  #draws     10 1 0 3 0 6 0  scattered   0 1 2 3 4 5 6

Which is more interesting from a testing perspective because it removes 
the neighboring value correlation.

A better approach is to use a hash function. "i' = hash(i) % size",
although it skews the distribution some more, but the quality of the 
shuffling is much better than with the mult-modulo version above.
Note that you need a modulo as well...

I must say that I'm appaled by a decision process which leads to such 
results, with significant patches passed, and the tiny complement to make 
it really useful (I mean not on the paper or on the feature list, but in 
real life) is rejected...

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Heikki Linnakangas
Date:
On 09/24/2014 10:45 AM, Fabien COELHO wrote:
> Currently these distributions are achieved by mapping a continuous
> function onto integers, so that neighboring integers get neighboring
> number of draws, say with size=7:
>
>     #draws     10 6 3 1 0 0 0  // some exponential distribution
>     int drawn   0 1 2 3 4 5 6
>
> Although having an exponential distribution of accesses on tuples is quite
> reasonable, the likelyhood there would be so much correlation between
> neighboring values is not realistic at all. You need some additional
> shuffling to get there.
>
>> I don't understand what that pseudo-random stage you're talking about is. Can
>> you elaborate?
>
> The pseudo random stage is just a way to scatter the values. A basic
> approach to achieve this is "i' = (i * large-prime) % size", if you have a
> modulo. For instance with prime=5 you may get something like:
>
>     #draws     10 6 3 1 0 0 0
>     int drawn   0 1 2 3 4 5 6 (i)
>     scattered   0 5 3 1 6 4 2 (i' = 5 i % 7)
>
> So the distribution becomes:
>
>     #draws     10 1 0 3 0 6 0
>     scattered   0 1 2 3 4 5 6
>
> Which is more interesting from a testing perspective because it removes
> the neighboring value correlation.

Depends on what you're testing. Yeah, shuffling like that makes sense 
for a primary key. Or not: very often, recently inserted rows are also 
queried more often, so that there is indeed a strong correlation between 
the integer key and the access frequency. Or imagine that you have a 
table that stores the height of people in centimeters. To populate that, 
you would want to use a gaussian distributed variable, without shuffling.

For shuffling, perhaps we should provide a pgbench function or operator 
that does that directly, instead of having to implement it using * and 
%. Something like hash(x, min, max), where x is the input variable 
(gaussian distributed, or whatever you want), and min and max are the 
range to map it to.

> I must say that I'm appaled by a decision process which leads to such
> results, with significant patches passed, and the tiny complement to make
> it really useful (I mean not on the paper or on the feature list, but in
> real life) is rejected...

The idea of a modulo operator was not rejected, we'd just like to have 
the infrastructure in place first.

- Heikki




Re: add modulo (%) operator to pgbench

From
Kevin Grittner
Date:
Fabien COELHO <coelho@cri.ensmp.fr> wrote:

>>> So my opinion is that this small modulo operator patch is both useful and
>>> harmless, so it should be committed.
>>
>> You've really failed to make that case --- in particular, AFAICS there is
>> not even consensus on the exact semantics that the operator should have.
>
> There is. Basically whatever with a positive remainder when the divisor is
> positive is fine. The only one to avoid is the dividend signed version,
> which happen to be the one of C and SQL, a very unfortunate choice in both
> case as soon as you have negative numbers.

No, it depends totally on the application.  For financial and
physical inventory purposes where I have had occasion to use it,
the properties which were important were:

Assuming that all values are integers, for:
 x = a / b; y = a % b;
 If b is zero either statement must generate an error. If a and b have the same sign, x must be positive; else x must
benegative. It must hold that abs(x) is equal to abs(a) / abs(b). It must hold that ((x * b) + y) is equal to a.
 

This is exactly what the languages I was using did, and I was glad.
I find it convenient that C and SQL behave this way.  You are
proposing that these not all hold, which in a lot of situations
could cause big problems.  You seem familiar enough with your own
use case that I believe you when you say you don't want these
semantics, but that just points out the need to support both.

>> Then we could add a modulo operator with whatever semantics seem
>> most popular, and some function(s) for the other semantics, and
>> there would not be so much riding on choosing the "right"
> semantics.
>
> The semantics is clear.

I disagree.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add modulo (%) operator to pgbench

From
Tom Lane
Date:
Kevin Grittner <kgrittn@ymail.com> writes:
> Assuming that all values are integers, for:

>   x = a / b;
>   y = a % b;

>   If b is zero either statement must generate an error.
>   If a and b have the same sign, x must be positive; else x must be negative.
>   It must hold that abs(x) is equal to abs(a) / abs(b).
>   It must hold that ((x * b) + y) is equal to a.

Not sure about the third of those statements, but the last one is
definitely a requirement.

I think the only defensible choice, really, is that % should be defined
so that a = ((a / b) * b) + (a % b).  It is perfectly reasonable to
provide other division/modulus semantics as functions, preferably in
matching pairs that also satisfy this axiom.  But the two operators need
to agree, or you'll have surprised users.
        regards, tom lane



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:

> No, it depends totally on the application.  For financial and
> physical inventory purposes where I have had occasion to use it,
> the properties which were important were:
> [...]

Hmmm. Probably I'm biased towards my compiler with an integer linear 
flavor field, where C-like "%" is always a pain, however you look at it.

I'm not sure of physical inventories with negative numbers though. In 
accounting, I thought that a negative number was a positive number with 
"debit" written above. In finance, no problem to get big deficits:-)

Now about the use case, I'm not sure that you would like to write your 
financial and physical inventory stuff within a pgbench test script, 
whereas in such a script I do expect when doing a modulo with the size of 
a table to have a positive result so that I can expect to find a tuple 
there, hence the desired "positive remainder" property for negative 
dividends.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
> The idea of a modulo operator was not rejected, we'd just like to have the 
> infrastructure in place first.

Sigh.

How to transform a trivial 10 lines patch into a probably 500+ lines 
project involving flex & bison & some non trivial data structures, and 
which may get rejected on any ground...

Maybe I'll set that as a student project.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Heikki Linnakangas
Date:
On 09/24/2014 09:34 PM, Fabien COELHO wrote:
>
>> The idea of a modulo operator was not rejected, we'd just like to have the
>> infrastructure in place first.
>
> Sigh.
>
> How to transform a trivial 10 lines patch into a probably 500+ lines
> project involving flex & bison & some non trivial data structures, and
> which may get rejected on any ground...

That's how we get good features. It's very common for someone to need X, 
and to post a patch that does X. Other people pop up that need Y and Z 
which are similar, and you end up implementing those too. It's more work 
for you in the short term, but it benefits everyone in the long run.

- Heikki




Re: add modulo (%) operator to pgbench

From
Robert Haas
Date:
On Wed, Sep 24, 2014 at 2:34 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> Sigh.
>
> How to transform a trivial 10 lines patch into a probably 500+ lines project
> involving flex & bison & some non trivial data structures, and which may get
> rejected on any ground...
>
> Maybe I'll set that as a student project.

I think you're making a mountain out of a molehill.  I implemented
this today in about three hours.  The patch is attached.  It needs
more testing, documentation, and possibly some makefile adjustments,
but it seems to basically work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: add modulo (%) operator to pgbench

From
Gregory Smith
Date:
On 9/24/14, 10:10 PM, Robert Haas wrote:
> I think you're making a mountain out of a molehill. I implemented this 
> today in about three hours.

I think you're greatly understating your own efficiency at 
shift/reducing parser mountains down to molehills.  Fabien even guessed 
the LOC size of the resulting patch with less than a 9% error.  That's 
some top notch software metrics and development work there boys; kudos 
all around.

Let's get this operator support whipped into shape, then we can add the 
2 to 3 versions of the modulo operator needed to make the major use 
cases work.  (There was never going to be just one hacked in with a 
quick patch that satisfied the multiple ways you can do this)

Then onto the last missing pieces of Fabien's abnormally distributed 
test workload vision.  He seems kind of stressed about the process 
lately; not sure what to say about it.  Yes, the PostgreSQL community is 
hostile to short, targeted feature improvements unless they come already 
fit into a large, standards compliant strategy for improving the 
database.  That doesn't work well when approached by scratch an itch 
stye development.  Nothing we can really do about it

-- 
Greg Smith greg.smith@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Hello Robert,

> I think you're making a mountain out of a molehill.

Probably. I tend to try the minimum effort first.

> I implemented this today in about three hours.  The patch is attached.

Great!

Your patch is 544 lines, my size evaluation was quite good:-)

Note that I probably spent 10 minutes on the 10 lines patch, but I 
probably spent hours writing mails about it.

> It needs more testing, documentation, and possibly some makefile 
> adjustments, but it seems to basically work.

I'll try to do that, but not in the short term.

Note that the modulo is not the one I need, but probably I can make do 
with an abs() function and/or a conditional.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
>> I think you're making a mountain out of a molehill. I implemented this 
>> today in about three hours.
>
> I think you're greatly understating your own efficiency at shift/reducing 
> parser mountains down to molehills.  Fabien even guessed the LOC size of the 
> resulting patch with less than a 9% error.

As I do research and also teach compilers, it was not that hard for me to 
provide a reasonable size estimation.

A also noticed Robert's 3 lines per minutes outstanding productivity. Once 
you include argumenting, testing, debuging and documenting, that may go 
down a bit, but it is very good anyway.

You can also notice that in the patch the handling of '%' involves 11 
lines (1 lexer, 1 parser, 9 executor), basically the very same size as the 
patch I submitted, for a very similar code.

> [...].  Yes, the PostgreSQL community is hostile to 
> short, targeted feature improvements unless they come already fit into a 
> large, standards compliant strategy for improving the database.  That doesn't 
> work well when approached by scratch an itch stye development.

Indeed I tend to avoid spending hours on something when I can spend 
minutes, and if I spend hours I'm really cross if a patch is rejected, 
whereas I'm less crossed if I just lost minutes.

I had bad experiences with patch submissions in the past when "things are 
changed and someone does not want it", and it seems that this patch falls
within this risk, hence my reluctance to spend the time.

Now, as the patch is submitted by a core dev, I think the likelyhood that 
some version will be committed is high, so all is well, and I gladly 
review and test it when I have time, alas not in the short term.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Robert Haas
Date:
On Thu, Sep 25, 2014 at 1:27 AM, Gregory Smith <gregsmithpgsql@gmail.com> wrote:
> On 9/24/14, 10:10 PM, Robert Haas wrote:
>> I think you're making a mountain out of a molehill. I implemented this
>> today in about three hours.
>
> I think you're greatly understating your own efficiency at shift/reducing
> parser mountains down to molehills.  Fabien even guessed the LOC size of the
> resulting patch with less than a 9% error.  That's some top notch software
> metrics and development work there boys; kudos all around.

Well, I blame Heikki.  I used to think that this kind of thing was
really hard, and a few years ago I might have had Fabien's reaction,
but then I saw Heikki bust out a shift/reduce parser for the isolation
tester in no time, so I decided it must not be that hard.  So it
proved.  I copied all that hard parts from other parts of the
PostgreSQL code base - my references were the isolation tester lexer
and parser, the contrib/seg parser, and the main parser.  I couldn't
do it that fast from scratch, not even close, but adapting something
that's already known to work is much easier.

> Let's get this operator support whipped into shape, then we can add the 2 to
> 3 versions of the modulo operator needed to make the major use cases work.
> (There was never going to be just one hacked in with a quick patch that
> satisfied the multiple ways you can do this)

I don't think adding more versions of the modulo operator is the right
way forward: I think we should add ? : for conditionals and some kind
of function thing like abs(x).  Or maybe come up with a more
sophisticated rehashing algorithm and expose that directly as hash(x).
That's my whole reason for not wanting to adopt Fabien's approach in
the first place: I was cool with exposing C's modulo operator, but any
other modulo semantics seem like they should be built up from
general-purpose primitives.

Anyway, I think the first thing is that somebody needs to spend some
time testing, polishing, and documenting this patch, before we start
adding to it.  I'm hoping someone else will volunteer - other tasks
beckon.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add modulo (%) operator to pgbench

From
Gregory Smith
Date:
On 9/25/14, 8:38 AM, Robert Haas wrote:
> That's my whole reason for not wanting to adopt Fabien's approach in 
> the first place: I was cool with exposing C's modulo operator, but any 
> other modulo semantics seem like they should be built up from 
> general-purpose primitives. 

Maybe; I don't quite understand his requirements well enough yet to know 
if that's possible, or if it's easier to give him a full special 
operator of his own.  But since what you did makes that easier, too, 
forward progress regardless.

> Anyway, I think the first thing is that somebody needs to spend some 
> time testing, polishing, and documenting this patch, before we start 
> adding to it. I'm hoping someone else will volunteer - other tasks 
> beckon. 

I bouncing it to here for you, and I expect to help with those parts 
presumably in addition to Fabien's help: 
https://commitfest.postgresql.org/action/patch_view?id=1581



Re: add modulo (%) operator to pgbench

From
Heikki Linnakangas
Date:
On 09/25/2014 05:10 AM, Robert Haas wrote:
> On Wed, Sep 24, 2014 at 2:34 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>> Sigh.
>>
>> How to transform a trivial 10 lines patch into a probably 500+ lines project
>> involving flex & bison & some non trivial data structures, and which may get
>> rejected on any ground...
>>
>> Maybe I'll set that as a student project.
>
> I think you're making a mountain out of a molehill.  I implemented
> this today in about three hours.  The patch is attached.  It needs
> more testing, documentation, and possibly some makefile adjustments,
> but it seems to basically work.

Looks good to me. The new modulo operator needs documentation, and it 
could use a pgindent run, but other than that I think this is ready for 
commit.

It would be nice to go even further, and replace process_file, 
process_builtin, and process_commands altogether with a Bison parser. 
But this is definitely a step in the right direction.

- Heikki



Re: add modulo (%) operator to pgbench

From
Andrew Dunstan
Date:
On 11/24/2014 07:26 AM, Heikki Linnakangas wrote:
> On 09/25/2014 05:10 AM, Robert Haas wrote:
>> On Wed, Sep 24, 2014 at 2:34 PM, Fabien COELHO <coelho@cri.ensmp.fr> 
>> wrote:
>>> Sigh.
>>>
>>> How to transform a trivial 10 lines patch into a probably 500+ lines 
>>> project
>>> involving flex & bison & some non trivial data structures, and which 
>>> may get
>>> rejected on any ground...
>>>
>>> Maybe I'll set that as a student project.
>>
>> I think you're making a mountain out of a molehill.  I implemented
>> this today in about three hours.  The patch is attached.  It needs
>> more testing, documentation, and possibly some makefile adjustments,
>> but it seems to basically work.
>
> Looks good to me. The new modulo operator needs documentation, and it 
> could use a pgindent run, but other than that I think this is ready 
> for commit.
>
> It would be nice to go even further, and replace process_file, 
> process_builtin, and process_commands altogether with a Bison parser. 
> But this is definitely a step in the right direction.
>


As it doesn't have documentation, I'm inclined to say we should mark 
this as Waiting on Author or Returned with Feedback.

cheers

andrew



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
> As it doesn't have documentation, I'm inclined to say we should mark this as 
> Waiting on Author or Returned with Feedback.

I'm planing to have a detailed look at Robert's patch before the end of 
the year. I could update pgbench documentation while doing that.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Andrew Dunstan
Date:
On 12/13/2014 01:19 PM, Fabien COELHO wrote:
>
>> As it doesn't have documentation, I'm inclined to say we should mark 
>> this as Waiting on Author or Returned with Feedback.
>
> I'm planing to have a detailed look at Robert's patch before the end 
> of the year. I could update pgbench documentation while doing that.
>

Ok, I think for now that means Returned with Feedback.

cheers

andrew



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Here is a review & updated version of the patch.

Patch applies and compiles without problem on current head,
and worked for the various tests I made with custom scripts.

This patch is a good thing, and I recommand warmly its inclusion. It 
extends greatly pgbench custom capabilities by allowing arbitrary integer 
expressions, and will allow to add other functions (I'll send abs() & a 
hash(), and possibly others).

I'm not sure about what Makefile changes could be necessary for
various environments, it looks ok to me as it is.

I have included the following additional changes in v2:

(1) small update to pgbench documentation. English proof reading is welcome.

(2) improve error reporting to display the file and line from where an error    is raised, and also the column on
syntaxerrors (yyline is always 1...).    The prior status was that there were no way to get this information, which
wasquite annoying.  It could probably be improved further.
 

(3) spacing fixed in a few places.

If Robert is ok with these changes, I think it could be applied.

-- 
Fabien.

Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
> I'm not sure about what Makefile changes could be necessary for
> various environments, it looks ok to me as it is.

Found one. EXTRA_CLEAN is needed for generated C files. Added to this v3.

-- 
Fabien.

Re: add modulo (%) operator to pgbench

From
David Rowley
Date:
On 1 January 2015 at 04:02, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

I'm not sure about what Makefile changes could be necessary for
various environments, it looks ok to me as it is.

Found one. EXTRA_CLEAN is needed for generated C files. Added to this v3.


I was about to go and look at this, but I had a problem when attempting to compile with MSVC.

The attached patch seems to fix it.

Regards

David Rowley 
Attachment

Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
> I was about to go and look at this, but I had a problem when attempting to
> compile with MSVC.

Thanks! Here is a v4 which includes your fix.

-- 
Fabien.

Re: add modulo (%) operator to pgbench

From
David Rowley
Date:
On 1 January 2015 at 21:23, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

I was about to go and look at this, but I had a problem when attempting to
compile with MSVC.

Thanks! Here is a v4 which includes your fix.

Thank you.

I've had a quick look at the patch as I'm quite interested in seeing some improvements in this area.

At the moment I feel the patch is a bit half done. I really think that since the gaussian and exponential stuff was added in commit ed802e7d, that this should now be changed so that we have functions like random(), erandom() and grandom() and the way to use this becomes:

\set name random(1,10)
\set name erandom(1,10)

And we completely pull out the new \\setrandom additions added in that commit. We'd likely keep \\setrandom name 1 10 for backwards compatibility.

Does anyone else feel strongly about fixing this now, while we have the chance?

Regards

David Rowley

Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Hello David,

> At the moment I feel the patch is a bit half done. I really think that
> since the gaussian and exponential stuff was added in commit ed802e7d, that
> this should now be changed so that we have functions like random(),
> erandom() and grandom() and the way to use this becomes:

> \set name random(1,10)
> \set name erandom(1,10)
>
> And we completely pull out the new \\setrandom additions added in that
> commit. We'd likely keep \\setrandom name 1 10 for backwards compatibility.

> Does anyone else feel strongly about fixing this now, while we have the
> chance?

I thought about adding functions, possibly random, very probably abs & 
some hash, but I felt it would be more for a second round.

The other point is that although uniform random is fine, the gaussian and 
exponential ones require an additional floating point argument which means 
handling some typing.

The current patch is "just" about handling operators as before, although 
in a much nicer and extensible way, thus I would suggest to let Robert's 
patch more or less as it is, and people will be able to propose new 
extensions later on.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Alvaro Herrera
Date:
David Rowley wrote:

> At the moment I feel the patch is a bit half done. I really think that
> since the gaussian and exponential stuff was added in commit ed802e7d, that
> this should now be changed so that we have functions like random(),
> erandom() and grandom() and the way to use this becomes:
> 
> \set name random(1,10)
> \set name erandom(1,10)
> 
> And we completely pull out the new \\setrandom additions added in that
> commit.

Sounds good to me.  The current syntax is rather messy IMV, and
presumably a function-based syntax might be better.

> We'd likely keep \\setrandom name 1 10 for backwards compatibility.

Yep.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: add modulo (%) operator to pgbench

From
David Rowley
Date:
On 1 January 2015 at 21:23, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

I was about to go and look at this, but I had a problem when attempting to
compile with MSVC.

Thanks! Here is a v4 which includes your fix.


I'm also just looking at you ERROR() macro, it seems that core code is quite careful not to use __VA_ARGS__ on compilers where HAVE__VA_ARGS is not defined. I'd say this needs to be fixed too. Have a look at the trick used in elog.h for when HAVE__VA_ARGS  is not defined.

Regards

David Rowley

Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
> I'm also just looking at you ERROR() macro, it seems that core code is
> quite careful not to use __VA_ARGS__ on compilers where HAVE__VA_ARGS is
> not defined. I'd say this needs to be fixed too. Have a look at the trick
> used in elog.h for when HAVE__VA_ARGS  is not defined.

Indeed.

The simplest solution seems to expand this macro so that these is no 
macro:-) I'll do that.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
> I'm also just looking at you ERROR() macro, it seems that core code is
> quite careful not to use __VA_ARGS__ on compilers where HAVE__VA_ARGS is
> not defined. I'd say this needs to be fixed too. Have a look at the trick
> used in elog.h for when HAVE__VA_ARGS  is not defined.

Here is a v5 with the vararg macro expanded.

-- 
Fabien.

Re: add modulo (%) operator to pgbench

From
Alvaro Herrera
Date:
David Rowley wrote:

> I'm also just looking at you ERROR() macro, it seems that core code is
> quite careful not to use __VA_ARGS__ on compilers where HAVE__VA_ARGS is
> not defined. I'd say this needs to be fixed too. Have a look at the trick
> used in elog.h for when HAVE__VA_ARGS  is not defined.

Hm, I just looked at the previous version which used ERROR rather than
LOCATE and LOCATION, and I liked that approach better.  Can we get that
back?  I understand that for the purposes of this patch it's easier to
not change existing fprintf()/exit() calls, though.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: add modulo (%) operator to pgbench

From
Alvaro Herrera
Date:
Fabien COELHO wrote:
> 
> >I'm also just looking at you ERROR() macro, it seems that core code is
> >quite careful not to use __VA_ARGS__ on compilers where HAVE__VA_ARGS is
> >not defined. I'd say this needs to be fixed too. Have a look at the trick
> >used in elog.h for when HAVE__VA_ARGS  is not defined.
> 
> Here is a v5 with the vararg macro expanded.

Thanks.

On top of evaluateExpr() we need a comment (generally I think pgbench
could do with more comments; not saying your patch should add them, just
expressing an opinion.)  Also, intuitively I would say that the return
values of that function should be reversed: return true if things are
good. Can we cause a stack overflow in this function with a complex
enough expression?

I wonder about LOCATE and LOCATION.  Can we do away with the latter, and
keep only LOCATE perhaps with a better name such as PRINT_ERROR_AT or
similar?  I would just expand an ad-hoc fprintf in the single place
where the other macro is used.

Are we okay with only integer operands?  Is this something we would
expand in the future?  Is the gaussian/exp random stuff going to work
with integer operands, if we want to change it to use function syntax,
as expressed elsewhere?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Hello Alvaro,

> On top of evaluateExpr() we need a comment (generally I think pgbench
> could do with more comments; not saying your patch should add them, just
> expressing an opinion.)

Having spent some time in pgbench, I agree that more comments are a good 
thing.

> Also, intuitively I would say that the return values of that function 
> should be reversed: return true if things are good.

Ok.

> Can we cause a stack overflow in this function with a complex
> enough expression?

Not for any practical purpose, IMO.

> I wonder about LOCATE and LOCATION.  Can we do away with the latter, and
> keep only LOCATE perhaps with a better name such as PRINT_ERROR_AT or
> similar?  I would just expand an ad-hoc fprintf in the single place
> where the other macro is used.

I think that all "location" information should always be the same, so 
having it defined only once helps maintenance. If someone fixes the macro 
and there is one expanded version it is likely that it would not be 
changed. Maybe we could do with only one macro, though.

> Are we okay with only integer operands?  Is this something we would 
> expand in the future?

Probably

> Is the gaussian/exp random stuff going to work with integer operands,

No, it will need a floating point parameter, but I was thinking of only 
adding constants floats as function arguments in a first approach, and not 
allow an expression syntax on these, something like:
  \set n exprand(1, :size+3, 2.0) + 1

But not
  \set n exprand(1, :size+3, :size/3.14159) + 1

That is debatable. Otherwise we have to take care of typing issues, which 
would complicate the code significantly with two dynamic types (int & 
float) to handle, propagate and so in the expression evaluation. It is 
possible though, but it seems to me that it is a lot of bother for a small 
added value.

Anyway, I suggest to keep that for another round and keep the Robert's
isofunctional patch as it is before extending.

> if we want to change it to use function syntax, as expressed elsewhere?

I think I'll add a function syntax, and add a new node type to handle 
these, but the current syntax should/might be preserved for upward 
compatibility.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
>> I'm also just looking at you ERROR() macro, it seems that core code is
>> quite careful not to use __VA_ARGS__ on compilers where HAVE__VA_ARGS is
>> not defined. I'd say this needs to be fixed too. Have a look at the trick
>> used in elog.h for when HAVE__VA_ARGS  is not defined.
>
> Hm, I just looked at the previous version which used ERROR rather than
> LOCATE and LOCATION, and I liked that approach better.  Can we get that
> back?

It seems that postgresql must be able to compile with a preprocessor which 
does not handle varargs macros, so I thought it simpler to just expand the 
macro. If we must have it without a vararg macro, it means creating an 
adhoc vararg function to handle the vfprintf and exit. Oviously it can be 
done, if vfprintf is available. The prior style was to repeat fprintf/exit 
everywhere, I wanted to improve a little, but not to bother too much about 
portability constraints with pre C99 compilers.

> I understand that for the purposes of this patch it's easier to
> not change existing fprintf()/exit() calls, though.

The issue is really the portability constraints. C99 is only 16 years old, 
so it is a minor language:-)

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Hello Alvaro,

Here is a v6 with most of your suggestions applied.

> On top of evaluateExpr() we need a comment (generally I think pgbench
> could do with more comments; not saying your patch should add them, just
> expressing an opinion.)  Also, intuitively I would say that the return
> values of that function should be reversed: return true if things are
> good.

Comment & inverted return value done.

> I wonder about LOCATE and LOCATION.  Can we do away with the latter, and
> keep only LOCATE perhaps with a better name such as PRINT_ERROR_AT or
> similar?  I would just expand an ad-hoc fprintf in the single place
> where the other macro is used.

I've used just one PRINT_ERROR_AT() macro consistently.

> Are we okay with only integer operands?  Is this something we would
> expand in the future?  Is the gaussian/exp random stuff going to work
> with integer operands, if we want to change it to use function syntax,
> as expressed elsewhere?

Nothing for now, I feel it is for a later round.

> [other mail] bring ERROR() macro back

I also prefer the code with it, but the cost-benefit of a pre-C99 
compatible implementation seems quite low, and it does imply less (style) 
changes with the previous situation as it is.

-- 
Fabien.

Re: add modulo (%) operator to pgbench

From
Robert Haas
Date:
On Mon, Jan 5, 2015 at 10:37 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> Anyway, I suggest to keep that for another round and keep the Robert's
> isofunctional patch as it is before extending.

+1.  Let's please get the basic thing committed, and then people can
write more patches to extend and improve it.  There is no reason to
squash-merge every enhancement someone might want here into what I
wrote.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
> On Mon, Jan 5, 2015 at 10:37 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>> Anyway, I suggest to keep that for another round and keep the Robert's
>> isofunctional patch as it is before extending.
>
> +1.  Let's please get the basic thing committed, and then people can
> write more patches to extend and improve it.  There is no reason to
> squash-merge every enhancement someone might want here into what I
> wrote.

From my point of view the latest v6 of the patch is pretty "ready for 
committers", but I'm not sure whom should decide that...

Should I just update the state in the commitfest interface?

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Stephen Frost
Date:
* Fabien COELHO (coelho@cri.ensmp.fr) wrote:
>
> >On Mon, Jan 5, 2015 at 10:37 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> >>Anyway, I suggest to keep that for another round and keep the Robert's
> >>isofunctional patch as it is before extending.
> >
> >+1.  Let's please get the basic thing committed, and then people can
> >write more patches to extend and improve it.  There is no reason to
> >squash-merge every enhancement someone might want here into what I
> >wrote.
>
> >From my point of view the latest v6 of the patch is pretty "ready
> >for
> committers", but I'm not sure whom should decide that...
>
> Should I just update the state in the commitfest interface?

I took a look through the patch and the discussion and it certainly
seems ready to me.  I agree with Robert- let's go ahead and get this in
and then folks can build on top of it.  I'm guessing it was added as
"Needs Review" since that's the default for a new entry, but it's
clearly had review from multiple people, committers and non-committers
alike, so I went ahead and marked it as 'ready for committer' to make
that clear to folks looking at the CF app.

Robert, as this is mostly your code (and you're marked as author on the
CF app), do you want to do the actual commit, or are you impartial, or
would you prefer someone else handle it, or..?  I'm interested in this
also and would be happy to help in any way I can.
Thanks!
    Stephen

Re: add modulo (%) operator to pgbench

From
Robert Haas
Date:
On Sat, Feb 28, 2015 at 12:06 PM, Stephen Frost <sfrost@snowman.net> wrote:
> I took a look through the patch and the discussion and it certainly
> seems ready to me.  I agree with Robert- let's go ahead and get this in
> and then folks can build on top of it.  I'm guessing it was added as
> "Needs Review" since that's the default for a new entry, but it's
> clearly had review from multiple people, committers and non-committers
> alike, so I went ahead and marked it as 'ready for committer' to make
> that clear to folks looking at the CF app.
>
> Robert, as this is mostly your code (and you're marked as author on the
> CF app), do you want to do the actual commit, or are you impartial, or
> would you prefer someone else handle it, or..?  I'm interested in this
> also and would be happy to help in any way I can.

Yeah, I'll take care of it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add modulo (%) operator to pgbench

From
Robert Haas
Date:
On Sun, Mar 1, 2015 at 1:42 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Feb 28, 2015 at 12:06 PM, Stephen Frost <sfrost@snowman.net> wrote:
>> I took a look through the patch and the discussion and it certainly
>> seems ready to me.  I agree with Robert- let's go ahead and get this in
>> and then folks can build on top of it.  I'm guessing it was added as
>> "Needs Review" since that's the default for a new entry, but it's
>> clearly had review from multiple people, committers and non-committers
>> alike, so I went ahead and marked it as 'ready for committer' to make
>> that clear to folks looking at the CF app.
>>
>> Robert, as this is mostly your code (and you're marked as author on the
>> CF app), do you want to do the actual commit, or are you impartial, or
>> would you prefer someone else handle it, or..?  I'm interested in this
>> also and would be happy to help in any way I can.
>
> Yeah, I'll take care of it.

Done.  I removed some of the new error-reporting stuff because (1) I
wasn't sure it was right and (2) it touched more places than just the
ones directly relevant to the patch.  I think those things can be
resubmitted as a separate patch, but I'd like to have a more robust
discussion about what we want the error reporting to look like rather
than just sliding it into this patch.  I also removed the bit about
using "uniform" as an argument to setrandom, which may or may not be
something we want but doesn't seem related to the rest of this.  The
rest, I committed with minor wordsmithing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
Hello Robert,

> Done.

Great!

> I removed some of the new error-reporting stuff because (1) I wasn't 
> sure it was right

Hmmm. Although I'm not sure it was right, I'm sure that not enough error 
reporting is too rough:
  sh> ./pgbench -f bad.sql  syntax error at column 25  set: parse error

> and (2) it touched more places than just the ones directly relevant to 
> the patch.

It just tried to report errors consistently from all places.

> I think those things can be resubmitted as a separate patch,

Yep, I'll do that.

> but I'd like to have a more robust discussion about what we want the 
> error reporting to look like rather than just sliding it into this 
> patch.

As an input, I suggest that the error reporting feature should provide a 
clue about where the issue is, including a line number and possibly the 
offending line. Not sure what else is needed.

-- 
Fabien.



Re: add modulo (%) operator to pgbench

From
Robert Haas
Date:
On Mon, Mar 2, 2015 at 5:41 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>> but I'd like to have a more robust discussion about what we want the error
>> reporting to look like rather than just sliding it into this patch.
>
> As an input, I suggest that the error reporting feature should provide a
> clue about where the issue is, including a line number and possibly the
> offending line. Not sure what else is needed.

I agree.  But I think it might be better to try to put everything
related to a single error on one line, in a consistent format, e.g.:

bad.sql:3: syntax error in set command at column 25

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: add modulo (%) operator to pgbench

From
Fabien COELHO
Date:
> On Mon, Mar 2, 2015 at 5:41 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>>> but I'd like to have a more robust discussion about what we want the error
>>> reporting to look like rather than just sliding it into this patch.
>>
>> As an input, I suggest that the error reporting feature should provide a
>> clue about where the issue is, including a line number and possibly the
>> offending line. Not sure what else is needed.
>
> I agree.  But I think it might be better to try to put everything
> related to a single error on one line, in a consistent format, e.g.:
>
> bad.sql:3: syntax error in set command at column 25

Hmmm... sure that's better. I'll look into that.

-- 
Fabien.