Thread: pgbench - allow backslash-continuations in custom scripts

pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
Add backslash continuations to pgbench custom scripts.

The benefit of this approach is that it is upward compatible, and it is 
also pretty simple to implement. The downside is that backslash 
continuation is not the best syntax ever invented, but then you do not 
have to use it if you do not like it.

The alternative would be to make semi-colon a mandatory end-of-line 
marker, which would introduce an incompatibility and requires more efforts 
to implement, including some kind of SQL-compatible lexer.

IMHO this approach is the best compromise.

-- 
Fabien.

Re: pgbench - allow backslash-continuations in custom scripts

From
Josh Berkus
Date:
On 05/14/2015 12:10 PM, Fabien COELHO wrote:
> 
> Add backslash continuations to pgbench custom scripts.
> 
> The benefit of this approach is that it is upward compatible, and it is
> also pretty simple to implement. The downside is that backslash
> continuation is not the best syntax ever invented, but then you do not
> have to use it if you do not like it.
> 
> The alternative would be to make semi-colon a mandatory end-of-line
> marker, which would introduce an incompatibility and requires more
> efforts to implement, including some kind of SQL-compatible lexer.
> 
> IMHO this approach is the best compromise.

I don't personally agree.  I believe that it it worth breaking backwards
compatibility to support line breaks in pgbench statements, and that if
we're not going to do that, supporting \ continuations is of little value.

As someone who actively uses pgbench to write custom benchmarks, I need
to write queries which I can test.  \ continuation does NOT work on the
psql command line, so that's useless for testing my queries; I still
have to reformat and troubleshoot.  If we added \ continuation, I
wouldn't use it.

I think we should support line breaks, and require semicolons for
end-of-statement.  Backwards-compatability in custom pgbench scripts is
not critical; pgbench scripts are neither used in produciton, nor used
in automated systems much that I know of.

I'm not clear on why we'd need a full SQL lexer.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: pgbench - allow backslash-continuations in custom scripts

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> On 05/14/2015 12:10 PM, Fabien COELHO wrote:
>> Add backslash continuations to pgbench custom scripts.

> I don't personally agree.  I believe that it it worth breaking backwards
> compatibility to support line breaks in pgbench statements, and that if
> we're not going to do that, supporting \ continuations is of little value.

I tend to agree on that bottom line; having this be inconsistent with psql
does not seem like a win.

> I'm not clear on why we'd need a full SQL lexer.

So you don't get fooled by semicolons embedded in string literals or
comments.
        regards, tom lane



Re: pgbench - allow backslash-continuations in custom scripts

From
Josh Berkus
Date:
On 06/19/2015 02:51 PM, Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>> On 05/14/2015 12:10 PM, Fabien COELHO wrote:
>>> Add backslash continuations to pgbench custom scripts.
> 
>> I don't personally agree.  I believe that it it worth breaking backwards
>> compatibility to support line breaks in pgbench statements, and that if
>> we're not going to do that, supporting \ continuations is of little value.
> 
> I tend to agree on that bottom line; having this be inconsistent with psql
> does not seem like a win.
> 
>> I'm not clear on why we'd need a full SQL lexer.
> 
> So you don't get fooled by semicolons embedded in string literals or
> comments.

I take it we ignore those now?  I mean, personally, it wouldn't break
anything for me but since some other benhcmarks involve random text
generators ....


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
>> I tend to agree on that bottom line; having this be inconsistent with psql
>> does not seem like a win.
>>
>>> I'm not clear on why we'd need a full SQL lexer.
>>
>> So you don't get fooled by semicolons embedded in string literals or
>> comments.
>
> I take it we ignore those now?  I mean, personally, it wouldn't break
> anything for me but since some other benhcmarks involve random text
> generators ....

If backward compatibility is not an issue (I'm surprised:-), and failure 
is acceptable in contrived cases, a simple implementation would be to 
accumulate lines till one ends with ";\s*$",

Otherwise maybe the "states" management or the lexer are enough (in simple 
quotes, in double quotes, in comment, in stuff), so this can implemented 
without actually requiring another lexer in pgbench and be robust.

-- 
Fabien.



Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
Hello Josh,

>> Add backslash continuations to pgbench custom scripts.
>> [...]
>> IMHO this approach is the best compromise.
>
> I don't personally agree.  I believe that it it worth breaking backwards
> compatibility to support line breaks in pgbench statements, and that if
> we're not going to do that, supporting \ continuations is of little value.
>
> As someone who actively uses pgbench to write custom benchmarks, I need
> to write queries which I can test.  \ continuation does NOT work on the
> psql command line, so that's useless for testing my queries; I still
> have to reformat and troubleshoot.  If we added \ continuation, I
> wouldn't use it.
>
> I think we should support line breaks, and require semicolons for
> end-of-statement.  Backwards-compatability in custom pgbench scripts is
> not critical; pgbench scripts are neither used in produciton, nor used
> in automated systems much that I know of.
>
> I'm not clear on why we'd need a full SQL lexer.


Attached is a "without lexer" version which does ;-terminated SQL commands 
and \-continuated meta commands (may be useful for \shell and long \set 
expressions).


Also attached is a small pgbench script to test the feature.


Without a lexer it is possible to fool pgbench with somehow contrived 
examples, say with:
  SELECT 'hello;    world';

The ";" within the string will be considered as end-of-line.

Also, comments intermixed with sql on the same line would generate errors.
  SELECT 1 -- one    + 3;

Would fail, but comments on lines of their own are ok.

It may be argued that these are not a likely scripts and that this 
behavior could be declared as a "feature" for keeping the code simple.

ISTM that it would be an overall improvement, but also the ;-termination 
requirement breaks backward compatibility.

-- 
Fabien.

Re: pgbench - allow backslash-continuations in custom scripts

From
Josh Berkus
Date:
Fabien,

> Without a lexer it is possible to fool pgbench with somehow contrived
> examples, say with:
> 
>   SELECT 'hello;
>     world';
> 
> The ";" within the string will be considered as end-of-line.
> 
> Also, comments intermixed with sql on the same line would generate errors.
> 
>   SELECT 1 -- one
>     + 3;
> 
> Would fail, but comments on lines of their own are ok.
> 
> It may be argued that these are not a likely scripts and that this
> behavior could be declared as a "feature" for keeping the code simple.

Yeah, these seem pretty contrived.  I would personally be OK with
breaking them.

> ISTM that it would be an overall improvement, but also the ;-termination
> requirement breaks backward compatibility.

Look, how many people in the world develop their own pgbench scripts?
And how many of those aren't on this list right now, reading this
thread?  I expect I could count them on my fingers and toes.
Backwards-compatability for pgdump, pg_basebackup, initdb, etc. matters.The worst case with pgbench is that we break
twopeople's test scripts,
 
they read the release notes, and fix them.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
> The worst case with pgbench is that we break two people's test scripts,
> they read the release notes, and fix them.

Sure, I agree that breaking pgbench custom scripts compatibility is no big 
deal, and having pgbench consistent with psql is useful.

-- 
Fabien.



Re: pgbench - allow backslash-continuations in custom scripts

From
Josh Berkus
Date:
On 06/21/2015 01:37 PM, Fabien COELHO wrote:
> 
>> The worst case with pgbench is that we break two people's test scripts,
>> they read the release notes, and fix them.
> 
> Sure, I agree that breaking pgbench custom scripts compatibility is no
> big deal, and having pgbench consistent with psql is useful.

... apparently nobody disagrees ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: pgbench - allow backslash-continuations in custom scripts

From
Tatsuo Ishii
Date:
> Look, how many people in the world develop their own pgbench scripts?
> And how many of those aren't on this list right now, reading this
> thread?  I expect I could count them on my fingers and toes.

I'm not against you break the backward compatibility of pgbench custom
scripts.

However I just want to let you know that PostgreSQL Enterprise
Consortium has been published couple of scripts along with reports on
evaluating PostgreSQL, which have been downloaded considerable
numbers.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: pgbench - allow backslash-continuations in custom scripts

From
Jeff Janes
Date:
On Wed, Jun 24, 2015 at 1:22 PM, Josh Berkus <josh@agliodbs.com> wrote:
On 06/21/2015 01:37 PM, Fabien COELHO wrote:
>
>> The worst case with pgbench is that we break two people's test scripts,
>> they read the release notes, and fix them.
>
> Sure, I agree that breaking pgbench custom scripts compatibility is no
> big deal, and having pgbench consistent with psql is useful.

... apparently nobody disagrees ...

I'm fine re-punctuating my current scripts.  And since pgbench doesn't have to be version-matched to the server it connects to, people can just keep using the older version if they want to against a new server.

Are there other breaking changes we have been wanting to make?  If so, should we try to get them all in during the same release?

Cheers,

Jeff

Re: pgbench - allow backslash-continuations in custom scripts

From
Michael Paquier
Date:
On Thu, Jun 25, 2015 at 10:51 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
>> Look, how many people in the world develop their own pgbench scripts?
>> And how many of those aren't on this list right now, reading this
>> thread?  I expect I could count them on my fingers and toes.
>
> I'm not against you break the backward compatibility of pgbench custom
> scripts.
>
> However I just want to let you know that PostgreSQL Enterprise
> Consortium has been published couple of scripts along with reports on
> evaluating PostgreSQL, which have been downloaded considerable
> numbers.

pgbench is a tool for dedicated to developers. Hence people who should
be able to fix scripts properly as long as we inform them by
documenting it in the release notes so I am not sure that this is
actually a problem.
-- 
Michael



Re: pgbench - allow backslash-continuations in custom scripts

From
Tatsuo Ishii
Date:
>> I'm not against you break the backward compatibility of pgbench custom
>> scripts.
>>
>> However I just want to let you know that PostgreSQL Enterprise
>> Consortium has been published couple of scripts along with reports on
>> evaluating PostgreSQL, which have been downloaded considerable
>> numbers.
> 
> pgbench is a tool for dedicated to developers. Hence people who should
> be able to fix scripts properly as long as we inform them by
> documenting it in the release notes so I am not sure that this is
> actually a problem.

I'm not sure what you mean by "developers" here but if that means
people who are developing PostgreSQL itself, I am strongly against
"pgbench is a tool for dedicated to developers" concept. Pgbench has
been heavily used by users who want to measure PostgreSQL's
performance.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: pgbench - allow backslash-continuations in custom scripts

From
Michael Paquier
Date:
On Fri, Jun 26, 2015 at 9:01 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:
>>> I'm not against you break the backward compatibility of pgbench custom
>>> scripts.
>>>
>>> However I just want to let you know that PostgreSQL Enterprise
>>> Consortium has been published couple of scripts along with reports on
>>> evaluating PostgreSQL, which have been downloaded considerable
>>> numbers.
>>
>> pgbench is a tool for dedicated to developers. Hence people who should
>> be able to fix scripts properly as long as we inform them by
>> documenting it in the release notes so I am not sure that this is
>> actually a problem.
>
> I'm not sure what you mean by "developers" here but if that means
> people who are developing PostgreSQL itself, I am strongly against
> "pgbench is a tool for dedicated to developers" concept. Pgbench has
> been heavily used by users who want to measure PostgreSQL's
> performance.

I meant "people who can write SQL". Sorry for the misunderstanding.
Please do not take any offense ;)
-- 
Michael



Re: pgbench - allow backslash-continuations in custom scripts

From
Heikki Linnakangas
Date:
On 06/21/2015 11:12 AM, Fabien COELHO wrote:
>
> Hello Josh,
>
>>> Add backslash continuations to pgbench custom scripts.
>>> [...]
>>> IMHO this approach is the best compromise.
>>
>> I don't personally agree.  I believe that it it worth breaking backwards
>> compatibility to support line breaks in pgbench statements, and that if
>> we're not going to do that, supporting \ continuations is of little value.
>>
>> As someone who actively uses pgbench to write custom benchmarks, I need
>> to write queries which I can test.  \ continuation does NOT work on the
>> psql command line, so that's useless for testing my queries; I still
>> have to reformat and troubleshoot.  If we added \ continuation, I
>> wouldn't use it.
>>
>> I think we should support line breaks, and require semicolons for
>> end-of-statement.  Backwards-compatability in custom pgbench scripts is
>> not critical; pgbench scripts are neither used in produciton, nor used
>> in automated systems much that I know of.
>>
>> I'm not clear on why we'd need a full SQL lexer.
>
> Attached is a "without lexer" version which does ;-terminated SQL commands
> and \-continuated meta commands (may be useful for \shell and long \set
> expressions).

As Tom pointed out, you need the full lexer to do this correctly. You 
can argue that something that handles the most common cases is enough, 
but realistically, by the time you've handled all the common cases 
correctly, you've just re-invented the lexer.

The home-grown lexer is missing e.g. dollar-quoting support, so this is 
not be parsed correctly:

do $$
begin  ...
end;
$$;

That would be very nice to handle correctly, I've used DO-blocks in 
pgbench scripts many times, and it's a pain to have to write them in a 
single line.

Also worth noting that you can currently test so-called multi-statements 
with pgbench, by putting multiple statements on a single line. Your 
patch seems to still do that, but if we went with a full-blown SQL 
lexer, they would probably be split into two statements.

I think we should either bite the bullet and include the full SQL lexer 
in pgbench, or come up with some new syntax for marking the beginning 
and end of a statement. We could do something like bash here-documents 
or Postgres dollar-quoting, for example:

\set ...
select 1234; -- A statement on a single line, no change here

-- Begin a multi-line statement
\multi-line-statement END_TOKEN
select *  from complicated;
END_TOKEN

- Heikki




Re: pgbench - allow backslash-continuations in custom scripts

From
Andres Freund
Date:
On 2015-07-03 13:50:02 +0300, Heikki Linnakangas wrote:
> As Tom pointed out, you need the full lexer to do this correctly. You can
> argue that something that handles the most common cases is enough, but
> realistically, by the time you've handled all the common cases correctly,
> you've just re-invented the lexer.

Yes.

> I think we should either bite the bullet and include the full SQL lexer in
> pgbench, or come up with some new syntax for marking the beginning and end
> of a statement.

I'm pretty clearly in favor of doing correct lexing. I think we should
generalize that and make it reusable. psql has it's own hacked up
version already, there seems little point in having variedly good copies
around.

> We could do something like bash here-documents or Postgres
> dollar-quoting, for example:
> 
> \set ...
> select 1234; -- A statement on a single line, no change here
> 
> -- Begin a multi-line statement
> \multi-line-statement END_TOKEN
> select *
>   from complicated;
> END_TOKEN

Not pretty imo. I could see including something esimpler, in addition to
the lexer, to allow sending multiple statements in one go.



Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
Hello Heikki,

>>> I'm not clear on why we'd need a full SQL lexer.
>> 
>> Attached is a "without lexer" version which does ;-terminated SQL commands
>> and \-continuated meta commands (may be useful for \shell and long \set
>> expressions).
>
> As Tom pointed out, you need the full lexer to do this correctly. You can 
> argue that something that handles the most common cases is enough, but 
> realistically, by the time you've handled all the common cases correctly, 
> you've just re-invented the lexer.

Sure. I understand that part of Josh argument is that we are discussing 
pgbench test scripts here, not real full-blown applications, and these are 
expected to be quite basic, plain mostly SQL things.

> The home-grown lexer is missing e.g. dollar-quoting support, so this is not 
> be parsed correctly:
>
> do $$
> begin
>  ...
> end;
> $$;

Hmmm, good one, if indeed you want to use PL/pgSQL or even any arbitrary 
language in a pgbench scripts... I would rather have created functions 
(once, outside of pgbench) and would call them from the script, so that 
would be a simple SELECT.

> That would be very nice to handle correctly, I've used DO-blocks in pgbench 
> scripts many times, and it's a pain to have to write them in a single line.

Yep. With some languages I'm not sure that it is even possible.

> Also worth noting that you can currently test so-called multi-statements 
> with pgbench, by putting multiple statements on a single line.

Yes indeed, behind the hood pgbench expects just one line, or you have to 
change significantly the way statements are handled, which is way beyond 
my initial intentions on this one, and this would mean quite a lot of 
changes for more or less corner cases.

> Your patch seems to still do that, but if we went with a full-blown SQL 
> lexer, they would probably be split into two statements.

> I think we should either bite the bullet and include the full SQL lexer in 
> pgbench, or come up with some new syntax for marking the beginning and end of 
> a statement. We could do something like bash here-documents or Postgres 
> dollar-quoting, for example:
>
> \set ...
> select 1234; -- A statement on a single line, no change here
>
> -- Begin a multi-line statement
> \multi-line-statement END_TOKEN
> select *
>  from complicated;
> END_TOKEN

I do not like the aesthetic of the above much. I really liked the idea of 
simply writing SQL queries just as in psql.

So maybe just handling $$-quoting would be enough to handle reasonable 
use-cases without troubling pgbench internal working too much? That would 
be a simple local changes in the line reader.

-- 
Fabien.



Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
> I'm pretty clearly in favor of doing correct lexing. I think we should
> generalize that and make it reusable. psql has it's own hacked up
> version already, there seems little point in having variedly good copies
> around.

I must admit that I do not know how to share lexer rules but have 
different actions on them (psql vs sql parser vs ...), as the action code 
is intrinsically intertwined with expressions. Maybe some hack is 
possible. Having yet another SQL-lexer to maintain seems highly 
undesirable, especially just for pgbench.

> I could see including something esimpler, in addition to the lexer, to 
> allow sending multiple statements in one go.

Currently, probably
  SELECT 1; SELECT 1;

Does 2 statements in one go, but it is on one line.

May by allowing both continuations and ";" at the same time:
  -- two statements in one go  SELECT 1; \  SELECT 1;  -- next statement on it's own  SELECT    1;

Which could be reasonnably neat, and easy to implement.

-- 
Fabien.



Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
> The home-grown lexer is missing e.g. dollar-quoting support, so this is not 
> be parsed correctly:
>
> do $$
> begin
>  ...
> end;
> $$;

> That would be very nice to handle correctly, I've used DO-blocks in pgbench 
> scripts many times, and it's a pain to have to write them in a single line.

Attached is a version which does that (I think), and a test script.
 - backslash-commands can be \-continuated - sql-commands may include $$-quotes and must end with a ';' - double-dash
commentsand blank line are skipped.
 

Obviously it is still a non-lexer hack which can be easily defeated, but 
ISTM that it handles non-contrived cases well. Anyway ISTM that dollar 
quoting cannot be handle as such and simply by a lexer, it is really an 
exception mechanism.

-- 
Fabien.

Re: pgbench - allow backslash-continuations in custom scripts

From
Tom Lane
Date:
Fabien COELHO <coelho@cri.ensmp.fr> writes:
>> I'm pretty clearly in favor of doing correct lexing. I think we should
>> generalize that and make it reusable. psql has it's own hacked up
>> version already, there seems little point in having variedly good copies
>> around.

> I must admit that I do not know how to share lexer rules but have 
> different actions on them (psql vs sql parser vs ...), as the action code 
> is intrinsically intertwined with expressions.

Obviously this is scope creep of the first magnitude, but ISTM that
it would be possible to share a lexer between psql and pgbench, since
in both of them the basic requirement is "break SQL commands apart and
identify newline-terminated backslash commands".  If we're gonna break
pgbench's backwards compatibility anyway, there would be a whole lot
to be said for just going over to psql's input parsing rules, lock
stock 'n barrel; and this would be a good way to achieve that.

As it stands, psqlscan.l has some external dependencies on the rest of
psql, but we could perhaps refactor some of those away, and provide dummy
implementations to satisfy others (eg pgbench could provide a dummy
GetVariable() that just always returns NULL).

So I'm imagining symlinking psqlscan.l into src/bin/pgbench and using it
as-is (possibly after refactoring in psql).  A possible issue is avoiding
unnecessary invocations of flex, though.  Maybe symlinking the .c file
would work better.
        regards, tom lane



Re: pgbench - allow backslash-continuations in custom scripts

From
Tom Lane
Date:
I wrote:
> As it stands, psqlscan.l has some external dependencies on the rest of
> psql, but we could perhaps refactor some of those away, and provide dummy
> implementations to satisfy others (eg pgbench could provide a dummy
> GetVariable() that just always returns NULL).

> So I'm imagining symlinking psqlscan.l into src/bin/pgbench and using it
> as-is (possibly after refactoring in psql).  A possible issue is avoiding
> unnecessary invocations of flex, though.  Maybe symlinking the .c file
> would work better.

A quick experiment with compiling psqlscan inside pgbench yields the
following failures:

pgbench.o: In function `psql_scan_setup':
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1239: undefined reference to `pset'
pgbench.o: In function `escape_variable':
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1950: undefined reference to `pset'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1950: undefined reference to `GetVariable'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1956: undefined reference to `pset'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1957: undefined reference to `psql_error'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1971: undefined reference to `pset'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1973: undefined reference to `psql_error'
pgbench.o: In function `evaluate_backtick':
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1701: undefined reference to `psql_error'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1712: undefined reference to `psql_error'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1722: undefined reference to `psql_error'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1728: undefined reference to `psql_error'
pgbench.o: In function `yylex':
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:511: undefined reference to `standard_strings'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:743: undefined reference to `pset'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:743: undefined reference to `GetVariable'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:751: undefined reference to `psql_error'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1037: undefined reference to `pset'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1037: undefined reference to `GetVariable'
pgbench.o: In function `psql_scan_slash_option':
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1619: undefined reference to `pset'
/home/postgres/pgsql/src/bin/pgbench/psqlscan.l:1628: undefined reference to `psql_error'

The pset references are to pset.encoding, pset.db, or pset.vars.  I'd
think the best way to deal with the encoding and connection are to pass
them as parameters to psql_scan_setup() which'd store them in 
the PsqlScanState.  pset.vars is only passed to GetVariable.  We could
refactor that away somehow (although actually, why wouldn't we want to
just implement variable substitution exactly like it is in psql?  Maybe
the right answer is to import psql/variables.c lock stock n barrel too...)
psql_error() and standard_strings() wouldn't be hard to provide.

So this is looking *eminently* doable.
        regards, tom lane



Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
> (although actually, why wouldn't we want to just implement variable 
> substitution exactly like it is in psql?

Pgbench variable substitution is performed when the script is run, not 
while the file is being processed for being split, which is when a lexer 
would be used. The situation is not the same with psql. The most it could 
do would be to keep track of what substitution are done in queries.

> So this is looking *eminently* doable.

Possibly.  How much more effort would be involved compared to the quick 
patch I did, I wonder:-)

-- 
Fabien.



Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
Hello Tom,

>> (although actually, why wouldn't we want to just implement variable 
>> substitution exactly like it is in psql?
>
> Pgbench variable substitution is performed when the script is run, not while 
> the file is being processed for being split, which is when a lexer would be 
> used. The situation is not the same with psql. The most it could do would be 
> to keep track of what substitution are done in queries.
>
>> So this is looking *eminently* doable.
>
> Possibly.  How much more effort would be involved compared to the quick patch 
> I did, I wonder:-)

I had a quick look at the code, and although it seems doable to hack the 
psql lexer for pgbench benefit, I do not think it is a good idea:
 - improving pgbench scripts is really a small feature which requires   a light weight approach in my opinion. There is
noreal benefit   of having a lexer solution which can handle contrived cases, because   they would be contrived cases
andnot the kind of tests really written   by pgbench users.
 
 - the solution would probably be fragile to changes in psql, or   could prevent such changes because of the pgbench
dependency,  and this does not look desirable.
 
 - it would involve much more time than I'm ready to give on such a   small feature.

So the current patch, or possibly variants of this patch to fix issues 
that may be raised, is what I'm ready to put forward on this.

If you feel that this feature only deserve a lexer solution, then the 
patch should be "returned with feedback".

-- 
Fabien.



Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
Hi,

> If you feel that this feature only deserve a lexer solution, then the
> patch should be "returned with feedback".

It's unfortunate to abandon this idea so I tried this and made it
run with psql's parser.  I think it works as expected.

The attached files are as follwoing.

- 0001-Prepare-for-share-psqlscan-with-pgbench.patchA patch to modify psql so that psqlscan can be shared with other
modules.

- 0002-Make-use-of-psqlscan-in-pgbench.patchA patch to use psqlscan in pgbench.

- hoge.sqlA sample custom script including multilne statement and line comment 

I can't judge wheter this is a new version of Febien's patch
following Tom's suggestion or brand-new one. Anyway I'd like to
post on this thread.


======
At Fri, 17 Jul 2015 21:26:44 +0200 (CEST), Fabien COELHO <coelho@cri.ensmp.fr> wrote in
<alpine.DEB.2.10.1507172113080.31314@sto>
> > Pgbench variable substitution is performed when the script is run, not
> > while the file is being processed for being split, which is when a
> > lexer would be used. The situation is not the same with psql. The most
> > it could do would be to keep track of what substitution are done in
> > queries.
> >
> >> So this is looking *eminently* doable.
> >
> > Possibly.  How much more effort would be involved compared to the
> > quick patch I did, I wonder:-)


The patch set consists of two parts.

The first modifies psqlscan.l to work in pgbench. Almost along on
Tom's suggestion.
 - Eliminate direct reading of pset and store them into   PsqlScanState in psql_scan_setup.
 - variables, common, settings and prompt in pgbench are the   shrinked version from that of psql.

The second part modifies pgbench to use the modified version of
psqlscan.l. As the result,

- Multiline SQLs (not backslash continuation) in custom script is allowed. (also for builtins but it's no use).

- backslash commands is handled as the same as before: multiline is not allowed.

A sample script is also attached.

Suggestions? Opinions?

I don't have idea how to deal with the copy of psqlscan.[lh] from
psql. Currently they are simply the dead copies of those of psql.

- Modifying psqlscan in psql requires consideration on how it is used in pgbench.

- They are rather small but common, variables, prompt are essentially needeless files..



reagsrds,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
Hello Kyotaro-san,

>> If you feel that this feature only deserve a lexer solution, then the
>> patch should be "returned with feedback".
>
> It's unfortunate to abandon this idea so I tried this and made it run 
> with psql's parser. I think it works as expected.

Wow, you are much more courageous than I am!:-)

> - 0001-Prepare-for-share-psqlscan-with-pgbench.patch
> A patch to modify psql so that psqlscan can be shared with other modules.
>
> - 0002-Make-use-of-psqlscan-in-pgbench.patch
> A patch to use psqlscan in pgbench.
>
> - hoge.sql
> A sample custom script including multilne statement and line comment
>
> I can't judge wheter this is a new version of Febien's patch
> following Tom's suggestion or brand-new one. Anyway I'd like to
> post on this thread.

I think it is really a new patch, but posting it is seems logical because 
that is where the discussion was lead.

> - backslash commands is handled as the same as before: multiline
>  is not allowed.

Hmm... that is really the feature I wanted to add initially, too bad it is 
the dropped one:-)

> Suggestions? Opinions?
>
> I don't have idea how to deal with the copy of psqlscan.[lh] from
> psql. Currently they are simply the dead copies of those of psql.

I think that there should be no copies, but it should use relative 
symbolic links so that the files are kept synchronized.

> - Modifying psqlscan in psql requires consideration on how it is
>  used in pgbench.

Yep, that is one of the reason why I did not want to go this way, bar my 
natural lazyness.

-- 
Fabien.



Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
Hi, all.

Attatched is the revised version of this patch.

The first patch is not changed from before.

The second is fixed a kind of bug.

Ths third is the new one to allow backslash continuation for
backslash commands.

hoge.sql is the test custom script.

======
At Fri, 24 Jul 2015 07:39:16 +0200 (CEST), Fabien COELHO <coelho@cri.ensmp.fr> wrote in
<alpine.DEB.2.10.1507240731050.12839@sto>
> > - backslash commands is handled as the same as before: multiline
> >  is not allowed.
> 
> Hmm... that is really the feature I wanted to add initially, too bad
> it is the dropped one:-)

Ouch. The story has been derailed somewhere.

Since SQL statments could be multilined without particluar
marker, we cannot implement multilined backslash commands in the
same way..

The attached revised patch allows backslash continuation for
backslash comands. I suppose this is the same as what you did in
behavior. But SQL statements are still can be continued as psql
does.

I'm not satisfied by the design but I don't see another way..

> >
> > I don't have idea how to deal with the copy of psqlscan.[lh] from
> > psql. Currently they are simply the dead copies of those of psql.
> 
> I think that there should be no copies, but it should use relative
> symbolic links so that the files are kept synchronized.

Yeah, I think so but symlinks could harm on git and Windows. The
another way would be make copies it from psql directory. They
live next door to each other.

> > - Modifying psqlscan in psql requires consideration on how it is
> >  used in pgbench.
> 
> Yep, that is one of the reason why I did not want to go this way, bar
> my natural lazyness.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
<oops, sorry, stalled post because of wrong from, posting again...>

> Attatched is the revised version of this patch.
>
> The first patch is not changed from before.
>
> The second is fixed a kind of bug.
>
> Ths third is the new one to allow backslash continuation for
> backslash commands.

Ah, thanks:-)

Would you consider adding the patch to the next commitfest? I may put 
myself as a reviewer...

> [...] I'm not satisfied by the design but I don't see another way..

I'll try to have a look.

>>> I don't have idea how to deal with the copy of psqlscan.[lh] from
>>> psql. Currently they are simply the dead copies of those of psql.
>>
>> I think that there should be no copies, but it should use relative
>> symbolic links so that the files are kept synchronized.
>
> Yeah, I think so but symlinks could harm on git and Windows.
> The another way would be make copies it from psql directory. They live 
> next door to each other.

Indeed there are plenty of links already which are generated by makefiles 
(see src/bin/pg_xlogdump/*), and probably a copy is made on windows. There 
should no file duplication within the source tree.

-- 
Fabien.



Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
Hello,

> > Attatched is the revised version of this patch.
> >
> > The first patch is not changed from before.
> >
> > The second is fixed a kind of bug.
> >
> > Ths third is the new one to allow backslash continuation for
> > backslash commands.
> 
> Ah, thanks:-)
> 
> Would you consider adding the patch to the next commitfest? I may put
> myself as a reviewer...

No problem.

> > [...] I'm not satisfied by the design but I don't see another way..
> 
> I'll try to have a look.

Thanks.

> >>> I don't have idea how to deal with the copy of psqlscan.[lh] from
> >>> psql. Currently they are simply the dead copies of those of psql.
> >>
> >> I think that there should be no copies, but it should use relative
> >> symbolic links so that the files are kept synchronized.
> >
> > Yeah, I think so but symlinks could harm on git and Windows.
> > The another way would be make copies it from psql directory. They live
> > next door to each other.
> 
> Indeed there are plenty of links already which are generated by
> makefiles (see src/bin/pg_xlogdump/*), and probably a copy is made on
> windows. There should no file duplication within the source tree.

Thank you for pointing out that. Makefile of pg_xlogdump
re-creates symlinks to those files and they are replaced with cp
for the platforms where symlinks are not usable. But the files
are are explicitly added to .sln file on msvc. Anyway I'll
address it.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
> Attatched is the revised version of this patch.
>
> The first patch is not changed from before.
>
> The second is fixed a kind of bug.
>
> Ths third is the new one to allow backslash continuation for
> backslash commands.

Ah, thanks:-)

Would you consider adding the patch to the next commitfest? I may put 
myself as a reviewer...

> [...] I'm not satisfied by the design but I don't see another way..

I'll try to have a look.

>>> I don't have idea how to deal with the copy of psqlscan.[lh] from
>>> psql. Currently they are simply the dead copies of those of psql.
>>
>> I think that there should be no copies, but it should use relative
>> symbolic links so that the files are kept synchronized.
>
> Yeah, I think so but symlinks could harm on git and Windows.
> The another way would be make copies it from psql directory. They live 
> next door to each other.

Indeed there are plenty of links already which are generated by makefiles 
(see src/bin/pg_xlogdump/*), and probably a copy is made on windows. There 
should no file duplication within the source tree.

-- 
Fabien.



Re: pgbench - allow backslash-continuations in custom scripts

From
Heikki Linnakangas
Date:
On 07/24/2015 11:36 AM, Kyotaro HORIGUCHI wrote:
> At Fri, 24 Jul 2015 07:39:16 +0200 (CEST), Fabien COELHO <coelho@cri.ensmp.fr> wrote in
<alpine.DEB.2.10.1507240731050.12839@sto>
>>> - backslash commands is handled as the same as before: multiline
>>>   is not allowed.
>>
>> Hmm... that is really the feature I wanted to add initially, too bad
>> it is the dropped one:-)
>
> Ouch. The story has been derailed somewhere.
>
> Since SQL statments could be multilined without particluar
> marker, we cannot implement multilined backslash commands in the
> same way..

I don't think we actually want backslash-continuations. The feature we 
want is "allow SQL statements span multiple lines", and using the psql 
lexer solves that. We don't need the backslash-continuations when we 
have that.

On 07/25/2015 05:53 PM, Fabien COELHO wrote:
>>>> I don't have idea how to deal with the copy of psqlscan.[lh] from
>>>> psql. Currently they are simply the dead copies of those of psql.
>>>
>>> I think that there should be no copies, but it should use relative
>>> symbolic links so that the files are kept synchronized.
>>
>> Yeah, I think so but symlinks could harm on git and Windows.
>> The another way would be make copies it from psql directory. They live
>> next door to each other.
>
> Indeed there are plenty of links already which are generated by makefiles
> (see src/bin/pg_xlogdump/*), and probably a copy is made on windows. There
> should no file duplication within the source tree.

Yeah, following the example of pg_xlogdump and others is the way to go.

Docs need updating, and there's probably some cleanup to do before this 
is ready for committing, but overall I think this is definitely the 
right direction.

I complained upthread that this makes it impossible to use 
"multi-statements" in pgbench, as they would be split into separate 
statements, but looking at psqlscan.l there is actually a syntax for 
that in psql already. You escape the semicolon as \;, e.g. "SELECT 1 \; 
SELECT 2;", and then both queries will be sent to the server as one. So 
even that's OK.

- Heikki



Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
Hello Heikki,

> I don't think we actually want backslash-continuations. The feature we want 
> is "allow SQL statements span multiple lines", and using the psql lexer 
> solves that. We don't need the backslash-continuations when we have that.

Sure. The feature *I* initially wanted was to have multi-line 
meta-commands. For this feature ISTM that continuations are, alas, the 
solution.

>> Indeed there are plenty of links already which are generated by makefiles
>> (see src/bin/pg_xlogdump/*), and probably a copy is made on windows. There
>> should no file duplication within the source tree.
>
> Yeah, following the example of pg_xlogdump and others is the way to go.
>
> Docs need updating, and there's probably some cleanup to do before this is 
> ready for committing, but overall I think this is definitely the right 
> direction.

I've created an entry for the next commitfest, and put the status to 
"waiting on author".

> I complained upthread that this makes it impossible to use "multi-statements" 
> in pgbench, as they would be split into separate statements, but looking at 
> psqlscan.l there is actually a syntax for that in psql already. You escape 
> the semicolon as \;, e.g. "SELECT 1 \; SELECT 2;", and then both queries will 
> be sent to the server as one. So even that's OK.

Good!

-- 
Fabien.



Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
Hi, all.

> > I don't think we actually want backslash-continuations. The feature we
> > want is "allow SQL statements span multiple lines", and using the psql
> > lexer solves that. We don't need the backslash-continuations when we
> > have that.
> 
> Sure. The feature *I* initially wanted was to have multi-line
> meta-commands. For this feature ISTM that continuations are, alas, the
> solution.
> 
> >> Indeed there are plenty of links already which are generated by
> >> makefiles
> >> (see src/bin/pg_xlogdump/*), and probably a copy is made on
> >> windows. There
> >> should no file duplication within the source tree.
> >
> > Yeah, following the example of pg_xlogdump and others is the way to
> > go.
> >
> > Docs need updating, and there's probably some cleanup to do before
> > this is ready for committing, but overall I think this is definitely
> > the right direction.
> 
> I've created an entry for the next commitfest, and put the status to
> "waiting on author".
> 
> > I complained upthread that this makes it impossible to use
> > "multi-statements" in pgbench, as they would be split into separate
> > statements, but looking at psqlscan.l there is actually a syntax for
> > that in psql already. You escape the semicolon as \;, e.g. "SELECT 1
> > \; SELECT 2;", and then both queries will be sent to the server as
> > one. So even that's OK.
> 
> Good!

Hmm. psqlscan.l handles multistatement naturally.
I worked on that and the attached patche set does,
- backslash continuation for pgbench metacommands.
  set variable \      <some value>
- SQL statement natural continuation lines.
  SELECT :foo   FROM  :bar;
- SQL multi-statement.
  SELECT 1; SELECT 2;


The work to be left is eliminating double-format of Command
struct.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
Hello, Thank you for registering this to CF-Sep.
I missed the regtration window.. It ended earlier than usual..

Most troubles have gone and I'll be back next week.

> The work to be left is eliminating double-format of Command
> struct.

This is done as the additional fourth patch, not merged into
previous ones, to show what's changed in the manner of command
storing.

I repost on this thread the new version of this patch including
this and posted before. This is rebased to current master.

The changes in behaviors brought by this patch has been described
in the privous mail as the following,

> Hmm. psqlscan.l handles multistatement naturally.
> I worked on that and the attached patche set does,
> 
>  - backslash continuation for pgbench metacommands.
> 
>    set variable \
>        <some value>
> 
>  - SQL statement natural continuation lines.
> 
>    SELECT :foo
>     FROM  :bar;
> 
>  - SQL multi-statement.
> 
>    SELECT 1; SELECT 2;

Each of the four patches does the following thigs,

1. 0001-Prepare-to-share-psqlscan-with-pgbench.patch
 The global variable pset, VariableSpace and backslash syntax of psql are the obstacles for psqlscan.l from being used
bypgbench.  This patch eliminates direct reference to pset and masks VariableSpace feature (looks ugry..), and enables
backslashsyntax in psqlscan.l to be hidden from outside psql by defining the symbol OUTSIDE_PSQL.
 
 No behavioral changes of pasql are introduced by ths patch.

2. 0002-Make-use-of-psqlscan-for-parsing-of-custom-script.patch
 This is the core of this patch, which makes pgbench to use psqlscan.l and enables multi-statements,
multiline-SQL-statementand backslash-continuation of metacommands.
 
 The struct Command is modified that it can be chained in order to convey multistatement in one line. But the commands
arestored in an array of Command just outside process_commands as of old. This double-formatting will be removed by the
fourthpatch.
 
 psqlscan.c is compiled as a part of mainloop.c for some reason described at the end of the file. I haven't confirmed
thatthe same thing will happen in pgbench, but I did the same thing for pgbenc.c.
 
 Compilation will fail on Windows as of this patch.

3. 0003-Change-MSVC-Build-script.patch
 Changes the build script for Windows platform. It certainly works but might look a bit odd because of the anormaly of
thecompilation way of psqlscan.l
 

4. 0004-Change-the-way-to-hold-command-list.patch
 Changes the way to hold commad list from an array to linked list, to remove the double formatting of Command-list
introducedby the second patch. This removes the explicit limitation on the number of commands in scripts, as a
side-effect.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
Hello,

Here is a review, sorry for the delay...

> This is done as the additional fourth patch, not merged into
> previous ones, to show what's changed in the manner of command
> storing.
> [...]
>>  - SQL multi-statement.
>>
>>    SELECT 1; SELECT 2;

I think this is really "SELECT 1\; SELECT 2;"

I join a test script I used.


The purpose of this 4 parts patch is to reuse psql scanner from pgbench
so that commands are cleanly separated by ";", including managing dollar
quoting, having \ continuations in backslash-commands, having 
multi-statement commands...

This review is about 4 part v4 of the patch. The patches apply and compile 
cleanly.

I think that the features are worthwhile. I would have prefer more limited
changes to get them, but my earlier attempt was rejected, and the scanner
sharing with psql results in reasonably limited changes, so I would go for
it.


* 0001 patch about psql scanner reworking

The relevant features lexer which can be reused by pgbench are isolated
and adapted thanks to ifdefs, guards, and putting some stuff in the
current state.

I'm not sure of the "OUTSIDE_PSQL" macro name. ISTM that "PGBENCH_SCANNER"
would be better, as it makes it very clear that it is being used for pgbench,
and if someone wants to use it for something else they should define and handle
their own case explicitely.

Use "void *" instead of "int *" for VariableSpace?

Rule ":{variable_char}+": the ECHO works more or less as a side effect,
and most of the code is really ignored by pgbench. Instead of the different
changes which rely on NULL values, what about a simple ifdef/else/endif
to replace the whole stuff by ECHO for pgbench, without changing the current
code?

Also, on the same part of the code, I'm not sure about having two assignments
on the "const char * value" variable, because of "const".

The "db" parameter is not used by psql_scan_setup, so the state's db field
is never initialized, so probably "psql" does not work properly because
it seems used in several places.

I'm not sure what would happen if there are reconnections from psql (is 
that possible? Without reseting the scanner state?), as there are two 
connections, one in pset and one in the scanner state?

Variable lookup guards: why is a database connection necessary for doing
":variables" lookups? It seemed not to be required in the initial version,
and is not really needed.

Avoid changing style without clear motivation, for instance the 
PQerrorMessage/psql_error on escape_value==NULL?


*** 0002 patch to use the scanner in pgbench

There is no documentation nor examples about the new features.
I think that the internal builtin commands and the documentation should
be used to show the new features where appropriate, and insist on that
";" are now required at the end of sql commands.

If the file starts with a -- comment followed by a backslash-command, eg:  -- there is only one foo currently available
\set foo 1
 
an error is reported: the comment should probably just be ignored.

I'm not sure that the special "next" command parsing management is useful.
I do not see a significant added value to managing especially a list of
commands for commands which happened to be on the same line but separated
by a simple ";". Could not the code be simplified by just restarting
the scanner where it left, instead of looping in "process_commands"?

It seems that part of the changes are just reindentations, especially
all the parsing code for backslash commands. This should be avoided if
possible.

Some spurious spaces after "next_command:".


*** 0003 patch for ms build

I don't do windows:-)

The perl script changes look pretty reasonable, although the copied
comments refer to pg_xlogdump, I guess it should rather refer to pgbench.


*** 0004 command list patch

This patch changes the command array to use a linked list.

As the command number is needed in several places and has to be replaced by a
function call which scans the list, I do not think it is a good idea, and
I recommand not to consider this part for inclusion.

-- 
Fabien.

Re: pgbench - allow backslash-continuations in custom scripts

From
Michael Paquier
Date:
On Wed, Oct 14, 2015 at 5:49 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>
> Hello,
>
> Here is a review, sorry for the delay...
>
>> This is done as the additional fourth patch, not merged into
>> previous ones, to show what's changed in the manner of command
>> storing.
>> [...]
>>>
>>>  - SQL multi-statement.
>>>
>>>    SELECT 1; SELECT 2;
>
>
> I think this is really "SELECT 1\; SELECT 2;"
>
> I join a test script I used.
>
>
> The purpose of this 4 parts patch is to reuse psql scanner from pgbench
> so that commands are cleanly separated by ";", including managing dollar
> quoting, having \ continuations in backslash-commands, having
> multi-statement commands...
>
> This review is about 4 part v4 of the patch. The patches apply and compile
> cleanly.
>
> I think that the features are worthwhile. I would have prefer more limited
> changes to get them, but my earlier attempt was rejected, and the scanner
> sharing with psql results in reasonably limited changes, so I would go for
> it.

Regarding that:
+#if !defined OUTSIDE_PSQL
+#include "variables.h"
+#else
+typedef int * VariableSpace;
+#endif

And that:
+/* Provide dummy macros when no use of psql variables */
+#if defined OUTSIDE_PSQL
+#define GetVariable(space,name) NULL
+#define standard_strings() true
+#define psql_error(fmt,...) do { \
+    fprintf(stderr, "psql_error is called. abort.\n");\
+    exit(1);\
+} while(0)
+#endif
That's ugly... Wouldn't it be better with something say in src/common
which is frontend-only? We could start with a set of routines allowing
commands to be parsed. That gives us more room for future improvement.

+    # fix up pg_xlogdump once it's been set up
+    # files symlinked on Unix are copied on windows
+    my $pgbench = AddSimpleFrontend('pgbench');
+    $pgbench->AddDefine('FRONTEND');
+    $pgbench->AddDefine('OUTSIDE_PSQL');
+    $pgbench->AddFile('src/bin/psql/psqlscan.l');
+    $pgbench->AddIncludeDir('src/bin/psql');
This is a simple copy-paste, with an incorrect comment at least
(haven't tested compilation with MSVC, I suspect that this is going to
fail still the flags are correctly set).

This patch is waiting for input from its author for quite some time
now, and the structure of this patch needs a rework. Are folks on this
thread fine if it is returned with feedback?
-- 
Michael



Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
I'm terribly sorry to have overlooked Febien's comment.

I'll rework on this considering Febien's previous comment and
Michael's this comment in the next CF.


At Tue, 22 Dec 2015 16:17:07 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in
<CAB7nPqRX_-VymeEH-3ChoPrQLgKh=EGgQ2GUtZ53ccO9uLGmxA@mail.gmail.com>
> On Wed, Oct 14, 2015 at 5:49 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> >
> > Hello,
> >
> > Here is a review, sorry for the delay...
> >
> >> This is done as the additional fourth patch, not merged into
> >> previous ones, to show what's changed in the manner of command
> >> storing.
> >> [...]
> >>>
> >>>  - SQL multi-statement.
> >>>
> >>>    SELECT 1; SELECT 2;
> >
> >
> > I think this is really "SELECT 1\; SELECT 2;"

Mmm. It is differenct from my recognition. I'll confirm that.

> > I join a test script I used.

Thank you. I'll work on with it.

> > The purpose of this 4 parts patch is to reuse psql scanner from pgbench
> > so that commands are cleanly separated by ";", including managing dollar
> > quoting, having \ continuations in backslash-commands, having
> > multi-statement commands...
> >
> > This review is about 4 part v4 of the patch. The patches apply and compile
> > cleanly.
> >
> > I think that the features are worthwhile. I would have prefer more limited
> > changes to get them, but my earlier attempt was rejected, and the scanner
> > sharing with psql results in reasonably limited changes, so I would go for
> > it.
> 
> Regarding that:
> +#if !defined OUTSIDE_PSQL
> +#include "variables.h"
> +#else
> +typedef int * VariableSpace;
> +#endif
> 
> And that:
> +/* Provide dummy macros when no use of psql variables */
> +#if defined OUTSIDE_PSQL
> +#define GetVariable(space,name) NULL
> +#define standard_strings() true
> +#define psql_error(fmt,...) do { \
> +    fprintf(stderr, "psql_error is called. abort.\n");\
> +    exit(1);\
> +} while(0)
> +#endif
> That's ugly...

I have to admit that I think just the same.

> Wouldn't it be better with something say in src/common
> which is frontend-only? We could start with a set of routines allowing
> commands to be parsed. That gives us more room for future improvement.

If I read you correctly, I should cut it out into a new file and
include it. Is it correct?

> +    # fix up pg_xlogdump once it's been set up
> +    # files symlinked on Unix are copied on windows
> +    my $pgbench = AddSimpleFrontend('pgbench');
> +    $pgbench->AddDefine('FRONTEND');
> +    $pgbench->AddDefine('OUTSIDE_PSQL');
> +    $pgbench->AddFile('src/bin/psql/psqlscan.l');
> +    $pgbench->AddIncludeDir('src/bin/psql');
> This is a simple copy-paste, with an incorrect comment at least
> (haven't tested compilation with MSVC, I suspect that this is going to
> fail still the flags are correctly set).

Oops. Thank you for pointing out. It worked for me but, honestly
saying, I couldn't another clean way to do that but I'll
reconsider it.

> This patch is waiting for input from its author for quite some time
> now, and the structure of this patch needs a rework. Are folks on this
> thread fine if it is returned with feedback?

It's fine for me.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





Re: pgbench - allow backslash-continuations in custom scripts

From
Michael Paquier
Date:
On Tue, Dec 22, 2015 at 5:34 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> I wrote:
>> Wouldn't it be better with something say in src/common
>> which is frontend-only? We could start with a set of routines allowing
>> commands to be parsed. That gives us more room for future improvement.
>
> If I read you correctly, I should cut it out into a new file and
> include it. Is it correct?

Not really, I meant to see if it would be possible to include this set
of routines directly in libpqcommon (as part of OBJS_FRONTEND). This
way any client applications could easily reuse it. If we found that
what was in psql is now useful to pgbench, I have little doubt that
some other folks would make good use of that. I honestly have not
looked at the code to see if that's doable or not, but soft-linking
directly in pgbench a file of psql will not help future maintenance
for sure. This increases the complexity of the code.
-- 
Michael



Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
Hello Michaël,

>> If I read you correctly, I should cut it out into a new file and
>> include it. Is it correct?
>
> Not really, I meant to see if it would be possible to include this set
> of routines directly in libpqcommon (as part of OBJS_FRONTEND). This
> way any client applications could easily reuse it. If we found that
> what was in psql is now useful to pgbench, I have little doubt that
> some other folks would make good use of that. I honestly have not
> looked at the code to see if that's doable or not, but soft-linking
> directly in pgbench a file of psql will not help future maintenance
> for sure. This increases the complexity of the code.

Just my 0.02€:

I understand that you suggest some kind of dynamic parametrization... this 
is harder to do and potentially as fragile as the link/ifdef option, with 
an undertermined set of callbacks to provide... the generic version would 
be harder to debug, and this approach would prevent changing lexer 
options. Basically I'm not sure that doing all that for improving the 
handling of pgbench scripts is worth the effort. I would go with the 
simpler option.

-- 
Fabien.

Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
Hello,

At Thu, 24 Dec 2015 07:40:19 +0100 (CET), Fabien COELHO <coelho@cri.ensmp.fr> wrote in
<alpine.DEB.2.10.1512240729160.17411@sto>
>
> Hello Michaël,
>
> >> If I read you correctly, I should cut it out into a new file and
> >> include it. Is it correct?
> >
> > Not really, I meant to see if it would be possible to include this set
> > of routines directly in libpqcommon (as part of OBJS_FRONTEND). This
> > way any client applications could easily reuse it. If we found that
> > what was in psql is now useful to pgbench, I have little doubt that
> > some other folks would make good use of that. I honestly have not
> > looked at the code to see if that's doable or not, but soft-linking
> > directly in pgbench a file of psql will not help future maintenance
> > for sure. This increases the complexity of the code.

Thanks. I understand it and I agree to the last sentense. I don't
want them to be exposed as generic features.

Instaed, I'd like to separate backslash commands from psqlscan.l
and use callbacks to access variables by ":name" syntax (it is a
common syntax between pgbench and psql). Current psqlscan.l
simply exits noticing of leading backslash of backslash commands
to the caller so it would be separated without heavy
surgery. This can put away the ugliness of VariableSpace
overriding.

standard_strings() checks standard_comforming_strins on the
session. This is necessarily called on parsing so it can be moved
out into PsqlScanState.

psql_error() redirects messages according to queryFout and adds
input file information. They are heavily dependent to psql. So
I'd like to make them a callback in PsqlScanState and do fprintf
as the default behavior (=NULL).

These measures will get rid of the ugliness. What do you think
about this?

> Just my 0.02€:
>
> I understand that you suggest some kind of dynamic
> parametrization... this is harder to do and potentially as fragile as
> the link/ifdef option, with an undertermined set of callbacks to
> provide... the generic version would be harder to debug, and this
> approach would prevent changing lexer options. Basically I'm not sure
> that doing all that for improving the handling of pgbench scripts is
> worth the effort. I would go with the simpler option.

Undetermined set of callbacks would do so but it seems to me a
set of few known functions to deal with as shown above. The
shared lexer deals only with SQL and a backslash at the top of a
command.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center





Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
Hello,

At Fri, 25 Dec 2015 14:18:24 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20151225.141824.75912397.horiguchi.kyotaro@lab.ntt.co.jp>
> > > Not really, I meant to see if it would be possible to include this set
> > > of routines directly in libpqcommon (as part of OBJS_FRONTEND). This
> > > way any client applications could easily reuse it. If we found that
> > > what was in psql is now useful to pgbench, I have little doubt that
> > > some other folks would make good use of that. I honestly have not
> > > looked at the code to see if that's doable or not, but soft-linking
> > > directly in pgbench a file of psql will not help future maintenance
> > > for sure. This increases the complexity of the code.
>
> Thanks. I understand it and I agree to the last sentense. I don't
> want them to be exposed as generic features.
>
> Instaed, I'd like to separate backslash commands from psqlscan.l
> and use callbacks to access variables by ":name" syntax (it is a
> common syntax between pgbench and psql). Current psqlscan.l
> simply exits noticing of leading backslash of backslash commands
> to the caller so it would be separated without heavy
> surgery. This can put away the ugliness of VariableSpace
> overriding.

Done. Lexer for backslash commands is moved out of SQL lexer as a
standalone lexer.  Finally the SQL lexer left behind no longer is
aware of VariableSpace and simply used from pgbench by linking
the object file in psql directory.  Addition to that, PGconn
required by escape_variable() so the core of the escape feature
is moved out into callback function side.

> standard_strings() checks standard_comforming_strins on the
> session. This is necessarily called on parsing so it can be moved
> out into PsqlScanState.

Such behavior is not compatible with the current behavior. So I
made starndard_strings() a callback, too.

> psql_error() redirects messages according to queryFout and adds
> input file information. They are heavily dependent to psql. So
> I'd like to make them a callback in PsqlScanState and do fprintf
> as the default behavior (=NULL).

Done.

> These measures will get rid of the ugliness. What do you think
> about this?
>
> > Just my 0.02€:
> >
> > I understand that you suggest some kind of dynamic
> > parametrization... this is harder to do and potentially as fragile as
> > the link/ifdef option, with an undertermined set of callbacks to
> > provide... the generic version would be harder to debug, and this
> > approach would prevent changing lexer options. Basically I'm not sure
> > that doing all that for improving the handling of pgbench scripts is
> > worth the effort. I would go with the simpler option.
>
> Undetermined set of callbacks would do so but it seems to me a
> set of few known functions to deal with as shown above. The
> shared lexer deals only with SQL and a backslash at the top of a
> command.

Finally, PsqlScanState has four callback funcions and all pgbench
needs to do to use it is setting NULL to all of them and link the
object file in psql directory. No link switch/ifdef are necessary.

|  const char get_variable(const char *, bool escape, bool as_ident, void (**free_fn)(void *));
|  int enc_mblen(const char *);
|  bool standard_strings(void);
|  void error_out(const char *fmt, ...)

The attached patches are the following.

- 0001-Prepare-for-sharing-psqlscan-with-pgbench.patch
 This diff looks a bit large but most of them is cut'n-paste work and the substantial change is rather small.
 This refactors psqlscan.l into two .l files. The additional psqlscan_slash.l is a bit tricky in the point that
recreatingscan_state on transition between psqlscan.l. 
 .c files generated from .c are no longer complied as a part of mainloop.c. Thay have dedicated envelope .c files so
thatthe sql lexer can easily used outside psql. 

- 0002-Change-the-access-method-to-shell-variables.patch
 Detaches VariableSpace from psqlscan.

- 0003-Detach-common.c-from-psqlscan.patch
 Detaches PGconn from psqlscan by throwing out the session encoding stuff by ading two callbacks.

- 0004-pgbench-uses-common-frontend-SQL-parser.patch
 This looks larger than what actually it does because an if branch with a large block in process_commands() was
removed.'git diff -w' shows far small, substantial changes. 

- 0005-Change-the-way-to-hold-command-list.patch
 In the changes in 0004, SQL multistatement is passed as a linked list to the caller, then the caller assigns them on
anarray with the same data type. This patch changes the way to hold commands entirely to linked list. 

--
Kyotaro Horiguchi
NTT Open Source Software Center

Re: pgbench - allow backslash-continuations in custom scripts

From
Michael Paquier
Date:
On Thu, Jan 7, 2016 at 5:36 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Finally, PsqlScanState has four callback funcions and all pgbench
> needs to do to use it is setting NULL to all of them and link the
> object file in psql directory. No link switch/ifdef are necessary.

Am I missing something? This patch is not registered in the CF app.
Horiguchi-san, if you expect feedback, it would be good to get it
there.
-- 
Michael



Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
Mmm. I believed that this is on CF app..

At Tue, 19 Jan 2016 15:41:54 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in
<CAB7nPqQFnORG=LjDiGgD_hy_M00scx1ihn89QHx_1B9+3Vz7tQ@mail.gmail.com>
> On Thu, Jan 7, 2016 at 5:36 PM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > Finally, PsqlScanState has four callback funcions and all pgbench
> > needs to do to use it is setting NULL to all of them and link the
> > object file in psql directory. No link switch/ifdef are necessary.
> 
> Am I missing something? This patch is not registered in the CF app.
> Horiguchi-san, if you expect feedback, it would be good to get it
> there.

Thank you very much Michael but the CF app doesn't allow me to
regsiter new one. Filling the Description field with "pgbench -
allow backslash-continuations in custom scripts" and chose a
topic then "Find thread" shows nothing. Filling the search text
field on the "Attach thread" dialogue with the description or
giving the exact message-id gave me nothing to choose.

Maybe should I repost the patch so that the "Attach thread" can
find it as a "recent" email?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
Hello Kyotaro-san,

> Thank you very much Michael but the CF app doesn't allow me to
> regsiter new one. Filling the Description field with "pgbench -
> allow backslash-continuations in custom scripts" and chose a
> topic then "Find thread" shows nothing. Filling the search text
> field on the "Attach thread" dialogue with the description or
> giving the exact message-id gave me nothing to choose.

Strange.

You could try taking the old entry and selecting state "move to next CF"?

-- 
Fabien.



Re: pgbench - allow backslash-continuations in custom scripts

From
Michael Paquier
Date:
On Tue, Jan 26, 2016 at 6:51 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Mmm. I believed that this is on CF app..
>
> At Tue, 19 Jan 2016 15:41:54 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in
<CAB7nPqQFnORG=LjDiGgD_hy_M00scx1ihn89QHx_1B9+3Vz7tQ@mail.gmail.com>
>> On Thu, Jan 7, 2016 at 5:36 PM, Kyotaro HORIGUCHI
>> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>> > Finally, PsqlScanState has four callback funcions and all pgbench
>> > needs to do to use it is setting NULL to all of them and link the
>> > object file in psql directory. No link switch/ifdef are necessary.
>>
>> Am I missing something? This patch is not registered in the CF app.
>> Horiguchi-san, if you expect feedback, it would be good to get it
>> there.
>
> Thank you very much Michael but the CF app doesn't allow me to
> regsiter new one.

That's perhaps a bit late I am afraid for this CF, we are at the end
of January already...

> Filling the Description field with "pgbench -
> allow backslash-continuations in custom scripts" and chose a
> topic then "Find thread" shows nothing. Filling the search text
> field on the "Attach thread" dialogue with the description or
> giving the exact message-id gave me nothing to choose.

Really? That's because the patch is marked as returned with feedback here:
https://commitfest.postgresql.org/7/319/

> Maybe should I repost the patch so that the "Attach thread" can
> find it as a "recent" email?

What if you just add it to next CF with a new entry? You are actually
proposing an entirely new patch.
-- 
Michael



Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
Hello, thank you, Febien, Micael.

# Though I have made almost no activity in the last month...

At Tue, 26 Jan 2016 13:53:33 +0100 (CET), Fabien COELHO <coelho@cri.ensmp.fr> wrote in
<alpine.DEB.2.10.1601261352210.6482@sto>
> 
> Hello Kyotaro-san,
> 
> > Thank you very much Michael but the CF app doesn't allow me to
> > regsiter new one. Filling the Description field with "pgbench -
> > allow backslash-continuations in custom scripts" and chose a
> > topic then "Find thread" shows nothing. Filling the search text
> > field on the "Attach thread" dialogue with the description or
> > giving the exact message-id gave me nothing to choose.
> 
> Strange.
> 
> You could try taking the old entry and selecting state "move to next
> CF"?

Hmm. The state of the old entry in CF2015-11 is already "Move*d*
to next CF" and it is not found in CF2016-01, as far as I saw.

At Tue, 26 Jan 2016 22:21:49 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in
<CAB7nPqRtaz2nGb-7wQJ+w1-rFyFFxkruesNBM3RcPCgXaCoSmQ@mail.gmail.com>
> > Filling the Description field with "pgbench -
> > allow backslash-continuations in custom scripts" and chose a
> > topic then "Find thread" shows nothing. Filling the search text
> > field on the "Attach thread" dialogue with the description or
> > giving the exact message-id gave me nothing to choose.
> 
> Really? That's because the patch is marked as returned with feedback here:
> https://commitfest.postgresql.org/7/319/

Ah, I have many candidates in "Attach thread" dialog. That would
be a temporary symptom of a kind of the CF-seaon-wise
meaintenance.

> > Maybe should I repost the patch so that the "Attach thread" can
> > find it as a "recent" email?
> 
> What if you just add it to next CF with a new entry? You are actually
> proposing an entirely new patch.

So, I finally could register an entry for CF2016-3.
Thank you all for the suggestion.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





Re: pgbench - allow backslash-continuations in custom scripts

From
Robert Haas
Date:
On Thu, Jan 7, 2016 at 3:36 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> - 0001-Prepare-for-sharing-psqlscan-with-pgbench.patch
>
>   This diff looks a bit large but most of them is cut'n-paste
>   work and the substantial change is rather small.
>
>   This refactors psqlscan.l into two .l files. The additional
>   psqlscan_slash.l is a bit tricky in the point that recreating
>   scan_state on transition between psqlscan.l.

I've looked at this patch a few times now but find it rather hard to
verify.  I am wondering if you would be willing to separate 0001 into
subpatches.  For example, maybe there could be one or two patches that
ONLY move code around and then one or more patches that make the
changes to that code.  Right now, for example, psql_scan_setup() is
getting three additional arguments, but it's also being moved to a
different file.  Perhaps those two things could be done one at a time.

I also think this patch could really benefit from a detailed set of
submission notes that specifically lay out why each change was made
and why.  For instance, I see that psqlscan.l used yyless() while
psqlscanbody.l uses a new my_yyless() you've defined.  There is
probably a great reason for that and I'm sure if I stare at this for
long enough I can figure out what that reason is, but it would be
better if you had a list of bullet points explaining what was changed
and why.

I would really like to see this patch committed; my problem is that I
don't have enough braincells to be sure that it's correct in the
present form.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
Hello, thank you for reviewing this.

> On Thu, Jan 7, 2016 at 3:36 AM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > - 0001-Prepare-for-sharing-psqlscan-with-pgbench.patch
> >
> >   This diff looks a bit large but most of them is cut'n-paste
> >   work and the substantial change is rather small.
> >
> >   This refactors psqlscan.l into two .l files. The additional
> >   psqlscan_slash.l is a bit tricky in the point that recreating
> >   scan_state on transition between psqlscan.l.
>
> I've looked at this patch a few times now but find it rather hard to
> verify.  I am wondering if you would be willing to separate 0001 into
> subpatches.  For example, maybe there could be one or two patches that
> ONLY move code around and then one or more patches that make the
> changes to that code.  Right now, for example, psql_scan_setup() is
> getting three additional arguments, but it's also being moved to a
> different file.  Perhaps those two things could be done one at a time.

I tried to split it into patches with some meaningful (I thought)
steps, but I'll arrange them if it is not easy to read.

> I also think this patch could really benefit from a detailed set of
> submission notes that specifically lay out why each change was made
> and why.  For instance, I see that psqlscan.l used yyless() while
> psqlscanbody.l uses a new my_yyless() you've defined.  There is
> probably a great reason for that and I'm sure if I stare at this for
> long enough I can figure out what that reason is, but it would be
> better if you had a list of bullet points explaining what was changed
> and why.

I'm sorry, but I didn't understood the 'submission notes' exactly
means. Is it precise descriptions in source comments? or commit
message of git-commit?

> I would really like to see this patch committed; my problem is that I
> don't have enough braincells to be sure that it's correct in the
> present form.

Thank you. I'll send the rearranged patch sooner.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center






Re: pgbench - allow backslash-continuations in custom scripts

From
Robert Haas
Date:
On Mon, Feb 15, 2016 at 1:04 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> I'm sorry, but I didn't understood the 'submission notes' exactly
> means. Is it precise descriptions in source comments? or commit
> message of git-commit?

Write a detailed email explaining each change that is part of the
patch and why it is there.  Attach the patch to that same email.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
Hello,

At Tue, 16 Feb 2016 08:05:10 -0500, Robert Haas 
<robertmhaas@gmail.com> wrote in 
<CA+TgmoavXzXVV_k-89SbgMKB-Eyp+RpSKu_0tPGqx_ceEk=kCQ@mail.gmail.com>
> On Mon, Feb 15, 2016 at 1:04 AM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > I'm sorry, but I didn't understood the 'submission notes' exactly
> > means. Is it precise descriptions in source comments? or commit
> > message of git-commit?
>
> Write a detailed email explaining each change that is part of the
> patch and why it is there.  Attach the patch to that same email.

Sorry for the silly question. I'll try to describe this in
detail. I hope you are patient enough to read my clumsy (but a
bit long) text.

First, I rebased the previous patch set and merged three of
them. Now they are of three patches.


1. Making SQL parser part of psqlscan independent from psql.
  Moved psql's baskslsh command stuff out of original psqlscan.l  and some psql stuff the parser directly looked are
usedvia a  set of callback functions, which can be all NULL for usages  from other than psql.
 

2. Making pgbench to use the new psqlscan parser.

3. Changing the way to hold SQL/META commands from array to   linked list.
   The #2 introduced linked list to store SQL multistatement but   immediately the caller moves the elements into an
array.This   patch totally changes the way to linked list.
 

I'll explain in detail of each patch.


1. Make SQL parser part of psqlscan independent from psql.

The new file psqlscan_slashbody.l is the moved-out part of
psqlscan.l. The prefix of flex symbols are changed to "yys".

The psqlscan(_slash).c are container C source which does what is
previously done by mainloop.c. This makes other bin/ programs to
link psqlscan.o directly, without compilation.

1.1. The psqlscanbody.l

It is the SQL part of old psqlscan.l but the difference between
them is a bit bothersome to see. I attached the diff between them
as "psqlscanbody.l.diff" for convenience.

1.1.1. Switching between two lexers.

The SQL parser and the metacommand parser should be alternately
callable each other but yyparse() cannot parse from intermediate
position of the input text. So I provided functions to remake a
parser state using an input text beginning at just after the
position already parsed.  psql_scan_switch_lexer() and
psql_scan_slash_command_switch_lexer() do that for psqlscan.l and
psqlscan_backslash.l respectively.

I haven't found no variable available to know how many characters
the parser has eaten so I decided to have a counter for the usage
as PsqlScanState.curpos and keep it pointing just after the last
letter already parsed. It is rather a common way to advance it
using YY_USER_ACTION and I did so, but the parser occasionally
steps back using yyless() when it reads some sequences. Hence
yyless() should decrement .curpos, but flex has no hook point for
the purpose. I defined my_yyless() (the name should need to be
changed) macro to do this.


1.1.2.  Detaching psqlscan from psql stuff

psqlscan.l had depended on psql through pset.vars, pset.encoding,
pset.db and psql_erorr(). I have modified the parser to access
them via callback functions. psql_scan_setup() now has the new
fourth parameter PsqlScanCallbacks to receive them. The two
callbacks of them, standard_strings and error_out wouldn't need
detailed explanation, but enc_mblen and get_variable would need.

pset.encoding was used only to check if PQmblen is applicable
then to be given to PQmblen. So I provided the callback enc_mblen
which could be given if strings can be encoded using it. The
another one, get_variable() is an equivalent of a wrapper
function of GetVariable(). GetVariable() was called directly in
lexer definitions and indirectly via escape_variable() in
psqlscan.l but escape_variable() was accessing pset.db only in
order to check existence of an active connection. I could give it
via another callback, but I have moved out the part of the
function accessing it because it is accessed only in the
function.

Finally, I defined the callbacks in common.c.

1.2. The reason for the names psqlscanbody.l and psqlscan_slashbody.l.

psqlscan.l was finally included as psqlscan.c by mainloop.c. The
reason is postgresql_fe.h must be read before psqlscan.c on some
platform, according to the comment at the end of mainloop.c. But
it is an annoyance when using it from other bin/
programs. Therefore, I provided dedicated .c files to do so for
the two lexer .c files. In order to make the name of the file to
be linked from outside psql be psqlscan.o, I have renamed the *.l
files to *body.l.

1.3 The psqlscan_int.h file

As the new psqlscan.o is used from outside psql, psqlscan.h
should have only definitions needed to do so. psqlscan_int.h
defines some stuffs used by the lexers but not necessary to use
them.

1.4 Other files

Other files that are not mentioned so far, Makefile, common.h,
psqlscan_slash.h and startup.c would'nt be neccesary to be
explained.



2. Making pgbench to use the new psqlscan parser.

By the patch #2, pgbench.c gets mainly two major
modifications. Splitting of process_commands() and adding
backslash-continuation feature to the added function
process_backslash_commands().

2.1. process_commands() has been splitted into two functions

The function process_commands() has been splitted into
process_commands() and new function
process_backslash_commands(). The former has been made to use
psqlscan. In contrast to psql, pgbench first checks if the input
on focus is a backslash command or not in process_commands(),
then parses it using psqlscan if it was not a backslash
command. process_backslash_commands() is a cut out from the
original process_command() and it is modified to handle backslash
continuation.  Both functions read multiple lines in a context so
the processing context is to be made by the caller
(i.e. process_file) and kept throughout all input lines.

2.2 backslash continuation in process_backslash_commands().

The loop over input lines in the old process_commands() is
refactored to handle SQL statements and backslash commands
separately. The most part of the new process_backslash_commands()
is almost the same with the corresponding part in the old
process_commands() except indentation. "git diff -b -patience"
gives the far-less-noisy differences so I attached it as the
"pgbench.c.patient.diff" for convenience.


3. Changing the way to hold SQL/META commands from array to   linked list.

The patch #2 adds a new member "next" to Command_t. It is
necessary to handle SQL multistatements. But the caller
process_file holds the commands in an array of Commant_t. This is
apparently not in undesirable form. Since the liked list seems
simpler for this usage (for me), I decided to unify them to the
linked list. Most of this patch is rather simple one by one
replacement.


I hope this submission note makes sense and this patch becomes
easy to read.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: pgbench - allow backslash-continuations in custom scripts

From
David Steele
Date:
On 2/18/16 6:54 AM, Kyotaro HORIGUCHI wrote:

> First, I rebased the previous patch set and merged three of
> them. Now they are of three patches.
>
>
> 1. Making SQL parser part of psqlscan independent from psql.
>
>     Moved psql's baskslsh command stuff out of original psqlscan.l
>     and some psql stuff the parser directly looked are used via a
>     set of callback functions, which can be all NULL for usages
>     from other than psql.
>
> 2. Making pgbench to use the new psqlscan parser.
>
> 3. Changing the way to hold SQL/META commands from array to
>      linked list.
>
>      The #2 introduced linked list to store SQL multistatement but
>      immediately the caller moves the elements into an array. This
>      patch totally changes the way to linked list.

Any takers to review this updated patch?

-- 
-David
david@pgmasters.net



Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
Hello David,

> Any takers to review this updated patch?

I intend to have a look at it, I had a look at a previous instance, but 
I'm ok if someone wants to proceed.

-- 
Fabien.



Re: pgbench - allow backslash-continuations in custom scripts

From
David Steele
Date:
Hi Fabien,

On 3/14/16 3:27 PM, Fabien COELHO wrote:

>> Any takers to review this updated patch?
>
> I intend to have a look at it, I had a look at a previous instance, but
> I'm ok if someone wants to proceed.

There's not exactly a long line of reviewers at the moment so if you
could do a followup review that would be great.

Thanks,
--
-David
david@pgmasters.net


Re: pgbench - allow backslash-continuations in custom scripts

From
Fabien COELHO
Date:
>> I intend to have a look at it, I had a look at a previous instance, but
>> I'm ok if someone wants to proceed.
>
> There's not exactly a long line of reviewers at the moment so if you
> could do a followup review that would be great.

Ok. It is in the queue, not for right know, though.

-- 
Fabien.



Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
At Tue, 15 Mar 2016 14:55:52 +0100 (CET), Fabien COELHO <coelho@cri.ensmp.fr> wrote in
<alpine.DEB.2.10.1603151455170.23831@sto>
> 
> >> I intend to have a look at it, I had a look at a previous instance,
> >> but
> >> I'm ok if someone wants to proceed.
> >
> > There's not exactly a long line of reviewers at the moment so if you
> > could do a followup review that would be great.
> 
> Ok. It is in the queue, not for right know, though.

Thank you.

This patchset needs "make maintainer-clean" before applying
because it adds src/bin/psql/psqlscan.c, which is currently
generated by flex. All of the patches apply but with many offsets
so I rebased them. The two subsidiary diffs are the same to the
previous patch set.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


Re: pgbench - allow backslash-continuations in custom scripts

From
Robert Haas
Date:
<div dir="ltr">On Thu, Feb 18, 2016 at 6:54 AM, Kyotaro HORIGUCHI <span dir="ltr"><<a
href="mailto:horiguchi.kyotaro@lab.ntt.co.jp"target="_blank">horiguchi.kyotaro@lab.ntt.co.jp</a>></span> wrote:<br
/><divclass="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1pxsolid rgb(204,204,204);padding-left:1ex">It is the SQL part of old psqlscan.l but the difference
between<br/> them is a bit bothersome to see. I attached the diff between them<br /> as "psqlscanbody.l.diff" for
convenience.<br/></blockquote></div><br /></div><div class="gmail_extra">This is a huge diff, and I don't see that
you'veexplained the reason for all the changes.  For example:<br /><br />-/*<br />- * We use a stack of flex buffers to
handlesubstitution of psql variables.<br />- * Each stacked buffer contains the as-yet-unread text from one psql
variable.<br/>- * When we pop the stack all the way, we resume reading from the outer buffer<br />- * identified by
scanbufhandle.<br/>- */<br />-typedef struct StackElem<br />-{<br />-       YY_BUFFER_STATE buf;            /* flex
inputcontrol structure */<br />-       char       *bufstring;          /* data actually being scanned by flex *<br
/>/<br/>-       char       *origstring;         /* copy of original data, if needed */<br />-       char      
*varname;           /* name of variable providing data, or N<br />ULL */<br />-       struct StackElem *next;<br />-}
StackElem;<br/><br /></div><div class="gmail_extra">Perhaps we could separate this part of the code motion into its own
preliminarypatch?  I see this went to psqlscan_int.h, but there's no obvious reason for that particular name, and the
commentsdon't explain it; in fact, they say that's psqlscan.h.  psqlscan_slash.h has the same problem; perhaps moving
thingsthere could be another preliminary patch.<br /><br />-                                       yyless(0);<br
/>+                                      my_yyless(0);<br /><br clear="all" /></div><div class="gmail_extra">Why do we
needto do this?  Is "my_" really the best prefix?  Is this another change that could be its own patch?<br /><br
/></div><divclass="gmail_extra">-- <br /><div class="gmail_signature">Robert Haas<br />EnterpriseDB: <a
href="http://www.enterprisedb.com"target="_blank">http://www.enterprisedb.com</a><br />The Enterprise PostgreSQL
Company</div></div></div>

Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
Thank you for the comment, but could you please tell me what kind
of criteria should I take to split this patch? The discussion
about splitting criteria is in the following reply (in the
sentence begins with "By the way").

At Wed, 16 Mar 2016 11:57:25 -0400, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+Tgmobvu1aBRdRaKvqMVp0ifhQpgvnOEZa2Rg3AHfRWPE5-Tg@mail.gmail.com>
> On Thu, Feb 18, 2016 at 6:54 AM, Kyotaro HORIGUCHI <
> horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> 
> > It is the SQL part of old psqlscan.l but the difference between
> > them is a bit bothersome to see. I attached the diff between them
> > as "psqlscanbody.l.diff" for convenience.
> >
> 
> This is a huge diff, and I don't see that you've explained the reason for
> all the changes.  For example:
> 
> -/*
> - * We use a stack of flex buffers to handle substitution of psql variables.
> - * Each stacked buffer contains the as-yet-unread text from one psql
> variable.
> - * When we pop the stack all the way, we resume reading from the outer
> buffer
> - * identified by scanbufhandle.
> - */
> -typedef struct StackElem
> -{
> -       YY_BUFFER_STATE buf;            /* flex input control structure */
> -       char       *bufstring;          /* data actually being scanned by
> flex *
> /
> -       char       *origstring;         /* copy of original data, if needed
> */
> -       char       *varname;            /* name of variable providing data,
> or N
> ULL */
> -       struct StackElem *next;
> -} StackElem;
> 
> Perhaps we could separate this part of the code motion into its own
> preliminary patch?

The "preliminary patch" seems to mean the same thing with the
first patch in the following message.

http://www.postgresql.org/message-id/20160107.173603.31865003.horiguchi.kyotaro@lab.ntt.co.jp

The commit log says as the following.

| Subject: [PATCH 1/5] Prepare for sharing psqlscan with pgbench.
| 
| Lexer is no longer compiled as a part of mainloop.c.  The slash
| command lexer is brought out from the command line lexer.  psql_scan
| no longer accesses directly to pset struct and VariableSpace. This
| change allows psqlscan to be used without these things.

The following two patches are the follwings.

| Subject: [PATCH 2/5] Change the access method to shell variables
| 
| Access to shell variables via a callback function so that the lexer no
| longer need to be aware of VariableSpace.

| Subject: [PATCH 3/5] Detach common.c from psqlscan
| 
| Call standard_strings() and psql_error() via callback functions so
| that psqlscan.l can live totally without common.c stuff. They are
| bundled up with get_variable() callback in one struct since now we
| have as many as four callback functions.

These patches are made so as to keep the compilable and workable
state of the source files. It might be a bit more readable if
unshackled from such restriction.

>  I see this went to psqlscan_int.h, but there's no
> obvious reason for that particular name, and the comments don't explain it;

I assumed that is a convention of naming by looking libpq-int.h
(though it doesn't use underscore, but hyphen). But the file
needs a comment like libpq-int.h. I'll rename it and add such
comment to it.

By the way, the patch set mentioned above gives such preliminary
steps separately. Should I make the next patch set based on the
older one which is separating the preliminary steps? Or in new
splitting criteria that is free from the compilable-workable
restriction is preferable?

> in fact, they say that's psqlscan.h.  psqlscan_slash.h has the same
> problem; perhaps moving things there could be another preliminary patch.

It is also included in the 0001 patch.

> -                                       yyless(0);
> +                                       my_yyless(0);
> 
> Why do we need to do this?  Is "my_" really the best prefix?  Is this
> another change that could be its own patch?

Oops! Sorry for the silly name. I was not able to think up a
proper name for it. Does psqlscan_yyless seems good?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





Re: pgbench - allow backslash-continuations in custom scripts

From
Robert Haas
Date:
On Thu, Mar 17, 2016 at 4:12 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Thank you for the comment, but could you please tell me what kind
> of criteria should I take to split this patch? The discussion
> about splitting criteria is in the following reply (in the
> sentence begins with "By the way").

Well, I'm trying to find a piece of this patch that is small enough
that I can understand it and in good enough shape that I can commit it
independently, but I am having some difficulty with that.  I keep
hoping some other committer is going to come along and be able to grok
this well enough to apply it based on what you've already done, but so
far it seems to be the all-me show.

> These patches are made so as to keep the compilable and workable
> state of the source files. It might be a bit more readable if
> unshackled from such restriction.

Keeping it compilable and workable after each patch is essential, but
the first patch is still big and doing a lot of stuff.  I'm wondering
if it can be further decomposed.

>>  I see this went to psqlscan_int.h, but there's no
>> obvious reason for that particular name, and the comments don't explain it;
>
> I assumed that is a convention of naming by looking libpq-int.h
> (though it doesn't use underscore, but hyphen). But the file
> needs a comment like libpq-int.h. I'll rename it and add such
> comment to it.

OK.

>> -                                       yyless(0);
>> +                                       my_yyless(0);
>>
>> Why do we need to do this?  Is "my_" really the best prefix?  Is this
>> another change that could be its own patch?
>
> Oops! Sorry for the silly name. I was not able to think up a
> proper name for it. Does psqlscan_yyless seems good?

That does sound better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pgbench - allow backslash-continuations in custom scripts

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Well, I'm trying to find a piece of this patch that is small enough
> that I can understand it and in good enough shape that I can commit it
> independently, but I am having some difficulty with that.  I keep
> hoping some other committer is going to come along and be able to grok
> this well enough to apply it based on what you've already done, but so
> far it seems to be the all-me show.

This is mostly a flex/bison hack, isn't it?  If you like I'll take it.
        regards, tom lane



Re: pgbench - allow backslash-continuations in custom scripts

From
Robert Haas
Date:
On Fri, Mar 18, 2016 at 10:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Well, I'm trying to find a piece of this patch that is small enough
>> that I can understand it and in good enough shape that I can commit it
>> independently, but I am having some difficulty with that.  I keep
>> hoping some other committer is going to come along and be able to grok
>> this well enough to apply it based on what you've already done, but so
>> far it seems to be the all-me show.
>
> This is mostly a flex/bison hack, isn't it?  If you like I'll take it.

I would be delighted if you would.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pgbench - allow backslash-continuations in custom scripts

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Mar 18, 2016 at 10:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This is mostly a flex/bison hack, isn't it?  If you like I'll take it.

> I would be delighted if you would.

I've committed changes equivalent to Horiguchi-san's 0001 and 0002
patches, though rather different in detail.  I concur with the upthread
opinion that 0003 doesn't seem really necessary.

This solves the problem of allowing SQL commands in scripts to span
lines, but it doesn't do anything about backslash commands, which was
the original point according to the thread title ;-).  I can think of
two somewhat-independent changes we might want to make at this point,
since we're breaking exact script compatibility for 9.6 anyway:

* Allow multiple backslash commands on one line, eg\set foo 5 \set bar 6
The main reason for that is that psql allows it, and one of the things
we're supposedly trying to do here is reduce the behavioral distance
between psql and pgbench parsing rules.

* Allow backslash commands to span lines, probably by adopting the
rule that backslash immediately followed by newline is to be ignored
within a backslash command.  This would not be compatible with psql,
though, at least not unless we wanted to change psql too.

I don't have strong feelings about either.
        regards, tom lane



Re: pgbench - allow backslash-continuations in custom scripts

From
"David G. Johnston"
Date:
On Sunday, March 20, 2016, Tom Lane <tgl@sss.pgh.pa.us> wrote:

* Allow backslash commands to span lines, probably by adopting the
rule that backslash immediately followed by newline is to be ignored
within a backslash command.  This would not be compatible with psql,
though, at least not unless we wanted to change psql too.


This would be appreciated.  The main case I find wanting this is writing out long \copy expressions.  Solving really complex ones using temporary tables works but being able to spread it out over multiple lines would be a welcomed addition.

David J.

Re: pgbench - allow backslash-continuations in custom scripts

From
Robert Haas
Date:
On Sun, Mar 20, 2016 at 1:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Mar 18, 2016 at 10:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> This is mostly a flex/bison hack, isn't it?  If you like I'll take it.
>
>> I would be delighted if you would.
>
> I've committed changes equivalent to Horiguchi-san's 0001 and 0002
> patches, though rather different in detail.  I concur with the upthread
> opinion that 0003 doesn't seem really necessary.
>
> This solves the problem of allowing SQL commands in scripts to span
> lines, ...

Excellent.

> but it doesn't do anything about backslash commands, which was
> the original point according to the thread title ;-).

Wait, was it really?  I'd been thinking it was mostly to continue
queries, not metacommands, but maybe I missed the boat.

> I can think of
> two somewhat-independent changes we might want to make at this point,
> since we're breaking exact script compatibility for 9.6 anyway:
>
> * Allow multiple backslash commands on one line, eg
>         \set foo 5 \set bar 6
> The main reason for that is that psql allows it, and one of the things
> we're supposedly trying to do here is reduce the behavioral distance
> between psql and pgbench parsing rules.

This seems to me to be going in the wrong direction.

> * Allow backslash commands to span lines, probably by adopting the
> rule that backslash immediately followed by newline is to be ignored
> within a backslash command.  This would not be compatible with psql,
> though, at least not unless we wanted to change psql too.

This might have some point to it, though, if you want to say \set i
<incredibly long expression not easily contained on a single line>

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pgbench - allow backslash-continuations in custom scripts

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Mar 20, 2016 at 1:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This solves the problem of allowing SQL commands in scripts to span
>> lines, ...

> Excellent.

>> but it doesn't do anything about backslash commands, which was
>> the original point according to the thread title ;-).

> Wait, was it really?  I'd been thinking it was mostly to continue
> queries, not metacommands, but maybe I missed the boat.

Nah, you're right, it was about continuing queries.  Still, we've had
complaints about the other thing too, and I think if we're going to
change anything here, we should change it all in the same release.

>> I can think of
>> two somewhat-independent changes we might want to make at this point,
>> since we're breaking exact script compatibility for 9.6 anyway:
>> 
>> * Allow multiple backslash commands on one line, eg
>> \set foo 5 \set bar 6
>> The main reason for that is that psql allows it, and one of the things
>> we're supposedly trying to do here is reduce the behavioral distance
>> between psql and pgbench parsing rules.

> This seems to me to be going in the wrong direction.

Um, why exactly?  That psql behavior is of really ancient standing, and
we have not had complaints about it.

>> * Allow backslash commands to span lines, probably by adopting the
>> rule that backslash immediately followed by newline is to be ignored
>> within a backslash command.  This would not be compatible with psql,
>> though, at least not unless we wanted to change psql too.

> This might have some point to it, though, if you want to say \set i
> <incredibly long expression not easily contained on a single line>

Shall I make a patch that allows backslash-newline to be handled this way
in both psql and pgbench backslash commands?  At least in psql, there
would be no backwards compatibility problem, since right now the case
just fails:

regression=# \set x y \
Invalid command \. Try \? for help.
        regards, tom lane



Re: pgbench - allow backslash-continuations in custom scripts

From
Robert Haas
Date:
On Mon, Mar 21, 2016 at 3:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Wait, was it really?  I'd been thinking it was mostly to continue
>> queries, not metacommands, but maybe I missed the boat.
>
> Nah, you're right, it was about continuing queries.  Still, we've had
> complaints about the other thing too, and I think if we're going to
> change anything here, we should change it all in the same release.

Fair enough.

>>> * Allow multiple backslash commands on one line, eg
>>> \set foo 5 \set bar 6
>>> The main reason for that is that psql allows it, and one of the things
>>> we're supposedly trying to do here is reduce the behavioral distance
>>> between psql and pgbench parsing rules.
>
>> This seems to me to be going in the wrong direction.
>
> Um, why exactly?  That psql behavior is of really ancient standing, and
> we have not had complaints about it.

I think that's mostly because the psql metacommands are ridiculously
impoverished.  I'm guessing that pgbench's expression language is
eventually going to support strings as a data type, for example, and
those strings might want to contain backlashes.  There's basically no
value in cramming multiple metacommands onto a single line, but there
is the risk of creating unnecessary lexing or parsing difficulties in
the future.

>>> * Allow backslash commands to span lines, probably by adopting the
>>> rule that backslash immediately followed by newline is to be ignored
>>> within a backslash command.  This would not be compatible with psql,
>>> though, at least not unless we wanted to change psql too.
>
>> This might have some point to it, though, if you want to say \set i
>> <incredibly long expression not easily contained on a single line>
>
> Shall I make a patch that allows backslash-newline to be handled this way
> in both psql and pgbench backslash commands?  At least in psql, there
> would be no backwards compatibility problem, since right now the case
> just fails:
>
> regression=# \set x y \
> Invalid command \. Try \? for help.

I certainly don't object to such a patch, although if it's between you
writing that patch and you getting Tomas Vondra's multivariate
statistics stuff committed, I'll take the latter.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pgbench - allow backslash-continuations in custom scripts

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Mar 21, 2016 at 3:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Um, why exactly?  That psql behavior is of really ancient standing, and
>> we have not had complaints about it.

> I think that's mostly because the psql metacommands are ridiculously
> impoverished.  I'm guessing that pgbench's expression language is
> eventually going to support strings as a data type, for example, and
> those strings might want to contain backlashes.

Sure, but once we define strings, they'll be quoted, and the behavior
can/should/will be different for a backslash inside quotes than one
outside them --- as it is already in psql.  Moreover, if you're on
board with the backslash-newline proposal, you've already bought into
the idea that backslashes outside quotes will behave differently from
those inside.

>> Shall I make a patch that allows backslash-newline to be handled this way
>> in both psql and pgbench backslash commands?

> I certainly don't object to such a patch, although if it's between you
> writing that patch and you getting Tomas Vondra's multivariate
> statistics stuff committed, I'll take the latter.  :-)

I'll get to that, but I'd like to get this area fully dealt with before
context-swapping to that one.
        regards, tom lane



Re: pgbench - allow backslash-continuations in custom scripts

From
Robert Haas
Date:
On Mon, Mar 21, 2016 at 3:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Mon, Mar 21, 2016 at 3:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Um, why exactly?  That psql behavior is of really ancient standing, and
>>> we have not had complaints about it.
>
>> I think that's mostly because the psql metacommands are ridiculously
>> impoverished.  I'm guessing that pgbench's expression language is
>> eventually going to support strings as a data type, for example, and
>> those strings might want to contain backlashes.
>
> Sure, but once we define strings, they'll be quoted, and the behavior
> can/should/will be different for a backslash inside quotes than one
> outside them --- as it is already in psql.  Moreover, if you're on
> board with the backslash-newline proposal, you've already bought into
> the idea that backslashes outside quotes will behave differently from
> those inside.

Mmph.  I just don't see any benefit in being able to start a command
in the middle of a line.  Even if it's not dangerous to future things
we might want o implement, I'd argue it's still poor style, and of no
practical utility.  But if I lose that argument, then I do.

>>> Shall I make a patch that allows backslash-newline to be handled this way
>>> in both psql and pgbench backslash commands?
>
>> I certainly don't object to such a patch, although if it's between you
>> writing that patch and you getting Tomas Vondra's multivariate
>> statistics stuff committed, I'll take the latter.  :-)
>
> I'll get to that, but I'd like to get this area fully dealt with before
> context-swapping to that one.

Understood.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pgbench - allow backslash-continuations in custom scripts

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Mmph.  I just don't see any benefit in being able to start a command
> in the middle of a line.

I agree that there's no functional benefit; it's a matter of consistency.
In particular, psql has always allowed you to write multiple SQL commands
per line:
   SELECT 2+2; SELECT x FROM tab; SELECT y FROM othertab;

and as of yesterday pgbench supports that as well.  So allowing multiple
backslash commands on a line improves consistency both with psql and with
pgbench's own behavior, IMV.
        regards, tom lane



Re: pgbench - allow backslash-continuations in custom scripts

From
Tom Lane
Date:
So I looked into this, and found that persuading psql to let backslash
commands cross line boundaries is a much bigger deal than just fixing the
lexer.  The problem is that MainLoop would need to grow an understanding
of having received only a partial backslash command and needing to go back
to readline() for another line.  And probably HandleSlashCmds would need
to be changed to stop parsing and back out without doing anything when it
hits backslash-newline.  It's do-able no doubt, but it's not going to be a
small and simple patch.

However, since pgbench is already set up to slurp the entire file and
lex it in one go, it is just a trivial adjustment to the lexer rules in
that program.  The only thing I found that made it complicated is that
syntax_error() had too simplistic an understanding of how to position
the error cursor usefully, so that needed a bit of work.

I think it'd be okay to commit this without fixing psql at the same time;
if you try it in psql you get an error, so on that side it's unimplemented
behavior rather than an actual incompatibility.  Perhaps somebody will be
motivated to fix it later, but I'm not going to spend that kind of time
on it right now.

I've not written a docs update, but otherwise I think this is committable.
Comments?

            regards, tom lane

diff --git a/src/bin/pgbench/exprscan.l b/src/bin/pgbench/exprscan.l
index 825dacc..12b5f7e 100644
*** a/src/bin/pgbench/exprscan.l
--- b/src/bin/pgbench/exprscan.l
***************
*** 6,17 ****
   *
   * This lexer supports two operating modes:
   *
!  * In INITIAL state, just parse off whitespace-separated words (this mode
!  * is basically equivalent to strtok(), which is what we used to use).
   *
   * In EXPR state, lex for the simple expression syntax of exprparse.y.
   *
!  * In either mode, stop upon hitting newline or end of string.
   *
   * Note that this lexer operates within the framework created by psqlscan.l,
   *
--- 6,18 ----
   *
   * This lexer supports two operating modes:
   *
!  * In INITIAL state, just parse off whitespace-separated words.  (This mode
!  * is basically equivalent to strtok(), which is what we used to use.)
   *
   * In EXPR state, lex for the simple expression syntax of exprparse.y.
   *
!  * In either mode, stop upon hitting newline, end of string, or unquoted
!  * backslash (except that backslash-newline is silently swallowed).
   *
   * Note that this lexer operates within the framework created by psqlscan.l,
   *
*************** extern void expr_yyset_column(int column
*** 61,69 ****
  alpha            [a-zA-Z_]
  digit            [0-9]
  alnum            [a-zA-Z0-9_]
! /* {space} + {nonspace} + {newline} should cover all characters */
  space            [ \t\r\f\v]
! nonspace        [^ \t\r\f\v\n]
  newline            [\n]

  /* Exclusive states */
--- 62,71 ----
  alpha            [a-zA-Z_]
  digit            [0-9]
  alnum            [a-zA-Z0-9_]
! /* {space} + {nonspace} + {backslash} + {newline} should cover all characters */
  space            [ \t\r\f\v]
! nonspace        [^ \t\r\f\v\\\n]
! backslash        [\\]
  newline            [\n]

  /* Exclusive states */
*************** newline            [\n]
*** 98,103 ****
--- 100,113 ----

  {space}+        { /* ignore */ }

+ {backslash}{newline}    { /* ignore */ }
+
+ {backslash}        {
+                     /* do not eat, and report end of command */
+                     yyless(0);
+                     return 0;
+                 }
+
  {newline}        {
                      /* report end of command */
                      last_was_newline = true;
*************** newline            [\n]
*** 130,143 ****
                      return FUNCTION;
                  }

  {newline}        {
                      /* report end of command */
                      last_was_newline = true;
                      return 0;
                  }

- {space}+        { /* ignore */ }
-
  .                {
                      /*
                       * must strdup yytext so that expr_yyerror_more doesn't
--- 140,161 ----
                      return FUNCTION;
                  }

+ {space}+        { /* ignore */ }
+
+ {backslash}{newline}    { /* ignore */ }
+
+ {backslash}        {
+                     /* do not eat, and report end of command */
+                     yyless(0);
+                     return 0;
+                 }
+
  {newline}        {
                      /* report end of command */
                      last_was_newline = true;
                      return 0;
                  }

  .                {
                      /*
                       * must strdup yytext so that expr_yyerror_more doesn't
*************** expr_yyerror_more(yyscan_t yyscanner, co
*** 177,183 ****
      /*
       * While parsing an expression, we may not have collected the whole line
       * yet from the input source.  Lex till EOL so we can report whole line.
!      * (If we're at EOF, it's okay to call yylex() an extra time.)
       */
      if (!last_was_newline)
      {
--- 195,201 ----
      /*
       * While parsing an expression, we may not have collected the whole line
       * yet from the input source.  Lex till EOL so we can report whole line.
!      * (If we're at backslash/EOF, it's okay to call yylex() an extra time.)
       */
      if (!last_was_newline)
      {
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 4196b0e..e947f77 100644
*** a/src/bin/pgbench/pgbench.c
--- b/src/bin/pgbench/pgbench.c
*************** syntax_error(const char *source, int lin
*** 2413,2426 ****
      fprintf(stderr, "\n");
      if (line != NULL)
      {
!         fprintf(stderr, "%s\n", line);
!         if (column >= 0)
          {
              int            i;

!             for (i = 0; i < column; i++)
!                 fprintf(stderr, " ");
!             fprintf(stderr, "^ error found here\n");
          }
      }
      exit(1);
--- 2413,2455 ----
      fprintf(stderr, "\n");
      if (line != NULL)
      {
!         /*
!          * Multi-line backslash commands make this harder than you'd think; we
!          * have to identify which line to put the error cursor on.  So print
!          * one line at a time.
!          */
!         for (;;)
          {
+             const char *nlpos = strchr(line, '\n');
+             int            len;
              int            i;

!             if (nlpos)
!             {
!                 /*
!                  * It's tempting to use fprintf("%.*s"), but that can fail if
!                  * glibc has a different idea of the encoding than we do.
!                  */
!                 len = nlpos - line + 1;
!                 for (i = 0; i < len; i++)
!                     fputc(line[i], stderr);
!             }
!             else
!             {
!                 len = column + 1;        /* ensure we print ^ if not done */
!                 fprintf(stderr, "%s\n", line);
!             }
!             if (column >= 0 && column < len)
!             {
!                 for (i = 0; i < column; i++)
!                     fputc(' ', stderr);
!                 fprintf(stderr, "^ error found here\n");
!             }
!             column -= len;
!             if (nlpos)
!                 line = nlpos + 1;
!             else
!                 break;
          }
      }
      exit(1);

Re: pgbench - allow backslash-continuations in custom scripts

From
Kyotaro HORIGUCHI
Date:
First, thank you all involved, and thank you for polishing this
and committing, Tom.

At Mon, 21 Mar 2016 17:15:18 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in <1596.1458594918@sss.pgh.pa.us>
> So I looked into this, and found that persuading psql to let backslash
> commands cross line boundaries is a much bigger deal than just fixing the
> lexer.  The problem is that MainLoop would need to grow an understanding
> of having received only a partial backslash command and needing to go back
> to readline() for another line.  And probably HandleSlashCmds would need
> to be changed to stop parsing and back out without doing anything when it
> hits backslash-newline.  It's do-able no doubt, but it's not going to be a
> small and simple patch.

I agree.

> However, since pgbench is already set up to slurp the entire file and
> lex it in one go, it is just a trivial adjustment to the lexer rules in
> that program.  The only thing I found that made it complicated is that
> syntax_error() had too simplistic an understanding of how to position
> the error cursor usefully, so that needed a bit of work.

The modified lexer treats {backslash}{newline} as the same with
whitespace and it looks ok for me.

> /test.sql:6: syntax error, unexpected FUNCTION, expecting $end in command "set"
> \set naccounts\
> 10x0
>    ^ error found here

The error message seems fine. (The mysterious message would be
another problem.) But it prints the lines after the error indicator.

> \set naccounts\
> 10x0\
>    ^ error found here
> * :scale

I suppose that the trailing lines might be better not be
printed. (gcc doesn't seem to do so.)

> I think it'd be okay to commit this without fixing psql at the same time;
> if you try it in psql you get an error, so on that side it's unimplemented
> behavior rather than an actual incompatibility.  Perhaps somebody will be
> motivated to fix it later, but I'm not going to spend that kind of time
> on it right now.
> 
> I've not written a docs update, but otherwise I think this is committable.
> Comments?
> 
>             regards, tom lane

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





Re: pgbench - allow backslash-continuations in custom scripts

From
Alvaro Herrera
Date:
Tom Lane wrote:
> So I looked into this, and found that persuading psql to let backslash
> commands cross line boundaries is a much bigger deal than just fixing the
> lexer.  The problem is that MainLoop would need to grow an understanding
> of having received only a partial backslash command and needing to go back
> to readline() for another line.  And probably HandleSlashCmds would need
> to be changed to stop parsing and back out without doing anything when it
> hits backslash-newline.  It's do-able no doubt, but it's not going to be a
> small and simple patch.

FWIW, I would love to see this in some future release: particularly for
\copy lines with large queries, the limitation that only single-line
input is accepted is very annoying -- much more so when the query comes
pasted from some other input, or when you have a file with a query and
just want to add a quick \copy prefix.

(Hmm, a much simpler alternative would be to allow \g-like syntax, i.e.
the query is already in the query buffer and the \copy line just
specifies the output file.  In fact, for queries in history this is much
more convenient than the current syntax ...)

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services