Thread: Concurrent psql API

Concurrent psql API

From

Tom Lane

Date:

08 April 2008, 18:11:11

[ redirecting to -hackers since -patches isn't the place for general
discussion of feature specifications ]

Gregory Stark <stark@enterprisedb.com> writes:
> So based on the feedback and suggestions here this is the interface I suggest:

> \connect&   - to open a new connection keeping the existing one
> \g&         - to submit a command asynchronously (like & in the shell)
> \S [Sess#]  - to _S_witch to a different _S_ession
>             - if no connection # specified list available _S_essions
> \D          - _D_isconnect from current session (like ^D in the shell)

This is still the latest API suggestion for concurrent psql, right?

After reflecting on it for awhile it seems to me that the use of
automatically assigned numbers as connection IDs is kind of a wart.
It makes it difficult if not impossible to write context-insensitive
script fragments, and even for interactive use it doesn't seem
especially convenient.  How about naming connections with user-assigned
strings, instead, eg
\connect& name [ optional connect params ]\S name

This would require choosing a name for the default session, maybe "-".
Or you could use "1" if you figured that people really would prefer
numbers as IDs.

I'm not real thrilled with overloading \S with two fundamentally
different behaviors, either.  Can't we find a different string to assign
to the listing purpose?  Maybe \S without parameter should mean to
switch to the default session.

> Another thought I had for the future is a \C command to simulate C-c and send
> a query cancel. That would let us have regression tests that query
> cancellation worked.

Do we really need a regression test for that?  I notice \C is already
taken.  In general the space of backslash command names is taken up
densely enough that eating single-letter names for marginal functions
doesn't seem wise.  (I also question giving \D a single-letter name.)

But the part of the API that really seems like a wart is

+       <varlistentry>
+         <term><varname>ASYNC_DELAY</varname></term>
+         <listitem>
+         <para>
+         Wait up to this period of time (in milliseconds) for output prior to
+         any connection switch. If no asynchronous command is pending or if any
+         output arrives <application>psql</> may not wait the full specified
+         time.
+         </para>

There is no way to select a correct value for ASYNC_DELAY --- any value
you might pick could be too small if the machine is heavily loaded.
In any case for safety's sake you'd need to pick values much larger than
(you think) are really needed, which is not cool for something we hope to
use in regression tests.  Those of us who routinely run the tests many
times a day will scream pretty loudly if they start spending most of
their time waiting --- and the prospect of random failures on heavily
loaded buildfarm members is not appetizing either.

What seems possibly more useful is to reintroduce \cwait (or hopefully
some better name) and give it the semantics of "wait for a response from
any active connection; switch to the first one to respond, printing its
name, and print its result".

This would lead to code like, say,
\c& conn1\c& conn2...\S conn1CREATE INDEX ...  \g&\S conn2CREATE INDEX ...  \g&...\cwait\cwait

The number of \cwaits you need is exactly equal to the number of
async commands you've issued.  For regression testing purposes
you'd need to design the script to ensure that only one of the
connections is expected to respond next, but that seems necessary
anyway --- and you don't need any extra checks to catch the case
that you get an unexpected early response from another one.

Hmm, this still seems a bit notation-heavy, doesn't it?  What if \g&
takes an arg indicating which connection to issue the command on:
\c& conn1\c& conn2...CREATE INDEX ...  \g& conn1CREATE INDEX ...  \g& conn2...\cwait\cwait

Not totally sure about that one, but issuing a command on a background
connection seems appealing for scripting purposes.  It eliminates the
risk that the query response comes back before you manage to switch away
from the connection; which would be bad because it would mess up your
count of how many cwait's you need.  It seems a bit more analogous to
the use of & in shell scripts, too, where you implicitly fork away from
the async command.  (Maybe c& shouldn't make the new connection
foreground either?)
        regards, tom lane

Re: Concurrent psql API

From

Tom Lane

Date:

08 April 2008, 18:56:24

I wrote:
> What seems possibly more useful is to reintroduce \cwait (or hopefully
> some better name) and give it the semantics of "wait for a response from
> any active connection; switch to the first one to respond, printing its
> name, and print its result".

It strikes me that with these semantics, \cwait is a lot like a thread
join operation, so we could call it \join or \j.
        regards, tom lane

Re: Concurrent psql API

From

Alvaro Herrera

Date:

08 April 2008, 19:19:18

Tom Lane wrote:
> I wrote:
> > What seems possibly more useful is to reintroduce \cwait (or hopefully
> > some better name) and give it the semantics of "wait for a response from
> > any active connection; switch to the first one to respond, printing its
> > name, and print its result".
> 
> It strikes me that with these semantics, \cwait is a lot like a thread
> join operation, so we could call it \join or \j.

FWIW on POSIX shell there's something similar called "wait".

http://www.opengroup.org/onlinepubs/009695399/utilities/wait.html

Perhaps we should define the operator after these semantics -- these
guys have probably hashed up a good interface.  Basically it means we
would have a "\cwait [n ...]" command meaning "wait for the connection
'n' to return".

If we do that, we can then have multiple commands in flight on
regression tests, and wait for them in whatever deterministic order we
choose, regardless of which one finishes execution first.

However, the no-operands version of POSIX wait means "wait for all
commands" instead of "wait for any command".  Perhaps we could have
"\cwait -" as meaning "wait for any command".

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Concurrent psql API

From

Tom Lane

Date:

08 April 2008, 20:05:42

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> It strikes me that with these semantics, \cwait is a lot like a thread
>> join operation, so we could call it \join or \j.

> FWIW on POSIX shell there's something similar called "wait".
> http://www.opengroup.org/onlinepubs/009695399/utilities/wait.html
> Perhaps we should define the operator after these semantics -- these
> guys have probably hashed up a good interface.  Basically it means we
> would have a "\cwait [n ...]" command meaning "wait for the connection
> 'n' to return".

I was thinking about this some more while out running an errand, and
came to the same conclusion that "\cwait connID" would be a good thing
to have.

> However, the no-operands version of POSIX wait means "wait for all
> commands" instead of "wait for any command".  Perhaps we could have
> "\cwait -" as meaning "wait for any command".

That would require prohibiting "-" as a connection ID, but maybe that's
an OK price for acting like a known standard.

Another thought that came to me while driving around is that it seems
bogus to offer a prompt when attached to a connection that can't
actually accept a command right now.  I know that psql can get into that
state after a connection dies, but it's still a wart, and not really
something we should design into normal operations.  Furthermore, I don't
see the reason for switching to a connection with an active async
command unless you are desiring to wait for that command's result.
So I'm thinking we could unify \S with \cwait.  This leads to the
following proposals:

sql-command \g& connID
If connID is idle and not the current connection, issuesql-command on it, but do *not* switch to that connection.(There
arevarious possibilities on what to do in thecorner cases where it's busy or the current connection.If it's busy, we
couldthrow error, or do a forced \joinbefore issuing the command.  If it's the current connection,my inclination is to
treatthis exactly like \g, ie waitfor the result.)Also, if connID is not a known ID, we could automaticallycreate it as
aclone of the current connection; which'deliminate the need for explicit \connect& in many cases.(OTOH that might be
toovulnerable to typos.)

\join connID
Switch to connection connID.  If it is busy, wait forcommand completion and print the result before offeringa new
commandprompt.

\join    (or \join - as per Alvaro)
Wait for any currently busy connection's command to finish,then \join to it.  Error if there is no busy connection.

While there's still a possible use for \D (disconnect) in this
scheme, I'm not sure how interesting it is.  In any case disconnecting
the active session is a bogus behavior; you should only be able
to disconnect a non-active, idle one.
        regards, tom lane

Re: Concurrent psql API

From

Shane Ambler

Date:

08 April 2008, 23:36:50

Tom Lane wrote:

>     \connect& name [ optional connect params ]
>     \S name
> 
> This would require choosing a name for the default session, maybe "-".
> Or you could use "1" if you figured that people really would prefer
> numbers as IDs.

+1 with name as a string, when an empty string is passed a numerical 
sequence is used as default.

> I'm not real thrilled with overloading \S with two fundamentally
> different behaviors, either.  Can't we find a different string to assign
> to the listing purpose?  Maybe \S without parameter should mean to
> switch to the default session.

I think it seems fine. Fits with \h and \d behaviour.


> Hmm, this still seems a bit notation-heavy, doesn't it?  What if \g&
> takes an arg indicating which connection to issue the command on:
> 
>     \c& conn1
>     \c& conn2
>     ...
>     CREATE INDEX ...  \g& conn1
>     CREATE INDEX ...  \g& conn2
>     ...
>     \cwait
>     \cwait

+1 on the \g& but I would reverse the syntax -

\g& conn1 CERATE INDEX...;

> Not totally sure about that one, but issuing a command on a background
> connection seems appealing for scripting purposes.  It eliminates the
> risk that the query response comes back before you manage to switch away
> from the connection; which would be bad because it would mess up your
> count of how many cwait's you need.  It seems a bit more analogous to
> the use of & in shell scripts, too, where you implicitly fork away from
> the async command.  (Maybe c& shouldn't make the new connection
> foreground either?)

\c& for a new foreground connection
\cb& for a new background connection?




-- 

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

Re: Concurrent psql API

From

Shane Ambler

Date:

08 April 2008, 23:42:23

Tom Lane wrote:

> \join connID
> 
>     Switch to connection connID.  If it is busy, wait for
>     command completion and print the result before offering
>     a new command prompt.

When switching to a conn we also need a non-destructive way out if it is 
busy.

> \join    (or \join - as per Alvaro)
> 
>     Wait for any currently busy connection's command to finish,
>     then \join to it.  Error if there is no busy connection.
> 

So what you suggest is that if you have 10 busy conns running \join will 
send you to the next conn to return a result?

On that - listing the current conns could be useful to have some status 
info with the list to indicate idle or running what command.

-- 

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

Re: Concurrent psql API

From

Tom Lane

Date:

09 April 2008, 00:37:46

Shane Ambler <pgsql@Sheeky.Biz> writes:
> +1 on the \g& but I would reverse the syntax -
> \g& conn1 CERATE INDEX...;

No, not good.  If the command requires multiple lines then this creates
an action-at-a-distance behavior.  Thought experiment: what would you
expect here:
\g& conn1CREATE INDEX z (<oops, made a mistake>\rCREATE INDEX q ...;

And whichever behavior you'd "expect", how would you get the other
one when you needed it?  Hidden state sucks.

(Yeah, this argument probably appeals to people who like RPN calculators
more than those who don't...)

psql's established behavior is that \g is issued after the command
it affects, and we should not change that.
        regards, tom lane

Re: Concurrent psql API

From

Tom Lane

Date:

09 April 2008, 00:42:27

Shane Ambler <pgsql@Sheeky.Biz> writes:
> When switching to a conn we also need a non-destructive way out if it is 
> busy.

Uh, why?  Why would you switch to a connection at all, if you didn't
want its result?

This is a pretty fundamental issue, and insisting that you want that
behavior will make both the user's mental model and the implementation
a whole lot more complex.  I'm not going to accept unsupported arguments
that it might be a nice thing to have.

> So what you suggest is that if you have 10 busy conns running \join will 
> send you to the next conn to return a result?

Right.

> On that - listing the current conns could be useful to have some status 
> info with the list to indicate idle or running what command.

Sure, some status-inquiry commands could be added without fundamentally
affecting anything.
        regards, tom lane

Re: Concurrent psql API

From

Gregory Stark

Date:

09 April 2008, 08:24:37

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Tom Lane wrote:
>>> It strikes me that with these semantics, \cwait is a lot like a thread
>>> join operation, so we could call it \join or \j.
>
>> FWIW on POSIX shell there's something similar called "wait".
>> http://www.opengroup.org/onlinepubs/009695399/utilities/wait.html
>> Perhaps we should define the operator after these semantics -- these
>> guys have probably hashed up a good interface.  Basically it means we
>> would have a "\cwait [n ...]" command meaning "wait for the connection
>> 'n' to return".
>
> I was thinking about this some more while out running an errand, and
> came to the same conclusion that "\cwait connID" would be a good thing
> to have.

I threw out cwait because it seemed to me that to write any kind of reliable
regression test you would end up having to put a cwait with a timeout on every
connection switch.

Consider a simple regression test to test that update locks out concurrent
updaters:

1 begin;
1 update t where i=1
UPDATE 1
<switch to connection 2>
2 begin;
2 update t where i=1
<switch to connection 2>
2 commit;
COMMIT
<switch to connection 1>
UPDATE 1

So here what you really want to test is that the second update blocks. If we
don't wait at all we might very well miss the UPDATE message because we just
flew past it too fast. In fact IIRC that's exactly what I saw.

> While there's still a possible use for \D (disconnect) in this
> scheme, I'm not sure how interesting it is.  In any case disconnecting
> the active session is a bogus behavior; you should only be able
> to disconnect a non-active, idle one.

Unless you're specifically trying to test that things get cleaned up properly
when the session rolls back... But yeah, I only put it in for the sake of
completeness at the time.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!

Re: Concurrent psql API

From

Shane Ambler

Date:

09 April 2008, 11:30:41

Tom Lane wrote:
> Shane Ambler <pgsql@Sheeky.Biz> writes:
>> When switching to a conn we also need a non-destructive way out if it is 
>> busy.
> 
> Uh, why?  Why would you switch to a connection at all, if you didn't
> want its result?

What if you switch to the wrong connection and it hasn't finished. Do 
you then have to wait until you get the results before you can issue 
another command? Or will we be able to type commands while we wait for 
results?

I am thinking as currently happens - you can't type a command as you are 
waiting for a result. So if the connection you switch to is busy but you 
want to go to another connection then how do you?

This may tie into an 'auto new connection'. You start psql enter a 
command that will take a while then think of something else you can do 
as you wait. Do you open another shell and start psql again, or send the 
working task to the background and enter another command in a new 
connection?

Think jobs in a shell, you can suspend a long running process then send 
it to the background to work and go on with something else.

So I am thinking something like C-z that will allow you to switch out of 
a task that is waiting for results without having to stop it with C-c.

-- 

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

Re: Concurrent psql API

From

Alvaro Herrera

Date:

09 April 2008, 12:40:02

Shane Ambler wrote:

> Think jobs in a shell, you can suspend a long running process then send  
> it to the background to work and go on with something else.
>
> So I am thinking something like C-z that will allow you to switch out of  
> a task that is waiting for results without having to stop it with C-c.

I agree -- we would need to have a mode on which it is "not on any
connection", to which we could switch on C-z.  If all connections are
busy, there's no way to create a new one otherwise.

It makes sense if we continue with the shell analogy: the shell prompt
is not any particular task.  Either there is a task running in
foreground (in which case we have no prompt, but we can press C-z to
suspend the current task and get a prompt), or there isn't (in which
case we have a prompt.)

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Concurrent psql API

From

Tom Lane

Date:

09 April 2008, 14:28:08

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Shane Ambler wrote:
>> So I am thinking something like C-z that will allow you to switch out of  
>> a task that is waiting for results without having to stop it with C-c.

> I agree -- we would need to have a mode on which it is "not on any
> connection", to which we could switch on C-z.  If all connections are
> busy, there's no way to create a new one otherwise.

That would work okay for interactive use and not at all for scripts,
which makes it kind of a nonstarter.  I'm far from convinced that the
case must be handled anyway.  If you fat-finger a SQL command the
consequences are likely to be far worse than having to wait a bit,
so why is it so critical to be able to recover from a typo in a \join
argument?

(I'm also unconvinced that there won't be severe implementation
difficulties in supporting a control-Z-like interrupt --- we don't have
any terminal signals left to use AFAIK.  And what about Windows?)

> It makes sense if we continue with the shell analogy: the shell prompt
> is not any particular task.  Either there is a task running in
> foreground (in which case we have no prompt, but we can press C-z to
> suspend the current task and get a prompt), or there isn't (in which
> case we have a prompt.)

This is nonsense.  When you have a shell prompt, you are connected to a
shell that will take a command right now.
        regards, tom lane

Re: Concurrent psql API

From

Decibel!

Date:

09 April 2008, 16:17:35

On Apr 9, 2008, at 12:27 PM, Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Shane Ambler wrote:
>>> So I am thinking something like C-z that will allow you to switch  
>>> out of
>>> a task that is waiting for results without having to stop it with  
>>> C-c.
>
>> I agree -- we would need to have a mode on which it is "not on any
>> connection", to which we could switch on C-z.  If all connections are
>> busy, there's no way to create a new one otherwise.
>
> That would work okay for interactive use and not at all for scripts,
> which makes it kind of a nonstarter.

I can't see any need to do this in a script, and in fact I don't  
think shell scripting supports it. Totally different story for  
interactive use. Anyone using *nix is likely to be familiar with how  
job control works in shells and expecting psql to work the same way.  
We should try and follow the shell standard as much as possible just  
so that people don't have to re-train themselves.

> I'm far from convinced that the
> case must be handled anyway.  If you fat-finger a SQL command the
> consequences are likely to be far worse than having to wait a bit,
> so why is it so critical to be able to recover from a typo in a \join
> argument?

I find myself doing this frequently with any long-running command,  
but currently it's a PITA because I'd doing it at the shell level and  
firing up a new psql: more work than should be necessary, and psql  
sometimes gets confused when you resume it from the background in  
interactive mode (stops echoing characters, though maybe this has  
been fixed).

> (I'm also unconvinced that there won't be severe implementation
> difficulties in supporting a control-Z-like interrupt --- we don't  
> have
> any terminal signals left to use AFAIK.  And what about Windows?)

That might be true. I don't know if we could use ^z anyway; the shell  
might have different ideas there.

>> It makes sense if we continue with the shell analogy: the shell  
>> prompt
>> is not any particular task.  Either there is a task running in
>> foreground (in which case we have no prompt, but we can press C-z to
>> suspend the current task and get a prompt), or there isn't (in which
>> case we have a prompt.)
>
> This is nonsense.  When you have a shell prompt, you are connected  
> to a
> shell that will take a command right now.

You're always connected to the shell, but if you background something  
in the shell it becomes a stand-alone job that you're not connected  
to. You could even think of it as every command you run being a job,  
it's just a question of if you're actually connected to it or not.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Re: Concurrent psql API

From

Shane Ambler

Date:

09 April 2008, 16:33:27

Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Shane Ambler wrote:
>>> So I am thinking something like C-z that will allow you to switch out of  
>>> a task that is waiting for results without having to stop it with C-c.
> 
>> I agree -- we would need to have a mode on which it is "not on any
>> connection", to which we could switch on C-z.  If all connections are
>> busy, there's no way to create a new one otherwise.
> 
> That would work okay for interactive use and not at all for scripts,
> which makes it kind of a nonstarter.  I'm far from convinced that the
> case must be handled anyway.  If you fat-finger a SQL command the
> consequences are likely to be far worse than having to wait a bit,
> so why is it so critical to be able to recover from a typo in a \join
> argument?

I can see that a non-connected prompt would interfere with a script but 
I would think that a prompt should always be linked to a connection. It 
may work to get an un-connected prompt made available from C-z which 
could be limited to only allow new connections or \join commands which 
would also be limited to interactive input.

My first thoughts where that C-z would either drop back to the previous 
connection or create a new connection either based on the initial login 
or the connection you are C-z'ing out of. This would be the tricky 
decider though which may make a limited prompt viable.

C-z input detection may also be limited to the wait for query response 
loop so that it is only available if the current connection is without a 
prompt.

I do think it is useful for more than typo's in the \join command. What 
about a slip where you forget to \g& the command. Or you start a query 
that seems to be taking too long, background it and look into what is 
happening. This would be more helpful to those that ssh into a machine 
then run psql from there.

> (I'm also unconvinced that there won't be severe implementation
> difficulties in supporting a control-Z-like interrupt --- we don't have
> any terminal signals left to use AFAIK.  And what about Windows?)

That may be so and could be the decider over whether this can be added 
or not.

Unless Windows steals the input before psql gets it I don't see there 
will be a problem there. Windows may be a factor in deciding which key 
to use for this command if it is to be uniform across platforms.

-- 

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz

Re: Concurrent psql API

From

Csaba Nagy

Date:

10 April 2008, 05:17:42

On Thu, 2008-04-10 at 05:03 +0930, Shane Ambler wrote:
> I do think it is useful for more than typo's in the \join command. What 
> about a slip where you forget to \g& the command. Or you start a query 
> that seems to be taking too long, background it and look into what is 
> happening. This would be more helpful to those that ssh into a machine 
> then run psql from there.

For interactive use in the above mentioned scenario you can use the
'screen' command and start as many psqls as needed ('man screen' to see
what it can do). I would probably always use screen instead of psql's
multisession capability in interactive use. I do want to instantly see
what is currently running, and a psql screen cluttered with multiple
results will not make that easier. Even a list method of what is running
will only help if it actually shows the complete SQL for all running
sessions and that will be a PITA if the SQLs are many and big. Multiple
screens are much better at that.

So from my POV scripting should be the main case for such a feature...
and there it would be welcome if it would be made easy to synchronize
the different sessions.

Cheers,
Csaba.

Re: [OT] Concurrent psql API

From

Csaba Nagy

Date:

10 April 2008, 05:32:03

> I find myself doing this frequently with any long-running command,  
> but currently it's a PITA because I'd doing it at the shell level and  
> firing up a new psql: more work than should be necessary, and psql  
> sometimes gets confused when you resume it from the background in  
> interactive mode (stops echoing characters, though maybe this has  
> been fixed).

I would recommend trying out the 'screen' utility (see my other post
too). And here you find a nice .screenrc too which will show you a
status bar of your active session, I find it super cool (and it's well
commented if you don't like it as it is):

http://home.insightbb.com/~bmsims1/Scripts/Screenrc.html

The man page has all commands you need, the most used by me:

Ctrl-a Ctrl-c -> open a new session;
Ctrl-a A -> name the session 8will show up with that name in the status
bar, note that the second key is a capital A not a);
Ctrl-a Ctrl-a -> switch to the last viewed session;
Ctrl-a <n> -> switch to the <n>th session, where <n> is a digit 0-9

I usually leave the screen sessions running end detach only the
terminal, and then I can connect again to the already set up sessions
using "screen -R". It's a real time saver.

It has many more facilities, and creating a new psql session is just
Ctrl-a Ctrl-c and then type in psql... and you're good to go... I don't
think you can beat that by a large margin with psql-intern commands (you
still need to type in something extra), and you do have added benefits
of clearly separated workflows and a nice overview of it.

Cheers,
Csaba.

Re: Concurrent psql API

From

Gregory Stark

Date:

10 April 2008, 06:02:39

"Csaba Nagy" <nagy@ecircle-ag.com> writes:

> For interactive use in the above mentioned scenario you can use the
> 'screen' command and start as many psqls as needed

Sure, or you could just start multiple xterms or emacs shell buffers 
(my preferred setup).

But I'm sure there are people who would prefer C-z too.

> So from my POV scripting should be the main case for such a feature...
> and there it would be welcome if it would be made easy to synchronize
> the different sessions.

I think it's the main case, that's why I didn't implement C-z at all. But I
think we should keep it as a design consideration and not preclude it in the
future.

Hm. I had a thought though. Perhaps C-z should just immediately start a new
connection. That would perhaps maintain the shell metaphor the way Tom was
thinking where you're always at a usable prompt. That might suck if you're at
a password-authenticated connection.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!

Re: Concurrent psql API

From

Alvaro Herrera

Date:

10 April 2008, 11:22:57

So, Greg, after all this feedback, are you going to rework the patch?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Concurrent psql API

From

Gregory Stark

Date:

10 April 2008, 11:46:20

"Alvaro Herrera" <alvherre@commandprompt.com> writes:

> So, Greg, after all this feedback, are you going to rework the patch?

I'm a bit busy now but yes, eventually.

I had in mind that it would probably make sense to start over, stealing code
as appropriate. The main thing is that the logic is a bit twisted now since I
originally had it as a prefix command you gave before issuing the sql. As a
postfix command, \g&, the logic could be a bit simpler.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication
support!

Re: Concurrent psql API

From

Tom Lane

Date:

10 April 2008, 14:04:30

Gregory Stark <stark@enterprisedb.com> writes:
> "Csaba Nagy" <nagy@ecircle-ag.com> writes:
>> For interactive use in the above mentioned scenario you can use the
>> 'screen' command and start as many psqls as needed

> Sure, or you could just start multiple xterms or emacs shell buffers 
> (my preferred setup).

Yeah, that's an awfully good point, and I have to admit I'd generally
prefer multiple xterms too.

> But I'm sure there are people who would prefer C-z too.

AFAICT, supporting C-z will add a pretty significant increment of
definitional complexity, implementation complexity, and portability
risks to what otherwise could be a relatively small patch.  I don't
want to buy into that just because "some people might use it".

I note also that if we start trapping C-z, it would stop working
for what it works for now, namely suspending psql so you can do
something else in that window.

So, +1 for thinking about this entirely as a scripting feature.
        regards, tom lane

Re: Concurrent psql API

From

Simon Riggs

Date:

23 April 2008, 11:17:31

On Tue, 2008-04-08 at 17:10 -0400, Tom Lane wrote:

> What seems possibly more useful is to reintroduce \cwait (or hopefully
> some better name) and give it the semantics of "wait for a response from
> any active connection; switch to the first one to respond, printing its
> name, and print its result".
> 
> This would lead to code like, say,
> 
>     \c& conn1
>     \c& conn2
>     ...
>     \S conn1
>     CREATE INDEX ...  \g&
>     \S conn2
>     CREATE INDEX ...  \g&
>     ...
>     \cwait
>     \cwait
> 
> The number of \cwaits you need is exactly equal to the number of
> async commands you've issued.  For regression testing purposes
> you'd need to design the script to ensure that only one of the
> connections is expected to respond next, but that seems necessary
> anyway --- and you don't need any extra checks to catch the case
> that you get an unexpected early response from another one.
> 
> Hmm, this still seems a bit notation-heavy, doesn't it?  What if \g&
> takes an arg indicating which connection to issue the command on:
> 
>     \c& conn1
>     \c& conn2
>     ...
>     CREATE INDEX ...  \g& conn1
>     CREATE INDEX ...  \g& conn2
>     ...
>     \cwait
>     \cwait
> 
> Not totally sure about that one, but issuing a command on a background
> connection seems appealing for scripting purposes.  It eliminates the
> risk that the query response comes back before you manage to switch away
> from the connection; which would be bad because it would mess up your
> count of how many cwait's you need.  It seems a bit more analogous to
> the use of & in shell scripts, too, where you implicitly fork away from
> the async command.  (Maybe c& shouldn't make the new connection
> foreground either?)

Yes, I think the \g& conn syntax seems useful. Good thinking.

I agree also that the \S syntax has problems and we would wish to avoid
them. I would still like a way to change the default background session.
That will considerably reduce the number of changes people would need to
make to long scripts in order to be able to use this facility.

For example, if we have a script with 100 commands in, we may find that
commands 1-50 and 51-100 are in two groups. Commands 1-50 are each
dependent upon the previous command, as are 51-100. But the two groups
are independent of each other.

If we use the \g& syntax only, we would need to make 100 changes to the
script to send commands to the right session. If we had the capability
to say "use this background session as the default session to send
commands to", then we would be able to add parallelism to the script by
just making 2 changes: one prior to command 1 and one prior to command
51.

The original \S command had that capability, but was designed to
actually change into that session, giving the problems discussed.
Something like \S (don't care what syntax, though) would definitely
simplify scripting, which I think will translate directly into fewer
bugs for users.

I note \b is available... short for "background". Though I really don't
care what we call that command though, just want the capability.

Also, I don't want to have to count cwaits, so I'd like a command to say
"wait for all background sessions that have active statements" and for
that to be the default. For simplicity, \cwait would do this by default.

So this script
\c& conn1\c& conn2...ALTER TABLE ... ADD PRIMARY KEY \g& conn1ALTER TABLE ... ADD FOREIGN KEY    \g& conn1ALTER TABLE
...ADD FOREIGN KEY \g& conn1ALTER TABLE ... ADD FOREIGN KEY \g& conn1...

ALTER TABLE ... ADD PRIMARY KEY \g& conn2  ALTER TABLE ... ADD FOREIGN KEY \g& conn2ALTER TABLE ... ADD FOREIGN KEY \g&
conn2ALTERTABLE ... ADD FOREIGN KEY \g& conn2ALTER TABLE ... ADD FOREIGN KEY \g& conn2...

\cwait\cwait

would now become
\c& conn1\c& conn2...\b conn1ALTER TABLE ... ADD PRIMARY KEY ...  ALTER TABLE ... ADD FOREIGN KEYALTER TABLE ... ADD
FOREIGNKEYALTER TABLE ... ADD FOREIGN KEY...

\b conn2ALTER TABLE ... ADD PRIMARY KEY ...  ALTER TABLE ... ADD FOREIGN KEYALTER TABLE ... ADD FOREIGN KEYALTER TABLE
...ADD FOREIGN KEYALTER TABLE ... ADD FOREIGN KEY...

\cwait

Which seems much cleaner.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com

Re: Concurrent psql API

From

Simon Riggs

Date:

07 May 2008, 08:04:36

Greg,

Not sure whether you're working on this or not?

If so, what do you think of the slightly modified syntax I proposed?

I'm fairly keen on getting this patch completed fairly early on in the
8.4 cycle because it allows a new class of concurrent test case. I think
many people will be happy to submit concurrent test cases once the
syntax is known. That seems likely to reveal a few bugs we've not seen
before, especially when we are able to get that into the build farm. It
seems prudent to do that as early as possible so we have time to fix the
many bugs that emerge, some of them port specific.

Would you like any help?

------------------------------------------------------------------------

On Wed, 2008-04-23 at 15:18 +0100, Simon Riggs wrote:
> On Tue, 2008-04-08 at 17:10 -0400, Tom Lane wrote:
> 
> > What seems possibly more useful is to reintroduce \cwait (or hopefully
> > some better name) and give it the semantics of "wait for a response from
> > any active connection; switch to the first one to respond, printing its
> > name, and print its result".
> > 
> > This would lead to code like, say,
> > 
> >     \c& conn1
> >     \c& conn2
> >     ...
> >     \S conn1
> >     CREATE INDEX ...  \g&
> >     \S conn2
> >     CREATE INDEX ...  \g&
> >     ...
> >     \cwait
> >     \cwait
> > 
> > The number of \cwaits you need is exactly equal to the number of
> > async commands you've issued.  For regression testing purposes
> > you'd need to design the script to ensure that only one of the
> > connections is expected to respond next, but that seems necessary
> > anyway --- and you don't need any extra checks to catch the case
> > that you get an unexpected early response from another one.
> > 
> > Hmm, this still seems a bit notation-heavy, doesn't it?  What if \g&
> > takes an arg indicating which connection to issue the command on:
> > 
> >     \c& conn1
> >     \c& conn2
> >     ...
> >     CREATE INDEX ...  \g& conn1
> >     CREATE INDEX ...  \g& conn2
> >     ...
> >     \cwait
> >     \cwait
> > 
> > Not totally sure about that one, but issuing a command on a background
> > connection seems appealing for scripting purposes.  It eliminates the
> > risk that the query response comes back before you manage to switch away
> > from the connection; which would be bad because it would mess up your
> > count of how many cwait's you need.  It seems a bit more analogous to
> > the use of & in shell scripts, too, where you implicitly fork away from
> > the async command.  (Maybe c& shouldn't make the new connection
> > foreground either?)
> 
> Yes, I think the \g& conn syntax seems useful. Good thinking.
> 
> I agree also that the \S syntax has problems and we would wish to avoid
> them. I would still like a way to change the default background session.
> That will considerably reduce the number of changes people would need to
> make to long scripts in order to be able to use this facility.
> 
> For example, if we have a script with 100 commands in, we may find that
> commands 1-50 and 51-100 are in two groups. Commands 1-50 are each
> dependent upon the previous command, as are 51-100. But the two groups
> are independent of each other.
> 
> If we use the \g& syntax only, we would need to make 100 changes to the
> script to send commands to the right session. If we had the capability
> to say "use this background session as the default session to send
> commands to", then we would be able to add parallelism to the script by
> just making 2 changes: one prior to command 1 and one prior to command
> 51.
> 
> The original \S command had that capability, but was designed to
> actually change into that session, giving the problems discussed.
> Something like \S (don't care what syntax, though) would definitely
> simplify scripting, which I think will translate directly into fewer
> bugs for users.
> 
> I note \b is available... short for "background". Though I really don't
> care what we call that command though, just want the capability.
> 
> Also, I don't want to have to count cwaits, so I'd like a command to say
> "wait for all background sessions that have active statements" and for
> that to be the default. For simplicity, \cwait would do this by default.
> 
> So this script
> 
>     \c& conn1
>     \c& conn2
>     ...
>     ALTER TABLE ... ADD PRIMARY KEY \g& conn1
>     ALTER TABLE ... ADD FOREIGN KEY    \g& conn1
>     ALTER TABLE ... ADD FOREIGN KEY \g& conn1
>     ALTER TABLE ... ADD FOREIGN KEY \g& conn1
>     ...
> 
>     ALTER TABLE ... ADD PRIMARY KEY \g& conn2  
>     ALTER TABLE ... ADD FOREIGN KEY \g& conn2
>     ALTER TABLE ... ADD FOREIGN KEY \g& conn2
>     ALTER TABLE ... ADD FOREIGN KEY \g& conn2
>     ALTER TABLE ... ADD FOREIGN KEY \g& conn2
>     ...
> 
>     \cwait
>     \cwait
> 
> would now become
> 
>     \c& conn1
>     \c& conn2
>     ...
>     \b conn1
>     ALTER TABLE ... ADD PRIMARY KEY ...  
>     ALTER TABLE ... ADD FOREIGN KEY
>     ALTER TABLE ... ADD FOREIGN KEY
>     ALTER TABLE ... ADD FOREIGN KEY
>     ...
> 
>     \b conn2
>     ALTER TABLE ... ADD PRIMARY KEY ...  
>     ALTER TABLE ... ADD FOREIGN KEY
>     ALTER TABLE ... ADD FOREIGN KEY
>     ALTER TABLE ... ADD FOREIGN KEY
>     ALTER TABLE ... ADD FOREIGN KEY
>     ...
> 
>     \cwait
> 
> Which seems much cleaner.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com