Thread: psql as an execve(2) interpreter

psql as an execve(2) interpreter

From
Date:
I would like to use pgsl as an interpreter (in the sense of
execve(2)).  In short, if a file begins with the line
     #! /path/to/psql -f

it should be interpretable by psql.  The normal semantics of execve(2)
ensure that this will work perfectly (indeed a file containing
"#!/path/to/psql -l" works as expected), except for psql's nasty habit
of not interpreting the first line as a comment.

It seems that a simple fix to the following function in
src/bin/psql/input.c would do the trick.
    char *    gets_fromFile(FILE *source)    {     PQExpBufferData buffer;     char        line[1024];
     initPQExpBuffer(&buffer);
     while (fgets(line, sizeof(line), source) != NULL)     {         appendPQExpBufferStr(&buffer, line);         if
(buffer.data[buffer.len- 1] == '\n')         {             buffer.data[buffer.len - 1] = '\0';             return
buffer.data;        }     }
 
     if (buffer.len > 0)         return buffer.data;        /* EOF after reading some bufferload(s) */
     /* EOF, so return null */     termPQExpBuffer(&buffer);     return NULL;    }

For example, this feature could be achieved by 1) including a static
variable to differentiate the first from subsequent calls and 2)
discarding the first line (and returning the second) on the first call
if the first line begins with #!.

Thus, I have two questions.

- Is this a feature that would be generally accepted and useful for the postgresql community (i.e., would it be
incorporatedinto the code base)?
 

- Is this the correct solution or are there other portions of the code that need to be considered?

I appreciate any feedback you can give me on this.

Thank you very much.

Cheers,
Brook


Re: psql as an execve(2) interpreter

From
Tom Lane
Date:
<brook@biology.nmsu.edu> writes:
> I would like to use pgsl as an interpreter (in the sense of
> execve(2)).  In short, if a file begins with the line

>          #! /path/to/psql -f

> it should be interpretable by psql.  The normal semantics of execve(2)
> ensure that this will work perfectly (indeed a file containing
> "#!/path/to/psql -l" works as expected), except for psql's nasty habit
> of not interpreting the first line as a comment.

Given that # is not a comment introducer in SQL, I would consider
it a bug if it did.

You should instead write a shell script that invokes psql.
        regards, tom lane


Re: psql as an execve(2) interpreter

From
Date:
Tom Lane writes:> Given that # is not a comment introducer in SQL, I would consider> it a bug if it did.

I understand that # is not a comment introducer in SQL.  I am
wondering if it would be sensible to introduce an exception for the
first line of a file.  To prevent problems the behavior should be
controlled by a command line option (-i?) so that it would never have
this behavior unless explicitly asked for.

I guess you see no value in this and instead would solve the issue
with a separate interpreter that has this property?  Note that a shell
script cannot be an interpreter for execve(2); thus, this would
require another binary executable.  

My own feeling was that psql could be easily taught to have this
behavior in a way that would not interfer with any existing
applications.  I at least can see benefits to having that capability,
but perhaps others do not.  For example, some of my large database
applications are built by running a large collection of scripts (some
are shell scripts, some sql, etc.), each of which is responsible for a
portion of the task.  It would be very handy to execute each member of
the collection in a uniform manner, i.e., as a direct execution with
execve(2) figuring out which interpreter to use on a script-by-script
basis.  Currently, that is not possible, but it could be with a small
modification to psql or the addition of a completely new interpreter.

Thanks for the comments.

Cheers,
Brook


Re: psql as an execve(2) interpreter

From
"Jonah H. Harris"
Date:
Brook,

I have a lot of shell scripts that run as cron jobs and have considered 
this option.  However, if you look at it carefully, SQL is totally 
different from say perl, php, bash, etc. for scripts which execute from 
the shell.  Tom is right, it is much more valuable and supportable to 
call psql from within a shell script than add the functionality to psql 
itself.

-Jonah

brook@biology.nmsu.edu wrote:

>Tom Lane writes:
> > Given that # is not a comment introducer in SQL, I would consider
> > it a bug if it did.
>
>I understand that # is not a comment introducer in SQL.  I am
>wondering if it would be sensible to introduce an exception for the
>first line of a file.  To prevent problems the behavior should be
>controlled by a command line option (-i?) so that it would never have
>this behavior unless explicitly asked for.
>
>I guess you see no value in this and instead would solve the issue
>with a separate interpreter that has this property?  Note that a shell
>script cannot be an interpreter for execve(2); thus, this would
>require another binary executable.  
>
>My own feeling was that psql could be easily taught to have this
>behavior in a way that would not interfer with any existing
>applications.  I at least can see benefits to having that capability,
>but perhaps others do not.  For example, some of my large database
>applications are built by running a large collection of scripts (some
>are shell scripts, some sql, etc.), each of which is responsible for a
>portion of the task.  It would be very handy to execute each member of
>the collection in a uniform manner, i.e., as a direct execution with
>execve(2) figuring out which interpreter to use on a script-by-script
>basis.  Currently, that is not possible, but it could be with a small
>modification to psql or the addition of a completely new interpreter.
>
>Thanks for the comments.
>
>Cheers,
>Brook
>
>---------------------------(end of broadcast)---------------------------
>TIP 6: explain analyze is your friend
>  
>



Re: psql as an execve(2) interpreter

From
Date:
Jonah,

Thanks for your comments.

Jonah H. Harris writes:> I have a lot of shell scripts that run as cron jobs and have considered > this option.
However,if you look at it carefully, SQL is totally > different from say perl, php, bash, etc. for scripts which
executefrom > the shell.  Tom is right, it is much more valuable and supportable to > call psql from within a shell
scriptthan add the functionality to psql > itself.
 

I, too, have thought a lot about this and I suppose we are reaching
different conclusions.  I would very much appreciate hearing your
logic, as the underlying reasoning you imply is not transparent to me.
For what it is worth, here is an outline of part of my thinking.

It is true that the nature of SQL as a language is different from
other traditional programming languages, as it does not have concepts
such as variables and flow control.  To my way of thinking, however,
that simply dictates what is possible to express in the language.
Importantly, I do not see how it influences how one might wish to
execute the commands given in the language (or any language for that
matter).  In my mind the decision about how to execute something is
based on what it does and what the larger context is, not what the
language can express.

Suppose I have a script S in some language L that is interpretable by
some interpreter I.  The language L might be SQL and the inrepreter I
might be psql, but nothing that follows depends on that.  Indeed, the
language could be perl and the interpreter perl.  The fundamental
questions are:
    - What are useful ways to have the interpreter I carry out the      instructions contained within S?
    - What determines why those are useful?
    - How can those means be achieved?

For most scripting languages L (e.g., shell commands, perl, etc.) the
prior art has identified two useful means of having the interpreter
execute the instructions in S: 1) explicit execution (i.e., execute
the interpreter and explicitly pass the appropriate script S to it)
and 2) implicit execution (i.e., execute the script and magically have
the system invoke the interpreter on it).  Interpreting SQL scripts
stands out as one exception to this.

Why would one choose one method over another?  In all cases that I can
think of, the decision to use one method over another depends entirely
on considerations that are external to the nature of the language L
itself.  I would venture to say that they are governed primarily by
the nature of the external interface one is trying to create.  In some
cases, depending on what the script actually does, it is much more
natural to invoke a script directly.  An example would be one that is
a wrapper to something else, but must take the responsibility for
setting up the environment first.  In other cases, the other mechanism
is more natural.  The decision does not bear on what the _language_ is
capable of expressing, but rather on what the particular script is
doing and how it fits into a larger external context.

In my mind, the same is true for SQL.  In some cases it is appropriate
to execute the interpreter (i.e., psql) explicitly (that is currently
our only option).  In other cases it is appropriate to execute it
implicitly.  I see no fundamental difference between this and any
other interpreter.

Clearly, an implicit execution mechanism for SQL cannot work quite as
transparently as for languages that use the hash mark (#) as a comment
introducer.  However, supporting this option need not interfere with
other uses of the interpreter nor need it be costly.  What is required
is 1) a command line option that differentiates traditional behavior
from the "implicit interpreter" behavior, and 2) a modification of how
the first line of the file is handled depending on the mode.  No other
changes are required; no interaction with the vast majority of the
code is needed.

Thus, my analysis suggests no fundamental difference between the set
of invocations that are useful for the majority of interpreters and
the set that would be useful for interpreters of SQL.  I also can
envision a means of expanding that set for psql that would have no
impact on either its normal use or its ongoing maintenance and
development.  Consequently, I see no compelling reason not to move in
this direction.  However, I must be missing something obvious, as
there seems to be conflicting sentiment.  I would be very interested
to learn more about what is behind those ideas.

Cheers,
Brook