Thread: Shared Objects (Dynamic loading)

Shared Objects (Dynamic loading)

From
"Jasbinder Bali"
Date:
Hi,
I have a function in which i dynamicall load my shared object and the function definition is as follows:

----------------------------------------------------

CREATE OR REPLACE FUNCTION sp_trigger_raw_email(int4, text)
  RETURNS bool AS
'/usr/local/pgsql/jsbali/parser', 'parse_email'
  LANGUAGE 'c' VOLATILE STRICT;
ALTER FUNCTION sp_trigger_raw_email(int4,text ) OWNER TO postgres;

---------------------------------------------



function parse_email(int caseno, char *rawemail)
populates a few global variables first and then
call another function parse_header().
function parse_header() makes use of  the  global variables and then using ECPG stores values in a table in the database.

My question is, when we try to make use of a specific function of a shared object dynamically loaded as show above, then
would that function be able to access all global variables populated elsewhere in the program or all the global variables can't be accessed inside that function of the shared object.


Also, in the above function definition,
the signature of parse_email function is
parse_email(int, char*) and i am passing (int4 , text) to int as seen in the function code pasted above.
Is text in pgsql going to match with char* or i should use some other datatype?

Thanks and regards,
Jas

Re: [GENERAL] Shared Objects (Dynamic loading)

From
"Jasbinder Bali"
Date:
Well, the server side code is in ECPG because thats the easiest choice i could see.
I really don't know how difficult or beneficial would SPI be. Haven't heard of SPI before.
I was under the impression that ECPG and libpg are the only two choices a developer has in postgresql for database related activities.

I am using char in postgres function as an analogue for char* in C. Is this correct?
The link that you gave says varchar* in C has varchar as its analogue in postgresql.

Thanks and regards,
Jas

On 8/24/06, Michael Fuhr <mike@fuhr.org> wrote:
On Thu, Aug 24, 2006 at 01:03:43AM -0400, Jasbinder Bali wrote:
> CREATE OR REPLACE FUNCTION sp_trigger_raw_email(int4, text)
>  RETURNS bool AS
> '/usr/local/pgsql/jsbali/parser', 'parse_email'
>  LANGUAGE 'c' VOLATILE STRICT;
> ALTER FUNCTION sp_trigger_raw_email(int4,text ) OWNER TO postgres;
>
> function parse_email(int caseno, char *rawemail)
> populates a few global variables first and then
> call another function parse_header().
> function parse_header() makes use of  the  global variables and then using
> ECPG stores values in a table in the database.

Is there a reason this server-side code is using ECPG instead of SPI?

http://www.postgresql.org/docs/8.1/interactive/spi.html

> My question is, when we try to make use of a specific function of a shared
> object dynamically loaded as show above, then
> would that function be able to access all global variables populated
> elsewhere in the program or all the global variables can't be accessed
> inside that function of the shared object.

A function should be able to access any global symbol and any static
symbol in the same object file.  Are you having trouble doing so?

> Also, in the above function definition,
> the signature of parse_email function is
> parse_email(int, char*) and i am passing (int4 , text) to int as seen in the
> function code pasted above.
> Is text in pgsql going to match with char* or i should use some other
> datatype?

See "C-Language Functions" in the documentation, in particular what
it says about version 1 calling conventions.

http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html

Is there a reason you're coding in C instead of a higher-level
language like PL/Perl?  If you're parsing email messages then coding
in Perl, Python, Ruby, etc., would probably be easier than C.

--
Michael Fuhr

Re: [GENERAL] Shared Objects (Dynamic loading)

From
"Jasbinder Bali"
Date:
Also, when i dynamically load  a shared library and then later on change the code, create the same shared library (same name) and run my function where in the shared library is loaded, it takes the reference of the old shared library.
why does this happen and how to get rid of this.

Thanks and regards,
Jas

On 8/24/06, Jasbinder Bali < jsbali@gmail.com> wrote:
Well, the server side code is in ECPG because thats the easiest choice i could see.
I really don't know how difficult or beneficial would SPI be. Haven't heard of SPI before.
I was under the impression that ECPG and libpg are the only two choices a developer has in postgresql for database related activities.

I am using char in postgres function as an analogue for char* in C. Is this correct?
The link that you gave says varchar* in C has varchar as its analogue in postgresql.

Thanks and regards,
Jas


On 8/24/06, Michael Fuhr <mike@fuhr.org> wrote:
On Thu, Aug 24, 2006 at 01:03:43AM -0400, Jasbinder Bali wrote:
> CREATE OR REPLACE FUNCTION sp_trigger_raw_email(int4, text)
>  RETURNS bool AS
> '/usr/local/pgsql/jsbali/parser', 'parse_email'
>  LANGUAGE 'c' VOLATILE STRICT;
> ALTER FUNCTION sp_trigger_raw_email(int4,text ) OWNER TO postgres;
>
> function parse_email(int caseno, char *rawemail)
> populates a few global variables first and then
> call another function parse_header().
> function parse_header() makes use of  the  global variables and then using
> ECPG stores values in a table in the database.

Is there a reason this server-side code is using ECPG instead of SPI?

http://www.postgresql.org/docs/8.1/interactive/spi.html

> My question is, when we try to make use of a specific function of a shared
> object dynamically loaded as show above, then
> would that function be able to access all global variables populated
> elsewhere in the program or all the global variables can't be accessed
> inside that function of the shared object.

A function should be able to access any global symbol and any static
symbol in the same object file.  Are you having trouble doing so?

> Also, in the above function definition,
> the signature of parse_email function is
> parse_email(int, char*) and i am passing (int4 , text) to int as seen in the
> function code pasted above.
> Is text in pgsql going to match with char* or i should use some other
> datatype?

See "C-Language Functions" in the documentation, in particular what
it says about version 1 calling conventions.

http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html

Is there a reason you're coding in C instead of a higher-level
language like PL/Perl?  If you're parsing email messages then coding
in Perl, Python, Ruby, etc., would probably be easier than C.

--
Michael Fuhr


Re: [GENERAL] Shared Objects (Dynamic loading)

From
Michael Fuhr
Date:
On Thu, Aug 24, 2006 at 02:51:50AM -0400, Jasbinder Bali wrote:
> Well, the server side code is in ECPG because thats the easiest choice i
> could see.
> I really don't know how difficult or beneficial would SPI be. Haven't heard
> of SPI before.
> I was under the impression that ECPG and libpg are the only two choices a
> developer has in postgresql for database related activities.

ECPG and libpq are client libraries.  Server-side functions can use
those libraries to make connections to the same or a different
database, but a function can use SPI to execute commands in the
same backend in which it's running without having to make a separate
client connection.  That's more efficient and the commands the
function runs will be executed in the same transaction as the
function itself, so if the calling transaction rolls back then
statements the function executed will roll back.  If the function
had executed statements over a separate connection, committed them,
and closed the connection, then those statements wouldn't roll back
even if the function's transaction rolled back.

> I am using char in postgres function as an analogue for char* in C. Is this
> correct?
> The link that you gave says varchar* in C has varchar as its analogue in
> postgresql.

If the function accepts a text argument then see the copytext() and
concat_text() examples that show how to work with such data.  Look
at the examples that use the version-1 calling conventions, the
ones that are declared like this:

PG_FUNCTION_INFO_V1(copytext);

Datum
copytext(PG_FUNCTION_ARGS)
{
...
}

--
Michael Fuhr

Re: [GENERAL] Shared Objects (Dynamic loading)

From
"Jasbinder Bali"
Date:
Actually my function accepts char, so what should be the SQL analogue for that in postgres?
Thanks,
Jas

On 8/24/06, Michael Fuhr < mike@fuhr.org> wrote:
On Thu, Aug 24, 2006 at 02:51:50AM -0400, Jasbinder Bali wrote:
> Well, the server side code is in ECPG because thats the easiest choice i
> could see.
> I really don't know how difficult or beneficial would SPI be. Haven't heard
> of SPI before.
> I was under the impression that ECPG and libpg are the only two choices a
> developer has in postgresql for database related activities.

ECPG and libpq are client libraries.  Server-side functions can use
those libraries to make connections to the same or a different
database, but a function can use SPI to execute commands in the
same backend in which it's running without having to make a separate
client connection.  That's more efficient and the commands the
function runs will be executed in the same transaction as the
function itself, so if the calling transaction rolls back then
statements the function executed will roll back.  If the function
had executed statements over a separate connection, committed them,
and closed the connection, then those statements wouldn't roll back
even if the function's transaction rolled back.

> I am using char in postgres function as an analogue for char* in C. Is this
> correct?
> The link that you gave says varchar* in C has varchar as its analogue in
> postgresql.

If the function accepts a text argument then see the copytext() and
concat_text() examples that show how to work with such data.  Look
at the examples that use the version-1 calling conventions, the
ones that are declared like this:

PG_FUNCTION_INFO_V1(copytext);

Datum
copytext(PG_FUNCTION_ARGS)
{
...
}

--
Michael Fuhr

Re: [GENERAL] Shared Objects (Dynamic loading)

From
Michael Fuhr
Date:
On Thu, Aug 24, 2006 at 03:29:55AM -0400, Jasbinder Bali wrote:
> Also, when i dynamically load  a shared library and then later on change the
> code, create the same shared library (same name) and run my function where
> in the shared library is loaded, it takes the reference of the old shared
> library.
> why does this happen and how to get rid of this.

The "C-Language Functions" documentation explains:

http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html

"After it is used for the first time, a dynamically loaded object
file is retained in memory.  Future calls in the same session to
the function(s) in that file will only incur the small overhead of
a symbol table lookup.  If you need to force a reload of an object
file, for example after recompiling it, use the LOAD command or
begin a fresh session."

--
Michael Fuhr

Re: [GENERAL] Shared Objects (Dynamic loading)

From
Michael Fuhr
Date:
On Thu, Aug 24, 2006 at 03:51:28AM -0400, Jasbinder Bali wrote:
> Actually my function accepts char, so what should be the SQL analogue for
> that in postgres?

You might be able to declare the SQL function to accept a cstring
argument but you really should write the code to conform to
PostgreSQL's preferred (version-1) calling conventions.  That might
mean writing simple wrapper function that calls the real function.

If you're doing text manipulation then it would probably be a lot
easier in PL/Perl than in C.

--
Michael Fuhr

Re: [GENERAL] Shared Objects (Dynamic loading)

From
"Jasbinder Bali"
Date:
Is there any way to check what all shared objects are loaded in the memory?
Also, when i say LOAD 'parser' where parser.so is the shared object i've loaded dynamically
using CREATE FUNCTION, its says

ERROR:  could not access file "parser": No such file or directory

Why would it give me this error?

Thanks and regards,
~Jas

On 8/24/06, Michael Fuhr <mike@fuhr.org > wrote:
On Thu, Aug 24, 2006 at 03:29:55AM -0400, Jasbinder Bali wrote:
> Also, when i dynamically load  a shared library and then later on change the
> code, create the same shared library (same name) and run my function where
> in the shared library is loaded, it takes the reference of the old shared
> library.
> why does this happen and how to get rid of this.

The "C-Language Functions" documentation explains:

http://www.postgresql.org/docs/8.1/interactive/xfunc-c.html

"After it is used for the first time, a dynamically loaded object
file is retained in memory.  Future calls in the same session to
the function(s) in that file will only incur the small overhead of
a symbol table lookup.  If you need to force a reload of an object
file, for example after recompiling it, use the LOAD command or
begin a fresh session."

--
Michael Fuhr

Re: [GENERAL] Shared Objects (Dynamic loading)

From
Tom Lane
Date:
"Jasbinder Bali" <jsbali@gmail.com> writes:
> Also, when i say LOAD 'parser' where parser.so is the shared object i've
> loaded dynamically
> using CREATE FUNCTION, its says
> ERROR:  could not access file "parser": No such file or directory
> Why would it give me this error?

Probably because you didn't give it a path to the file.  You need either
an absolute path or something involving the special symbol $libdir.
This is not different from what's required in CREATE FUNCTION.

            regards, tom lane

Re: [GENERAL] Shared Objects (Dynamic loading)

From
"Jasbinder Bali"
Date:
Do I include this Version 1 syntax in the .pgc file or C file that i get after doing ECPG to the .pgc file?
Thanks
~Jas

On 8/24/06, Michael Fuhr < mike@fuhr.org> wrote:
On Thu, Aug 24, 2006 at 02:51:50AM -0400, Jasbinder Bali wrote:
> Well, the server side code is in ECPG because thats the easiest choice i
> could see.
> I really don't know how difficult or beneficial would SPI be. Haven't heard
> of SPI before.
> I was under the impression that ECPG and libpg are the only two choices a
> developer has in postgresql for database related activities.

ECPG and libpq are client libraries.  Server-side functions can use
those libraries to make connections to the same or a different
database, but a function can use SPI to execute commands in the
same backend in which it's running without having to make a separate
client connection.  That's more efficient and the commands the
function runs will be executed in the same transaction as the
function itself, so if the calling transaction rolls back then
statements the function executed will roll back.  If the function
had executed statements over a separate connection, committed them,
and closed the connection, then those statements wouldn't roll back
even if the function's transaction rolled back.

> I am using char in postgres function as an analogue for char* in C. Is this
> correct?
> The link that you gave says varchar* in C has varchar as its analogue in
> postgresql.

If the function accepts a text argument then see the copytext() and
concat_text() examples that show how to work with such data.  Look
at the examples that use the version-1 calling conventions, the
ones that are declared like this:

PG_FUNCTION_INFO_V1(copytext);

Datum
copytext(PG_FUNCTION_ARGS)
{
...
}

--
Michael Fuhr

Re: [GENERAL] Shared Objects (Dynamic loading)

From
Michael Fuhr
Date:
On Mon, Aug 28, 2006 at 03:26:55PM -0400, Jasbinder Bali wrote:
> Do I include this Version 1 syntax in the .pgc file or C file that i get
> after doing ECPG to the .pgc file?

You shouldn't modify the .c file that ecpg generates; the .pgc file
is the source code.  However, as Martijn and I have pointed out,
you should probably be using SPI instead of ECPG.  And as Tom and
I have mentioned, you probably shouldn't be using C at all because
everything you've said you're doing would be easier in other languages
like PL/pgSQL and PL/Perl.

--
Michael Fuhr

Re: [GENERAL] Shared Objects (Dynamic loading)

From
"Jasbinder Bali"
Date:
Well, I'm using C because later on i'll have to do socket programing for the same project (opening a socket between postgresql database and unix server, though am not sure at all how I'm or any1 else is goin to do it) and thats something i'll do in C. So to keep things pretty straight and stick to one single programing language due to other factors, I have no choice but to use C. Thats one of the design decisions taken for our project.
 
Now, for version 1 code, i think i'll have to write it in pgc code. right?
And using version 1 conventions only i'll be able to pass and read right values back and forth from DB to shared object. right?
 
Thanks,
~Jas

 
On 8/28/06, Michael Fuhr <mike@fuhr.org> wrote:
On Mon, Aug 28, 2006 at 03:26:55PM -0400, Jasbinder Bali wrote:
> Do I include this Version 1 syntax in the .pgc file or C file that i get
> after doing ECPG to the .pgc file?

You shouldn't modify the .c file that ecpg generates; the .pgc file
is the source code.  However, as Martijn and I have pointed out,
you should probably be using SPI instead of ECPG.  And as Tom and
I have mentioned, you probably shouldn't be using C at all because
everything you've said you're doing would be easier in other languages
like PL/pgSQL and PL/Perl.

--
Michael Fuhr

Re: [GENERAL] Shared Objects (Dynamic loading)

From
"Andrej Ricnik-Bay"
Date:
On 8/29/06, Jasbinder Bali <jsbali@gmail.com> wrote:

> though am not sure at all how I'm or any1 else is goin to do it) and thats
> something i'll do in C. So to keep things pretty straight and stick to one
> single programing language due to other factors, I have no choice but to use
> C. Thats one of the design decisions taken for our project.
You could stick with the basic Unix design idea and have two
parts to the program, one that does the socket-part (written in
C) and one Postgres-part, written in whatever?

> Thanks,
> ~Jas
Cheers,
Andrej

--
Please don't top post, and don't use HTML e-Mail :}  Make your quotes concise.

http://www.american.edu/econ/notes/htmlmail.htm