Thread: How do I use the backend APIs

How do I use the backend APIs

From
"Chad"
Date:
Hi,

In Postgres, is there a C language API which would give me access to
BTrees like Berkeley DB does? eg to seek to a particular key/value pair
and iterate forward from there? If not whats the nearest thing to this
in Postgres?

Cheers.


Re: How do I use the backend APIs

From
Michael Fuhr
Date:
On Thu, Feb 16, 2006 at 07:41:09AM -0800, Chad wrote:
> In Postgres, is there a C language API which would give me access to
> BTrees like Berkeley DB does? eg to seek to a particular key/value pair
> and iterate forward from there? If not whats the nearest thing to this
> in Postgres?

Could you tell us about the problem you're trying to solve?  Are
you writing client-side or server-side code?

--
Michael Fuhr

Re: How do I use the backend APIs

From
"Qingqing Zhou"
Date:
"Chad" <chadzakary@hotmail.com> wrote
>
> In Postgres, is there a C language API which would give me access to
> BTrees like Berkeley DB does? eg to seek to a particular key/value pair
> and iterate forward from there?

AFAIK there is no such API for this purpose. The reason is that to access
BTree, you have to setup complex enough environment to enable so. For
example, the buffer pool support, the WAL support etc. So though exporting
such API is easy to do, there is no pratical usage of it.

Regards,
Qingqing



Re: How do I use the backend APIs

From
Neil Conway
Date:
On Fri, 2006-02-17 at 11:34 +0800, Qingqing Zhou wrote:
> AFAIK there is no such API for this purpose. The reason is that to access
> BTree, you have to setup complex enough environment to enable so. For
> example, the buffer pool support, the WAL support etc. So though exporting
> such API is easy to do, there is no pratical usage of it.

Well, if the API is going to be invoked by C UDFs, it could assume that
the environment has been appropriately initialized.

I think it would be possible to provide such an API (although it would
take a considerable amount of work). However, I don't see the point --
why would an application want to use the API? SQL is much more flexible.

-Neil



Re: How do I use the backend APIs

From
Christopher Browne
Date:
A long time ago, in a galaxy far, far away, neilc@samurai.com (Neil Conway) wrote:
> On Fri, 2006-02-17 at 11:34 +0800, Qingqing Zhou wrote:
>> AFAIK there is no such API for this purpose. The reason is that to access
>> BTree, you have to setup complex enough environment to enable so. For
>> example, the buffer pool support, the WAL support etc. So though exporting
>> such API is easy to do, there is no pratical usage of it.
>
> Well, if the API is going to be invoked by C UDFs, it could assume that
> the environment has been appropriately initialized.
>
> I think it would be possible to provide such an API (although it would
> take a considerable amount of work). However, I don't see the point --
> why would an application want to use the API? SQL is much more flexible.

It would probably make more sense to create an API that runs the
activities via a translation into SQL...
--
let name="cbbrowne" and tld="ntlug.org" in String.concat "@" [name;tld];;
http://linuxdatabases.info/info/slony.html
Rules of the  Evil Overlord #46. "If an advisor says  to me "My liege,
he  is but  one man.  What can  one man  possibly do?",  I  will reply
"This." and kill the advisor." <http://www.eviloverlord.com/>

Re: How do I use the backend APIs

From
Martijn van Oosterhout
Date:
On Thu, Feb 16, 2006 at 07:41:09AM -0800, Chad wrote:
> Hi,
>
> In Postgres, is there a C language API which would give me access to
> BTrees like Berkeley DB does? eg to seek to a particular key/value pair
> and iterate forward from there? If not whats the nearest thing to this
> in Postgres?

Well, in the backend you can do things like open a btree index, setup
an ScanKey to indicate which values you want and then keep calling
getnext(). If you set your scankey to (col1 >= 'A') it will start at
'A' and go up from there...

Most of the time though you just create a query and use SPI_exec. Then
you don't actually have to worry about details like names of the
indexes, OIDs, types, comparison functions, etc...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: How do I use the backend APIs

From
"Chad"
Date:
Thanks Martijn,

This is exactly what I am looking for. I'm wondering how easy it is to
sit on top of this backend. Does anybody have any stand alone sample
code? Is it a library that can be linked or do you need to produce a
modified version of the postgres server? Can it be used in shared
library form and if so will it support multiple processes using it on
the same machine?

----------------------

"Well, in the backend you can do things like open a btree index, setup
an ScanKey to indicate which values you want and then keep calling
getnext(). If you set your scankey to (col1 >= 'A') it will start at
'A' and go up from there... "


Re: How do I use the backend APIs

From
"Chad"
Date:
In a word: The kind of problems people use Berkeley DB for.

People use BDB for more fine grained cursor access to BTrees. Stuff you
CANNOT do with SQL. There is a market for this. See their website. I'd
like something similar from Postgres so that the data would be stored
in a full fledged RDBMS but I could use the cursor methods for
searching more efficient than SQL. Best of both worlds.

I've had a quick browse around the Postgres code and found some
functions like "_bt_first()" but no sample code to use it. BTW its for
developing an alternative server based access to the underlying
relational data.


Re: How do I use the backend APIs

From
Tom Lane
Date:
"Chad" <chadzakary@hotmail.com> writes:
> This is exactly what I am looking for. I'm wondering how easy it is to
> sit on top of this backend.

You can't, and you'll get exactly zero community support for trying.
We don't believe in embedded databases --- or at least, we don't
believe in trying to use Postgres as one.  We like a hard-and-fast
separation between client and database server, so that client
programming mistakes can't corrupt the database.

You could possibly do what you are thinking of in the form of
user-defined functions executing in the backend, but the communication
overhead to the client side is likely more than you want, and you'll
be relying on APIs that we consider backend-internal and feel free to
whack around at the drop of a hat.

I'd suggest looking for something that's actually intended to be an
embedded database.  sqlite maybe, though I'm no expert on the subject.
For that matter, have you looked at good old dbm?

            regards, tom lane

Re: How do I use the backend APIs

From
Martijn van Oosterhout
Date:
On Fri, Feb 17, 2006 at 01:06:16AM -0800, Chad wrote:
> In a word: The kind of problems people use Berkeley DB for.
>
> People use BDB for more fine grained cursor access to BTrees. Stuff you
> CANNOT do with SQL. There is a market for this. See their website. I'd
> like something similar from Postgres so that the data would be stored
> in a full fledged RDBMS but I could use the cursor methods for
> searching more efficient than SQL. Best of both worlds.

Well, just the brief look at the docs doesn't immediatly reveal
anything that couldn't be done with straight SQL and server side
functions. It would be helpful if you could give an example of what you
actually want to do.

> I've had a quick browse around the Postgres code and found some
> functions like "_bt_first()" but no sample code to use it. BTW its for
> developing an alternative server based access to the underlying
> relational data.

Well, that function is several levels below where you need to be
looking. Using it directly will probably get you into a world of hurt.
BTW, what does "alternative server based access to the underlying
relational data" mean?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: How do I use the backend APIs

From
Christopher Browne
Date:
A long time ago, in a galaxy far, far away, "Chad" <chadzakary@hotmail.com> wrote:
> In a word: The kind of problems people use Berkeley DB for.

> People use BDB for more fine grained cursor access to BTrees. Stuff you
> CANNOT do with SQL. There is a market for this. See their website. I'd
> like something similar from Postgres so that the data would be stored
> in a full fledged RDBMS but I could use the cursor methods for
> searching more efficient than SQL. Best of both worlds.

I daresay we get enough challenges to fill the day when we use the
"coarse graining" of SQL.

I'm generally keener on getting aggregate results that let me not
bother needing to search in fantastical ways...

As far as I'm concerned, you're not pointing at a better world; you're
pointing at a worse one.  I've seen far too many bugs falling out of
the navigational complexities of navigation-oriented data structures.

The sheer scope of bugginess of that is why my ears perk up when
mention of languages like R and APL and such come up; I don't want to
navigate through data; I want to parallel process it :-).

> I've had a quick browse around the Postgres code and found some
> functions like "_bt_first()" but no sample code to use it. BTW its
> for developing an alternative server based access to the underlying
> relational data.

Those sorts of functions are intended as internals, and public usage
can be expected to break gloriously badly as changing them is fair
game as PostgreSQL progresses to new versions.

For things for "public use," you should look at what is offered in
libpq.

If you could outline some usage that might make it more powerful, it
is not implausible that people would listen.  There are doubtless ways
that cursors could be enhanced, and that might be the direction you
would want to go.  But you're not too likely to see PostgreSQL
rewritten for the sake of attracting the "market" of people who need
to manipulate the fine semantics of B-tree navigation.
--
If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne> rate me
http://cbbrowne.com/info/rdbms.html
"For those  of you who are  into writing programs that  are as obscure
and complicated  as possible, there are opportunities  for... real fun
here" -- Arthur Norman

Re: How do I use the backend APIs

From
Alban Hertroys
Date:
Chad wrote:
> Thanks Martijn,

> "Well, in the backend you can do things like open a btree index, setup
> an ScanKey to indicate which values you want and then keep calling
> getnext(). If you set your scankey to (col1 >= 'A') it will start at
> 'A' and go up from there... "

That looks suspiciously much like cursors. Isn't this what you're
looking for?:

OPEN curs1 FOR SELECT * FROM foo WHERE key = mykey;
FETCH cursor INTO target;
CLOSE cursor;

(from:
http://www.postgresql.org/docs/8.1/static/plpgsql-cursors.html#PLPGSQL-CURSOR-DECLARATIONS
)

I figure the other PL/ languages feature cursors as well, but I've never
used those.

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: How do I use the backend APIs

From
"Chad"
Date:
Firstly thank you to all who have taken the time to reply so far.
I need to clarify a few things based on the feedback I have received.

1. I understand the concerns you have about people using internal APIs
that the developers are free to change. I also understand the risks I
take if I use an undocumented API. I would prefer to use a supported
public API with full support of the community. I want a clean API as
you have put it. My understanding is that the level of abstraction at
which I want to use Postgres is not supported by a public API today.
This API may exist at a less official level for internal use. Lets
revisit the argument as to whether I NEED this level of abstraction in
a later point as people have made good suggestions in this regard to
counter my opinion.

2. I want to base my development on a fully fledged relational DB (not
just an embedded DB). This is because I believe it will increase the
appeal of my product.

3. We see cleanly defined layers in Networking Software and Operating
Systems Software. For example, I can send stuff over a network as
Ethernet Frames if I want to use a library at that level of
abstraction, OR I can send stuff as TCP packets, this assuming I have
at my disposal libraries exposing these levels of abstraction. I
believe there is a case for DBMS software to support this concept of
layering. I'm not aware of any standards based database software stack
but I believe that if it existed it could result in better reuse of
database core technology. This stack would maybe begin with a buffer
cache at level 0, a data structures layer at level 1 (Btrees, heaps
etc), a data model layer at level 2...eventually you work up to client
APIs and so on. Whatever this stack looks like, I would like to be able
to link to a library that exposed the data structure layer of a fully
fledged relational DB. I'm not going to try to implement that from
scratch and I want what I develop to have full SQL support.

4. Why I think I need to work at this level of abstraction. I believe,
certain types of queries can be realized more efficiently by code that
can seek to specific parts of an index and immediately returning a
small number of rows to the client. The query would remain active and
the client would only retrieve the next rows after the previous cursor
position when the client was ready to do so. With a huge underlying
table set this would enable very free incremental browsing of the data.
Imagine a table browser application sitting on top of a multi
million-row table. The user types in a partial name match ("Mal").
They want the table to scroll to the first name matching the name they
have typed ("Malcolm"). It should then be cheap to interactively
scroll downwards from here. It should also be cheap to scroll upwards
to view records immediately preceding "Malcolm". The most natural
way I can think of doing this is an API to cursor seek to "Malcolm"
and navigate forwards or backwards from that cursor pos. I feel a
B-Tree cursor API matches the way you would implement a table like
this, only retrieving data as the user scrolls to it.


Re: How do I use the backend APIs

From
Martijn van Oosterhout
Date:
On Mon, Feb 20, 2006 at 05:08:33AM -0800, Chad wrote:
> 4. Why I think I need to work at this level of abstraction. I believe,
> certain types of queries can be realized more efficiently by code that
> can seek to specific parts of an index and immediately returning a
> small number of rows to the client. The query would remain active and
> the client would only retrieve the next rows after the previous cursor
> position when the client was ready to do so. With a huge underlying

<snip>

You realise you can acheive almost exactly what you want here by saying
things like:

DECLARE c AS CURSOR FOR select * from table where name >= 'Mal' order by name;
FETCH FORWARD 1;  -- Go down
FETCH FORWARD 1;  -- Go down
FETCH BACKWARD 2; -- Go back again

If you have an index it will just step through the index to find the
rows you want. While you're free to try and make this work by using the
index code directly, you're not asking anything that's otherwise
impossible.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: How do I use the backend APIs

From
Scott Marlowe
Date:
On Mon, 2006-02-20 at 16:40, Martijn van Oosterhout wrote:
> On Mon, Feb 20, 2006 at 05:08:33AM -0800, Chad wrote:
> > 4. Why I think I need to work at this level of abstraction. I believe,
> > certain types of queries can be realized more efficiently by code that
> > can seek to specific parts of an index and immediately returning a
> > small number of rows to the client. The query would remain active and
> > the client would only retrieve the next rows after the previous cursor
> > position when the client was ready to do so. With a huge underlying
>
> <snip>
>
> You realise you can acheive almost exactly what you want here by saying
> things like:
>
> DECLARE c AS CURSOR FOR select * from table where name >= 'Mal' order by name;
> FETCH FORWARD 1;  -- Go down
> FETCH FORWARD 1;  -- Go down
> FETCH BACKWARD 2; -- Go back again
>
> If you have an index it will just step through the index to find the
> rows you want. While you're free to try and make this work by using the
> index code directly, you're not asking anything that's otherwise
> impossible.

I think the problem here is that when you have a hammer, everything
looks like a nail.  We see it with people coming from Oracle or DB2 as
well.  "I learned it this way" is often a bigger obstacle to learning
postgresql than no knowledge at all.

Re: How do I use the backend APIs

From
"Chad"
Date:
Thanks Martijn/Alban,

This look interesting. I'll make some time to try this problem out
using your approach.
I have a few questions like:
-Could I skip the FETCH FORWARD and go straight to the FETCH BACKWARD
i.e. declare cursor to be at "Mal" and go backwards from there or is
the cursor limited to going backward only as far as "Mal"?
-Does the DB avoid transferring the data until the FETCH command?
-When rows change in between opening the cursor and fetching the
changed rows, will the FETCH retrieve the new data or is a snapshot
taken when the cursor is declared ?
-What about concurrency? If a cursor is kept open while other
transactions change the same table or does it cause those writer
transactions to block? Perhaps this is configurable.


Re: How do I use the backend APIs

From
Alban Hertroys
Date:
Chad wrote:
> Thanks Martijn/Alban,
>
> This look interesting. I'll make some time to try this problem out
> using your approach.
> I have a few questions like:
> -Could I skip the FETCH FORWARD and go straight to the FETCH BACKWARD
> i.e. declare cursor to be at "Mal" and go backwards from there or is
> the cursor limited to going backward only as far as "Mal"?
> -Does the DB avoid transferring the data until the FETCH command?

Indeed it does.

> -When rows change in between opening the cursor and fetching the
> changed rows, will the FETCH retrieve the new data or is a snapshot
> taken when the cursor is declared ?
> -What about concurrency? If a cursor is kept open while other
> transactions change the same table or does it cause those writer
> transactions to block? Perhaps this is configurable.

Transactions are isolated. If data is changed in other transactions,
they don't influence the data in the transaction you're currently
looking at. I assume the same goes for cursors, it's still a query
after all.

I don't know what happens if you change the data you're looping over in
a way that the result set changes, but a test case can't be too hard to
think up. I'm talking about something like this:

OPEN cur FOR SELECT val FROM values WHERE val BETWEEN 1 AND 10 ORDER BY val;
LOOP
    FETCH cur INTO record;

    -- This reverses the order in which the record would be fetched
    UPDATE values SET val = 11 - val WHERE val = record.val;
END LOOP;

Makes me kind of curious what happens... Would it only get to halfway
the values 1..10 and then go backwards again?

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: How do I use the backend APIs

From
Martijn van Oosterhout
Date:
On Tue, Feb 21, 2006 at 02:41:13AM -0800, Chad wrote:
> Thanks Martijn/Alban,
>
> This look interesting. I'll make some time to try this problem out
> using your approach.
> I have a few questions like:
> -Could I skip the FETCH FORWARD and go straight to the FETCH BACKWARD
> i.e. declare cursor to be at "Mal" and go backwards from there or is
> the cursor limited to going backward only as far as "Mal"?

Only as far back as "Mal". However, if you set your original query to
ORDER BY blah DESC, then FETCH FORWARD would scan backward through the
index.

> -Does the DB avoid transferring the data until the FETCH command?

Of course. It would kind of defeat the purpose to do otherwise.

> -When rows change in between opening the cursor and fetching the
> changed rows, will the FETCH retrieve the new data or is a snapshot
> taken when the cursor is declared ?

Standard visibility rules apply. READ COMMITTED shows anything
committed, even after you've started. SERIALIZABLE gives you a
consistant snapshot.

> -What about concurrency? If a cursor is kept open while other
> transactions change the same table or does it cause those writer
> transactions to block? Perhaps this is configurable.

Some as normal. PostgreSQL doesn't acquire any locks for plain SELECTs
so no risk there...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: How do I use the backend APIs

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Tue, Feb 21, 2006 at 02:41:13AM -0800, Chad wrote:
>> -When rows change in between opening the cursor and fetching the
>> changed rows, will the FETCH retrieve the new data or is a snapshot
>> taken when the cursor is declared ?

> Standard visibility rules apply. READ COMMITTED shows anything
> committed, even after you've started. SERIALIZABLE gives you a
> consistant snapshot.

I believe that a cursor always shows a snapshot --- whether the
transaction is READ COMMITTED or SERIALIZABLE only affects whether
the snapshot is current as of the DECLARE CURSOR command or the
transaction's BEGIN command.  In either case you won't see changes
occurring after the cursor is opened.

            regards, tom lane

Re: How do I use the backend APIs

From
"Chad"
Date:
Thanks Martijn.