Thread: PostgreSQL extension API? Documentation?
Hi. I have a newbie question for extension development. Extensions provide an entry point, and are dynamically linked to PostgreSQL. But what APIs/functions are really available for extensions to call? The most obvious API is SPI. You could also implements hooks. Of course, functions, types, aggregates, whatever. But can an extension call other "internal" PostgreSQL functions? Is there any restriction to what could --or should-- call an extension? Is there any specific API, or any documentation which states what is available to use? In other words: what is the API surface exposed by PostgreSQL to extension developers? The assumption is that no PostgreSQL code should be modified, just adding your own and calling existing funcitons. Thanks, Álvaro -- Álvaro Hernández Tortosa ----------- 8Kdata
On Sat, Feb 27, 2016 at 10:37 AM, Álvaro Hernández Tortosa <aht@8kdata.com> wrote:
>
>
> Hi.
>
> I have a newbie question for extension development. Extensions provide an entry point, and are dynamically linked to PostgreSQL. But what APIs/functions are really available for extensions to call?
>
> The most obvious API is SPI. You could also implements hooks. Of course, functions, types, aggregates, whatever. But can an extension call other "internal" PostgreSQL functions? Is there any restriction to what could --or should-- call an extension? Is there any specific API, or any documentation which states what is available to use?
>
> In other words: what is the API surface exposed by PostgreSQL to extension developers? The assumption is that no PostgreSQL code should be modified, just adding your own and calling existing funcitons.
>
Writing a C extension you can access a lot of internal code if it's available internally by .h headers. For example, some time ago I'm thinking to write an extension to show more internal information about autovacuum (internal queue, etc... some like pg_stat_autovaccuum) . But nowadays is impossible without change the core because some internal structures are not exposed, so we should define an internal API to expose this kind of information.
So depending what problem you want to solve you can write an extension to do that. Then unfortunately the short aswer is "depend".
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
On 02/27/16 08:37, Álvaro Hernández Tortosa wrote: > In other words: what is the API surface exposed by PostgreSQL to > extension developers? The assumption is that no PostgreSQL code should be > modified, just adding your own and calling existing funcitons. That's an excellent question that repeatedly comes up, in particular because of the difference between the way the MSVC linker works on Windows, and the way most other linkers work on other platforms. The issue there is ... on most non-Windows platforms, there are only the general C rules to think about: if a symbol is static (or auto of course) it is not visible to extensions, but otherwise it is. For MSVC, in contrast, symbols need to have a certain decoration (look for PGDLLIMPORT in various PostgreSQL .h files) for an MSVC-built extension to be able to see it, otherwise it isn't accessible. Well, that's not quite right. It turns out (and it may have taken some work on the build process to make it turn out this way) ... *functions* are accessible from MSVC (as long as they would be accessible under normal C rules) whether or not they have PGDLLIMPORT. It's just data symbols/variables that have to have PGDLLIMPORT or they aren't available on Windows/MSVC. And *that* arrangement is the result of a long thread in 2014 that unfolded after discovering that what was really happening in MSVC *before* that was that MSVC would silently pretend to link your non-PGDLLIMPORT data symbols, and then give you the wrong data. http://www.postgresql.org/message-id/flat/52FAB90B.6020302@2ndquadrant.com In that long thread, there are a few messages in the middle that probably give the closest current answer to your API question. Craig Ringer has consistently favored making other platforms work more like Windows/MSVC, so that the PGDLLIMPORT business would serve to limit and more clearly define the API surface: http://www.postgresql.org/message-id/52EF1468.6080107@2ndquadrant.com Andres Freund had the pragmatic reply: http://www.postgresql.org/message-id/20140203103701.GA1225@awork2.anarazel.de > I think that'd be an exercise in futility. ... We'd break countless > extensions people have written. ... we'd need to have a really > separate API layer ... doesn't seem likely to arrive anytime soon, > if ever. which was ultimately concurred in by Tom, and Craig too: http://www.postgresql.org/message-id/29286.1391436782@sss.pgh.pa.us http://www.postgresql.org/message-id/52EFA654.8010609@2ndquadrant.com Andres characterized it as "We have a (mostly) proper API. Just not an internal/external API split." http://www.postgresql.org/message-id/20140203142514.GD1225@awork2.anarazel.de -Chap
On 27/02/16 15:01, Fabrízio de Royes Mello wrote:
I don't know what kind of problem you want to solve, but maybe you should ask to yourself:
On Sat, Feb 27, 2016 at 10:37 AM, Álvaro Hernández Tortosa <aht@8kdata.com> wrote:
>
>
> Hi.
>
> I have a newbie question for extension development. Extensions provide an entry point, and are dynamically linked to PostgreSQL. But what APIs/functions are really available for extensions to call?
>
> The most obvious API is SPI. You could also implements hooks. Of course, functions, types, aggregates, whatever. But can an extension call other "internal" PostgreSQL functions? Is there any restriction to what could --or should-- call an extension? Is there any specific API, or any documentation which states what is available to use?
>
> In other words: what is the API surface exposed by PostgreSQL to extension developers? The assumption is that no PostgreSQL code should be modified, just adding your own and calling existing funcitons.
>
Good point. I don't know. More precisely: no specific problem as of today. But if I knew all the "exposed API" I could more clearly think of what problems could be solved.
In other words: I see it's not clear what an extension could "extend". And knowing that would help extension developers to create new solutions.
1) I need to change some current PostgreSQL behavior?
If that means not changing current code, might well be an option.
2) I need to add a new feature do PostgreSQL without change the current behavior?Writing a C extension you can access a lot of internal code if it's available internally by .h headers. For example, some time ago I'm thinking to write an extension to show more internal information about autovacuum (internal queue, etc... some like pg_stat_autovaccuum) . But nowadays is impossible without change the core because some internal structures are not exposed, so we should define an internal API to expose this kind of information.
So, calling any code exposed by the headers is ok for an extension? Is then the set of all .h files the "exposed API"? Or are some of those functions that should never be called?
So depending what problem you want to solve you can write an extension to do that. Then unfortunately the short aswer is "depend".
Hope that we can find a more general answer :) Thanks for your opinion!
Álvaro
-- Álvaro Hernández Tortosa ----------- 8Kdata
Chapman Flack <chap@anastigmatix.net> writes: > On 02/27/16 08:37, Álvaro Hernández Tortosa wrote: >> In other words: what is the API surface exposed by PostgreSQL to >> extension developers? The assumption is that no PostgreSQL code should be >> modified, just adding your own and calling existing funcitons. > That's an excellent question that repeatedly comes up, in particular > because of the difference between the way the MSVC linker works on Windows, > and the way most other linkers work on other platforms. Yeah. It would be a fine thing to have a document defining what we consider to be the exposed API for extensions. In most cases we could not actually stop extension developers from relying on stuff outside the defined API, and I don't particularly feel a need to try. But it would be clear to all concerned that if you rely on something not in the API, it's your problem if we remove it or whack it around in some future release. On the other side, it would be clearer to core-code developers which changes should be avoided because they would cause pain to extension authors. Unfortunately, it would be a lot of work to develop such a thing, and no one has wanted to take it on. regards, tom lane
On 27/02/16 15:10, Chapman Flack wrote: > On 02/27/16 08:37, Álvaro Hernández Tortosa wrote: >> In other words: what is the API surface exposed by PostgreSQL to >> extension developers? The assumption is that no PostgreSQL code should be >> modified, just adding your own and calling existing funcitons. > That's an excellent question that repeatedly comes up, in particular > because of the difference between the way the MSVC linker works on Windows, > and the way most other linkers work on other platforms. > > The issue there is ... on most non-Windows platforms, there are only the > general C rules to think about: if a symbol is static (or auto of course) > it is not visible to extensions, but otherwise it is. > > For MSVC, in contrast, symbols need to have a certain decoration > (look for PGDLLIMPORT in various PostgreSQL .h files) for an MSVC-built > extension to be able to see it, otherwise it isn't accessible. > > Well, that's not quite right. It turns out (and it may have taken some > work on the build process to make it turn out this way) ... *functions* > are accessible from MSVC (as long as they would be accessible under > normal C rules) whether or not they have PGDLLIMPORT. It's just > data symbols/variables that have to have PGDLLIMPORT or they aren't > available on Windows/MSVC. > > And *that* arrangement is the result of a long thread in 2014 that > unfolded after discovering that what was really happening in MSVC > *before* that was that MSVC would silently pretend to link your > non-PGDLLIMPORT data symbols, and then give you the wrong data. > > http://www.postgresql.org/message-id/flat/52FAB90B.6020302@2ndquadrant.com > > In that long thread, there are a few messages in the middle that probably > give the closest current answer to your API question. Craig Ringer has > consistently favored making other platforms work more like Windows/MSVC, > so that the PGDLLIMPORT business would serve to limit and more clearly > define the API surface: > > http://www.postgresql.org/message-id/52EF1468.6080107@2ndquadrant.com > > Andres Freund had the pragmatic reply: > > http://www.postgresql.org/message-id/20140203103701.GA1225@awork2.anarazel.de > >> I think that'd be an exercise in futility. ... We'd break countless >> extensions people have written. ... we'd need to have a really >> separate API layer ... doesn't seem likely to arrive anytime soon, >> if ever. > which was ultimately concurred in by Tom, and Craig too: > > http://www.postgresql.org/message-id/29286.1391436782@sss.pgh.pa.us > http://www.postgresql.org/message-id/52EFA654.8010609@2ndquadrant.com > > Andres characterized it as "We have a (mostly) proper API. Just not > an internal/external API split." > > http://www.postgresql.org/message-id/20140203142514.GD1225@awork2.anarazel.de > > -Chap Hi Chapman. Thank you very much for your detailed message and all the references. They were very appropiate. However, I still lack a list of functions that might be callable (I understand not even those labeled with PGDLLIMPORT are all good candidates and some good candidates might not be labeled as such) from an extension point of view. Have you come across such a list over any of these threads? I haven't been able to find it. Thanks for your input! Álvaro -- Álvaro Hernández Tortosa ----------- 8Kdata
On 27/02/16 15:43, Tom Lane wrote: > Chapman Flack <chap@anastigmatix.net> writes: >> On 02/27/16 08:37, Ã�lvaro Hernández Tortosa wrote: >>> In other words: what is the API surface exposed by PostgreSQL to >>> extension developers? The assumption is that no PostgreSQL code should be >>> modified, just adding your own and calling existing funcitons. >> That's an excellent question that repeatedly comes up, in particular >> because of the difference between the way the MSVC linker works on Windows, >> and the way most other linkers work on other platforms. > Yeah. It would be a fine thing to have a document defining what we > consider to be the exposed API for extensions. In most cases we could > not actually stop extension developers from relying on stuff outside the > defined API, and I don't particularly feel a need to try. But it would be > clear to all concerned that if you rely on something not in the API, it's > your problem if we remove it or whack it around in some future release. > On the other side, it would be clearer to core-code developers which > changes should be avoided because they would cause pain to extension > authors. > > Unfortunately, it would be a lot of work to develop such a thing, and no > one has wanted to take it on. Why would it be so much work? Creating a function list, and maybe documenting those, doesn't sound like a daunting task. I wouldn't mind volunteering for this work, but I guess I would need some help to understand and identify the candidate parts of the API. If anyone could help me here, please let me know. Álvaro -- Álvaro Hernández Tortosa ----------- 8Kdata
Álvaro Hernández Tortosa wrote: > I wouldn't mind volunteering for this work, but I guess I would > need some help to understand and identify the candidate parts of > the API. If anyone could help me here, please let me know. When you write an extension often regret that someone specified this or that function as static. I am not sure that such a list ever be complete. In Postgres no clear boundaries between the subsystems. -- Yury Zhuravlev Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 02/27/16 13:51, Álvaro Hernández Tortosa wrote: > ... I still lack a list of functions that might be callable (I > understand not even those labeled with PGDLLIMPORT are all good candidates > and some good candidates might not be labeled as such) from an extension > point of view. Have you come across such a list over any of these threads? On my best understanding, there isn't really such a thing exactly. If the formulation by Andres is persuasive ("We have a (mostly) proper API. Just not an internal/external API split"), then the good references for hacking an extension will be essentially the same as the good references for hacking PostgreSQL, such as the "Hacking PostgreSQL Resources" found on the "So, you want to be a developer?" wiki page: https://wiki.postgresql.org/wiki/So,_you_want_to_be_a_developer%3F Also, the PostgreSQL code repository has a lot of README files in subdirectories where important pieces of the architecture happen, and they are very informative and worth reading, and also the comments are often quite comprehensive in the .h or .c files pertaining to the parts of the system you need to interact with. The extra ingredients for being an *extension* author, in the absence of any formalized "this is the extension API" documentation, seem to be those unformalized qualities like taste or restraint, in looking over the available interfaces and judging which ones seem to be fundamental, useful, stable, less likely to be whacked around later, etc. Those qualities also can be called "enlightened self-interest" because you are not looking forward to fixing your busted extension when something you have relied on changes. Another piece of the puzzle seems to be participating on -hackers so that you may see what changes are coming, or possibly advocate for why a particular interface really is useful to your extension and is worth committing to. If there is some subspace of possible extensions where you are interested in working, taking on some maintenance of an existing extension in that space, thereby getting familiar with what interfaces it relies on and why, seems to be an effective baptism-by-fire. :) The danger to avoid would be then drawing overbroad conclusions about what should or shouldn't be extension API, based on what is useful for the subspace of imaginable extensions in which you are working. -Chap
On 02/27/16 14:11, Álvaro Hernández Tortosa wrote: > Why would it be so much work? Creating a function list, and maybe > documenting those, doesn't sound like a daunting task. > > I wouldn't mind volunteering for this work, but I guess I would need > some help to understand and identify the candidate parts of the API. I guess one daunting part is that the first approximation to "candidate parts of the API" is something like "that which is useful to extensions" and there are a lot of those, adding a really wide variety of capabilities, and not all of their maintainers may be close followers of -hackers or in a position to promptly answer if you asked "what are all the PostgreSQL interfaces your extension relies on and why?". My experience in working on PL/Java has been, sort of recurringly, that I may appear on -hackers needing to advocate that PGDLLIMPORT be put on some recently-added variable, or that there be some way to hook into the extension dependency mechanism (to cite a couple recent examples) and face initial questions on why such a need crops up in an extension. So it takes some more explaining, and I don't think that reflects in any way on the perspicacity of the -hackers readership; it's just that any piece you're not personally immersed in is likely to have details that won't have jumped out at you. Such things probably lurk in the corners of most existing extensions, of which there are a lot. -Chap