Thread: No longer possible to query catalogs for index capabilities?
With the gutting of pg_am in 9.6, there seems to be no longer any way for a query of the system catalogs to discover any of the index capabilities that were formerly columns in pg_am (notably amcanorder, amcanorderbyop, amclusterable, amsearcharray, amsearchnulls). Am I missing something or is this a significant oversight? -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > With the gutting of pg_am in 9.6, there seems to be no longer any way > for a query of the system catalogs to discover any of the index > capabilities that were formerly columns in pg_am (notably amcanorder, > amcanorderbyop, amclusterable, amsearcharray, amsearchnulls). > Am I missing something or is this a significant oversight? It's absolutely not an oversight. We asked when 65c5fcd35 went in whether there was still any need for that information to be available at the SQL level, and nobody appeared to care. We could in theory expose a view to show the data --- but since a large part of the point of that change was to not need initdb for AM API changes, and to not be constrained to exactly SQL-compatible representations within that API, I'm disinclined to do so without a fairly compelling argument why it's needed. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: >> With the gutting of pg_am in 9.6, there seems to be no longer any>> way for a query of the system catalogs to discoverany of the index>> capabilities that were formerly columns in pg_am (notably>> amcanorder, amcanorderbyop, amclusterable,amsearcharray,>> amsearchnulls). >> Am I missing something or is this a significant oversight? Tom> It's absolutely not an oversight. We asked when 65c5fcd35 went inTom> whether there was still any need for that informationto beTom> available at the SQL level, and nobody appeared to care. Perhaps you were asking the wrong people? Tom> We could in theory expose a view to show the data --- but since aTom> large part of the point of that change was tonot need initdb forTom> AM API changes, and to not be constrained to exactlyTom> SQL-compatible representations withinthat API, I'm disinclined toTom> do so without a fairly compelling argument why it's needed. It could easily be exposed as a function interface of the form index_has_capability(oid,name) or indexam_has_capability(oid,name) without any initdb worries. That would surely be better than the present condition of being completely unable to get this information from SQL. -- Andrew (irc:RhodiumToad)
* Andrew Gierth (andrew@tao11.riddles.org.uk) wrote: > >>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > > >> With the gutting of pg_am in 9.6, there seems to be no longer any > >> way for a query of the system catalogs to discover any of the index > >> capabilities that were formerly columns in pg_am (notably > >> amcanorder, amcanorderbyop, amclusterable, amsearcharray, > >> amsearchnulls). > > >> Am I missing something or is this a significant oversight? > > Tom> It's absolutely not an oversight. We asked when 65c5fcd35 went in > Tom> whether there was still any need for that information to be > Tom> available at the SQL level, and nobody appeared to care. > > Perhaps you were asking the wrong people? The capabilities strike me as useful to expose, they're pretty useful to know. I believe we were right to hide the APIs/functions and don't see any need to expose those to the SQL level. > Tom> We could in theory expose a view to show the data --- but since a > Tom> large part of the point of that change was to not need initdb for > Tom> AM API changes, and to not be constrained to exactly > Tom> SQL-compatible representations within that API, I'm disinclined to > Tom> do so without a fairly compelling argument why it's needed. > > It could easily be exposed as a function interface of the form > index_has_capability(oid,name) or indexam_has_capability(oid,name) > without any initdb worries. Hmm, that seems pretty reasonable. > That would surely be better than the present condition of being > completely unable to get this information from SQL. Agreed. Thanks! Stephen
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Tom> We could in theory expose a view to show the data --- but since a > Tom> large part of the point of that change was to not need initdb for > Tom> AM API changes, and to not be constrained to exactly > Tom> SQL-compatible representations within that API, I'm disinclined to > Tom> do so without a fairly compelling argument why it's needed. > It could easily be exposed as a function interface of the form > index_has_capability(oid,name) or indexam_has_capability(oid,name) > without any initdb worries. You missed the "compelling argument why it's needed" part. What is the need for this? I'm not going to be persuaded by "it was there before". We've gotten along fine without such inspection functions for FDWs and tablesample methods, so I doubt that we really need them for index AMs. Nobody's writing applications that make decisions about which AM to use based on what they see in pg_am. And anyone who's concerned whether their AM is reporting the right info is going to be much better served by gdb than by some functions that can present only a subset of what's in the info struct. Moreover, I think you are missing the point about initdb. The issue there is that anytime in future that we make a change to the AM API, we'd need to have a debate about whether and how to expose such a change for SQL inspection. Defining the exposure mechanism as a new function rather than a new view column changes neither the need for a debate, nor the need for an initdb unless we decide that we don't need to expose anything. But if your proposal is merely that we freeze the set of information available as some subset of what used to be available from pg_am, then it sounds an awful lot like a backwards-compatibility hack rather than an honest attempt to describe AM capabilities. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: >> It could easily be exposed as a function interface of the form>> index_has_capability(oid,name) or indexam_has_capability(oid,name)>>without any initdb worries. Tom> You missed the "compelling argument why it's needed" part. WhatTom> is the need for this? I'm not going to be persuadedby "it wasTom> there before". How about "it was there before, and people did use it"? In fact I notice you participated in a discussion of this a couple of months back on the JDBC list, in which your solution was to suggest hardcoding the name 'btree' into the query: https://www.postgresql.org/message-id/24504.1463237368%40sss.pgh.pa.us Doesn't that strike you as an indication that something is wrong? Tom> We've gotten along fine without such inspection functions for FDWsTom> and tablesample methods, which are new and not especially interesting to code doing introspection Tom> so I doubt that we really need them for index AMs. People write catalog queries for indexes a whole lot more than they do for FDWs or tablesample methods. This whole discussion started because I wrote a catalog query for someone on IRC, and found I couldn't do it on 9.6 because amcanorder was gone. Tom> Nobody's writing applications that make decisions about which AMTom> to use based on what they see in pg_am. That's not the issue. The issue is finding information about _existing_ indexes that is not otherwise exposed. Tom> Moreover, I think you are missing the point about initdb. TheTom> issue there is that anytime in future that we makea change to theTom> AM API, we'd need to have a debate about whether and how to exposeTom> such a change for SQL inspection. Defining the exposure mechanismTom> as a new function rather than a new view column changes neitherTom> the needfor a debate, nor the need for an initdb unless we decideTom> that we don't need to expose anything. I'm not proposing a new function for each capability. I'm proposing ONE function (or two, one starting from the index rather than the AM, for convenience). Adding more capability names would not require an initdb. -- Andrew (irc:RhodiumToad)
On Mon, Jul 25, 2016 at 10:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > You missed the "compelling argument why it's needed" part. What is the > need for this? It's self-evident that this thread wouldn't exist if it were not the case that people had queries that no longer work because of these new changes. You can hold your breath and pretend that every single one of those queries is probably misdesigned, but I do not think anyone else will find that argument convincing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jul 25, 2016 at 10:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> You missed the "compelling argument why it's needed" part. What is the >> need for this? > It's self-evident that this thread wouldn't exist if it were not the > case that people had queries that no longer work because of these new > changes. You can hold your breath and pretend that every single one > of those queries is probably misdesigned, but I do not think anyone > else will find that argument convincing. We've already broken existing queries against pg_am, simply because the columns are not there anymore; and that decision is not getting undone at this point. I'm willing to consider putting back some substitute capability, but I'd like to see as much evidence for adding that as we'd expect for any other new feature. Andrew still hasn't shown a concrete example of what he needs to do and why. regards, tom lane
Here is my proposed code (first cut; obviously it needs docs too). Opinions? -- Andrew (irc:RhodiumToad)
Attachment
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> Andrew still hasn't shown a concrete example of what he needs toTom> do and why. The issue I ran into was the exact same one as in the JDBC thread I linked to earlier: correctly interpreting pg_index.indoption (to get the ASC / DESC and NULLS FIRST/LAST settings), which requires knowing whether amcanorder is true to determine whether to look at the bits at all. The guy I was helping was using an earlier pg version, so it didn't affect him (yet); I hit it when trying to test the query on 9.6. -- Andrew (irc:RhodiumToad)
On 07/25/2016 12:19 PM, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, Jul 25, 2016 at 10:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> You missed the "compelling argument why it's needed" part. What is the >>> need for this? > >> It's self-evident that this thread wouldn't exist if it were not the >> case that people had queries that no longer work because of these new >> changes. You can hold your breath and pretend that every single one >> of those queries is probably misdesigned, but I do not think anyone >> else will find that argument convincing. > > We've already broken existing queries against pg_am, simply because the > columns are not there anymore; and that decision is not getting undone > at this point. I'm willing to consider putting back some substitute > capability, but I'd like to see as much evidence for adding that as we'd > expect for any other new feature. Andrew still hasn't shown a concrete > example of what he needs to do and why. I think that Andrew and other people who have commented on this thread made it pretty obvious why it is useful. JD > > regards, tom lane > > -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. Unless otherwise stated, opinions are my own.
"Joshua D. Drake" <jd@commandprompt.com> writes: > On 07/25/2016 12:19 PM, Tom Lane wrote: >> Andrew still hasn't shown a concrete >> example of what he needs to do and why. > I think that Andrew and other people who have commented on this thread > made it pretty obvious why it is useful. Both Andrew and Robert have asserted without proof that it'd be useful to be able to get at some of that data. Given the lack of any supporting evidence, it's impossible to know which data needs to be exposed, and that's why I find their statements insufficient. "Emulate 9.5's pg_am exactly" is not in the cards, and short of that I'd like to see some concrete reasons why we do or do not need to expose particular bits of data. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: > > On 07/25/2016 12:19 PM, Tom Lane wrote: > >> Andrew still hasn't shown a concrete > >> example of what he needs to do and why. > > > I think that Andrew and other people who have commented on this thread > > made it pretty obvious why it is useful. > > Both Andrew and Robert have asserted without proof that it'd be useful > to be able to get at some of that data. Given the lack of any supporting > evidence, it's impossible to know which data needs to be exposed, and > that's why I find their statements insufficient. "Emulate 9.5's pg_am > exactly" is not in the cards, and short of that I'd like to see some > concrete reasons why we do or do not need to expose particular bits of > data. I believe the response to "what" is the patch which Andrew provided, and the use-case is illustrated by the query which he wrote that used those columns in much the same way that the JDBC driver used them (and which was also broken by their removal). This isn't just academic "gee, I wish we hadn't removed those columns", there are clearly cases where they were useful and were used. I do not believe hard-coding the name of index types as a definitive list of which indexes support what capabilities is an appropriate approach (as was suggested, and evidently done, for the JDBC driver). Additional use-cases include query analysis, by which one might want to see what capabilities an index has to understand why it may or may not be useful for a given query. I would also suggest that relying on pg_get_indexdef() is a poor solution and we should be considering how to expose the necessary information for pg_dump through the catalog instead of asking users who are interested to use a function that returns the result as an SQL DDL statement. We don't do that for table definitions and have argued time and time again why we shouldn't. Thanks! Stephen
On 25/07/16 21:20, Andrew Gierth wrote: > Here is my proposed code (first cut; obviously it needs docs too). > Opinions? I support the future of this patch, for 9.6. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
And a doc patch to go with it: -- Andrew (irc:RhodiumToad)
Attachment
On 7/25/16 3:26 PM, Andrew Gierth wrote: > The issue I ran into was the exact same one as in the JDBC thread I > linked to earlier: correctly interpreting pg_index.indoption (to get the > ASC / DESC and NULLS FIRST/LAST settings), which requires knowing > whether amcanorder is true to determine whether to look at the bits at > all. Maybe we should provide a facility to decode those bits then? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 7/25/16 3:26 PM, Andrew Gierth wrote: >> The issue I ran into was the exact same one as in the JDBC thread I >> linked to earlier: correctly interpreting pg_index.indoption (to get the >> ASC / DESC and NULLS FIRST/LAST settings), which requires knowing >> whether amcanorder is true to determine whether to look at the bits at >> all. > Maybe we should provide a facility to decode those bits then? Yeah. I'm not very impressed by the underlying assumption that it's okay for client-side code to hard-wire knowledge about what indoption bits mean, but not okay for it to hard-wire knowledge about which index AMs use which indoption bits. There's something fundamentally wrong in that. We don't let psql or pg_dump look directly at indoption, so why would we think that third-party client-side code should do so? Andrew complained upthread that pg_get_indexdef() was too heavyweight for his purposes, but it's not clear to me what he wants instead. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > > On 7/25/16 3:26 PM, Andrew Gierth wrote: > >> The issue I ran into was the exact same one as in the JDBC thread I > >> linked to earlier: correctly interpreting pg_index.indoption (to get the > >> ASC / DESC and NULLS FIRST/LAST settings), which requires knowing > >> whether amcanorder is true to determine whether to look at the bits at > >> all. > > > Maybe we should provide a facility to decode those bits then? > > Yeah. I'm not very impressed by the underlying assumption that it's > okay for client-side code to hard-wire knowledge about what indoption > bits mean, but not okay for it to hard-wire knowledge about which index > AMs use which indoption bits. There's something fundamentally wrong > in that. We don't let psql or pg_dump look directly at indoption, so > why would we think that third-party client-side code should do so? > > Andrew complained upthread that pg_get_indexdef() was too heavyweight > for his purposes, but it's not clear to me what he wants instead. I guess I'm missing something because it seems quite clear to me. He wants to know if the index was built with ASC or DESC, and if it was built with NULLS FIRST or NULLS LAST, just like the JDBC driver. pg_get_indexdef() will return that information, but as an SQL statement with a lot of other information that isn't relevant and is difficult to deal with when all you're trying to do is write an SQL query (no, I don't believe the solution here is to use pg_get_indexef() ~ 'DESC'). For my 2c, I'd like to see pg_dump able to use the catalog tables to derive the index definition, just as they manage to figure out table definitions without (for the most part) using functions. More generally, I believe we should be working to reach a point where we can reconstruct all objects in the database using just the catalog, without any SQL bits being provided from special functions which access information that isn't available at the SQL level. I don't see any problem with what Andrew has proposed as the information returned informs the creation of the DDL statement, but does not provide a textual "drop-in"/black-box component to include in the statement to recreate the object, the way pg_get_indexdef() does. Thanks! Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> Yeah. I'm not very impressed by the underlying assumption that it's >> okay for client-side code to hard-wire knowledge about what indoption >> bits mean, but not okay for it to hard-wire knowledge about which index >> AMs use which indoption bits. There's something fundamentally wrong >> in that. We don't let psql or pg_dump look directly at indoption, so >> why would we think that third-party client-side code should do so? > For my 2c, I'd like to see pg_dump able to use the catalog tables to > derive the index definition, just as they manage to figure out table > definitions without (for the most part) using functions. More > generally, I believe we should be working to reach a point where we can > reconstruct all objects in the database using just the catalog, without > any SQL bits being provided from special functions which access > information that isn't available at the SQL level. No, I reject that entirely. It would be insane for example to expect that random client-side code should be able to interpret the node trees stored in places like pg_index.indexprs. It's barely possible that we could maintain such logic in pg_dump, though having to maintain a different version for each supported server branch would be a giant PITA. But do you also want to maintain translated-into-Java copies of each of those libraries for the benefit of JDBC? Or any other language that client code might be written in? Now, obviously knowing which bit in pg_index.indoption does what would be a few orders of magnitude less of a maintenance hazard than knowing what expression node trees contain. But that doesn't make it a good future-proof thing for clients to be doing. If the answer to the question "why do you need access to pg_am.amcanorder?" is "so I can interpret the bits in pg_index.indoption", I think it's clear that we've got an abstraction failure that is not going to be fixed by just exposing something equivalent to the old pg_am definition. Building on the has-property approach Andrew suggested, I wonder if we need something like pg_index_column_has_property(indexoid, colno, propertyname) with properties like "sortable", "desc", "nulls first". regards, tom lane
Tom, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > >> Yeah. I'm not very impressed by the underlying assumption that it's > >> okay for client-side code to hard-wire knowledge about what indoption > >> bits mean, but not okay for it to hard-wire knowledge about which index > >> AMs use which indoption bits. There's something fundamentally wrong > >> in that. We don't let psql or pg_dump look directly at indoption, so > >> why would we think that third-party client-side code should do so? > > > For my 2c, I'd like to see pg_dump able to use the catalog tables to > > derive the index definition, just as they manage to figure out table > > definitions without (for the most part) using functions. More > > generally, I believe we should be working to reach a point where we can > > reconstruct all objects in the database using just the catalog, without > > any SQL bits being provided from special functions which access > > information that isn't available at the SQL level. > > No, I reject that entirely. It would be insane for example to expect that > random client-side code should be able to interpret the node trees stored > in places like pg_index.indexprs. It's barely possible that we could > maintain such logic in pg_dump, though having to maintain a different > version for each supported server branch would be a giant PITA. But do > you also want to maintain translated-into-Java copies of each of those > libraries for the benefit of JDBC? Or any other language that client > code might be written in? Honestly, I anticipated the focus on the pg_get_expr() and should have explicitly commented on it. I agree that we shouldn't look to have pg_dump or client utilities be able to understand node trees and that, instead, we should continue to provide a way for those to be reconstructed into SQL expressions. > Now, obviously knowing which bit in pg_index.indoption does what would be > a few orders of magnitude less of a maintenance hazard than knowing what > expression node trees contain. But that doesn't make it a good > future-proof thing for clients to be doing. If the answer to the question > "why do you need access to pg_am.amcanorder?" is "so I can interpret the > bits in pg_index.indoption", I think it's clear that we've got an > abstraction failure that is not going to be fixed by just exposing > something equivalent to the old pg_am definition. I agree- asking clients to interpret the bits in pg_index.indoption isn't the right answer either. > Building on the has-property approach Andrew suggested, I wonder if > we need something like pg_index_column_has_property(indexoid, colno, > propertyname) with properties like "sortable", "desc", "nulls first". Right, this makes sense to me. The point which I was trying to get at above is that we should be able to replace most of what is provided in pg_get_indexdef() by using this function to rebuild the CREATE INDEX command- again, similar to how we build a CREATE TABLE command rather than simply provide a 'pg_get_tabledef()'. Thanks! Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> Building on the has-property approach Andrew suggested, I wonder if >> we need something like pg_index_column_has_property(indexoid, colno, >> propertyname) with properties like "sortable", "desc", "nulls first". > Right, this makes sense to me. The point which I was trying to get at > above is that we should be able to replace most of what is provided in > pg_get_indexdef() by using this function to rebuild the CREATE INDEX > command- again, similar to how we build a CREATE TABLE command rather > than simply provide a 'pg_get_tabledef()'. Uh, what would be the point? You're just replacing a call to one backend function with calls to N backend functions, and creating version dependencies and opportunities for errors of omission on the client side. (That is, there's exactly no chance that the set of functions you'd need to call to construct a CREATE INDEX command would stay static forever. We keep adding new index features.) As far as I understood Andrew's use case, he was specifically *not* interested in a complete representation of an index definition, but rather about whether it had certain properties that would be of interest to query-constructing applications. regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > * Tom Lane (tgl@sss.pgh.pa.us) wrote: > >> Building on the has-property approach Andrew suggested, I wonder if > >> we need something like pg_index_column_has_property(indexoid, colno, > >> propertyname) with properties like "sortable", "desc", "nulls first". > > > Right, this makes sense to me. The point which I was trying to get at > > above is that we should be able to replace most of what is provided in > > pg_get_indexdef() by using this function to rebuild the CREATE INDEX > > command- again, similar to how we build a CREATE TABLE command rather > > than simply provide a 'pg_get_tabledef()'. > > Uh, what would be the point? You're just replacing a call to one backend > function with calls to N backend functions, and creating version > dependencies and opportunities for errors of omission on the client side. > (That is, there's exactly no chance that the set of functions you'd need > to call to construct a CREATE INDEX command would stay static forever. > We keep adding new index features.) We also keep adding table-level options too, and is therefore hardly a reason to argue that we shouldn't provide the information through the catalog for a client-side application to rebuild a table, as pg_dump does. > As far as I understood Andrew's use case, he was specifically *not* > interested in a complete representation of an index definition, but > rather about whether it had certain properties that would be of > interest to query-constructing applications. I'm not convinced that the two are actually different. As we add new index features, query-constructing applications may be interested in those new features and therefore we should be exposing that information. If we were using a capabilities function to build up the CREATE INDEX command in pg_dump, we never would have ended up in the situation which we find ourselves now- specifically, that we've removed information that applications were using. Consider the RLS case. If we were using some hypothetical pg_get_tabledef() in pg_dump, and that function handled everything about building the table definition, we might not have considered how to expose the policy information for RLS and could have stored things like "what command is this policy for?" as an opaque column that clients wouldn't easily understand. That would have been unfortunate, as there are clients which are definitely interested in the policies that have been defined on tables, for auditing purposes. In other words, for my 2c, pg_dump provides a great definition of what we should provide in the way of database introspection and we should try to minimize the cases where we're providing special server-side functions that pg_dump needs to perform its role. That this information is needed by client applications and we don't provide an easy way to programatically access it demonstrates how pg_get_indexdef() really went too far in the direction of the server handing opaque SQL commands for the client to run to recreate the object, without giving the client any understanding of the definition of the object. Thanks! Stephen
Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> Building on the has-property approach Andrew suggested, I wonder if
>> we need something like pg_index_column_has_property(indexoid, colno,
>> propertyname) with properties like "sortable", "desc", "nulls first".
> Right, this makes sense to me. The point which I was trying to get at
> above is that we should be able to replace most of what is provided in
> pg_get_indexdef() by using this function to rebuild the CREATE INDEX
> command- again, similar to how we build a CREATE TABLE command rather
> than simply provide a 'pg_get_tabledef()'.
As far as I understood Andrew's use case, he was specifically *not*
interested in a complete representation of an index definition, but
rather about whether it had certain properties that would be of
interest to query-constructing applications.
+1
I guess it might matter whether the query be constructed is CREATE INDEX or SELECT
The later seems to be more useful. The former should probably be something the server provides as a whole when requested.
David J.
On Mon, Aug 1, 2016 at 10:19:43AM -0400, Tom Lane wrote: > As far as I understood Andrew's use case, he was specifically *not* > interested in a complete representation of an index definition, but > rather about whether it had certain properties that would be of > interest to query-constructing applications. Would it be helpful to output an array of strings representing the index definition? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
>>>>> "Bruce" == Bruce Momjian <bruce@momjian.us> writes: >> As far as I understood Andrew's use case, he was specifically *not*>> interested in a complete representation of an indexdefinition, but>> rather about whether it had certain properties that would be of>> interest to query-constructing applications. Well, I wouldn't limit it to query-constructing applications. I'll give another random example that I thought of. Suppose an administrative GUI (I have no idea if any of the existing GUIs do this) has an option to do CLUSTER on a table; how should it know which indexes to offer the user to cluster on, without access to amclusterable? Bruce> Would it be helpful to output an array of strings representingBruce> the index definition? Why would that help, if the point is to enable programmatic access to information? Anyway, what I haven't seen in this thread is any implementable counter-proposal other than the "just hardcode the name 'btree'" response that was given in the JDBC thread, which I don't consider acceptable in any sense. Is 9.6 going to go out like this or is action going to be taken before rc1? -- Andrew (irc:RhodiumToad)
On Sat, Aug 6, 2016 at 01:00:15PM +0100, Andrew Gierth wrote: > >>>>> "Bruce" == Bruce Momjian <bruce@momjian.us> writes: > > >> As far as I understood Andrew's use case, he was specifically *not* > >> interested in a complete representation of an index definition, but > >> rather about whether it had certain properties that would be of > >> interest to query-constructing applications. > > Well, I wouldn't limit it to query-constructing applications. > > I'll give another random example that I thought of. Suppose an > administrative GUI (I have no idea if any of the existing GUIs do this) > has an option to do CLUSTER on a table; how should it know which indexes > to offer the user to cluster on, without access to amclusterable? > > Bruce> Would it be helpful to output an array of strings representing > Bruce> the index definition? > > Why would that help, if the point is to enable programmatic access to > information? I was thinking an array of strings would avoid problems in having to re-scan the output for tokens. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
>>>>> "Bruce" == Bruce Momjian <bruce@momjian.us> writes: Bruce> Would it be helpful to output an array of strings representingBruce> the index definition? >> Why would that help, if the point is to enable programmatic access>> to information? Bruce> I was thinking an array of strings would avoid problems inBruce> having to re-scan the output for tokens. OK, but that still leaves the issue of how to interpret each string, yes? -- Andrew (irc:RhodiumToad)
On Sat, Aug 6, 2016 at 04:04:41PM +0100, Andrew Gierth wrote: > >>>>> "Bruce" == Bruce Momjian <bruce@momjian.us> writes: > > Bruce> Would it be helpful to output an array of strings representing > Bruce> the index definition? > > >> Why would that help, if the point is to enable programmatic access > >> to information? > > Bruce> I was thinking an array of strings would avoid problems in > Bruce> having to re-scan the output for tokens. > > OK, but that still leaves the issue of how to interpret each string, > yes? Yes, you still have to parse it, just not scan/tokenize it. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On 8/6/16 10:13 AM, Bruce Momjian wrote: > On Sat, Aug 6, 2016 at 04:04:41PM +0100, Andrew Gierth wrote: >>>>>>> "Bruce" == Bruce Momjian <bruce@momjian.us> writes: >> >> Bruce> Would it be helpful to output an array of strings representing >> Bruce> the index definition? >> >> >> Why would that help, if the point is to enable programmatic access >> >> to information? >> >> Bruce> I was thinking an array of strings would avoid problems in >> Bruce> having to re-scan the output for tokens. >> >> OK, but that still leaves the issue of how to interpret each string, >> yes? > > Yes, you still have to parse it, just not scan/tokenize it. That's an improvement. For some scenarios maybe it's enough. But I also don't see what's wrong with having the ability to probe for specific capabilities. I've needed to do this for unit tests, and for some items it's a real bear. Trigger definitions are an example. I've done this in the past in unit tests to ensure that a trigger was defined in a particular way. Some data is available directly in the catalog, but getting at other info would have required hard-coding knowledge of specific bit patterns into the query. Instead of doing that I elected to parse pg_get_triggerdef[1], but that's hardly satisfying either. 1: https://github.com/decibel/cat_tools/blob/master/sql/cat_tools.sql#L339 -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
On Sat, Aug 6, 2016 at 8:00 AM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: > Anyway, what I haven't seen in this thread is any implementable > counter-proposal other than the "just hardcode the name 'btree'" > response that was given in the JDBC thread, which I don't consider > acceptable in any sense. Is 9.6 going to go out like this or is action > going to be taken before rc1? Well, at this point, I think 9.6 is going to go out like this, unless Tom is willing to do something today. Multiple people have expressed clear support for adding something along the lines you've suggested, I too am in favor, and I think it's unfortunate that Tom didn't do something about it before now. But I'm neither willing to commit a patch to fix the day before rc1 nor to argue that the whole release cycle should be put back by several weeks on account of this issue. Once we open the tree for 10, I'm willing to pick this up if nobody else has gotten to it before then. I realize that's probably not the answer you were hoping for, and I'm sorry about that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 Robert Haas wrote: > But I'm neither willing to commit a patch to fix the day before rc1 > nor to argue that the whole release cycle should be put back by > several weeks on account of this issue. Seriously? First, not sure why this would put the whole release cycle back by 'several weeks'. Second, this is removing functionality, so what are apps supposed to do - have a three-choice case in the code to handle pg_am for < 9.6, do some ugly parsing for 9.6, and use the new functions when 10.0 comes out?! This issue was raised on July 25th, and the OP has gone out of his way to present the case and provide patches. It's hardly fair to discard it now. - -- Greg Sabino Mullane greg@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201608071606 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAlenlWUACgkQvJuQZxSWSsjjeACfVrThYGx+4DnBwO2ZAOYGoK7s wdgAoOoxdVo0RM7smSr3CJg8J4dM3YMo =+m9i -----END PGP SIGNATURE-----
On Sun, Aug 7, 2016 at 4:09 PM, Greg Sabino Mullane <greg@turnstep.com> wrote: > Robert Haas wrote: >> But I'm neither willing to commit a patch to fix the day before rc1 >> nor to argue that the whole release cycle should be put back by >> several weeks on account of this issue. > > Seriously? First, not sure why this would put the whole release cycle > back by 'several weeks'. It's a combination of three things. First, it is unhelpful to users and to the project and burdensome to packagers to do lots of releases in quick succession. If we issue beta4 this week, we should really wait about 3 weeks before we consider issuing beta5 or rc1. Otherwise, we're going to have little time to get any meaningful feedback on the release, and possibly annoy some of our all-volunteer packaging team. Second, we are doing a set of minor releases this week and those minor releases will include some security fixes. We do not want to do minor releases for the back branches without releasing a new version of 9.6, because that leaves people who are trying to help with beta-testing for 9.6 running close with disclosed security vulnerabilities. Third, I don't think it's appropriate to have a catversion bump between rc1 and final; rc1 should be very, very close to final; if it's not, we weren't really ready for rc1. Because of point #2, we have to release either 9.6beta4 or 9.6rc1 this week. If we pick 9.6rc1, then because of point #3, we cannot make this change before final. If we pick 9.6beta4, then because of point #1, we cannot reasonably release 9.6rc1 for about another 3 weeks. These issues have been discussed by the release team, which includes the core team and the pool of active committers, and this is the consensus. Of course, you may not agree. > Second, this is removing functionality, so what > are apps supposed to do - have a three-choice case in the code to handle > pg_am for < 9.6, do some ugly parsing for 9.6, and use the new functions > when 10.0 comes out?! This issue was raised on July 25th, and the OP has > gone out of his way to present the case and provide patches. It's hardly > fair to discard it now. I understand your concern and I wish things had come out differently than they have with respect to this issue. However, I do not agree that rushing a patch into the tree less than 24 hours before rc1 is likely to improve our chances of a stable, on-time release, and I also do not agree that this one issue is so important that it justifies holding up the final release by another three weeks. You may feel differently and that is fine, but I do not think that my position is unreasonable. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Aug 6, 2016 at 8:00 AM, Andrew Gierth > <andrew@tao11.riddles.org.uk> wrote: >> Anyway, what I haven't seen in this thread is any implementable >> counter-proposal other than the "just hardcode the name 'btree'" >> response that was given in the JDBC thread, which I don't consider >> acceptable in any sense. Is 9.6 going to go out like this or is action >> going to be taken before rc1? > Well, at this point, I think 9.6 is going to go out like this, unless > Tom is willing to do something today. Multiple people have expressed > clear support for adding something along the lines you've suggested, I > too am in favor, and I think it's unfortunate that Tom didn't do > something about it before now. FWIW, this thread started on 25-July, less than two weeks ago. I've been fully engaged in either stabilizing 9.6 or doing necessary back-branch maintenance since then (including, I might add, putting in full days both days this weekend). There has not been time to work on this request, and as far as I've seen from the thread we are not very close to a consensus on what the API details should be anyway. Had the complaint been raised sooner, maybe there would've been time to get a well-thought-out API into 9.6. The fact that it wasn't raised till more than 6 months after we committed the pg_am changes, and more than 2 months after 9.6beta1 was released, makes me feel that it's not all that critical a problem. Having said all that, it is unfortunate that 9.6 is going to go out without any good solution to this need. But as Robert already pointed out, trying to fix it now would force delaying 9.6rc1 by several weeks (and that's assuming that it doesn't take very long to get consensus on a solution). There's not, AFAICT, desire on the part of the release team to do that. We'd like to ship 9.6 on time for a change. regards, tom lane
On Sun, Aug 07, 2016 at 07:19:39PM -0400, Tom Lane wrote: > Had the complaint been raised sooner, maybe there would've been time > to get a well-thought-out API into 9.6. The fact that it wasn't raised > till more than 6 months after we committed the pg_am changes, and more > than 2 months after 9.6beta1 was released, makes me feel that it's not > all that critical a problem. > > Having said all that, it is unfortunate that 9.6 is going to go out > without any good solution to this need. But as Robert already pointed > out, trying to fix it now would force delaying 9.6rc1 by several weeks > (and that's assuming that it doesn't take very long to get consensus > on a solution). There's not, AFAICT, desire on the part of the release > team to do that. We'd like to ship 9.6 on time for a change. I agree with all that.
Tom Lane wrote: > Building on the has-property approach Andrew suggested, I wonder if > we need something like pg_index_column_has_property(indexoid, colno, > propertyname) with properties like "sortable", "desc", "nulls first". This seems simple enough, on the surface. Why not run with this design? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tom Lane <tgl@sss.pgh.pa.us>:
FWIW, this thread started on 25-July, less than two weeks ago.
Technically speaking, there was a pgsql-jdbc thread started on May 14: https://www.postgresql.org/message-id/nh72v6%24582%241%40ger.gmane.org
9.6beta1 was released on May 12
The fact that it wasn't raised
till more than 6 months after we committed the pg_am changes
This means that nobody was testing compatibility of "postgresql's master branch with existing third-party clients".
Testing against well-known clients makes sense to catch bugs early.
I've added "build postgresql from master branch" test to the pgjdbc's regression suite a week ago, so I hope it would highlight issues early (even before the official postgresql beta is released).
However, pgjdbc tests are executed only for pgjdbc commits, so if there's no pgjdbc changes, then there is no logic to trigger "try newer postgres with current pgjdbc".
Ideally, postgresql's regression suite should validate well-known clients as well.
I've no idea how long would it take to add something to postgresql's buildfarm, so I just went ahead and created Travis test configuration at https://github.com/vlsi/postgres
Do you think those Travis changes can be merged to the upstream?
I mean the following:
1) Activate TravisCI integration for https://github.com/postgres/postgres mirror.
2) Add relevant Travis CI file so it checks postgresql's regression suite, and the other
3) Add build badge to the readme (link to travis ci build) for simplified navigation to test results.
Here's how test results look like: https://travis-ci.org/vlsi/postgres
Vladimir
On 8 August 2016 at 03:49, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote:
Tom Lane <tgl@sss.pgh.pa.us>:FWIW, this thread started on 25-July, less than two weeks ago.Technically speaking, there was a pgsql-jdbc thread started on May 14: https://www.postgresql.org/message-id/nh72v6%24582% 241%40ger.gmane.org 9.6beta1 was released on May 12The fact that it wasn't raised
till more than 6 months after we committed the pg_am changesThis means that nobody was testing compatibility of "postgresql's master branch with existing third-party clients".Testing against well-known clients makes sense to catch bugs early.I've added "build postgresql from master branch" test to the pgjdbc's regression suite a week ago, so I hope it would highlight issues early (even before the official postgresql beta is released).However, pgjdbc tests are executed only for pgjdbc commits, so if there's no pgjdbc changes, then there is no logic to trigger "try newer postgres with current pgjdbc".Ideally, postgresql's regression suite should validate well-known clients as well.I've no idea how long would it take to add something to postgresql's buildfarm, so I just went ahead and created Travis test configuration at https://github.com/vlsi/postgres Do you think those Travis changes can be merged to the upstream?I mean the following:1) Activate TravisCI integration for https://github.com/postgres/postgres mirror. 2) Add relevant Travis CI file so it checks postgresql's regression suite, and the other3) Add build badge to the readme (link to travis ci build) for simplified navigation to test results.Here's how test results look like: https://travis-ci.org/vlsi/postgres
Nice work +!
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> Building on the has-property approach Andrew suggested, I wonder if >> we need something like pg_index_column_has_property(indexoid, colno, >> propertyname) with properties like "sortable", "desc", "nulls first". > This seems simple enough, on the surface. Why not run with this design? Well, the next step is to fill in the blanks: what properties need to be queryable? regards, tom lane
On 08/08/16 15:38, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> Tom Lane wrote: >>> Building on the has-property approach Andrew suggested, I wonder if >>> we need something like pg_index_column_has_property(indexoid, colno, >>> propertyname) with properties like "sortable", "desc", "nulls first". > >> This seems simple enough, on the surface. Why not run with this design? > > Well, the next step is to fill in the blanks: what properties need to be > queryable? That's less urgent; adding missing properties does not require a catversion bump. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Vik Fearing <vik@2ndquadrant.fr> writes: > On 08/08/16 15:38, Tom Lane wrote: >> Well, the next step is to fill in the blanks: what properties need to be >> queryable? > That's less urgent; adding missing properties does not require a > catversion bump. If the complaint is "I can do X before 9.6.0 and after 9.6.0, but not in 9.6.0", it doesn't really matter whether it would take a catversion bump to resolve; the problem is lack of a time machine. Once there's a released 9.6.0 out there that is missing something important, applications that care about version portability are going to have to deal with it. So we need to get this right the first time. regards, tom lane
I wrote: > Having said all that, it is unfortunate that 9.6 is going to go out > without any good solution to this need. But as Robert already pointed > out, trying to fix it now would force delaying 9.6rc1 by several weeks > (and that's assuming that it doesn't take very long to get consensus > on a solution). There's not, AFAICT, desire on the part of the release > team to do that. We'd like to ship 9.6 on time for a change. After some back-and-forth among the release team, there's been a decision to change this position: today's wrap will be 9.6beta4, and we'll try to get something done about the AM property access issue before issuing rc1. We only have a week or two to get this done without starting to impact v10 development as well as hurting our chances of shipping 9.6.0 in September. So let's start getting down to details. regards, tom lane
On Sun, Aug 7, 2016 at 11:53 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> Building on the has-property approach Andrew suggested, I wonder if >> we need something like pg_index_column_has_property(indexoid, colno, >> propertyname) with properties like "sortable", "desc", "nulls first". > > This seems simple enough, on the surface. Why not run with this design? Andrew's patch, plus this, covers everything I can think of. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
>>>>> "Kevin" == Kevin Grittner <kgrittn@gmail.com> writes: >>> Building on the has-property approach Andrew suggested, I wonder if>>> we need something like pg_index_column_has_property(indexoid,colno,>>> propertyname) with properties like "sortable", "desc", "nulls first".>> >>This seems simple enough, on the surface. Why not run with this design? Kevin> Andrew's patch, plus this, covers everything I can think of. Where'd be a good place to put that function? ruleutils? catalog/index.c ? (ruleutils is way too big already) -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Where'd be a good place to put that function? ruleutils? catalog/index.c ? > (ruleutils is way too big already) Agreed. catalog/index.c is not a place that implements SQL-visible functions, so I don't like that either. One idea is utils/adt/misc.c. Or we could make a new file under utils/adt/ though I'm not very sure what to name it. amaccess.c? catutils.c? If there's only ever likely to be one or two functions of this ilk, maybe a new file is overkill and we should just use misc.c. regards, tom lane
Tom Lane wrote: > Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > > Where'd be a good place to put that function? ruleutils? catalog/index.c ? > > > (ruleutils is way too big already) > > Agreed. catalog/index.c is not a place that implements SQL-visible > functions, so I don't like that either. > > One idea is utils/adt/misc.c. Or we could make a new file under > utils/adt/ though I'm not very sure what to name it. amaccess.c? > catutils.c? If there's only ever likely to be one or two functions > of this ilk, maybe a new file is overkill and we should just use misc.c. I like the idea of a new file; I have a hunch that it will grow, given that we're expanding in this area, and perhaps we can find some existing stuff to relocate there in the future. I don't think a small file is a problem, anyway. How about amfuncs.c? Maybe it can live in catalog/ instead of utils/adt? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>>>> "Alvaro" == Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> One idea is utils/adt/misc.c. Or we could make a new file under>> utils/adt/ though I'm not very sure what to name it. amaccess.c?>> catutils.c? If there's only ever likely to be one or two functions>> of this ilk, maybe a new file isoverkill and we should just use>> misc.c. Alvaro> I like the idea of a new file; I have a hunch that it willAlvaro> grow, given that we're expanding in this area,and perhaps weAlvaro> can find some existing stuff to relocate there in the future.Alvaro> I don't think a small fileis a problem, anyway. Alvaro> How about amfuncs.c? Maybe it can live in catalog/ instead ofAlvaro> utils/adt? Well, the existing patch used access/index/amapi.c for the AM capability functions. There may be some merit in keeping everything together - I asked because it didn't seem at first glance that the index column property function belonged there, but on second thought there's some overlap in that in future, if indoptions ever acquires any AM-specific flags, it may be necessary for pg_index_column_has_property to call into an AM-specific function. So, here are some options: 1. Put everything in access/index/amapi.c 2. Move either all of access/index/amapi.c, or just the SQL-callable part of it (amvalidate), to utils/adt/amfuncs.c andput new stuff in there 3. put pg_index[am]_has_capability in access/index/amapi.c and pg_index_column_has_property in utils/adt/misc.c -- Andrew (irc:RhodiumToad)
Updated patch. Changes: - returns NULL rather than "cache lookup failed" - added pg_index_column_has_property (incl. docs) - added regression tests Not changed / need consideration: - this still has everything in amapi.c rather than creating any new files. Also, the regression tests are in create_index.sql for lack of any obviously better place. The list of column properties is: ordered - (same as "amcanorder" AM capability) ordered_asc ordered_desc ordered_nulls_first ordered_nulls_last If "ordered" is true then exactly one of _asc/_desc and exactly one of _nulls_first/_last will be true; if "ordered" is false then all the others will be false too. The intended usage is something like CASE WHEN pg_index_column_has_property(idx, attno, 'ordered_asc') THEN 'ASC' WHEN pg_index_column_has_property(idx, attno, 'ordered_desc') THEN 'DESC' ELSE '' -- or NULL END Comments? -- Andrew (irc:RhodiumToad)
Attachment
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > - this still has everything in amapi.c rather than creating any new > files. Also, the regression tests are in create_index.sql for lack > of any obviously better place. This more than doubles the size of amapi.c, so it has a definite feel of tail-wags-dog for me, even if that seemed like an appropriate place otherwise which it doesn't really. I think a new .c file under adt/ is the way to go, with extern declarations in builtins.h. Maybe we need a new regression test case file too. I don't much like adding this to create_index because (a) it doesn't particularly seem to match that file's purpose of setting up indexes on the standard regression test tables, and (b) that file is a bottleneck in parallel regression runs because it can't run in parallel with much else. > The list of column properties is: > ordered - (same as "amcanorder" AM capability) > ordered_asc > ordered_desc > ordered_nulls_first > ordered_nulls_last > If "ordered" is true then exactly one of _asc/_desc and exactly one of > _nulls_first/_last will be true; if "ordered" is false then all the > others will be false too. Check, but do we need the "ordered_" prefixes on the last four? Seems like mostly useless typing. And the name space here is never going to be so large that it'd be confusing. > Comments? Why did you go with "capability" rather than "property" in the exposed function names? The latter seems much closer to le mot juste, especially for things like asc/desc. I'd expect the property keywords to be recognized case-insensitively, as for example are the keywords in has_table_privilege() and its ilk. That being the case, you should be using pg_strcasecmp(). This stuff:if (namelen == 10 && memcmp(nameptr, "amcanorder",10) == 0) is ugly and hard to maintain as well as being unnecessarily non-user- friendly. Just convert the input to a C string and be done with it. It's not like saving a couple of nanoseconds is critical here. I'd personally cut the list of pg_am replacement properties way back, as I believe much of what you've got there is not actually of use to applications, and some of it is outright counterproductive. An example is exposing amcanreturn as an index-AM boolean. That's just wrong these days. If you want that, it should be an index column property. And because what used to be an index-AM property no longer is in that case, I think it's a fine example of why we shouldn't be reporting more than the absolute minimum here. It would tie our hands with respect to making other improvements of that kind later. (This might be a good argument for moving as much as we can over to the index-column-property function, even if it can't be set per-column today. If there's a chance that it could be per-column in the future, let's not call it an AM property.) Also, while I see the point of the amgettuple and amgetbitmap properties, I would call them something like "index_scan" and "bitmap_scan" because that's the user's-eye view of what is supported. I do not think we should slavishly tie the property names to the old, never-intended-as-user-facing column names. I'd drop the "am" prefix, for starters. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: >> - this still has everything in amapi.c rather than creating any new>> files. Also, the regression tests are in create_index.sqlfor lack>> of any obviously better place. Tom> This more than doubles the size of amapi.c, so it has a definiteTom> feel of tail-wags-dog for me, even if that seemedlike anTom> appropriate place otherwise which it doesn't really. I think aTom> new .c file under adt/ is the way togo, with extern declarationsTom> in builtins.h. Yeah, I'm fine with that. adt/amutils.c unless I see some better suggestion. Tom> Maybe we need a new regression test case file too. I don't muchTom> like adding this to create_index because (a) itdoesn'tTom> particularly seem to match that file's purpose of setting upTom> indexes on the standard regression test tables,and (b) that fileTom> is a bottleneck in parallel regression runs because it can't runTom> in parallel with much else. Good point. I looked around to see if anything was testing pg_get_indexdef, thinking that that would be a good place, but it seems that pg_get_indexdef is tested only in incidental ways (in collate and rules, the latter of which tests it only with invalid input). I'll do the next version with a new file, unless a better idea shows up. >> Comments? Tom> Why did you go with "capability" rather than "property" in theTom> exposed function names? The latter seems much closerto le motTom> juste, especially for things like asc/desc. The first version (which dealt only with AMs) went with "capability" because it was dealing with what the AM _could_ do rather than what was defined on any specific index. The second version added pg_index_column_has_property because that was the name that had been used in discussion. Changing them all to "property" would be more consistent I suppose. Tom> I'd personally cut the list of pg_am replacement properties wayTom> back, as I believe much of what you've got thereis not actuallyTom> of use to applications, and some of it is outrightTom> counterproductive. An example is exposingamcanreturn as anTom> index-AM boolean. For AM-wide properties, it may be that they have to be considered "lossy" when tested against the AM oid rather than on an individual index or column - at the AM level, "false" might mean "this won't work" while "true" would mean "this might work sometimes, not guaranteed to work on every index". The documentation should probably indicate this. So these properties (I've changed all the names here, suggestions welcome for better ones) I think should be testable on the AM, each with an example of why: can_order - if this is false, an admin tool shouldn't try and put ASC or DESC in a CREATE INDEX can_unique - if this is false, an admin tool might, for example, want to not offer the user the option of CREATE UNIQUEINDEX with this AM can_multi_col - if this is false, an admin tool might want to allow the user to select only one column can_exclude - a new property that indicates whether the AM can be used for exclusion constraints; at present thismatches "amgettuple" but that implementation detail should of course be hidden (One possible refinement here could be to invert the sense of all of these, making them no_whatever, so that "false" and "null" could be treated the same by clients. Or would that be too confusing?) These could be limited to being testable only on a specified index, and not AM-wide: can_backward clusterable index_scan bitmap_scan optional_key (? maybe) predicate_locks (? maybe) And these for individual columns: can_return search_array (? maybe) search_nulls (? maybe) operator_orderable (or distance_orderable? what's a good name?)orderable asc desc nulls_first nulls_last A question: implementing can_return as a per-column property looks like it requires actually opening the index rel, rather than just consulting the syscache the way that most pg_get_* functions do. Should it always open it, or only for properties that need it? -- Andrew (irc:RhodiumToad)
On Wed, Aug 10, 2016 at 12:31 PM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote: > These could be limited to being testable only on a specified index, and > not AM-wide: > predicate_locks (? maybe) That one seems like it should either be at the AM level or not included at all. Where it would be interesting to know is if you are a hacker looking for an AM to enhance with support, or (when there is more than just btree supported, so it is not so easy to remember) if you are a DBA investigating a high rate of serialization failures and want to know whether indexes of a certain type have index-relation predicate locking granularity or something more fine-grained. The latter use case seems plausible once there is more of a mix of support among the AMs. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Kevin Grittner <kgrittn@gmail.com> writes: > That one seems like it should either be at the AM level or not > included at all. Where it would be interesting to know is if you > are a hacker looking for an AM to enhance with support, or (when > there is more than just btree supported, so it is not so easy to > remember) if you are a DBA investigating a high rate of > serialization failures and want to know whether indexes of a > certain type have index-relation predicate locking granularity or > something more fine-grained. The latter use case seems plausible > once there is more of a mix of support among the AMs. TBH, that line of thought impresses me not at all, because I do not see a reason for SQL queries to need to see internal behaviors of AMs, and especially not at levels as crude as boolean properties of entire AMs, because that's just about guaranteed to become a lie (or at least not enough of the truth) in a year or two. If you are a DBA wanting to know how fine-grained the locking is in a particular index type, you really need to read the source code or ask a hacker. We have been bit by the "representation not good enough to describe actual behavior" problem *repeatedly* over the years that pg_am had all this detail. First it was amstrategies and amsupport, which have never usefully described the set of valid proc/op strategy numbers for any index type more complicated than btree. Then there was amorderstrategy, which we got rid of in favor of amcanorder, and later added amcanbackward to that (not to mention amcanorderbyop). And amconcurrent, which went away for reasons I don't recall. Then we added amstorage, which later had to be supplemented with amkeytype, and still isn't a very accurate guide to what's actually in an index. amcanreturn actually was a boolean rather than a function for awhile (though it looks like we never shipped a release with that definition). There's still a lot of stuff with obviously limited life expectancy, like amoptionalkey, which is at best a really crude guide to what are valid index qualifications; someday that will likely have to go away in favor of a "check proposed index qual for supportability" AM callback. So I don't think I'm being unreasonable in wanting to minimize, not maximize, the amount of info exposed through this interface. There is enough history to make me pretty sure that a lot of things that might be simple boolean properties today are going to be less simple tomorrow, and then we'll be stuck having to invent arbitrary definitions for what the property-test function is going to return for those. And if there are any queries out there that are depending on simplistic interpretations of those property flags, they'll be broken in some respect no matter what we do. In short, I do not see a good reason to expose ampredlocks at the SQL level, and I think there needs to be a darn good reason to expose any of this stuff, not just "maybe some DBA will think he needs to query this". regards, tom lane
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > So these properties (I've changed all the names here, suggestions > welcome for better ones) I think should be testable on the AM, each with > an example of why: > can_order > can_unique > can_multi_col > can_exclude Check, flags that indicate what you can put in CREATE INDEX obviously have to be testable on the AM. I wonder though whether this behavior of can_order should be distinct from the notion of "can produce ordered scan output"? Maybe that's silly. Or maybe can_order needs to be something you test at the opclass level not the AM level --- I can sort of imagine an index type in which some opclasses support ordering and others don't. Certainly that behavior is possible today for amcanorderbyop. > (One possible refinement here could be to invert the sense of all of > these, making them no_whatever, so that "false" and "null" could be > treated the same by clients. Or would that be too confusing?) Hmm? I think true as the "has capability" case is fine from that perspective, null would be taken as "doesn't work". > These could be limited to being testable only on a specified index, and > not AM-wide: That would require adding a third function, but maybe we should just do that. In a lot of cases you'd rather not have to worry about which AM underlies a given index, so a simple index_has_property(regclass, text) function would be nice. (That means can_order ought to be supported in the per-index function even if you don't believe that it'd ever be opclass-specific, or in the per-column function if you do.) > can_backward As above, that could conceivably need to be per-column. > clusterable Call this can_cluster, maybe? Or just cluster? > index_scan > bitmap_scan > optional_key (? maybe) > predicate_locks (? maybe) As noted in my response to Kevin, I don't like the last two. They are internal properties and it's hard to see them being of much use to applications even if they weren't subject to change. > And these for individual columns: > can_return > search_array (? maybe) > search_nulls (? maybe) > operator_orderable (or distance_orderable? what's a good name?) distance_orderable seems not bad. > orderable > asc > desc > nulls_first > nulls_last OK > A question: implementing can_return as a per-column property looks like > it requires actually opening the index rel, rather than just consulting > the syscache the way that most pg_get_* functions do. Should it always > open it, or only for properties that need it? Probably only if needed, on performance grounds, and because opening the rel adds to chances of failure. regards, tom lane
On Wed, Aug 10, 2016 at 6:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Kevin Grittner <kgrittn@gmail.com> writes: >> That one seems like it should either be at the AM level or not >> included at all. Where it would be interesting to know is if you >> are a hacker looking for an AM to enhance with support, or (when >> there is more than just btree supported, so it is not so easy to >> remember) if you are a DBA investigating a high rate of >> serialization failures and want to know whether indexes of a >> certain type have index-relation predicate locking granularity or >> something more fine-grained. The latter use case seems plausible >> once there is more of a mix of support among the AMs. > > TBH, that line of thought impresses me not at all, because I do not see > a reason for SQL queries to need to see internal behaviors of AMs, and > especially not at levels as crude as boolean properties of entire AMs, > because that's just about guaranteed to become a lie (or at least not > enough of the truth) in a year or two. If you are a DBA wanting to know > how fine-grained the locking is in a particular index type, you really > need to read the source code or ask a hacker. > > We have been bit by the "representation not good enough to describe actual > behavior" problem *repeatedly* over the years that pg_am had all this > detail. First it was amstrategies and amsupport, which have never > usefully described the set of valid proc/op strategy numbers for any index > type more complicated than btree. Then there was amorderstrategy, which > we got rid of in favor of amcanorder, and later added amcanbackward to > that (not to mention amcanorderbyop). And amconcurrent, which went away > for reasons I don't recall. Then we added amstorage, which later had to > be supplemented with amkeytype, and still isn't a very accurate guide to > what's actually in an index. amcanreturn actually was a boolean rather > than a function for awhile (though it looks like we never shipped a > release with that definition). There's still a lot of stuff with > obviously limited life expectancy, like amoptionalkey, which is at best a > really crude guide to what are valid index qualifications; someday that > will likely have to go away in favor of a "check proposed index qual for > supportability" AM callback. > > So I don't think I'm being unreasonable in wanting to minimize, not > maximize, the amount of info exposed through this interface. There is > enough history to make me pretty sure that a lot of things that might be > simple boolean properties today are going to be less simple tomorrow, and > then we'll be stuck having to invent arbitrary definitions for what the > property-test function is going to return for those. And if there are > any queries out there that are depending on simplistic interpretations > of those property flags, they'll be broken in some respect no matter > what we do. > > In short, I do not see a good reason to expose ampredlocks at the SQL > level, and I think there needs to be a darn good reason to expose any of > this stuff, not just "maybe some DBA will think he needs to query this". I don't think you're being unreasonable, but I don't agree with your approach. I think that we should expose everything we reasonably can, and if we have to change it later then it will be a backward compatibility break. Making it unqueryable in the hopes that people won't try to query it is futile. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Aug 10, 2016 at 6:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> In short, I do not see a good reason to expose ampredlocks at the SQL >> level, and I think there needs to be a darn good reason to expose any of >> this stuff, not just "maybe some DBA will think he needs to query this". > I don't think you're being unreasonable, but I don't agree with your > approach. I think that we should expose everything we reasonably can, > and if we have to change it later then it will be a backward > compatibility break. Making it unqueryable in the hopes that people > won't try to query it is futile. Well, if it's unqueryable they won't be able to query it no matter how hard they try ;-). But my point here is that up to now, we never had the opportunity to draw a line between user-visible and non-user-visible AM properties; if it needed to be in pg_am, that's where it went. Now that we do have an opportunity, we should draw the line in an intelligent fashion, not blindly assume that everything that was in pg_am should remain exposed. I think that neither amoptionalkey nor ampredlocks is of real use to applications, and there are easily foreseeable reasons why they would disappear or change behavior substantially. So I feel we should leave them out of the API. regards, tom lane
On Thu, Aug 11, 2016 at 11:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Aug 10, 2016 at 6:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> In short, I do not see a good reason to expose ampredlocks at the SQL >>> level, and I think there needs to be a darn good reason to expose any of >>> this stuff, not just "maybe some DBA will think he needs to query this". > >> I don't think you're being unreasonable, but I don't agree with your >> approach. I think that we should expose everything we reasonably can, >> and if we have to change it later then it will be a backward >> compatibility break. Making it unqueryable in the hopes that people >> won't try to query it is futile. > > Well, if it's unqueryable they won't be able to query it no matter how > hard they try ;-). Sure, but as this thread already demonstrates, they may also complain forcefully about having lost that capability. You argued when removing the pg_am columns that nobody cared about the ability to query any of them; when that turned out to be false, you argued that they could hard-code the index types instead of querying for capabilities; when there was opposition to that, you fell back to your present position of arguing that only the smallest possible subset of it should be made queryable. This is basically the same argument we have every time somebody wants to remove a "static" from a function prototype or stick a PGDLLIMPORT on a variable. You argue against these things on the grounds that they might change later, but the overwhelming evidence from posts on this list is that people would prefer to have access to APIs that might not be stable rather than have no access at all. I don't expect this post or any other to convince you that such a view is in fact sensible, but I could hope that at some point you might be willing to admit that it's widely-held. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Aug 11, 2016 at 5:24 PM, Robert Haas <robertmhaas@gmail.com> wrote: > You argue against > these things on the grounds that they might change later, but the > overwhelming evidence from posts on this list is that people would > prefer to have access to APIs that might not be stable rather than > have no access at all. I don't think it's useful to take this ultimatum approach. I would say abstraction boundaries are a fairly well-proven C.S. tool at this point -- and indeed by sometime last century. The real question is where do the benefits outweigh the costs and that's going to be a question of balancing conflicting priorities. Not one where an ultimatum is justified. So the real question is, are index access methods a place where we want to take short cuts and just expose internals or is this a place where we should spend the effort to design good abstractions? At face value it certainly seems like a line worth defending but historically it's been a kind of half-hearted abstraction since it was never clear what new access methods might need and how to abstractly define every possible attribute they might have. And the push away from SQL defined attributes seems to be conceding defeat on that front. -- greg
On Wed, Aug 10, 2016 at 5:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Kevin Grittner <kgrittn@gmail.com> writes: >> Where [whether the AM supports predicate locking at a >> granularity finer than the provided default of index relation] >> would be interesting to know is [ ... ] (when there is more than >> just btree supported, so it is not so easy to remember) if you >> are a DBA investigating a high rate of serialization failures >> and want to know whether indexes of a certain type have >> index-relation predicate locking granularity or something more >> fine-grained. [This] use case seems plausible once there is >> more of a mix of support among the AMs. > > TBH, that line of thought impresses me not at all, because I do not see > a reason for SQL queries to need to see internal behaviors of AMs, and > especially not at levels as crude as boolean properties of entire AMs, > because that's just about guaranteed to become a lie (or at least not > enough of the truth) in a year or two. But a DBA who has a problem doesn't care what the truth will be in a year or two -- the interest is in *right now* on one particular cluster. > If you are a DBA wanting to know how fine-grained the locking is > in a particular index type, you really need to read the source code > or ask a hacker. Really? > In short, I do not see a good reason to expose ampredlocks at the SQL > level, and I think there needs to be a darn good reason to expose any of > this stuff, not just "maybe some DBA will think he needs to query this". As I said before, there's probably not a lot of benefit exposing it *now*, when only btree indexes support fine-grained predicate locks; but if we were to add support for one or two more AMs per release for the next several releases, it could become a significant benefit to a DBA who's trying to figure out a problem -- especially if that DBA is not a skilled C programmer. So, the next question is, if it is somewhat likely to become useful as an exposed property in some future release, are we better off exposing it now, even though it might not be referenced much, or wait until we see demand popping up on the lists by people needing the information to solve problems they are having? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 08/11/2016 10:46 AM, Kevin Grittner wrote: > On Wed, Aug 10, 2016 at 5:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Kevin Grittner <kgrittn@gmail.com> writes: > But a DBA who has a problem doesn't care what the truth will be in > a year or two -- the interest is in *right now* on one particular > cluster. > >> If you are a DBA wanting to know how fine-grained the locking is >> in a particular index type, you really need to read the source code >> or ask a hacker. > This has to be a joke. With the greatest respect, this show a world of disconnect from the people that actually use this software. JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them.
Kevin Grittner <kgrittn@gmail.com> writes: > On Wed, Aug 10, 2016 at 5:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> In short, I do not see a good reason to expose ampredlocks at the SQL >> level, and I think there needs to be a darn good reason to expose any of >> this stuff, not just "maybe some DBA will think he needs to query this". > As I said before, there's probably not a lot of benefit exposing it > *now*, when only btree indexes support fine-grained predicate > locks; but if we were to add support for one or two more AMs per > release for the next several releases, it could become a > significant benefit to a DBA who's trying to figure out a problem > -- especially if that DBA is not a skilled C programmer. By then, we might have a better idea of whether a per-AM boolean flag is a sensible long-term representation or not. Right now, I say that this does little except lock us into something we may well wish to get out of. Also, I'm very skeptical of the implied position that pg_am properties should explain everything a DBA needs to know about the different AMs. That's never been even remotely true, and if that's the sort of standard we wish to strive for, an API that can return a few boolean properties ain't gonna cut it. I think we should limit our ambitions here to exposing properties that *applications* demonstrably have uses for. regards, tom lane
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Aug 11, 2016 at 11:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Well, if it's unqueryable they won't be able to query it no matter how >> hard they try ;-). > Sure, but as this thread already demonstrates, they may also complain > forcefully about having lost that capability. You argued when > removing the pg_am columns that nobody cared about the ability to > query any of them; Sir, that's just historical revisionism. I/we asked at the time whether people needed any of that info, and heard nothing but crickets. It was mentioned multiple times during the development thread, see for example https://www.postgresql.org/message-id/17342.1439225415%40sss.pgh.pa.us or https://www.postgresql.org/message-id/19218.1440604259%40sss.pgh.pa.us or even the commit message in question (65c5fcd35): A disadvantage is that SQL-level code can no longer see attributes of index AMs; in particular, some of the crosschecks in the opr_sanity regression test are no longer possiblefrom SQL. We've addressed that by adding a facility for the index AM to perform such checks instead. (Much morecould be done in that line, but for now we're content if the amvalidate functions more or less replace what opr_sanityused to do.) We might also want to expose some sort of reporting functionality, but this patch doesn't do that. I will admit that I'd rather minimize than maximize the amount of information we expose here, but I think that's an entirely defensible position. > ... You argue against > these things on the grounds that they might change later, but the > overwhelming evidence from posts on this list is that people would > prefer to have access to APIs that might not be stable rather than > have no access at all. That doesn't stop them from bitching when we do change things they were depending on. I'm fine with exposing things there is a clear use-case for, but I do not see that there is a reasonable use-case for exposing ampredlocks. regards, tom lane
So I'm tidying up and doing docs for the next version of this patch, but here for comment is the current functionality: select cap, pg_indexam_has_property(a.oid, cap) as "AM", pg_index_has_property('onek_hundred'::regclass, cap) as"Index", pg_index_column_has_property('onek_hundred'::regclass, 1, cap) as "Column" from pg_am a, unnest(array['asc','desc', 'nulls_first', 'nulls_last', 'orderable', 'distance_orderable', 'can_order', 'can_unique', 'can_multi_col', 'can_exclude', 'can_backward', 'can_cluster','index_scan', 'bitmap_scan', 'can_return', 'search_array', 'search_nulls']) with ordinality as u(cap,ord)where a.amname='btree'order by ord; cap | AM | Index |Column --------------------+----+-------+--------asc | | | tdesc | | | fnulls_first | | | fnulls_last | | | torderable | | | tdistance_orderable| | | fcan_order | t | t | tcan_unique | t | t | tcan_multi_col | t | t | tcan_exclude | t | t | tcan_backward | | t | tcan_cluster | | t | tindex_scan | | t | tbitmap_scan | | t | tcan_return | | | tsearch_array | | | tsearch_nulls | | | t (17 rows) This table shows what properties are exposed at the AM-wide level, the per-index level and the per-column level. distance_orderable now returns true/false depending on the opclass, not just on the amcanorderbyop field. In order to do this, I've added an optional amproperty function to the AM api, which if it exists, gets first dibs on all property calls so it can override the result as it sees fit. can_return likewise reflects the result of index_can_return. -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > This table shows what properties are exposed at the AM-wide level, the > per-index level and the per-column level. +1 mostly, but I'm a bit bemused by can_order and can_backward having different scopes --- how come? Also, not sure about allowing things like can_multi_col as index or column properties. That seems a bit silly: whether or not the AM can do multi columns, a specific index is what it is. I'd be a bit inclined to have those return null except at the AM level. In particular, I'm not sure what you intend to mean by applying can_unique at the index or column level. Is that supposed to mean that the index or column *is* unique? If so, I'd prefer a different name for that, on the same principle that "orderable" is not the same thing as "can_order". > distance_orderable now returns true/false depending on the opclass, not > just on the amcanorderbyop field. In order to do this, I've added an > optional amproperty function to the AM api, which if it exists, gets > first dibs on all property calls so it can override the result as it > sees fit. Hmm, seems like for that case, it'd be easier to look into pg_amop and see if the opclass has any suitably-marked operators. The AMs that support this would have to duplicate such code anyway, so it seems better to just have one copy. Or we could leave it to the client to do that join, which is the sort of approach that Stephen was advocating anyway, IIUC. Adding an AM override method might be a good idea anyway, but I'm not convinced that it's an appropriate way to address this particular point. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: >> distance_orderable now returns true/false depending on the opclass,>> not just on the amcanorderbyop field. In order todo this, I've>> added an optional amproperty function to the AM api, which if it>> exists, gets first dibs on all propertycalls so it can override the>> result as it sees fit. Tom> Hmm, seems like for that case, it'd be easier to look into pg_amopTom> and see if the opclass has any suitably-markedoperators. I thought about that, but it seemed like it could get painful. The planner is working forwards from a known operator and matching it against the index column, whereas we'd have to work backwards from the opfamily, and there's no good existing index for this; in the presence of binary-compatible types, I don't think even amoplefttype can be assumed (e.g. a varchar column can be ordered by pg_trgm's <-> operator which is declared for text). So it'd have to be the equivalent of: get index column's opclass oid look it up in pg_opclass to get opfamily for r in select * from pg_amop where amopfamily=?and amoppurpose='o' if r.amoplefttype is binary-coercible from the index column's type then return true As opposed to what I have now, which is: get index column's opclass oid look it up in pg_opclass to get opfamily/opcintype result = SearchSysCacheExists4(AMPROCNUM,...) (in theory this could produce a false positive if there's a distance function but no actual operators to reference it, but I think that's the opclass author's issue) -- Andrew (irc:RhodiumToad)
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: >> This table shows what properties are exposed at the AM-wide level,>> the per-index level and the per-column level. Tom> +1 mostly, but I'm a bit bemused by can_order and can_backwardTom> having different scopes --- how come? That's where they were in the previous list, a couple of messages up in the thread... Currently can_backward is always the same for all indexes in the same AM, but I guess there's no guarantee that won't change. In the previous message you suggested it might have to be per-column, even, but I think that makes no sense (either the whole index is scannable in both directions or it is not, it can't be different per column). Tom> Also, not sure about allowing things like can_multi_col as indexTom> or column properties. That seems a bit silly:whether or not theTom> AM can do multi columns, a specific index is what it is. I'd be aTom> bit inclined to havethose return null except at the AM level. I thought it would be cleaner to be able to query all properties at the most specific level. Tom> In particular, I'm not sure what you intend to mean by applyingTom> can_unique at the index or column level. Is thatsupposed to meanTom> that the index or column *is* unique? No. (We could add properties like is_unique, is_exclusion which clients currently query in pg_index; should we?) -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Tom> +1 mostly, but I'm a bit bemused by can_order and can_backward > Tom> having different scopes --- how come? > That's where they were in the previous list, a couple of messages up in > the thread... > Currently can_backward is always the same for all indexes in the same > AM, but I guess there's no guarantee that won't change. In the previous > message you suggested it might have to be per-column, even, but I think > that makes no sense (either the whole index is scannable in both > directions or it is not, it can't be different per column). OK, fuzzy thinking on my part --- I was conflating amcanbackward with whether you can get a descending-order scan. I can imagine an index type in which some opclasses can produce btree-ordered output and some can't, in which case "orderable" has to be per-index and maybe even per-column. (If, say, the leading column is orderable and the next is not, then you could expect to produce output that is sorted by the first column but not by the first two.) However, the ability to back up in a scan is entirely independent of that; see the hash AM, which sets amcanbackward even though its output has no definable order. Right offhand it's pretty hard to see how amcanbackward wouldn't be dependent on AM only. However, we earlier posited that the only AM-level properties should be those needed to figure out what you can say in CREATE INDEX, so if we believe that then can_backward is properly positioned. > Tom> In particular, I'm not sure what you intend to mean by applying > Tom> can_unique at the index or column level. Is that supposed to mean > Tom> that the index or column *is* unique? > No. (We could add properties like is_unique, is_exclusion which clients > currently query in pg_index; should we?) No, I don't feel a need to replace that. But we need to be clear in the documentation about what this property actually means. My objection to having it answer at the index or column level is basically that that encourages confusion as to what it means. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> But we need to be clear in the documentation about what thisTom> property actually means. My objection to having itanswer at theTom> index or column level is basically that that encourages confusionTom> as to what it means. OK. Here's new output with the scope changes, and some of the names changed in an attempt to make them clearer: cap | AM | Index | Column --------------------+----+-------+--------asc | | | tdesc | | | fnulls_first | | | fnulls_last | | | torderable | | | tdistance_orderable| | | freturnable | | | tsearch_array | | | tsearch_nulls | | | tclusterable | | t | backward_scan | | t | index_scan | | t |bitmap_scan | | t | can_order | t | | can_unique | t | | can_multi_col | t | | can_exclude | t | | (17 rows) (The can_* names are reserved for the AM level where we're asking about the abstract capabilities of the AM.) Better? Worse? Any more improvements to the names? -- Andrew (irc:RhodiumToad)
Latest patch. Names and scopes are as per discussion. New files for code and regression test. Docs included. -- Andrew (irc:RhodiumToad)
Attachment
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Latest patch. > Names and scopes are as per discussion. New files for code and > regression test. Docs included. I'm starting to review this now. If anyone has objections to the property names/behaviors shown in Andrew's previous message, speak up now ... regards, tom lane
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Latest patch. > Names and scopes are as per discussion. New files for code and > regression test. Docs included. Pushed with (mostly) cosmetic adjustments. regards, tom lane