Re: NAMEDATALEN increase because of non-latin languages - Mailing list pgsql-hackers
From | Julien Rouhaud |
---|---|
Subject | Re: NAMEDATALEN increase because of non-latin languages |
Date | |
Msg-id | 20220626024824.qnlpp6vikzjvuxs3@jrouhaud Whole thread Raw |
In response to | Re: NAMEDATALEN increase because of non-latin languages (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: NAMEDATALEN increase because of non-latin languages
|
List | pgsql-hackers |
On Thu, Jun 23, 2022 at 10:19:44AM -0400, Robert Haas wrote: > On Thu, Jun 23, 2022 at 6:13 AM Julien Rouhaud <rjuju123@gmail.com> wrote: > > > And should record_in / record_out use the logical position, as in: > > SELECT ab::text FROM ab / SELECT (a, b)::ab; > > > > I would think not, as relying on a possibly dynamic order could break things if > > you store the results somewhere, but YMMV. > > I think here the answer is yes again. I mean, consider that you could > also ALTER TABLE DROP COLUMN and then ALTER TABLE ADD COLUMN with the > same name. That is surely going to affect the meaning of such things. > I don't think we want to have one meaning if you reorder things that > way and a different meaning if you reorder things using whatever > commands we create for changing the display column positions. It indeed would, but ALTER TABLE DROP COLUMN is a destructive operation, and I'm assuming that anyone doing that is aware that it will have an impact on stored data and such. I initially thought that changing the display order of columns shouldn't have the same impact with the stability of otherwise unchanged record definition, as it would make such reorder much more impacting. But I agree that having different behaviors seems worse. > > Then, what about joinrels expansion? I learned that the column ordering rules > > are far from being obvious, and I didn't find those in the documentation (note > > that I don't know if that something actually described in the SQL standard). > > So for instance, if a join is using an explicit USING clause rather than an ON > > clause, the merged columns are expanded first, so: > > > > SELECT * FROM ab ab1 JOIN ab ab2 USING (b) > > > > should unexpectedly expand to (b, a, a). Is this order a strict requirement? > > I dunno, but I can't see why it creates a problem for this patch to > maintain the current behavior. I mean, just use the logical column > position instead of the physical one here and forget about the details > of how it works beyond that. I'm not that familiar with this part of the code so I may have missed something, but I didn't see any place where I could just simply do that. To be clear, the approach I used is to change the expansion ordering but otherwise keep the current behavior, to try to minimize the changes. This is done by keeping the attribute in the physical ordering pretty much everywhere, including in the nsitems, and just logically reorder them during the expansion. In other words all the code still knows that the 1st column is the first physical column and so on. So in that query, the ordering is supposed to happen when handling the "SELECT *", which makes it impossible to retain that order. I'm assuming that what you meant is to change the ordering when processing the JOIN and retain the old "SELECT *" behavior, which is to emit items in the order they're found. But IIUC the only way to do that would be to change the order when building the nsitems themselves, and have the code believe that the attributes are physically stored in the logical order. That's probably doable, but that looks like a way more impacting change. Or did you mean to keep the approach I used, and just have some special case for "SELECT *" when referring to a joinrel and instead try to handle the logical expansion in the join? AFAICS it would require to add some extra info in the parsing structures, as it doesn't really really store any position, just relies on array offset / list position and maps things that way. > > Another problem (that probably wouldn't be a problem for system catalogs) is > > that defaults are evaluated in the physical position. This example from the > > regression test will clearly have a different behavior if the columns are in a > > different physical order: > > > > CREATE TABLE INSERT_TBL ( > > x INT DEFAULT nextval('insert_seq'), > > y TEXT DEFAULT '-NULL-', > > z INT DEFAULT -1 * currval('insert_seq'), > > CONSTRAINT INSERT_TBL_CON CHECK (x >= 3 AND y <> 'check failed' AND x < 8), > > CHECK (x + z = 0)); > > > > But changing the behavior to rely on the logical position seems quite > > dangerous. > > Why? It feels to me like a POLA violation, and probably people wouldn't expect it to behave this way (even if this is clearly some corner case problem). Even if you argue that this is not simply a default display order but something more like real column order, the physical position being some implementation detail, it still doesn't really feels right. The main reason for having the possibility to change the logical position is to have "better looking", easier to work with, relations even if you have some requirements with the real physical order like trying to optimize things as much as possible (reordering columns to avoid padding space, put non-nullable columns first...). The order in which defaults are evaluated looks like the same kind of requirements. How useful would it be if you could chose a logical order, but not being able to chose the one you actually want because it would break your default values? Anyway, per the nearby discussions I don't see much interest, especially not in the context of varlena identifiers (I should have started a different thread, sorry about that), so I don't think it's worth investing more efforts into it.
pgsql-hackers by date: