Re: multiple UNIQUE indices for FK - Mailing list pgsql-general
From | Rafal Pietrak |
---|---|
Subject | Re: multiple UNIQUE indices for FK |
Date | |
Msg-id | 56DD3BE6.7050909@ztk-rp.eu Whole thread Raw |
In response to | Re: multiple UNIQUE indices for FK (Francisco Olarte <folarte@peoplecall.com>) |
Responses |
Re: multiple UNIQUE indices for FK
|
List | pgsql-general |
W dniu 05.03.2016 o 19:53, Francisco Olarte pisze: > Hi Rafal: > > On Fri, Mar 4, 2016 at 11:46 PM, Rafal Pietrak <rafal@ztk-rp.eu> wrote: >> W dniu 04.03.2016 o 18:59, Francisco Olarte pisze: >>> Make sender_person_id NOT NULL in messages if you want to insure every >>> message ahs exactly ONE SENDER, leave it out if you want to allow >>> senderless messages. An FK column must either link to a record or be >>> null. >>> >>> Then, if you want to have a msgs-person ''table'' I would use a view: >>> >>> CREATE VIEW msgs_persons as >>> SELECT message_id, sender_person_id as person_id, 'SENDER' as role from messages >>> UNION ALL >>> SELECT message_id, recipient_person_id as person_id, 'RECIPIENT' as >>> role from recipients >> This was my initial schema .. with the addition of one "super table", >> that the two above (sender_person_id and recipient_person_id) both >> inharited from (to avoid the UNION ALL when selecting everything). > > Wuf. I do not like it. I would add a column named sender_person_id to > messages ( to distinguish its role ) and put a recipient_person_id, or > just person_id, in recipients ( the role is clear in that table ) to > avoid problems. Otherwise, what do you call the parent table and the > fields? It's a naming issue, nut I've found the hard way naming is > important in this things. Bear in mind you do only avoid TYPING the > union all when selecting everything ( as inheritance DOES do a union > all, it would have to do it with both kids AND the parent, so it MAY > be slower ). And you introduce several problems, the naming ones, a > very strange foreign-key relationship between kids, the possibility of > having a row inserted in the parent. No, no. It was OK. the parent table was appropriately "ruled on insert" for inherited tables to work as partitions. and the table was called "messages" :) [--------------] >> broke the NEXT functionality and I cannot imagine any way to reintroduce >> it into the new table layouts. > > Which is exactly the functionality of the NEXT column ? I mean, I see > you have messages with ONE sender and MANY? (Can they be zero? ) > recipients. What are you trying to achieve with it? How are you > planning to maintain it in your dessign? NULL NEXT indicates the last message inserted, and as I explain below, this is necessary to access/update the row that looses it's "most recently inserted" status just after that happens. The access/update of the row that looses it's "last" status after new insert is sufficiently expensive, that it pays to do so after an insert. Otherwise I'd have to perform it for pretty much every row that is selected, every time it is selected ... which will be an overkill. and pls note, that when performance of locating one NULL field in millions of records becomes a problem, I'm prepared to use a reserved value (a reserved record, like: "a null message") for NEXT, instead of current NULL. > >> Now I'm quite stuck here. > > I ask these questions because I think we are in a case of > http://xyproblem.info/ . :) this is a good one!!! Actually I'm very, very acquainted with this "XY problem". i.e quite often when "end-users" ask me for additional functionality, they (like in the XY case) suggest a "technical solution". And (just like you :) I always ask: pls tell me what you do "traditionally", e.g "when you are currently doing this on paper: how does it go - step by step", then I'll find a good IT solution for you. But there is a flip side of this coin. In case of a complex inter-mangled systems, where a well defined "critical point" shows up, it's more efficient to extract the "show case" that causes the problem and focus on this, instead of digressing on overall design. (which may be flowed, but cannot be rewritten at this point). > >> BTW: I'm considering your sugestion of replaceing NEXT with the >> timestamp. The primary reason for the NEXT is to be able to fetch a row >> "just preceeding" currently inserted new one AFTER the insert is done >> (in trigger after), so that some elaborated "statistics" get updated in >> that "one before" message record. May be timestap would do instead... > > If you are planning on updating the previous row for a message ( or a > person ? ) on a trigger, this smells fishy. You may have a reason, not Yes it does. > knowing what you are exactly planning to do, I cannot tell, but it > sounds really weird. I can understand that. But all I can tell you without full documentation is that it's statistics gathering, which results are subsequently used for almost every query made; and that it's a major performance hog of the system, so it have to be done just once, at the moment is becomes well defined, which is just after insert of a "next" row. And It actually does not matter (to me) how the "single row" is located within the system (i.e using NEXT or not), but it have to be located robustly - there may not be a situation, where such last record is missed or more then one get updated. I like FK as NEXT because I get that guarantee from the postgresql itself. BTW: using timestamp instead of FK to message_id does not work quite so well. To see the problem, ponder a question: what time resolution should such timestamp have, to be as robust as FK ... irrespectively of the transaction load? I did however considered NEXT becoming a binary field: FALSE for "the last message, and TRUE for the message "just before the last" ... with UNIQUE index on it end every other message having this field NULL; but the system does have an operation of "popping/purging the last message" from the pool and in such case I don't see how to restore TRUE for the message immediately before the one that became the last one after such operation. After that "pop" operation I'm currently adjusting "row before last" statistics, but possibly this could be avoided. So as of now (being afraid of breaking a working system) I will need to update statistics within a row that just have become "the one just before the last", so I will need to know which row becomes that .... and thus I ruled out NEXT becoming a BOOL field. ... again in dead waters. -R
pgsql-general by date: