Thread: What have I done!?!?!? :-)
Rather than explain how I got here, I’ll just explain the state I’m in.
From psql:
files_development=# \d files
Table "public.files"
Column | Type | Collation | Nullable | Default
------------+--------------------------------+-----------+----------+-----------------------------------
id | bigint | | not null | nextval('files_id_seq'::regclass)
basename | character varying | | not null |
parent_id | bigint | | not null |
dev | bigint | | not null |
ftype | character varying | | not null |
uid | bigint | | not null |
gid | bigint | | not null |
ino | bigint | | not null |
mode | bigint | | not null |
mtime | time without time zone | | not null |
nlink | bigint | | not null |
size | bigint | | not null |
created_at | timestamp(6) without time zone | | not null |
updated_at | timestamp(6) without time zone | | not null |
Indexes:
"files_pkey" PRIMARY KEY, btree (id)
"index_files_on_parent_id" btree (parent_id)
Foreign-key constraints:
"fk_rails_15605042e6" FOREIGN KEY (parent_id) REFERENCES files(id)
Referenced by:
TABLE "files" CONSTRAINT "fk_rails_15605042e6" FOREIGN KEY (parent_id) REFERENCES files(id)
Notice that parent_id is suppose to refer to an id in the same table — at least, that is what I’m trying to do. I’m trying to create a “root” entry whose parent points to themselves and I botched the code first time around and now I have this:
Table "public.files"
Column | Type | Collation | Nullable | Default
------------+--------------------------------+-----------+----------+-----------------------------------
id | bigint | | not null | nextval('files_id_seq'::regclass)
basename | character varying | | not null |
parent_id | bigint | | not null |
dev | bigint | | not null |
ftype | character varying | | not null |
uid | bigint | | not null |
gid | bigint | | not null |
ino | bigint | | not null |
mode | bigint | | not null |
mtime | time without time zone | | not null |
nlink | bigint | | not null |
size | bigint | | not null |
created_at | timestamp(6) without time zone | | not null |
updated_at | timestamp(6) without time zone | | not null |
Indexes:
"files_pkey" PRIMARY KEY, btree (id)
"index_files_on_parent_id" btree (parent_id)
Foreign-key constraints:
"fk_rails_15605042e6" FOREIGN KEY (parent_id) REFERENCES files(id)
Referenced by:
TABLE "files" CONSTRAINT "fk_rails_15605042e6" FOREIGN KEY (parent_id) REFERENCES files(id)
Notice that parent_id is suppose to refer to an id in the same table — at least, that is what I’m trying to do. I’m trying to create a “root” entry whose parent points to themselves and I botched the code first time around and now I have this:
files_development=# select * from files;
id | basename | parent_id | dev | ftype | uid | gid | ino | mode | mtime | nlink | size | created_at | updated_at
----+----------+-----------+-----------+-----------+------+------+-----+-------+----------------+-------+------+----------------------------+----------------------------
11 | pedz | 1234 | 687931150 | directory | 1000 | 1002 | 2 | 16877 | 18:43:29.65271 | 31 | 34 | 2022-04-06 21:58:43.570539 | 2022-04-06 21:58:43.570539
12 | pedz | 12 | 687931150 | directory | 1000 | 1002 | 2 | 16877 | 18:43:29.65271 | 31 | 34 | 2022-04-06 22:00:29.087417 | 2022-04-06 22:00:29.115021
(2 rows)
id | basename | parent_id | dev | ftype | uid | gid | ino | mode | mtime | nlink | size | created_at | updated_at
----+----------+-----------+-----------+-----------+------+------+-----+-------+----------------+-------+------+----------------------------+----------------------------
11 | pedz | 1234 | 687931150 | directory | 1000 | 1002 | 2 | 16877 | 18:43:29.65271 | 31 | 34 | 2022-04-06 21:58:43.570539 | 2022-04-06 21:58:43.570539
12 | pedz | 12 | 687931150 | directory | 1000 | 1002 | 2 | 16877 | 18:43:29.65271 | 31 | 34 | 2022-04-06 22:00:29.087417 | 2022-04-06 22:00:29.115021
(2 rows)
The record with id 11 has a parent id of 1234 which doesn’t exist.
My question isn’t how do I fix it, my question is why didn’t Postgres back out the botched record? Why isn’t it complaining?
I’m using Active Record with the psql adapter. It has a disable_referential_integrity which takes a block of code. When the block of code exists, the constraints are put back. At least, that is what I thought.
I’m wondering if the disabled constraints are still disabled somehow. If so, how would I check for that and how would I turn them back on? Or am I way off in the weeds?
Thank you for your time
Perry Smith
Attachment
On Wednesday, April 6, 2022, Perry Smith <pedz@easesoftware.com> wrote:
I’m using Active Record with the psql adapter. It has a disable_referential_integrity which takes a block of code. When the block of code exists, the constraints are put back. At least, that is what I thought.
Constraints basically are only ever evaluated when DML (insert, update, delete) commands commit and ensure the rows affected by those commands are valid at that moment in time. If you manage to insert invalid data, because you say disabled validation, it will not be checked again. Even an “update” doesn’t re-check the existing record, it effectively deletes it, and then checks its replacement.
So, yes, the constraints were probably “put back”, but it was too late, the invalid data was already saved.
David J.
Hi Perry,
Le 07/04/2022 à 00:25, Perry Smith a écrit :
Le 07/04/2022 à 00:25, Perry Smith a écrit :
[...]Notice that parent_id is suppose to refer to an id in the same table — at least, that is what I’m trying to do. I’m trying to create a “root” entry whose parent points to themselves
Note that you don't usually define a root as having a parent_id being the same as its id (hard to manage especially when you use a sequence nextval() to auto-fill the "id" primary keys).
The usual way is to have parent_id being nullable and roots are then rows with no parent_id. This matches the intuitive idea of a root which makes code more maintainable.
and I botched the code first time around and now I have this:files_development=# select * from files;
id | basename | parent_id | dev | ftype | uid | gid | ino | mode | mtime | nlink | size | created_at | updated_at
----+----------+-----------+-----------+-----------+------+------+-----+-------+----------------+-------+------+----------------------------+----------------------------
11 | pedz | 1234 | 687931150 | directory | 1000 | 1002 | 2 | 16877 | 18:43:29.65271 | 31 | 34 | 2022-04-06 21:58:43.570539 | 2022-04-06 21:58:43.570539
12 | pedz | 12 | 687931150 | directory | 1000 | 1002 | 2 | 16877 | 18:43:29.65271 | 31 | 34 | 2022-04-06 22:00:29.087417 | 2022-04-06 22:00:29.115021
(2 rows)The record with id 11 has a parent id of 1234 which doesn’t exist.My question isn’t how do I fix it, my question is why didn’t Postgres back out the botched record? Why isn’t it complaining?
Disabling referential integrity in Active Record explicitly disables triggers that would have made PostgreSQL return an error.
I’m using Active Record with the psql adapter. It has a disable_referential_integrity which takes a block of code. When the block of code exists, the constraints are put back. At least, that is what I thought.
If you look at ActiveRecord's code (https://www.rubydoc.info/docs/rails/ActiveRecord/ConnectionAdapters/PostgreSQL/ReferentialIntegrity#disable_referential_integrity-instance_method) :
before the block of code the triggers are disabled, then the block is executed and finally the triggers are enabled again (but only after they would have had a chance to be used).
I don't think this code is meant for general use (I believe I only used it in data migrations on rare occasions). I would bet that this isn't safe to use in many cases : unless I missed something you could kill your process before the triggers are enabled again leaving your application with 0 constraints until disable_referential_integrity is used again. What happens when several processes are using it simultaneously is probably not what you want either (triggers being enabled again by another process in the middle of the execution of your code).
I’m wondering if the disabled constraints are still disabled somehow.
Constraints are implemented using triggers so they aren't meant to ensure a consistent global state, they only check that the modifications are OK at the moment they are done.
If you disable constraints temporarily nothing prevents your data from being inconsistent with your constraints.
If so, how would I check for that and how would I turn them back on? Or am I way off in the weeds?
I'd say the later : in your case I would use a NULL parent_id for root(s). Your way leads you to bend PostgreSQL until its back brakes.
Best regards,
-- Lionel Bouton gérant de JTEK SARL https://www.linkedin.com/in/lionelbouton/
> On Apr 6, 2022, at 18:05, Lionel Bouton <lionel.bouton@jtek.fr> wrote: > > Hi Perry, > > Le 07/04/2022 à 00:25, Perry Smith a écrit : >> [...] > I'd say the later : in your case I would use a NULL parent_id for root(s). Your way leads you to bend PostgreSQL untilits back brakes Yea. This is definitely walking up the down escalator. But… I did learn something. So… bandage up my wounds, learn, and grow… Thank you, Perry
Attachment
On 4/6/22 18:25, Perry Smith wrote: > Rather than explain how I got here, I’ll just explain the state I’m in. > ... > > I’m using Active Record with the psql adapter. It has a > disable_referential_integrity which takes a block of code. When the > block of code exists, the constraints are put back. At least, that is > what I thought. > > I’m wondering if the disabled constraints are still disabled somehow. > If so, how would I check for that and how would I turn them back on? > Or am I way off in the weeds? That depends on how exactly Active Record does this disabling of constraints. There is a GUC in PostgreSQL 'session_replication_role'. Setting that to value 'replica' will do precisely that as a side effect. Its primary purpose is for logical replication systems (like Londiste, Slony and logical decoding based ones) to disable user triggers and referential integrity actions (like on delete cascade) as well as integrity checking under the assumption that those actions have been performed on the origin database and will be replicated as well or are unnecessary. Note that changing that setting requires PostgreSQL superuser privilege. Precisely because of the danger of getting your database into an inconsistent state. So **IF** Active Record is using that feature, then it can dump any amount of garbage into your PostgreSQL database and PostgreSQL will happily accept it with zero integrity checking. Best Regards, Jan
On Thu, Apr 7, 2022 at 8:10 AM Jan Wieck <jan@wi3ck.info> wrote:
So **IF** Active Record is using that feature, then it can dump any
amount of garbage into your PostgreSQL database and PostgreSQL will
happily accept it with zero integrity checking.
On 4/8/22 01:57, Nikolay Samokhvalov wrote: > On Thu, Apr 7, 2022 at 8:10 AM Jan Wieck <jan@wi3ck.info > <mailto:jan@wi3ck.info>> wrote: > > So **IF** Active Record is using that feature, then it can dump any > amount of garbage into your PostgreSQL database and PostgreSQL will > happily accept it with zero integrity checking. > > > It's DISABLE TRIGGER ALL > https://github.com/rails/rails/blob/831031a8cec5bfe59ef653ae2857d4fe64c5698d/activerecord/lib/active_record/connection_adapters/postgresql/referential_integrity.rb#L12 > <https://github.com/rails/rails/blob/831031a8cec5bfe59ef653ae2857d4fe64c5698d/activerecord/lib/active_record/connection_adapters/postgresql/referential_integrity.rb#L12> Similar poison, same side effect. Looking further at that code it also directly updates the PostgreSQL system catalog. This is a big, red flag. Why do the Rails developers think they need a sledgehammer like that? It seems to be doing that for over 7 years, so it is hard to tell from the commit log why they need to disable RI at all. Regards, Jan
On Fri, Apr 8, 2022 at 7:57 AM Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:
On Thu, Apr 7, 2022 at 8:10 AM Jan Wieck <jan@wi3ck.info> wrote:So **IF** Active Record is using that feature, then it can dump any
amount of garbage into your PostgreSQL database and PostgreSQL will
happily accept it with zero integrity checking.
A side-note on this, which of course won't help the OP at this point, but if the general best practice of not running the application with a highly privileged account is followed, the problem won't occur (it will just fail early before breaking things). DISABLE TRIGGER ALL requires either ownership of the table or superuser permissions, none of which it's recommended that the application run with. Doesn't help once the problem has occurred of course, but can help avoid it happening in the future.
--
On 4/8/22 08:58, Magnus Hagander wrote: > A side-note on this, which of course won't help the OP at this point, > but if the general best practice of not running the application with a > highly privileged account is followed, the problem won't occur (it will > just fail early before breaking things). DISABLE TRIGGER ALL requires > either ownership of the table or superuser permissions, none of which > it's recommended that the application run with. Doesn't help once the > problem has occurred of course, but can help avoid it happening in the > future. It gets even better further down in that code, where it UPDATEs pg_constraint directly. That not only requires superuser but also catupd permissions (which are separate from superuser for a reason). Regards, Jan
On Fri, Apr 8, 2022 at 3:07 PM Jan Wieck <jan@wi3ck.info> wrote:
On 4/8/22 08:58, Magnus Hagander wrote:
> A side-note on this, which of course won't help the OP at this point,
> but if the general best practice of not running the application with a
> highly privileged account is followed, the problem won't occur (it will
> just fail early before breaking things). DISABLE TRIGGER ALL requires
> either ownership of the table or superuser permissions, none of which
> it's recommended that the application run with. Doesn't help once the
> problem has occurred of course, but can help avoid it happening in the
> future.
It gets even better further down in that code, where it UPDATEs
pg_constraint directly. That not only requires superuser but also catupd
permissions (which are separate from superuser for a reason).
Indeed.The fact that's in the code is sadly an indicator of how many people run their app as superuser :(
On Apr 8, 2022, at 07:47, Jan Wieck <jan@wi3ck.info> wrote:On 4/8/22 01:57, Nikolay Samokhvalov wrote:On Thu, Apr 7, 2022 at 8:10 AM Jan Wieck <jan@wi3ck.info <mailto:jan@wi3ck.info>> wrote:
So **IF** Active Record is using that feature, then it can dump any
amount of garbage into your PostgreSQL database and PostgreSQL will
happily accept it with zero integrity checking.
It's DISABLE TRIGGER ALL https://github.com/rails/rails/blob/831031a8cec5bfe59ef653ae2857d4fe64c5698d/activerecord/lib/active_record/connection_adapters/postgresql/referential_integrity.rb#L12 <https://github.com/rails/rails/blob/831031a8cec5bfe59ef653ae2857d4fe64c5698d/activerecord/lib/active_record/connection_adapters/postgresql/referential_integrity.rb#L12>
Similar poison, same side effect.
Looking further at that code it also directly updates the PostgreSQL system catalog. This is a big, red flag.
Why do the Rails developers think they need a sledgehammer like that? It seems to be doing that for over 7 years, so it is hard to tell from the commit log why they need to disable RI at all.
It has been a long time since I’ve done Rails stuff. What follows is the best I can recall but please take it with a grain of salt.
The first problem is that generally Rails does not put constraints in the database. There were others like me who thought that was insane and would put constraints in the database — this includes foreign key constraints, check constraints, etc. About the only constraint that could be added into the DB using native Rails was the “not null” constraint.
When foreign and other constraints were added, it broke something they call “Fixtures” which are present db states that are plopped into the DB during testing. To “fix” that, I (and others) would add this into our code base: (I’m adding this to see what you guys think of it — is it safer / better or just as insane?)
def disable_referential_integrity(&block)
transaction {
begin
execute "SET CONSTRAINTS ALL DEFERRED"
yield
ensure
execute "SET CONSTRAINTS ALL IMMEDIATE"
end
}
end
transaction {
begin
execute "SET CONSTRAINTS ALL DEFERRED"
yield
ensure
execute "SET CONSTRAINTS ALL IMMEDIATE"
end
}
end
The above would only be used during testing.
For this project, I wondered if it was in the AR code base and found a method with the same name. Note that my method was all done under one transaction. The code they have uses multiple transactions. I’m curious to know if either method is better / safer.
For the longest time, the Rails developers refused to put constraints into the DB insisting that the app could do it. Eventually (SURPRISE!!!!) they were proven wrong. Users can now add foreign key constraints using native AR constructs as well as general check constraints and indexes. Before it had to be done using SQL. Rails did have a general purpose “execute” statement (shown above).
Attachment
On Fri, Apr 8, 2022 at 3:23 PM Perry Smith <pedz@easesoftware.com> wrote:
On Apr 8, 2022, at 07:47, Jan Wieck <jan@wi3ck.info> wrote:On 4/8/22 01:57, Nikolay Samokhvalov wrote:On Thu, Apr 7, 2022 at 8:10 AM Jan Wieck <jan@wi3ck.info <mailto:jan@wi3ck.info>> wrote:
So **IF** Active Record is using that feature, then it can dump any
amount of garbage into your PostgreSQL database and PostgreSQL will
happily accept it with zero integrity checking.
It's DISABLE TRIGGER ALL https://github.com/rails/rails/blob/831031a8cec5bfe59ef653ae2857d4fe64c5698d/activerecord/lib/active_record/connection_adapters/postgresql/referential_integrity.rb#L12 <https://github.com/rails/rails/blob/831031a8cec5bfe59ef653ae2857d4fe64c5698d/activerecord/lib/active_record/connection_adapters/postgresql/referential_integrity.rb#L12>
Similar poison, same side effect.
Looking further at that code it also directly updates the PostgreSQL system catalog. This is a big, red flag.
Why do the Rails developers think they need a sledgehammer like that? It seems to be doing that for over 7 years, so it is hard to tell from the commit log why they need to disable RI at all.It has been a long time since I’ve done Rails stuff. What follows is the best I can recall but please take it with a grain of salt.The first problem is that generally Rails does not put constraints in the database. There were others like me who thought that was insane and would put constraints in the database — this includes foreign key constraints, check constraints, etc. About the only constraint that could be added into the DB using native Rails was the “not null” constraint.When foreign and other constraints were added, it broke something they call “Fixtures” which are present db states that are plopped into the DB during testing. To “fix” that, I (and others) would add this into our code base: (I’m adding this to see what you guys think of it — is it safer / better or just as insane?)def disable_referential_integrity(&block)
transaction {
begin
execute "SET CONSTRAINTS ALL DEFERRED"
yield
ensure
execute "SET CONSTRAINTS ALL IMMEDIATE"
end
}
end
This is perfectly normal code and nothing wrong with it. DEFERRED constraints are how you are *supposed* to handle such things. It defers the check of the foreign key to the end of the transaction, but it will still fail to commit if the foreign key is broken *at that point*. But it lets you do things like modify multiple tables that refer to each other, and have the changes only checked when they're all done.
On Apr 8, 2022, at 08:10, Magnus Hagander <magnus@hagander.net> wrote:On Fri, Apr 8, 2022 at 3:07 PM Jan Wieck <jan@wi3ck.info> wrote:On 4/8/22 08:58, Magnus Hagander wrote:
> A side-note on this, which of course won't help the OP at this point,
> but if the general best practice of not running the application with a
> highly privileged account is followed, the problem won't occur (it will
> just fail early before breaking things). DISABLE TRIGGER ALL requires
> either ownership of the table or superuser permissions, none of which
> it's recommended that the application run with. Doesn't help once the
> problem has occurred of course, but can help avoid it happening in the
> future.
It gets even better further down in that code, where it UPDATEs
pg_constraint directly. That not only requires superuser but also catupd
permissions (which are separate from superuser for a reason).Indeed.The fact that's in the code is sadly an indicator of how many people run their app as superuser :(
Interesting conversation. Yes, the developer generally has superuser DB because they need to create the DB, etc. And, during testing, I’m sure their test user has super user privileges as well. I bet they don’t set up the DB with one user and then test with a non-SU user during test and the corollary is I bet the DB user used in production is also a super user in the majority of the installations. I’ve never seen this discussed. I’m not hugely active in the Rails community. They never liked me (or perhaps that was projection on my part).
Attachment
On Fri, Apr 8, 2022 at 3:27 PM Magnus Hagander <magnus@hagander.net> wrote:
On Fri, Apr 8, 2022 at 3:23 PM Perry Smith <pedz@easesoftware.com> wrote:On Apr 8, 2022, at 07:47, Jan Wieck <jan@wi3ck.info> wrote:On 4/8/22 01:57, Nikolay Samokhvalov wrote:On Thu, Apr 7, 2022 at 8:10 AM Jan Wieck <jan@wi3ck.info <mailto:jan@wi3ck.info>> wrote:
So **IF** Active Record is using that feature, then it can dump any
amount of garbage into your PostgreSQL database and PostgreSQL will
happily accept it with zero integrity checking.
It's DISABLE TRIGGER ALL https://github.com/rails/rails/blob/831031a8cec5bfe59ef653ae2857d4fe64c5698d/activerecord/lib/active_record/connection_adapters/postgresql/referential_integrity.rb#L12 <https://github.com/rails/rails/blob/831031a8cec5bfe59ef653ae2857d4fe64c5698d/activerecord/lib/active_record/connection_adapters/postgresql/referential_integrity.rb#L12>
Similar poison, same side effect.
Looking further at that code it also directly updates the PostgreSQL system catalog. This is a big, red flag.
Why do the Rails developers think they need a sledgehammer like that? It seems to be doing that for over 7 years, so it is hard to tell from the commit log why they need to disable RI at all.It has been a long time since I’ve done Rails stuff. What follows is the best I can recall but please take it with a grain of salt.The first problem is that generally Rails does not put constraints in the database. There were others like me who thought that was insane and would put constraints in the database — this includes foreign key constraints, check constraints, etc. About the only constraint that could be added into the DB using native Rails was the “not null” constraint.When foreign and other constraints were added, it broke something they call “Fixtures” which are present db states that are plopped into the DB during testing. To “fix” that, I (and others) would add this into our code base: (I’m adding this to see what you guys think of it — is it safer / better or just as insane?)def disable_referential_integrity(&block)
transaction {
begin
execute "SET CONSTRAINTS ALL DEFERRED"
yield
ensure
execute "SET CONSTRAINTS ALL IMMEDIATE"
end
}
endThis is perfectly normal code and nothing wrong with it. DEFERRED constraints are how you are *supposed* to handle such things. It defers the check of the foreign key to the end of the transaction, but it will still fail to commit if the foreign key is broken *at that point*. But it lets you do things like modify multiple tables that refer to each other, and have the changes only checked when they're all done.
Oh, I should add. The code is correct. The name of the function is wrong, because it doesn't actually disable referential integrity. It only postpones it.
On 4/8/22 09:27, Magnus Hagander wrote: > > > On Fri, Apr 8, 2022 at 3:23 PM Perry Smith <pedz@easesoftware.com > <mailto:pedz@easesoftware.com>> wrote: > It has been a long time since I’ve done Rails stuff. What follows > is the best I can recall but please take it with a grain of salt. > > The first problem is that generally Rails does not put constraints > in the database. There were others like me who thought that was > insane and would put constraints in the database — this includes > foreign key constraints, check constraints, etc. About the only > constraint that could be added into the DB using native Rails was > the “not null” constraint. > > When foreign and other constraints were added, it broke something > they call “Fixtures” which are present db states that are plopped > into the DB during testing. To “fix” that, I (and others) would add > this into our code base: (I’m adding this to see what you guys think > of it — is it safer / better or just as insane?) > > def disable_referential_integrity(&block) > transaction { > begin > execute "SET CONSTRAINTS ALL DEFERRED" > yield > ensure > execute "SET CONSTRAINTS ALL IMMEDIATE" > end > } > end > > > This is perfectly normal code and nothing wrong with it. DEFERRED > constraints are how you are *supposed* to handle such things. It defers > the check of the foreign key to the end of the transaction, but it will > still fail to commit if the foreign key is broken *at that point*. But > it lets you do things like modify multiple tables that refer to each > other, and have the changes only checked when they're all done. Indeed, especially because this code does not require any elevated permissions, guarantees referential integrity at commit time and guarantees that no inconsistent, intermediate state will ever be visible to another, concurrent session. It only affects constraints that have been declared DEFERRABLE. Those that are not are silently ignored (as per SQL standard). A lot of frameworks didn't support foreign keys because one of the most popular databases at that time didn't support them. Well, the SQL parser of that particular database would accept the syntax, but the engine would not enforce anything. Even the manual of that database stated that "foreign keys are mostly for documentation purposes and are not needed as long as the application does all operations in the correct order." They changed that part of the documentation when support for InnoDB was added. Therefore I would not put all blame on the Rails developers. Best Regards, Jan
On 4/8/22 09:58, Jan Wieck wrote: > It only affects constraints that have been declared DEFERRABLE. Those > that are not are silently ignored (as per SQL standard). I should have said "... silently ignored by this statement and still fire IMMEDIATE". Just to be clear, Jan