Thread: Self-referential records

Self-referential records

From
Ovid
Date:
Assuming I have the following table:

    CREATE TABLE refers (
      id        SERIAL  PRIMARY KEY,
      name      VARCHAR(255) NOT NULL,
      parent_id INTEGER NOT NULL,
      FOREIGN KEY (parent_id) REFERENCES refers(id)
  );
I need to insert two records so that "select * from refers" looks like this:

    =# select * from refers;
     id | name | parent_id
    ----+------+-----------
      1 | xxxx |         1
      2 | yyy  |         2

The first record can't be inserted because I don't yet know the parent_id. The second record can be inserted after the
first,but I since this is merely a large .sql file that I intend to shove into the PG, I'd much rather declare a
variablein the script to get this done.  I'm thinking something like the following pseudo-code: 

    INSERT INTO refers (name, parent_id) VALUES ('xxxx', :id);
    SELECT id INTO :parent_id FROM refers WHERE name='xxxx';
    INSERT INTO refers (name, parent_id) VALUES ('yyy', :parent_id);

Obviously the above is gibberish, but hopefully it makes clear what I'm trying to do :)

Oh, and "parent_id" is NOT NULL because I hate the logical inconsistencies associated with NULL values.

Cheers,
Ovid--
Buy the book         - http://www.oreilly.com/catalog/perlhks/
Tech blog            - http://use.perl.org/~Ovid/journal/
Twitter              - http://twitter.com/OvidPerl
Official Perl 6 Wiki - http://www.perlfoundation.org/perl6



Re: Self-referential records

From
Leif Biberg Kristensen
Date:
On Sunday 24. January 2010 14.43.10 Ovid wrote:
> Assuming I have the following table:
>
>     CREATE TABLE refers (
>       id        SERIAL  PRIMARY KEY,
>       name      VARCHAR(255) NOT NULL,
>       parent_id INTEGER NOT NULL,
>       FOREIGN KEY (parent_id) REFERENCES refers(id)
>   );
> I need to insert two records so that "select * from refers" looks like this:
>
>     =# select * from refers;
>      id | name | parent_id
>     ----+------+-----------
>       1 | xxxx |         1
>       2 | yyy  |         2
>
> The first record can't be inserted because I don't yet know the parent_id.

I've got a similar structure. I just declared the root node with both id and
parent_id=0.

regards,
--
Leif Biberg Kristensen
http://solumslekt.org/

Re: Self-referential records

From
Bill Moran
Date:
On 1/24/10 8:43 AM, Ovid wrote:
> Assuming I have the following table:
>
>      CREATE TABLE refers (
>        id        SERIAL  PRIMARY KEY,
>        name      VARCHAR(255) NOT NULL,
>        parent_id INTEGER NOT NULL,
>        FOREIGN KEY (parent_id) REFERENCES refers(id)
>    );
> I need to insert two records so that "select * from refers" looks like this:
>
>      =# select * from refers;
>       id | name | parent_id
>      ----+------+-----------
>        1 | xxxx |         1
>        2 | yyy  |         2
>
> The first record can't be inserted because I don't yet know the parent_id. The second record can be inserted after
thefirst, but I since this is merely a large .sql file that I intend to shove into the PG, I'd much rather declare a
variablein the script to get this done.  I'm thinking something like the following pseudo-code: 
>
>      INSERT INTO refers (name, parent_id) VALUES ('xxxx', :id);
>      SELECT id INTO :parent_id FROM refers WHERE name='xxxx';
>      INSERT INTO refers (name, parent_id) VALUES ('yyy', :parent_id);
>
> Obviously the above is gibberish, but hopefully it makes clear what I'm trying to do :)
>
> Oh, and "parent_id" is NOT NULL because I hate the logical inconsistencies associated with NULL values.

You could always remove the NOT NULL or the FOREIGN KEY constraints
during data load, then add them back on afterward.

If the problem is with everyday usage and not just data load, you
can still do this trick, since DDL can be transactionalized (is
that a word).  Just start a transaction, remove the NOT NULL
constraint, add your new records, then update the parent_key as
appropriate, then add the NOT NULL back.  If any point during the
process fails, just rollback the transaction.  You may want to
set the isolation level to serializable, but I'm not sure if
that's necessary.

-Bill

Re: Self-referential records

From
Thomas Kellerer
Date:
Ovid wrote on 24.01.2010 14:43:
> Assuming I have the following table:
>
>      CREATE TABLE refers (
>        id        SERIAL  PRIMARY KEY,
>        name      VARCHAR(255) NOT NULL,
>        parent_id INTEGER NOT NULL,
>        FOREIGN KEY (parent_id) REFERENCES refers(id)
>    );
> I need to insert two records so that "select * from refers" looks like this:
>
>      =# select * from refers;
>       id | name | parent_id
>      ----+------+-----------
>        1 | xxxx |         1
>        2 | yyy  |         2
>
> The first record can't be inserted because I don't yet know the parent_id.

I ususally identify the root record by setting the parent_id to NULL.
In my experience creating a cycle in the tree creates a lot of trouble that is hard to come by.

Thomas

Re: Self-referential records

From
Andreas Kretschmer
Date:
Ovid <curtis_ovid_poe@yahoo.com> wrote:

> Assuming I have the following table:
>
>     CREATE TABLE refers (
>       id        SERIAL  PRIMARY KEY,
>       name      VARCHAR(255) NOT NULL,
>       parent_id INTEGER NOT NULL,
>       FOREIGN KEY (parent_id) REFERENCES refers(id)
>   );
> I need to insert two records so that "select * from refers" looks like this:
>
>     =# select * from refers;
>      id | name | parent_id
>     ----+------+-----------
>       1 | xxxx |         1
>       2 | yyy  |         2

I think you mean id=1, parent_id=2 and id=2, parent_id=1, or?

>
> The first record can't be inserted because I don't yet know the parent_id. The second record can be inserted after
thefirst, but I since this is merely a large .sql file that I intend to shove into the PG, I'd much rather declare a
variablein the script to get this done.  I'm thinking something like the following pseudo-code: 
>
>     INSERT INTO refers (name, parent_id) VALUES ('xxxx', :id);
>     SELECT id INTO :parent_id FROM refers WHERE name='xxxx';
>     INSERT INTO refers (name, parent_id) VALUES ('yyy', :parent_id);
>
> Obviously the above is gibberish, but hopefully it makes clear what I'm trying to do :)
>
> Oh, and "parent_id" is NOT NULL because I hate the logical inconsistencies associated with NULL values.

To handle that you can set the constzraint deferrable, initially
deferred:

test=# CREATE TABLE refers ( id SERIAL  PRIMARY KEY, name VARCHAR(255) NOT NULL, parent_id INTEGER NOT NULL, FOREIGN
KEY(parent_id) REFERENCES refers(id) deferrable initially deferred); 
NOTICE:  CREATE TABLE will create implicit sequence "refers_id_seq" for serial column "refers.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "refers_pkey" for table "refers"
CREATE TABLE
Zeit: 25,599 ms
test=*# insert into refers (name, parent_id) values ('xxx',0);
INSERT 0 1
Zeit: 0,662 ms
test=*# insert into refers (name, parent_id) select 'yyy', id from refers where name = 'xxx';
INSERT 0 1
Zeit: 0,436 ms
test=*# update refers set parent_id = (select id from refers where name = 'yyy') where name = 'xxx';
UPDATE 1
Zeit: 0,431 ms
test=*# select * from refers;
 id | name | parent_id
----+------+-----------
  2 | yyy  |         1
  1 | xxx  |         2
(2 Zeilen)


The next release 9.0 contains (i hope) writes CTE, with this featue you can do:

test=# CREATE TABLE refers ( id SERIAL  PRIMARY KEY, name VARCHAR(255) NOT NULL, parent_id INTEGER NOT NULL, FOREIGN
KEY(parent_id) REFERENCES refers(id) deferrable initially deferred); 
NOTICE:  CREATE TABLE will create implicit sequence "refers_id_seq" for serial column "refers.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "refers_pkey" for table "refers"
CREATE TABLE
Time: 3,753 ms
test=*#

with
  t1 as (select nextval('refers_id_seq') as id),
  t2 as (insert into refers (id, name, parent_id) select nextval('refers_id_seq'), 'yyy', t1.id from t1 returning *),
  t3 as (insert into refers (id, name, parent_id) select t1.id, 'xxx', t2.id from t1, t2)
select true;
 bool
------
 t
(1 row)

Time: 0,853 ms
test=*# select * from refers;
 id | name | parent_id
----+------+-----------
  2 | yyy  |         1
  1 | xxx  |         2
(2 rows)


That's (the two insert's) are now one single statement ;-)


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: Self-referential records

From
Andreas Kretschmer
Date:
Xi Shen <davidshen84@googlemail.com> wrote:

> > To handle that you can set the constzraint deferrable, initially
> > deferred:
> >
> > test=# CREATE TABLE refers ( id SERIAL  PRIMARY KEY, name VARCHAR(255) NOT NULL, parent_id INTEGER NOT NULL,
FOREIGNKEY (parent_id) REFERENCES refers(id) deferrable initially deferred); 
> > NOTICE:  CREATE TABLE will create implicit sequence "refers_id_seq" for serial column "refers.id"
> > NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "refers_pkey" for table "refers"
> > CREATE TABLE
> > Zeit: 25,599 ms
> > test=*# insert into refers (name, parent_id) values ('xxx',0);
> > INSERT 0 1
> > Zeit: 0,662 ms
>
> to Andreas,
>
> this 'deferrable' thing is really cool. but i have a question. at this
> point, where i insert this text, if i select the refers table, what
> would the parent_id looks like?

0, as i inserted:

test=*# insert into refers (name, parent_id) values ('xxx',0);
INSERT 0 1
Zeit: 0,636 ms
test=*# select * from refers;
 id | name | parent_id
----+------+-----------
  1 | xxx  |         0
(1 Zeile)




Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: Self-referential records

From
Xi Shen
Date:
On Sun, Jan 24, 2010 at 10:36 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:
> Ovid <curtis_ovid_poe@yahoo.com> wrote:
>
>> Assuming I have the following table:
>>
>>     CREATE TABLE refers (
>>       id        SERIAL  PRIMARY KEY,
>>       name      VARCHAR(255) NOT NULL,
>>       parent_id INTEGER NOT NULL,
>>       FOREIGN KEY (parent_id) REFERENCES refers(id)
>>   );
>> I need to insert two records so that "select * from refers" looks like this:
>>
>>     =# select * from refers;
>>      id | name | parent_id
>>     ----+------+-----------
>>       1 | xxxx |         1
>>       2 | yyy  |         2
>
> I think you mean id=1, parent_id=2 and id=2, parent_id=1, or?
>
>>
>> The first record can't be inserted because I don't yet know the parent_id. The second record can be inserted after
thefirst, but I since this is merely a large .sql file that I intend to shove into the PG, I'd much rather declare a
variablein the script to get this done.  I'm thinking something like the following pseudo-code: 
>>
>>     INSERT INTO refers (name, parent_id) VALUES ('xxxx', :id);
>>     SELECT id INTO :parent_id FROM refers WHERE name='xxxx';
>>     INSERT INTO refers (name, parent_id) VALUES ('yyy', :parent_id);
>>
>> Obviously the above is gibberish, but hopefully it makes clear what I'm trying to do :)
>>
>> Oh, and "parent_id" is NOT NULL because I hate the logical inconsistencies associated with NULL values.
>
> To handle that you can set the constzraint deferrable, initially
> deferred:
>
> test=# CREATE TABLE refers ( id SERIAL  PRIMARY KEY, name VARCHAR(255) NOT NULL, parent_id INTEGER NOT NULL, FOREIGN
KEY(parent_id) REFERENCES refers(id) deferrable initially deferred); 
> NOTICE:  CREATE TABLE will create implicit sequence "refers_id_seq" for serial column "refers.id"
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "refers_pkey" for table "refers"
> CREATE TABLE
> Zeit: 25,599 ms
> test=*# insert into refers (name, parent_id) values ('xxx',0);
> INSERT 0 1
> Zeit: 0,662 ms

to Andreas,

this 'deferrable' thing is really cool. but i have a question. at this
point, where i insert this text, if i select the refers table, what
would the parent_id looks like?


> test=*# insert into refers (name, parent_id) select 'yyy', id from refers where name = 'xxx';
> INSERT 0 1
> Zeit: 0,436 ms
> test=*# update refers set parent_id = (select id from refers where name = 'yyy') where name = 'xxx';
> UPDATE 1
> Zeit: 0,431 ms
> test=*# select * from refers;
>  id | name | parent_id
> ----+------+-----------
>  2 | yyy  |         1
>  1 | xxx  |         2
> (2 Zeilen)
>
>
> The next release 9.0 contains (i hope) writes CTE, with this featue you can do:
>
> test=# CREATE TABLE refers ( id SERIAL  PRIMARY KEY, name VARCHAR(255) NOT NULL, parent_id INTEGER NOT NULL, FOREIGN
KEY(parent_id) REFERENCES refers(id) deferrable initially deferred); 
> NOTICE:  CREATE TABLE will create implicit sequence "refers_id_seq" for serial column "refers.id"
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "refers_pkey" for table "refers"
> CREATE TABLE
> Time: 3,753 ms
> test=*#
>
> with
>  t1 as (select nextval('refers_id_seq') as id),
>  t2 as (insert into refers (id, name, parent_id) select nextval('refers_id_seq'), 'yyy', t1.id from t1 returning *),
>  t3 as (insert into refers (id, name, parent_id) select t1.id, 'xxx', t2.id from t1, t2)
> select true;
>  bool
> ------
>  t
> (1 row)
>
> Time: 0,853 ms
> test=*# select * from refers;
>  id | name | parent_id
> ----+------+-----------
>  2 | yyy  |         1
>  1 | xxx  |         2
> (2 rows)
>
>
> That's (the two insert's) are now one single statement ;-)
>
>
> Andreas
> --
> Really, I'm not out to destroy Microsoft. That will just be a completely
> unintentional side effect.                              (Linus Torvalds)
> "If I was god, I would recompile penguin with --enable-fly."   (unknown)
> Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


to Ovid,

if you are going to create a tree in the table, you should certainly
allow the parent_id be null, otherwise, what would be the parent of
root? if you are meant to create a circle...god, what is that for? may
i have your story?


--
Best Regards,
David Shen

http://twitter.com/davidshen84/
http://meme.yahoo.com/davidshen84/

Re: Self-referential records

From
Peter Geoghegan
Date:
What is the preferred way to enforce that there is at least one orphan
record if any at all, and that a record is not a Marty McFly type
descendent of itself? I would suggest that a statement level after
trigger is the way to go, but I myself have never actually had to
enforce this.

Regards,
Peter Geoghegan

Re: Self-referential records

From
Andreas Kretschmer
Date:
Xi Shen <davidshen84@googlemail.com> wrote:

> what if you insert other values like '1', '999'? will the insertion
> successful? if so, what's the difference between a deferred reference
> and no reference at all?

Nice question ;-)

Okay, recreate the table but without NOT NULL:

test=# CREATE TABLE refers ( id SERIAL  PRIMARY KEY, name VARCHAR(255) NOT NULL, parent_id INTEGER , FOREIGN KEY
(parent_id)REFERENCES refers(id) deferrable initially deferred); 
NOTICE:  CREATE TABLE will create implicit sequence "refers_id_seq" for serial column "refers.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "refers_pkey" for table "refers"
CREATE TABLE
Zeit: 63,477 ms
test=*# insert into refers (name, parent_id) values ('xxx',null);
INSERT 0 1
Zeit: 0,686 ms

I think, NULL is more sensible than inserting a 'random' value like 0.



Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: Self-referential records

From
"Wayne E. Pfeffer"
Date:
If you do not use null to represent a root node, when you go to unwind the data from the table to generate a hierarchy tree, you could end up with an infinite loop. The query will always be looking for the next parent in the hierarchy. Meaning, you will want to find the parent of a node using the given parent_id, the query will find the parent of 1 to be 1, then it will look again for the parent of 1 it will find 1, etc. etc. ad nauseum. I enjoy using recursion as much as the next guy, but this could cause some serious issues with the PostgreSQL query engine eating up system resources.
 
Wayne E. Pfeffer

On Sun, Jan 24, 2010 at 9:13 AM, Andreas Kretschmer <akretschmer@spamfence.net> wrote:
> what if you insert other values like '1', '999'? will the insertion
> successful? if so, what's the difference between a deferred reference
> and no reference at all?

Nice question ;-)

Okay, recreate the table but without NOT NULL:

test=# CREATE TABLE refers ( id SERIAL  PRIMARY KEY, name VARCHAR(255) NOT NULL, parent_id INTEGER , FOREIGN KEY (parent_id) REFERENCES refers(id) deferrable initially deferred);
NOTICE:  CREATE TABLE will create implicit sequence "refers_id_seq" for serial column "refers.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "refers_pkey" for table "refers"
CREATE TABLE
Zeit: 63,477 ms
test=*# insert into refers (name, parent_id) values ('xxx',null);
INSERT 0 1
Zeit: 0,686 ms

I think, NULL is more sensible than inserting a 'random' value like 0.



Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Red Leader 1 -- Out
URL: http://apps.facebook.com/faceblogged/?uid=674333666

Re: Self-referential records

From
Leif Biberg Kristensen
Date:
On Sunday 24. January 2010 16.22.00 Wayne E. Pfeffer wrote:
> If you do not use null to represent a root node, when you go to unwind the
> data from the table to generate a hierarchy tree, you could end up with an
> infinite loop. The query will always be looking for the next parent in the
> hierarchy. Meaning, you will want to find the parent of a node using the
> given parent_id, the query will find the parent of 1 to be 1, then it will
> look again for the parent of 1 it will find 1, etc. etc. ad nauseum. I enjoy
> using recursion as much as the next guy, but this could cause some serious
> issues with the PostgreSQL query engine eating up system resources.

It doesn't really matter if the root node is NULL or 0 or whatever. You just
have to realize that the root node is a special case and program accordingly.
An adjacency tree is not a normalized structure, and will never be. It's the
programmer's responsibility to ensure that circular references can't occur.

regards,
--
Leif Biberg Kristensen
http://solumslekt.org/