Thread: How to represent a bi-directional list in db?

How to represent a bi-directional list in db?

From

Pankaj Jangid

Date:

22 September 2019, 17:25:00

I am creating an application for a manufacturing scenario. To represent
stages in an assembly line, I wanted to create following table,

CREATE TABLE stages (
       id SERIAL PRIMARY KEY,
       name  VARCHAR(80) NOT NULL,
       created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
       updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
       prev_stage_id SERIAL REFERENCES stages NULL,
       next_stage_id SERIAL REFERENCES stages NULL,
       process_id SERIAL REFERENCES processes NOT NULL
);

But it:

Failed with: conflicting NULL/NOT NULL declarations for column
"prev_stage_id" of table "stages"

Is it not possible to create "nullable" self referencing foreign keys?

-- 
Pankaj Jangid

Re: How to represent a bi-directional list in db?

From

Francisco Olarte

Date:

22 September 2019, 18:39:11

Pankaj:

On Sun, Sep 22, 2019 at 4:25 PM Pankaj Jangid <pankaj.jangid@gmail.com> wrote:
> CREATE TABLE stages (
>        id SERIAL PRIMARY KEY,
>        name  VARCHAR(80) NOT NULL,
>        created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
>        updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
>        prev_stage_id SERIAL REFERENCES stages NULL,
>        next_stage_id SERIAL REFERENCES stages NULL,
>        process_id SERIAL REFERENCES processes NOT NULL
> );
> Failed with: conflicting NULL/NOT NULL declarations for column
> "prev_stage_id" of table "stages"
> Is it not possible to create "nullable" self referencing foreign keys?

Serial seems wrong. It means integer, not null, defaul next value from
a sequence.

What you probably want is just "prev_stage_id INTEGER" ( NULL by
default ), as you do not want the prev/next stage ids to be generated,
you normally would want to assign values from other tuples.

Also, you may have problems populating this kind of table, as you will
not have the ids from either prev or next stage when building it.

And lastly, in SQL you do not really need a doubly linked list, just
populate prev_stage_id, and index it and you can query next stage of a
tuple using it.

Francisco Olarte.

Re: How to represent a bi-directional list in db?

From

Pankaj Jangid

Date:

23 September 2019, 05:07:22

Francisco Olarte <folarte@peoplecall.com> writes:

> On Sun, Sep 22, 2019 at 4:25 PM Pankaj Jangid <pankaj.jangid@gmail.com> wrote:
>> CREATE TABLE stages (
>>        id SERIAL PRIMARY KEY,
>>        name  VARCHAR(80) NOT NULL,
>>        created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
>>        updated_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
>>        prev_stage_id SERIAL REFERENCES stages NULL,
>>        next_stage_id SERIAL REFERENCES stages NULL,
>>        process_id SERIAL REFERENCES processes NOT NULL
>> );
>> Failed with: conflicting NULL/NOT NULL declarations for column
>> "prev_stage_id" of table "stages"
>> Is it not possible to create "nullable" self referencing foreign keys?
>
> Serial seems wrong. It means integer, not null, defaul next value from
> a sequence.
>
> What you probably want is just "prev_stage_id INTEGER" ( NULL by
> default ), as you do not want the prev/next stage ids to be generated,
> you normally would want to assign values from other tuples.
>

Thanks. This resolved my problem of NULL/NOT NULL conflict. I wasn't
aware that SERIAL is by default NOT NULL.

> Also, you may have problems populating this kind of table, as you will
> not have the ids from either prev or next stage when building it.
>

If NULL value is allowed I can fill it up with NULL initially. Right? Or
is there something wrong here.

> And lastly, in SQL you do not really need a doubly linked list, just
> populate prev_stage_id, and index it and you can query next stage of a
> tuple using it.
>

Could you please elaborate? Suppose I have this table,

CREATE TABLE stages (
    id SERIAL PRIMARY KEY,
    name VARCHAR(80) NOT NULL,
    next_id INTEGER REFERENCE stages NULL,
);

What would be the backward query in that case? Forward is clear. This is
forward query,

SELECT name FROM stages WHERE next_id = 123;

-- 
Pankaj Jangid

Re: How to represent a bi-directional list in db?

From

Francisco Olarte

Date:

23 September 2019, 11:54:48

Pankaj:

On Mon, Sep 23, 2019 at 4:07 AM Pankaj Jangid <pankaj.jangid@gmail.com> wrote:
> Thanks. This resolved my problem of NULL/NOT NULL conflict. I wasn't
> aware that SERIAL is by default NOT NULL.

Not only that. Once you strip the annoying NOT NULL the only thing
remaining on a serial column is a "default nextval", which you
normally do not want ( you could think of populating the table in
creative ways, but they are on a different sequence than the one you
use for the ID column ).

> > Also, you may have problems populating this kind of table, as you will
> > not have the ids from either prev or next stage when building it.
> If NULL value is allowed I can fill it up with NULL initially. Right? Or
> is there something wrong here.

There is not, you can use (id,prev,next) = (1,null,null) and then
update, but you are going to need to travel up and down a lot, or
store a lot of data. If you use the trick I comment later of just
using "prev", you can do, on a table having (id=serial, prev=int),
build a sequence by doing "prev_id=null"; insert (id,prev,other_data)
returning id; copy return value to prev_id, rinse and repeat.

Also note that you can query the sequence AND advance it and then
insert all rows without default values.

> > And lastly, in SQL you do not really need a doubly linked list, just
> > populate prev_stage_id, and index it and you can query next stage of a
> > tuple using it.
> Could you please elaborate? Suppose I have this table,
> CREATE TABLE stages (
>     id SERIAL PRIMARY KEY,
>     name VARCHAR(80) NOT NULL,
>     next_id INTEGER REFERENCE stages NULL,
> );
> What would be the backward query in that case? Forward is clear. This is
> forward query,
> SELECT name FROM stages WHERE next_id = 123;

No. That is a BACKWARDS QUERY. You are in row 123, you go BACK to its
preceedeing one.
If you need a traversable list containing (ListID, id,name) = x,1,A;
x,2,b; x,3;c ( I've added the ListId column to make it more
interesting/reallistic, you normally do not have a single table)
In sql you can build a (ListId, id, prev_id, name ) table ( PREV is
easier, as when you insert a row, in a normal application, you know
the previous one, but not the next one ) with the data
(x,1,null,a),(x,2,1,b),(x,3,2,c) ( the last one is a synthetic
sentinel and assumes nullable id), you can do it in a lot of ways.

To traverse it forward you just querying "select id where listid=x and
next_id is null" to locate the head (1), and then just go forward by
selecting with prev_id = last got id until you hit zero results.

To traverse backwards there are several ways. In the real cases I've
used I always had a "list row" where I could store the node for the
1st stage. In that cases i linked them circularly, (id=1, prev=3), so
bidirectional traversing was easy. Or you can use a special sentinel
node ( with a marker, like name=null). The thing is you locate the
last row, and then just query with id=last got prev_id. I do not
remember the details, but probably your "stages" are stages of
something which has a row, which can readily have a "first_stage_id"
or something similar.

Lists in tables are not the same as in C, where you directly store
pointers which point outwards. In this case any unique data serves as
a pointer, slow ( table scan ) by default, faster if you index the
column.

Anyway, unless you need the "linked list" functionality for something
( really heavy manipulation of large stage lists, splicing things
around ), I've normally found it's easier, in sql, to model this kind
of thing with a master-detail + order column.
( whatever = (id, ...., first_stage_id), stages=(id, order, .... ) )

Francisco Olarte.

Re: How to represent a bi-directional list in db?

From

Pankaj Jangid

Date:

23 September 2019, 17:07:02

Francisco Olarte <folarte@peoplecall.com> writes:

>> Could you please elaborate? Suppose I have this table,
>> CREATE TABLE stages (
>>     id SERIAL PRIMARY KEY,
>>     name VARCHAR(80) NOT NULL,
>>     next_id INTEGER REFERENCE stages NULL,
>> );
>> What would be the backward query in that case? Forward is clear. This is
>> forward query,
>> SELECT name FROM stages WHERE next_id = 123;
>
> No. That is a BACKWARDS QUERY. You are in row 123, you go BACK to its
> preceedeing one.
> If you need a traversable list containing (ListID, id,name) = x,1,A;
> x,2,b; x,3;c ( I've added the ListId column to make it more
> interesting/reallistic, you normally do not have a single table)
> In sql you can build a (ListId, id, prev_id, name ) table ( PREV is
> easier, as when you insert a row, in a normal application, you know
> the previous one, but not the next one ) with the data
> (x,1,null,a),(x,2,1,b),(x,3,2,c) ( the last one is a synthetic
> sentinel and assumes nullable id), you can do it in a lot of ways.
>
> To traverse it forward you just querying "select id where listid=x and
> next_id is null" to locate the head (1), and then just go forward by
> selecting with prev_id = last got id until you hit zero results.
>
> To traverse backwards there are several ways. In the real cases I've
> used I always had a "list row" where I could store the node for the
> 1st stage. In that cases i linked them circularly, (id=1, prev=3), so
> bidirectional traversing was easy. Or you can use a special sentinel
> node ( with a marker, like name=null). The thing is you locate the
> last row, and then just query with id=last got prev_id. I do not
> remember the details, but probably your "stages" are stages of
> something which has a row, which can readily have a "first_stage_id"
> or something similar.
>
> Lists in tables are not the same as in C, where you directly store
> pointers which point outwards. In this case any unique data serves as
> a pointer, slow ( table scan ) by default, faster if you index the
> column.
>
> Anyway, unless you need the "linked list" functionality for something
> ( really heavy manipulation of large stage lists, splicing things
> around ), I've normally found it's easier, in sql, to model this kind
> of thing with a master-detail + order column.
> ( whatever = (id, ...., first_stage_id), stages=(id, order, .... ) )
>
Thanks a lot Francisco. This is great help.

My stages are stages of processes. So yes processes are also stored in a
table. I got the idea. I'll add another column in the processes table
which points to the first stage (first_stage_id). And quries

Forward pass:
1. select name from stages where id = <first_stage_id from
process table>
2. select name from stages where prev_id = <last fetched id>
3. repeat (2)

Backward pass:
1. select name from stages where prev_id = <first_stage_id from
process table>
2. select name from stages where id = <last fetched prev_id>
3. repeat (2)

This is assuming I also create a circular list. I can also store
last_stage_id in the process table if we don't want to create circular
list in db.

Regards.
-- 
Pankaj Jangid

Re: How to represent a bi-directional list in db?

From

Francisco Olarte

Date:

24 September 2019, 10:29:54

Pankaj:

On Mon, Sep 23, 2019 at 4:07 PM Pankaj Jangid <pankaj.jangid@gmail.com> wrote:
...
> My stages are stages of processes. So yes processes are also stored in a
> table. I got the idea. I'll add another column in the processes table
> which points to the first stage (first_stage_id). And quries
> Forward pass:
...
> Backward pass:
..
> This is assuming I also create a circular list. I can also store
> last_stage_id in the process table if we don't want to create circular
> list in db.

That's exactly the idea. A pointer in C becomes a pair of fields with
corresponding value, and it can be traversed in any direction, slowly
by table scan or fastly with indexes. In fact a pair of values becomes
equivalent to two pointers, as it can be traversed either way ( but
the indexing acceleration has to be applied to each direction ).

That being said, linked lists are procedural data structures, SQL is
declarative, so they are not a great match, that's one of the reasons
why they are rarely seen. Things like master-detail have less
impedance mismatch.

Francisco Olarte.

Re: How to represent a bi-directional list in db?

From

Pankaj Jangid

Date:

24 September 2019, 12:37:53

Francisco Olarte <folarte@peoplecall.com> writes:

> That being said, linked lists are procedural data structures, SQL is
> declarative, so they are not a great match, that's one of the reasons
> why they are rarely seen. Things like master-detail have less
> impedance mismatch.

Thanks Francisco. Got the idea.

-- 
Pankaj Jangid