Thread: Storing an ordered list

Storing an ordered list

From
"Michael Artz"
Date:
What is the best way to store and ordered list that can be updated
OLTP-style?  A simplified problem is that I have an event, and the
event has an ordered list of predicates and I need to preserve the
order of the predicates.  All of the data is entered via a web
application, and I would like to support the new flashy ajax
drag-droppy thingies, meaning that there could be a significant amount
of updates if the user is dragging things all over the place.

I figure that one choice is to explicitly code the order as an integer
column in the predicate table which has the advantage of being very
easy and fast to query/order but *very* slow to reorder as all of the
predicates need to be updated.  This would seem to be a postgres/MVCC
weak spot as well.  Example:

create table event (event_id integer);
create table predicate (event_id integer not null references
event(event_id), name varchar, order integer);
insert into event (event_id) values (1);
insert into predicate (1, 'first event', 1);
insert into predicate (1, 'second predicate', 2);
select * from predicate p where p.event_id = 1 order by p.order;

I'm also thinking about a linked list, i.e.

create table event (event_id integer);
create table predicate (predicate_id integer, event_id integer not
null references event(event_id), name varchar, next_predicate integer
references predicate (predicate_id));
insert into predicate (101, 1, 'second predicate', NULL);
insert into predicate (102, 1, 'first predicate', 101);

The downside is that I'm not quite sure how to efficiently query the
linked list.  Any suggestions?

Are there any known best practices for storing ordered lists in
relational databases?  Are there any tricks that I can use with
postgres?

Thanks
-Mike


Re: Storing an ordered list

From
"Aaron Bono"
Date:
On 7/25/06, Michael Artz <mlartz@gmail.com> wrote:
What is the best way to store and ordered list that can be updated
OLTP-style?  A simplified problem is that I have an event, and the
event has an ordered list of predicates and I need to preserve the
order of the predicates.  All of the data is entered via a web
application, and I would like to support the new flashy ajax
drag-droppy thingies, meaning that there could be a significant amount
of updates if the user is dragging things all over the place.

I figure that one choice is to explicitly code the order as an integer
column in the predicate table which has the advantage of being very
easy and fast to query/order but *very* slow to reorder as all of the
predicates need to be updated.  This would seem to be a postgres/MVCC
weak spot as well.  Example:

create table event (event_id integer);
create table predicate (event_id integer not null references
event(event_id), name varchar, order integer);
insert into event (event_id) values (1);
insert into predicate (1, 'first event', 1);
insert into predicate (1, 'second predicate', 2);
select * from predicate p where p.event_id = 1 order by p.order;

I'm also thinking about a linked list, i.e.

create table event (event_id integer);
create table predicate (predicate_id integer, event_id integer not
null references event(event_id), name varchar, next_predicate integer
references predicate (predicate_id));
insert into predicate (101, 1, 'second predicate', NULL);
insert into predicate (102, 1, 'first predicate', 101);

The downside is that I'm not quite sure how to efficiently query the
linked list.  Any suggestions?

Are there any known best practices for storing ordered lists in
relational databases?  Are there any tricks that I can use with
postgres?

 
Even the linked list will require a lot of updates if there are is a lot of reshuffling - perhaps less though in certain circumstances, especially if the list is large and there is very little reshuffling.

If you use the linked list, remember this: to reduce the updates you are going to need more code in the application as it will have to keep track of what to update and what to not update.  It will also be more difficult to order the items using SQL so your application may have to take on that burden.  As a result, your application will become more complicated and writing reports that use the ordering will become difficult.

Another thing to think about with a linked list:  What if two people are reordering the items at the same time - they load the items at the same time, then reorder at the same time (with their own separate cache of the data) and finally save.  If you update everything, the last man to save wins but if you only update only what they change, you could end up with a mess:

Example:
    Start with:
    1 -> 2 -> 3 -> 4 -> 5
   
    Person 1 reorders to:
    1 -> 5 -> 2 -> 3 -> 4 (only update 1 -> 5, 5 -> 2 and 4 -> null)
   
    Person 2 reorders to:
    1 -> 2 -> 5 -> 3 -> 4 (only update 2 -> 5, 5 -> 3 and 4 -> null)
   
    If they then both save (assume person 1 saves and then person 2 saves) you get:
    1 -> 5
    2 -> 5
    This is going to be a big problem.

When I need something like this I go with your first approach, a simple order field.  Unless the user is reordering a small number of items in a very large list and doing it frequently, is there really a need to worry about the number of updates?  Are you worrying about a performance problem you will never have?

==================================================================
   Aaron Bono
   Aranya Software Technologies, Inc.
   http://www.aranya.com
==================================================================

Re: Storing an ordered list

From
Bruno Wolff III
Date:
On Tue, Jul 25, 2006 at 22:58:47 -0400, Michael Artz <mlartz@gmail.com> wrote:
> 
> I figure that one choice is to explicitly code the order as an integer
> column in the predicate table which has the advantage of being very
> easy and fast to query/order but *very* slow to reorder as all of the
> predicates need to be updated.  This would seem to be a postgres/MVCC
> weak spot as well.  Example:

If you use numeric instead of int, then it is easy to insert new values.


Re: Storing an ordered list

From
"Michael Artz"
Date:
On 7/26/06, Aaron Bono <postgresql@aranya.com> wrote:
> If you use the linked list, remember this: to reduce the updates you are
> going to need more code in the application as it will have to keep track of
> what to update and what to not update.  It will also be more difficult to
> order the items using SQL so your application may have to take on that
> burden.  As a result, your application will become more complicated and
> writing reports that use the ordering will become difficult.

Yeah, that was what I wanted to avoid.  I wasn't sure if there was a
common approach to this sort of problem.

> When I need something like this I go with your first approach, a simple
> order field.  Unless the user is reordering a small number of items in a
> very large list and doing it frequently, is there really a need to worry
> about the number of updates?  Are you worrying about a performance problem
> you will never have?

Perhaps it is unnecessary, however I wanted to start out with a decent
design that could scale if need be.  Also, I'd like to support
"real-time" reordering, i.e. the user won't have to click "save" ...
as soon as they drag the item to a new position the list is updated.


Re: Storing an ordered list

From
"Michael Artz"
Date:
On 7/26/06, Bruno Wolff III <bruno@wolff.to> wrote:
> If you use numeric instead of int, then it is easy to insert new values.

Hmm, hadn't thought about that.  How would you normally implement it?
I'm thinking that, if I wanted to insert between A and B, I could take
(A.order + B.order)/2, which would be pretty simple.  Is there a
better way?


Re: Storing an ordered list

From
Bruno Wolff III
Date:
On Wed, Jul 26, 2006 at 20:13:03 -0400, Michael Artz <mlartz@gmail.com> wrote:
> On 7/26/06, Bruno Wolff III <bruno@wolff.to> wrote:
> >If you use numeric instead of int, then it is easy to insert new values.
> 
> Hmm, hadn't thought about that.  How would you normally implement it?
> I'm thinking that, if I wanted to insert between A and B, I could take
> (A.order + B.order)/2, which would be pretty simple.  Is there a
> better way?

I think that will depend. To keep the size of the number down, you will
probably want to use the number with the fewest digits to the right of the
decimal point that gives you a number between the two values. That will be
a bit more complicated than the above formula. But you will want to do
something to keep the size of the numerics down since it seems like reordering
will be common.

Another issue to consider is concurrency. You may want to lock the table
against concurrent reordering, as doing two at once may lead to some unexpected
events.


Re: Storing an ordered list

From
"Aaron Bono"
Date:
On 7/26/06, Michael Artz <mlartz@gmail.com> wrote:
On 7/26/06, Bruno Wolff III <bruno@wolff.to> wrote:
> If you use numeric instead of int, then it is easy to insert new values.

Hmm, hadn't thought about that.  How would you normally implement it?
I'm thinking that, if I wanted to insert between A and B, I could take
(A.order + B.order)/2, which would be pretty simple.  Is there a
better way?

 This is a good idea.  Then you can add a scheduled process to read through these values and turn them back to integer values on a regular basis (sort of a reindexing) to keep your numbers from becoming small enough that you start experiencing round off problems.  Perhaps you could add a trigger that says if the value entered into the order field is going out to too many decimal places, it renumbers everything. to keep the values clean.  Or better yet, add a stored procedure you call to reorder the elements that decides how to do it for you so you can easily rewrite the implementation without having to change the application.

Just some ideas...

==================================================================
   Aaron Bono
   Aranya Software Technologies, Inc.
   http://www.aranya.com
==================================================================