Thread: Sorting by related tables
I have a few beginner questions about using related tables for sorting. create table region { id SERIAL PRIMARY KEY, name text, -- order this table should be sorted in -- a "1" is the top sort level sort_order integer ); create table city { id SERIAL PRIMARY KEY, name text, region integer REFERENCES region ); I want a way to adjust the sort order of the "region" table ("move item up" or "move item down" in a web interface) without requiring knowledge of the existing "sort_order" for the rows in the region table. (i.e. requiring them to already be in an order). Here's my "move up" (which is a lower sort_order value) statement. (__TABLE__ is "region" and ? are $\d bind parameters) UPDATE __TABLE__ SET sort_order = CASE -- subtract one from the item's sort, unless it's already "1" WHEN id = ? AND sort_order > 1 THEN sort_order-1 -- for other items that are greater or equal to sort-1 WHEN id != ? AND sort_order >= (select sort_order from __TABLE__ where id = ?)-1 THEN sort_order+1 -- all others, leave alone ELSE sort_order END; This works reasonably well for small tables, but doesn't scale and the logic likely has holes. And behavior when adding new rows to the region table is not defined. 1) How do most people do this? Use linked lists? create table region { id SERIAL PRIMARY KEY name text, list_head boolean, -- flag if this is the head of the linked list next integer REFERENCES region ); 2) As a SQL beginner, I'm not seeing how to display rows from "city" sorted in order based on the order in the "region" table. 3) Oh, and I have also this for checking IF there are items in "region" that are "above" the item in question -- to see IF an item can or cannot be moved up in the sort order relative to others. SELECT id FROM __TABLE__ WHERE sort_order <= (SELECT sort_order FROM __TABLE__ WHERE id = ?) AND id != ?; If that returns any rows then I know I can call the UPDATE to move the item up. Again, a very basic question: What method should be used to be sure that nothing changes between the SELECT and the UPDATE? -- Bill Moseley moseley@hank.org
Bill Moseley schrob: > create table region { > id SERIAL PRIMARY KEY, > name text, > -- order this table should be sorted in > -- a "1" is the top sort level > sort_order integer > ); > > create table city { > id SERIAL PRIMARY KEY, > name text, > region integer REFERENCES region > ); > > I want a way to adjust the sort order of the "region" table ("move > item up" or "move item down" in a web interface) without requiring > knowledge of the existing "sort_order" for the rows in the region > table. (i.e. requiring them to already be in an order). > > Here's my "move up" (which is a lower sort_order value) statement. > (__TABLE__ is "region" and ? are $\d bind parameters) > > UPDATE __TABLE__ > SET sort_order = > CASE > -- subtract one from the item's sort, unless it's already "1" > WHEN id = ? AND sort_order > 1 THEN sort_order-1 > > -- for other items that are greater or equal to sort-1 > WHEN id != ? AND sort_order >= (select sort_order from __TABLE__ where id = ?)-1 > THEN sort_order+1 > > -- all others, leave alone > ELSE sort_order > END; > > This works reasonably well for small tables, but doesn't scale and the > logic likely has holes. And behavior when adding new rows to the > region table is not defined. I guess your approach of maintaining a special attribute for the custom sort order could be quite fast when using floating point numbers instead of integers. You then could easily move a record around without having to update _every_ other record by using proper fractions. E.g. to move a record A between B and C, just update its sort_order to (B.sort_order + C.sort_order) / 2. However, with IEEE754-floats this is only guaranteed to work 1023 times in the worst case when using double precision and "seeding" the sort_order with integers. So one would have to be careful and "normalize" the column from time to time. > 1) How do most people do this? Use linked lists? I haven't had this problem yet, so I don't know if there's a standard answer. > create table region { > id SERIAL PRIMARY KEY > name text, > list_head boolean, -- flag if this is the head of the linked list > next integer REFERENCES region > ); The "problem" with linked lists is that they don't fit into the relational model well, and since SQL isn't turing-complete by design, you'd have to write a proper function with one of the procedural languages PostgreSQL offers to iterate the list. Searching the list archives should yield some examples. > 2) As a SQL beginner, I'm not seeing how to display rows from "city" > sorted in order based on the order in the "region" table. IMHO a simple join should do here, e.g. select city.name from city, region where region.id = city.region order by region.sort_order Joining is explained here: <http://www.postgresql.org/docs/8.0/static/queries-table-expressions.html> > 3) Oh, and I have also this for checking IF there are items in > "region" that are "above" the item in question -- to see IF an item > can or cannot be moved up in the sort order relative to others. > > SELECT id FROM __TABLE__ > WHERE > sort_order <= (SELECT sort_order FROM __TABLE__ WHERE id = ?) > AND id != ?; > > If that returns any rows then I know I can call the UPDATE to move the > item up. I guess you want a boolean value here? SELECT EXISTS around your above query as a subselect should do the trick. You also want to use LIMIT 1 in the statement, to avoid fetching unnecessary records. > Again, a very basic question: What method should be used to be sure > that nothing changes between the SELECT and the UPDATE? You can achieve that using transactions. Concurrency control is explained here: <http://www.postgresql.org/docs/8.0/static/mvcc.html>. regards Andreas --
On Sat, Aug 13, 2005 at 06:44:09PM +0200, Andreas Seltenreich wrote: > > 3) Oh, and I have also this for checking IF there are items in > > "region" that are "above" the item in question -- to see IF an item > > can or cannot be moved up in the sort order relative to others. > > > > SELECT id FROM __TABLE__ > > WHERE > > sort_order <= (SELECT sort_order FROM __TABLE__ WHERE id = ?) > > AND id != ?; > > > > If that returns any rows then I know I can call the UPDATE to move the > > item up. > > I guess you want a boolean value here? SELECT EXISTS around your above > query as a subselect should do the trick. You also want to use LIMIT 1 > in the statement, to avoid fetching unnecessary records. Is there much of a difference between using LIMIT 1 and using an EXISTS subselect? Frankly, I'm not clear what you are specifically suggestion with EXISTS. I'm using Perl's Class::DBI object mapping module so returning a single row is an easy way to check this as a boolean result in Perl. > > Again, a very basic question: What method should be used to be sure > > that nothing changes between the SELECT and the UPDATE? > > You can achieve that using transactions. Concurrency control is > explained here: <http://www.postgresql.org/docs/8.0/static/mvcc.html>. My comment was that I want to do the above SELECT and then *only* do an UPDATE if the SELECT returns at least one row. So, I should do: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; Before the SELECT. And when I UPDATE I need to be prepared to do a ROLLBACK if I get an error and repeat the process. (And, I assume, take some precaution to give up after some number of tries.) Does that seem reasonable? -- Bill Moseley moseley@hank.org
Bill Moseley schrob: > On Sat, Aug 13, 2005 at 06:44:09PM +0200, Andreas Seltenreich wrote: >> > 3) Oh, and I have also this for checking IF there are items in >> > "region" that are "above" the item in question -- to see IF an item >> > can or cannot be moved up in the sort order relative to others. >> > >> > SELECT id FROM __TABLE__ >> > WHERE >> > sort_order <= (SELECT sort_order FROM __TABLE__ WHERE id = ?) >> > AND id != ?; >> > >> > If that returns any rows then I know I can call the UPDATE to move the >> > item up. >> >> I guess you want a boolean value here? SELECT EXISTS around your above >> query as a subselect should do the trick. You also want to use LIMIT 1 >> in the statement, to avoid fetching unnecessary records. > > Is there much of a difference between using LIMIT 1 and using an > EXISTS subselect? LIMIT 1 does reduce the cost, EXISTS AFAIK only makes the result boolean and doesn't stop the execution of the subselect by itself when the first record is found. > Frankly, I'm not clear what you are specifically > suggestion with EXISTS. I'm using Perl's Class::DBI object mapping module so > returning a single row is an easy way to check this as a boolean > result in Perl. Uups, this wasn't question number three yet, and I wrongly inferred from your uppercase-ifs that you wanted a boolean result here :-/ >> > Again, a very basic question: What method should be used to be sure >> > that nothing changes between the SELECT and the UPDATE? >> >> You can achieve that using transactions. Concurrency control is >> explained here: <http://www.postgresql.org/docs/8.0/static/mvcc.html>. > > My comment was that I want to do the above SELECT and then *only* do > an UPDATE if the SELECT returns at least one row. > > So, I should do: > > SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; > > Before the SELECT. And when I UPDATE I need to be prepared to do a > ROLLBACK if I get an error and repeat the process. (And, I assume, > take some precaution to give up after some number of tries.) > > Does that seem reasonable? This would be one possibility. If you don't want your application to deal with transactions being aborted because of non-serializable transactions, you could alternatively use explicit locking (SELECT ... FOR UPDATE) combined with the Read Committed isolation level (the default). Explicit locking is documented here: <http://www.postgresql.org/docs/8.0/static/explicit-locking.html#LOCKING-ROWS> regards Andreas --
On Mon, Aug 15, 2005 at 11:30:32PM +0200, Andreas Seltenreich wrote: > > This would be one possibility. If you don't want your application to > deal with transactions being aborted because of non-serializable > transactions, you could alternatively use explicit locking (SELECT ... > FOR UPDATE) combined with the Read Committed isolation level (the > default). SELECT FOR UPDATE locks just the rows that are selected, right? If I understand correctly, that would not work for my case because I'm updating different rows than I'm selecting. My tables are small, so I'm thinking of just manually updating all the rows in sequence to adjust the order when needed -- to make things a bit more simple. But it is a problem that I am curious about how best to solve in a scalable way. Thanks very much for your feedback. -- Bill Moseley moseley@hank.org