Re: Problem with UPDATE and UNIQUE - Mailing list pgsql-general
From | Michael Glaesemann |
---|---|
Subject | Re: Problem with UPDATE and UNIQUE |
Date | |
Msg-id | F2AB0CCC-A90F-43A0-B136-F54E2C147BE9@seespotcode.net Whole thread Raw |
In response to | Problem with UPDATE and UNIQUE ("Frank Millman" <frank@chagford.com>) |
Responses |
Re: Problem with UPDATE and UNIQUE
|
List | pgsql-general |
On Aug 22, 2007, at 1:02 , Frank Millman wrote: > I want to store data in a 'tree' form, with a fixed number of > levels, so > that each level has a defined role. First thought: fixed, predetermined levels, separate tables for each level. If a more general approach is desired, your options are generally adjacency list, nested sets, or contrib/ltree. Each has their own strengths and weaknesses. > I have the following (simplified) table - > > CREATE TABLE treedata ( > rowid serial primary key, > levelno int not null, > parentid int references treedata, > seq int not null, > code varchar not null, > description varchar not null > ); rowid + parentid looks like adjacency list to me. Note that you're storing redundant data (the levelno, which can be derived from the rowid/parentid relationships), which you may want to do for performance reasons, but does make things more complicated: you're essentially caching data which brings with it problems of cache invalidation. In this case, you need to make sure you're updating levelno whenever it needs to be updated. (Which I'm sure you've already thought of.) > To describe each of the levels in the tree, I have the following > table - > > CREATE TABLE treelevels ( > levelno int primary key, > code varchar unique not null, > description varchar not null > ); Having each level as its own table would make this redundant, but again, that might not fit with what you're modeling. > Typical values for this table could be - > (0,'Prod','Product code') > (1,'Cat','Product category') > (2,'*','All products') This makes me think you'll want to rethink your schema a bit, as you're mixing different types of data: categories and products. I'd at least separate this out into a products table and a categories table. The categories table may in fact still require some kind of tree structure, but I don't think products belongs as part of it. CREATE TABLE products ( product_code text PRIMARY KEY , product_name text NOT NULL UNIQUE , product_description TEXT NOT NULL ); CREATE TABLE categories ( category_id INTEGER PRIMARY KEY , category_name TEXT NOT NULL UNIQUE , category_description TEXT NOT NULL , parent_category_id INTEGER NOT NULL REFERENCES categories , level INTEGER NOT NULL , seq INTEGER NOT NULL ); One difference here is that I have a NOT NULL constraint on parent_category_id, and I'd define the root as a category_id that has a parent_category = category_id. You should be able to add a check constraint that makes sure only level = 0 has parent_category_id = category_id and a unique index so only one category can have level 0. Something like: -- check constraint CHECK ( CASE WHEN category_id = parent_category_id THEN level = 0 ELSE category_id <> parent_category_id END) -- unique index CREATE UNIQUE INDEX categories_root_uniq_idx ON categories (level) WHERE level = 0; These are untested, but I think they should work. Given that you'd like to determine the order of the items, I think you might be interested in using nested sets rather than adjacency list, as this information is automatically encoded for you. This would look something like: CREATE FUNCTION valid_nested_interval_bounds(INTEGER, INTEGER) RETURNS BOOLEAN IMMUTABLE LANGUAGE SQL AS $body$ SELECT $1 < $2 -- make sure they're properly ordered AND ($2 - $1 - 1) % 2 = 0 -- useful if using the more strict versions of nested set (where there no gaps) -- and you can easily calculate the total number of descendants -- from just the upper and lower bounds of the parent node $body$; I generally separate the validity test into a separate function as I often have a couple different nested set hierarchies in the same database, and it makes the table definition clearer. I believe the function is inlined in the check constraint, but I'm not sure. You can, of course, move the SQL in the function straight into table definition. CREATE TABLE categories ( , category_name TEXT PRIMARY KEY , category_description TEXT NOT NULL , category_lower INTEGER NOT NULL UNIQUE -- often called "left" , category_upper INTEGER NOT NULL UNIQUE -- often called "right" , CHECK (valid_nested_interval_bounds(category_lower, category_upper)) ); Another advantage of nested sets is that selecting all of the descendants of a particular node is very easy. The downside of the nested set model is that inserts and updates are a bit more complicated and db intensive. If your categories aren't going to change much, this shouldn't be a problem. Again, you could add level information into the categories table, but you'd need to make sure to update it appropriately. CREATE TABLE category_products ( category_name TEXT NOT NULL REFERENCES categories , product_code TEXT NOT NULL REFERENCES products , UNIQUE (category_name, product_code) ); My comments below refer to your original design. > Now for the problem. I want to insert or delete levels dynamically. > I can > insert or delete levels in 'treedata' without a problem. However, I > also > want to insert or delete a level in 'treelevels'. > > Say I want to insert a level between 'code' and 'category' called > 'group' - > > INSERT INTO treelevels VALUES (1,'Group','Product group'); It's a good habit to *always* explicitly list your columns: it's self- documenting and more robust in the face of schema changes. > Obviously this will fail with a duplicate levelno. Therefore before > the > insert statement I want to do this - > > UPDATE treelevels SET levelno = (levelno+1) WHERE levelno >= 1; > > The problem is that if there are a number of levels, and they are in > indeterminate order, I can get duplicate level numbers while the > command is > being executed. > > My workaround at present is the following - > > UPDATE treelevels SET levelno = (levelno+10001) WHERE levelno >= 1; > UPDATE treelevels SET levelno = (levelno-10000) WHERE levelno >= 1; This is a general problem with nested sets and your situation where you're caching the levelno, and you're work around is similar to the two generally recommended solutions. One is to make updates using an offset such as what you're doing, and the other is to utilize negative levels. I'm keen on the latter, as I feel it's a bit more flexible: you don't need to make sure your offset is large enough. I'd do something like this: BEGIN; UPDATE treelevels SET levelno = -1 * (levelno + 1) WHERE levelno >= 1; INSERT INTO treelevels (levelno, code, description) VALUES (1, 'Group', 'Product group'); UPDATE treelevels SET levelno = -1 * levelno WHERE levelno < 0; COMMIT; And you can wrap the common operations (insert, move, delete) in functions for convenience. Anyway, hope this gives you something to think about. Michael Glaesemann grzm seespotcode net
pgsql-general by date: