Re: A general database question! - Mailing list pgsql-general
From | Dmitry Tkach |
---|---|
Subject | Re: A general database question! |
Date | |
Msg-id | 3C9B64EA.5010103@openratings.com Whole thread Raw |
In response to | A general database question! (Dmitry Tkach <dmitry@openratings.com>) |
List | pgsql-general |
It is simple so far - I'd just create an an index on= both location andname... I am not quite understanding why a multi-colu= mn index won't work. Can you explain a little more?Here is what= I was thinking: create table entity( type int, location int,<b= r> name varchar(100) not null, ); create unique index my_index on e= ntity(location,name);It seems as though an "and" query would run qu= ickly. It also seems as though if you needed to get a few records that = were one of several names, a query such as the following would run quic= kly:select * from entity where location =3D 5 and (name =3D 'joe' or na= me=3D'joseph');or are the aliases a seperate attribute? I am not cl= ear on exactly where the alias comes in. Well... Yes, that's the problem - aliases ARE a separate attribute. The SAME entry could be known as both 'joe' and 'joseph', so, as far as I can see, I have to choices to do this, as I explained earlier - either create two tables, one having just entry id and location, and the other one, having id, name and type (or 'nametype' if you will), liked to the first one, or, I could have one table with multiple rows, corresponding to the same entry.= The problem is that I don't =9Alike any of these solutions :-( The first one isn't good enough because I can't create an index accross two tables, so the search by location and name would be problematic... The second one sucks, because it duplicates all those locations, and,=9A be= cause it is not normalized. Your suggestion looks like the second of these solutions... First of all, the table you suggest, would need another column, say, entityID, telling which entity this name entry belongs to. This column can't even be a serial or a primary key, because there will have to be multiple rows with the same id (to hold aliases)... Now, suppose, a paritcular entity has 10 different names - this would create 10 different rows in the table, all of which are supposed to have the same id and location... First of all, I'll waste 9*(sizeof(location)) bytes per each such entity (10 identical location values, instead of only one really needed)... And secondly, how am I going to enforce the consistency when an entity 'moves' (location gets changed)? All those ten rows have to be updated... In other words, I would need to have some means of ensuring that for every given pair of rows in that table if row1.id=3Drow2.id then row1.location=3Drow2.locati= on... This calls for another complication of the schema - putting a trigger on the table to ensure the consistency.... While this all seems to be doable, please keep in mind that what we are dis= cussing here is just a simple example... This gets more complicated, if you take into consideration, for example, the fact, that name and location are likely to be not the only two attributes of an entity - there could be many more of them... Also, consider that you have LOTS of those entitites (tens of millions), each having, say, 20 attributes - imagine how much space would be wasted on duplication, and how much time would be wasted executing that trigger... :-( Does this clarify things? Thanks! Dima
pgsql-general by date: