Re: Querying multiple tables - Mailing list pgsql-novice
From | Paul Malherbe |
---|---|
Subject | Re: Querying multiple tables |
Date | |
Msg-id | 46B14FD1.6020209@tartan.co.za Whole thread Raw |
In response to | Re: Querying multiple tables (Michael Glaesemann <grzm@seespotcode.net>) |
List | pgsql-novice |
Michael Glaesemann wrote: > > On Aug 1, 2007, at 14:04 , Paul Malherbe wrote: > >> Sorry about that, here is an example of what I am doing and what I >> would like to achieve. > > Thanks. Providing an example of the results you're seeking is very > useful, as is providing example data. > > From your definitions (and the fact that you're posting to the -novice > list), I suspect you're quite new to SQL. I'll point out some things I > think you should do to improve your code. > >> Create table rcaowm (rom_owner varchar(7), rom_name varchar(30)); >> Create unique index rcaowm_key1 on rcaowm (rom_owner); > > SQL syntax is case insensitive (except for identifiers if they're > quoted). Capitalizing the first letter of the statement is > unnecessary. It doesn't matter, but it looks a little odd. > > Also, I think you should take some time to learn about database > normalization constraints such as NOT NULL, PRIMARY KEY and FOREIGN > KEY. These are really fundamental to using a relational database > server. I would rewrite the above statements as: > > CREATE TABLE rcaowm > ( > rom_owner TEXT PRIMARY KEY > , rom_name TEXT NOT NULL > ); > > For the most part, PRIMARY KEY is shorthand for UNIQUE NOT NULL. I > recommend setting at NOT NULL constraint on your columns by default as > well, as it will prevent perhaps unexpected query results, especially > if you're just starting out. > > As for using TEXT instead of varchar(7), there's no performance > advantage to specifying the length of the varchar column. Unless you > have a compelling business reason to restrict the number of > characters, I'd leave it as TEXT. > >> Insert into rcaowm values ('OOOOOOO', 'Owners Name'); > > Be explicit when naming columns in an INSERT statement. This increases > readability and prevents unexpected bugs caused by changes in schema > definition: > > INSERT INTO rcaowm (rom_owner, rom_name) VALUES ('OOOOOOO', 'Owners > Name'); > >> Create table rcaowt (rot_owner varchar(7), rot_date int4, rot_tramt >> decimal); > > CREATE TABLE rcaowt ( > rot_owner TEXT NOT NULL > REFERENCES rcaowm > , rot_date DATE NOT NULL > , rot_tramt DECIMAL NOT NULL > ); > > I assume that the value of the rot_owner column must appear in the > rcaowm table in the rom_owner column. If this is so, set a foreign key > constraint so the database server can enforce it. That's what the > "REFERENCES rcaowm" clause does. If no column is listed in the > REFERENCES clause, it references the primary key of the referenced > table, in this case rom_owner. To be explicit, it could have been > written REFERENCES rcaowm (rom_owner). > > Also, use appropriate datatypes. rot_date is a date, so use a DATE > column. This adds an additional constraint to be sure only valid date > values can be submitted. Right now, there's nothing to prevent someone > from doing something like: > > INSERT INTO rcaowt (rot_owner, rot_date, rot_tramt) VALUES ('OOOOOOO', > '20074533', 5000); > > Using a date column will prevent that. Also, there are a lot of > functions that are available for handling dates which you might find > useful. > >> Create index rcaowt_key1 on rcaowt (rot_owner); > > A plain index doesn't really add anything to the schema definition, > though it can be used by the database server to look up information in > the table. > >> Create table rcatnm (rtm_owner varchar(7), rtm_tenant varchar(7), >> rtm_name varchar(30)); >> Create unique index rcatnm_key1 on rcatnm (rtm_owner, rtm_tenant); >> Insert into rcatnm values ('OOOOOOO', 'TTTTTTT', 'Tenants Name'); > > Same as above: use TEXT instead of VARCHAR(n), use NOT NULL, PRIMARY > KEY and REFERENCES, list columns in INSERT statements. > >> Create table rcatnt (rtt_owner varchar(7), rtt_tenant varchar(7), >> rtt_date int4, rtt_tramt decimal); > > This is a question I had in my first response: can rtt_owner be > anything other than the rot_owner for rtt_tenant? If so, your table > definition should probably be > > CREATE TABLE rcatnt > ( > rtt_owner TEXT NOT NULL > REFERENCES rcaowm > , rtt_tenant TEXT NOT NULL > REFERENCES rcaowt > , rtt_date DATE NOT NULL > , rtt_tramt DECIMAL NOT NULL > ); > > If the rtt_owner must be the same as rot_owner, just leave it out. You > can always look up the owner by joining through the rtt_tenant table. > > CREATE TABLE rcatnt > ( > rtt_tenant TEXT NOT NULL > REFERENCES rcaowt > , rtt_date DATE NOT NULL > , rtt_tramt DECIMAL NOT NULL > ); > > In either case, it's not obvious to me what the primary key on this > table should be. You need some way to uniquely identify each row in > the table. > >> Create index rcatnt_key1 on rcatnt (rtt_owner, rtt_tenant); > > Again, this index doesn't do anything for your schema definition. > >> Select rom_name as "Owner_Name", rtm_name as "Tenant_Name", >> sum(rot_tramt) as "Sum_Owner_Value" , sum(rtt_tramt) as >> "Sum_Tenant_Value" from rcaowm join rcaowt on (rom_owner = rot_owner) >> join rcatnt on (rom_owner = rtt_owner) join (select rtm_tenant, >> rtm_name from rcatnm) as tenants on (rtm_tenant = rtt_tenant) group >> by rom_name, rtm_name; > > Do you find this easy to read? Personally, I don't. It helps if you > can format the query to make it easy for others (and yourself!) to read. > >> This query returns: >> >> Owner_Name | Tenant_Name | Sum_Owner_Value | Sum_Tenant_Value >> -------------+--------------+-----------------+------------------ >> Owners Name | Tenants Name | -7202.0 | 12000.0 >> >> >> I would like to return: >> >> Owner_Name | Tenant_Name | Sum_Owner_Value | Sum_Tenant_Value >> -------------+--------------+-----------------+------------------ >> Owners Name | Tenants Name | -3601.0 | 4000.0 > > This should get you most of the way there, as it gives the amounts you > want. However, it's not clear to me that this is *really* the result > you're looking for because of the ambiguity as to what owner you're > looking for (see my questions about the rcatnt table). > > SELECT rot_owner, sum_rot_tramt, rtt_tenant, sum_rtt_tramt > FROM rcatnm > JOIN ( > SELECT rot_owner, sum(rot_tramt) as sum_rot_tramt > FROM rcaowt > GROUP BY rot_owner) o ON (rtm_owner = rot_owner) > JOIN ( > SELECT rtt_tenant, sum(rtt_tramt) as sum_rtt_tramt > FROM rcatnt > GROUP BY rtt_tenant ) t ON (rtm_tenant = rtt_tenant); > rot_owner | sum_rot_tramt | rtt_tenant | sum_rtt_tramt > -----------+---------------+------------+--------------- > OOOOOOO | -3601.0 | TTTTTTT | 4000.0 > (1 row) > > Hope this gets you pointed in the right direction. And I still > recommend naming your tables and columns with less cryptic names. I > know I had to repeatedly look up the table and column names when > working on this, and that's not time well spent. Letters are cheap! > > Good luck! > > Michael Glaesemann > grzm seespotcode net > > > Thanks for all your suggestions and explanations, they are very helpfull! Regards, Paul
pgsql-novice by date: