Re: Querying multiple tables - Mailing list pgsql-novice

From Paul Malherbe
Subject Re: Querying multiple tables
Date
Msg-id 46B14FD1.6020209@tartan.co.za
Whole thread Raw
In response to Re: Querying multiple tables  (Michael Glaesemann <grzm@seespotcode.net>)
List pgsql-novice
Michael Glaesemann wrote:
>
> On Aug 1, 2007, at 14:04 , Paul Malherbe wrote:
>
>> Sorry about that, here is an example of what I am doing and what I
>> would like to achieve.
>
> Thanks. Providing an example of the results you're seeking is very
> useful, as is providing example data.
>
> From your definitions (and the fact that you're posting to the -novice
> list), I suspect you're quite new to SQL. I'll point out some things I
> think you should do to improve your code.
>
>> Create table rcaowm (rom_owner varchar(7), rom_name varchar(30));
>> Create unique index rcaowm_key1 on rcaowm (rom_owner);
>
> SQL syntax is case insensitive (except for identifiers if they're
> quoted). Capitalizing the first letter of the statement is
> unnecessary. It doesn't matter, but it looks a little odd.
>
> Also, I think you should take some time to learn about database
> normalization constraints such as NOT NULL, PRIMARY KEY and FOREIGN
> KEY. These are really fundamental to using a relational database
> server. I would rewrite the above statements as:
>
> CREATE TABLE rcaowm
> (
>     rom_owner TEXT PRIMARY KEY
>     , rom_name TEXT NOT NULL
> );
>
> For the most part, PRIMARY KEY is shorthand for UNIQUE NOT NULL. I
> recommend setting at NOT NULL constraint on your columns by default as
> well, as it will prevent perhaps unexpected query results, especially
> if you're just starting out.
>
> As for using TEXT instead of varchar(7), there's no performance
> advantage to specifying the length of the varchar column. Unless you
> have a compelling business reason to restrict the number of
> characters, I'd leave it as TEXT.
>
>> Insert into rcaowm values ('OOOOOOO', 'Owners Name');
>
> Be explicit when naming columns in an INSERT statement. This increases
> readability and prevents unexpected bugs caused by changes in schema
> definition:
>
> INSERT INTO rcaowm (rom_owner, rom_name)  VALUES  ('OOOOOOO', 'Owners
> Name');
>
>> Create table rcaowt (rot_owner varchar(7), rot_date int4, rot_tramt
>> decimal);
>
> CREATE TABLE rcaowt (
>     rot_owner TEXT NOT NULL
>         REFERENCES rcaowm
>     , rot_date DATE NOT NULL
>     , rot_tramt DECIMAL NOT NULL
> );
>
> I assume that the value of the rot_owner column must appear in the
> rcaowm table in the rom_owner column. If this is so, set a foreign key
> constraint so the database server can enforce it. That's what the
> "REFERENCES rcaowm" clause does. If no column is listed in the
> REFERENCES clause, it references the primary key of the referenced
> table, in this case rom_owner. To be explicit, it could have been
> written REFERENCES rcaowm (rom_owner).
>
> Also, use appropriate datatypes. rot_date is a date, so use a DATE
> column. This adds an additional constraint to be sure only valid date
> values can be submitted. Right now, there's nothing to prevent someone
> from doing something like:
>
> INSERT INTO rcaowt (rot_owner, rot_date, rot_tramt) VALUES ('OOOOOOO',
> '20074533', 5000);
>
> Using a date column will prevent that. Also, there are a lot of
> functions that are available for handling dates which you might find
> useful.
>
>> Create index rcaowt_key1 on rcaowt (rot_owner);
>
> A plain index doesn't really add anything to the schema definition,
> though it can be used by the database server to look up information in
> the table.
>
>> Create table rcatnm (rtm_owner varchar(7), rtm_tenant varchar(7),
>> rtm_name varchar(30));
>> Create unique index rcatnm_key1 on rcatnm (rtm_owner, rtm_tenant);
>> Insert into rcatnm values ('OOOOOOO', 'TTTTTTT', 'Tenants Name');
>
> Same as above: use TEXT instead of VARCHAR(n), use NOT NULL, PRIMARY
> KEY and REFERENCES, list columns in INSERT statements.
>
>> Create table rcatnt (rtt_owner varchar(7), rtt_tenant varchar(7),
>> rtt_date int4, rtt_tramt decimal);
>
> This is a question I had in my first response: can rtt_owner be
> anything other than the rot_owner for rtt_tenant? If so, your table
> definition should probably be
>
> CREATE TABLE rcatnt
> (
>     rtt_owner TEXT NOT NULL
>         REFERENCES rcaowm
>     , rtt_tenant TEXT NOT NULL
>         REFERENCES rcaowt
>     , rtt_date DATE NOT NULL
>     , rtt_tramt DECIMAL NOT NULL
> );
>
> If the rtt_owner must be the same as rot_owner, just leave it out. You
> can always look up the owner by joining through the rtt_tenant table.
>
> CREATE TABLE rcatnt
> (
>     rtt_tenant TEXT NOT NULL
>         REFERENCES rcaowt
>     , rtt_date DATE NOT NULL
>     , rtt_tramt DECIMAL NOT NULL
> );
>
> In either case, it's not obvious to me what the primary key on this
> table should be. You need some way to uniquely identify each row in
> the table.
>
>> Create index rcatnt_key1 on rcatnt (rtt_owner, rtt_tenant);
>
> Again, this index doesn't do anything for your schema definition.
>
>> Select rom_name as "Owner_Name", rtm_name as "Tenant_Name",
>> sum(rot_tramt) as "Sum_Owner_Value" , sum(rtt_tramt) as
>> "Sum_Tenant_Value" from rcaowm join rcaowt on (rom_owner = rot_owner)
>> join rcatnt on (rom_owner = rtt_owner) join (select rtm_tenant,
>> rtm_name from rcatnm) as tenants on (rtm_tenant = rtt_tenant) group
>> by rom_name, rtm_name;
>
> Do you find this easy to read? Personally, I don't. It helps if you
> can format the query to make it easy for others (and yourself!) to read.
>
>> This query returns:
>>
>> Owner_Name  | Tenant_Name  | Sum_Owner_Value | Sum_Tenant_Value
>> -------------+--------------+-----------------+------------------
>> Owners Name | Tenants Name |         -7202.0 |          12000.0
>>
>>
>> I would like to return:
>>
>> Owner_Name  | Tenant_Name  | Sum_Owner_Value | Sum_Tenant_Value
>> -------------+--------------+-----------------+------------------
>> Owners Name | Tenants Name |         -3601.0 |           4000.0
>
> This should get you most of the way there, as it gives the amounts you
> want. However, it's not clear to me that this is *really* the result
> you're looking for because of the ambiguity as to what owner you're
> looking for (see my questions about the rcatnt table).
>
> SELECT rot_owner, sum_rot_tramt, rtt_tenant, sum_rtt_tramt
> FROM rcatnm
> JOIN (
>     SELECT rot_owner,  sum(rot_tramt) as sum_rot_tramt
>     FROM rcaowt
>     GROUP BY rot_owner) o ON (rtm_owner = rot_owner)
> JOIN (
>     SELECT rtt_tenant, sum(rtt_tramt) as sum_rtt_tramt
>     FROM rcatnt
>     GROUP BY rtt_tenant ) t ON (rtm_tenant = rtt_tenant);
>  rot_owner | sum_rot_tramt | rtt_tenant | sum_rtt_tramt
> -----------+---------------+------------+---------------
> OOOOOOO   |       -3601.0 | TTTTTTT    |        4000.0
> (1 row)
>
> Hope this gets you pointed in the right direction. And I still
> recommend naming your tables and columns with less cryptic names. I
> know I had to repeatedly look up the table and column names when
> working on this, and that's not time well spent. Letters are cheap!
>
> Good luck!
>
> Michael Glaesemann
> grzm seespotcode net
>
>
>
Thanks for all your suggestions and explanations, they are very helpfull!

Regards, Paul

pgsql-novice by date:

Previous
From: Michael Glaesemann
Date:
Subject: Re: Querying multiple tables
Next
From: "Ralf Strandell"
Date:
Subject: Postgresql on 64bit Debian 4.0: libz.so.1 => not found