Thread: combining tables
I have a situation now where I have a group of 4 tables within one database. All tables make use of the code defined in the first column as the primary key. 1 of the 4 tables has the complete range of keys, in this case about 1,000. The other 3 tables have data representing most but not all of those keys, and in different sets. I want to end up with a new 'combined' table which will allow me to immediately 'see' missing data from the 3 smaller tables and take the necessary steps to 'fill in the blanks'. I realize I would of course have to create that table without constrictions other than the primary key. My <<< shaky >>> understanding from the manual is that I could use something like: SELECT * FROM tmain,tsub1,tsub2,tsub3 NATURAL INNER JOIN tcombo Am I on the right track? Thanks for any heads up you'd be kind enough to provide. George
G. J. Walsh wrote: > 1 of the 4 tables has the complete range of keys, in this case about > 1,000. The other 3 tables have data representing most but not all of > those keys, and in different sets. I want to end up with a new > 'combined' table which will allow me to immediately 'see' missing data > from the 3 smaller tables and take the necessary steps to 'fill in the > blanks'. I realize I would of course have to create that table without > constrictions other than the primary key. Do you mean you want to find keys that exist in the 'main' table but have no match in the other three tables? -- Cliff Nieuwenhuis
Hi, Cliff, and thanks for offering your help. All possible keys are in the main table. The remaining tables are organized the same way but are incomplete; they lack some matching rows and or they lack some data. I thought it would be better to combine the columns into 1 table because all the data is available for search purposes without having to complicate a high volume application with many shifting variables in the query. To arrive at this stage I have already optimized nearly 400,00 lines of tab- delimited text in 15 files (tables) to get to the present stage of 4 tables of under 4,000 rows. Once established, the data will remain virtually constant for a period of one year between scheduled updates. So the answer to your question is 'yes', but that is only a 'side benefit' as I see it. All of this comes from years of work with C-ISAM which is why I am so unsure of working with unions and joins. George On Sat, 2008-10-25 at 23:59 -0500, Cliff Nieuwenhuis wrote: > G. J. Walsh wrote: > > > 1 of the 4 tables has the complete range of keys, in this case about > > 1,000. The other 3 tables have data representing most but not all of > > those keys, and in different sets. I want to end up with a new > > 'combined' table which will allow me to immediately 'see' missing data > > from the 3 smaller tables and take the necessary steps to 'fill in the > > blanks'. I realize I would of course have to create that table without > > constrictions other than the primary key. > > Do you mean you want to find keys that exist in the 'main' table but > have no match in the other three tables? > > -- > Cliff Nieuwenhuis >
G. J. Walsh writes: > > 1 of the 4 tables has the complete range of keys, in this case about > > 1,000. The other 3 tables have data representing most but not all of > > those keys, and in different sets. I want to end up with a new > > 'combined' table which will allow me to immediately 'see' missing > > data from the 3 smaller tables and take the necessary steps to > > 'fill in the blanks'. I realize I would of course have to create > > that table without constrictions other than the primary key. and > My <<< shaky >>> understanding from the manual is that I could use > something like: > > SELECT * FROM tmain,tsub1,tsub2,tsub3 NATURAL INNER JOIN tcombo > > Am I on the right track? I asked > Do you mean you want to find keys that exist in the 'main' table but > have no match in the other three tables? .. and you said 'yes'. There is more to it on your side, but (to me) those details are unique to what you are doing with your application. For me (and I'm a novice as well and speak under correction here), I will start will a query I can easily understand and then work towards one that is more efficient. In your case I might try something like: SELECT * FROM tmain WHERE tmain.primarykey NOT IN ( SELECT DISTINCT foreignkey FROM tsub1 ); ..to get records that are in the 'main' table but have no related information in tsub1. You would need to repeat this for the other tables of course. If you wanted a big table with all 'missing' fields, I would try an OUTER JOIN instead of the NATURAL INNER JOIN, because you want the records in the 'main' table to appear in the results even if there is no match in one of the 'sub' tables. Perhaps someone on this list has a better idea or understands your situation better than I do.
Excellent suggestion, Cliff! Actually, working from your suggestion I was able to get a handle on what missing data I'd have to generate in the secondary tables. Then I would be able to unite what I need into one main table for the search queries. Note also the suggestion of Bastiaan regarding the use of left outer joins, which you also have suggested over the natural inner join. I am learning a lot of useful tools here, and I am the richer for it and for your helpfulness. Much appreciated, Cliff! George On Sun, 2008-10-26 at 20:44 -0500, Cliff Nieuwenhuis wrote: > G. J. Walsh writes: > > > > 1 of the 4 tables has the complete range of keys, in this case about > > > 1,000. The other 3 tables have data representing most but not all of > > > those keys, and in different sets. I want to end up with a new > > > 'combined' table which will allow me to immediately 'see' missing > > > data from the 3 smaller tables and take the necessary steps to > > > 'fill in the blanks'. I realize I would of course have to create > > > that table without constrictions other than the primary key. > > and > > > My <<< shaky >>> understanding from the manual is that I could use > > something like: > > > > SELECT * FROM tmain,tsub1,tsub2,tsub3 NATURAL INNER JOIN tcombo > > > > Am I on the right track? > > I asked > > > Do you mean you want to find keys that exist in the 'main' table but > > have no match in the other three tables? > > .. and you said 'yes'. There is more to it on your side, but (to me) > those details are unique to what you are doing with your application. > > For me (and I'm a novice as well and speak under correction here), I > will start will a query I can easily understand and then work towards > one that is more efficient. In your case I might try something like: > > SELECT * FROM tmain WHERE tmain.primarykey NOT IN ( > SELECT DISTINCT foreignkey FROM tsub1 ); > > ..to get records that are in the 'main' table but have no related > information in tsub1. You would need to repeat this for the other > tables of course. > > If you wanted a big table with all 'missing' fields, I would try an > OUTER JOIN instead of the NATURAL INNER JOIN, because you want the > records in the 'main' table to appear in the results even if there is > no match in one of the 'sub' tables. > > Perhaps someone on this list has a better idea or understands your > situation better than I do. >
--- On Sun, 10/26/08, Cliff Nieuwenhuis <cliff@nieusite.com> wrote: > G. J. Walsh writes: > > > > 1 of the 4 tables has the complete range of keys, > in this case about > > > 1,000. The other 3 tables have data representing > most but not all of > > > those keys, and in different sets. I want to end > up with a new > > > 'combined' table which will allow me to > immediately 'see' missing > > > data from the 3 smaller tables and take the > necessary steps to > > > 'fill in the blanks'. I realize I would > of course have to create > > > that table without constrictions other than the > primary key. > > and > > > My <<< shaky >>> understanding from > the manual is that I could use > > something like: > > > > SELECT * FROM tmain,tsub1,tsub2,tsub3 NATURAL INNER > JOIN tcombo > > > > Am I on the right track? > > I asked > > > Do you mean you want to find keys that exist in the > 'main' table but > > have no match in the other three tables? > > .. and you said 'yes'. There is more to it on your > side, but (to me) > those details are unique to what you are doing with your > application. > > For me (and I'm a novice as well and speak under > correction here), I > will start will a query I can easily understand and then > work towards > one that is more efficient. In your case I might try > something like: > > SELECT * FROM tmain WHERE tmain.primarykey NOT IN ( > SELECT DISTINCT foreignkey FROM tsub1 ); > > ..to get records that are in the 'main' table but > have no related > information in tsub1. You would need to repeat this for the > other > tables of course. > > If you wanted a big table with all 'missing' > fields, I would try an > OUTER JOIN instead of the NATURAL INNER JOIN, because you > want the > records in the 'main' table to appear in the > results even if there is > no match in one of the 'sub' tables. > > Perhaps someone on this list has a better idea or > understands your > situation better than I do. > I'm no expert either but it seems to me the situation calls for a view. It will always show up-to-date data. Unless you want to store additional data for the lines in with no matches. Then you would indeed need another table. Youmay have to handle updates to tmain and the tsubs in this case, probably with triggers. Bruce