Thread: Postgresql and programming
Hello all, as usual in CS, there can be many ways to do the same thing. For instance, I can uppercase a string in a function in POstgreSQL or I can uppercase it before sending the query to the database. I am reading Bruce Monjian's book right now (page 204, Functions & triggers/triggers), and I am wondering in terms of performance and speed what way is prefered (option A or B). In terms of performance also, is it preferable to desing a database as: create table articles ( refarticle text, title text, authorID integer, authorname text, editorID integer, editorname text, ... etc... ); OR : create articles( refarticle text, title text, authorID integer, editorID integer, ... etc... ); create table authors( authorname text, authorID integer, ...etc... ); create table editors( editorID integer, editorname text, ...etc... ); as the critical fields that searches will primarily be based on are obvious here. TIA. T.J. (P.S. some of you might remember design B from biblio.mdb :-) . )
If you're only consideration is raw performance of lookups on text fields, I suppose there might be some measureable performance advantage in putting everything in one table, but I doubt you'd really notice the difference. If you *did* do this, what is the significance of the authorID field? Normalize, normalize, normalize. Use a relational database for it's strengths. Don't duplicate data. "T.J.Farrell" wrote: > > In terms of performance also, is it preferable to desing a database as: > > create table articles ( > refarticle text, > title text, > authorID integer, > authorname text, > editorID integer, > editorname text, > ... etc... > ); > > OR : > > create articles( > refarticle text, > title text, > authorID integer, > editorID integer, > ... etc... > ); > > create table authors( > authorname text, > authorID integer, > ...etc... > ); > > create table editors( > editorID integer, > editorname text, > ...etc... > ); ________________________ Ron Peterson rpeterson@yellowbank.com
T.J.Farrell wrote: > Hello all, > > as usual in CS, there can be many ways to do the same thing. > For instance, I can uppercase a string in a function in POstgreSQL or I can > uppercase it before sending the query to the database. I am reading Bruce > Monjian's book right now (page 204, Functions & triggers/triggers), and I am > wondering in terms of performance and speed what way is prefered (option A > or B). > > In terms of performance also, is it preferable to desing a database as: > > create table articles ( > refarticle text, > title text, > authorID integer, > authorname text, > editorID integer, > editorname text, > ... etc... > ); > > OR : What will you do when a book is written by two authors, eg. Richard Patterson and John Hennesy ? Some technical books are even written by more authors ! > > > create articles( > refarticle text, > title text, > authorID integer, > editorID integer, > ... etc... > ); > > create table authors( > authorname text, > authorID integer, > ...etc... > ); > > create table editors( > editorID integer, > editorname text, > ...etc... > ); > > as the critical fields that searches will primarily be based on are obvious > here. > > TIA. > > T.J. > > (P.S. some of you might remember design B from biblio.mdb :-) . ) Jurgen Defurne
Well I think the problem here is normalization which is more important than perfomance. The anomalies of the first schema could take us into an extremely poor situation, even in terms of perfomance. Could you imagine a situation where you have 1 million records and you have to update the author or editorname? How many replacements in the first case? Imagine the same situation in the second case? Jesus. On Wed, 7 Jun 2000, T.J.Farrell wrote: > Hello all, > > as usual in CS, there can be many ways to do the same thing. > For instance, I can uppercase a string in a function in POstgreSQL or I can > uppercase it before sending the query to the database. I am reading Bruce > Monjian's book right now (page 204, Functions & triggers/triggers), and I am > wondering in terms of performance and speed what way is prefered (option A > or B). > > In terms of performance also, is it preferable to desing a database as: > > create table articles ( > refarticle text, > title text, > authorID integer, > authorname text, > editorID integer, > editorname text, > ... etc... > ); > > OR : > > create articles( > refarticle text, > title text, > authorID integer, > editorID integer, > ... etc... > ); > > create table authors( > authorname text, > authorID integer, > ...etc... > ); > > create table editors( > editorID integer, > editorname text, > ...etc... > ); > > as the critical fields that searches will primarily be based on are obvious > here. > > TIA. > > T.J. > > (P.S. some of you might remember design B from biblio.mdb :-) . ) > >
On Wed, 14 Jun 2000, Ron Peterson wrote: > Normalize, normalize, normalize. Use a relational database for it's > strengths. Don't duplicate data. On a related note, I have a similar scheme where I have a "master" table which includes integer references to other tables like the previous example had with authors. So, like the following: CREATE TABLE master_device_list ( device_name CHAR(80), -- Do people use CHARs or VARCHARs? device_param1 INTEGER, ... device_config_code INTEGER REFERENCES device_config NOT NULL DEFAULT 1 /* Sequences seem to start at a default of 1. Is this always true? Will it remain this way in the future? */ ); CREATE TABLE device_config ( device_config_code SERIAL PRIMARY KEY, param_1 INTEGER, ... param_n INTEGER, UNIQUE (param_1, ... , param_n) ); First, is this an acceptible way to construct a relational database? 99.9% of all of these devices will use the exact same configuration. Also, addition and removal of data is not that important in terms of speed. Data lookup is more important for my application. Now I want to do updates and adds. The issue is how to determine if the device_configuration that I want already exists. If it does exist, I can use the following: UPDATE master_device_list SET device_config_code = ( SELECT device_config_code FROM device_config WHERE param_1 = integer_value AND ... param_n = integer_value ) WHERE device_param1 = some_value; But this will fail if the entry doesn't exist yet. Is there some standard way of handling this sort of thing? Is it some sort of transaction? I appreciate any and all help. Thanks, R.