Thread: couple of general questions
Hi, I am in the process of developing my first database with PostgreSQL and was hoping to find help with a few questions: 1. Is it possible to use syntax like "if object exists then drop it" before running your ddl statements? Which system table do you hit? 2. In the documentation it says that of the character data types text is best, over varchar(). Why is this? What if any are the benefits of using an array data type over, say, a child table holding all the values of the array? 3. OK the CLUSTER statement. Say you cluster on the last name of an employee table. When you then do an insert into this table for someone named Smith is the record logically written between the R's and the T's? Just need to verify this is right :) 4. Lastly I wanted to confirm that the best way to regularly run a piece of sql is to schedule a OS script to handle it. Is this right? Thanks for any help! Culley __________________________________________________ Do You Yahoo!? Get email at your own domain with Yahoo! Mail. http://personal.mail.yahoo.com/
> 1. Is it possible to use syntax like "if object > exists then drop it" before running your ddl > statements? Which system table do you hit? If there is I don't know about it... but if you want to find the tables, they're in pg_class. Along with a bunch of other stuff. > 2. In the documentation it says that of the character > data types text is best, over varchar(). Why is this? Best, depending on the scenario. In cases where you are using a fixed number of characters, there's no need for the overhead of a varchar. For instance if you are storing state abbreviations, they will ALWAYS be 2 characters. The database can look up those fixed fields faster. But if you are storing full state names, it would be a waste to have all those passing spaces so that you could fit Mississippi with Maine. All that being said, it's my understanding that there will be no benefit to using the CHAR type over the VARCHAR type in 7.1 due to architectural changes. > What if any are the benefits of using an array data > type over, say, a child table holding all the values > of the array? One, it reduces the complexity of the database. Two, you don't have to perform a join or extra query to extract those child values. I'd personally use them more if there were better ways to query array information. Such as the length of the array, etc. Perhaps there is and I'm just ignorant... > 3. OK the CLUSTER statement. Say you cluster on the > last name of an employee table. When you then do an > insert into this table for someone named Smith is the > record logically written between the R's and the T's? > Just need to verify this is right :) Can't comment much here, but CLUSTERing seems to be somewhat broken currently. I'm not planning on doing anything with it until I hear that everything works dandy. > 4. Lastly I wanted to confirm that the best way to > regularly run a piece of sql is to schedule a OS > script to handle it. Is this right? There are a number of ways to handle it... I would handle it in whatever way you feel most comfortable. If that's running psql from crontab, so be it. Greg
------------------------ Best, depending on the scenario. In cases where you are using a fixed number of characters, there's no need for the overhead of a varchar. For instance if you are storing state abbreviations, they will ALWAYS be 2 characters. The database can look up those fixed fields faster. But if you are storing full state names, it would be a waste to have all those passing spaces so that you could fit Mississippi with Maine. All that being said, it's my understanding that there will be no benefit to using the CHAR type over the VARCHAR type in 7.1 due to architectural changes. ----------------------------------- Is there any difference between varchar and text other than varchar places a cap on the number of characters? Culley __________________________________________________ Do You Yahoo!? Get email at your own domain with Yahoo! Mail. http://personal.mail.yahoo.com/
> Is there any difference between varchar and text other > than varchar places a > cap on the number of characters? That I know of? No. But I would hope someone more knowledgeable would correct me if I'm wrong. Greg
Culley Harrelson writes: > 1. Is it possible to use syntax like "if object > exists then drop it" before running your ddl > statements? Which system table do you hit? There is no such syntax. You just use DROP and ignore the error message if appropriate. > 2. In the documentation it says that of the character > data types text is best, over varchar(). I think the text type was carried over from the old Postquel language whereas the varchar type was added later for SQL compliance. All in all there isn't much difference between them. > What if any are the benefits of using an array data > type over, say, a child table holding all the values > of the array? Don't use arrays. > 3. OK the CLUSTER statement. Say you cluster on the > last name of an employee table. When you then do an > insert into this table for someone named Smith is the > record logically written between the R's and the T's? > Just need to verify this is right :) No. The CLUSTER only affects the existing data. New records are appended to the end until the next CLUSTER. > 4. Lastly I wanted to confirm that the best way to > regularly run a piece of sql is to schedule a OS > script to handle it. Is this right? A cron job, sure. -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Gregory Wood wrote: > Best, depending on the scenario. In cases where you are using a fixed number > of characters, there's no need for the overhead of a varchar. For instance > if you are storing state abbreviations, they will ALWAYS be 2 characters. Does anyone else get annoyed when going on to an american site to register or buy something and find that the state field is only 2 characters long? Remember, America != World. Be careful when you say always... -- Martijn van Oosterhout <kleptog@cupid.suninternet.com> http://cupid.suninternet.com/~kleptog/
> > Best, depending on the scenario. In cases where you are using a fixed number > > of characters, there's no need for the overhead of a varchar. For instance > > if you are storing state abbreviations, they will ALWAYS be 2 characters. > > Does anyone else get annoyed when going on to an american site to > register or buy something and find that the state field is only > 2 characters long? Sorry, I didn't realize that many other countries had states... the only other frame of reference that I have is Canadian Provinces, which also have 2 character codes. Since we only do business with countries in the United States and Canada I'm woefully ignorant of international standards. I'd also like to apologise for our backwards use of the Imperial measurement system... but that one is not my fault. > Remember, America != World. Be careful when you say always... Sorry, I was trying to give what I thought was a somewhat more concrete example without having to stipulate a number of qualifications. Let me rephrase to be more internationally correct: "If you are storing United States of America state abbreviations, they will ALWAYS be 2 characters, as long as the U.S. Postal Service decides to stick with the current standards, the United States remains an autonomous state capable of determining internal boundaries and their designations, the programming community doesn't agree on another standard of naming states, you don't decide to use a numbering system for faster table joins, and you don't decide to make up your own abbreviations consisting of a number of characters other than 2." Does anyone else get annoyed when people jump all over someone because they didn't spend 20 minutes proofreading something for political correctness when they were trying to help someone else? Greg
> Does anyone else get annoyed when people jump all over > someone because they > didn't spend 20 minutes proofreading something for political > correctness > when they were trying to help someone else? > I agree with you, but I'm an American so I don't think my opinion counts! ;-) > Greg > Cheers, Craig
On Fri, 19 Jan 2001 17:02:43 Gregory Wood wrote: >> Does anyone else get annoyed when going on to an american site to >> register or buy something and find that the state field is only >> 2 characters long? > [snip] > >Does anyone else get annoyed when people jump all over someone because they >didn't spend 20 minutes proofreading something for political correctness >when they were trying to help someone else? No, but I get annoyed when I can't enter my address in a form because the form's creator did not consider all possible valid responses -- even for American addresses. For instance there are 50 U.S. states, but there's also the District of Columbia, U.S. territories, possessions, and military overseas addresses (APO/FPO) that have a "state" code. You'd be surprised how many sites fail to allow for all these variations of "American" addresses. Some things really are a fixed length of 2 (the number of digits needed to specify hour of day, day of month, or month of year). As for the state code, it wasn't political correctness, it was a valid point. Tony -- Anthony E. Greene <agreene@pobox.com> <http://www.pobox.com/~agreene/> PGP Key: 0x6C94239D/7B3D BD7D 7D91 1B44 BA26 C484 A42A 60DD 6C94 239D Chat: AOL/Yahoo: TonyG05 ICQ: 91183266 Linux. The choice of a GNU Generation. <http://www.linux.org/>
"Anthony E . Greene" wrote: > On Fri, 19 Jan 2001 17:02:43 Gregory Wood wrote: > >> Does anyone else get annoyed when going on to an american site to > >> register or buy something and find that the state field is only > >> 2 characters long? > [snip] > >Does anyone else get annoyed when people jump all over someone because they > >didn't spend 20 minutes proofreading something for political correctness > >when they were trying to help someone else? > No, but I get annoyed when I can't enter my address in a form because the > form's creator did not consider all possible valid responses -- even for > American addresses..... > Some things really are a fixed length of 2 (the number of digits > needed to specify hour of day, day of month, or month of year). > As for the state code, it wasn't political correctness, it was a valid > point. I'm currently doing a website for 5 continents, 18 languages... all from *one* set of PHP/PgSQL code. I have this to say about states/regions/etc: Yes, it sucks when UI fails to consider localized versions. I would treat them the say way I would treat anybody else who didn't want to make extra effort to get my business... I'd find somebody who _did_ want it. For this reason, I have both a two letter state/code entry, a county/area/region entry, and, depending on the site/area/language used, it will validate and work with different fields. It's usually a matter of design. Many folks in the US simply *can't* imagine a multinational world. So let them live in their little bubble, and buy from somebody else. Heck, have you ever noticed how many sites only support one form of currency, or list all dates in US style, or still use (gasp) the english measurement system (when even the english have given up on most of it)? Yes, it's annoying when somebody makes a crusade of it, and makes the lives of others miserable. But the good db-design issue is this: Target your designs for growth outside of your current market, and you will spend less time redesigning and rewriting. Make your zip/postal/delivery code column at least 18 chars, and alphanumeric (many countries use letters in their postal systems) in some way. Allow "state" to be blank (what is somebody living in luxembourg going to put in?) Phone number fields should support 22 chars (to allow for double growth in this space), and so on. It used to be that "good, tight, db design" saved as much as 20 or 50 whole megabytes, which was worth $5000-10,000 dollars (USD) per field. Now, that same design decision may save about $10 dollars (USD), and cost you 4-40 hours in redesign, drop, and reload, when you need to grow... which is *much* more expensive than adding another cheap 40 GB disk. Disk is cheap. Redesign is not. Make your fields large, and plentiful. If you need to trim then down for fast reporting and searches, again, disk is cheap, redesign is not, so make another (tighter) table with triggered entries for your searching/sorting. -Ronabop -- Personal: ron@opus1.com, 520-326-6109, http://www.opus1.com/ron/ Work: rchmara@pnsinc.com, 520-546-8993, http://www.pnsinc.com/ The opinions expressed in this email are not necessarily those of myself, my employers, or any of the other little voices in my head.
Gregory Wood wrote: > > > Does anyone else get annoyed when going on to an american site to > > register or buy something and find that the state field is only > > 2 characters long? > > Sorry, I didn't realize that many other countries had states... the only > other frame of reference that I have is Canadian Provinces, which also have > 2 character codes. Since we only do business with countries in the United > States and Canada I'm woefully ignorant of international standards. I'd also > like to apologise for our backwards use of the Imperial measurement > system... but that one is not my fault. Umm, sorry. I must have come over somewhat stronger than I intended. It was supposed to be just a passing comment. The reason I picked up on it is because it's the first thing people think of when looking for a reason for fixed length fields and (as pointed out on this thread) it's not even valid for the whole of the US. Personally, for all DB systems I build now I just make *every* text field type text and never use char()/varchar(). Everything time I think I've made it long enough, someone comes up with an example that's longer. The performance difference is marginal or non-existant. Don't think I'm attacking you, I'm just trying to help... -- Martijn van Oosterhout <kleptog@cupid.suninternet.com> http://cupid.suninternet.com/~kleptog/
Martijn van Oosterhout wrote: > > I must have come over somewhat stronger than I intended. > It was supposed to be just a passing comment. The reason > I picked up on it is because it's the first thing people > think of when looking for a reason for fixed length fields > and (as pointed out on this thread) it's not even valid > for the whole of the US. > Correction. The two character field *is* valid for the whole US. All the places the USPS delivers have legit 2 character codes, listed conveniently at: http://www.usps.com/ncsc/lookups/usps_abbreviations.html#states This includes DC, PR, AE, etc. The USPS even refers to them as "states" (their quotes) when referring to non-state entities such as the Armed Forces. That all being said, I've seen too many Web sites that have only the 50 states (plus usually DC) available in a state drop down. Increase that to include all 60 or so from the above URL, and it's good design for, say, an online company that only delivers via the USPS, because it eliminates data entry errors. On the other hand, a site that caters to an international base, needs more than just that. -- Richard Seymour : Anarchy Software, Inc. - * - - * - - - * -+- * - - - * - - * - `°º¤ø,¸ ¸,ø¤º°' `°º¤ø,¸¸,ø¤º°
> Umm, sorry. Sorry again, bad day mixed with feeling rubbed the wrong way. > and (as pointed out on this thread) it's not even valid > for the whole of the US. That's new on me... I have a list of U.S. Postal Codes that all consist of two letters. This includes all the U.S. states, territories and Canadian Provinces. I'd love an example of an invalid code. > Personally, for all DB systems I build now I just make > *every* text field type text and never use char()/varchar(). > Everything time I think I've made it long enough, someone > comes up with an example that's longer. The performance > difference is marginal or non-existant. That wouldn't be too bad an idea, except the way my interface handles TEXT fields. I use Delphi with a client/server architecture called Asta. Some Delphi visual components (the [in]famous grid control in particular) only display TEXT fields as "(MEMO)" (or "(memo)" if the value is NULL). In addition, Asta allows me to choose whether or not to send Blob and Memo fields. This can be incredibly handy for reducing bandwidth on tables with large fields. If I did choose to make all my fields TEXT, I would lose that advantage :( If it weren't for those little implementation details, I think all my fields would be TEXT as well :) > Don't think I'm attacking you, I'm just trying to help... Sorry again... was feeling a bit defensive. Thank god it's the weekend... Greg
Ron Chmara wrote: > It's usually a matter of design. Many folks in the US simply *can't* > imagine a multinational world. So let them live in their little bubble, > and buy from somebody else. Heck, have you ever noticed how many sites > only support one form of currency, or list all dates in US style, or > still use (gasp) the english measurement system (when even the english > have given up on most of it)? Please, this is still the PostgreSQL GENERAL list, not some place where <insert_favorite_minority>-bashing is wanted or tolerated. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
On Sat, 20 Jan 2001, Martijn van Oosterhout wrote: > Gregory Wood wrote: > > Best, depending on the scenario. In cases where you are using a fixed number > > of characters, there's no need for the overhead of a varchar. For instance > > if you are storing state abbreviations, they will ALWAYS be 2 characters. > > Does anyone else get annoyed when going on to an american site to > register or buy something and find that the state field is only > 2 characters long? Nope. But then again my e-commerce site doesn't accept orders from outside of the US or Canada. Shipping's too high. Vince. -- ========================================================================== Vince Vielhaber -- KA8CSH email: vev@michvhf.com http://www.pop4.net 128K ISDN from $22.00/mo - 56K Dialup from $16.00/mo at Pop4 Networking Online Campground Directory http://www.camping-usa.com Online Giftshop Superstore http://www.cloudninegifts.com ==========================================================================
------------------------ Best, depending on the scenario. In cases where you are using a fixed number of characters, there's no need for the overhead of a varchar. For instance if you are storing state abbreviations, they will ALWAYS be 2 characters. The database can look up those fixed fields faster. But if you are storing full state names, it would be a waste to have all those passing spaces so that you could fit Mississippi with Maine. All that being said, it's my understanding that there will be no benefit to using the CHAR type over the VARCHAR type in 7.1 due to architectural changes. ----------------------------------- Is there any difference between varchar and text other than varchar places a cap on the number of characters? Culley
Harrelson, CulleyX writes: > Is there any difference between varchar and text other than varchar places a > cap on the number of characters? Varchar is SQL compliant, Text is not. You can use Varchar without a character limit, but that is not SQL compliant either. -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Gregory Wood wrote: > Sorry, I was trying to give what I thought was a somewhat more concrete > example without having to stipulate a number of qualifications. Let me > rephrase to be more internationally correct: "If you are storing United > States of America state abbreviations, they will ALWAYS be 2 characters, as > long as the U.S. Postal Service decides to stick with the current standards, > the United States remains an autonomous state capable of determining > internal boundaries and their designations, the programming community > doesn't agree on another standard of naming states, you don't decide to use > a numbering system for faster table joins, and you don't decide to make up > your own abbreviations consisting of a number of characters other than 2." 1) You're right about the unusefulness of "politically correct" mails. A waste of time ... unless it means sometimes missing your interlocutor's point ... 2) However, the point made by MvO is 100% valid : an awful lot of software is built with US specifics hardcoded : a character is 1 byte wide, a phone number is 7+3 digits wide, the decimal separator is a dot, a ZIP code is 2 letters and 5 digits, time is expressed with 0-11 hours + AM/PM, paper size is 8.5"x11" and so on ... Quite a pain to use or translate if you happen to live in one of those backward countries where people still live 24-hours days ... <Venting> As to the point of the Imperial Units System : If Anglo-Saxons are dumb enough to let the IUS imposed on them, fine for them, but they should not expect the rest of the world to cope up with such nonsense ... Having to use a calculator to convert between pounds per square inch to tons per sqare mile ... Baaah ! The IS unit is Newtons per square meter, and all decimal powers of it. </Venting> Emanuel Charpentier Decimal since 1798 (Year VII of the Republic) -- Emmanuel Charpentier