Thread: weird duplicate data problem
a table was created with: create table bills (id serial not null,account_no int4 not null,date_issued date not null,..... ); The following SQL script returns only 1 row of data: select oid, * from bills where id = xxxxx' order by oid; However with the following SQL: select oid, * from bills where date_issued = '01/01/2001' order by oid; some of the data is duplicated, including the OID. When trying to 'vaccum' the database, get NOTICE: Index bills_id_key: number of index tupples (4755) is not the same as heap (7800). What is happening? What could have caused this? Database version used is V6.50. I know it's old, but i hope it's not the version that's causing this problem. Thanks in advance for any suggestions.
On Fri, 14 Dec 2001, Carolyn Lu Wong wrote: > a table was created with: > > create table bills ( > id serial not null, > account_no int4 not null, > date_issued date not null, > ..... > ); There is no reason to set id to not null. It is a serial field. It will automatically increase. Your account_no should probably be set to a char field. My old rule of thumb is not to make something int unless you plan to perform some type of math on it. > > The following SQL script returns only 1 row of data: > > select oid, * from bills where id = xxxxx' order by oid; > If your id field is serial, all the records will have unique id's in them. Therefore you are only going to get one row of data. > > > However with the following SQL: > > select oid, * from bills where date_issued = '01/01/2001' > order by oid; > Can't help you here. I'm running 7.1.3 and the query works fine. > some of the data is duplicated, including the OID. > > When trying to 'vaccum' the database, get > > NOTICE: Index bills_id_key: number of index tupples (4755) is not the > same as heap (7800). > > What is happening? What could have caused this? > > Database version used is V6.50. I know it's old, but i hope it's not the > version that's causing this problem. > > Thanks in advance for any suggestions. > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Jeff Self Powered by Debian GNU/Linux http://members.home.net/jocknerd
Carolyn, Jeff, > > a table was created with: > > > > create table bills ( > > id serial not null, > > account_no int4 not null, > > date_issued date not null, > > ..... > > ); > There is no reason to set id to not null. It is a serial field. It > will > automatically increase. Your account_no should probably be set to a > char > field. My old rule of thumb is not to make something int unless you > plan > to perform some type of math on it. I have to disagree with this, Jeff. A six-digit integer takes up only 20% of the storage space of a 6-character string. As such, it will be faster to index and faster to retrieve, and faster to perform JOINs and sorts. So if Carolyn will never want to store anything but numbers in the field, integer is ideal. > > The following SQL script returns only 1 row of data: > > > > select oid, * from bills where id = xxxxx' order by oid; > > > If your id field is serial, all the records will have unique id's in > them. > Therefore you are only going to get one row of data. Yup. Why, BTW, is Carloyn interested in the OID? It is not a useful piece of information for query purposes ... the OID is a tool for the Postgres DB engine and should not be used as data, since it may change unexpectedly. > > However with the following SQL: > > > > select oid, * from bills where date_issued = '01/01/2001' > > order by oid; > > > > Can't help you here. I'm running 7.1.3 and the query works fine. Try date_issued = '2001-01-01'. Some platforms (Unix for one) get fussy about using universal date format. > > some of the data is duplicated, including the OID. > > > > When trying to 'vaccum' the database, get > > > > NOTICE: Index bills_id_key: number of index tupples (4755) is not > the > > same as heap (7800). > > > > What is happening? What could have caused this? > > > > Database version used is V6.50. I know it's old, but i hope it's > not the > > version that's causing this problem. Sorry, Carolyn, can't help you on Vacuum problems unless you upgrade. 6.5 does indeed have several major problems with Vacuum; time to get 7.1.3. The core team ain't releasing new versions for the fun alone, y'know! -Josh ______AGLIO DATABASE SOLUTIONS___________________________ Josh Berkus Complete informationtechnology josh@agliodbs.com and data management solutions (415) 565-7293 for law firms, small businesses fax 621-2533 and non-profit organizations. San Francisco
Josh Berkus wrote: > Carolyn, Jeff, > > I have to disagree with this, Jeff. A six-digit integer takes up only > 20% of the storage space of a 6-character string. As such, it will be > faster to index and faster to retrieve, and faster to perform JOINs and > sorts. So if Carolyn will never want to store anything but numbers in > the field, integer is ideal. That is all true, and still you should not restrict the possible content to something that got nothing to do withthe nature of the data. Things like account numbers, zip codes, phone numbers, whatnot ARE CHARACTER SEQUENCES - period. How sure can Carolyn be that there will never be an account number where leading zeroes are significant? That true already for zip codes (and by the way, european zipcodes are alphanumeric). And what is the cosmic significance of my phone number plus your creditcard number divided by Carolyn's shoesize?:-) Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
-----Original Message----- From: Carolyn Lu Wong [mailto:carolyn@greatpacific.com.au] Sent: 14 December 2001 07:29 To: pgsql-sql@postgresql.org Subject: weird duplicate data problem a table was created with: create table bills (id serial not null,account_no int4 not null,date_issued date not null,..... ); The following SQL script returns only 1 row of data: select oid, * from bills where id = xxxxx' order by oid; -- Seems reasonable -- However with the following SQL: select oid, * from bills where date_issued = '01/01/2001' order by oid; some of the data is duplicated, including the OID. -- OIDs are not guaranteed unique and can wraparound (unlesss there's a unique index). However this seems a little unlikely in the same day so the only thing I can think of is that there is a corrupt index on date_issued. Try dropping and recreating any indexes.However I havn't really used 6.5 so couldn't be sure. -- When trying to 'vaccum' the database, get NOTICE: Index bills_id_key: number of index tupples (4755) is not the same as heap (7800). What is happening? What could have caused this? Database version used is V6.50. I know it's old, but i hope it's not the version that's causing this problem. Thanks in advance for any suggestions.