Re: Naming-scheme for db-files - Mailing list pgsql-general
From | Oleg Bartunov |
---|---|
Subject | Re: Naming-scheme for db-files |
Date | |
Msg-id | Pine.GSO.4.44.0208291401340.8881-100000@ra.sai.msu.su Whole thread Raw |
In response to | Re: Naming-scheme for db-files (Oleg Bartunov <oleg@sai.msu.su>) |
List | pgsql-general |
To continue my previous message. I just load test data (i've used in README.tsearch) and looked at sizes: -rw------- 1 postgres users 23191552 Aug 29 14:08 53016937 -rw------- 1 postgres users 81059840 Aug 29 14:08 52639027 Table titles occupies 80Mb, index on txtidx column occupies 22Mb. After doing test=# select title into titles_tmp from titles; SELECT I got size of table 'titles' without txtidx field -rw------- 1 postgres users 30105600 Aug 29 14:14 53016938 So, txtidx column itself occupies about 50Mb. I don't see any giga-sizes ! after running 'vacuum full analyze' I got: -rw------- 1 postgres users 30105600 Aug 29 14:26 53016938 -rw------- 1 postgres users 36880384 Aug 29 14:26 53016937 -rw------- 1 postgres users 51494912 Aug 29 14:26 52639027 pg@zen:/usr/local/pgsql/data/base/52638986$ oid2name -d test -x|grep 53016937 53016937 = t_idx pg@zen:/usr/local/pgsql/data/base/52638986$ oid2name -d test -x|grep 52639027 53016934 = pg_toast_52639027 53016936 = pg_toast_52639027_idx 52639027 = titles pg@zen:/usr/local/pgsql/data/base/52638986$ oid2name -d test -x|grep 53016938 53016938 = titles_tmp So, actual size of 'txtidx' field is 20 Mb ! --------------------------------------------------------------------------- test=# select count(*) from titles; s count -------- 377905 (1 row) test=# select version(); version --------------------------------------------------------------- PostgreSQL 7.2.2 on i686-pc-linux-gnu, compiled by GCC 2.95.3 (1 row) test=# \d titles Table "titles" Column | Type | Modifiers ----------+------------------------+----------- title | character varying(256) | titleidx | txtidx | Indexes: t_idx There is one performance tips: Use CLUSTER command to speedup search query ! There was thread about recently in -hackers. In 7.2 CLUSTER works but requires some addtional operations, in 7.3 it should works fine without problems. Regards, Oleg On Thu, 29 Aug 2002, Oleg Bartunov wrote: > I didn't follow this thread, but I'm a little bit surprised ! > txtdix itself is just a parsed data from corresponding column ! > and txtidx stores only unique lexems, so size of txtidx column should > be lesser than original one ! I'd be interested to see more details > (if you could send me your data and sql script). Also, > did you use oid2name from contrib to see sizes of tables and indices ? > What is an output of 'select version()' ? > 100,000 rows is not a big deal. > > regards, > > Oleg > > > On Thu, 29 Aug 2002, Markus Wollny wrote: > > > Just a try on using txtidx for a speedy search over the e-mail-fields. > > No success, though - fti would have been better for the task (like > > finding all users who have got an aol.com-adress). > > > > > -----UrsprNngliche Nachricht----- > > > Von: Martijn van Oosterhout [mailto:kleptog@svana.org] > > > Gesendet: Donnerstag, 29. August 2002 01:18 > > > An: Markus Wollny > > > Cc: pgsql-general@postgresql.org > > > Betreff: Re: [GENERAL] Naming-scheme for db-files > > > > > > > > > On Wed, Aug 28, 2002 at 11:25:44PM +0200, Markus Wollny wrote: > > > > After dropping the mentioned column emailindex of type txtidx (via > > > > copying the data to a temporary table and renaming this table > > > > afterwards), disk usage of that table evaporated from > > > 1862MB to a humble > > > > 61MB. And emailindex was completely empty... Very strange, > > > but ~60MB is > > > > very much more more like what I had estimated the size of > > > that table to > > > > be. > > > > > > What on earth is: > > > > > > emailindex txtidx, > > > > > > Anyway? > > > > > > > -----UrsprУМngliche Nachricht----- > > > > Von: Markus Wollny > > > > Gesendet: Mi 28.08.2002 17:04 > > > > An: Martijn van Oosterhout > > > > Cc: pgsql-general@postgresql.org > > > > Betreff: Re: [GENERAL] Naming-scheme for db-files > > > > > > > > > > > > > > > > Hi! > > > > > > > > Yes, I run vacuum every night - and log-output indicates no > > > > errors at > > > > all. Yet I am indeed quite puzzled about the size of this table. > > > > Is > > > > there some way of finding out which column consumes so much > > > > space? > > > > > > > > select count(*) from ct_com_user return 95858 rows. > > > > The CREATE-statement for this table is as follows: > > > > > > > > CREATE TABLE ct_com_user ( > > > > user_id numeric(10, 0), > > > > login varchar(1000), > > > > password varchar(1000), > > > > status numeric(10, 0), > > > > rights varchar(20) DEFAULT 'r', > > > > firstname varchar(1000), > > > > firstname_visible numeric(1, 0) DEFAULT 0, > > > > lastname varchar(1000), > > > > lastname_visible numeric(1, 0) DEFAULT 0, > > > > clan varchar(1000), > > > > clan_visible numeric(1, 0) DEFAULT 0, > > > > street varchar(1000), > > > > street_visible numeric(1, 0) DEFAULT 0, > > > > zipcode varchar(1000), > > > > zipcode_visible numeric(1, 0) DEFAULT 0, > > > > city varchar(1000), > > > > city_visible numeric(1, 0) DEFAULT 0, > > > > country varchar(1000), > > > > country_visible numeric(1, 0) DEFAULT 0, > > > > phone1 varchar(1000), > > > > phone1_visible numeric(1, 0) DEFAULT 0, > > > > phone2 varchar(1000), > > > > phone2_visible numeric(1, 0) DEFAULT 0, > > > > mobile varchar(1000), > > > > mobile_visible numeric(1, 0) DEFAULT 0, > > > > fax varchar(1000), > > > > fax_visible numeric(1, 0) DEFAULT 0, > > > > email1 varchar(1000), > > > > email1_visible numeric(1, 0) DEFAULT 0, > > > > email2 varchar(1000), > > > > email2_visible numeric(1, 0) DEFAULT 0, > > > > icq varchar(1000), > > > > icq_visible numeric(1, 0) DEFAULT 0, > > > > homepage varchar(1000), > > > > homepage_visible numeric(1, 0) DEFAULT 0, > > > > description varchar(1000), > > > > description_visible numeric(1, 0) DEFAULT 0, > > > > hobbies varchar(1000), > > > > hobbies_visible numeric(1, 0) DEFAULT 0, > > > > signature1 varchar(4000), > > > > signature2 varchar(4000), > > > > signature3 varchar(4000), > > > > forum_view varchar(20) DEFAULT 'flat', > > > > temp_password varchar(100), > > > > registered timestamptz, > > > > last_login timestamptz, > > > > created timestamptz DEFAULT ('now'::text)::timestamp(6) with > > > > time > > > > zone, > > > > lines numeric(3, 0) DEFAULT 400, > > > > lines_forum numeric(3, 0) DEFAULT 20, > > > > forum_lines numeric(3, 0) DEFAULT 20, > > > > forum_smileys varchar(50) DEFAULT 'enhanced', > > > > site_id numeric(10, 0) DEFAULT 0, > > > > flag_id numeric(10, 0) DEFAULT 0, > > > > forum_quoting varchar(50) DEFAULT 'enhanced', > > > > forum_flatpostings numeric(3, 0) DEFAULT 10, > > > > forum_images numeric(1, 0) DEFAULT 2, > > > > user_image numeric(1, 0) DEFAULT 0, > > > > user_image_visible numeric(1, 0) DEFAULT 0, > > > > chat_password varchar(50), > > > > chat_last_visit timestamptz, > > > > authorpoints_visible numeric(1, 0) DEFAULT 0, > > > > emailindex txtidx, > > > > CONSTRAINT idx_pk_ct_com_user UNIQUE (user_id), > > > > CONSTRAINT idx_u_ct_com_user_login UNIQUE (login) > > > > ) WITH OIDS; > > > > > > > > As hardly anyone actually uses a signature, hobby or > > > > description, I > > > > cannot imagine how this should amount to almost 2 GB of data > > > > (1862MB) - > > > > this would mean roughly 20k of data per row average - and no way > > > > have we > > > > got this amount of data - the absolute maximum data per row as > > > > defined > > > > would be about 30k, I guess, but that would mean that nearly all > > > > of our > > > > users would use nearly all available space and I know that this > > > > is quite > > > > far from true. I suspect it may have something to do with > > > > emailindex of > > > > the txtidx-type... As it's not absolutely necessary, I'll try > > > > and drop > > > > this column and see what I get... > > > > > > > > Regards, > > > > > > > > Markus > > > > > > > > > -----UrsprУМngliche Nachricht----- > > > > > Von: Martijn van Oosterhout [mailto:kleptog@svana.org] > > > > > Gesendet: Mittwoch, 28. August 2002 16:30 > > > > > An: Markus Wollny > > > > > Cc: pgsql-general@postgresql.org > > > > > Betreff: Re: [GENERAL] Naming-scheme for db-files > > > > > > > > > > > > > > > On Wed, Aug 28, 2002 at 03:54:45PM +0200, Markus Wollny wrote: > > > > > > Hi! > > > > > > > > > > > > As I was just checking disk-usage of a database (PostgreSQL > > > > > 7.2.1), I > > > > > > stumbled over some files named with a trailing .1 added to > > > > the usual > > > > > > oid. Now if a table 'example' with oid 12345 exists, what > > > > > does the file > > > > > > 12345.1 contain exactly? I didn't find anything about > > > > > .1-files in the > > > > > > documentation... > > > > > > > > > > Postgres splits files at 1GB. The .1 file would be the second > > > > > part of the > > > > > file. When that also reaches 1GB, you'll get a .2 file. > > > > > > > > > > You're doing VACUUM [FULL] regularly, right? > > > > > > > > > > -- > > > > > Martijn van Oosterhout <kleptog@svana.org> > > > > > http://svana.org/kleptog/ > > > > > > There are 10 kinds of people in the world, those that can do > > > > binary > > > > > > arithmetic and those that can't. > > > > > > > > > > > > > ---------------------------(end of > > > > broadcast)--------------------------- > > > > TIP 2: you can get off all lists at once with the unregister > > > > command > > > > (send "unregister YourEmailAddressHere" to > > > > majordomo@postgresql.org) > > > > > > > > > > > > > > > > ---------------------------(end of > > > broadcast)--------------------------- > > > > TIP 1: subscribe and unsubscribe commands go to > > > majordomo@postgresql.org > > > > > > -- > > > Martijn van Oosterhout <kleptog@svana.org> > > > http://svana.org/kleptog/ > > > > There are 10 kinds of people in the world, those that can do binary > > > > arithmetic and those that can't. > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 5: Have you checked our extensive FAQ? > > > > http://www.postgresql.org/users-lounge/docs/faq.html > > > > Regards, > Oleg > _____________________________________________________________ > Oleg Bartunov, sci.researcher, hostmaster of AstroNet, > Sternberg Astronomical Institute, Moscow University (Russia) > Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ > phone: +007(095)939-16-83, +007(095)939-23-83 > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
pgsql-general by date: