Re: Storage consumption - Mailing list pgsql-general
From | Bruce Momjian |
---|---|
Subject | Re: Storage consumption |
Date | |
Msg-id | 200311141826.hAEIQCD23265@candle.pha.pa.us Whole thread Raw |
In response to | Storage consumption (Troels Arvin <troels@arvin.dk>) |
List | pgsql-general |
Did you see the FAQ item on estimating disk space? --------------------------------------------------------------------------- Troels Arvin wrote: > Hello, > > For some very data-intensive projects it's interesting how much space the > DBMS uses for the storage of data, so I'm investigating how space > efficient different DBMSes are. > > In the PostgreSQL manual, it's written that values of the type INTEGER > take op four bytes. I was curious about how close to real-World this > number is, so I did a test: How much space does PostgreSQL use when > storing 100000 rows where each row consists of a single INTEGER value? > > With help from http://random.org/ I created a file with 100000 random > integer insertions. The SQL used to do that is available at > http://troels.arvin.dk/db/tests/storage-sizes/randomints.zip > > About installation: PostgreSQL v. 7.3.4 on Red Hat Linux 9, file system > ext3. PostgreSQL data-area in /var/lib/pgsql/data. > > For this test, PostgreSQL is being used for nothing else. > > Before test start: > ----------------- > Access to a default database ('psql' brings you right into a working > database) from psql. > Access to do a 'du' (disk usage unix-command) on /var/lib/pgsql/data from > the command line. > No existing table 'inttab' in database. PostgreSQL stopped. > > Test starts. > ----------- > Output of 'du -sb /var/lib/pgsql/data': 77946519. > Start PostgreSQL. > Do: "CREATE TABLE inttab (intval INT) WITHOUT OIDS;" > psql -q -f random_ints.sql > (Wait for a long time.) > Do: "VACUUM FULL;" > Shut down PostgreSQL. > Output of 'du -sb /var/lib/pgsql/data': 81190551. > > Result: > ------ > Real difference: 81190551-77946519 = 3244032 > Optimal difference: 100000*4 = 400000 > Storage consumption rate ((real/optimal)*100)% = 811% > > I'm surprised by an overhead _that_ high. Any comments on my methology? > Does it need adjustments? If you think it's rotten: What methology would > you use to measure space overhead for a DBMS? (Again: Space overhead is > seldomly interesting, but sometimes it is.) > > I guess that transaction log files are a joker in this context, but then > again: A number which reflects the DBMS' disk usage before and after an > operation does have real-World meaning, I think. > > (Of course, I'll need another methology for DBMSes which preallocate a > fixed amount of storage for a database.) > > -- > Greetings from Troels Arvin, Copenhagen, Denmark > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgsql-general by date: