Re: PostgreSQL and mySQL database size question - Mailing list pgsql-general

From Bruce Momjian
Subject Re: PostgreSQL and mySQL database size question
Date
Msg-id 200104271747.f3RHlZU17818@candle.pha.pa.us
Whole thread Raw
In response to PostgreSQL and mySQL database size question  ("Fred Ingham" <ingham@erols.com>)
List pgsql-general
Did you see the FAQ items on estimating database sizes?  Does MySQL have
less overhead per row?

[ Charset ISO-8859-1 unsupported, converting... ]
> To all,
>
> Before I begin, this is NOT a PostgreSQL verses mySQL post.  I have used
> both databases and have found both of these products to be excellent choices
> depending on the specific requirements for a  given project.   That said, I
> am trying to understand the database size differences between PostgreSQL and
> mySQL for a current project I am working on.  I have attached the schemas
> for the PostgreSQL and mySQL databases below (minus three tables that are
> not important for this discussion and have no data in them).
>
> I created the database as defined in the schemas below in both PostgreSQL
> v7.1 (on Windows 2000 via cygwin) and mySQL 3.23.37 (on Windows 2000 native)
> and populated them both with the same data: PIN table with 1,740 tuples and
> the PINNDX table with 92,488 tuples.  Looking at the number of files and
> size of the files, I get the following (after doing a vacuumdb with
> PostgreSQL):
>
> PostgreSQL
>   Files  109
>   Size   60.7MB
>
> mySQL
>   Files  24 (3 x 8 tables)
>   Size   11.1MB
>
> Why is there such a discrepancy in the database sizes?  I know that
> PostgreSQL has additional capabilities such as logging however, without
> being able to ascribe a function to each of the numbered files in the
> PostgreSQL database, it is unclear to me what is taking up so much space.
> Performance and capabilities (and the cost of disk space) aside, this is a
> major problem for me.  If anyone can shed some light on this I would
> appreciate it.
>
> Thanks, Fred
>
> ---- PostgreSQL schema ----
>
> CREATE SEQUENCE pins_seq;
>
> CREATE TABLE pins (
>   pinnum    INTEGER   NOT NULL DEFAULT nextval('pins_seq') PRIMARY KEY,
>   modified  TIMESTAMP NOT NULL DEFAULT now(),
>   indexed   CHAR(1)   NOT NULL DEFAULT 'N',
>   sindexed  CHAR(1)   NOT NULL DEFAULT 'N',
>   pin       TEXT      NOT NULL
> )
> ;
>
> CREATE INDEX indexed_ndx  ON pins (indexed);
> CREATE INDEX sindexed_ndx ON pins (sindexed);
>
> CREATE SEQUENCE pinndx_seq;
>
> CREATE TABLE pinndx (
>   pinndxnum INTEGER NOT NULL DEFAULT nextval('pinndx_seq') PRIMARY KEY,
>   parent    INTEGER NOT NULL,
>   tagpath   INTEGER NOT NULL,
>   tagtype   CHAR(1)          NOT NULL,
>   tagname   INTEGER NOT NULL,
>   atrname   INTEGER NOT NULL,
>   pinnum    INTEGER NOT NULL,
>   nvalue    FLOAT4,
>   value     TEXT
> )
> ;
>
> CREATE INDEX parent_ndx  ON pinndx (parent);
> CREATE INDEX tagpath_ndx ON pinndx (tagpath);
> CREATE INDEX tagname_ndx ON pinndx (tagname);
> CREATE INDEX atrname_ndx ON pinndx (atrname);
> CREATE INDEX pinnum_ndx  ON pinndx (pinnum);
> CREATE INDEX nvalue_ndx  ON pinndx (nvalue);
> CREATE INDEX value_ndx   ON pinndx (value);
>
> CREATE SEQUENCE tagpath_seq;
>
> CREATE TABLE tagpathtbl (
>   vkey   INTEGER NOT NULL DEFAULT nextval('tagpath_seq') PRIMARY KEY,
>   value VARCHAR(255) NOT NULL
> );
>
> CREATE INDEX tagpathtbl_ndx ON tagpathtbl (value);
>
> CREATE SEQUENCE tagname_seq;
>
> CREATE TABLE tagnametbl (
>   vkey  INTEGER NOT NULL DEFAULT nextval('tagname_seq') PRIMARY KEY,
>   value VARCHAR(255) NOT NULL
> );
>
> CREATE INDEX tagnametbl_ndx ON tagnametbl (value);
>
> CREATE SEQUENCE atrname_seq;
>
> CREATE TABLE atrnametbl (
>   vkey  INTEGER NOT NULL DEFAULT nextval('atrname_seq') PRIMARY KEY,
>   value VARCHAR(255) NOT NULL
> );
>
> CREATE INDEX atrnametbl_ndx ON atrnametbl (value);
>
> ---- mySQL schema ----
>
> CREATE TABLE pins (
>   pinnum    INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
>   modified  DATETIME         NOT NULL,
>   indexed   CHAR(1)          NOT NULL DEFAULT 'N',
>   INDEX indexedndx (indexed),
>   sindexed  CHAR(1)          NOT NULL DEFAULT 'N',
>   INDEX sindexedndx (sindexed),
>   pin       MEDIUMTEXT       NOT NULL
> )
> AVG_ROW_LENGTH=5000
> MAX_ROWS=315360000
> PACK_KEYS=1
> ROW_FORMAT=compressed
> ;
>
> CREATE TABLE pinndx (
>   pinndxnum INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
>   parent    INTEGER UNSIGNED NOT NULL,
>   INDEX parentndx (parent),
>   tagpath   INTEGER UNSIGNED NOT NULL,
>   INDEX tagpathndx (tagpath),
>   tagtype   CHAR(1)          NOT NULL,
>   tagname   INTEGER UNSIGNED NOT NULL,
>   INDEX tagnamendx (tagname),
>   atrname   INTEGER UNSIGNED NOT NULL,
>   INDEX atrnamendx (atrname),
>   pinnum    INTEGER UNSIGNED NOT NULL,
>   INDEX pinnumndx (pinnum),
>   nvalue    DOUBLE,
>   INDEX nvaluendx (nvalue),
>   value     MEDIUMTEXT,
>   INDEX valuendx (value(30))
> )
> AVG_ROW_LENGTH=500
> MAX_ROWS=3153600000
> PACK_KEYS=1
> ;
>
> CREATE TABLE tagpathtbl (
>   vkey   INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
>   value VARCHAR(255) NOT NULL,
>   INDEX tagpathndx (value(30))
> );
>
> CREATE TABLE tagnametbl (
>   vkey   INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
>   value VARCHAR(255) NOT NULL,
>   INDEX tagnamendx (value(30))
> );
>
> CREATE TABLE atrnametbl (
>   vkey   INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
>   value VARCHAR(255) NOT NULL,
>   INDEX atrnamendx (value(30))
> );
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

pgsql-general by date:

Previous
From: Stephan Szabo
Date:
Subject: Re: ON DELETE Behavior after the fact
Next
From: Bruce Momjian
Date:
Subject: Re: PostgreSQL and mySQL database size question