Thread: Can't explain db size

Can't explain db size

From
Matteo Grolla
Date:
Hi,
     I hope you can help me understand why the db is so big and if there's anything I can do.
It's the DB of an Enterprise Content Management application, Alfresco. Here are some data I collected, after executing a vaccum from pg admin.

A) Largest tables sizes relation total_size
  SELECT nspname || '.' || relname AS "relation",
    pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size"
  FROM pg_class C
  LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
  WHERE nspname NOT IN ('pg_catalog', 'information_schema')
    AND C.relkind <> 'i'
    AND nspname !~ '^pg_toast'
  ORDER BY pg_total_relation_size(C.oid) DESC
  LIMIT 20; public.alf_node_properties 17 GB
public.alf_node 2331 MB
public.alf_node_aspects 1939 MB
public.alf_child_assoc 1160 MB
public.alf_content_data 338 MB
public.alf_content_url 296 MB
public.alf_transaction 207 MB
public.alf_node_assoc 140 MB
public.alf_activity_feed 6016 kB
public.alf_activity_post 1464 kB
public.alf_acl_member 1056 kB
public.alf_access_control_list 712 kB
public.act_hi_detail 352 kB
public.act_hi_varinst 248 kB
public.alf_prop_value 240 kB
public.act_ru_variable 232 kB
public.alf_access_control_entry 208 kB
public.alf_lock 144 kB
public.alf_authority 144 kB
public.jbpm_log 120 kB

B) size size of alf_node_properties
select pg_catalog.pg_size_pretty(sum(pg_column_size(alf_node_properties.*)))
      from alf_node_properties 6322 MB

C) size of all the columns of alf_node_properties
Dimensioni delle singole colonne (MB)node_idactual_type_npersisted_type_nqname_idlist_indexlocale_idboolean_valuelong_valuefloat_valuedouble_valuestring_valueserializable_valuetotale (MB)totale values
select pg_catalog.pg_size_pretty(sum(pg_column_size(node_id))) node_id,
pg_catalog.pg_size_pretty(sum(pg_column_size(actual_type_n))) actual_type_n,
pg_catalog.pg_size_pretty(sum(pg_column_size(persisted_type_n))) persisted_type_n,
pg_catalog.pg_size_pretty(sum(pg_column_size(qname_id))) qname_id,
pg_catalog.pg_size_pretty(sum(pg_column_size(list_index))) list_index,
pg_catalog.pg_size_pretty(sum(pg_column_size(locale_id))) locale_id,
pg_catalog.pg_size_pretty(sum(pg_column_size(boolean_value))) boolean_value,
pg_catalog.pg_size_pretty(sum(pg_column_size(long_value))) long_value,
pg_catalog.pg_size_pretty(sum(pg_column_size(float_value))) float_value,
pg_catalog.pg_size_pretty(sum(pg_column_size(double_value))) double_value,
pg_catalog.pg_size_pretty(sum(pg_column_size(string_value))) string_value,
pg_catalog.pg_size_pretty(sum(pg_column_size(serializable_value))) serializable_value
from alf_node_properties
419210210419210419524192104196686237171830

-------Questions----------------

1) Can you explain me the big difference between the result in A for table alf_node_properties: 17GB and the result in B: ~6GB ?

2) Can you explain me the difference between the result in B: ~6GB and the result in C, the sum of all column sizes, 3717MB ?

Thanks

Re: Can't explain db size

From
Andreas Kretschmer
Date:
Matteo Grolla <matteo.grolla@gmail.com> wrote:

>
> -------Questions----------------
>
> 1) Can you explain me the big difference between the result in A for table
> alf_node_properties: 17GB and the result in B: ~6GB ?
>
> 2) Can you explain me the difference between the result in B: ~6GB and the
> result in C, the sum of all column sizes, 3717MB ?

Maybe there are some dead tuples, run a VACUUM FULL (be careful, it
requires an explicit lock). And please keep in mind that a table
can contains indexes and other objects. A nice explanation and some ways
to gather informations on table-, index- and database sizes can you find
here:
http://andreas.scherbaum.la/blog/archives/282-table-size,-database-size.html


Regards, Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°


Re: Can't explain db size

From
Matteo Grolla
Date:
Thanks Andreas,
     Il try

2015-12-15 11:07 GMT+01:00 Andreas Kretschmer <akretschmer@spamfence.net>:
Matteo Grolla <matteo.grolla@gmail.com> wrote:

>
> -------Questions----------------
>
> 1) Can you explain me the big difference between the result in A for table
> alf_node_properties: 17GB and the result in B: ~6GB ?
>
> 2) Can you explain me the difference between the result in B: ~6GB and the
> result in C, the sum of all column sizes, 3717MB ?

Maybe there are some dead tuples, run a VACUUM FULL (be careful, it
requires an explicit lock). And please keep in mind that a table
can contains indexes and other objects. A nice explanation and some ways
to gather informations on table-, index- and database sizes can you find
here:
http://andreas.scherbaum.la/blog/archives/282-table-size,-database-size.html


Regards, Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Can't explain db size

From
Matteo Grolla
Date:
have news,
      the pg version is 9.1.3
      a vaccum full, not a plain vaccum, was performed.
      o.s. is red hat 7
      filesystem: xfs with block size 4k

could it be a problem regarding the block size?
thanks

2015-12-15 12:11 GMT+01:00 Matteo Grolla <matteo.grolla@gmail.com>:
Thanks Andreas,
     Il try

2015-12-15 11:07 GMT+01:00 Andreas Kretschmer <akretschmer@spamfence.net>:
Matteo Grolla <matteo.grolla@gmail.com> wrote:

>
> -------Questions----------------
>
> 1) Can you explain me the big difference between the result in A for table
> alf_node_properties: 17GB and the result in B: ~6GB ?
>
> 2) Can you explain me the difference between the result in B: ~6GB and the
> result in C, the sum of all column sizes, 3717MB ?

Maybe there are some dead tuples, run a VACUUM FULL (be careful, it
requires an explicit lock). And please keep in mind that a table
can contains indexes and other objects. A nice explanation and some ways
to gather informations on table-, index- and database sizes can you find
here:
http://andreas.scherbaum.la/blog/archives/282-table-size,-database-size.html


Regards, Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: Can't explain db size

From
Matteo Grolla
Date:
also, 
serializable_value is of type bytea

2015-12-17 16:12 GMT+01:00 Matteo Grolla <matteo.grolla@gmail.com>:
have news,
      the pg version is 9.1.3
      a vaccum full, not a plain vaccum, was performed.
      o.s. is red hat 7
      filesystem: xfs with block size 4k

could it be a problem regarding the block size?
thanks

2015-12-15 12:11 GMT+01:00 Matteo Grolla <matteo.grolla@gmail.com>:
Thanks Andreas,
     Il try

2015-12-15 11:07 GMT+01:00 Andreas Kretschmer <akretschmer@spamfence.net>:
Matteo Grolla <matteo.grolla@gmail.com> wrote:

>
> -------Questions----------------
>
> 1) Can you explain me the big difference between the result in A for table
> alf_node_properties: 17GB and the result in B: ~6GB ?
>
> 2) Can you explain me the difference between the result in B: ~6GB and the
> result in C, the sum of all column sizes, 3717MB ?

Maybe there are some dead tuples, run a VACUUM FULL (be careful, it
requires an explicit lock). And please keep in mind that a table
can contains indexes and other objects. A nice explanation and some ways
to gather informations on table-, index- and database sizes can you find
here:
http://andreas.scherbaum.la/blog/archives/282-table-size,-database-size.html


Regards, Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



Re: Can't explain db size

From
Jim Nasby
Date:
On 12/15/15 3:52 AM, Matteo Grolla wrote:
> 1) Can you explain me the big difference between the result in A for
> table alf_node_properties: 17GB and the result in B: ~6GB ?

11GB of indexes would explain it.

> 2) Can you explain me the difference between the result in B: ~6GB and
> the result in C, the sum of all column sizes, 3717MB ?

Probably per-page and per-tuple overhead.

What does SELECT reltuples, relpages FROM pg_class WHERE oid =
'public.alf_node_properties'::regclass show?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com