Unexpected table size usage for small composite arrays - Mailing list pgsql-hackers
From | Erik Sjoblom |
---|---|
Subject | Unexpected table size usage for small composite arrays |
Date | |
Msg-id | CAAW=00XvzL9q6m2jgsd2YJJqzjjmBYyM3A1uaXuMMr19SpJ57A@mail.gmail.com Whole thread Raw |
Responses |
Re: Unexpected table size usage for small composite arrays
|
List | pgsql-hackers |
Hello PostgreSQL developers,
I’m observing a storage behavior with arrays in a table that differs from my expectations, and I’d appreciate your insights. I was to store key value pairs in a very dense data model. I don't haver the requirement of search so that's why I was thinking an array of a composite type would work well. I can see that padding might be involved using the int4 and int8 combination but there is more overhead. Anyone know where the following it coming from?
Context
I have defined a composite type and a table as follows:
sqlCREATE TYPE property_entry_data_type AS ( property_key_id int4, property_value_id int8 ); CREATE TABLE property_set_data ( property_set_data_id int8 PRIMARY KEY, parent_set_id int8 NULL, owner_id uuid NOT NULL, property_entry_data property_entry_data_type[] NULL, created_at timestamptz DEFAULT now(), modified_at timestamptz DEFAULT now() );
Observations
I inserted 10,000 rows with varying numbers of elements in the property_entry_data
array. I noticed the following pattern:
- For a small number of elements (up to around 40), each array element consumes roughly 40–50 bytes.
- After reaching the 2 KB threshold, PostgreSQL appears to start compressing the array data within the main table.
- When the number of elements exceeds 220, the data starts getting stored in the TOAST table.
Expected Behavior
Based on my understanding, each element in the array should take 12-16 bytes (4-8 bytes each for property_key_id depending on padding
and property_value_id
). Including some additional overhead, I expected the storage requirement to be roughly 24 + 12 * N
bytes per row for N
elements in the array. However, the actual usage is significantly higher (~40–50 bytes per element), which I didn’t anticipate.
My Question
Could you please help me understand the following:
- Why is PostgreSQL using more than the expected 12 bytes per element? What factors contribute to the additional storage overhead?
- How does PostgreSQL handle compression and alignment for arrays stored in composite types, and could these factors explain the discrepancy?
- Is there a way to minimize the per-element overhead for such arrays in a table, or is this behavior expected given PostgreSQL’s internal storage mechanisms?
This is the query I use to see the table size:sqlWITH table_info AS (
SELECT
n.nspname AS schema_name, -- Schema name
c.oid AS main_oid,
c.relname AS table_name,
c.relkind,
c.reltoastrelid AS toast_oid,
c.relispartition
FROM
pg_class c
JOIN
pg_namespace n ON n.oid = c.relnamespace -- Join to get schema information
WHERE
c.relname = 'property_set_data' -- Replace with your table name
)
SELECT
ti.schema_name, -- Add schema to the output
ti.table_name,
CASE
WHEN ti.relispartition THEN 'Partitioned Table'
ELSE 'Regular Table'
END AS table_type,
pg_relation_size(ti.main_oid) AS main_table_size,
-- pg_size_pretty(pg_relation_size(ti.main_oid)) AS main_table_size,
pg_size_pretty(pg_indexes_size(ti.main_oid)) AS indexes_size,
CASE
WHEN ti.toast_oid = 0 THEN 'No TOAST table'
ELSE pg_total_relation_size(ti.toast_oid)::text END AS toast_size,
--ELSE pg_size_pretty(pg_total_relation_size(ti.toast_oid)) END AS toast_size,
pg_size_pretty(pg_total_relation_size(ti.main_oid)) AS total_size
FROM
table_info ti;
Thank you for your assistance, and I appreciate any insights you can provide!
pgsql-hackers by date: