Re: How are null's stored? -- Some numbers - Mailing list pgsql-performance
| From | Jim C. Nasby |
|---|---|
| Subject | Re: How are null's stored? -- Some numbers |
| Date | |
| Msg-id | 20030513215541.GC40542@flake.decibel.org Whole thread Raw |
| In response to | How are null's stored? ("Jim C. Nasby" <jim@nasby.net>) |
| List | pgsql-performance |
I did two experiments. First, as someone mentioned, changing between
char and varchar made absolutely no difference size-wise. In some other
RDBMSes, performance wise char might still win out because the database
wouldn't have to do the math to figure out where in the tuple the fields
are. I know it's splitting hairs, but on what will be a 40M row table...
Second, I modified the table (see below; all fields were originally
nullable):
Before:
usps=# vacuum full analyze verbose zip4_detail;
INFO: --Relation public.zip4_detail--
INFO: Pages 12728: Changed 0, reaped 1, Empty 0, New 0; Tup 467140: Vac
0, Keep/VTL 0/0, UnUsed 19, MinLen 154, MaxLen 302; Re-using:
Free/Avail. Space 1009820/264028; EndEmpty/Avail. Pages 0/1521.
CPU 0.65s/0.86u sec elapsed 1.51 sec.
INFO: Rel zip4_detail: Pages: 12728 --> 12728; Tuple(s) moved: 0.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: Analyzing public.zip4_detail
After:
INFO: --Relation public.zip4_detail--
INFO: Pages 13102: Changed 0, reaped 6961, Empty 0, New 0; Tup 467140:
Vac 0, Keep/VTL 0/0, UnUsed 31795, MinLen 166, MaxLen 306; Re-using:
Free/Avail. Space 1136364/190188; EndEmpty/Avail. Pages 0/1056.
CPU 0.41s/0.79u sec elapsed 1.20 sec.
INFO: Rel zip4_detail: Pages: 13102 --> 13102; Tuple(s) moved: 0.
CPU 0.59s/10.02u sec elapsed 18.17 sec.
INFO: Analyzing public.zip4_detail
As you can see, space useage actually went up, by 2.9% (pages). In other
words, it appears to be more efficient to store a null than to store an
empty string in a varchar.
usps=# select count(*) from zip4_detail where street_pre_drctn_abbrev=''
and street_suffix_abbrev='' and street_post_drctn_abbrev='';
-------
9599
usps=# select count(*) from zip4_detail where street_pre_drctn_abbrev=''
or street_suffix_abbrev='';
--------
128434
(all rows have at least one of the 3 fields empty)
Hope someone finds this info useful... :)
Table "public.zip4_detail"
Column | Type | Modifiers
---------------------------+-----------------------+-----------
zip_code | character varying(5) |
update_key_no | character varying(10) |
action_code | character varying(1) |
record_type_code | character varying(1) |
carrier_route_id | character varying(4) |
street_pre_drctn_abbrev | character varying(2) | not null
street_name | character varying(28) |
street_suffix_abbrev | character varying(4) | not null
street_post_drctn_abbrev | character varying(2) | not null
addr_primary_low_no | character varying(10) |
addr_primary_high_no | character varying(10) |
addr_prmry_odd_even_code | character varying(1) |
building_or_firm_name | character varying(40) |
addr_secondary_abbrev | character varying(4) |
addr_secondary_low_no | character varying(8) |
addr_secondary_high_no | character varying(8) |
addr_secny_odd_even_code | character varying(1) |
zip_add_on_low_no | character varying(4) |
zip_add_on_high_no | character varying(4) |
base_alt_code | character varying(1) |
lacs_status_ind | character varying(1) |
govt_bldg_ind | character varying(1) |
finance_no | character varying(6) |
state_abbrev | character varying(2) |
county_no | character varying(3) |
congressional_dist_no | character varying(2) |
muncipality_ctyst_key | character varying(6) |
urbanization_ctyst_key | character varying(6) |
prefd_last_line_ctyst_key | character varying(6) |
--
Jim C. Nasby (aka Decibel!) jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828
Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"
pgsql-performance by date: