Home > mailing lists

Removing nulls with 6NF - Mailing list pgsql-performance

From	Robert DiFalco
Subject	Removing nulls with 6NF
Date	January 23, 2014 16:34:39
Msg-id	CAAXGW-w=gbJp3eWbT1m7DGakW3Uent=fsj10nUghyZ1vCjiBHA@mail.gmail.com Whole thread
List	pgsql-performance

Tree view

I have a question about replacing NULL columns with tables. Say I have a table called Contacts that represents the Contacts (say from Facebook) of my Users. Right now Contacts has several columns and in addition to these it has an Owner (the user that owns this contact) and User. The User is NULL if there is no user in my system that matches the contact. The Owner can never be NULL.

If I want to get all the contacts for a user and know which are and which are not themselves users I would do an OUTER join with the Users table on user_id (which may or may not be null). Or I may not want the user itself and not need to perform the OUTER join at all. I just want to see if they are NULL or a valid ID from the Users table.

So, if I endeavor to get rid of this kind of usage of NULL I could create another table called ContactUsers with two fields (ContactID, UserID).

I'd go from this:

Contacts (ContactID, OwnerID, UserID, ...)

To this:

Contacts(ContactID, OwnerID, ... )

ContactUsers( ContactID, UserID)

So my questions are these:

I assume my new queries of all contacts with or without correlated users would now change to all need a User column that has a SELECT query on ContactUsers for that column.

How would this perform relative to the approach that uses a field and NULL.

I *think* updates will actually be faster because instead of updating I will simply insert into ContactUsers and not have to lookup the Contact row and update its UserID field. But I'm concerned about queries.

I'm also concerned about when a User changes their details. Now I will have to DELETE from the ContactUsers table then INSERT if there are different correlations. Before I would just update the contact rows with the modified user to NULL and re-correlate.

Should I be concerned about the performance of this approach? In addition to getting rid of NULLs and approach like this would help me reduce the number of columns in some tables. But that SELECT for each column thing has me concerned since the queries are very fast now using a NULL field.

Thanks!

pgsql-performance by date:

From: Josh Berkus
Date: 22 January 2014, 20:01:46
Subject: Re: Time of query result delivery

From:
Date: 24 January 2014, 16:23:45
Subject: PostgreSQL 9.3.2 Performance issues

Removing nulls with 6NF - Mailing list pgsql-performance

Previous

Next