Thread: Whither 1:1?

Whither 1:1?

From
Guyren Howe
Date:
It’s come to my attention that what seems an obvious and useful database design pattern — 1:1 relations between tables by having a shared primary key — is hardly discussed or used.

It would seem to be a very simple pattern, and useful to avoid storing nulls or for groups of fields that tend to be used together.

Thoughts? Is there some downside I can’t see?

Re: Whither 1:1?

From
"David G. Johnston"
Date:
On Fri, Jun 1, 2018 at 9:52 AM, Guyren Howe <guyren@gmail.com> wrote:
It’s come to my attention that what seems an obvious and useful database design pattern — 1:1 relations between tables by having a shared primary key — is hardly discussed or used.

It would seem to be a very simple pattern, and useful to avoid storing nulls or for groups of fields that tend to be used together.

Thoughts? Is there some downside I can’t see?

​I make considerable use of one-to-one tables.  There isn't any inherent absolute downside to doing this that I can think of (potentially more joins are neither inherent ​nor absolute) though the specific benefits or concerns rely upon the specific model being implemented.

The two ways I've done this is "static vs dynamic data" split and, basically, "attribute grouping".  This is one that can avoid storing nulls though in my particular usage I still have them since I'm basically implementing a key-value json document and so have a record for the ID in every table even if no data is present.  The advantage is reducing model complexity by adding meta-data to all of the fields contained fields.

David J.

Re: Whither 1:1?

From
James Keener
Date:
I don't think I fully understand. Do you mean all pk using a single sequence? I'm not sure how this would avoid nulls or grouping fields.

Jim

On Fri, Jun 1, 2018 at 12:52 PM, Guyren Howe <guyren@gmail.com> wrote:
It’s come to my attention that what seems an obvious and useful database design pattern — 1:1 relations between tables by having a shared primary key — is hardly discussed or used.

It would seem to be a very simple pattern, and useful to avoid storing nulls or for groups of fields that tend to be used together.

Thoughts? Is there some downside I can’t see?

Re: Whither 1:1?

From
Pavel Stehule
Date:


2018-06-01 18:52 GMT+02:00 Guyren Howe <guyren@gmail.com>:
It’s come to my attention that what seems an obvious and useful database design pattern — 1:1 relations between tables by having a shared primary key — is hardly discussed or used.

It would seem to be a very simple pattern, and useful to avoid storing nulls or for groups of fields that tend to be used together.

Thoughts? Is there some downside I can’t see?

Postgres has not multi table statistics - every JOIN increase estimation error.

Regards

Pavel

Re: Whither 1:1?

From
Guyren Howe
Date:
On Jun 1, 2018, at 10:00 , James Keener <jim@jimkeener.com> wrote:

I don't think I fully understand. Do you mean all pk using a single sequence? I’m not sure how this would avoid nulls or grouping fields.

Presumably, you would create the multiple tables together or after some main one or something where you know the PK you’re going to use. Perhaps you create a view with triggers to present them as a single table.

It could avoid nulls by not having rows in some of the 1:1 tables. Say you have a users table with login credentials and also personal information (phone number, email and such). But some of your users are strictly API users so have no personal information. If you split the person information into a separate table, you can just not have rows for those users.

One of the originators of the relational model didn’t want to have null at all, presumably wanting folks to use this pattern instead.

Re: Whither 1:1?

From
Olivier Gautherot
Date:
On Fri, Jun 1, 2018 at 12:52 PM, Guyren Howe <guyren@gmail.com> wrote:
It’s come to my attention that what seems an obvious and useful database design pattern — 1:1 relations between tables by having a shared primary key — is hardly discussed or used.

It would seem to be a very simple pattern, and useful to avoid storing nulls or for groups of fields that tend to be used together.

Thoughts? Is there some downside I can’t see?

You will get a benefit in terms of space only if the optional fields in the second table exist in a reduced number of instances - and the second table is significantly wider. This can make a difference on big tables but this gain may be offset by the cost of the join. In this perspective, I don't think that there is a clear benefit or drawback: it should be evaluated on a case-by-case basis.

Olivier Gautherot


Re: Whither 1:1?

From
Guyren Howe
Date:
On Jun 1, 2018, at 10:16 , Olivier Gautherot <olivier@gautherot.net> wrote:

You will get a benefit in terms of space only if the optional fields in the second table exist in a reduced number of instances - and the second table is significantly wider. This can make a difference on big tables but this gain may be offset by the cost of the join. In this perspective, I don’t think that there is a clear benefit or drawback: it should be evaluated on a case-by-case basis.

It seems to me that people take time to catch up with modern hardware reality. SSDs reduce seek time to virtually zero. Surely, joins are now much, much cheaper. If so, I’m inclined to describe wide tables as a premature optimization.

Re: Whither 1:1?

From
Olivier Gautherot
Date:
On Fri, Jun 1, 2018 at 1:25 PM, Guyren Howe <guyren@gmail.com> wrote:
On Jun 1, 2018, at 10:16 , Olivier Gautherot <olivier@gautherot.net> wrote:

You will get a benefit in terms of space only if the optional fields in the second table exist in a reduced number of instances - and the second table is significantly wider. This can make a difference on big tables but this gain may be offset by the cost of the join. In this perspective, I don’t think that there is a clear benefit or drawback: it should be evaluated on a case-by-case basis.

It seems to me that people take time to catch up with modern hardware reality. SSDs reduce seek time to virtually zero. Surely, joins are now much, much cheaper. If so, I’m inclined to describe wide tables as a premature optimization.

Guyren, I'm only saying it is not a one-size-fits-all and we don't all have last-generation hardware to run our systems on. I actually have a use-case for the structure you describe, which supports optional fields in a table: the main one is partitioned into 15GB monthly tables with 30+ millions rows. The parallel table is used for devices that generate additional information but are a small subset of the fleet. This way, I could prevent the overhead of 150 columns across the board (the main table has around 30). By keeping things simple (I mean a single table), you get more bloat but you can reduce the response time and may eventually be able to run on a smaller, cheaper server. It is really a project-specific design issue.

Olivier Gautherot

Re: Whither 1:1?

From
Ron
Date:
On 06/01/2018 12:25 PM, Guyren Howe wrote:
On Jun 1, 2018, at 10:16 , Olivier Gautherot <olivier@gautherot.net> wrote:

You will get a benefit in terms of space only if the optional fields in the second table exist in a reduced number of instances - and the second table is significantly wider. This can make a difference on big tables but this gain may be offset by the cost of the join. In this perspective, I don’t think that there is a clear benefit or drawback: it should be evaluated on a case-by-case basis.

It seems to me that people take time to catch up with modern hardware reality. SSDs reduce seek time to virtually zero. Surely, joins are now much, much cheaper. If so, I’m inclined to describe wide tables as a premature optimization.

Sure, SSDs are uber-wonderful, but a rack full of rotating media is still going to be a lot cheaper and have a lot more capacity than a rack full of SSDs, and that makes all the difference...

--
Angular momentum makes the world go 'round.

Re: Whither 1:1?

From
Tim Cross
Date:
Olivier Gautherot <olivier@gautherot.net> writes:

> On Fri, Jun 1, 2018 at 12:52 PM, Guyren Howe <guyren@gmail.com> wrote:
>
>> It’s come to my attention that what seems an obvious and useful database
>> design pattern — 1:1 relations between tables by having a shared primary
>> key — is hardly discussed or used.
>>
>> It would seem to be a very simple pattern, and useful to avoid storing
>> nulls or for groups of fields that tend to be used together.
>>
>> Thoughts? Is there some downside I can’t see?
>>
>
> You will get a benefit in terms of space only if the optional fields in the
> second table exist in a reduced number of instances - and the second table
> is significantly wider. This can make a difference on big tables but this
> gain may be offset by the cost of the join. In this perspective, I don't
> think that there is a clear benefit or drawback: it should be evaluated on
> a case-by-case basis.
>

Well said. Like many database design decisions, there are more
guidelines than rules. Nearly all decisions have pros and
cons. Therefore, you need to assess on a per project basis. In addition
to the structure of data (size, null frequency etc), you also need to
consider how the data is used. It is also important to consider who will
be using the data, how they will access it and what level of
sophistication/understanding they have. The clarity of your data model
is also important as future decisions may be made by others and the
better they understand the design/model, the higher the likelihood the
system will evolve in a sane and maintainable manner.

There other point to remember is that all designs often have an element
of surprise - you make a decision under an assumption which turns out
not to hold due to variables you didn't consider or don't have control
over. Sometimes there are alternatives which may actually perform better
because they are optimised within the system - for example, some of the
benefits for this approach could be realised using partitions.

I have used this technique, but from memory, this was done as the system
evolved and we found there was a benefit from having a smaller 'main'
table. It isn't a design decision I recall adopting during an initial
modelling of the system, but it may be something to consider once you
find a performance problem (along with other options) which needs to be
addressed.

I'm not aware of any guideline or school of thought which rules out this
as an option. Possibly the reason it appears to be used infrequently is
because it doesn't realise the benefits you might expect or is simply
not an problem in a majority of use cases.

Tim



--
Tim Cross