Re: Primary keys for companies and people - Mailing list pgsql-general

From Leif B. Kristensen
Subject Re: Primary keys for companies and people
Date
Msg-id 200602022311.53757.leif@solumslekt.org
Whole thread Raw
In response to Re: Primary keys for companies and people  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-general
On Thursday 02 February 2006 21:09, Martijn van Oosterhout wrote:
>To the GP, your page is an interesting one and raises several
>interesting points. In particular the one about the "person" being the
>conclusion of the rest of the database. You essentially have a set of
>facts "A married B in C on date D" and you're trying to correlate
>these. In the end it's just a certain amount of guess work, especially
>since back then they wern't that particular about spelling as they are
>today.
>
>My naive view is that you're basically assigning trust values to each
>fact and the chance that two citations refer to the same person. In
>principle you'd be able to cross-reference all these citations and
>build the structure quasi-automatically. I suppose in practice this is
>done by hand.

Yes it is. As I stated in the article, I'd like to quantify a
'participant' of an 'event' as a "vector in genealogy space", but I
haven't really figured out a sensible entry mode for that evidence yet.
For now, I'm trying to enter as much information as possible into the
source citations.

>As for your question, I think you're stuck with having a person ID.
>Basically because you need to identify a person somehow. Given you
>still have the original citiations, you can split a person into
>multiple if the situation appears to not work out.
>
>One thing I find odd though, your "person" objects have no birthdate
> or deathdate. Or birth place either.

I've appropriated the model from my previous program, The Master
Genealogist. I like the approach that the "person" entity should
contain the least possible number of assertions. I've got views and
functions that retrieves a primary birth and death date from the
database automatically.

> I would have thought these
> elements would be fundamental in determining if two people are the
> same, given that they can't change and people are unlikely to forget
> them.

Yes. But in 18th century genealogy, at least in Norway, you're unlikely
to find a birth date and place in other records than the christening.
As a matter of fact, the birth date wasn't usually recorded either, but
as the christening usually took place within a week after birth, you've
got a pretty good approximation.

>Put another way, two people with the same birthday in the same place
>with similar names are very likely to be the same. If you can
>demostrate this is not the case that's another fact. In the end you're
>dealing with probabilities, you can never know for sure.

18th century genealogy has a lot in common with crime investigation.
You've basically got a few clues, and try to figure out a picture from
the sparse evidence that may be found. I love that challenge, but it
may be quite taxing sometimes.
--
Leif Biberg Kristensen | Registered Linux User #338009
http://solumslekt.org/ | Cruising with Gentoo/KDE

pgsql-general by date:

Previous
From: "Ted Byers"
Date:
Subject: Re: Primary keys for companies and people
Next
From: Merlin Moncure
Date:
Subject: Re: Primary keys for companies and people