Re: My honours project - databases using dynamically attached entity-properties - Mailing list pgsql-hackers

From Richard Huxton
Subject Re: My honours project - databases using dynamically attached entity-properties
Date
Msg-id 45F66F81.7090201@archonet.com
Whole thread Raw
In response to Re: My honours project - databases using dynamically attached entity-properties  ("Sean Utt" <sean@strateja.com>)
Responses Re: My honours project - databases using dynamically attached entity-properties  (David Fetter <david@fetter.org>)
List pgsql-hackers
Sean Utt wrote:
> And then what? Make the search box on www.postgresql.org able to handle 
> an email address as search text without throwing a shoe?
> 
> Search for linus@torvalds.com or any other 'email' address from the 
> postgres home page. Barfage every time.
> Easy for some isn't easy for all, apparently. Left that out as a test 
> case did we? Someone searching a mailing list for an email address? Who 
> wudda thunk it? It works without the . -- don't know why, but then I 
> also don't know why someone hasn't tried that before me.

Had a bad day?

And I'll be damned if you're not right - it doesn't return results for 
dev@archonet.com but does for dev@archonet. Presumably something to do 
with (not ?) splitting the email address on ".". Can't believe no-one 
has noticed this before (me, for example). I guess that even though I 
search a lot, it's not on email addrs.

Have you reported it to the www team?

> "Sure, sounds like a simple solution to me..." Richard said 
> sarcastically. Would be funnier if the search on the website wasn't 
> broken in a completely stupid, almost ironic way. Ah, irony and sarcasm 
> -- the ugly twins.

Actually, it was Greg who said that. And it was *me* the (really very 
gentle) sarcasm was directed at.

> Yeah, we have to dynamically generate queries day in and day out. But 
> then some of us actually work for a living.

Umm - like all of us?

> Since we already have to do that, maybe someone could make that easier? 

Good idea!

> Isn't that really the point here?

Not as I was reading the discussion.
> Someone asked if something would be
> useful, and the people who use the database to do real work said YES, 
> and here's how I might use it. Like full text seach and recursive 
> queries, user defined (fields|attributes|properties) and the ability to 
> manage them would be BUTTER! Is it a difficult problem? YES, but if it 
> wasn't, why should it be worth an advanced degree?

I think the main discussion has been around:
1. Whether Edward's final-year project is basically EAV (in which case 
he'll probably need to work hard to get good marks).
2. Whether dynamically generating DDL is safe/practical in a business 
setting. This seems to split some *very* experienced people about 50:50. 
Half of them agree with me and the other half are inexplicably wrong ;-)

If you read section 0.3 of Edward's project proposal he argues that 
dynamic DDL causes problems for the application because: "However, SQL 
does not provide an easy syntax for querying these properties." (meaning 
the changed structure of the database). I'm not saying this part is 
easy, but I'm not convinced it's harder than doing it the "wrong" way. 
At least not if you do as Edward does and enforce types.

Now, in 0.3.1 he describes a normalised webpage=>tags table pair and 
shows some cumbersome-looking SQL. However, I'd argue this is due to the 
verbose nature of the SQL rather than the underlying expressions.

He then looks at what I think of as the "system settings table"* 
problem, where you have a bunch of configuration-settings you'd tend to 
store in a single table (setting_name, value), except you'd like to have 
different types for each setting (a further wrinkle is that you might 
like lists of settings - do you use arrays or a sub-table?). This is 
your classic EAV scenario.

Now, he proposes a set of tables - one per property, so you can enforce 
type constraints, but he will need to create syntax to make this easier 
to query. Presumably it'll need an automatically-generated view over the 
top. (Edward - tip: read up on Date's thoughts on automatically 
determining what table you can insert a row in based on its type).

This certainly looks like a variant on EAV to me, and I'm not convinced 
that it's gaining much since you'll have possibly huge numbers of joins 
going on in the background while not having any semantics to your table 
definitions.

The other variant (which I've used myself) is to have a "type" column 
and a trigger to enforce the text-value is valid for "type".

Now, if you do it "properly", that is define tables as you would as a 
human designer, then you do have the problem of informing your 
application on the relevant structure. Now, the syntax issues of this 
are largely solved - plenty of ActiveRecord-style classes out there for 
Ruby,Python,Perl,PHP,...

The interesting question then becomes where the practical limits in such 
flexibility are. Simple attribute/value pairs are easy enough. How do 
you feel about changes to existing tables though? How about new tables 
that add relationships between existing tables? At what point does the 
application just become pgAdmin?


* Another good example is the "questionnaire". Users need to be able to 
define their own lists of questions and if "driving_licence=no" then 
don't bother to ask "type of car=?". Edward - if you want to see a 
schema that implements this sort of thing, contact me off list and I'll 
see what I can do. The client for that project will probably be fine 
with sharing it with one student.

--   Richard Huxton  Archonet Ltd


pgsql-hackers by date:

Previous
From: Csaba Nagy
Date:
Subject: Re: My honours project - databases using dynamically attached entity-properties
Next
From: "Simon Riggs"
Date:
Subject: Re: Bug: Buffer cache is not scan resistant