Re: Google's Summer of Code ... - Mailing list pgsql-hackers

From Meredith L. Patterson
Subject Re: Google's Summer of Code ...
Date
Msg-id 429E31CC.8070907@thesmartpolitenerd.com
Whole thread Raw
In response to Re: Google's Summer of Code ...  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: Google's Summer of Code ...
List pgsql-hackers
Simon Riggs wrote:
> Is it possible that you could put sufficient of the application into
> PostgreSQL to genericise some features? Stonebraker's Third Wave was
> *all* about putting data intensive operations closer to where the data
> is stored/accessed.

And just like that, a lightbulb goes off in my head.

I'd been reluctant to push the training step inside the engine, because
I couldn't come up with a good way of doing it, but now it seems so
obvious. A ranking support vector machine takes as input a series of
partial orders -- think of it as several "buckets" into which data items
are thrown. Or, if you will, a list of lists of unique identifiers. And
that would be *easy* to pass as part of a query string. I'm envisioning
a syntax like:

ORDER BY SVM linear KEY foo ((1, 2, 3), (4, 5), (6, 7, 8), (9))

So this would be a partial ordering where each number is the key (PK is
column 'foo') of some tuple in a table, and 1, 2, 3 > 4, 5 > 6, 7, 8 > 9in terms of the user's preference.

Use that (much more human-readable than I had originally envisioned)
input to learn the actual ranking function inside the database, apply
that ranking to the results, and boom -- an ORDER BY clause extrapolated
directly from a partial ranking, with no pesky outside-the-database
learning step.

(Nonlinear kernels have some additional parameters, and tuning them can
be something of a black art, but the syntax can be extended to let
people specify them. Default values would also be necessary.)

I'll continue to think on this, but already this approach strikes me as
a lot more useful to the average user. Thanks, Simon!

Cheers,
Meredith


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: NOLOGGING option, or ?
Next
From: "Luke Lonergan"
Date:
Subject: Re: NOLOGGING option, or ?