Thread: effective SELECT from child tables

effective SELECT from child tables

From

"Ilia Kantor"

Date:

27 September 2005, 15:05:25

<div class="Section1"><p class="MsoNormal"><font face="Arial" size="2"><span lang="EN-US" style="font-size:
10.0pt;font-family:Arial">Let table A be inherited by A1, A2, A3.</span></font><p class="MsoNormal"><font face="Arial"
size="2"><spanlang="EN-US" style="font-size: 
10.0pt;font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span lang="EN-US"
style="font-size:
10.0pt;font-family:Arial">How to select from A records where actual relations are A1, A2 ?</span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span lang="EN-US" style="font-size: 
10.0pt;font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span lang="EN-US"
style="font-size:
10.0pt;font-family:Arial">I found a way somewhere, it sounds like SELECT  … WHERE tableoid IN (a1.oid,
a2.oid),</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span lang="EN-US" style="font-size: 
10.0pt;font-family:Arial">but tableoid checks actually do seq scan.</span></font><p class="MsoNormal"><font
face="Arial"size="2"><span lang="EN-US" style="font-size: 
10.0pt;font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span lang="EN-US"
style="font-size:
10.0pt;font-family:Arial">Like: SELECT * FROM sometable WHERE tableoid =anything will do seq. scan on
sometable..</span></font><pclass="MsoNormal"><font face="Arial" size="2"><span lang="EN-US" style="font-size: 
10.0pt;font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span lang="EN-US"
style="font-size:
10.0pt;font-family:Arial">So such way seems very ineffective: it seq scans and filters records..</span></font><p
class="MsoNormal"><fontface="Arial" size="2"><span lang="EN-US" style="font-size: 
10.0pt;font-family:Arial"> </span></font><p class="MsoNormal"><font face="Arial" size="2"><span lang="EN-US"
style="font-size:
10.0pt;font-family:Arial">Maybe new constraint_exclusion staff could help to exclude non-matching tables from
inheritancequery ?</span></font></div>

Re: effective SELECT from child tables

From

Josh Berkus

Date:

27 September 2005, 15:30:53

Ilia,

> Maybe new constraint_exclusion staff could help to exclude non-matching
> tables from inheritance query ?

Yes, that's exactly what it's for.  Your testing is welcome.  Download 8.1 
and try it today.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

Re: effective SELECT from child tables

From

Tom Lane

Date:

27 September 2005, 15:38:24

"Ilia Kantor" <ilia@obnovlenie.ru> writes:
> Let table A be inherited by A1, A2, A3.

> How to select from A records where actual relations are A1, A2 ?

Why not just select directly from the child tables?  I can't get excited
about optimizing the case you propose.
        regards, tom lane

Re: effective SELECT from child tables

From

"Ilia Kantor"

Date:

28 September 2005, 05:51:02

>> Maybe new constraint_exclusion staff could help to exclude non-matching
>> tables from inheritance query ?

> Yes, that's exactly what it's for.  Your testing is welcome.  Download 8.1

> and try it today.

Great, I'm developing on 8.1b2 now... 
But could you be more particular about the solution ?


Only the way I can think of is to add "relname" field into parent table, add
"BEFORE INSERT" trigger to each child that will set it appropriately and
CHECK (relname=<here goes table name>).
It works in this case..
More than that.. I can create indexes on each table with "WHERE
relname!=<table name>", then they are used with DELETE/UPDATE WHERE relname
IN(..). 

But the whole idea to adding an extra field, trigger and duplicating table
name multiple times.. Feels a bit stinky to me..

Could you suggest another path ?

Re: effective SELECT from child tables

From

"Ilia Kantor"

Date:

28 September 2005, 05:51:20

>> Let table A be inherited by A1, A2, A3.
>> How to select from A records where actual relations are A1, A2 ?

>Why not just select directly from the child tables?  I can't get excited
>about optimizing the case you propose.

Because "WHERE concrete_class IN (a,b,c)" is much more convenient and
flexible way of forming select then manually split request into many unions.


Also, this query runs on top of "abstract class", so inheritance really
assists me here.

Re: effective SELECT from child tables

From

mark@mark.mielke.cc

Date:

28 September 2005, 10:58:53

On Tue, Sep 27, 2005 at 09:30:55PM +0400, Ilia Kantor wrote:
> Let table A be inherited by A1, A2, A3.
> How to select from A records where actual relations are A1, A2 ?

If A1 and A2 will be naturally together, where compared to A, or A3,
why not introduce an intermediate table?

A would be inherited by A12, and A3. A12 would be inherited by A1, and A2.

You can do the UNION yourself, as well.

Cheers,
mark

-- 
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada
 One ring to rule them all, one ring to find them, one ring to bring them all                      and in the darkness
bindthem...
 
                          http://mark.mielke.cc/

Re: effective SELECT from child tables

From

Simon Riggs

Date:

28 September 2005, 15:26:19

On Wed, 2005-09-28 at 12:13 +0400, Ilia Kantor wrote:
> >> Let table A be inherited by A1, A2, A3.
> >> How to select from A records where actual relations are A1, A2 ?
> 
> >Why not just select directly from the child tables?  I can't get excited
> >about optimizing the case you propose.
> 
> Because "WHERE concrete_class IN (a,b,c)" is much more convenient and
> flexible way of forming select then manually split request into many unions.

> Also, this query runs on top of "abstract class", so inheritance really
> assists me here.
> 

If you treat the sub-class Discriminator as a data item rather than some
additional syntax for class membership then you will find this works
very straightforwardly for you.

Include the Discriminator as a column in A and it will be inherited by
all A1, A2, A3. 
e.g. concrete_class    char(1) not null

then setup constraints on each table like so

ALTER TABLE A1 ADD CHECK (concrete_class = 'A')
ALTER TABLE A2 ADD CHECK (concrete_class = 'B')
ALTER TABLE A3 ADD CHECK (concrete_class = 'C')

then when you run a query like

SELECT * FROM A WHERE concrete_class IN ('A','B')

you will find that table A3, which corresponds to concrete_class C has
been excluded from your query.

Presumably A, B, C are all mutually exclusive, so the end result will be
the same as if you had used a UNION ALL set query.

This will add 1 byte per row in your superclass... and requires no
index. You can even add this as a DEFAULT value for each child table, so
the actual column concrete_class need not be mentioned in an INSERT
statement.

(I've got plans to add an ABSTRACT keyword to tables to follow along the
same set of OO terminology in describing this situation. In next
release, not 8.1)

Best Regards, Simon Riggs

Re: effective SELECT from child tables

From

"Ilia Kantor"

Date:

28 September 2005, 16:09:36


> If you treat the sub-class Discriminator as a data item rather than some
> additional syntax for class membership then you will find this works
> very straightforwardly for you.

Your suggestion is essentially the same as mine.. 
There exists tableoid, pretty much suited to tell between tables in the case
of inheritance.. I can't see a "real" need to add a special "class
classifier" to each table..
This solution is a workaround. It will work, just can't make myself love it.

Re: effective SELECT from child tables

From

Martijn van Oosterhout

Date:

28 September 2005, 17:24:35

On Wed, Sep 28, 2005 at 11:00:46PM +0400, Ilia Kantor wrote:
> Your suggestion is essentially the same as mine..
> There exists tableoid, pretty much suited to tell between tables in the case
> of inheritance.. I can't see a "real" need to add a special "class
> classifier" to each table..
> This solution is a workaround. It will work, just can't make myself love it.

I wonder if it would be possible to tweak the constraints exclusion
code so that if it sees something of the form "tableoid = X" to exclude
other tables...

You know, assume each table has a constraint "tableoid = OID".

Still, it is a fairly unusual feature.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: effective SELECT from child tables

From

Josh Berkus

Date:

29 September 2005, 13:35:36

Ilia,

Well, Simon is still writing the CE docs.  In the meantime:

http://archives.postgresql.org/pgsql-hackers/2005-07/msg00461.php

--josh

Re: effective SELECT from child tables

From

"Jim C. Nasby"

Date:

30 September 2005, 20:26:51

On Wed, Sep 28, 2005 at 10:24:18PM +0200, Martijn van Oosterhout wrote:
> On Wed, Sep 28, 2005 at 11:00:46PM +0400, Ilia Kantor wrote:
> > Your suggestion is essentially the same as mine.. 
> > There exists tableoid, pretty much suited to tell between tables in the case
> > of inheritance.. I can't see a "real" need to add a special "class
> > classifier" to each table..
> > This solution is a workaround. It will work, just can't make myself love it.
> 
> I wonder if it would be possible to tweak the constraints exclusion
> code so that if it sees something of the form "tableoid = X" to exclude
> other tables...
> 
> You know, assume each table has a constraint "tableoid = OID".
> 
> Still, it is a fairly unusual feature.

Well, it's possibly a good way to do list partitioning where you can
drop the partitioning key out of each partition, something I would love
to have for stats.distributed.net (I'm actually working on a project
that does exactly this with a UNION ALL view and rules...)
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: effective SELECT from child tables

From

"Jim C. Nasby"

Date:

30 September 2005, 20:30:12

On Wed, Sep 28, 2005 at 07:25:46PM +0100, Simon Riggs wrote:
> Include the Discriminator as a column in A and it will be inherited by
> all A1, A2, A3. 
> e.g. concrete_class    char(1) not null
<snip>
> This will add 1 byte per row in your superclass... and requires no

I thought char was actually stored variable-length...? I know there's a
type that actually acts like char does on most databases, but I can't
remember what it is off-hand (it should be mentioned in docs 8.3...)
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: effective SELECT from child tables

From

Simon Riggs

Date:

01 October 2005, 05:21:17

On Wed, 2005-09-28 at 22:24 +0200, Martijn van Oosterhout wrote:
> On Wed, Sep 28, 2005 at 11:00:46PM +0400, Ilia Kantor wrote:
> > Your suggestion is essentially the same as mine.. 
> > There exists tableoid, pretty much suited to tell between tables in the case
> > of inheritance.. I can't see a "real" need to add a special "class
> > classifier" to each table..
> > This solution is a workaround. It will work, just can't make myself love it.
> 
> I wonder if it would be possible to tweak the constraints exclusion
> code so that if it sees something of the form "tableoid = X" to exclude
> other tables...
> 
> You know, assume each table has a constraint "tableoid = OID".
> 
> Still, it is a fairly unusual feature.

As I pointed out, the solution I proposed works with CE, in comparison
with selecting a tableoid, which does not. It also costs 1 byte per row.

Jim Nasby requested a similar feature a couple of months back.
Essentially this request reduces to the idea of having attributes that
are constant for all rows of a table. That doesn't have any side or
additional benefits AFAICS, so improving that case isn't going to be at
the top of a worthwhile-improvements list for a long time yet,
especially since the CE list already has at least 10 items on it.

Best Regards, Simon Riggs

Re: effective SELECT from child tables

From

Simon Riggs

Date:

01 October 2005, 05:42:44

On Fri, 2005-09-30 at 18:30 -0500, Jim C. Nasby wrote:

> I thought char was actually stored variable-length...? I know there's a
> type that actually acts like char does on most databases, but I can't
> remember what it is off-hand (it should be mentioned in docs 8.3...)

You are correct on that point, but IMHO that doesn't change the
situation sufficiently for the main part of this thread.

Best Regards, Simon Riggs

Re: effective SELECT from child tables

From

Martijn van Oosterhout

Date:

01 October 2005, 09:14:55

On Fri, Sep 30, 2005 at 06:30:10PM -0500, Jim C. Nasby wrote:
> On Wed, Sep 28, 2005 at 07:25:46PM +0100, Simon Riggs wrote:
> > Include the Discriminator as a column in A and it will be inherited by
> > all A1, A2, A3.
> > e.g. concrete_class    char(1) not null
> <snip>
> > This will add 1 byte per row in your superclass... and requires no
>
> I thought char was actually stored variable-length...? I know there's a
> type that actually acts like char does on most databases, but I can't
> remember what it is off-hand (it should be mentioned in docs 8.3...)

IIRC, this is the difference between "char" and char(1). The latter is
variable length and can store any character per current encoding, hence
the variable length. "char" on the other hand is a one byte (presumably
ASCII) character. It's used mainly in the system catalogs...
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: effective SELECT from child tables

From

mark@mark.mielke.cc

Date:

01 October 2005, 11:19:45

On Fri, Sep 30, 2005 at 09:54:39PM +0100, Simon Riggs wrote:
> On Wed, 2005-09-28 at 22:24 +0200, Martijn van Oosterhout wrote:
> > I wonder if it would be possible to tweak the constraints exclusion
> > code so that if it sees something of the form "tableoid = X" to exclude
> > other tables...
> > You know, assume each table has a constraint "tableoid = OID".
> > Still, it is a fairly unusual feature.
> As I pointed out, the solution I proposed works with CE, in comparison
> with selecting a tableoid, which does not. It also costs 1 byte per row.

I can't see the use of an extra char column in a row, that is static
for an entire table, as anything except an ugly hack. It might work.
It doesn't make it right.

> Jim Nasby requested a similar feature a couple of months back.
> Essentially this request reduces to the idea of having attributes that
> are constant for all rows of a table. That doesn't have any side or
> additional benefits AFAICS, so improving that case isn't going to be at
> the top of a worthwhile-improvements list for a long time yet,
> especially since the CE list already has at least 10 items on it.

It has the 'side or additional benefit' being requested here. The ability
to filter the child table by some attribute. For example, if the child
tables are used for partitioning, and the attribute were to keep a date
range, the field restriction optimization could be used to automatically
determine the set of tables to use for the date range specified. With
such a change, it would even work automatically if the date ranges
overlapped for some reason. Selecting a table name by date is hacky. This
sort of solution would be a general solution to the problem.

If the original code suggested - the use of the table oid, and comparing
it to the rows table oid, is supported by PostgreSQL (I have never tried
it), it seems like a miss for the optimizer not to understand what it
means. Each row has one table that it is associated with. It's quite
obviously criteria that could be used to restrict the table search, and
belongs in any COMPLETE implementation of a restriction optimizer.

I still prefer Jim Nasby's model, though. I think it easily extends such
that the table row oid IS an automatic constant for all rows of a table.
It's a natural extension, and takes away the exceptional nature of the
table row oid.

Cheers,
mark

-- 
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada
 One ring to rule them all, one ring to find them, one ring to bring them all                      and in the darkness
bindthem...

                          http://mark.mielke.cc/

Re: effective SELECT from child tables

From

Martijn van Oosterhout

Date:

01 October 2005, 11:37:52

On Sat, Oct 01, 2005 at 10:05:22AM -0400, mark@mark.mielke.cc wrote:
> It has the 'side or additional benefit' being requested here. The ability
> to filter the child table by some attribute. For example, if the child
> tables are used for partitioning, and the attribute were to keep a date
> range, the field restriction optimization could be used to automatically
> determine the set of tables to use for the date range specified. With
> such a change, it would even work automatically if the date ranges
> overlapped for some reason. Selecting a table name by date is hacky. This
> sort of solution would be a general solution to the problem.

This is what "Constraint Exclusion" does. It uses CHECK constraints on
a table to filter out tables that obviously don't apply to a query.
It's just the the specific case of "tableoid = XXX" is not supported
right now.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: effective SELECT from child tables

From

"Jim C. Nasby"

Date:

01 October 2005, 12:57:34

To clarify, this is a hard-coded implementation of what I'm asking for:
http://cvs.distributed.net/viewcvs.cgi/stats-sql/logdb/ in a nutshell:

CREATE TABLE log_other (   project_id  smallint NOT NULL   ...
)

CREATE TABLE log_8 (   -- No project_id   ...
)
CREATE TABLE log_24, log_25, log_5...
CREATE VIEW log AS   SELECT * FROM log_other   UNION ALL SELECT 8 AS project_id, * FROM log_8   ...

So the end result is that for cases where project_id is 5, 8, 24, or 25,
the data will be stored in tables that don't have the project_id.

If I were to use this on the main table for
http://stats.distributed.net, which has ~130M rows, I would be able to
save 130M*4 bytes (4 instead of 2 due to alignment), or 520MB. The logdb
will have many times that number of rows, so the savings will be even
larger.

Note that this technique wouldn't help at all for something like date
partitioning, because you have to store the date in the partitioned
table.

On Sat, Oct 01, 2005 at 04:35:49PM +0200, Martijn van Oosterhout wrote:
> On Sat, Oct 01, 2005 at 10:05:22AM -0400, mark@mark.mielke.cc wrote:
> > It has the 'side or additional benefit' being requested here. The ability
> > to filter the child table by some attribute. For example, if the child
> > tables are used for partitioning, and the attribute were to keep a date
> > range, the field restriction optimization could be used to automatically
> > determine the set of tables to use for the date range specified. With
> > such a change, it would even work automatically if the date ranges
> > overlapped for some reason. Selecting a table name by date is hacky. This
> > sort of solution would be a general solution to the problem.
> 
> This is what "Constraint Exclusion" does. It uses CHECK constraints on
> a table to filter out tables that obviously don't apply to a query.
> It's just the the specific case of "tableoid = XXX" is not supported
> right now.
> 
> Have a nice day,
> -- 
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> > tool for doing 5% of the work and then sitting around waiting for someone
> > else to do the other 95% so you can sue them.

-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: effective SELECT from child tables

From

"Jim C. Nasby"

Date:

01 October 2005, 12:58:53

On Sat, Oct 01, 2005 at 09:43:03AM +0100, Simon Riggs wrote:
> On Fri, 2005-09-30 at 18:30 -0500, Jim C. Nasby wrote:
> 
> > I thought char was actually stored variable-length...? I know there's a
> > type that actually acts like char does on most databases, but I can't
> > remember what it is off-hand (it should be mentioned in docs 8.3...)
> 
> You are correct on that point, but IMHO that doesn't change the
> situation sufficiently for the main part of this thread.

Depends on how many tuples and how wide they are. 4 extra bytes over a
couple hundred million rows isn't something to sneeze at. :)
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: effective SELECT from child tables

From

"Jim C. Nasby"

Date:

01 October 2005, 12:59:32

On Sat, Oct 01, 2005 at 02:13:09PM +0200, Martijn van Oosterhout wrote:
> On Fri, Sep 30, 2005 at 06:30:10PM -0500, Jim C. Nasby wrote:
> > On Wed, Sep 28, 2005 at 07:25:46PM +0100, Simon Riggs wrote:
> > > Include the Discriminator as a column in A and it will be inherited by
> > > all A1, A2, A3. 
> > > e.g. concrete_class    char(1) not null
> > <snip>
> > > This will add 1 byte per row in your superclass... and requires no
> > 
> > I thought char was actually stored variable-length...? I know there's a
> > type that actually acts like char does on most databases, but I can't
> > remember what it is off-hand (it should be mentioned in docs 8.3...)
> 
> IIRC, this is the difference between "char" and char(1). The latter is
> variable length and can store any character per current encoding, hence
> the variable length. "char" on the other hand is a one byte (presumably
> ASCII) character. It's used mainly in the system catalogs...

According to the docs, char == char(1).
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: effective SELECT from child tables

From

Andrew Dunstan

Date:

01 October 2005, 13:13:13


Jim C. Nasby wrote:

>On Sat, Oct 01, 2005 at 02:13:09PM +0200, Martijn van Oosterhout wrote:
>  
>
>>On Fri, Sep 30, 2005 at 06:30:10PM -0500, Jim C. Nasby wrote:
>>    
>>
>>>On Wed, Sep 28, 2005 at 07:25:46PM +0100, Simon Riggs wrote:
>>>      
>>>
>>>>Include the Discriminator as a column in A and it will be inherited by
>>>>all A1, A2, A3. 
>>>>e.g. concrete_class    char(1) not null
>>>>        
>>>>
>>><snip>
>>>      
>>>
>>>>This will add 1 byte per row in your superclass... and requires no
>>>>        
>>>>
>>>I thought char was actually stored variable-length...? I know there's a
>>>type that actually acts like char does on most databases, but I can't
>>>remember what it is off-hand (it should be mentioned in docs 8.3...)
>>>      
>>>
>>IIRC, this is the difference between "char" and char(1). The latter is
>>variable length and can store any character per current encoding, hence
>>the variable length. "char" on the other hand is a one byte (presumably
>>ASCII) character. It's used mainly in the system catalogs...
>>    
>>
>
>According to the docs, char == char(1).
>  
>

The docs also say:

The type "char" (note the quotes) is different from char(1) in that it 
only uses one byte of storage. It is internally used in the system 
catalogs as a poor-man's enumeration type.

cheers

andrew

Re: effective SELECT from child tables

From

Martijn van Oosterhout

Date:

01 October 2005, 13:29:56

Hmm, I'm trying to understand here. If every row in log_8 should have
the same project_id, couldn't this be acheived by having each row in log_other
contain the tableoid of the table it refers to. Then a join will return
the info you're looking for.

Or am I missing something?

On Sat, Oct 01, 2005 at 10:57:27AM -0500, Jim C. Nasby wrote:
> To clarify, this is a hard-coded implementation of what I'm asking for:
> http://cvs.distributed.net/viewcvs.cgi/stats-sql/logdb/ in a nutshell:
>
> CREATE TABLE log_other (
>     project_id  smallint NOT NULL
>     ...
> )
>
> CREATE TABLE log_8 (
>     -- No project_id
>     ...
> )
> CREATE TABLE log_24, log_25, log_5...
> CREATE VIEW log AS
>     SELECT * FROM log_other
>     UNION ALL SELECT 8 AS project_id, * FROM log_8
>     ...
>
> So the end result is that for cases where project_id is 5, 8, 24, or 25,
> the data will be stored in tables that don't have the project_id.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: effective SELECT from child tables

From

"Jim C. Nasby"

Date:

01 October 2005, 13:34:23

On Sat, Oct 01, 2005 at 06:28:03PM +0200, Martijn van Oosterhout wrote:
> Hmm, I'm trying to understand here. If every row in log_8 should have
> the same project_id, couldn't this be acheived by having each row in log_other
> contain the tableoid of the table it refers to. Then a join will return
> the info you're looking for.
> 
> Or am I missing something?

log_other will only contain rows where project_id NOT IN (5,8,24,25).
The UNION ALL view is the general purpose way to read data from the
tables. (Actually, since it has rules on it, it's also the way to write
data to the tables as well).

> On Sat, Oct 01, 2005 at 10:57:27AM -0500, Jim C. Nasby wrote:
> > To clarify, this is a hard-coded implementation of what I'm asking for:
> > http://cvs.distributed.net/viewcvs.cgi/stats-sql/logdb/ in a nutshell:
> > 
> > CREATE TABLE log_other (
> >     project_id  smallint NOT NULL
> >     ...
> > )
> > 
> > CREATE TABLE log_8 (
> >     -- No project_id
> >     ...
> > )
> > CREATE TABLE log_24, log_25, log_5...
> > CREATE VIEW log AS
> >     SELECT * FROM log_other
> >     UNION ALL SELECT 8 AS project_id, * FROM log_8
> >     ...
> > 
> > So the end result is that for cases where project_id is 5, 8, 24, or 25,
> > the data will be stored in tables that don't have the project_id.
> -- 
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> > tool for doing 5% of the work and then sitting around waiting for someone
> > else to do the other 95% so you can sue them.



-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: effective SELECT from child tables

From

Simon Riggs

Date:

01 October 2005, 15:58:58

On Sat, 2005-10-01 at 10:57 -0500, Jim C. Nasby wrote:
> To clarify, this is a hard-coded implementation of what I'm asking for:
> http://cvs.distributed.net/viewcvs.cgi/stats-sql/logdb/ in a nutshell:
> 
> CREATE TABLE log_other (
>     project_id  smallint NOT NULL
>     ...
> )
> 
> CREATE TABLE log_8 (
>     -- No project_id
>     ...
> )
> CREATE TABLE log_24, log_25, log_5...
> CREATE VIEW log AS
>     SELECT * FROM log_other
>     UNION ALL SELECT 8 AS project_id, * FROM log_8
>     ...
> 
> So the end result is that for cases where project_id is 5, 8, 24, or 25,
> the data will be stored in tables that don't have the project_id.
> 
> If I were to use this on the main table for
> http://stats.distributed.net, which has ~130M rows, I would be able to
> save 130M*4 bytes (4 instead of 2 due to alignment), or 520MB. The logdb
> will have many times that number of rows, so the savings will be even
> larger.
> 
> Note that this technique wouldn't help at all for something like date
> partitioning, because you have to store the date in the partitioned
> table.

Jim,

Your idea was noted before and actually; I mentioned it to show that I
listen and take note of ideas from any source.

For everybody, I would note that the current behaviour is exactly the
way that List Partitioning works on other systems.

The cost of this technique is only paid if you choose to partition on
something that you would not otherwise have included in your table. In
many cases, you'll choose a column that would have been in the table if
you created one big table so the additional cost is zero.

In your example, I would expect to see project_id in a superclass table
and so there would be no cost.

The idea is neat, but IMHO the potential saving of this idea is not big
enough for me to prioritise that very highly over other items at this
time.

Best Regards, Simon Riggs

Re: effective SELECT from child tables

From

Hannu Krosing

Date:

01 October 2005, 16:33:22

On L, 2005-10-01 at 19:59 +0100, Simon Riggs wrote:

> Jim,
> 
> Your idea was noted before and actually; I mentioned it to show that I
> listen and take note of ideas from any source.
> 
> For everybody, I would note that the current behaviour is exactly the
> way that List Partitioning works on other systems.
> 
> The cost of this technique is only paid if you choose to partition on
> something that you would not otherwise have included in your table. In
> many cases, you'll choose a column that would have been in the table if
> you created one big table so the additional cost is zero.

If we had not disabled SELECT rules on ordinary tables some time back
(reserving them exclusively for VIEWs), then most of the benefit of not
storing static tables would have been obtained by storing NULL in the
constant column (via RULE or TRIGGER) and creating an ON SELECT rule on
the subtable that returns the desired constant value.

I also often wish that this would be possible when someone adds a column
with a default value to a multi-million row table on a 24/7 production
system and insists on filling all existing columns with the default.

A rule "ON SELECT FROM table_x WHERE col_x IS NULL return
col_x=default_for_col_x" would solve that nicely.

This would even not require adding null bitmap to existing tuples with
no null values.

> In your example, I would expect to see project_id in a superclass table
> and so there would be no cost.
> 
> The idea is neat, but IMHO the potential saving of this idea is not big
> enough for me to prioritise that very highly over other items at this
> time.

I think that bringing the ON SELECT rules of form "ON SELECT ... RETURN
DEFAUL FOR COLUMN x" would be the cleanest and easiest way to do this.

Another use of SELECT rules would be introducing computed columns, which
can also be done by a NULL-filled column and ON SELECT rule using a
function.

We could additionally require the column on which this is defined to
have a "MUST BE NULL" constraint :) 

-- 
Hannu Krosing <hannu@skype.net>

Re: effective SELECT from child tables

From

Greg Stark

Date:

02 October 2005, 02:25:02

Hannu Krosing <hannu@skype.net> writes:

> I also often wish that this would be possible when someone adds a column
> with a default value to a multi-million row table on a 24/7 production
> system and insists on filling all existing columns with the default.
> 
> A rule "ON SELECT FROM table_x WHERE col_x IS NULL return
> col_x=default_for_col_x" would solve that nicely.

Of course that only works if the reason they want to set fill the rows with
the default value isn't precisely because NULL is a perfectly reasonable thing
for the column to have (but not what they want for the existing rows).

-- 
greg

Re: effective SELECT from child tables

From

mark@mark.mielke.cc

Date:

02 October 2005, 05:23:44

On Sat, Oct 01, 2005 at 04:35:49PM +0200, Martijn van Oosterhout wrote:
> On Sat, Oct 01, 2005 at 10:05:22AM -0400, mark@mark.mielke.cc wrote:
> > It has the 'side or additional benefit' being requested here. The ability
> > to filter the child table by some attribute. For example, if the child
> > tables are used for partitioning, and the attribute were to keep a date
> > range, the field restriction optimization could be used to automatically
> > determine the set of tables to use for the date range specified. With
> > such a change, it would even work automatically if the date ranges
> > overlapped for some reason. Selecting a table name by date is hacky. This
> > sort of solution would be a general solution to the problem.
> This is what "Constraint Exclusion" does. It uses CHECK constraints on
> a table to filter out tables that obviously don't apply to a query.

Good point. I'll have to invent another use case. :-)

> It's just the the specific case of "tableoid = XXX" is not supported
> right now.

Yes. This is what I was looking for Simon to admit. :-)

Cheers,
mark

-- 
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada
 One ring to rule them all, one ring to find them, one ring to bring them all                      and in the darkness
bindthem...
 
                          http://mark.mielke.cc/

Re: effective SELECT from child tables

From

Hannu Krosing

Date:

02 October 2005, 09:57:44

On P, 2005-10-02 at 01:24 -0400, Greg Stark wrote:
> Hannu Krosing <hannu@skype.net> writes:
> 
> > I also often wish that this would be possible when someone adds a column
> > with a default value to a multi-million row table on a 24/7 production
> > system and insists on filling all existing columns with the default.
> > 
> > A rule "ON SELECT FROM table_x WHERE col_x IS NULL return
> > col_x=default_for_col_x" would solve that nicely.
> 
> Of course that only works if the reason they want to set fill the rows with
> the default value isn't precisely because NULL is a perfectly reasonable thing
> for the column to have (but not what they want for the existing rows).

Sure.

What would be needed for adding new colums with default filling would be
some end-of-tuple marker or stored column count or tuple version nr, and
then a rule (or just default behaviour) of showing default value for
*missing* columns (colno > nr of stored columns).

-- 
Hannu Krosing <hannu@skype.net>

Re: effective SELECT from child tables

From

Martijn van Oosterhout

Date:

02 October 2005, 10:32:16

On Sun, Oct 02, 2005 at 03:57:37PM +0300, Hannu Krosing wrote:
> On P, 2005-10-02 at 01:24 -0400, Greg Stark wrote:
> > Of course that only works if the reason they want to set fill the rows with
> > the default value isn't precisely because NULL is a perfectly reasonable thing
> > for the column to have (but not what they want for the existing rows).
>
> Sure.
>
> What would be needed for adding new colums with default filling would be
> some end-of-tuple marker or stored column count or tuple version nr, and
> then a rule (or just default behaviour) of showing default value for
> *missing* columns (colno > nr of stored columns).

Actually, PostgreSQL does know the number of columns in a tuple. It
would be possible get change heap_getattr to return the default value.

However, from a semantic point of view, it would be a bit strange. If
you added a column, updated some rows then set a default, that default
might end up applying to every row, except the ones you already
modified. With careful coding you may be able to get around this.

However, a good argument can be made that setting the DEFAULT for a
column shouldn't change data anywhere.  What about if I want to change
the default for new values but not for old ones. That wouldn't work if
the database starts adding values randomly, depending on when they are
read...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: effective SELECT from child tables

From

Greg Stark

Date:

02 October 2005, 12:44:06

Martijn van Oosterhout <kleptog@svana.org> writes:

> However, from a semantic point of view, it would be a bit strange. If
> you added a column, updated some rows then set a default, that default
> might end up applying to every row, except the ones you already
> modified. With careful coding you may be able to get around this.

I suppose the update would have to check for these new columns and fill in the
default value when first updated.

But what do you do if the DEFAULT expression isn't immutable. Volatile DEFAULT
expressions are extremely common in databases... namely sequences.

It's too bad though. An efficient way to add a new column with a default would
clear up one of the missing feature of Postgres. Virtually every time I add a
new column I have to do this ALTER TABLE, ALTER COLUMN SET DEFAULT, UPDATE
TABLE, VACUUM FULL, ALTER COLUMN SET NOT NULL dance. It would be real nice if
you could add a new NOT NULL column with a default with a single command and
just move on.

-- 
greg

Re: effective SELECT from child tables

From

mark@mark.mielke.cc

Date:

02 October 2005, 18:13:12

If one defines a restriction such that 'COLUMN = VALUE' for a specific
table, in a theoretical sort of model that completely ignores
implementation difficulty, or changes to the restriction, I think it
would be safe to not store COLUMN in the tuple. If the tuple is
stored, then COLUMN = VALUE, so when fetched, the value will be VALUE.

Back to the real world, this would be difficult to implement without
treating the column special from the point of table creation, and
preventing the restriction from being altered without re-building
the table... :-)

Cheers,
mark

-- 
mark@mielke.cc / markm@ncf.ca / markm@nortel.com     __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada
 One ring to rule them all, one ring to find them, one ring to bring them all                      and in the darkness
bindthem...
 
                          http://mark.mielke.cc/

Re: effective SELECT from child tables

From

Martijn van Oosterhout

Date:

02 October 2005, 18:29:10

On Sun, Oct 02, 2005 at 11:51:27PM +0300, Hannu Krosing wrote:
> Right. Actually the "default" value returned for missing columns should
> be different from the default for new values and should be settable only
> once, when adding the column, else issues would become really really
> weird.

Right, the only way I could really imagine it working is have a flag,
attautodefault which if set would return the default instead of NULL.
It would only ever be set if the ADD COLUMN SET DEFAULT happened
together. But does this mean you have one magic default fixed for the
life of the column and the actual default which can be changed anytime?
Seems messy, though possible...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Re: effective SELECT from child tables

From

Hannu Krosing

Date:

02 October 2005, 18:37:14

On P, 2005-10-02 at 15:30 +0200, Martijn van Oosterhout wrote:
> On Sun, Oct 02, 2005 at 03:57:37PM +0300, Hannu Krosing wrote:
> > On P, 2005-10-02 at 01:24 -0400, Greg Stark wrote:
> > > Of course that only works if the reason they want to set fill the rows with
> > > the default value isn't precisely because NULL is a perfectly reasonable thing
> > > for the column to have (but not what they want for the existing rows).
> > 
> > Sure.
> > 
> > What would be needed for adding new colums with default filling would be
> > some end-of-tuple marker or stored column count or tuple version nr, and
> > then a rule (or just default behaviour) of showing default value for
> > *missing* columns (colno > nr of stored columns).
> 
> Actually, PostgreSQL does know the number of columns in a tuple. It
> would be possible get change heap_getattr to return the default value.

Good. (I have not looked at tuple layout for a few years :)

> However, from a semantic point of view, it would be a bit strange. If
> you added a column, updated some rows then set a default, that default
> might end up applying to every row, except the ones you already
> modified. With careful coding you may be able to get around this.

If I do the add column and set default in the same transaction, this
should not be a problem.

> However, a good argument can be made that setting the DEFAULT for a
> column shouldn't change data anywhere.  What about if I want to change
> the default for new values but not for old ones. That wouldn't work if
> the database starts adding values randomly, depending on when they are
> read...

Right. Actually the "default" value returned for missing columns should
be different from the default for new values and should be settable only
once, when adding the column, else issues would become really really
weird.

-- 
Hannu Krosing <hannu@skype.net>

Re: effective SELECT from child tables

From

Greg Stark

Date:

02 October 2005, 23:50:14

Martijn van Oosterhout <kleptog@svana.org> writes:

> On Sun, Oct 02, 2005 at 11:51:27PM +0300, Hannu Krosing wrote:
> > Right. Actually the "default" value returned for missing columns should
> > be different from the default for new values and should be settable only
> > once, when adding the column, else issues would become really really
> > weird.
> 
> Right, the only way I could really imagine it working is have a flag,
> attautodefault which if set would return the default instead of NULL.

No, I think redefining NULL is a non-starter. This whole thing only has legs
if Postgres can distinguish between a column that has never been set and a
column that has NULL.

Actually the only case where I don't see bizarre semantic consequences is the
case of a newly created column. It would be nice to be able to do:
 ALTER TABLE ADD foo integer DEFAULT 1

And there's no question of what what the semantics of this are. No question of
columns magically acquiring a value where they were NULL before.

The *main* reason I would be happy about this is that it would let me add NOT
NULL columns efficiently. I would expect NOT NULL to be allowed whenever a
default is provided since there's obviously no way it can cause a problem.

(I don't follow the "NULL is evil all the time" philosophy but I do try to set
columns NOT NULL wherever I can. It always annoys me that when adding a new
column I have to choose between a massive disk intensive batch job or
compromising the design.)

On the other hand if you do
ALTER TABLE ADD foo integer

and then later do
ALTER TABLE ALTER foo SET DEFAULT 1

then there is a window where all those foos are NULL and then they magically
become 1? That doesn't seem tenable.

-- 
greg

Re: effective SELECT from child tables

From

Tom Lane

Date:

03 October 2005, 00:02:16

Greg Stark <gsstark@mit.edu> writes:
> It would be nice to be able to do:
>   ALTER TABLE ADD foo integer DEFAULT 1
> And there's no question of what what the semantics of this are.

Sure, but you can only optimize this if the default expression is
immutable...

> On the other hand if you do
>  ALTER TABLE ADD foo integer
> and then later do
>  ALTER TABLE ALTER foo SET DEFAULT 1
> then there is a window where all those foos are NULL and then they magically
> become 1? That doesn't seem tenable.

It'd also be contrary to the SQL spec, AFAICS.

Here's another interesting case to think about:
ALTER TABLE ADD foo integer DEFAULT 1...ALTER TABLE ALTER foo SET DEFAULT 2

You'll have to pay the table-traversal cost on one step or the other.
        regards, tom lane

Re: effective SELECT from child tables

From

Greg Stark

Date:

03 October 2005, 02:36:32

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Here's another interesting case to think about:
> 
>     ALTER TABLE ADD foo integer DEFAULT 1
>     ...
>     ALTER TABLE ALTER foo SET DEFAULT 2
> 
> You'll have to pay the table-traversal cost on one step or the other.

A good point.

I wonder if this could be solved by recording the xid of the ALTER transaction
along with the default value. I imagine that might get complicated quickly
though.

-- 
greg

Re: effective SELECT from child tables

From

Hannu Krosing

Date:

03 October 2005, 17:34:17

On P, 2005-10-02 at 22:49 -0400, Greg Stark wrote:

> No, I think redefining NULL is a non-starter. This whole thing only has legs
> if Postgres can distinguish between a column that has never been set and a
> column that has NULL.
> 
> Actually the only case where I don't see bizarre semantic consequences is the
> case of a newly created column. It would be nice to be able to do:
> 
>   ALTER TABLE ADD foo integer DEFAULT 1

Probably a different syntax would be better here, perhaps

ALTER TABLE ADD foo integer AS 1 WHEN MISSING;

or somesuch.

> And there's no question of what what the semantics of this are. No question of
> columns magically acquiring a value where they were NULL before.
> 
> The *main* reason I would be happy about this is that it would let me add NOT
> NULL columns efficiently. I would expect NOT NULL to be allowed whenever a
> default is provided since there's obviously no way it can cause a problem.

> (I don't follow the "NULL is evil all the time" philosophy but I do try to set
> columns NOT NULL wherever I can. It always annoys me that when adding a new
> column I have to choose between a massive disk intensive batch job or
> compromising the design.)
> 
> 
> On the other hand if you do
> 
>  ALTER TABLE ADD foo integer
> 
> and then later do
> 
>  ALTER TABLE ALTER foo SET DEFAULT 1
> 
> then there is a window where all those foos are NULL and then they magically
> become 1? That doesn't seem tenable.

Not if you require these two to happen in one transaction to affect
added/missing values. 

-- 
Hannu Krosing <hannu@skype.net>

Re: effective SELECT from child tables

From

Hannu Krosing

Date:

03 October 2005, 17:44:58

On P, 2005-10-02 at 23:00 -0400, Tom Lane wrote:
> Greg Stark <gsstark@mit.edu> writes:
> > It would be nice to be able to do:
> >   ALTER TABLE ADD foo integer DEFAULT 1
> > And there's no question of what what the semantics of this are.
> 
> Sure, but you can only optimize this if the default expression is
> immutable...
> 
> > On the other hand if you do
> >  ALTER TABLE ADD foo integer
> > and then later do
> >  ALTER TABLE ALTER foo SET DEFAULT 1
> > then there is a window where all those foos are NULL and then they magically
> > become 1? That doesn't seem tenable.
> 
> It'd also be contrary to the SQL spec, AFAICS.
> 
> Here's another interesting case to think about:
> 
>     ALTER TABLE ADD foo integer DEFAULT 1
>     ...
>     ALTER TABLE ALTER foo SET DEFAULT 2
>
> You'll have to pay the table-traversal cost on one step or the other.

The second, ALTER ... SET DEFAULT, would only set default for newly
inserted columns, not the ones which are missing due to tuples being
created before the column existed.

But completely different syntax may be more clear.

ALTER TABLE ADD foo integer WITH DEFAULT 1;

Or whatever 

-- 
Hannu Krosing <hannu@skype.net>

Re: effective SELECT from child tables

From

Greg Stark

Date:

04 October 2005, 00:22:53

Hannu Krosing <hannu@skype.net> writes:

> Probably a different syntax would be better here, perhaps
> 
> ALTER TABLE ADD foo integer AS 1 WHEN MISSING;
>
> or somesuch.

Uhm, if you're adding the column they're *all* "missing". That's the whole
point. If you start inventing a new user-visible concept "missing" and try to
distinguish it from NULL you're going to have a hell of a time defining the
semantics.

The goal has to be to provide the *exact* same user-visible semantics as
actually setting the default. That means setting all the existing rows if
you're adding a new column.

It also unfortunately means tackling the much trickier gotcha that Tom raised
about what happens if you want to later change the default.

-- 
greg

Re: effective SELECT from child tables

From

Greg Stark

Date:

04 October 2005, 00:24:38

Hannu Krosing <hannu@skype.net> writes:

> On P, 2005-10-02 at 23:00 -0400, Tom Lane wrote:
> > 
> > Here's another interesting case to think about:
> > 
> >     ALTER TABLE ADD foo integer DEFAULT 1
> >     ...
> >     ALTER TABLE ALTER foo SET DEFAULT 2
> >
> > You'll have to pay the table-traversal cost on one step or the other.
> 
> The second, ALTER ... SET DEFAULT, would only set default for newly
> inserted columns, not the ones which are missing due to tuples being
> created before the column existed.

Hm. So you're saying there are only ever exactly two types of defaults. The
"initial" default that applies to all tuples that were created before the
column was added. And the "current" default that only ever applies to newly
created tuples.

That does seem to cleanly close this hole.

-- 
greg

Re: effective SELECT from child tables

From

Rod Taylor

Date:

04 October 2005, 00:52:14

On Mon, 2005-10-03 at 23:24 -0400, Greg Stark wrote:
> Hannu Krosing <hannu@skype.net> writes:
> 
> > On P, 2005-10-02 at 23:00 -0400, Tom Lane wrote:
> > > 
> > > Here's another interesting case to think about:
> > > 
> > >     ALTER TABLE ADD foo integer DEFAULT 1
> > >     ...
> > >     ALTER TABLE ALTER foo SET DEFAULT 2
> > >
> > > You'll have to pay the table-traversal cost on one step or the other.
> > 
> > The second, ALTER ... SET DEFAULT, would only set default for newly
> > inserted columns, not the ones which are missing due to tuples being
> > created before the column existed.
> 
> Hm. So you're saying there are only ever exactly two types of defaults. The
> "initial" default that applies to all tuples that were created before the
> column was added. And the "current" default that only ever applies to newly
> created tuples.
> 
> That does seem to cleanly close this hole.

I don't think so.

ALTER TABLE tab ADD foo integer DEFAULT 1;
INSERT INTO tab DEFAULT VALUES;

ALTER TABLE tab ALTER foo SET DEFAULT 2
INSERT INTO tab DEFAULT VALUES;

ALTER TABLE tab ALTER foo SET DEFAULT 3
INSERT INTO tab DEFAULT VALUES;

SELECT foo FROM tab;
--

Re: effective SELECT from child tables

From

Rod Taylor

Date:

04 October 2005, 01:06:42

On Mon, 2005-10-03 at 23:51 -0400, Rod Taylor wrote:
> On Mon, 2005-10-03 at 23:24 -0400, Greg Stark wrote:
> > Hannu Krosing <hannu@skype.net> writes:
> > 
> > > On P, 2005-10-02 at 23:00 -0400, Tom Lane wrote:
> > > > 
> > > > Here's another interesting case to think about:
> > > > 
> > > >     ALTER TABLE ADD foo integer DEFAULT 1
> > > >     ...
> > > >     ALTER TABLE ALTER foo SET DEFAULT 2
> > > >
> > > > You'll have to pay the table-traversal cost on one step or the other.
> > > 
> > > The second, ALTER ... SET DEFAULT, would only set default for newly
> > > inserted columns, not the ones which are missing due to tuples being
> > > created before the column existed.
> > 
> > Hm. So you're saying there are only ever exactly two types of defaults. The
> > "initial" default that applies to all tuples that were created before the
> > column was added. And the "current" default that only ever applies to newly
> > created tuples.
> > 
> > That does seem to cleanly close this hole.
> 
> I don't think so.

Ignore me. The thread seems to be about allowing fast addition of
columns, not decreasing storage space.

For some reason I was thinking of a bitmap like the NULL bitmap for
compressing out all default values.


--

Re: effective SELECT from child tables

From

Greg Stark

Date:

04 October 2005, 01:13:02

Rod Taylor <pg@rbt.ca> writes:

> > Hm. So you're saying there are only ever exactly two types of defaults. The
> > "initial" default that applies to all tuples that were created before the
> > column was added. And the "current" default that only ever applies to newly
> > created tuples.
> > 
> > That does seem to cleanly close this hole.
> 
> I don't think so.
> 
> ALTER TABLE tab ADD foo integer DEFAULT 1;
> INSERT INTO tab DEFAULT VALUES;

This inserts a physical "1" in the record (the "current" default").

> ALTER TABLE tab ALTER foo SET DEFAULT 2
> INSERT INTO tab DEFAULT VALUES;

This inserts a physical "2" in the record.

> ALTER TABLE tab ALTER foo SET DEFAULT 3
> INSERT INTO tab DEFAULT VALUES;

This inserts a physical "3" in the record.

> SELECT foo FROM tab;

This checks for any old records that predate the column and use the "initial"
default of 1 for those records. The three records above all postdate the
column addition so they have values present, namely 1, 2, and 3.

-- 
greg

Re: effective SELECT from child tables

From

"Jim C. Nasby"

Date:

04 October 2005, 12:14:45

Is there enough for a TODO here?

On Mon, Oct 03, 2005 at 11:24:30PM -0400, Greg Stark wrote:
> 
> Hannu Krosing <hannu@skype.net> writes:
> 
> > On P, 2005-10-02 at 23:00 -0400, Tom Lane wrote:
> > > 
> > > Here's another interesting case to think about:
> > > 
> > >     ALTER TABLE ADD foo integer DEFAULT 1
> > >     ...
> > >     ALTER TABLE ALTER foo SET DEFAULT 2
> > >
> > > You'll have to pay the table-traversal cost on one step or the other.
> > 
> > The second, ALTER ... SET DEFAULT, would only set default for newly
> > inserted columns, not the ones which are missing due to tuples being
> > created before the column existed.
> 
> Hm. So you're saying there are only ever exactly two types of defaults. The
> "initial" default that applies to all tuples that were created before the
> column was added. And the "current" default that only ever applies to newly
> created tuples.
> 
> That does seem to cleanly close this hole.
> 
> -- 
> greg
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
> 

-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: effective SELECT from child tables

From

"Jim C. Nasby"

Date:

04 October 2005, 12:27:34

On Sat, Oct 01, 2005 at 07:59:11PM +0100, Simon Riggs wrote:
> On Sat, 2005-10-01 at 10:57 -0500, Jim C. Nasby wrote:
> > To clarify, this is a hard-coded implementation of what I'm asking for:
> > http://cvs.distributed.net/viewcvs.cgi/stats-sql/logdb/ in a nutshell:
> > 
> > CREATE TABLE log_other (
> >     project_id  smallint NOT NULL
> >     ...
> > )
> > 
> > CREATE TABLE log_8 (
> >     -- No project_id
> >     ...
> > )
> > CREATE TABLE log_24, log_25, log_5...
> > CREATE VIEW log AS
> >     SELECT * FROM log_other
> >     UNION ALL SELECT 8 AS project_id, * FROM log_8
> >     ...
> > 
> > So the end result is that for cases where project_id is 5, 8, 24, or 25,
> > the data will be stored in tables that don't have the project_id.
> > 
> > If I were to use this on the main table for
> > http://stats.distributed.net, which has ~130M rows, I would be able to
> > save 130M*4 bytes (4 instead of 2 due to alignment), or 520MB. The logdb
> > will have many times that number of rows, so the savings will be even
> > larger.
> > 
> > Note that this technique wouldn't help at all for something like date
> > partitioning, because you have to store the date in the partitioned
> > table.
> 
> Jim,
> 
> Your idea was noted before and actually; I mentioned it to show that I
> listen and take note of ideas from any source.
> 
> For everybody, I would note that the current behaviour is exactly the
> way that List Partitioning works on other systems.
> 
> The cost of this technique is only paid if you choose to partition on
> something that you would not otherwise have included in your table. In
> many cases, you'll choose a column that would have been in the table if
> you created one big table so the additional cost is zero.

Well, the idea is to be more space efficient than if one big table was
used. This is unique to this class of partitioning problems.

> In your example, I would expect to see project_id in a superclass table
> and so there would be no cost.

Superclass table?

> The idea is neat, but IMHO the potential saving of this idea is not big
> enough for me to prioritise that very highly over other items at this
> time.

Certainly. I only chimed in with a specific example so people could
better understand what the idea was. I know it's on the list and might
be addressed at some point. In the mean time it's not too horrible to
hard-code a solution.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461