Thread: Minor TODO list changes

Minor TODO list changes

From
Simon Riggs
Date:
A few minor typos/notes:

INDEXES

1. On 2nd bullet...
"The main difficulty with this item is the problem of creating an index
that can spam more than one table."

should be span, not spam

2. On 6th bullet * "Use index to restrict rows returned by multi-key index when used   with non-consecutive keys to
reduceheap accesses 
 

For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and col3
= 9, spin though the index checking for col1 and col3 matches, rather
than just col1 "

This is also known as "skip-scanning", so it would be good to use that
phrase in the TODO

3. Whatever happened to the FILLFACTOR todo item? AFAICS it disappeared
sometime at the beginning of August, but I'm not sure why?

REF INTEGRITY

...Didn't we just get rid of deferred triggers?? Perhaps I read that
wrong.


CACHE

1. 1st bullet....prefix it with "Use posix_fadvise() API to let OS know
how files will be used by PostgreSQL. This would allow, for example,
a..." (replacing the word "add").



-- 
Best Regards, Simon Riggs



Re: Minor TODO list changes

From
Simon Riggs
Date:
On Thu, 2004-11-04 at 09:31, Simon Riggs wrote:
> A few minor typos/notes:
> 
> INDEXES
> 
> 1. On 2nd bullet...
> "The main difficulty with this item is the problem of creating an index
> that can spam more than one table."
> 
> should be span, not spam
> 
> 2. On 6th bullet
>   * "Use index to restrict rows returned by multi-key index when used
>     with non-consecutive keys to reduce heap accesses 
> 
> For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and col3
> = 9, spin though the index checking for col1 and col3 matches, rather
> than just col1 "
> 
> This is also known as "skip-scanning", so it would be good to use that
> phrase in the TODO
> 
> 3. Whatever happened to the FILLFACTOR todo item? AFAICS it disappeared
> sometime at the beginning of August, but I'm not sure why?

4. Multiple column index statistics

Allow accurate statistics to be collected on indexes that have more than
one column, so that they are more frequently selected for use.

(following on from Manfred Koizar's exploratory patch to provide
this...)

> 
> REF INTEGRITY
> 
> ...Didn't we just get rid of deferred triggers?? Perhaps I read that
> wrong.
> 
> 
> CACHE
> 
> 1. 1st bullet....prefix it with "Use posix_fadvise() API to let OS know
> how files will be used by PostgreSQL. This would allow, for example,
> a..." (replacing the word "add").
-- 
Best Regards, Simon Riggs



Re: Minor TODO list changes

From
Stephan Szabo
Date:
On Thu, 4 Nov 2004, Simon Riggs wrote:

> REF INTEGRITY
>
> ...Didn't we just get rid of deferred triggers?? Perhaps I read that
> wrong.

We got rid of deferred referential actions.  Constraint check triggers
for referential integrity (insert/update to fk table, NO ACTION on pk
table) are still deferrable.


Re: Minor TODO list changes

From
Bruce Momjian
Date:
OK, I updated all your items.  I removed fillfactor because I thought I
was the only one who thought it was valuable and as I remember it was
mostly useful for ISAM, which we don't support.  Can you think of a use
for a non-100% fillfactor?

---------------------------------------------------------------------------

Simon Riggs wrote:
> A few minor typos/notes:
> 
> INDEXES
> 
> 1. On 2nd bullet...
> "The main difficulty with this item is the problem of creating an index
> that can spam more than one table."
> 
> should be span, not spam
> 
> 2. On 6th bullet
>   * "Use index to restrict rows returned by multi-key index when used
>     with non-consecutive keys to reduce heap accesses 
> 
> For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and col3
> = 9, spin though the index checking for col1 and col3 matches, rather
> than just col1 "
> 
> This is also known as "skip-scanning", so it would be good to use that
> phrase in the TODO
> 
> 3. Whatever happened to the FILLFACTOR todo item? AFAICS it disappeared
> sometime at the beginning of August, but I'm not sure why?
> 
> REF INTEGRITY
> 
> ...Didn't we just get rid of deferred triggers?? Perhaps I read that
> wrong.
> 
> 
> CACHE
> 
> 1. 1st bullet....prefix it with "Use posix_fadvise() API to let OS know
> how files will be used by PostgreSQL. This would allow, for example,
> a..." (replacing the word "add").
> 
> 
> 
> -- 
> Best Regards, Simon Riggs
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Minor TODO list changes

From
Bruce Momjian
Date:
Simon Riggs wrote:
> 
> 4. Multiple column index statistics
> 
> Allow accurate statistics to be collected on indexes that have more than
> one column, so that they are more frequently selected for use.
> 
> (following on from Manfred Koizar's exploratory patch to provide
> this...)

Added.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Minor TODO list changes

From
Bruce Momjian
Date:
Added:
* Add fillfactor to control reserved free space during index  creation


---------------------------------------------------------------------------

Kenneth Marshall wrote:
> Bruce,
> 
> Just to chime in. I also agree that fillfactor is useful. I have
> been investigating different index variants and different fill
> factors can greatly influence the performance of the index. I
> also think it may play a key role in minimizing the small table/
> many inserts/updates performance problem.
> 
> --Ken
> 
> On Thu, Nov 04, 2004 at 11:51:15AM -0500, Bruce Momjian wrote:
> > 
> > OK, I updated all your items.  I removed fillfactor because I thought I
> > was the only one who thought it was valuable and as I remember it was
> > mostly useful for ISAM, which we don't support.  Can you think of a use
> > for a non-100% fillfactor?
> > 
> > ---------------------------------------------------------------------------
> > 
> > Simon Riggs wrote:
> > > A few minor typos/notes:
> > > 
> > > INDEXES
> > > 
> > > 1. On 2nd bullet...
> > > "The main difficulty with this item is the problem of creating an index
> > > that can spam more than one table."
> > > 
> > > should be span, not spam
> > > 
> > > 2. On 6th bullet
> > >   * "Use index to restrict rows returned by multi-key index when used
> > >     with non-consecutive keys to reduce heap accesses 
> > > 
> > > For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and col3
> > > = 9, spin though the index checking for col1 and col3 matches, rather
> > > than just col1 "
> > > 
> > > This is also known as "skip-scanning", so it would be good to use that
> > > phrase in the TODO
> > > 
> > > 3. Whatever happened to the FILLFACTOR todo item? AFAICS it disappeared
> > > sometime at the beginning of August, but I'm not sure why?
> > > 
> > > REF INTEGRITY
> > > 
> > > ...Didn't we just get rid of deferred triggers?? Perhaps I read that
> > > wrong.
> > > 
> > > 
> > > CACHE
> > > 
> > > 1. 1st bullet....prefix it with "Use posix_fadvise() API to let OS know
> > > how files will be used by PostgreSQL. This would allow, for example,
> > > a..." (replacing the word "add").
> > > 
> > > 
> > > 
> > > -- 
> > > Best Regards, Simon Riggs
> > > 
> > > 
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 2: you can get off all lists at once with the unregister command
> > >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> > > 
> > 
> > -- 
> >   Bruce Momjian                        |  http://candle.pha.pa.us
> >   pgman@candle.pha.pa.us               |  (610) 359-1001
> >   +  If your life is a hard drive,     |  13 Roberts Road
> >   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
> > 
> > ---------------------------(end of broadcast)---------------------------
> > TIP 8: explain analyze is your friend
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Minor TODO list changes

From
Simon Riggs
Date:
On Thu, 2004-11-04 at 16:51, Bruce Momjian wrote:
> OK, I updated all your items.  

Thanks

> I removed fillfactor because I thought I
> was the only one who thought it was valuable and as I remember it was
> mostly useful for ISAM, which we don't support.  Can you think of a use
> for a non-100% fillfactor?
> 

I was under the impression the factor was 67% for data loaded on the
leading-edge of an index, and 50% for other INSERTs.
(backend/access/nbtree/nbtinsert.c)

Not sure, without checking, what CREATE INDEX and COPY do, but I'm
guessing it is similar?

Other RDBMS use a higher leading-edge/standard fill factor.

There are situations where I'd want to set it at 90%, or even 100%. If I
know the update rate is likely to be zero, then I'd like my indexes to
fit in 10-30% less memory and disk, please.

Or am I missing something?

-- 
Best Regards, Simon Riggs



Re: Minor TODO list changes

From
Bruce Momjian
Date:
Simon Riggs wrote:
> On Thu, 2004-11-04 at 16:51, Bruce Momjian wrote:
> > OK, I updated all your items.  
> 
> Thanks
> 
> > I removed fillfactor because I thought I
> > was the only one who thought it was valuable and as I remember it was
> > mostly useful for ISAM, which we don't support.  Can you think of a use
> > for a non-100% fillfactor?
> > 
> 
> I was under the impression the factor was 67% for data loaded on the
> leading-edge of an index, and 50% for other INSERTs.
> (backend/access/nbtree/nbtinsert.c)
> 
> Not sure, without checking, what CREATE INDEX and COPY do, but I'm
> guessing it is similar?
> 
> Other RDBMS use a higher leading-edge/standard fill factor.
> 
> There are situations where I'd want to set it at 90%, or even 100%. If I
> know the update rate is likely to be zero, then I'd like my indexes to
> fit in 10-30% less memory and disk, please.
> 
> Or am I missing something?

Oh, good point.  I was thinking of just the leaf pages which I think are
100% filled.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Minor TODO list changes

From
"Darren King"
Date:
In my data warehousing situation, I'd like to be able to specify that
the indexes be as compact as possible (fillfactor = 100%) in order to
hit as few index pages as necessary.

For summary tables there will not be any more inserts or deletions so
the index will not change either.  In that case, there's no point to
leaving any extra room for page-splitting.

At some point it would also be nice to be able to mark tables as
read-only and then any indexes created on that table after that would
have a fillfactor of 100%.  Then I'd be able to load the table, alter it
to be read-only, then add the appropriate indexes that are automatically
compacted.

Darren

-----Original Message-----
From: Bruce Momjian [mailto:pgman@candle.pha.pa.us]
Sent: Thursday, November 04, 2004 12:19 PM
To: Simon Riggs
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Minor TODO list changes

Simon Riggs wrote:
> On Thu, 2004-11-04 at 16:51, Bruce Momjian wrote:
> > OK, I updated all your items.
>
> Thanks
>
> > I removed fillfactor because I thought I was the only one who
> > thought it was valuable and as I remember it was mostly useful for
> > ISAM, which we don't support.  Can you think of a use for a non-100%

> > fillfactor?
> >
>
> I was under the impression the factor was 67% for data loaded on the
> leading-edge of an index, and 50% for other INSERTs.
> (backend/access/nbtree/nbtinsert.c)
>
> Not sure, without checking, what CREATE INDEX and COPY do, but I'm
> guessing it is similar?
>
> Other RDBMS use a higher leading-edge/standard fill factor.
>
> There are situations where I'd want to set it at 90%, or even 100%. If

> I know the update rate is likely to be zero, then I'd like my indexes
> to fit in 10-30% less memory and disk, please.
>
> Or am I missing something?

Oh, good point.  I was thinking of just the leaf pages which I think are
100% filled.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania
19073

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
              http://www.postgresql.org/docs/faqs/FAQ.html




Re: Minor TODO list changes

From
Rod Taylor
Date:
> At some point it would also be nice to be able to mark tables as
> read-only and then any indexes created on that table after that would
> have a fillfactor of 100%.  Then I'd be able to load the table, alter it
> to be read-only, then add the appropriate indexes that are automatically
> compacted.

If it's read-only, you might as well remove a number of the internal
fields used for visibility as well.




Re: Minor TODO list changes

From
Simon Riggs
Date:
On Thu, 2004-11-04 at 17:59, Darren King wrote:
> In my data warehousing situation, I'd like to be able to specify that
> the indexes be as compact as possible (fillfactor = 100%) in order to
> hit as few index pages as necessary.
> 

Yes, that's my intent.

> At some point it would also be nice to be able to mark tables as
> read-only and then any indexes created on that table after that would
> have a fillfactor of 100%.  Then I'd be able to load the table, alter it
> to be read-only, then add the appropriate indexes that are automatically
> compacted.
> 

Sounds cool. Good idea,

-- 
Best Regards, Simon Riggs



Re: Minor TODO list changes

From
Simon Riggs
Date:
On Thu, 2004-11-04 at 18:15, Rod Taylor wrote:
> > At some point it would also be nice to be able to mark tables as
> > read-only and then any indexes created on that table after that would
> > have a fillfactor of 100%.  Then I'd be able to load the table, alter it
> > to be read-only, then add the appropriate indexes that are automatically
> > compacted.
> 
> If it's read-only, you might as well remove a number of the internal
> fields used for visibility as well.

Yes, should be able to save 16 bytes/row for an INSERT only table that
would still allow multiple simultaneous COPY jobs against it, with no
more than 1 statement per transaction. I'd like to create an additional
tuple layout using the tuple version bits, so you'd be able to set a
flag at CREATE TABLE time to use that as an optional alternative from
the standard one. UPDATEs and DELETEs would be permanently disallowed
against such tables, just as if privileges had not been granted.
TRUNCATE would still work, however.

Call it something like NOMODIFY? You could then alter VACUUM to skip
such tables, so you'd be able to do a VACUUM database without scanning
all of the largest tables in your system.

Darren's first idea would then be interpreted as automatically setting
FILLFACTOR=100 on indexes of NOMODIFY tables.

Darren's second idea was dynamic: i.e. an ALTER TABLE READONLY after
loading, rather than using a different tuple layout, which would need to
be done before loading, probably at CREATE TABLE time.

Darren's second idea of READONLY tables is related, but not necessarily
the same as the NOMODIFY concept that Rod brings up.

-- 
Best Regards, Simon Riggs



Re: Minor TODO list changes

From
"Darren King"
Date:
You are correct.

I would envision being able to alter a table "read-write" at any point.
If the index(es) on the table are completely filled from being created
in read-only mode, then the affected pages should be split with the
default fillfactor when/if a row is inserted or updated.  Altering the
table back to read-only would simple leave the index as is with a few
pages not filled, but still fully functional.

The insert-only index is intriguing as well though.

Darren

-----Original Message-----
From: Simon Riggs [mailto:simon@2ndquadrant.com]
Sent: Thursday, November 04, 2004 1:50 PM
To: Rod Taylor
Cc: Darren King; Bruce Momjian; PostgreSQL Development
Subject: Re: [HACKERS] Minor TODO list changes

On Thu, 2004-11-04 at 18:15, Rod Taylor wrote:
> > At some point it would also be nice to be able to mark tables as
> > read-only and then any indexes created on that table after that
> > would have a fillfactor of 100%.  Then I'd be able to load the
> > table, alter it to be read-only, then add the appropriate indexes
> > that are automatically compacted.
>
> If it's read-only, you might as well remove a number of the internal
> fields used for visibility as well.

Yes, should be able to save 16 bytes/row for an INSERT only table that
would still allow multiple simultaneous COPY jobs against it, with no
more than 1 statement per transaction. I'd like to create an additional
tuple layout using the tuple version bits, so you'd be able to set a
flag at CREATE TABLE time to use that as an optional alternative from
the standard one. UPDATEs and DELETEs would be permanently disallowed
against such tables, just as if privileges had not been granted.
TRUNCATE would still work, however.

Call it something like NOMODIFY? You could then alter VACUUM to skip
such tables, so you'd be able to do a VACUUM database without scanning
all of the largest tables in your system.

Darren's first idea would then be interpreted as automatically setting
FILLFACTOR=100 on indexes of NOMODIFY tables.

Darren's second idea was dynamic: i.e. an ALTER TABLE READONLY after
loading, rather than using a different tuple layout, which would need to
be done before loading, probably at CREATE TABLE time.

Darren's second idea of READONLY tables is related, but not necessarily
the same as the NOMODIFY concept that Rod brings up.

--
Best Regards, Simon Riggs





Re: Minor TODO list changes

From
Kenneth Marshall
Date:
Bruce,

Just to chime in. I also agree that fillfactor is useful. I have
been investigating different index variants and different fill
factors can greatly influence the performance of the index. I
also think it may play a key role in minimizing the small table/
many inserts/updates performance problem.

--Ken

On Thu, Nov 04, 2004 at 11:51:15AM -0500, Bruce Momjian wrote:
> 
> OK, I updated all your items.  I removed fillfactor because I thought I
> was the only one who thought it was valuable and as I remember it was
> mostly useful for ISAM, which we don't support.  Can you think of a use
> for a non-100% fillfactor?
> 
> ---------------------------------------------------------------------------
> 
> Simon Riggs wrote:
> > A few minor typos/notes:
> > 
> > INDEXES
> > 
> > 1. On 2nd bullet...
> > "The main difficulty with this item is the problem of creating an index
> > that can spam more than one table."
> > 
> > should be span, not spam
> > 
> > 2. On 6th bullet
> >   * "Use index to restrict rows returned by multi-key index when used
> >     with non-consecutive keys to reduce heap accesses 
> > 
> > For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and col3
> > = 9, spin though the index checking for col1 and col3 matches, rather
> > than just col1 "
> > 
> > This is also known as "skip-scanning", so it would be good to use that
> > phrase in the TODO
> > 
> > 3. Whatever happened to the FILLFACTOR todo item? AFAICS it disappeared
> > sometime at the beginning of August, but I'm not sure why?
> > 
> > REF INTEGRITY
> > 
> > ...Didn't we just get rid of deferred triggers?? Perhaps I read that
> > wrong.
> > 
> > 
> > CACHE
> > 
> > 1. 1st bullet....prefix it with "Use posix_fadvise() API to let OS know
> > how files will be used by PostgreSQL. This would allow, for example,
> > a..." (replacing the word "add").
> > 
> > 
> > 
> > -- 
> > Best Regards, Simon Riggs
> > 
> > 
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> > 
> 
> -- 
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 359-1001
>   +  If your life is a hard drive,     |  13 Roberts Road
>   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend