Re: Dead Space Map version 2 - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Dead Space Map version 2
Date
Msg-id 1172563897.3760.540.camel@silverbirch.site
Whole thread Raw
In response to Dead Space Map version 2  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
Responses Re: Dead Space Map version 2  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
List pgsql-hackers
On Tue, 2007-02-27 at 12:05 +0900, ITAGAKI Takahiro wrote:

> If we combine this with the HOT patch, pages with HOT tuples are probably
> marked as UNFROZEN because we don't bother vacuuming HOT tuples. They can
> be removed incrementally and doesn't require explicit vacuums.

Perhaps avoid DSM entries for HOT updates completely?

> VACUUM commands
> ---------------
> 
> VACUUM now only scans the pages that possibly have dead tuples.
> VACUUM ALL, a new syntax, behaves as the same as before.
> 
> - VACUUM FULL : Not changed. scans all pages and compress them.
> - VACUUM ALL  : Scans all pages; Do the same behavior as previous VACUUM.
> - VACUUM      : Scans only HIGH pages usually, but also LOW and UNFROZEN
>                 pages on vacuums in the cases for preventing XID wraparound.

Sounds good.

> Performance issues
> ------------------
> 
> * Enable/Disable DSM tracking per tables
>     DSM requires more or less additional works. If we know specific tables
>     where DSM does not work well, ex. heavily updated small tables, we can
>     disable DSM for it. The syntax is:
>       ALTER TABLE name SET (dsm=true/false);

How about a dsm_tracking_limit GUC? (Better name please)
The number of pages in a table before we start tracking DSM entries for
it. DSM only gives worthwhile benefits for larger tables anyway, so let
the user define what large means for them.
dsm_tracking_limit = 1000 by default.

> * Dead Space State Cache
>     The DSM management module is guarded using one LWLock, DeadSpaceLock.
>     Almost all accesses to DSM requires only shared lock, but the frequency
>     of shared lock was very high (tied with BufMappingLock) in my research.
>     To avoid the lock contention, I added a cache of dead space state in
>     BufferDesc flags. Backends see the flags first, and avoid locking if no
>     need to 

ISTM there should be a point at which DSM is so full we don't bother to
keep track any longer, so we can drop that information. For example if
user runs UPDATE without a WHERE clause, there's no point in tracking
whole relation.

> Memory management
> -----------------
> 
> In current implementation, DSM allocates a bunch of memory at start up and
> we cannot modify it in running. It's maybe enough because DSM consumes very
> little memory -- 32MB memory per 1TB database.

That sounds fine.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com




pgsql-hackers by date:

Previous
From: ITAGAKI Takahiro
Date:
Subject: Re: [PATCHES] Load distributed checkpoint
Next
From: "Simon Riggs"
Date:
Subject: Re: Dead Space Map version 2