Thread: TODO Item - Add system view to show free space map contents

TODO Item - Add system view to show free space map contents

From
Mark Kirkwood
Date:
This patch implements a view to display the free space map contents - e.g :

     regression=# SELECT c.relname, m.relblocknumber, m.blockfreebytes
                  FROM pg_freespacemap m INNER JOIN pg_class c
                  ON c.relfilenode = m.relfilenode LIMIT 10;
         relname             | relblocknumber | blockfreebytes
     ------------------------+----------------+----------------
     sql_features            |              5 |           2696
     sql_implementation_info |              0 |           7104
     sql_languages           |              0 |           8016
     sql_packages            |              0 |           7376
     sql_sizing              |              0 |           6032
     pg_authid               |              0 |           7424
     pg_toast_2618           |             13 |           4588
     pg_toast_2618           |             12 |           1680
     pg_toast_2618           |             10 |           1436
     pg_toast_2618           |              7 |           1136
     (10 rows)

[I found being able to display the FSM pretty cool, even if I say so
myself....].

It is written as a contrib module (similar to pg_buffercache) so as to
make any revisions non-initdb requiring.

The code needs to know about several of the (currently) internal data
structures in freespace.c, so I moved these into freespace.h. Similarly
for the handy macros to actually compute the free space. Let me know if
this was the wrong way to proceed!

Additionally access to the FSM pointer itself is required, I added a
function in freespace.c to return this, rather than making it globally
visible, again if the latter is a better approach, it is easily changed.

cheers

Mark

P.s : Currently don't have access to a windows box, so had to just 'take
a stab' at what DLLIMPORTs were required.


diff -Ncar pgsql.orig/contrib/pg_freespacemap/Makefile pgsql/contrib/pg_freespacemap/Makefile
*** pgsql.orig/contrib/pg_freespacemap/Makefile    Thu Jan  1 12:00:00 1970
--- pgsql/contrib/pg_freespacemap/Makefile    Thu Oct 27 17:52:10 2005
***************
*** 0 ****
--- 1,17 ----
+ # $PostgreSQL$
+
+ MODULE_big = pg_freespacemap
+ OBJS    = pg_freespacemap.o
+
+ DATA_built = pg_freespacemap.sql
+ DOCS = README.pg_freespacemap
+
+ ifdef USE_PGXS
+ PGXS := $(shell pg_config --pgxs)
+ include $(PGXS)
+ else
+ subdir = contrib/pg_freespacemap
+ top_builddir = ../..
+ include $(top_builddir)/src/Makefile.global
+ include $(top_srcdir)/contrib/contrib-global.mk
+ endif
diff -Ncar pgsql.orig/contrib/pg_freespacemap/README.pg_freespacemap
pgsql/contrib/pg_freespacemap/README.pg_freespacemap
*** pgsql.orig/contrib/pg_freespacemap/README.pg_freespacemap    Thu Jan  1 12:00:00 1970
--- pgsql/contrib/pg_freespacemap/README.pg_freespacemap    Thu Oct 27 18:06:20 2005
***************
*** 0 ****
--- 1,98 ----
+ Pg_freespacemap - Real time queries on the free space map (FSM).
+ ---------------
+
+   This module consists of a C function 'pg_freespacemap()' that returns
+   a set of records, and a view 'pg_freespacemap' to wrapper the function.
+
+   The module provides the ability to examine the contents of the free space
+   map, without having to restart or rebuild the server with additional
+   debugging code.
+
+   By default public access is REVOKED from both of these, just in case there
+   are security issues lurking.
+
+
+ Installation
+ ------------
+
+   Build and install the main Postgresql source, then this contrib module:
+
+   $ cd contrib/pg_freespacemap
+   $ gmake
+   $ gmake install
+
+
+   To register the functions:
+
+   $ psql -d <database> -f pg_freespacemap.sql
+
+
+ Notes
+ -----
+
+   The definition of the columns exposed in the view is:
+
+        Column     |  references          | Description
+   ----------------+----------------------+------------------------------------
+    blockid        |                      | Id, 1.. max_fsm_pages
+    relfilenode    | pg_class.relfilenode | Refilenode of the relation.
+    reltablespace  | pg_tablespace.oid    | Tablespace oid of the relation.
+    reldatabase    | pg_database.oid      | Database for the relation.
+    relblocknumber |                      | Offset of the page in the relation.
+    blockfreebytes |                      | Free bytes in the block/page.
+
+
+   There is one row for each page in the free space map.
+
+   Because the map is shared by all the databases, there are pages from
+   relations not belonging to the current database.
+
+   When the pg_freespacemap view is accessed, internal free space map locks are
+   taken, and a copy of the map data is made for the view to display.
+   This ensures that the view produces a consistent set of results, while not
+   blocking normal activity longer than necessary.  Nonetheless there
+   could be some impact on database performance if this view is read often.
+
+
+ Sample output
+ -------------
+
+   regression=# \d pg_freespacemap
+       View "public.pg_freespacemap"
+       Column     |  Type   | Modifiers
+   ---------------+---------+-----------
+   blockid        | integer |
+   relfilenode    | oid     |
+   reltablespace  | oid     |
+   reldatabase    | oid     |
+   relblocknumber | bigint  |
+   blockfreebytes | integer |
+  View definition:
+   SELECT p.blockid, p.relfilenode, p.reltablespace, p.reldatabase, p.relblocknumber, p.blockfreebytes
+     FROM pg_freespacemap() p(blockid integer, relfilenode oid, reltablespace oid, reldatabase oid, relblocknumber
bigint,blockfreebytes integer); 
+
+   regression=# SELECT c.relname, m.relblocknumber, m.blockfreebytes
+                FROM pg_freespacemap m INNER JOIN pg_class c
+                ON c.relfilenode = m.relfilenode LIMIT 10;
+       relname             | relblocknumber | blockfreebytes
+   ------------------------+----------------+----------------
+   sql_features            |              5 |           2696
+   sql_implementation_info |              0 |           7104
+   sql_languages           |              0 |           8016
+   sql_packages            |              0 |           7376
+   sql_sizing              |              0 |           6032
+   pg_authid               |              0 |           7424
+   pg_toast_2618           |             13 |           4588
+   pg_toast_2618           |             12 |           1680
+   pg_toast_2618           |             10 |           1436
+   pg_toast_2618           |              7 |           1136
+   (10 rows)
+
+   regression=#
+
+
+ Author
+ ------
+
+   * Mark Kirkwood <markir@paradise.net.nz>
+
diff -Ncar pgsql.orig/contrib/pg_freespacemap/pg_freespacemap.c pgsql/contrib/pg_freespacemap/pg_freespacemap.c
*** pgsql.orig/contrib/pg_freespacemap/pg_freespacemap.c    Thu Jan  1 12:00:00 1970
--- pgsql/contrib/pg_freespacemap/pg_freespacemap.c    Thu Oct 27 18:07:05 2005
***************
*** 0 ****
--- 1,231 ----
+ /*-------------------------------------------------------------------------
+  *
+  * pg_freespacemap.c
+  *      display some contents of the free space map.
+  *
+  *      $PostgreSQL$
+  *-------------------------------------------------------------------------
+  */
+ #include "postgres.h"
+ #include "funcapi.h"
+ #include "catalog/pg_type.h"
+ #include "storage/freespace.h"
+ #include "utils/relcache.h"
+
+ #define        NUM_FREESPACE_PAGES_ELEM     6
+
+ #if defined(WIN32) || defined(__CYGWIN__)
+ extern DLLIMPORT volatile uint32 InterruptHoldoffCount;
+ #endif
+
+ Datum        pg_freespacemap(PG_FUNCTION_ARGS);
+
+
+ /*
+  * Record structure holding the to be exposed free space data.
+  */
+ typedef struct
+ {
+
+     uint32                blockid;
+     uint32                relfilenode;
+     uint32                reltablespace;
+     uint32                reldatabase;
+     uint32                relblocknumber;
+     uint32                blockfreebytes;
+
+ }    FreeSpacePagesRec;
+
+
+ /*
+  * Function context for data persisting over repeated calls.
+  */
+ typedef struct
+ {
+
+     AttInMetadata         *attinmeta;
+     FreeSpacePagesRec    *record;
+     char                   *values[NUM_FREESPACE_PAGES_ELEM];
+
+ }    FreeSpacePagesContext;
+
+
+ /*
+  * Function returning data from the Free Space Map (FSM).
+  */
+ PG_FUNCTION_INFO_V1(pg_freespacemap);
+ Datum
+ pg_freespacemap(PG_FUNCTION_ARGS)
+ {
+
+     FuncCallContext            *funcctx;
+     Datum                    result;
+     MemoryContext             oldcontext;
+     FreeSpacePagesContext    *fctx;                /* User function context. */
+     TupleDesc                tupledesc;
+     HeapTuple                tuple;
+
+     FSMHeader                *FreeSpaceMap;         /* FSM main structure. */
+     FSMRelation                *fsmrel;            /* Individual relation. */
+
+
+     if (SRF_IS_FIRSTCALL())
+     {
+         uint32                i;
+         uint32                numPages;    /* Max possible no. of pages in map. */
+         int                    nPages;        /* Mapped pages for a relation. */
+
+         /*
+          * Get the free space map data structure.
+          */
+         FreeSpaceMap = GetFreeSpaceMap();
+
+         numPages = MaxFSMPages;
+
+         funcctx = SRF_FIRSTCALL_INIT();
+
+         /* Switch context when allocating stuff to be used in later calls */
+         oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+         /* Construct a tuple to return. */
+         tupledesc = CreateTemplateTupleDesc(NUM_FREESPACE_PAGES_ELEM, false);
+         TupleDescInitEntry(tupledesc, (AttrNumber) 1, "blockid",
+                            INT4OID, -1, 0);
+         TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
+                            OIDOID, -1, 0);
+         TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
+                            OIDOID, -1, 0);
+         TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
+                            OIDOID, -1, 0);
+         TupleDescInitEntry(tupledesc, (AttrNumber) 5, "relblocknumber",
+                            INT8OID, -1, 0);
+         TupleDescInitEntry(tupledesc, (AttrNumber) 6, "blockfreebytes",
+                            INT4OID, -1, 0);
+
+         /* Generate attribute metadata needed later to produce tuples */
+         funcctx->attinmeta = TupleDescGetAttInMetadata(tupledesc);
+
+         /*
+          * Create a function context for cross-call persistence and initialize
+          * the counters.
+          */
+         fctx = (FreeSpacePagesContext *) palloc(sizeof(FreeSpacePagesContext));
+         funcctx->user_fctx = fctx;
+
+         /* Set an upper bound on the calls */
+         funcctx->max_calls = numPages;
+
+
+         /* Allocate numPages worth of FreeSpacePagesRec records, this is also
+          * an upper bound.
+          */
+         fctx->record = (FreeSpacePagesRec *) palloc(sizeof(FreeSpacePagesRec) * numPages);
+
+         /* allocate the strings for tuple formation */
+         fctx->values[0] = (char *) palloc(3 * sizeof(uint32) + 1);
+         fctx->values[1] = (char *) palloc(3 * sizeof(uint32) + 1);
+         fctx->values[2] = (char *) palloc(3 * sizeof(uint32) + 1);
+         fctx->values[3] = (char *) palloc(3 * sizeof(uint32) + 1);
+         fctx->values[4] = (char *) palloc(3 * sizeof(uint32) + 1);
+         fctx->values[5] = (char *) palloc(3 * sizeof(uint32) + 1);
+
+
+         /* Return to original context when allocating transient memory */
+         MemoryContextSwitchTo(oldcontext);
+
+
+         /*
+          * Lock free space map and scan though all the relations,
+          * for each relation, gets all its mapped pages.
+          */
+         LWLockAcquire(FreeSpaceLock, LW_EXCLUSIVE);
+
+
+         i = 0;
+
+         for (fsmrel = FreeSpaceMap->usageList; fsmrel; fsmrel = fsmrel->nextUsage)
+         {
+
+             if (fsmrel->isIndex)
+             {    /* Index relation. */
+                 IndexFSMPageData *page;
+
+                 page = (IndexFSMPageData *)
+                         (FreeSpaceMap->arena + fsmrel->firstChunk * CHUNKBYTES);
+
+                 for (nPages = 0; nPages < fsmrel->storedPages; nPages++)
+                 {
+
+                     fctx->record[i].blockid = i;
+                     fctx->record[i].relfilenode = fsmrel->key.relNode;
+                     fctx->record[i].reltablespace = fsmrel->key.spcNode;
+                     fctx->record[i].reldatabase = fsmrel->key.dbNode;
+                     fctx->record[i].relblocknumber = IndexFSMPageGetPageNum(page);
+                     fctx->record[i].blockfreebytes = 0;    /* index.*/
+
+                     page++;
+                     i++;
+                 }
+             }
+             else
+             {    /* Heap relation. */
+                 FSMPageData *page;
+
+                 page = (FSMPageData *)
+                         (FreeSpaceMap->arena + fsmrel->firstChunk * CHUNKBYTES);
+
+                 for (nPages = 0; nPages < fsmrel->storedPages; nPages++)
+                 {
+                     fctx->record[i].blockid = i;
+                     fctx->record[i].relfilenode = fsmrel->key.relNode;
+                     fctx->record[i].reltablespace = fsmrel->key.spcNode;
+                     fctx->record[i].reldatabase = fsmrel->key.dbNode;
+                     fctx->record[i].relblocknumber = FSMPageGetPageNum(page);
+                     fctx->record[i].blockfreebytes = FSMPageGetSpace(page);
+
+                     page++;
+                     i++;
+                 }
+
+             }
+
+         }
+
+         /* Set the real no. of calls as we know it now! */
+         funcctx->max_calls = i;
+
+         /* Release free space map. */
+         LWLockRelease(FreeSpaceLock);
+     }
+
+     funcctx = SRF_PERCALL_SETUP();
+
+     /* Get the saved state */
+     fctx = funcctx->user_fctx;
+
+
+     if (funcctx->call_cntr < funcctx->max_calls)
+     {
+         uint32        i = funcctx->call_cntr;
+
+
+         sprintf(fctx->values[0], "%u", fctx->record[i].blockid);
+         sprintf(fctx->values[1], "%u", fctx->record[i].relfilenode);
+         sprintf(fctx->values[2], "%u", fctx->record[i].reltablespace);
+         sprintf(fctx->values[3], "%u", fctx->record[i].reldatabase);
+         sprintf(fctx->values[4], "%u", fctx->record[i].relblocknumber);
+         sprintf(fctx->values[5], "%u", fctx->record[i].blockfreebytes);
+
+
+
+         /* Build and return the tuple. */
+         tuple = BuildTupleFromCStrings(funcctx->attinmeta, fctx->values);
+         result = HeapTupleGetDatum(tuple);
+
+
+         SRF_RETURN_NEXT(funcctx, result);
+     }
+     else
+         SRF_RETURN_DONE(funcctx);
+
+ }
diff -Ncar pgsql.orig/contrib/pg_freespacemap/pg_freespacemap.sql.in
pgsql/contrib/pg_freespacemap/pg_freespacemap.sql.in
*** pgsql.orig/contrib/pg_freespacemap/pg_freespacemap.sql.in    Thu Jan  1 12:00:00 1970
--- pgsql/contrib/pg_freespacemap/pg_freespacemap.sql.in    Thu Oct 27 18:07:43 2005
***************
*** 0 ****
--- 1,17 ----
+ -- Adjust this setting to control where the objects get created.
+ SET search_path = public;
+
+ -- Register the function.
+ CREATE OR REPLACE FUNCTION pg_freespacemap()
+ RETURNS SETOF RECORD
+ AS 'MODULE_PATHNAME', 'pg_freespacemap'
+ LANGUAGE 'C';
+
+ -- Create a view for convenient access.
+ CREATE VIEW pg_freespacemap AS
+     SELECT P.* FROM pg_freespacemap() AS P
+      (blockid int4, relfilenode oid, reltablespace oid, reldatabase oid, relblocknumber int8, blockfreebytes int4);
+
+ -- Don't want these to be available at public.
+ REVOKE ALL ON FUNCTION pg_freespacemap() FROM PUBLIC;
+ REVOKE ALL ON pg_freespacemap FROM PUBLIC;
diff -Ncar pgsql.orig/src/backend/storage/freespace/freespace.c pgsql/src/backend/storage/freespace/freespace.c
*** pgsql.orig/src/backend/storage/freespace/freespace.c    Thu Oct 20 12:25:06 2005
--- pgsql/src/backend/storage/freespace/freespace.c    Thu Oct 27 17:51:33 2005
***************
*** 71,114 ****
  #include "storage/shmem.h"


- /* Initial value for average-request moving average */
- #define INITIAL_AVERAGE ((Size) (BLCKSZ / 32))
-
- /*
-  * Number of pages and bytes per allocation chunk.    Indexes can squeeze 50%
-  * more pages into the same space because they don't need to remember how much
-  * free space on each page.  The nominal number of pages, CHUNKPAGES, is for
-  * regular rels, and INDEXCHUNKPAGES is for indexes.  CHUNKPAGES should be
-  * even so that no space is wasted in the index case.
-  */
- #define CHUNKPAGES    16
- #define CHUNKBYTES    (CHUNKPAGES * sizeof(FSMPageData))
- #define INDEXCHUNKPAGES ((int) (CHUNKBYTES / sizeof(IndexFSMPageData)))
-
-
- /*
-  * Typedefs and macros for items in the page-storage arena.  We use the
-  * existing ItemPointer and BlockId data structures, which are designed
-  * to pack well (they should be 6 and 4 bytes apiece regardless of machine
-  * alignment issues).  Unfortunately we can't use the ItemPointer access
-  * macros, because they include Asserts insisting that ip_posid != 0.
-  */
- typedef ItemPointerData FSMPageData;
- typedef BlockIdData IndexFSMPageData;
-
- #define FSMPageGetPageNum(ptr)    \
-     BlockIdGetBlockNumber(&(ptr)->ip_blkid)
- #define FSMPageGetSpace(ptr)    \
-     ((Size) (ptr)->ip_posid)
- #define FSMPageSetPageNum(ptr, pg)    \
-     BlockIdSet(&(ptr)->ip_blkid, pg)
- #define FSMPageSetSpace(ptr, sz)    \
-     ((ptr)->ip_posid = (OffsetNumber) (sz))
- #define IndexFSMPageGetPageNum(ptr) \
-     BlockIdGetBlockNumber(ptr)
- #define IndexFSMPageSetPageNum(ptr, pg) \
-     BlockIdSet(ptr, pg)
-
  /*----------
   * During database shutdown, we store the contents of FSM into a disk file,
   * which is re-read during startup.  This way we don't have a startup
--- 71,76 ----
***************
*** 156,218 ****
      int32        storedPages;    /* # of pages stored in arena */
  } FsmCacheRelHeader;

-
- /*
-  * Shared free-space-map objects
-  *
-  * The per-relation objects are indexed by a hash table, and are also members
-  * of two linked lists: one ordered by recency of usage (most recent first),
-  * and the other ordered by physical location of the associated storage in
-  * the page-info arena.
-  *
-  * Each relation owns one or more chunks of per-page storage in the "arena".
-  * The chunks for each relation are always consecutive, so that it can treat
-  * its page storage as a simple array.    We further insist that its page data
-  * be ordered by block number, so that binary search is possible.
-  *
-  * Note: we handle pointers to these items as pointers, not as SHMEM_OFFSETs.
-  * This assumes that all processes accessing the map will have the shared
-  * memory segment mapped at the same place in their address space.
-  */
- typedef struct FSMHeader FSMHeader;
- typedef struct FSMRelation FSMRelation;
-
- /* Header for whole map */
- struct FSMHeader
- {
-     FSMRelation *usageList;        /* FSMRelations in usage-recency order */
-     FSMRelation *usageListTail; /* tail of usage-recency list */
-     FSMRelation *firstRel;        /* FSMRelations in arena storage order */
-     FSMRelation *lastRel;        /* tail of storage-order list */
-     int            numRels;        /* number of FSMRelations now in use */
-     double        sumRequests;    /* sum of requested chunks over all rels */
-     char       *arena;            /* arena for page-info storage */
-     int            totalChunks;    /* total size of arena, in chunks */
-     int            usedChunks;        /* # of chunks assigned */
-     /* NB: there are totalChunks - usedChunks free chunks at end of arena */
- };
-
- /*
-  * Per-relation struct --- this is an entry in the shared hash table.
-  * The hash key is the RelFileNode value (hence, we look at the physical
-  * relation ID, not the logical ID, which is appropriate).
-  */
- struct FSMRelation
- {
-     RelFileNode key;            /* hash key (must be first) */
-     FSMRelation *nextUsage;        /* next rel in usage-recency order */
-     FSMRelation *priorUsage;    /* prior rel in usage-recency order */
-     FSMRelation *nextPhysical;    /* next rel in arena-storage order */
-     FSMRelation *priorPhysical; /* prior rel in arena-storage order */
-     bool        isIndex;        /* if true, we store only page numbers */
-     Size        avgRequest;        /* moving average of space requests */
-     int            lastPageCount;    /* pages passed to RecordRelationFreeSpace */
-     int            firstChunk;        /* chunk # of my first chunk in arena */
-     int            storedPages;    /* # of pages stored in arena */
-     int            nextPage;        /* index (from 0) to start next search at */
- };
-
-
  int            MaxFSMRelations;    /* these are set by guc.c */
  int            MaxFSMPages;

--- 118,123 ----
***************
*** 1835,1840 ****
--- 1740,1756 ----
          Assert(fsmrel->firstChunk < 0 && fsmrel->storedPages == 0);
          return 0;
      }
+ }
+
+
+ /*
+  * Return the FreeSpaceMap structure for examination.
+  */
+ FSMHeader *
+ GetFreeSpaceMap(void)
+ {
+
+     return FreeSpaceMap;
  }


diff -Ncar pgsql.orig/src/include/storage/freespace.h pgsql/src/include/storage/freespace.h
*** pgsql.orig/src/include/storage/freespace.h    Tue Aug 23 15:56:23 2005
--- pgsql/src/include/storage/freespace.h    Thu Oct 27 17:51:33 2005
***************
*** 16,21 ****
--- 16,22 ----

  #include "storage/block.h"
  #include "storage/relfilenode.h"
+ #include "storage/itemptr.h"


  /*
***************
*** 28,33 ****
--- 29,129 ----
  } PageFreeSpaceInfo;


+ /* Initial value for average-request moving average */
+ #define INITIAL_AVERAGE ((Size) (BLCKSZ / 32))
+
+ /*
+  * Number of pages and bytes per allocation chunk.    Indexes can squeeze 50%
+  * more pages into the same space because they don't need to remember how much
+  * free space on each page.  The nominal number of pages, CHUNKPAGES, is for
+  * regular rels, and INDEXCHUNKPAGES is for indexes.  CHUNKPAGES should be
+  * even so that no space is wasted in the index case.
+  */
+ #define CHUNKPAGES    16
+ #define CHUNKBYTES    (CHUNKPAGES * sizeof(FSMPageData))
+ #define INDEXCHUNKPAGES ((int) (CHUNKBYTES / sizeof(IndexFSMPageData)))
+
+
+ /*
+  * Typedefs and macros for items in the page-storage arena.  We use the
+  * existing ItemPointer and BlockId data structures, which are designed
+  * to pack well (they should be 6 and 4 bytes apiece regardless of machine
+  * alignment issues).  Unfortunately we can't use the ItemPointer access
+  * macros, because they include Asserts insisting that ip_posid != 0.
+  */
+ typedef ItemPointerData FSMPageData;
+ typedef BlockIdData IndexFSMPageData;
+
+ #define FSMPageGetPageNum(ptr)    \
+     BlockIdGetBlockNumber(&(ptr)->ip_blkid)
+ #define FSMPageGetSpace(ptr)    \
+     ((Size) (ptr)->ip_posid)
+ #define FSMPageSetPageNum(ptr, pg)    \
+     BlockIdSet(&(ptr)->ip_blkid, pg)
+ #define FSMPageSetSpace(ptr, sz)    \
+     ((ptr)->ip_posid = (OffsetNumber) (sz))
+ #define IndexFSMPageGetPageNum(ptr) \
+     BlockIdGetBlockNumber(ptr)
+ #define IndexFSMPageSetPageNum(ptr, pg) \
+     BlockIdSet(ptr, pg)
+
+ /*
+  * Shared free-space-map objects
+  *
+  * The per-relation objects are indexed by a hash table, and are also members
+  * of two linked lists: one ordered by recency of usage (most recent first),
+  * and the other ordered by physical location of the associated storage in
+  * the page-info arena.
+  *
+  * Each relation owns one or more chunks of per-page storage in the "arena".
+  * The chunks for each relation are always consecutive, so that it can treat
+  * its page storage as a simple array.    We further insist that its page data
+  * be ordered by block number, so that binary search is possible.
+  *
+  * Note: we handle pointers to these items as pointers, not as SHMEM_OFFSETs.
+  * This assumes that all processes accessing the map will have the shared
+  * memory segment mapped at the same place in their address space.
+  */
+ typedef struct FSMHeader FSMHeader;
+ typedef struct FSMRelation FSMRelation;
+
+ /* Header for whole map */
+ struct FSMHeader
+ {
+     FSMRelation *usageList;        /* FSMRelations in usage-recency order */
+     FSMRelation *usageListTail; /* tail of usage-recency list */
+     FSMRelation *firstRel;        /* FSMRelations in arena storage order */
+     FSMRelation *lastRel;        /* tail of storage-order list */
+     int            numRels;        /* number of FSMRelations now in use */
+     double        sumRequests;    /* sum of requested chunks over all rels */
+     char       *arena;            /* arena for page-info storage */
+     int            totalChunks;    /* total size of arena, in chunks */
+     int            usedChunks;        /* # of chunks assigned */
+     /* NB: there are totalChunks - usedChunks free chunks at end of arena */
+ };
+
+ /*
+  * Per-relation struct --- this is an entry in the shared hash table.
+  * The hash key is the RelFileNode value (hence, we look at the physical
+  * relation ID, not the logical ID, which is appropriate).
+  */
+ struct FSMRelation
+ {
+     RelFileNode key;            /* hash key (must be first) */
+     FSMRelation *nextUsage;        /* next rel in usage-recency order */
+     FSMRelation *priorUsage;    /* prior rel in usage-recency order */
+     FSMRelation *nextPhysical;    /* next rel in arena-storage order */
+     FSMRelation *priorPhysical; /* prior rel in arena-storage order */
+     bool        isIndex;        /* if true, we store only page numbers */
+     Size        avgRequest;        /* moving average of space requests */
+     int            lastPageCount;    /* pages passed to RecordRelationFreeSpace */
+     int            firstChunk;        /* chunk # of my first chunk in arena */
+     int            storedPages;    /* # of pages stored in arena */
+     int            nextPage;        /* index (from 0) to start next search at */
+ };
+
+
+
  /* GUC variables */
  extern int    MaxFSMRelations;
  extern int    MaxFSMPages;
***************
*** 62,67 ****
--- 158,164 ----

  extern void DumpFreeSpaceMap(int code, Datum arg);
  extern void LoadFreeSpaceMap(void);
+ extern FSMHeader *GetFreeSpaceMap(void);

  #ifdef FREESPACE_DEBUG
  extern void DumpFreeSpace(void);


Re: TODO Item - Add system view to show free space map contents

From
"Jim C. Nasby"
Date:
Shouldn't the DDL in pg_freespacemap.sql.in be wrapped in a transaction?
Specifically I'm considering the case of the script stopping before the
REVOKEs.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: TODO Item - Add system view to show free space map

From
Mark Kirkwood
Date:
Jim C. Nasby wrote:
> Shouldn't the DDL in pg_freespacemap.sql.in be wrapped in a transaction?
> Specifically I'm considering the case of the script stopping before the
> REVOKEs.

That's nice, (probably should have done it in pg_buffercache ....)!

Re: TODO Item - Add system view to show free space map

From
Christopher Kings-Lynne
Date:
Want to host it on pgfoundry until 8.2 is released?

Mark Kirkwood wrote:
> This patch implements a view to display the free space map contents - e.g :
>
>     regression=# SELECT c.relname, m.relblocknumber, m.blockfreebytes
>                  FROM pg_freespacemap m INNER JOIN pg_class c
>                  ON c.relfilenode = m.relfilenode LIMIT 10;
>         relname             | relblocknumber | blockfreebytes
>     ------------------------+----------------+----------------
>     sql_features            |              5 |           2696
>     sql_implementation_info |              0 |           7104
>     sql_languages           |              0 |           8016
>     sql_packages            |              0 |           7376
>     sql_sizing              |              0 |           6032
>     pg_authid               |              0 |           7424
>     pg_toast_2618           |             13 |           4588
>     pg_toast_2618           |             12 |           1680
>     pg_toast_2618           |             10 |           1436
>     pg_toast_2618           |              7 |           1136
>     (10 rows)
>
> [I found being able to display the FSM pretty cool, even if I say so
> myself....].
>
> It is written as a contrib module (similar to pg_buffercache) so as to
> make any revisions non-initdb requiring.
>
> The code needs to know about several of the (currently) internal data
> structures in freespace.c, so I moved these into freespace.h. Similarly
> for the handy macros to actually compute the free space. Let me know if
> this was the wrong way to proceed!
>
> Additionally access to the FSM pointer itself is required, I added a
> function in freespace.c to return this, rather than making it globally
> visible, again if the latter is a better approach, it is easily changed.
>
> cheers
>
> Mark
>
> P.s : Currently don't have access to a windows box, so had to just 'take
> a stab' at what DLLIMPORTs were required.
>
>
>
> ------------------------------------------------------------------------
>
> diff -Ncar pgsql.orig/contrib/pg_freespacemap/Makefile pgsql/contrib/pg_freespacemap/Makefile
> *** pgsql.orig/contrib/pg_freespacemap/Makefile    Thu Jan  1 12:00:00 1970
> --- pgsql/contrib/pg_freespacemap/Makefile    Thu Oct 27 17:52:10 2005
> ***************
> *** 0 ****
> --- 1,17 ----
> + # $PostgreSQL$
> +
> + MODULE_big = pg_freespacemap
> + OBJS    = pg_freespacemap.o
> +
> + DATA_built = pg_freespacemap.sql
> + DOCS = README.pg_freespacemap
> +
> + ifdef USE_PGXS
> + PGXS := $(shell pg_config --pgxs)
> + include $(PGXS)
> + else
> + subdir = contrib/pg_freespacemap
> + top_builddir = ../..
> + include $(top_builddir)/src/Makefile.global
> + include $(top_srcdir)/contrib/contrib-global.mk
> + endif
> diff -Ncar pgsql.orig/contrib/pg_freespacemap/README.pg_freespacemap
pgsql/contrib/pg_freespacemap/README.pg_freespacemap
> *** pgsql.orig/contrib/pg_freespacemap/README.pg_freespacemap    Thu Jan  1 12:00:00 1970
> --- pgsql/contrib/pg_freespacemap/README.pg_freespacemap    Thu Oct 27 18:06:20 2005
> ***************
> *** 0 ****
> --- 1,98 ----
> + Pg_freespacemap - Real time queries on the free space map (FSM).
> + ---------------
> +
> +   This module consists of a C function 'pg_freespacemap()' that returns
> +   a set of records, and a view 'pg_freespacemap' to wrapper the function.
> +
> +   The module provides the ability to examine the contents of the free space
> +   map, without having to restart or rebuild the server with additional
> +   debugging code.
> +
> +   By default public access is REVOKED from both of these, just in case there
> +   are security issues lurking.
> +
> +
> + Installation
> + ------------
> +
> +   Build and install the main Postgresql source, then this contrib module:
> +
> +   $ cd contrib/pg_freespacemap
> +   $ gmake
> +   $ gmake install
> +
> +
> +   To register the functions:
> +
> +   $ psql -d <database> -f pg_freespacemap.sql
> +
> +
> + Notes
> + -----
> +
> +   The definition of the columns exposed in the view is:
> +
> +        Column     |  references          | Description
> +   ----------------+----------------------+------------------------------------
> +    blockid        |                      | Id, 1.. max_fsm_pages
> +    relfilenode    | pg_class.relfilenode | Refilenode of the relation.
> +    reltablespace  | pg_tablespace.oid    | Tablespace oid of the relation.
> +    reldatabase    | pg_database.oid      | Database for the relation.
> +    relblocknumber |                      | Offset of the page in the relation.
> +    blockfreebytes |                      | Free bytes in the block/page.
> +
> +
> +   There is one row for each page in the free space map.
> +
> +   Because the map is shared by all the databases, there are pages from
> +   relations not belonging to the current database.
> +
> +   When the pg_freespacemap view is accessed, internal free space map locks are
> +   taken, and a copy of the map data is made for the view to display.
> +   This ensures that the view produces a consistent set of results, while not
> +   blocking normal activity longer than necessary.  Nonetheless there
> +   could be some impact on database performance if this view is read often.
> +
> +
> + Sample output
> + -------------
> +
> +   regression=# \d pg_freespacemap
> +       View "public.pg_freespacemap"
> +       Column     |  Type   | Modifiers
> +   ---------------+---------+-----------
> +   blockid        | integer |
> +   relfilenode    | oid     |
> +   reltablespace  | oid     |
> +   reldatabase    | oid     |
> +   relblocknumber | bigint  |
> +   blockfreebytes | integer |
> +  View definition:
> +   SELECT p.blockid, p.relfilenode, p.reltablespace, p.reldatabase, p.relblocknumber, p.blockfreebytes
> +     FROM pg_freespacemap() p(blockid integer, relfilenode oid, reltablespace oid, reldatabase oid, relblocknumber
bigint,blockfreebytes integer); 
> +
> +   regression=# SELECT c.relname, m.relblocknumber, m.blockfreebytes
> +                FROM pg_freespacemap m INNER JOIN pg_class c
> +                ON c.relfilenode = m.relfilenode LIMIT 10;
> +       relname             | relblocknumber | blockfreebytes
> +   ------------------------+----------------+----------------
> +   sql_features            |              5 |           2696
> +   sql_implementation_info |              0 |           7104
> +   sql_languages           |              0 |           8016
> +   sql_packages            |              0 |           7376
> +   sql_sizing              |              0 |           6032
> +   pg_authid               |              0 |           7424
> +   pg_toast_2618           |             13 |           4588
> +   pg_toast_2618           |             12 |           1680
> +   pg_toast_2618           |             10 |           1436
> +   pg_toast_2618           |              7 |           1136
> +   (10 rows)
> +
> +   regression=#
> +
> +
> + Author
> + ------
> +
> +   * Mark Kirkwood <markir@paradise.net.nz>
> +
> diff -Ncar pgsql.orig/contrib/pg_freespacemap/pg_freespacemap.c pgsql/contrib/pg_freespacemap/pg_freespacemap.c
> *** pgsql.orig/contrib/pg_freespacemap/pg_freespacemap.c    Thu Jan  1 12:00:00 1970
> --- pgsql/contrib/pg_freespacemap/pg_freespacemap.c    Thu Oct 27 18:07:05 2005
> ***************
> *** 0 ****
> --- 1,231 ----
> + /*-------------------------------------------------------------------------
> +  *
> +  * pg_freespacemap.c
> +  *      display some contents of the free space map.
> +  *
> +  *      $PostgreSQL$
> +  *-------------------------------------------------------------------------
> +  */
> + #include "postgres.h"
> + #include "funcapi.h"
> + #include "catalog/pg_type.h"
> + #include "storage/freespace.h"
> + #include "utils/relcache.h"
> +
> + #define        NUM_FREESPACE_PAGES_ELEM     6
> +
> + #if defined(WIN32) || defined(__CYGWIN__)
> + extern DLLIMPORT volatile uint32 InterruptHoldoffCount;
> + #endif
> +
> + Datum        pg_freespacemap(PG_FUNCTION_ARGS);
> +
> +
> + /*
> +  * Record structure holding the to be exposed free space data.
> +  */
> + typedef struct
> + {
> +
> +     uint32                blockid;
> +     uint32                relfilenode;
> +     uint32                reltablespace;
> +     uint32                reldatabase;
> +     uint32                relblocknumber;
> +     uint32                blockfreebytes;
> +
> + }    FreeSpacePagesRec;
> +
> +
> + /*
> +  * Function context for data persisting over repeated calls.
> +  */
> + typedef struct
> + {
> +
> +     AttInMetadata         *attinmeta;
> +     FreeSpacePagesRec    *record;
> +     char                   *values[NUM_FREESPACE_PAGES_ELEM];
> +
> + }    FreeSpacePagesContext;
> +
> +
> + /*
> +  * Function returning data from the Free Space Map (FSM).
> +  */
> + PG_FUNCTION_INFO_V1(pg_freespacemap);
> + Datum
> + pg_freespacemap(PG_FUNCTION_ARGS)
> + {
> +
> +     FuncCallContext            *funcctx;
> +     Datum                    result;
> +     MemoryContext             oldcontext;
> +     FreeSpacePagesContext    *fctx;                /* User function context. */
> +     TupleDesc                tupledesc;
> +     HeapTuple                tuple;
> +
> +     FSMHeader                *FreeSpaceMap;         /* FSM main structure. */
> +     FSMRelation                *fsmrel;            /* Individual relation. */
> +
> +
> +     if (SRF_IS_FIRSTCALL())
> +     {
> +         uint32                i;
> +         uint32                numPages;    /* Max possible no. of pages in map. */
> +         int                    nPages;        /* Mapped pages for a relation. */
> +
> +         /*
> +          * Get the free space map data structure.
> +          */
> +         FreeSpaceMap = GetFreeSpaceMap();
> +
> +         numPages = MaxFSMPages;
> +
> +         funcctx = SRF_FIRSTCALL_INIT();
> +
> +         /* Switch context when allocating stuff to be used in later calls */
> +         oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
> +
> +         /* Construct a tuple to return. */
> +         tupledesc = CreateTemplateTupleDesc(NUM_FREESPACE_PAGES_ELEM, false);
> +         TupleDescInitEntry(tupledesc, (AttrNumber) 1, "blockid",
> +                            INT4OID, -1, 0);
> +         TupleDescInitEntry(tupledesc, (AttrNumber) 2, "relfilenode",
> +                            OIDOID, -1, 0);
> +         TupleDescInitEntry(tupledesc, (AttrNumber) 3, "reltablespace",
> +                            OIDOID, -1, 0);
> +         TupleDescInitEntry(tupledesc, (AttrNumber) 4, "reldatabase",
> +                            OIDOID, -1, 0);
> +         TupleDescInitEntry(tupledesc, (AttrNumber) 5, "relblocknumber",
> +                            INT8OID, -1, 0);
> +         TupleDescInitEntry(tupledesc, (AttrNumber) 6, "blockfreebytes",
> +                            INT4OID, -1, 0);
> +
> +         /* Generate attribute metadata needed later to produce tuples */
> +         funcctx->attinmeta = TupleDescGetAttInMetadata(tupledesc);
> +
> +         /*
> +          * Create a function context for cross-call persistence and initialize
> +          * the counters.
> +          */
> +         fctx = (FreeSpacePagesContext *) palloc(sizeof(FreeSpacePagesContext));
> +         funcctx->user_fctx = fctx;
> +
> +         /* Set an upper bound on the calls */
> +         funcctx->max_calls = numPages;
> +
> +
> +         /* Allocate numPages worth of FreeSpacePagesRec records, this is also
> +          * an upper bound.
> +          */
> +         fctx->record = (FreeSpacePagesRec *) palloc(sizeof(FreeSpacePagesRec) * numPages);
> +
> +         /* allocate the strings for tuple formation */
> +         fctx->values[0] = (char *) palloc(3 * sizeof(uint32) + 1);
> +         fctx->values[1] = (char *) palloc(3 * sizeof(uint32) + 1);
> +         fctx->values[2] = (char *) palloc(3 * sizeof(uint32) + 1);
> +         fctx->values[3] = (char *) palloc(3 * sizeof(uint32) + 1);
> +         fctx->values[4] = (char *) palloc(3 * sizeof(uint32) + 1);
> +         fctx->values[5] = (char *) palloc(3 * sizeof(uint32) + 1);
> +
> +
> +         /* Return to original context when allocating transient memory */
> +         MemoryContextSwitchTo(oldcontext);
> +
> +
> +         /*
> +          * Lock free space map and scan though all the relations,
> +          * for each relation, gets all its mapped pages.
> +          */
> +         LWLockAcquire(FreeSpaceLock, LW_EXCLUSIVE);
> +
> +
> +         i = 0;
> +
> +         for (fsmrel = FreeSpaceMap->usageList; fsmrel; fsmrel = fsmrel->nextUsage)
> +         {
> +
> +             if (fsmrel->isIndex)
> +             {    /* Index relation. */
> +                 IndexFSMPageData *page;
> +
> +                 page = (IndexFSMPageData *)
> +                         (FreeSpaceMap->arena + fsmrel->firstChunk * CHUNKBYTES);
> +
> +                 for (nPages = 0; nPages < fsmrel->storedPages; nPages++)
> +                 {
> +
> +                     fctx->record[i].blockid = i;
> +                     fctx->record[i].relfilenode = fsmrel->key.relNode;
> +                     fctx->record[i].reltablespace = fsmrel->key.spcNode;
> +                     fctx->record[i].reldatabase = fsmrel->key.dbNode;
> +                     fctx->record[i].relblocknumber = IndexFSMPageGetPageNum(page);
> +                     fctx->record[i].blockfreebytes = 0;    /* index.*/
> +
> +                     page++;
> +                     i++;
> +                 }
> +             }
> +             else
> +             {    /* Heap relation. */
> +                 FSMPageData *page;
> +
> +                 page = (FSMPageData *)
> +                         (FreeSpaceMap->arena + fsmrel->firstChunk * CHUNKBYTES);
> +
> +                 for (nPages = 0; nPages < fsmrel->storedPages; nPages++)
> +                 {
> +                     fctx->record[i].blockid = i;
> +                     fctx->record[i].relfilenode = fsmrel->key.relNode;
> +                     fctx->record[i].reltablespace = fsmrel->key.spcNode;
> +                     fctx->record[i].reldatabase = fsmrel->key.dbNode;
> +                     fctx->record[i].relblocknumber = FSMPageGetPageNum(page);
> +                     fctx->record[i].blockfreebytes = FSMPageGetSpace(page);
> +
> +                     page++;
> +                     i++;
> +                 }
> +
> +             }
> +
> +         }
> +
> +         /* Set the real no. of calls as we know it now! */
> +         funcctx->max_calls = i;
> +
> +         /* Release free space map. */
> +         LWLockRelease(FreeSpaceLock);
> +     }
> +
> +     funcctx = SRF_PERCALL_SETUP();
> +
> +     /* Get the saved state */
> +     fctx = funcctx->user_fctx;
> +
> +
> +     if (funcctx->call_cntr < funcctx->max_calls)
> +     {
> +         uint32        i = funcctx->call_cntr;
> +
> +
> +         sprintf(fctx->values[0], "%u", fctx->record[i].blockid);
> +         sprintf(fctx->values[1], "%u", fctx->record[i].relfilenode);
> +         sprintf(fctx->values[2], "%u", fctx->record[i].reltablespace);
> +         sprintf(fctx->values[3], "%u", fctx->record[i].reldatabase);
> +         sprintf(fctx->values[4], "%u", fctx->record[i].relblocknumber);
> +         sprintf(fctx->values[5], "%u", fctx->record[i].blockfreebytes);
> +
> +
> +
> +         /* Build and return the tuple. */
> +         tuple = BuildTupleFromCStrings(funcctx->attinmeta, fctx->values);
> +         result = HeapTupleGetDatum(tuple);
> +
> +
> +         SRF_RETURN_NEXT(funcctx, result);
> +     }
> +     else
> +         SRF_RETURN_DONE(funcctx);
> +
> + }
> diff -Ncar pgsql.orig/contrib/pg_freespacemap/pg_freespacemap.sql.in
pgsql/contrib/pg_freespacemap/pg_freespacemap.sql.in
> *** pgsql.orig/contrib/pg_freespacemap/pg_freespacemap.sql.in    Thu Jan  1 12:00:00 1970
> --- pgsql/contrib/pg_freespacemap/pg_freespacemap.sql.in    Thu Oct 27 18:07:43 2005
> ***************
> *** 0 ****
> --- 1,17 ----
> + -- Adjust this setting to control where the objects get created.
> + SET search_path = public;
> +
> + -- Register the function.
> + CREATE OR REPLACE FUNCTION pg_freespacemap()
> + RETURNS SETOF RECORD
> + AS 'MODULE_PATHNAME', 'pg_freespacemap'
> + LANGUAGE 'C';
> +
> + -- Create a view for convenient access.
> + CREATE VIEW pg_freespacemap AS
> +     SELECT P.* FROM pg_freespacemap() AS P
> +      (blockid int4, relfilenode oid, reltablespace oid, reldatabase oid, relblocknumber int8, blockfreebytes int4);
> +
> + -- Don't want these to be available at public.
> + REVOKE ALL ON FUNCTION pg_freespacemap() FROM PUBLIC;
> + REVOKE ALL ON pg_freespacemap FROM PUBLIC;
> diff -Ncar pgsql.orig/src/backend/storage/freespace/freespace.c pgsql/src/backend/storage/freespace/freespace.c
> *** pgsql.orig/src/backend/storage/freespace/freespace.c    Thu Oct 20 12:25:06 2005
> --- pgsql/src/backend/storage/freespace/freespace.c    Thu Oct 27 17:51:33 2005
> ***************
> *** 71,114 ****
>   #include "storage/shmem.h"
>
>
> - /* Initial value for average-request moving average */
> - #define INITIAL_AVERAGE ((Size) (BLCKSZ / 32))
> -
> - /*
> -  * Number of pages and bytes per allocation chunk.    Indexes can squeeze 50%
> -  * more pages into the same space because they don't need to remember how much
> -  * free space on each page.  The nominal number of pages, CHUNKPAGES, is for
> -  * regular rels, and INDEXCHUNKPAGES is for indexes.  CHUNKPAGES should be
> -  * even so that no space is wasted in the index case.
> -  */
> - #define CHUNKPAGES    16
> - #define CHUNKBYTES    (CHUNKPAGES * sizeof(FSMPageData))
> - #define INDEXCHUNKPAGES ((int) (CHUNKBYTES / sizeof(IndexFSMPageData)))
> -
> -
> - /*
> -  * Typedefs and macros for items in the page-storage arena.  We use the
> -  * existing ItemPointer and BlockId data structures, which are designed
> -  * to pack well (they should be 6 and 4 bytes apiece regardless of machine
> -  * alignment issues).  Unfortunately we can't use the ItemPointer access
> -  * macros, because they include Asserts insisting that ip_posid != 0.
> -  */
> - typedef ItemPointerData FSMPageData;
> - typedef BlockIdData IndexFSMPageData;
> -
> - #define FSMPageGetPageNum(ptr)    \
> -     BlockIdGetBlockNumber(&(ptr)->ip_blkid)
> - #define FSMPageGetSpace(ptr)    \
> -     ((Size) (ptr)->ip_posid)
> - #define FSMPageSetPageNum(ptr, pg)    \
> -     BlockIdSet(&(ptr)->ip_blkid, pg)
> - #define FSMPageSetSpace(ptr, sz)    \
> -     ((ptr)->ip_posid = (OffsetNumber) (sz))
> - #define IndexFSMPageGetPageNum(ptr) \
> -     BlockIdGetBlockNumber(ptr)
> - #define IndexFSMPageSetPageNum(ptr, pg) \
> -     BlockIdSet(ptr, pg)
> -
>   /*----------
>    * During database shutdown, we store the contents of FSM into a disk file,
>    * which is re-read during startup.  This way we don't have a startup
> --- 71,76 ----
> ***************
> *** 156,218 ****
>       int32        storedPages;    /* # of pages stored in arena */
>   } FsmCacheRelHeader;
>
> -
> - /*
> -  * Shared free-space-map objects
> -  *
> -  * The per-relation objects are indexed by a hash table, and are also members
> -  * of two linked lists: one ordered by recency of usage (most recent first),
> -  * and the other ordered by physical location of the associated storage in
> -  * the page-info arena.
> -  *
> -  * Each relation owns one or more chunks of per-page storage in the "arena".
> -  * The chunks for each relation are always consecutive, so that it can treat
> -  * its page storage as a simple array.    We further insist that its page data
> -  * be ordered by block number, so that binary search is possible.
> -  *
> -  * Note: we handle pointers to these items as pointers, not as SHMEM_OFFSETs.
> -  * This assumes that all processes accessing the map will have the shared
> -  * memory segment mapped at the same place in their address space.
> -  */
> - typedef struct FSMHeader FSMHeader;
> - typedef struct FSMRelation FSMRelation;
> -
> - /* Header for whole map */
> - struct FSMHeader
> - {
> -     FSMRelation *usageList;        /* FSMRelations in usage-recency order */
> -     FSMRelation *usageListTail; /* tail of usage-recency list */
> -     FSMRelation *firstRel;        /* FSMRelations in arena storage order */
> -     FSMRelation *lastRel;        /* tail of storage-order list */
> -     int            numRels;        /* number of FSMRelations now in use */
> -     double        sumRequests;    /* sum of requested chunks over all rels */
> -     char       *arena;            /* arena for page-info storage */
> -     int            totalChunks;    /* total size of arena, in chunks */
> -     int            usedChunks;        /* # of chunks assigned */
> -     /* NB: there are totalChunks - usedChunks free chunks at end of arena */
> - };
> -
> - /*
> -  * Per-relation struct --- this is an entry in the shared hash table.
> -  * The hash key is the RelFileNode value (hence, we look at the physical
> -  * relation ID, not the logical ID, which is appropriate).
> -  */
> - struct FSMRelation
> - {
> -     RelFileNode key;            /* hash key (must be first) */
> -     FSMRelation *nextUsage;        /* next rel in usage-recency order */
> -     FSMRelation *priorUsage;    /* prior rel in usage-recency order */
> -     FSMRelation *nextPhysical;    /* next rel in arena-storage order */
> -     FSMRelation *priorPhysical; /* prior rel in arena-storage order */
> -     bool        isIndex;        /* if true, we store only page numbers */
> -     Size        avgRequest;        /* moving average of space requests */
> -     int            lastPageCount;    /* pages passed to RecordRelationFreeSpace */
> -     int            firstChunk;        /* chunk # of my first chunk in arena */
> -     int            storedPages;    /* # of pages stored in arena */
> -     int            nextPage;        /* index (from 0) to start next search at */
> - };
> -
> -
>   int            MaxFSMRelations;    /* these are set by guc.c */
>   int            MaxFSMPages;
>
> --- 118,123 ----
> ***************
> *** 1835,1840 ****
> --- 1740,1756 ----
>           Assert(fsmrel->firstChunk < 0 && fsmrel->storedPages == 0);
>           return 0;
>       }
> + }
> +
> +
> + /*
> +  * Return the FreeSpaceMap structure for examination.
> +  */
> + FSMHeader *
> + GetFreeSpaceMap(void)
> + {
> +
> +     return FreeSpaceMap;
>   }
>
>
> diff -Ncar pgsql.orig/src/include/storage/freespace.h pgsql/src/include/storage/freespace.h
> *** pgsql.orig/src/include/storage/freespace.h    Tue Aug 23 15:56:23 2005
> --- pgsql/src/include/storage/freespace.h    Thu Oct 27 17:51:33 2005
> ***************
> *** 16,21 ****
> --- 16,22 ----
>
>   #include "storage/block.h"
>   #include "storage/relfilenode.h"
> + #include "storage/itemptr.h"
>
>
>   /*
> ***************
> *** 28,33 ****
> --- 29,129 ----
>   } PageFreeSpaceInfo;
>
>
> + /* Initial value for average-request moving average */
> + #define INITIAL_AVERAGE ((Size) (BLCKSZ / 32))
> +
> + /*
> +  * Number of pages and bytes per allocation chunk.    Indexes can squeeze 50%
> +  * more pages into the same space because they don't need to remember how much
> +  * free space on each page.  The nominal number of pages, CHUNKPAGES, is for
> +  * regular rels, and INDEXCHUNKPAGES is for indexes.  CHUNKPAGES should be
> +  * even so that no space is wasted in the index case.
> +  */
> + #define CHUNKPAGES    16
> + #define CHUNKBYTES    (CHUNKPAGES * sizeof(FSMPageData))
> + #define INDEXCHUNKPAGES ((int) (CHUNKBYTES / sizeof(IndexFSMPageData)))
> +
> +
> + /*
> +  * Typedefs and macros for items in the page-storage arena.  We use the
> +  * existing ItemPointer and BlockId data structures, which are designed
> +  * to pack well (they should be 6 and 4 bytes apiece regardless of machine
> +  * alignment issues).  Unfortunately we can't use the ItemPointer access
> +  * macros, because they include Asserts insisting that ip_posid != 0.
> +  */
> + typedef ItemPointerData FSMPageData;
> + typedef BlockIdData IndexFSMPageData;
> +
> + #define FSMPageGetPageNum(ptr)    \
> +     BlockIdGetBlockNumber(&(ptr)->ip_blkid)
> + #define FSMPageGetSpace(ptr)    \
> +     ((Size) (ptr)->ip_posid)
> + #define FSMPageSetPageNum(ptr, pg)    \
> +     BlockIdSet(&(ptr)->ip_blkid, pg)
> + #define FSMPageSetSpace(ptr, sz)    \
> +     ((ptr)->ip_posid = (OffsetNumber) (sz))
> + #define IndexFSMPageGetPageNum(ptr) \
> +     BlockIdGetBlockNumber(ptr)
> + #define IndexFSMPageSetPageNum(ptr, pg) \
> +     BlockIdSet(ptr, pg)
> +
> + /*
> +  * Shared free-space-map objects
> +  *
> +  * The per-relation objects are indexed by a hash table, and are also members
> +  * of two linked lists: one ordered by recency of usage (most recent first),
> +  * and the other ordered by physical location of the associated storage in
> +  * the page-info arena.
> +  *
> +  * Each relation owns one or more chunks of per-page storage in the "arena".
> +  * The chunks for each relation are always consecutive, so that it can treat
> +  * its page storage as a simple array.    We further insist that its page data
> +  * be ordered by block number, so that binary search is possible.
> +  *
> +  * Note: we handle pointers to these items as pointers, not as SHMEM_OFFSETs.
> +  * This assumes that all processes accessing the map will have the shared
> +  * memory segment mapped at the same place in their address space.
> +  */
> + typedef struct FSMHeader FSMHeader;
> + typedef struct FSMRelation FSMRelation;
> +
> + /* Header for whole map */
> + struct FSMHeader
> + {
> +     FSMRelation *usageList;        /* FSMRelations in usage-recency order */
> +     FSMRelation *usageListTail; /* tail of usage-recency list */
> +     FSMRelation *firstRel;        /* FSMRelations in arena storage order */
> +     FSMRelation *lastRel;        /* tail of storage-order list */
> +     int            numRels;        /* number of FSMRelations now in use */
> +     double        sumRequests;    /* sum of requested chunks over all rels */
> +     char       *arena;            /* arena for page-info storage */
> +     int            totalChunks;    /* total size of arena, in chunks */
> +     int            usedChunks;        /* # of chunks assigned */
> +     /* NB: there are totalChunks - usedChunks free chunks at end of arena */
> + };
> +
> + /*
> +  * Per-relation struct --- this is an entry in the shared hash table.
> +  * The hash key is the RelFileNode value (hence, we look at the physical
> +  * relation ID, not the logical ID, which is appropriate).
> +  */
> + struct FSMRelation
> + {
> +     RelFileNode key;            /* hash key (must be first) */
> +     FSMRelation *nextUsage;        /* next rel in usage-recency order */
> +     FSMRelation *priorUsage;    /* prior rel in usage-recency order */
> +     FSMRelation *nextPhysical;    /* next rel in arena-storage order */
> +     FSMRelation *priorPhysical; /* prior rel in arena-storage order */
> +     bool        isIndex;        /* if true, we store only page numbers */
> +     Size        avgRequest;        /* moving average of space requests */
> +     int            lastPageCount;    /* pages passed to RecordRelationFreeSpace */
> +     int            firstChunk;        /* chunk # of my first chunk in arena */
> +     int            storedPages;    /* # of pages stored in arena */
> +     int            nextPage;        /* index (from 0) to start next search at */
> + };
> +
> +
> +
>   /* GUC variables */
>   extern int    MaxFSMRelations;
>   extern int    MaxFSMPages;
> ***************
> *** 62,67 ****
> --- 158,164 ----
>
>   extern void DumpFreeSpaceMap(int code, Datum arg);
>   extern void LoadFreeSpaceMap(void);
> + extern FSMHeader *GetFreeSpaceMap(void);
>
>   #ifdef FREESPACE_DEBUG
>   extern void DumpFreeSpace(void);
>
>
>
> ------------------------------------------------------------------------
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly


Re: TODO Item - Add system view to show free space map

From
Mark Kirkwood
Date:
Christopher Kings-Lynne wrote:
> Want to host it on pgfoundry until 8.2 is released?
>

Absolutely - I'll let it run the gauntlet of freedback to fix the silly
mistakes I put in :-), then do patches for 8.1 and 8.0 (maybe 7.4 and
7.3 as well - if it rains a lot....).

cheers

Mark


Re: TODO Item - Add system view to show free space map

From
Simon Riggs
Date:
On Fri, 2005-10-28 at 13:21 +1300, Mark Kirkwood wrote:

>      regression=# SELECT c.relname, m.relblocknumber, m.blockfreebytes
>                   FROM pg_freespacemap m INNER JOIN pg_class c
>                   ON c.relfilenode = m.relfilenode LIMIT 10;
>          relname             | relblocknumber | blockfreebytes
>      ------------------------+----------------+----------------
>      sql_features            |              5 |           2696
>      sql_implementation_info |              0 |           7104
>      sql_languages           |              0 |           8016
>      sql_packages            |              0 |           7376
>      sql_sizing              |              0 |           6032
>      pg_authid               |              0 |           7424
>      pg_toast_2618           |             13 |           4588
>      pg_toast_2618           |             12 |           1680
>      pg_toast_2618           |             10 |           1436
>      pg_toast_2618           |              7 |           1136
>      (10 rows)
>
> [I found being able to display the FSM pretty cool, even if I say so
> myself....].

I like this, but not because I want to read it myself, but because I
want to make autovacuum responsible for re-allocating free space when it
runs out. This way we can have an autoFSM feature in 8.2

Best Regards, Simon Riggs


Re: TODO Item - Add system view to show free space map

From
Alvaro Herrera
Date:
Simon Riggs wrote:
> On Fri, 2005-10-28 at 13:21 +1300, Mark Kirkwood wrote:
>
> >      regression=# SELECT c.relname, m.relblocknumber, m.blockfreebytes
> >                   FROM pg_freespacemap m INNER JOIN pg_class c
> >                   ON c.relfilenode = m.relfilenode LIMIT 10;
>
>
> I like this, but not because I want to read it myself, but because I
> want to make autovacuum responsible for re-allocating free space when it
> runs out. This way we can have an autoFSM feature in 8.2

What do you mean, re-allocating free space?  I don't understand what you
are proposing.

--
Alvaro Herrera                  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
"Use it up, wear it out, make it do, or do without"

Re: TODO Item - Add system view to show free space map

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> Simon Riggs wrote:
>> I like this, but not because I want to read it myself, but because I
>> want to make autovacuum responsible for re-allocating free space when it
>> runs out. This way we can have an autoFSM feature in 8.2

> What do you mean, re-allocating free space?  I don't understand what you
> are proposing.

And even less why autovacuum would go through a view to get at the info.

            regards, tom lane

Re: TODO Item - Add system view to show free space map

From
Mark Kirkwood
Date:
Simon Riggs wrote:
>
>
> I like this, but not because I want to read it myself, but because I
> want to make autovacuum responsible for re-allocating free space when it
> runs out. This way we can have an autoFSM feature in 8.2
>
>

Not wanting to denigrate value of the interesting but slightly OT
direction this thread has taken - but does anybody want to
comment/review the patch itself :-) ....?

Cheers

Mark

Re: TODO Item - Add system view to show free space map

From
Bruce Momjian
Date:
Mark Kirkwood wrote:
> Simon Riggs wrote:
> >
> >
> > I like this, but not because I want to read it myself, but because I
> > want to make autovacuum responsible for re-allocating free space when it
> > runs out. This way we can have an autoFSM feature in 8.2
> >
> >
>
> Not wanting to denigrate value of the interesting but slightly OT
> direction this thread has taken - but does anybody want to
> comment/review the patch itself :-) ....?

I saw this question about a transaction block and your reply:

    http://archives.postgresql.org/pgsql-patches/2005-10/msg00226.php

but no new patch.  I know someone suggested pgfoundry but it seems most
natural in /contrib.  Do you want to update the patch?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: TODO Item - Add system view to show free space map

From
Mark Kirkwood
Date:
Bruce Momjian wrote:
> Mark Kirkwood wrote:
>
>>Simon Riggs wrote:
>>
>>>
>>>I like this, but not because I want to read it myself, but because I
>>>want to make autovacuum responsible for re-allocating free space when it
>>>runs out. This way we can have an autoFSM feature in 8.2
>>>
>>>
>>
>>Not wanting to denigrate value of the interesting but slightly OT
>>direction this thread has taken - but does anybody want to
>>comment/review the patch itself :-) ....?
>
>
> I saw this question about a transaction block and your reply:
>
>     http://archives.postgresql.org/pgsql-patches/2005-10/msg00226.php
>
> but no new patch.  I know someone suggested pgfoundry but it seems most
> natural in /contrib.  Do you want to update the patch?
>

In the expectation of further revisions, I was going to batch that one
in with the 'rest' - given that there have not been any, I'll submit a
revised patch.

Cheers

Mark

Re: TODO Item - Add system view to show free space map

From
Mark Kirkwood
Date:
Mark Kirkwood wrote:
> Bruce Momjian wrote:
>
>> Mark Kirkwood wrote:
>>
>>> Simon Riggs wrote:
>>>
>>>>
>>>> I like this, but not because I want to read it myself, but because I
>>>> want to make autovacuum responsible for re-allocating free space
>>>> when it
>>>> runs out. This way we can have an autoFSM feature in 8.2
>>>>
>>>>
>>>
>>> Not wanting to denigrate value of the interesting but slightly OT
>>> direction this thread has taken - but does anybody want to
>>> comment/review the patch itself :-) ....?
>>
>>
>>
>> I saw this question about a transaction block and your reply:
>>
>>     http://archives.postgresql.org/pgsql-patches/2005-10/msg00226.php
>>
>> but no new patch.  I know someone suggested pgfoundry but it seems most
>> natural in /contrib.  Do you want to update the patch?
>>
>
> In the expectation of further revisions, I was going to batch that one
> in with the 'rest' - given that there have not been any, I'll submit a
> revised patch.
>

Here it is - I seem to have had trouble sending any attachments to this
list recently. Bruce the patch (sent privately), so its in the patches
queue, but thought I would have another go at getting it to -patches so
others can review it more easily!

cheers

Mark

Attachment

Re: TODO Item - Add system view to show free space map

From
Bruce Momjian
Date:
Your patch has been added to the PostgreSQL unapplied patches list at:

    http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------


Mark Kirkwood wrote:
> Mark Kirkwood wrote:
> > Bruce Momjian wrote:
> >
> >> Mark Kirkwood wrote:
> >>
> >>> Simon Riggs wrote:
> >>>
> >>>>
> >>>> I like this, but not because I want to read it myself, but because I
> >>>> want to make autovacuum responsible for re-allocating free space
> >>>> when it
> >>>> runs out. This way we can have an autoFSM feature in 8.2
> >>>>
> >>>>
> >>>
> >>> Not wanting to denigrate value of the interesting but slightly OT
> >>> direction this thread has taken - but does anybody want to
> >>> comment/review the patch itself :-) ....?
> >>
> >>
> >>
> >> I saw this question about a transaction block and your reply:
> >>
> >>     http://archives.postgresql.org/pgsql-patches/2005-10/msg00226.php
> >>
> >> but no new patch.  I know someone suggested pgfoundry but it seems most
> >> natural in /contrib.  Do you want to update the patch?
> >>
> >
> > In the expectation of further revisions, I was going to batch that one
> > in with the 'rest' - given that there have not been any, I'll submit a
> > revised patch.
> >
>
> Here it is - I seem to have had trouble sending any attachments to this
> list recently. Bruce the patch (sent privately), so its in the patches
> queue, but thought I would have another go at getting it to -patches so
> others can review it more easily!
>
> cheers
>
> Mark

[ application/gzip is not supported, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: TODO Item - Add system view to show free space map

From
Bruce Momjian
Date:
Patch applied.  Thanks.

---------------------------------------------------------------------------


Mark Kirkwood wrote:
> Mark Kirkwood wrote:
> > Bruce Momjian wrote:
> >
> >> Mark Kirkwood wrote:
> >>
> >>> Simon Riggs wrote:
> >>>
> >>>>
> >>>> I like this, but not because I want to read it myself, but because I
> >>>> want to make autovacuum responsible for re-allocating free space
> >>>> when it
> >>>> runs out. This way we can have an autoFSM feature in 8.2
> >>>>
> >>>>
> >>>
> >>> Not wanting to denigrate value of the interesting but slightly OT
> >>> direction this thread has taken - but does anybody want to
> >>> comment/review the patch itself :-) ....?
> >>
> >>
> >>
> >> I saw this question about a transaction block and your reply:
> >>
> >>     http://archives.postgresql.org/pgsql-patches/2005-10/msg00226.php
> >>
> >> but no new patch.  I know someone suggested pgfoundry but it seems most
> >> natural in /contrib.  Do you want to update the patch?
> >>
> >
> > In the expectation of further revisions, I was going to batch that one
> > in with the 'rest' - given that there have not been any, I'll submit a
> > revised patch.
> >
>
> Here it is - I seem to have had trouble sending any attachments to this
> list recently. Bruce the patch (sent privately), so its in the patches
> queue, but thought I would have another go at getting it to -patches so
> others can review it more easily!
>
> cheers
>
> Mark

[ application/gzip is not supported, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073