Thread: VACUUM

VACUUM

From

Anderson dos Santos Donda

Date:

07 March 2009, 16:14:35

Anyone can explain this:

INFO: free space map contains 5848 pages in 5000 relations
DETAIL: A total of 83728 page slots are in use (including overhead).
83728 page slots are required to track all free space.
Current limits are: 153600 page slots, 5000 relations, using 1228 kB.
NOTICE: max_fsm_relations(5000) equals the number of relations checked

I increase the relations but doesn't work... and if a increase more than 7000, postgre doesn't work because the shared memory kernel...

What I do?

Re: VACUUM

From

John R Pierce

Date:

07 March 2009, 18:41:23

Anderson dos Santos Donda wrote:
> Anyone can explain this:
>
> INFO:  free space map contains 5848 pages in 5000 relations
> DETAIL:  A total of 83728 page slots are in use (including overhead).
> 83728 page slots are required to track all free space.
> Current limits are:  153600 page slots, 5000 relations, using 1228 kB.
> NOTICE:  max_fsm_relations(5000) equals the number of relations checked
>
>
> I increase the relations but doesn't work... and if a increase more
> than 7000, postgre doesn't work because the shared memory kernel...
>
> What I do?

you probably have to increase the SHMMAX in the kernel, how you do this
depends on the OS and distribution.

for instance...  in a Red Hat style linux distribution,
/etc/sysctl.conf, add a line...

    kernel.shmmax = 500000000

(thats about 500MB, the value you use should be appropriate for the
system... I generally allow around 1/2 of physical ram for this, this is
just a limit, and won't have any impact if you don't actually use this much)

on a older Solaris version, in /etc/system

    set shmsys:shminfo_shmmax=500000000

in a newer solaris system...  execute the command...

 # projadd -U postgres -c "PostgreSQL Server" \
    -K "project.max-shm-memory=(priv,500MB,deny)" \
    postgresql

which sets that limit just for processes running as the postgres user
rather than globally.

[Q] Cluster design for geographically separated dbs

From

"V S P"

Date:

07 March 2009, 20:11:17

Hello,
I am designing a db to hold often changed user data.

I just wanted to write down what I am thinking and ask people on the
list to comment
if they have any experiences in that area.

My thought is to have
say west-cost and east-cost data center and each user will
go to either East Coast or West Coast

and then within each Coast, I would want to partition by Hash on a user
id.

I am reviewing the Skype paper on the subject


http://kaiv.wordpress.com/2007/07/27/postgresql-cluster-partitioning-with-plproxy-part-i/


And wanted to ask what would be the main challenges I am facing with --
from the experience of the users on this list.

Especially I am not sure how to for example manage 'overlapping unique
IDs' data.

First, say I have a user who is trying to register with the same ID as
somebody
else only in a different data center -- that means that I always have to
check
first in each datacenter if ID exists.  Then based on his/her IP address
I decide what data center is closest (but IP addresses are often not a
good indication of geographical location of the user either, so I will
give them a 'manual' select option)

Then if I have say 'BIG' serial in my tables, but since there is more
than one database -- the 'big-serial' in one database can well overlap
it in another database.

So if I have any tables that must contain data from different databases
-- I have to add something else to the 'foreign' key -- besides the
reference to the big serial. And so on...

Right now - on paper, I am just having quite a few 'extra' fields in my
tables just to support 'UNiqueness' of the record across clusters.

I am not sure if I am doing it the right way (because then I also have
to at some point in time 'Defgrament' the IDs (as the data with
BIGserial keys can be deleted).

It looks to me that If I design things to take advantage of  Skype's
plproxy -- I will be able to leverage, what appears to be, a relatively
easy way to get data between databases (for reports that span clusters).


thanks in advance for any comments,
Vlad
--
  V S P
  toreason@fastmail.fm

--
http://www.fastmail.fm - Email service worth paying for. Try it for free

Re: [Q] Cluster design for geographically separated dbs

From

Scott Marlowe

Date:

08 March 2009, 08:13:59

On Sat, Mar 7, 2009 at 2:03 PM, V S P <toreason@fastmail.fm> wrote:

> And wanted to ask what would be the main challenges I am facing with --
> from the experience of the users on this list.
>
> Especially I am not sure how to for example manage 'overlapping unique
> IDs' data.

I'm not expert on a lot of what you're doing, but the standard trick
here is to partition your bigserials.

The max value for the underlying sequence is 9223372036854775807 which
should give you plenty of space to work in.  So, When creating your
bigserials, you can then alter the sequence underneath them to use a
different range on each machine.

smarlowe=# create table a1 (id bigserial, info text);
NOTICE:  CREATE TABLE will create implicit sequence "a1_id_seq" for
serial column "a1.id"
smarlowe=# create table a2 (id bigserial, info text);
NOTICE:  CREATE TABLE will create implicit sequence "a2_id_seq" for
serial column "a2.id"
smarlowe=# alter sequence a1_id_seq minvalue  maxvalue 19999999999
start 10000000000;
ALTER SEQUENCE
smarlowe=# alter sequence a2_id_seq minvalue 20000000000 maxvalue
29999999999 start 20000000000;
ALTER SEQUENCE

Now those two sequences can't run into each other, and if you move a
record from one machine to another it won't bump into what's already
there.  Partitioning by 10billion gives you 922337203 possible
partitions, so if you need bigger but fewer partitions there's plenty
of wiggle room to play with.

Re: [Q] Cluster design for geographically separated dbs

From

"V S P"

Date:

08 March 2009, 20:59:47

Thank you,
Is there a way, in the same  idea,
to make postgresql 'skip' say every 100 numbers when generating
a 'next' in bigserial?
(or to insure that every number generated is evenly divisible by 100,
and then another db would be 99 and so on)

In oracle, if I remember right, there was something called a 'Step'
for the sequence values.



Vlad



On Sun, 08 Mar 2009 01:13 -0700, "Scott Marlowe"
<scott.marlowe@gmail.com> wrote:
> On Sat, Mar 7, 2009 at 2:03 PM, V S P <toreason@fastmail.fm> wrote:
>
> > And wanted to ask what would be the main challenges I am facing with --
> > from the experience of the users on this list.
> >
> > Especially I am not sure how to for example manage 'overlapping unique
> > IDs' data.
>
> I'm not expert on a lot of what you're doing, but the standard trick
> here is to partition your bigserials.
>
> The max value for the underlying sequence is 9223372036854775807 which
> should give you plenty of space to work in.  So, When creating your
> bigserials, you can then alter the sequence underneath them to use a
> different range on each machine.
>
> smarlowe=# create table a1 (id bigserial, info text);
> NOTICE:  CREATE TABLE will create implicit sequence "a1_id_seq" for
> serial column "a1.id"
> smarlowe=# create table a2 (id bigserial, info text);
> NOTICE:  CREATE TABLE will create implicit sequence "a2_id_seq" for
> serial column "a2.id"
> smarlowe=# alter sequence a1_id_seq minvalue  maxvalue 19999999999
> start 10000000000;
> ALTER SEQUENCE
> smarlowe=# alter sequence a2_id_seq minvalue 20000000000 maxvalue
> 29999999999 start 20000000000;
> ALTER SEQUENCE
>
> Now those two sequences can't run into each other, and if you move a
> record from one machine to another it won't bump into what's already
> there.  Partitioning by 10billion gives you 922337203 possible
> partitions, so if you need bigger but fewer partitions there's plenty
> of wiggle room to play with.
--
  V S P
  toreason@fastmail.fm

--
http://www.fastmail.fm - IMAP accessible web-mail

Re: [Q] Cluster design for geographically separated dbs

From

Scott Marlowe

Date:

08 March 2009, 21:27:38

On Sun, Mar 8, 2009 at 1:01 PM, V S P <toreason@fastmail.fm> wrote:
> Thank you,
> Is there a way, in the same  idea,
> to make postgresql 'skip' say every 100 numbers when generating
> a 'next' in bigserial?
> (or to insure that every number generated is evenly divisible by 100,
> and then another db would be 99 and so on)
>
> In oracle, if I remember right, there was something called a 'Step'
> for the sequence values.

To find this stuff in pgsql you can either wander through the manual,
or fire up psql and ask it:

smarlowe=# \h create sequence
Command:     CREATE SEQUENCE
Description: define a new sequence generator
Syntax:
CREATE [ TEMPORARY | TEMP ] SEQUENCE name [ INCREMENT [ BY ] increment ]
    [ MINVALUE minvalue | NO MINVALUE ] [ MAXVALUE maxvalue | NO MAXVALUE ]
    [ START [ WITH ] start ] [ CACHE cache ] [ [ NO ] CYCLE ]
    [ OWNED BY { table.column | NONE } ]

smarlowe=# \h alter sequence
Command:     ALTER SEQUENCE
Description: change the definition of a sequence generator
Syntax:
ALTER SEQUENCE name [ INCREMENT [ BY ] increment ]
    [ MINVALUE minvalue | NO MINVALUE ] [ MAXVALUE maxvalue | NO MAXVALUE ]
    [ RESTART [ WITH ] start ] [ CACHE cache ] [ [ NO ] CYCLE ]
    [ OWNED BY { table.column | NONE } ]
ALTER SEQUENCE name RENAME TO new_name
ALTER SEQUENCE name SET SCHEMA new_schema

As you can see there's an increment setting that sets what you're
wanting to set.