Thread: PostgreSQL win32 fragmentation issue

PostgreSQL win32 fragmentation issue

From
"Joshua D. Drake"
Date:
Hello,

I ran a simple test... I ran defrag on my drives. Then I initialized 
pgbench with a -s 1000.

11% fragmentation. I dropped the bench database, and my fragmentation is 1%.

I know this isn't *our* fault :) but I am curious if there is anything 
we can do about the way postgresql writes files to help limit fragmentation.

Essentially, this makes win32 impossible in a 24x7 environment (jokes 
aside about Win32 in general) because we *have* to defrag on Windows and
Windows won't defrag open files (thus anything PostgreSQL is using).

Sincerely,

Joshua D. Drake



Re: PostgreSQL win32 fragmentation issue

From
Martijn van Oosterhout
Date:
On Fri, Dec 01, 2006 at 10:58:44PM -0800, Joshua D. Drake wrote:
> I know this isn't *our* fault :) but I am curious if there is anything
> we can do about the way postgresql writes files to help limit fragmentation.
>
> Essentially, this makes win32 impossible in a 24x7 environment (jokes
> aside about Win32 in general) because we *have* to defrag on Windows and
> Windows won't defrag open files (thus anything PostgreSQL is using).

I thought fragmentation was something that disappeared with the FAT
filesystem. Isn't NTFS smart enought o avoid fragmentation in the first
place?

The way you avoid fragmentation is by preallocating larger blocks, but
most filesystems are smart enough to handle that somehwta
automatically.

BTW, do you know what 11% fragmentation means? Does that mean each file
is on average split in 9 pieces, because for a 1GB file, 9 pieces isn't
all that bad.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: PostgreSQL win32 fragmentation issue

From
"Matthew T. O'Connor"
Date:
Martijn van Oosterhout wrote:
> On Fri, Dec 01, 2006 at 10:58:44PM -0800, Joshua D. Drake wrote:
>   
>> I know this isn't *our* fault :) but I am curious if there is anything 
>> we can do about the way postgresql writes files to help limit fragmentation.
>>
>> Essentially, this makes win32 impossible in a 24x7 environment (jokes 
>> aside about Win32 in general) because we *have* to defrag on Windows and
>> Windows won't defrag open files (thus anything PostgreSQL is using).
>>     
>
> BTW, do you know what 11% fragmentation means? Does that mean each file
> is on average split in 9 pieces, because for a 1GB file, 9 pieces isn't
> all that bad.


Or more specifically do you have any idea what it means in terms of 
performance?  And why do you say that you *have to* defrag under 
windows, isn't this only a performance issue and not necessarily a huge 
one at that?  Also, as a solution, I would think that CLUSTER might help.

Matt



Re: PostgreSQL win32 fragmentation issue

From
"Joshua D. Drake"
Date:
On Sat, 2006-12-02 at 14:31 +0100, Martijn van Oosterhout wrote:
> On Fri, Dec 01, 2006 at 10:58:44PM -0800, Joshua D. Drake wrote:
> > I know this isn't *our* fault :) but I am curious if there is anything 
> > we can do about the way postgresql writes files to help limit fragmentation.
> > 
> > Essentially, this makes win32 impossible in a 24x7 environment (jokes 
> > aside about Win32 in general) because we *have* to defrag on Windows and
> > Windows won't defrag open files (thus anything PostgreSQL is using).
> 
> I thought fragmentation was something that disappeared with the FAT
> filesystem. Isn't NTFS smart enought o avoid fragmentation in the first
> place?

Nope... you still have the good old defrag command (well button now).

> BTW, do you know what 11% fragmentation means? Does that mean each file
> is on average split in 9 pieces, because for a 1GB file, 9 pieces isn't
> all that bad.

No, the report had Total Fragmentation, and File Fragmentation. I was
reporting on Total, which I assume is some aggregation.

My concern is that this is over a single bench run. I could imagine that
after a week or two weeks of stead PostgreSQL use, the IO would
gradually get worse and worse.

Joshua D. Drake



> 
> Have a nice day,
-- 
     === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997            http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate





Re: PostgreSQL win32 fragmentation issue

From
"Joshua D. Drake"
Date:
> Or more specifically do you have any idea what it means in terms of 
> performance?  And why do you say that you *have to* defrag under 
> windows, isn't this only a performance issue and not necessarily a huge 
> one at that?  Also, as a solution, I would think that CLUSTER might help.

Yes it is a performance issue. And all performance issues if they can
not be resolved within the database or application, eventually become an
outage. 

A CLUSTER may, but that puts us back to an outage. A CLUSTER is an
exclusive lock. Usually for long periods of time.

Joshua D. Drake


> 
> Matt
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
> 
-- 
     === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997            http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate





Re: PostgreSQL win32 fragmentation issue

From
Alvaro Herrera
Date:
Joshua D. Drake wrote:

> My concern is that this is over a single bench run. I could imagine that
> after a week or two weeks of stead PostgreSQL use, the IO would
> gradually get worse and worse.

Well, we mostly do random reads (seek) all the time anyway, so is this
really a concern?

Are you using NTFS or FAT?  You didn't answer that question.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: PostgreSQL win32 fragmentation issue

From
"Joshua D. Drake"
Date:
On Sat, 2006-12-02 at 13:34 -0300, Alvaro Herrera wrote:
> Joshua D. Drake wrote:
> 
> > My concern is that this is over a single bench run. I could imagine that
> > after a week or two weeks of stead PostgreSQL use, the IO would
> > gradually get worse and worse.
> 
> Well, we mostly do random reads (seek) all the time anyway, so is this
> really a concern?

Wouldn't it be for writes?

> 
> Are you using NTFS or FAT?  You didn't answer that question.

NTFS of course :) (sorry).

Joshua D. Drake


> 
-- 
     === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997            http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate





Re: PostgreSQL win32 fragmentation issue

From
"Andrew Dunstan"
Date:
Alvaro Herrera wrote:
> Joshua D. Drake wrote:
>
>> My concern is that this is over a single bench run. I could imagine that
>> after a week or two weeks of stead PostgreSQL use, the IO would
>> gradually get worse and worse.
>
> Well, we mostly do random reads (seek) all the time anyway, so is this
> really a concern?
>
> Are you using NTFS or FAT?  You didn't answer that question.
>

If it's FAT then you deserve what you get.

If you really need this, a Slony setup would probably work well. Shut down
replica, defrag, start up replica, wait for sync completion, switchover
and repeat. The new Slony release is said to support Windows.

cheers

andrew



Re: PostgreSQL win32 fragmentation issue

From
"Thomas H."
Date:
> Martijn van Oosterhout wrote:
>> On Fri, Dec 01, 2006 at 10:58:44PM -0800, Joshua D. Drake wrote:
>>
>>> I know this isn't *our* fault :) but I am curious if there is anything 
>>> we can do about the way postgresql writes files to help limit 
>>> fragmentation.
>>>
>>> Essentially, this makes win32 impossible in a 24x7 environment (jokes 
>>> aside about Win32 in general) because we *have* to defrag on Windows and
>>> Windows won't defrag open files (thus anything PostgreSQL is using).
>>>
>>
>> BTW, do you know what 11% fragmentation means? Does that mean each file
>> is on average split in 9 pieces, because for a 1GB file, 9 pieces isn't
>> all that bad.

in our win32/ntfs environment, only 6 pgsql data-files are fragmented. but 
they are heavily fragmented. fragmentiation ranges from 1369 fragments for a 
14mb file to 4548 fragments for a 628mb one... the database is only 1 week 
old.

- thomas 




Re: PostgreSQL win32 fragmentation issue

From
"Joshua D. Drake"
Date:
> > Well, we mostly do random reads (seek) all the time anyway, so is this
> > really a concern?
> >
> > Are you using NTFS or FAT?  You didn't answer that question.
> >
> 
> If it's FAT then you deserve what you get.

I am investigating this for the wider community. I do not, use
PostgreSQL Win32 for anything but testing, but guess what -- our silent
majority does. I have a customer with 5000 Win32 installations and they
are not the only ones with those types of deployments.

I am trying to get as much information as possible so that:

1. We can update the documentation to correctly reflect issues with
Win32.

2. We can hopefully become the dominant database on Win32.

> 
> If you really need this, a Slony setup would probably work well.

The idea of Slony with Windows is a bit terryfying considering the
target Admin audience and the performance hit one would take. 

>  Shut down
> replica, defrag, start up replica, wait for sync completion, switchover
> and repeat. The new Slony release is said to support Windows.

O.k., doesn't this seem like killing a squirrel with a 50mm tank gun?
The fact that this was suggested as anything but a last resort isn't to
inspiring.

Sincerely,

Joshua D. Drake



> 
> cheers
> 
> andrew
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
> 
>                http://archives.postgresql.org
> 
-- 
     === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997            http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate





Re: PostgreSQL win32 fragmentation issue

From
"Andrew Dunstan"
Date:
Joshua D. Drake wrote:
>
> O.k., doesn't this seem like killing a squirrel with a 50mm tank gun?
> The fact that this was suggested as anything but a last resort isn't to
> inspiring.
>

Before you start ringing alarm bells, you need to instrument the actual
performance effect. So far all I have seen is an assumption that the
effect will be serious. Let's see some performance metrics that
demonstrate the problem you think might exist.

The Windows port has been out for nearly 2 years. The fact that we have
not seen complaints about this leads me to be somewhat skeptical. You
could be right, but I want hard evidence. (And why wouldn't this be a
problem for any DBMS running on Windows? There are just huge numbers of
24/7 Windows servers running SQLServer or Oracle.)

cheers

andrew




Re: PostgreSQL win32 fragmentation issue

From
"Joshua D. Drake"
Date:
On Sat, 2006-12-02 at 11:05 -0600, Andrew Dunstan wrote:
> Joshua D. Drake wrote:
> >
> > O.k., doesn't this seem like killing a squirrel with a 50mm tank gun?
> > The fact that this was suggested as anything but a last resort isn't to
> > inspiring.
> >
> 
> Before you start ringing alarm bells, you need to instrument the actual
> performance effect.

Alarm bells? I saw a potential issue, I brought it to the community.
That is all. 

>  So far all I have seen is an assumption that the
> effect will be serious. Let's see some performance metrics that
> demonstrate the problem you think might exist.

It is not an assumption that defragmentation causes performance issues.
The assumption is that the above fact will cause problems with
PostgreSQL performance.

Yes, the PostgreSQL portion is an assumption which is why I have brought
it to the community for further analysis.

> 
> The Windows port has been out for nearly 2 years. The fact that we have
> not seen complaints about this leads me to be somewhat skeptical.

Fair enough but I don't hold too much weight to this argument as even
the remote idea that anything but the slimmest <1% of our Wndows users
even read this list (let alone post to it) beyond a possible Google
search is far fetched.

>  You
> could be right, but I want hard evidence.

Sure, which is why I opened the dialog.


>  (And why wouldn't this be a
> problem for any DBMS running on Windows? There are just huge numbers of
> 24/7 Windows servers running SQLServer or Oracle.)

Well I don't know about SQLServer but Oracle uses a single file (at
least it does on Unix). I don't know if that would make a difference or
not.

Sincerely,

Joshua D. Drake


> 
> cheers
> 
> andrew
> 
> 
-- 
     === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997            http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate





Re: PostgreSQL win32 fragmentation issue

From
"Merlin Moncure"
Date:
On 12/2/06, Joshua D. Drake <jd@commandprompt.com> wrote:
> Well I don't know about SQLServer but Oracle uses a single file (at
> least it does on Unix). I don't know if that would make a difference or
> not.

ditto sql server.  I agree with Andrew though, let's determine there
to be negative performance impact before assuming the worst.

merlin


Re: PostgreSQL win32 fragmentation issue

From
"Andrew Dunstan"
Date:
Joshua D. Drake wrote:
> On Sat, 2006-12-02 at 11:05 -0600, Andrew Dunstan wrote:
>> Joshua D. Drake wrote:
>> >
>> > O.k., doesn't this seem like killing a squirrel with a 50mm tank gun?
>> > The fact that this was suggested as anything but a last resort isn't
>> to
>> > inspiring.
>> >
>>
>> Before you start ringing alarm bells, you need to instrument the actual
>> performance effect.
>
> Alarm bells? I saw a potential issue, I brought it to the community.
> That is all.

You said:

"Essentially, this makes win32 impossible in a 24x7 environment (jokes
aside about Win32 in general) because we *have* to defrag on Windows and
Windows won't defrag open files (thus anything PostgreSQL is using)."

What I am asking for is evidence to back up that assertion. At the moment
it is, as Tom likes to say, "a fact not in evidence".

(I should have thought a single file DB would be MORE liable to
fragmentation, BTW).

cheers

andrew



Re: PostgreSQL win32 fragmentation issue

From
"Thomas H."
Date:
> (I should have thought a single file DB would be MORE liable to
> fragmentation, BTW).

AFAIK what mssql does against fragmentation is: it preallocates its 
tablespace in chunks of several mb/gb and then starts filling the free 
allocated space until it has to reallocate more. the fragmentation for such 
a single file (actually in most times 2 files for mssql) is practically 
ignorable.

- thomas 




Re: PostgreSQL win32 fragmentation issue

From
"Simon Riggs"
Date:
On Sat, 2006-12-02 at 09:22 -0800, Joshua D. Drake wrote:
> On Sat, 2006-12-02 at 11:05 -0600, Andrew Dunstan wrote:
> > Joshua D. Drake wrote:
> > >
> > > O.k., doesn't this seem like killing a squirrel with a 50mm tank gun?
> > > The fact that this was suggested as anything but a last resort isn't to
> > > inspiring.
> > >
> > 
> > Before you start ringing alarm bells, you need to instrument the actual
> > performance effect.
> 
> Alarm bells? I saw a potential issue, I brought it to the community.
> That is all. 

That's understood and appreciated. In case no one else says it: thanks.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com




Re: PostgreSQL win32 fragmentation issue

From
"Joshua D. Drake"
Date:
Starting point: 1% Fragmentation
transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 50
number of transactions per client: 1000
number of transactions actually processed: 50000/50000
tps = 46.001927 (including connections establishing)
tps = 46.045008 (excluding connections establishing)

Total Fragmentation: 2%
File Fragmentation: 5%

transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 50
number of transactions per client: 1000
number of transactions actually processed: 50000/50000
tps = 50.507082 (including connections establishing)
tps = 50.558191 (excluding connections establishing)

Total Fragmentation: 2%
File Fragmentation: 5%

transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 100
number of transactions per client: 1000
number of transactions actually processed: 100000/100000
tps = 45.976461 (including connections establishing)
tps = 46.026986 (excluding connections establishing)

Total Fragmentation: 2%
File Fragmentation: 5%

Initialize with scaling of 100

Total Fragmentation: 2%
File Fragmentation: 5%

transaction type: TPC-B (sort of)
scaling factor: 100
number of clients: 50
number of transactions per client: 1000
number of transactions actually processed: 50000/50000
tps = 43.054164 (including connections establishing)
tps = 43.090163 (excluding connections establishing)

Total Fragmentation: 2%
File Fragmentation: 5%


O.k. my guess here is that autovacuum is keeping up with the actual
pgbench runs and thus causing reusable tuples? Does that make sense?

Joshua D. Drake




-- 
     === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997            http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate





Re: PostgreSQL win32 fragmentation issue

From
"Magnus Hagander"
Date:
> >>> I know this isn't *our* fault :) but I am curious if there is
> >>> anything we can do about the way postgresql writes files to help
> >>> limit fragmentation.
> >>>
> >>> Essentially, this makes win32 impossible in a 24x7 environment
> >>> (jokes aside about Win32 in general) because we *have* to
> defrag on
> >>> Windows and Windows won't defrag open files (thus
> anything PostgreSQL is using).
> >>>
> >>
> >> BTW, do you know what 11% fragmentation means? Does that mean each
> >> file is on average split in 9 pieces, because for a 1GB file, 9
> >> pieces isn't all that bad.
>
> in our win32/ntfs environment, only 6 pgsql data-files are
> fragmented. but they are heavily fragmented. fragmentiation
> ranges from 1369 fragments for a 14mb file to 4548 fragments
> for a 628mb one... the database is only 1 week old.

Which relations do these files represent?

//Magnus


Re: PostgreSQL win32 fragmentation issue

From
"Thomas H."
Date:
>> in our win32/ntfs environment, only 6 pgsql data-files are
>> fragmented. but they are heavily fragmented. fragmentiation
>> ranges from 1369 fragments for a 14mb file to 4548 fragments
>> for a 628mb one... the database is only 1 week old.

> Which relations do these files represent?

all 6 of them tables are tables. makes sense as reindex would recreate the 
index file more or less at once and thus without much fragmentation...

- thomas 




Re: PostgreSQL win32 fragmentation issue

From
Mark Kirkwood
Date:
Andrew Dunstan wrote:

> 
> (I should have thought a single file DB would be MORE liable to
> fragmentation, BTW).
> 

The likes of Oracle/SQLServer preallocate the files(s) at database 
creation time, so there will be little or no initial fragmentation.

However, they are typically allowed to grow to accommodate extra data - 
this means that fragmentation over time *can* become a problem for these 
guys too - but looks like we can suffer it from day 1.

Cheers

Mark


Re: PostgreSQL win32 fragmentation issue

From
Mark Kirkwood
Date:
Mark Kirkwood wrote:
> Andrew Dunstan wrote:
> 
>>
>> (I should have thought a single file DB would be MORE liable to
>> fragmentation, BTW).
>>
> 
> The likes of Oracle/SQLServer preallocate the files(s) at database 
> creation time, so there will be little or no initial fragmentation.
> 

... To clarify - preallocated with free space in the files :-) ...

A related point comes to mind, consider the single "tablespace" file for 
example: It may have no fragmentation from a *file* perspective, but 
supposing you have several busy relations being inserted or updated then 
pages (or groups of pages) for these could be "fragmented" throughout 
the tablespace file. This can reduce scan performance, since seeks are 
now involved to "jump over" the page chunks for relations you are not 
interested in.

Some Oracle DBA's I knew used to regularly export|rebuild|import their 
databases precisely to "defragment" their relation|file page distribution.

I'm not trying to say that we don't have an issue, just pointing out 
that preallocated files are not necessarily a complete or ideal solution.

Cheers

Mark


Re: PostgreSQL win32 fragmentation issue

From
"Thomas H."
Date:
by the way... at least on windows 2003 file fragmentation is *not* really an 
issue. thanks to the vssvc (shadowcopy) even files that are in use are being 
defragmented. as far as i know, vssvc is also there in xp, but i can't 
verify.

running a quick defrag on the db data partition defragmented all the 
beforementioned heavily fragmented relation files without any problems, 
eventhough the tables are actually being updated and in use while 
defragmenting.

a regular defrag will be sufficient to keep the performance hit low - if 
there is any noticable in first place, that is.

- thomas 




Re: PostgreSQL win32 fragmentation issue

From
Shane Ambler
Date:
Joshua D. Drake wrote:

> I am trying to get as much information as possible so that:
> 

Well I have a small postgresql setup on my girlfriends Win XP Pro 
machine that only had a little testing use a few months ago.

The data folder is 84MB and even though there is 25% total fragmentation 
and 45% file fragmentation on the drive - the largest being the 
pc-cillin log with 75,104 fragments - none of the fragmented files are 
postgresql data files. (Been a while since I defraged this)

If we stop and think about how windows (and mac os x) allocate file 
space as compared to *nix file systems then we can assume a few things 
that could be tested to verify.

The first test you mentioned you defraged with an existing db then ran 
the tests and then checked fragmentation again. Try this instead - 
remove the contents of the data folder - defrag - initdb - run test and 
see what fragmentation you get.

Try installing the data folder on a separate drive with nothing else on 
it and run the test and then look at fragmentation.

How about some performance tests to measure the difference between badly 
fragmented data files and un-fragmented data files. It may not make a 
significant difference.


-- 

Shane Ambler
pgSQL@007Marketing.com

Get Sheeky @ http://Sheeky.Biz


Re: PostgreSQL win32 fragmentation issue

From
Tom Lane
Date:
Mark Kirkwood <markir@paradise.net.nz> writes:
> A related point comes to mind, consider the single "tablespace" file for 
> example: It may have no fragmentation from a *file* perspective, but 
> supposing you have several busy relations being inserted or updated then 
> pages (or groups of pages) for these could be "fragmented" throughout 
> the tablespace file.

+1 ... what was said upthread sounds to me like those other databases
are just hiding the fragmentation issue within their
huge-files-you-can't-see-into.  I would very much like to see some proof
of performance problems before we worry about this.
        regards, tom lane


Re: PostgreSQL win32 fragmentation issue

From
"Gurjeet Singh"
Date:
On 12/2/06, Joshua D. Drake <jd@commandprompt.com> wrote:
On Sat, 2006-12-02 at 13:34 -0300, Alvaro Herrera wrote:
> Joshua D. Drake wrote:
>
> > My concern is that this is over a single bench run. I could imagine that
> > after a week or two weeks of stead PostgreSQL use, the IO would
> > gradually get worse and worse.
>
> Well, we mostly do random reads (seek) all the time anyway, so is this
> really a concern?

Wouldn't it be for writes?

Also, huge index range scans can be badly affected by file level fragmentation.

Regards,

--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | yahoo }.com

Re: PostgreSQL win32 fragmentation issue

From
"Gurjeet Singh"
Date:
On 12/3/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mark Kirkwood <markir@paradise.net.nz> writes:
> A related point comes to mind, consider the single "tablespace" file for
> example: It may have no fragmentation from a *file* perspective, but
> supposing you have several busy relations being inserted or updated then
> pages (or groups of pages) for these could be "fragmented" throughout
> the tablespace file.

+1 ... what was said upthread sounds to me like those other databases
are just hiding the fragmentation issue within their
huge-files-you-can't-see-into.  I would very much like to see some proof
of performance problems before we worry about this.


Does this raise the need for an in-postgres de-fragmenter?

--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | yahoo }.com
GTODO

Re: PostgreSQL win32 fragmentation issue

From
"Magnus Hagander"
Date:
> > A related point comes to mind, consider the single
> "tablespace" file
> > for
> > example: It may have no fragmentation from a *file*
> perspective, but
> > supposing you have several busy relations being inserted or updated
> > then pages (or groups of pages) for these could be "fragmented"
> > throughout the tablespace file.
>
> +1 ... what was said upthread sounds to me like those other databases
> are just hiding the fragmentation issue within their
> huge-files-you-can't-see-into.  I would very much like to see
> some proof of performance problems before we worry about this.

Yes, they definitly do. That's why for exapmle SQLServer provides the
commands "DBCC CHECKCONTIG" and "DBCC INDEXDEFRAG". Along with comments
like this:

"DBCC SHOWCONTIG determines whether the table is heavily fragmented.
Table fragmentation occurs through the process of data modifications
(INSERT, UPDATE, and DELETE statements) made against the table. Because
these modifications are not ordinarily distributed equally among the
rows of the table, the fullness of each page can vary over time. For
queries that scan part or all of a table, such table fragmentation can
cause additional page reads. This hinders parallel scanning of data."



//Magnus


Re: PostgreSQL win32 fragmentation issue

From
Martijn van Oosterhout
Date:
On Sun, Dec 03, 2006 at 11:54:05AM +0100, Magnus Hagander wrote:
> Yes, they definitly do. That's why for exapmle SQLServer provides the
> commands "DBCC CHECKCONTIG" and "DBCC INDEXDEFRAG". Along with comments
> like this:
>
> "DBCC SHOWCONTIG determines whether the table is heavily fragmented.
> Table fragmentation occurs through the process of data modifications
> (INSERT, UPDATE, and DELETE statements) made against the table. Because
> these modifications are not ordinarily distributed equally among the
> rows of the table, the fullness of each page can vary over time. For
> queries that scan part or all of a table, such table fragmentation can
> cause additional page reads. This hinders parallel scanning of data."

But that's measuring something else I think. That's not looking at how
the pages are physically mapped on disk, but at how tuples are spread
across pages.. Maybe in sqlserver tuples can span pages?

That's not a kind of fragmentation relevent to postgresql. This is what
the FSM is for.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: PostgreSQL win32 fragmentation issue

From
"Magnus Hagander"
Date:
> > "DBCC SHOWCONTIG determines whether the table is heavily fragmented.
> > Table fragmentation occurs through the process of data
> modifications
> > (INSERT, UPDATE, and DELETE statements) made against the table.
> > Because these modifications are not ordinarily distributed equally
> > among the rows of the table, the fullness of each page can
> vary over
> > time. For queries that scan part or all of a table, such table
> > fragmentation can cause additional page reads. This hinders
> parallel scanning of data."
>
> But that's measuring something else I think. That's not
> looking at how the pages are physically mapped on disk, but
> at how tuples are spread across pages.. Maybe in sqlserver
> tuples can span pages?

I don't beleive they can (except for IMAGE and TEXT data, which is
handled like our TOAST data).

That said, it returns two numbers:
Scan Density, which shows how many more pages it needs to hit than what
would be ideal
and
Locical Scan Fragmentation, which shows the percentage of "out-of-order
pages" it hits. And isn't "out-of-order pages" exactly what file system
fragmentation would leave us?  The difference between the physical page
and the logical page location.

Given that they preallocate files, they only have this kind of
fragmentation at one level. Since we don't, we can have this both inside
the file and in the fliesystem. But it's still the same thing, isn't it?


//Magnus


Re: PostgreSQL win32 fragmentation issue

From
jhaile@gmail.com
Date:
I have a production database that is running on Win32 NTFS with
PostgreSQL 8.2.1.  The drive that the data is stored on is dedicated to
PostgreSQL data files (transaction log stored on a separate disk)  The
data partition is currently 44% fragmented, which I believe is
resulting in performance degradation.  I don't have any benchmarks
though.


"Joshua D. Drake" wrote:
> Hello,
>
> I ran a simple test... I ran defrag on my drives. Then I initialized
> pgbench with a -s 1000.
>
> 11% fragmentation. I dropped the bench database, and my fragmentation is 1%.
>
> I know this isn't *our* fault :) but I am curious if there is anything
> we can do about the way postgresql writes files to help limit fragmentation.
>
> Essentially, this makes win32 impossible in a 24x7 environment (jokes
> aside about Win32 in general) because we *have* to defrag on Windows and
> Windows won't defrag open files (thus anything PostgreSQL is using).
>
> Sincerely,
>
> Joshua D. Drake
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend