Thread: sorting big tables :(

sorting big tables :(

From
Michael Richards
Date:
I think soon people are going to start calling me Mr. Big...Tables...

I have a big table. 40M rows.
On the disk, it's size is:
 2,090,369,024 bytes. So 2 gigs. On a 9 gig drive I can't sort this table.
How should one decide based on table size how much room is needed?

Also, this simple table consisting of only 2 int4 values is the exact size
of an equally sized table consisting of only one int2. There seems to be
too much overhead here. I realise there are extra things that have to be
saved, but I am not getting the size/performance I had hoped for... I am
starting to think this segment of the database would be better implemented
without a dbms because it is not expected to change at all...

-Mike


Re: [HACKERS] sorting big tables :(

From
Bruce Momjian
Date:
>
> I think soon people are going to start calling me Mr. Big...Tables...
>
> I have a big table. 40M rows.
> On the disk, it's size is:
>  2,090,369,024 bytes. So 2 gigs. On a 9 gig drive I can't sort this table.
> How should one decide based on table size how much room is needed?
>
> Also, this simple table consisting of only 2 int4 values is the exact size
> of an equally sized table consisting of only one int2. There seems to be
> too much overhead here. I realise there are extra things that have to be
> saved, but I am not getting the size/performance I had hoped for... I am
> starting to think this segment of the database would be better implemented
> without a dbms because it is not expected to change at all...
>

It is taking so much disk space because it is using a TAPE sorting
method, by breaking the file into tape chunks and sorting in pieces, the
merging.

Can you try increasing your postgres -S parameter to some huge amount like 32MB
and see if that helps?  It should.

i.e.

    postmaster -i -B 400 $DEBUG -o '-F -S 1024' "$@" >server.log 2>&1



--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] sorting big tables :(

From
Michael Richards
Date:
On Fri, 15 May 1998, Bruce Momjian wrote:

> > I have a big table. 40M rows.
> > On the disk, it's size is:
> >  2,090,369,024 bytes. So 2 gigs. On a 9 gig drive I can't sort this table.
> > How should one decide based on table size how much room is needed?

> It is taking so much disk space because it is using a TAPE sorting
> method, by breaking the file into tape chunks and sorting in pieces, the
The files grow until I have 6 files of almost a gig each. At that point, I
start running out of space...
This TAPE sotring method. It is a simple merge sort? Do you know of a way
this could be done while using constant space and no more complexity in
the algorithim. Even if it is a little slower, the DBMS could decide based
on the table size whether it should use the tape sort or another one...
Bubble sort would not be my first choice tho :)

-Mike


Re: [HACKERS] sorting big tables :(

From
Bruce Momjian
Date:
>
> On Fri, 15 May 1998, Bruce Momjian wrote:
>
> > > I have a big table. 40M rows.
> > > On the disk, it's size is:
> > >  2,090,369,024 bytes. So 2 gigs. On a 9 gig drive I can't sort this table.
> > > How should one decide based on table size how much room is needed?
>
> > It is taking so much disk space because it is using a TAPE sorting
> > method, by breaking the file into tape chunks and sorting in pieces, the
> The files grow until I have 6 files of almost a gig each. At that point, I
> start running out of space...
> This TAPE sotring method. It is a simple merge sort? Do you know of a way
> this could be done while using constant space and no more complexity in
> the algorithim. Even if it is a little slower, the DBMS could decide based
> on the table size whether it should use the tape sort or another one...
> Bubble sort would not be my first choice tho :)

Tape sort is a standard Knuth sorting.  It basically sorts in pieces,
and merges.  If you don't do this, the accessing around gets very poor
as you page fault all over the file, and the cache becomes useless.

There is something optimal about having seven sort files.  Not sure what
to suggest.  No one has complained about this before.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] sorting big tables :(

From
dg@illustra.com (David Gould)
Date:
>
> >
> > On Fri, 15 May 1998, Bruce Momjian wrote:
> >
> > > > I have a big table. 40M rows.
> > > > On the disk, it's size is:
> > > >  2,090,369,024 bytes. So 2 gigs. On a 9 gig drive I can't sort this table.
> > > > How should one decide based on table size how much room is needed?
> >
> > > It is taking so much disk space because it is using a TAPE sorting
> > > method, by breaking the file into tape chunks and sorting in pieces, the
> > The files grow until I have 6 files of almost a gig each. At that point, I
> > start running out of space...
> > This TAPE sotring method. It is a simple merge sort? Do you know of a way
> > this could be done while using constant space and no more complexity in
> > the algorithim. Even if it is a little slower, the DBMS could decide based
> > on the table size whether it should use the tape sort or another one...
> > Bubble sort would not be my first choice tho :)
>
> Tape sort is a standard Knuth sorting.  It basically sorts in pieces,
> and merges.  If you don't do this, the accessing around gets very poor
> as you page fault all over the file, and the cache becomes useless.
>
> There is something optimal about having seven sort files.  Not sure what
> to suggest.  No one has complained about this before.

I think this is a bug. There is no reason to use more than a little bit over
three times the input size for a sort. This is: input file, run files, output
file. If we are not able to sort a 2 gig table on a 9 gig partition we need
to fix it. I suspect we have a bug in the implementation, but perhaps we
need to look at our choice of algorithm. In any case this problem should go
on the todo list.

-dg

David Gould            dg@illustra.com           510.628.3783 or 510.305.9468
Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
"Of course, someone who knows more about this will correct me if I'm wrong,
 and someone who knows less will correct me if I'm right."
               --David Palmer (palmer@tybalt.caltech.edu)

Re: [HACKERS] sorting big tables :(

From
The Hermit Hacker
Date:
On Sun, 17 May 1998, David Gould wrote:

> I think this is a bug. There is no reason to use more than a little bit over
> three times the input size for a sort. This is: input file, run files, output
> file. If we are not able to sort a 2 gig table on a 9 gig partition we need
> to fix it. I suspect we have a bug in the implementation, but perhaps we
> need to look at our choice of algorithm. In any case this problem should go
> on the todo list.

    Have to agree here...

    Micheal...if you were to dump that table into a text file, how big
would it turn out to be?  Much smaller then 2gig, no?  Then perform a Unix
sort on that, how long would that take?  Then reload the data...

    Needing more then 7gig to sort a 2gig table sounds slightly off to
me as well :(

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org


Re: [HACKERS] sorting big tables :(

From
Michael Richards
Date:
On Sun, 17 May 1998, Bruce Momjian wrote:

> > > > I have a big table. 40M rows.
> > > > On the disk, it's size is:
> > > >  2,090,369,024 bytes. So 2 gigs. On a 9 gig drive I can't sort this table.
> > > > How should one decide based on table size how much room is needed?
>
> Tape sort is a standard Knuth sorting.  It basically sorts in pieces,
> and merges.  If you don't do this, the accessing around gets very poor
> as you page fault all over the file, and the cache becomes useless.
Right. I wasn't reading the right chapter. Internal sorting is much
different than external sorts. Internal suggests the use of a Quicksort
algorithim.
Marc and I discussed over lunch. If I did a select * into, would it not
make more sense to sort the results into the resulting table rather than
into pieces and then copy into a table? From my limited knowlege, I think
this should save 8/7 N the space.
In this issue, I think there must be a lot more overhead than necessary.
The table consists of only
int4, int4, int2
I read 10 bytes / row of actual data here.
Instead, 40M/2gigs is about
50 bytes / record
What is there other than oid (4? bytes)

-Mike


Re: [HACKERS] sorting big tables :(

From
Bruce Momjian
Date:
>
> On Sun, 17 May 1998, Bruce Momjian wrote:
>
> > > > > I have a big table. 40M rows.
> > > > > On the disk, it's size is:
> > > > >  2,090,369,024 bytes. So 2 gigs. On a 9 gig drive I can't sort this table.
> > > > > How should one decide based on table size how much room is needed?
> >
> > Tape sort is a standard Knuth sorting.  It basically sorts in pieces,
> > and merges.  If you don't do this, the accessing around gets very poor
> > as you page fault all over the file, and the cache becomes useless.
> Right. I wasn't reading the right chapter. Internal sorting is much
> different than external sorts. Internal suggests the use of a Quicksort
> algorithim.
> Marc and I discussed over lunch. If I did a select * into, would it not
> make more sense to sort the results into the resulting table rather than
> into pieces and then copy into a table? From my limited knowlege, I think
> this should save 8/7 N the space.
> In this issue, I think there must be a lot more overhead than necessary.

Not sure if the internal tape is the same structure as a real table, but
I doubt it.  I seem to remember there is less overhead.

> The table consists of only
> int4, int4, int2
> I read 10 bytes / row of actual data here.
> Instead, 40M/2gigs is about
> 50 bytes / record
> What is there other than oid (4? bytes)

Internal stuff so it looks like a real table, even though it is a
result, I think.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] sorting big tables :(

From
The Hermit Hacker
Date:
On Tue, 19 May 1998, Bruce Momjian wrote:

> >
> > On Sun, 17 May 1998, Bruce Momjian wrote:
> >
> > > > > > I have a big table. 40M rows.
> > > > > > On the disk, it's size is:
> > > > > >  2,090,369,024 bytes. So 2 gigs. On a 9 gig drive I can't sort this table.
> > > > > > How should one decide based on table size how much room is needed?
> > >
> > > Tape sort is a standard Knuth sorting.  It basically sorts in pieces,
> > > and merges.  If you don't do this, the accessing around gets very poor
> > > as you page fault all over the file, and the cache becomes useless.
> > Right. I wasn't reading the right chapter. Internal sorting is much
> > different than external sorts. Internal suggests the use of a Quicksort
> > algorithim.
> > Marc and I discussed over lunch. If I did a select * into, would it not
> > make more sense to sort the results into the resulting table rather than
> > into pieces and then copy into a table? From my limited knowlege, I think
> > this should save 8/7 N the space.
> > In this issue, I think there must be a lot more overhead than necessary.
>
> Not sure if the internal tape is the same structure as a real table, but
> I doubt it.  I seem to remember there is less overhead.
>
> > The table consists of only
> > int4, int4, int2
> > I read 10 bytes / row of actual data here.
> > Instead, 40M/2gigs is about
> > 50 bytes / record
> > What is there other than oid (4? bytes)
>
> Internal stuff so it looks like a real table, even though it is a
> result, I think.

Okay...I get to jump in here with both feet and arms flailing :)

Michael and I had lunch today and talked about this, and I asked him to
send an email in to the list about it...unfortunately, he didn't translate
our chat very well for here :)

This whole things makes absolutely no sense to me, as far as why it takes
2.5 times more space to *sort* the table as the table size itself.

He starts with a 2gig table, and it runs out of disk space on a 9gig file
system...

Now, looking at question 3.26 in the FAQ, we have:

40 bytes + each row header (approximate)
10 bytes + two int4 fields + one int2 field
 4 bytes + pointer on page to tuple
-------- =
54 bytes per row

The data page size in PostgreSQL is 8192(8k) bytes, so:

8192 bytes per page
-------------------  =  157 rows per database page (rounded up)
 54 bytes per row

40000000 data rows
-----------------  =   254777 database pages
157 rows per page

254777 database pages * 8192 bytes per page  =  2,087,133,184 or ~1.9gig

Now, as a text file, this would amount to, what...~50MB?

So, if I were to do a 'copy out' to a text file, a Unix sort and then a
'copy in', I would use up *less* disk space (by several orders of
magnitude) then doing the sort inside of PostgreSQL?

Why?

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org


Re: [HACKERS] sorting big tables :(

From
Michal Mosiewicz
Date:
The Hermit Hacker wrote:

> Now, as a text file, this would amount to, what...~50MB?
40M of records to produce a 50MB text file? How would you sort such a
*compressed* file? ;-)

> So, if I were to do a 'copy out' to a text file, a Unix sort and then a
> 'copy in', I would use up *less* disk space (by several orders of
> magnitude) then doing the sort inside of PostgreSQL?

Well, I think it might be optimised slightly. Am I right that postgres
uses heap (i.e. they look like tables) files during sorting? While this
is a merge sort, those files doesn't have to be a table-like files.
Certainly, they might variable length records without pages (aren't they
used sequentially). Moreover we would consider packing tape files before
writting them down if necessary. Of course it will result in some
performance dropdown. However it's better to have less performance that
being unable to sort it at all.

Last question... What's the purpose of such a big sort? If somebody gets
40M of sorted records in a result of some query, what would he do with
it? Is he going to spent next years on reading this lecture? I mean,
isn't it worth to query the database for necessary informations only and
then sort it?

Mike

--
WWW: http://www.lodz.pdi.net/~mimo  tel: Int. Acc. Code + 48 42 148340
add: Michal Mosiewicz  *  Bugaj 66 m.54 *  95-200 Pabianice  *  POLAND

Re: [HACKERS] sorting big tables :(

From
The Hermit Hacker
Date:
On Wed, 20 May 1998, Michal Mosiewicz wrote:

> The Hermit Hacker wrote:
>
> > Now, as a text file, this would amount to, what...~50MB?
> 40M of records to produce a 50MB text file? How would you sort such a
> *compressed* file? ;-)

My math off?  40M rows at 11bytes each (2xint4+int2+\n?)  oops...ya, just
off by a factor of ten...still, 500MB is a quarter of the size of the 2gig
file we started with...

> > So, if I were to do a 'copy out' to a text file, a Unix sort and then a
> > 'copy in', I would use up *less* disk space (by several orders of
> > magnitude) then doing the sort inside of PostgreSQL?
>
> Well, I think it might be optimised slightly. Am I right that postgres
> uses heap (i.e. they look like tables) files during sorting? While this
> is a merge sort, those files doesn't have to be a table-like files.
> Certainly, they might variable length records without pages (aren't they
> used sequentially). Moreover we would consider packing tape files before
> writting them down if necessary. Of course it will result in some
> performance dropdown. However it's better to have less performance that
> being unable to sort it at all.
>
> Last question... What's the purpose of such a big sort? If somebody gets
> 40M of sorted records in a result of some query, what would he do with
> it? Is he going to spent next years on reading this lecture? I mean,
> isn't it worth to query the database for necessary informations only and
> then sort it?

    this I don't know...I never even really thought about that,
actually...Michael? :)  Only you can answer that one.



Re: [HACKERS] sorting big tables :(

From
Bruce Momjian
Date:
>
> On Wed, 20 May 1998, Michal Mosiewicz wrote:
>
> > The Hermit Hacker wrote:
> >
> > > Now, as a text file, this would amount to, what...~50MB?
> > 40M of records to produce a 50MB text file? How would you sort such a
> > *compressed* file? ;-)
>
> My math off?  40M rows at 11bytes each (2xint4+int2+\n?)  oops...ya, just
> off by a factor of ten...still, 500MB is a quarter of the size of the 2gig
> file we started with...

Actually, my description of the use of tape files was somewhat off.
Actually, the file is sorted by putting several batches in each tape
file, then reading the batches make another tape file with bigger
batches until there is one tape file and one big sorted batch.  Also, if
the data is already sorted, it can do it in one pass, without making all
those small batches because of the way the data structure sorts them in
memory.  Only Knuth can do the description justice, but suffice it to
say that the data can appear up to two places at once.

This is the first time I remember someone complaining about it.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] sorting big tables :(

From
Bruce Momjian
Date:
> > Well, I think it might be optimised slightly. Am I right that postgres
> > uses heap (i.e. they look like tables) files during sorting? While this
> > is a merge sort, those files doesn't have to be a table-like files.
> > Certainly, they might variable length records without pages (aren't they
> > used sequentially). Moreover we would consider packing tape files before
> > writting them down if necessary. Of course it will result in some
> > performance dropdown. However it's better to have less performance that
> > being unable to sort it at all.
> >
> > Last question... What's the purpose of such a big sort? If somebody gets
> > 40M of sorted records in a result of some query, what would he do with
> > it? Is he going to spent next years on reading this lecture? I mean,
> > isn't it worth to query the database for necessary informations only and
> > then sort it?
>
>     this I don't know...I never even really thought about that,
> actually...Michael? :)  Only you can answer that one.

I have an idea.  Can he run CLUSTER on the data?  If so, the sort will
not use small batches, and the disk space during sort will be reduced.
However, I think CLUSTER will NEVER finish on such a file, unless it is
already pretty well sorted.


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] sorting big tables :(

From
The Hermit Hacker
Date:
On Wed, 20 May 1998, Bruce Momjian wrote:

> > > Well, I think it might be optimised slightly. Am I right that postgres
> > > uses heap (i.e. they look like tables) files during sorting? While this
> > > is a merge sort, those files doesn't have to be a table-like files.
> > > Certainly, they might variable length records without pages (aren't they
> > > used sequentially). Moreover we would consider packing tape files before
> > > writting them down if necessary. Of course it will result in some
> > > performance dropdown. However it's better to have less performance that
> > > being unable to sort it at all.
> > >
> > > Last question... What's the purpose of such a big sort? If somebody gets
> > > 40M of sorted records in a result of some query, what would he do with
> > > it? Is he going to spent next years on reading this lecture? I mean,
> > > isn't it worth to query the database for necessary informations only and
> > > then sort it?
> >
> >     this I don't know...I never even really thought about that,
> > actually...Michael? :)  Only you can answer that one.
>
> I have an idea.  Can he run CLUSTER on the data?  If so, the sort will
> not use small batches, and the disk space during sort will be reduced.
> However, I think CLUSTER will NEVER finish on such a file, unless it is
> already pretty well sorted.

    Okay...then we *do* have a table size limit problem?  Tables that
just get too large to be manageable?  Maybe this is one area we should be
looking at as far as performance is concerned?

    One thing that just pop'd to mind, concerning the above CLUSTER
command...what would it take to have *auto-cluster'ng*?  Maybe provide a
means of marking a field in a table for this purpose?

    One of the things that the Unix FS does is auto-defragmenting, at
least the UFS one does.  Whenever the system is idle (from my
understanding), the kernel uses that time to clean up the file systems, to
reduce the file system fragmentation.

    This is by no means SQL92, but it would be a neat
"extension"...let me specify a "CLUSTER on" field.  Then, as I'm entering
data into the database, periodically check for fragmentation of the data
and clean up accordingly.  If done by the system, reasonably often, it
shouldn't take up *too* much time, as most of the data should already be
in order...

    That would have the side-benefit of speeding up the "ORDER by" on
that field also...




Re: [HACKERS] sorting big tables :(

From
Bruce Momjian
Date:
> > I have an idea.  Can he run CLUSTER on the data?  If so, the sort will
> > not use small batches, and the disk space during sort will be reduced.
> > However, I think CLUSTER will NEVER finish on such a file, unless it is
> > already pretty well sorted.
>
>     Okay...then we *do* have a table size limit problem?  Tables that
> just get too large to be manageable?  Maybe this is one area we should be
> looking at as far as performance is concerned?

Well, cluster moves one row at a time, so if the table is very
fragmented, the code is slow because it is seeking all over the table.
See the cluster manual pages for an alternate solution, the uses ORDER
BY.


>
>     One thing that just pop'd to mind, concerning the above CLUSTER
> command...what would it take to have *auto-cluster'ng*?  Maybe provide a
> means of marking a field in a table for this purpose?

Hard to do.  That's what we have indexes for.

>
>     One of the things that the Unix FS does is auto-defragmenting, at
> least the UFS one does.  Whenever the system is idle (from my
> understanding), the kernel uses that time to clean up the file systems, to
> reduce the file system fragmentation.
>
>     This is by no means SQL92, but it would be a neat
> "extension"...let me specify a "CLUSTER on" field.  Then, as I'm entering
> data into the database, periodically check for fragmentation of the data
> and clean up accordingly.  If done by the system, reasonably often, it
> shouldn't take up *too* much time, as most of the data should already be
> in order...
>
>     That would have the side-benefit of speeding up the "ORDER by" on
> that field also...

We actually can have a CLUSTER ALL command, that does this.  No one has
implemented it yet.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] sorting big tables :(

From
Tom
Date:
On Wed, 20 May 1998, The Hermit Hacker wrote:

>     One of the things that the Unix FS does is auto-defragmenting, at
> least the UFS one does.  Whenever the system is idle (from my
> understanding), the kernel uses that time to clean up the file systems, to
> reduce the file system fragmentation.

  No, that doesn't happen.  The only way to eliminate fragmentation is a
dump/newfs/restore cycle.  UFS does do fragmentation avoidance (which is
reason UFS filesystems have a 10% reserve).

Tom


Re: [HACKERS] sorting big tables :(

From
The Hermit Hacker
Date:
On Wed, 20 May 1998, Tom wrote:

>
> On Wed, 20 May 1998, The Hermit Hacker wrote:
>
> >     One of the things that the Unix FS does is auto-defragmenting, at
> > least the UFS one does.  Whenever the system is idle (from my
> > understanding), the kernel uses that time to clean up the file systems, to
> > reduce the file system fragmentation.
>
>   No, that doesn't happen.  The only way to eliminate fragmentation is a
> dump/newfs/restore cycle.  UFS does do fragmentation avoidance (which is
> reason UFS filesystems have a 10% reserve).

    Okay, then we have two different understandings of this.  My
understanding was that the 10% reserve gave the OS a 'temp area' in which
to move blocks to/from so that it could defrag on the fly...

    Am CC'ng this into freebsd-hackers@freebsd.org for a "third
opinion"...am willing to admit I'm wrong *grin*



Re: [HACKERS] sorting big tables :(

From
Bruce Momjian
Date:
>     Okay, then we have two different understandings of this.  My
> understanding was that the 10% reserve gave the OS a 'temp area' in which
> to move blocks to/from so that it could defrag on the fly...
>
>     Am CC'ng this into freebsd-hackers@freebsd.org for a "third
> opinion"...am willing to admit I'm wrong *grin*

You are wrong.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] sorting big tables :(

From
The Hermit Hacker
Date:
On Wed, 20 May 1998, Bruce Momjian wrote:

> >     Okay, then we have two different understandings of this.  My
> > understanding was that the 10% reserve gave the OS a 'temp area' in which
> > to move blocks to/from so that it could defrag on the fly...
> >
> >     Am CC'ng this into freebsd-hackers@freebsd.org for a "third
> > opinion"...am willing to admit I'm wrong *grin*
>
> You are wrong.

    I just love short answers *roll eyes*



Re: [HACKERS] sorting big tables :(

From
Bruce Momjian
Date:
>
> On Wed, 20 May 1998, Bruce Momjian wrote:
>
> > >     Okay, then we have two different understandings of this.  My
> > > understanding was that the 10% reserve gave the OS a 'temp area' in which
> > > to move blocks to/from so that it could defrag on the fly...
> > >
> > >     Am CC'ng this into freebsd-hackers@freebsd.org for a "third
> > > opinion"...am willing to admit I'm wrong *grin*
> >
> > You are wrong.
>
>     I just love short answers *roll eyes*
>
>
>


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] sorting big tables :(

From
Eivind Eklund
Date:
On Wed, May 20, 1998 at 01:17:34PM -0400, The Hermit Hacker wrote:
> On Wed, 20 May 1998, Tom wrote:
> >   No, that doesn't happen.  The only way to eliminate fragmentation is a
> > dump/newfs/restore cycle.  UFS does do fragmentation avoidance (which is
> > reason UFS filesystems have a 10% reserve).
>
>     Okay, then we have two different understandings of this.  My
> understanding was that the 10% reserve gave the OS a 'temp area' in which
> to move blocks to/from so that it could defrag on the fly...

No.  What is done is (quite correctly) fragmentation avoidance.  Big
files are even sometimes fragmented on purpose, to allow small files
that are written later to avoid being fragmented.

Eivind.

RE: [HACKERS] sorting big tables :(

From
"Stupor Genius"
Date:
> Last question... What's the purpose of such a big sort? If somebody gets
> 40M of sorted records in a result of some query, what would he do with
> it? Is he going to spent next years on reading this lecture? I mean,
> isn't it worth to query the database for necessary informations only and
> then sort it?

Not all query results are for human eyes.  I'd venture to say that more
queries are fed into report generators for formatting than are looked at
directly from psql.

A sort is required in some cases where not explicitly requested.

For example, a GROUP BY clause.  You _could_ get the data back ungrouped,
but then you'd have to pipe it to another application or script to do the
sorting and then the grouping (or perhaps group on the fly).  But then
perhaps that app/script will eat all the memory or disk space and you'd
be in the same pickle as before.

darrenk

Re: [HACKERS] sorting big tables :(

From
Oleg Broytmann
Date:
Hello!

On Wed, 20 May 1998, The Hermit Hacker wrote:
> >   No, that doesn't happen.  The only way to eliminate fragmentation is a
> > dump/newfs/restore cycle.  UFS does do fragmentation avoidance (which is
> > reason UFS filesystems have a 10% reserve).
>
>     Okay, then we have two different understandings of this.  My
> understanding was that the 10% reserve gave the OS a 'temp area' in which
> to move blocks to/from so that it could defrag on the fly...

   No, you are wrong. This 10% is temp area reserved for emergent
situations - when root bring system down to single-user and do system
maintainance.

Oleg.
----
  Oleg Broytmann     http://members.tripod.com/~phd2/     phd2@earthling.net
           Programmers don't die, they just GOSUB without RETURN.


Re: [HACKERS] sorting big tables :(

From
The Hermit Hacker
Date:
On Thu, 21 May 1998, Oleg Broytmann wrote:

> Hello!
>
> On Wed, 20 May 1998, The Hermit Hacker wrote:
> > >   No, that doesn't happen.  The only way to eliminate fragmentation is a
> > > dump/newfs/restore cycle.  UFS does do fragmentation avoidance (which is
> > > reason UFS filesystems have a 10% reserve).
> >
> >     Okay, then we have two different understandings of this.  My
> > understanding was that the 10% reserve gave the OS a 'temp area' in which
> > to move blocks to/from so that it could defrag on the fly...
>
>    No, you are wrong. This 10% is temp area reserved for emergent
> situations - when root bring system down to single-user and do system
> maintainance.

    Actually, in this one you are only partly right.  Only root has
*access* to using that extra 10%, but, as I've been corrected by several
ppl, including a couple on the FreeBSD list, that 10% is meant to
*prevent/reduce* fragmentation.