Home > mailing lists

Re: ext3 filesystem / linux 7.3 - Mailing list pgsql-performance

From	Kevin Brown
Subject	Re: ext3 filesystem / linux 7.3
Date	April 8, 2003 19:22:46
Msg-id	20030408232247.GH1847@filer Whole thread Raw
In response to	Re: ext3 filesystem / linux 7.3 (Josh Berkus <josh@agliodbs.com>)
Responses	Re: ext3 filesystem / linux 7.3
List	pgsql-performance

Tree view

Josh Berkus wrote:
> Jeffery,
>
> > Can't we generate data?  Random data stored in random formats at random
> > sizes would stress the file system wouldn't it?
>
> In my experience, randomly generated data tends to resemble real data very
> little in distribution patterns and data types.  This is one of the
> limitations of PGBench.

Okay, from this it sounds like what we need is information on the data
types typically used for real world applications and information on
the the distribution patterns for each type (the latter could get
quite complex and varied, I'm sure, but since we're after something
that's typical, we only need a few examples).

So perhaps the first step in this is to write something that will show
what the distribution pattern for data in a table is?  With that
information, we *could* randomly generate data that would conform to
the statistical patterns seen in the real world.

In fact, even though the databases you have access to are all
proprietary, I'm pretty sure their owners would agree to let you run a
program that would gather statistical distribution about it.  Then (as
long as they agree) you could copy the schema itself, recreate it on
the test system, and randomly generate the data.

--
Kevin Brown                          kevin@sysexperts.com

pgsql-performance by date:

From: Josh Berkus
Date: 08 April 2003, 17:52:52
Subject: Re: [SQL] Yet Another (Simple) Case of Index not used

From: Martijn van Oosterhout
Date: 08 April 2003, 19:46:39
Subject: Re: [GENERAL] Yet Another (Simple) Case of Index not used

Re: ext3 filesystem / linux 7.3 - Mailing list pgsql-performance

Previous

Next