Re: Script to compute random page cost - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Script to compute random page cost
Date
Msg-id 200209090613.g896Dsj12335@candle.pha.pa.us
Whole thread Raw
In response to Script to compute random page cost  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Script to compute random page cost  ("Christopher Kings-Lynne" <chriskl@familyhealth.com.au>)
Re: Script to compute random page cost  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Re: Script to compute random page cost  (Oliver Elphick <olly@lfix.co.uk>)
Re: Script to compute random page cost  (Rod Taylor <rbt@rbt.ca>)
Re: Script to compute random page cost  ("Nick Fankhauser" <nickf@ontko.com>)
Re: Script to compute random page cost  ("scott.marlowe" <scott.marlowe@ihs.com>)
Re: Script to compute random page cost  (Oliver Elphick <olly@lfix.co.uk>)
List pgsql-hackers
OK, turns out that the loop for sequential scan ran fewer times and was
skewing the numbers.  I have a new version at:
ftp://candle.pha.pa.us/pub/postgresql/randcost

I get _much_ lower numbers now for random_page_cost.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Because we have seen many complains about sequential vs index scans, I
> wrote a script which computes the value for your OS/hardware
> combination.
> 
> Under BSD/OS on one SCSI disk, I get a random_page_cost around 60.  Our
> current postgresql.conf default is 4.
> 
> What do other people get for this value?
> 
> Keep in mind if we increase this value, we will get a more sequential
> scans vs. index scans.
> 
> One flaw in this test is that it randomly reads blocks from different
> files rather than randomly reading from the same file.  Do people have a
> suggestion on how to correct this?  Does it matter?
> 
> -- 
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 359-1001
>   +  If your life is a hard drive,     |  13 Roberts Road
>   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

> #!/bin/bash
> 
> trap "rm -f /tmp/$$" 0 1 2 3 15
> 
> BLCKSZ=8192
> 
> if [ "$RANDOM" = "$RANDOM" ]
> then    echo "Your shell does not support \$RANDOM.  Try using bash." 1>&2
>     exit 1
> fi
> 
> # XXX We assume 0 <= random <= 32767
> 
> echo "Collecting sizing information ..."
> 
> TEMPLATE1=`du -s "$PGDATA/base/1" | awk '{print $1}'`
> FULL=`du -s "$PGDATA/base" | awk '{print $1}'`
> if [ "$FULL" -lt `expr "$TEMPLATE1" \* 4` ]
> then    echo "Your installation should have at least four times the data stored in template1 to yield meaningful
results"1>&2
 
>     exit 1 
> fi
> 
> find "$PGDATA/base" -type f -exec ls -ld {} \; |
> awk '$5 % '"$BLCKSZ"' == 0 {print $5 / '"$BLCKSZ"', $9}' |
> grep -v '^0 ' > /tmp/$$
> 
> TOTAL=`awk 'BEGIN    {sum=0}
>             {sum += $1}
>         END        {print sum}' /tmp/$$`
> 
> echo "Running random access timing test ..."
> 
> START=`date '+%s'`
> PAGES=1000
> 
> while [ "$PAGES" -ne 0 ]
> do
>     BIGRAND=`expr "$RANDOM" \* 32768 + "$RANDOM"`
>     
>     OFFSET=`awk 'BEGIN{printf "%d\n", ('"$BIGRAND"' / 2^30) * '"$TOTAL"'}'`
>     
>     RESULT=`awk '    BEGIN    {offset = 0}
>         offset + $1 > '"$OFFSET"' \
>             {print $2, '"$OFFSET"' - offset ; exit}
>             {offset += $1}' /tmp/$$`
>     FILE=`echo "$RESULT" | awk '{print $1}'`
>     OFFSET=`echo "$RESULT" | awk '{print $2}'`
>     
>     dd bs="$BLCKSZ" seek="$OFFSET" count=1 if="$FILE" of="/dev/null" >/dev/null 2>&1
>     PAGES=`expr "$PAGES" - 1`
> done
> 
> STOP=`date '+%s'`
> RANDTIME=`expr "$STOP" - "$START"`
> 
> echo "Running sequential access timing test ..."
> 
> START=`date '+%s'`
> # We run the random test 10 times more because it is quicker and
> # we need it to run for a while to get accurate results.
> PAGES=10000
> 
> while [ "$PAGES" -ne 0 ]
> do
>     BIGRAND=`expr "$RANDOM" \* 32768 + "$RANDOM"`
>     
>     OFFSET=`awk 'BEGIN{printf "%d\n", ('"$BIGRAND"' / 2^30) * '"$TOTAL"'}'`
>     
>     RESULT=`awk '    BEGIN    {offset = 0}
>         offset + $1 > '"$OFFSET"' \
>             {print $2, $1; exit}
>             {offset += $1}' /tmp/$$`
>     FILE=`echo "$RESULT" | awk '{print $1}'`
>     FILEPAGES=`echo "$RESULT" | awk '{print $2}'`
> 
>     if [ "$FILEPAGES" -gt "$PAGES" ]
>     then    FILEPAGES="$PAGES"
>     fi
>     
>     dd bs="$BLCKSZ" count="$FILEPAGES" if="$FILE" of="/dev/null" >/dev/null 2>&1
>     PAGES=`expr "$PAGES" - "$FILEPAGES"`
> done
> 
> STOP=`date '+%s'`
> SEQTIME=`expr "$STOP" - "$START"`
> 
> echo
> awk 'BEGIN    {printf "random_page_cost = %f\n", ('"$RANDTIME"' / '"$SEQTIME"') * 10}'

> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: Schemas not available for pl/pgsql %TYPE....
Next
From: "Christopher Kings-Lynne"
Date:
Subject: Re: Script to compute random page cost