Script to compute random page cost - Mailing list pgsql-hackers

From Bruce Momjian
Subject Script to compute random page cost
Date
Msg-id 200209090505.g8955nq14584@candle.pha.pa.us
Whole thread Raw
Responses Re: Script to compute random page cost
Re: Script to compute random page cost
List pgsql-hackers
Because we have seen many complains about sequential vs index scans, I
wrote a script which computes the value for your OS/hardware
combination.

Under BSD/OS on one SCSI disk, I get a random_page_cost around 60.  Our
current postgresql.conf default is 4.

What do other people get for this value?

Keep in mind if we increase this value, we will get a more sequential
scans vs. index scans.

One flaw in this test is that it randomly reads blocks from different
files rather than randomly reading from the same file.  Do people have a
suggestion on how to correct this?  Does it matter?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#!/bin/bash

trap "rm -f /tmp/$$" 0 1 2 3 15

BLCKSZ=8192

if [ "$RANDOM" = "$RANDOM" ]
then    echo "Your shell does not support \$RANDOM.  Try using bash." 1>&2
    exit 1
fi

# XXX We assume 0 <= random <= 32767

echo "Collecting sizing information ..."

TEMPLATE1=`du -s "$PGDATA/base/1" | awk '{print $1}'`
FULL=`du -s "$PGDATA/base" | awk '{print $1}'`
if [ "$FULL" -lt `expr "$TEMPLATE1" \* 4` ]
then    echo "Your installation should have at least four times the data stored in template1 to yield meaningful
results"1>&2 
    exit 1
fi

find "$PGDATA/base" -type f -exec ls -ld {} \; |
awk '$5 % '"$BLCKSZ"' == 0 {print $5 / '"$BLCKSZ"', $9}' |
grep -v '^0 ' > /tmp/$$

TOTAL=`awk 'BEGIN    {sum=0}
            {sum += $1}
        END        {print sum}' /tmp/$$`

echo "Running random access timing test ..."

START=`date '+%s'`
PAGES=1000

while [ "$PAGES" -ne 0 ]
do
    BIGRAND=`expr "$RANDOM" \* 32768 + "$RANDOM"`

    OFFSET=`awk 'BEGIN{printf "%d\n", ('"$BIGRAND"' / 2^30) * '"$TOTAL"'}'`

    RESULT=`awk '    BEGIN    {offset = 0}
        offset + $1 > '"$OFFSET"' \
            {print $2, '"$OFFSET"' - offset ; exit}
            {offset += $1}' /tmp/$$`
    FILE=`echo "$RESULT" | awk '{print $1}'`
    OFFSET=`echo "$RESULT" | awk '{print $2}'`

    dd bs="$BLCKSZ" seek="$OFFSET" count=1 if="$FILE" of="/dev/null" >/dev/null 2>&1
    PAGES=`expr "$PAGES" - 1`
done

STOP=`date '+%s'`
RANDTIME=`expr "$STOP" - "$START"`

echo "Running sequential access timing test ..."

START=`date '+%s'`
# We run the random test 10 times more because it is quicker and
# we need it to run for a while to get accurate results.
PAGES=10000

while [ "$PAGES" -ne 0 ]
do
    BIGRAND=`expr "$RANDOM" \* 32768 + "$RANDOM"`

    OFFSET=`awk 'BEGIN{printf "%d\n", ('"$BIGRAND"' / 2^30) * '"$TOTAL"'}'`

    RESULT=`awk '    BEGIN    {offset = 0}
        offset + $1 > '"$OFFSET"' \
            {print $2, $1; exit}
            {offset += $1}' /tmp/$$`
    FILE=`echo "$RESULT" | awk '{print $1}'`
    FILEPAGES=`echo "$RESULT" | awk '{print $2}'`

    if [ "$FILEPAGES" -gt "$PAGES" ]
    then    FILEPAGES="$PAGES"
    fi

    dd bs="$BLCKSZ" count="$FILEPAGES" if="$FILE" of="/dev/null" >/dev/null 2>&1
    PAGES=`expr "$PAGES" - "$FILEPAGES"`
done

STOP=`date '+%s'`
SEQTIME=`expr "$STOP" - "$START"`

echo
awk 'BEGIN    {printf "random_page_cost = %f\n", ('"$RANDTIME"' / '"$SEQTIME"') * 10}'

pgsql-hackers by date:

Previous
From: Stephan Szabo
Date:
Subject: Re: Proposal: Solving the "Return proper effected tuple
Next
From: Joe Conway
Date:
Subject: Re: Schemas not available for pl/pgsql %TYPE....