Re: Netflix Prize data - Mailing list pgsql-hackers

From Mark Woodward
Subject Re: Netflix Prize data
Date
Msg-id 21735.24.91.171.78.1160002678.squirrel@mail.mohawksoft.com
Whole thread Raw
In response to Re: Netflix Prize data  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
> "Mark Woodward" <pgsql@mohawksoft.com> writes:
>> The one thing I notice is that it is REAL slow.
>
> How fast is your disk?  Counting on my fingers, I estimate you are
> scanning the table at about 47MB/sec, which might or might not be
> disk-limited...
>
>> I'm using 8.1.4. The "rdate" field looks something like: "2005-09-06"
>
> So why aren't you storing it as type "date"?
>

You are assuming I gave it any thought at all. :-)

I converted it to a date type (create table ratings2 as ....)
markw@snoopy:~/netflix/download$ time psql -c "select count(*) from
ratings" netflix  count
-----------100480507
(1 row)


real    1m29.852s
user    0m0.002s
sys     0m0.005s

That's about the right increase based on the reduction in data size.

OK, I guess I am crying wolf, 47M/sec isn't all that bad for the system.


pgsql-hackers by date:

Previous
From: "Mark Woodward"
Date:
Subject: Re: Netflix Prize data
Next
From: Gregory Stark
Date:
Subject: Re: Netflix Prize data