Re: Air-traffic benchmark - Mailing list pgsql-performance
From | Lefteris |
---|---|
Subject | Re: Air-traffic benchmark |
Date | |
Msg-id | 852badbc1001070547p48b47175vf87efc473e3bd41b@mail.gmail.com Whole thread Raw |
In response to | Re: Air-traffic benchmark (Arjen van der Meijden <acmmailing@tweakers.net>) |
Responses |
Re: Air-traffic benchmark
Re: Air-traffic benchmark |
List | pgsql-performance |
Hi Arjen, so I understand from all of you that you don't consider the use of 25k for sorting to be the cause of the slowdown? Probably I am missing something on the specific sort algorithm used by PG. My RAM does fill up, mainly by file buffers from linux, but postgres process remains to 0.1% consumption of main memory. There is no way to force sort to use say blocks of 128MB ? wouldn't that make a difference? lefteris p.s. i already started the analyze verbose again as Flavio suggested and reset the parrameters, although I think some of Flavioo's suggestions have to do with multiple users/queries and not 1 long running query, like shared_buffers, or not? On Thu, Jan 7, 2010 at 2:36 PM, Arjen van der Meijden <acmmailing@tweakers.net> wrote: > On 7-1-2010 13:38 Lefteris wrote: >> >> I decided to run the benchmark over postgres to get some more >> experience and insights. Unfortunately, the query times I got from >> postgres were not the expected ones: > > Why were they not expected? In the given scenario, column databases are > having a huge advantage. Especially the given simple example is the type of > query a column database *should* excel. > You should, at the very least, compare the queries to MyISAM: > http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/ > > But unfortunately, that one also beats your postgresql-results. > >> The hardware characteristics are: >> Platform Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz with 8GB RAM and >> ample disk space (2x 500 GB SATA disk @ 7200 RPM as SW-RAID-0) > > Unfortunately, the blogpost fails to mention the disk-subsystem. So it may > well be much faster than yours, although its not a new, big or fast server, > so unless it has external storage, it shouldn't be too different for > sequential scans. > >> SELECT "DayOfWeek", count(*) AS c FROM ontime WHERE "Year" BETWEEN >> 2000 AND 2008 GROUP BY "DayOfWeek" ORDER BY c DESC; >> >> Reported query times are (in sec): >> MonetDB 7.9s >> InfoBright 12.13s >> LucidDB 54.8s >> >> For pg-8.4.2 I got with 3 consecutive runs on the server: >> 5m52.384s >> 5m55.885s >> 5m54.309s > > Maybe an index of the type 'year, dayofweek' will help for this query. But > it'll have to scan about half the table any way, so a seq scan isn't a bad > idea. > In this case, a partitioned table with partitions per year and constraint > exclusion enabled would help a bit more. > > Best regards, > > Arjen >
pgsql-performance by date: