Thread: Benchmark-Comparison PostGreSQL vs. SQL Server

Benchmark-Comparison PostGreSQL vs. SQL Server

From
"Robert Soeding"
Date:

Hi, this is my first question here, and also, it's somewhat delicate. So please be patient.
 
My question is, CAN PostGreSQL perform in the SQL Server area when it comes to speed?
In other words, are there explanations for the results I found (see below)?
 
Thanks,
Robert
 
-----
Background:
1. I read people were using PostGreSQL with TeraBytes of data sometimes, or thousands of users. These are things that could easily break SQL Server. - So I thought PostGreSQL might be similar fast to SQL Server.
 
2. I did some tests:
Windows XP SP2
Several GIGs free harddisk, ~400 MB free RAM
Java 1.5 / JDBC
PostGreSQL 8.0 beta (through Windows Installer), default configuration, default driver
SQL Server 2000 SP3a, default configuration, JDTS driver
Tablespaces of both databases on the same partition
Write-Test: Creating tables (slightly modified TCP-W benchmark)
Read-Test: Simple SELECT statements on all tables, returning the first 1000 rows (cursor variants: read-only and non-locking, resp. updatable and locking)
 
Results:
Writing: SQL Server 25 times faster.
Reading: SQL Server 100 times faster.

Re: Benchmark-Comparison PostGreSQL vs. SQL Server

From
Richard Huxton
Date:
Robert Soeding wrote:
> Hi, this is my first question here, and also, it's somewhat delicate.
> So please be patient.
>
> My question is, CAN PostGreSQL perform in the SQL Server area when it
> comes to speed? In other words, are there explanations for the
> results I found (see below)?

Faster in some cases, slower in others in my experience. Oh, and
publishing performance comparisons with another database might be in
breach of your ms-sql server licencing.

> Thanks, Robert
>
> ----- Background: 1. I read people were using PostGreSQL with
> TeraBytes of data sometimes, or thousands of users. These are things
> that could easily break SQL Server. - So I thought PostGreSQL might
> be similar fast to SQL Server.

Some people have very large installations. This obviously isn't on
Windows, and not necessarily on x86 hardware.

> 2. I did some tests: Windows XP SP2 Several GIGs free harddisk, ~400
> MB free RAM Java 1.5 / JDBC PostGreSQL 8.0 beta (through Windows
> Installer), default configuration, default driver SQL Server 2000
> SP3a, default configuration, JDTS driver Tablespaces of both
> databases on the same partition Write-Test: Creating tables (slightly
> modified TCP-W benchmark) Read-Test: Simple SELECT statements on all
> tables, returning the first 1000 rows (cursor variants: read-only and
> non-locking, resp. updatable and locking)
>
> Results: Writing: SQL Server 25 times faster. Reading: SQL Server 100
> times faster.

The figures sound wrong. The Windows port isn't likely to be as fast as
the *nix versions (certainly not yet) but those figures don't match for
my experience with PG on Linux.

Unfortunately, although you provide a lot of information, almost none of
it tells us what the problem is. So -
1. What configuration changes have you made?
2. How many concurrent connections was this?
3. Were you selecting 1000 rows (LIMIT 1000), selecting all the rows
(and only fetching 1000) or actually defining an SQL cursor.
4. What was the load on the machine - CPU or DISK peaking?
5. What was the RAM usage like?

--
   Richard Huxton
   Archonet Ltd

Re: Benchmark-Comparison PostGreSQL vs. SQL Server

From
"Robert Soeding"
Date:
> 1. What configuration changes have you made?

None, both installations are default configured.

> 2. How many concurrent connections was this?

One.

> 3. Were you selecting 1000 rows (LIMIT 1000), selecting all the rows 
> (and only fetching 1000) or actually defining an SQL cursor.

I used "LIMIT 1000", resp. "TOP 1000" statements.

> 4. What was the load on the machine - CPU or DISK peaking?
> 5. What was the RAM usage like? 

CPU (< 10%) and RAM (<10%) usage were very low.

- I guess (as you mentioned below) it's the NT file system.
When running PostgreSQL queries I can hear the harddisk buzzing, but not with SQL Server queries.

On the other hand, if an application has to "fight" against the file system, I would suppose it to increase RAM and CPU
usagesignificantly.
 


-----Original Message-----
From: Richard Huxton [mailto:dev@archonet.com] 
Sent: Thursday, November 25, 2004 12:34 PM
To: Robert Soeding
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Benchmark-Comparison PostGreSQL vs. SQL Server

Robert Soeding wrote:
> Hi, this is my first question here, and also, it's somewhat delicate.
> So please be patient.
> 
> My question is, CAN PostGreSQL perform in the SQL Server area when it 
> comes to speed? In other words, are there explanations for the results 
> I found (see below)?

Faster in some cases, slower in others in my experience. Oh, and publishing performance comparisons with another
databasemight be in breach of your ms-sql server licencing.
 

> Thanks, Robert
> 
> ----- Background: 1. I read people were using PostGreSQL with 
> TeraBytes of data sometimes, or thousands of users. These are things 
> that could easily break SQL Server. - So I thought PostGreSQL might be 
> similar fast to SQL Server.

Some people have very large installations. This obviously isn't on Windows, and not necessarily on x86 hardware.

> 2. I did some tests: Windows XP SP2 Several GIGs free harddisk, ~400 
> MB free RAM Java 1.5 / JDBC PostGreSQL 8.0 beta (through Windows 
> Installer), default configuration, default driver SQL Server 2000 
> SP3a, default configuration, JDTS driver Tablespaces of both databases 
> on the same partition Write-Test: Creating tables (slightly modified 
> TCP-W benchmark) Read-Test: Simple SELECT statements on all tables, 
> returning the first 1000 rows (cursor variants: read-only and 
> non-locking, resp. updatable and locking)
> 
> Results: Writing: SQL Server 25 times faster. Reading: SQL Server 100 
> times faster.

The figures sound wrong. The Windows port isn't likely to be as fast as the *nix versions (certainly not yet) but those
figuresdon't match for my experience with PG on Linux.
 

Unfortunately, although you provide a lot of information, almost none of it tells us what the problem is. So - 1. What
configurationchanges have you made?
 
2. How many concurrent connections was this?
3. Were you selecting 1000 rows (LIMIT 1000), selecting all the rows (and only fetching 1000) or actually defining an
SQLcursor.
 
4. What was the load on the machine - CPU or DISK peaking?
5. What was the RAM usage like?

--
   Richard Huxton
   Archonet Ltd

Re: Benchmark-Comparison PostGreSQL vs. SQL Server

From
Richard Huxton
Date:
Robert Soeding wrote:
>> 1. What configuration changes have you made?
>
> None, both installations are default configured.

You'll want to do at least some tuning on PG. Try the URL below for a
quick introduction - just the basic stuff is a good start.
   http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

>> 2. How many concurrent connections was this?
> One.

OK. That keeps things nice and simple.

>> 3. Were you selecting 1000 rows (LIMIT 1000), selecting all the
>> rows (and only fetching 1000) or actually defining an SQL cursor.
>
>
> I used "LIMIT 1000", resp. "TOP 1000" statements.

OK. The reason I asked was that if you ask for 1000000 rows, then PG
will find them all and return them all in a bunch. Most other servers
return the first row once it's available.

>> 4. What was the load on the machine - CPU or DISK peaking? 5. What
>> was the RAM usage like?
>
> CPU (< 10%) and RAM (<10%) usage were very low.
>
> - I guess (as you mentioned below) it's the NT file system. When
> running PostgreSQL queries I can hear the harddisk buzzing, but not
> with SQL Server queries.

If you're seeing hard-disk activity that means the data isn't cached.

> On the other hand, if an application has to "fight" against the file
> system, I would suppose it to increase RAM and CPU usage
> significantly.

It shouldn't be fighting the file system, but it does use it, and rely
on it for caching (rather than bypassing your filesystem cache). For
MS-SQL server I'm guessing you're allocating a lot of memory to SQL
server and not much to the file-cache. For PG you'll want it the other
way around.

--
   Richard Huxton
   Archonet Ltd

Re: Benchmark-Comparison PostGreSQL vs. SQL Server

From
Shridhar Daithankar
Date:
On Thursday 25 Nov 2004 6:22 pm, Richard Huxton wrote:

> > On the other hand, if an application has to "fight" against the file
> > system, I would suppose it to increase RAM and CPU usage
> > significantly.
>
> It shouldn't be fighting the file system, but it does use it, and rely
> on it for caching (rather than bypassing your filesystem cache). For
> MS-SQL server I'm guessing you're allocating a lot of memory to SQL
> server and not much to the file-cache. For PG you'll want it the other
> way around.

Do we have some sort of document about how caching on windows works? It is
very simple on linux and BSDs but for other OSs, I haven't seen many
suggestions.

I am sure OS specific hints would help OP a lot.

 Shridhar



Re: Benchmark-Comparison PostGreSQL vs. SQL Server

From
Richard Huxton
Date:
Shridhar Daithankar wrote:
>
> Do we have some sort of document about how caching on windows works? It is
> very simple on linux and BSDs but for other OSs, I haven't seen many
> suggestions.
>
> I am sure OS specific hints would help OP a lot.

Microsoft's own resources are good for this sort of stuff (apologies if
URLs wrap):

http://www.microsoft.com/technet/prodtechnol/windows2000serv/maintain/optimize/wperfch7.mspx

http://www.microsoft.com/resources/documentation/Windows/2000/server/reskit/en-us/Default.asp?url=/resources/documentation/Windows/2000/server/reskit/en-us/core/fnec_evl_ACKS.asp

The simplest setting is the simple application/file-sharing switch
detailed in article #2 (which was introduced in NT4 iirc).

--
   Richard Huxton
   Archonet Ltd

table name in pl/pgsql

From
"ON.KG"
Date:
New question:

i have tables like
table_20041124,
table_20041125,
etc...

i'm trying to make function (for example):
=====================================
CREATE FUNCTION get_count(text, text)
RETURNS int2 AS '
  DECLARE
    cnt int4;
  BEGIN
    SELECT INTO cnt COUNT(*)
    FROM table_$1           -- That doesn't work
    WHERE key = $2;

    RETURN cnt;
  END;'
LANGUAGE 'plpgsql';
=====================================

call this function by:

=====================================
SELECT get_count("20041124", "something");
=====================================

string in funstion -  FROM table_$1

how could i get a final correct table name here?

Thanx!


Re: table name in pl/pgsql

From
Adam Witney
Date:
I think you would have to do it something like this, although whether the
SELECT INTO works in an EXECUTE context I am not sure (note, completely
untested code!)

CREATE FUNCTION get_count(text, text) RETURNS int2 AS '
DECLARE
  cnt int4;
BEGIN

  EXECUTE ''SELECT INTO cnt COUNT(*) FROM table_'' || $1 || '' WHERE key =
'' || $2;

  RETURN cnt;
END;'
LANGUAGE 'plpgsql';





> New question:
>
> i have tables like
> table_20041124,
> table_20041125,
> etc...
>
> i'm trying to make function (for example):
> =====================================
> CREATE FUNCTION get_count(text, text)
> RETURNS int2 AS '
> DECLARE
>   cnt int4;
> BEGIN
>   SELECT INTO cnt COUNT(*)
>   FROM table_$1           -- That doesn't work
>   WHERE key = $2;
>
>   RETURN cnt;
> END;'
> LANGUAGE 'plpgsql';
> =====================================
>
> call this function by:
>
> =====================================
> SELECT get_count("20041124", "something");
> =====================================
>
> string in funstion -  FROM table_$1
>
> how could i get a final correct table name here?
>
> Thanx!
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>              http://archives.postgresql.org


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: table name in pl/pgsql

From
Richard Huxton
Date:
Adam Witney wrote:
> I think you would have to do it something like this, although whether the
> SELECT INTO works in an EXECUTE context I am not sure (note, completely
> untested code!)
>
> CREATE FUNCTION get_count(text, text) RETURNS int2 AS '
> DECLARE
>   cnt int4;
> BEGIN
>
>   EXECUTE ''SELECT INTO cnt COUNT(*) FROM table_'' || $1 || '' WHERE key =
> '' || $2;

That won't work either, you'll need to user FOR..IN..EXECUTE:

CREATE TABLE exectest (a integer, b text, PRIMARY KEY (a));
COPY exectest FROM stdin;
1   aaa
2   bbb
3   ccc
\.

CREATE FUNCTION demo_exec_fn() RETURNS boolean AS '
DECLARE
     r     RECORD;
BEGIN
     FOR r IN EXECUTE ''SELECT * FROM exectest''
     LOOP
         RAISE NOTICE ''a=%, b=%'', r.a, r.b;
     END LOOP;
     RETURN true;
END
' LANGUAGE plpgsql;

SELECT demo_exec_fn();

--
   Richard Huxton
   Archonet Ltd

Re: table name in pl/pgsql

From
Tino Wildenhain
Date:
Hi,

Am Donnerstag, den 25.11.2004, 19:42 +0300 schrieb ON.KG:
> New question:
>
> i have tables like
> table_20041124,
> table_20041125,
> etc...
>
> i'm trying to make function (for example):
> =====================================
> CREATE FUNCTION get_count(text, text)
> RETURNS int2 AS '
>   DECLARE
>     cnt int4;
>   BEGIN
>     SELECT INTO cnt COUNT(*)
>     FROM table_$1           -- That doesn't work
>     WHERE key = $2;
>
>     RETURN cnt;
>   END;'
> LANGUAGE 'plpgsql';
> =====================================
>
> call this function by:
>
> =====================================
> SELECT get_count("20041124", "something");
> =====================================
>
> string in funstion -  FROM table_$1
>
> how could i get a final correct table name here?

You need to build your query in your function
as a big string and pass it to EXECUTE

(See
http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN
and
http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-RECORDS-ITERATING
)

however encoding data into table names does not sound so elegant
for me - why not make it a real column?
The advantage would be you'd have a real query and let postgres
compile a plan for it - which is not possible for execute.

If you are concerned on index size you can always use partial
indices based on your "table number".

Regards
Tino


Re: table name in pl/pgsql

From
CoL
Date:
hi,

ON.KG wrote:
> New question:
>
> i have tables like
> table_20041124,
> table_20041125,
> etc...
>
> i'm trying to make function (for example):
> =====================================
> CREATE FUNCTION get_count(text, text)
> RETURNS int2 AS '
>   DECLARE
>     cnt int4;
>   BEGIN
>     SELECT INTO cnt COUNT(*)
>     FROM table_$1           -- That doesn't work
>     WHERE key = $2;
>
>     RETURN cnt;
>   END;'
> LANGUAGE 'plpgsql';
> =====================================
>
> call this function by:
>
> =====================================
> SELECT get_count("20041124", "something");
> =====================================
>
> string in funstion -  FROM table_$1
>
> how could i get a final correct table name here?

You can use execute for dynamic sql.
CREATE FUNCTION get_count(text, text) RETURNS int2 AS '
declare rec record;
begin
  for rec in execute ''select COUNT(*) as num from table_''||$1||''
where key=''''||$2'''' '';
  loop
   return rec.num;
  end loop;
return;
end;

PS: anyway, you want returns int2 , but you declared int4 :)

C.

Re: Benchmark-Comparison PostGreSQL vs. SQL Server

From
Russ Brown
Date:
On Thu, 25 Nov 2004 11:36:33 +0100
"Robert Soeding" <robert.soeding@lisocon.de> wrote:

> Hi, this is my first question here, and also, it's somewhat delicate.
> So please be patient.
> My question is, CAN PostGreSQL perform in the SQL Server area when it
> comes to speed? In other words, are there explanations for the
> results I found (see below)?
> Thanks,
> Robert
>
> -----
> Background:
> 1. I read people were using PostGreSQL with TeraBytes of data
> sometimes, or thousands of users. These are things that could easily
> break SQL Server. - So I thought PostGreSQL might be similar fast to
> SQL Server. 2. I did some tests: Windows XP SP2
> Several GIGs free harddisk, ~400 MB free RAM
> Java 1.5 / JDBC
> PostGreSQL 8.0 beta (through Windows Installer), default
> configuration, default driver SQL Server 2000 SP3a, default
> configuration,

Here's your problem right here. You're never going to get a fair
comparison unless you tune the crap out of *both* contenders.
PostgreSQL's default configuration is extremely conservative to allow
it to be run on very limited resources.

> JDTS driver Tablespaces of both databases on the same
> partition Write-Test: Creating tables (slightly modified TCP-W
> benchmark) Read-Test: Simple SELECT statements on all tables,
> returning the first 1000 rows (cursor variants: read-only and
> non-locking, resp. updatable and locking) Results:
> Writing: SQL Server 25 times faster.
> Reading: SQL Server 100 times faster.


--

Russ

Re: Benchmark-Comparison PostGreSQL vs. SQL Server

From
"Merlin Moncure"
Date:
> > Background:
> > 1. I read people were using PostGreSQL with TeraBytes of data
> > sometimes, or thousands of users. These are things that could easily
> > break SQL Server. - So I thought PostGreSQL might be similar fast to
> > SQL Server. 2. I did some tests: Windows XP SP2
> > Several GIGs free harddisk, ~400 MB free RAM
> > Java 1.5 / JDBC
> > PostGreSQL 8.0 beta (through Windows Installer), default
> > configuration, default driver SQL Server 2000 SP3a, default
> > configuration,

SQL Server like mysql, is very 'noob friendly'...it doesn't require a
lot of configuration to be quick is forgiving of lousy access
strategies to the database (these are good things btw).

PostgreSQL 8.0b was still using old style win32 fsync which maps to
flushfilebuffers()...this is an an ultra conservative way of flushing
ot disk so this is not apples to apples comparison.  That said, SQL
Server will be very hard to beat with write performance on windows
because it is very tuned to the win32 api and has very efficient disk
access.  In the mean time for a fairer test you have to turn fsync off
in postgresql.conf and you definately want to be running pg 8.1 which
has tons of improvements for win32 and a more efficient
syncer...although in windows for production database I would highly
suggest targeting a caching raid controller.

I am right now in the development planning phase of a PostgreSQL
databse which has to scale well into the terabyte range.  I can tell
you without hesitation that the database can handle it and deserves
the reputation of a workhorse that it has.  Administrating pg is a bit
harder to get into but well help you understand what a database is
really all about and how to make your data access elegant and
efficient.

Merlin