Thread: MySQL HandlerSocket - Is this possible in PG?

MySQL HandlerSocket - Is this possible in PG?

From
Michael Ben-Nes
Date:
Hi,

Just stumbled on the following post:
http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html

The post claim that MySQL can do more qps then MemCahed or any other NoSQL when doing simple queries like: SELECT * FROM table WHERE id=num;

And I wonder if:

1. Currently, is it possbile to achive the same using PG 9.0.x
2. Is it possible at all?

It seems to me that if such gain is possible, PG should benefit from that significantly when it comes to Key/Value queries.


Best,
Miki

--------------------------------------------------
Michael Ben-Nes - Internet Consultant and Director.
http://www.epoch.co.il - weaving the Net.
Cellular: 054-4848113
--------------------------------------------------

Re: MySQL HandlerSocket - Is this possible in PG?

From
Pavel Stehule
Date:
Hello

you can emulate it now.

a) try to do a simple stored procedure, where you can wrap your query
b) use a FAST CALL API to call this procedure
c) use a some pool tool for pooling and persisting sessions

Regards

Pavel Stehule

2010/12/21 Michael Ben-Nes <michael@epoch.co.il>:
> Hi,
>
> Just stumbled on the following post:
> http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html
>
> The post claim that MySQL can do more qps then MemCahed or any other NoSQL
> when doing simple queries like: SELECT * FROM table WHERE id=num;
>
> And I wonder if:
>
> 1. Currently, is it possbile to achive the same using PG 9.0.x
> 2. Is it possible at all?
>
> It seems to me that if such gain is possible, PG should benefit from that
> significantly when it comes to Key/Value queries.
>
>
> Best,
> Miki
>
> --------------------------------------------------
> Michael Ben-Nes - Internet Consultant and Director.
> http://www.epoch.co.il - weaving the Net.
> Cellular: 054-4848113
> --------------------------------------------------
>

Re: MySQL HandlerSocket - Is this possible in PG?

From
Pavel Stehule
Date:
2010/12/21 Michael Ben-Nes <michael@epoch.co.il>:
> Hi Pavel,
>
> Thanks for your quick answer. Can you please elaborate a bit more about the
> points bellow.
>
> On Tue, Dec 21, 2010 at 1:31 PM, Pavel Stehule <pavel.stehule@gmail.com>
> wrote:
>>
>> Hello
>>
>> you can emulate it now.
>>
>> a) try to do a simple stored procedure, where you can wrap your query
>
> Do you mean I should use PREPARE?

yes

>
>> b) use a FAST CALL API to call this procedure
>
> Currently I use PHP to access the DB which use libpq. Is that cosidered a
> fast call API ? if not, can you please refer me to the right info.
>
>>

sorry it is a fast-path interface

http://www.postgresql.org/docs/8.1/static/libpq-fastpath.html

but php hasn't a adequate API :(


>> c) use a some pool tool for pooling and persisting sessions
>
> PHP pg_pconnect command open a persistent PostgreSQL connection. Is it
> enough or I better use PgPool2 or something similar?
>

probably it's enough

>
> Considering the points above, will I be able to get such high QPS from
> PostgreSQL ? If so, it will be my pleasure to dump Reddis and work solely
> with PG :)
>

There is a lot of unknown factors, but I believe so speed is limited
by IO more than by sw. The PostgreSQL engine isn't specially optimised
for access with primary key (InnoDB has this optimization, PostgreSQL
hasn't clustered index) , so probably Pg will be slower.

Regards

Pavel Stehule

>
> Thanks,
> Miki
>
>>
>> Regards
>>
>> Pavel Stehule
>>
>> 2010/12/21 Michael Ben-Nes <michael@epoch.co.il>:
>> > Hi,
>> >
>> > Just stumbled on the following post:
>> >
>> > http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html
>> >
>> > The post claim that MySQL can do more qps then MemCahed or any other
>> > NoSQL
>> > when doing simple queries like: SELECT * FROM table WHERE id=num;
>> >
>> > And I wonder if:
>> >
>> > 1. Currently, is it possbile to achive the same using PG 9.0.x
>> > 2. Is it possible at all?
>> >
>> > It seems to me that if such gain is possible, PG should benefit from
>> > that
>> > significantly when it comes to Key/Value queries.
>> >
>> >
>> > Best,
>> > Miki
>> >
>> >
>
> --------------------------------------------------
> Michael Ben-Nes - Internet Consultant and Director.
> http://www.epoch.co.il - weaving the Net.
> Cellular: 054-4848113
> --------------------------------------------------
>

Re: MySQL HandlerSocket - Is this possible in PG?

From
Merlin Moncure
Date:
On Tue, Dec 21, 2010 at 10:50 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2010/12/21 Michael Ben-Nes <michael@epoch.co.il>:
>> Hi Pavel,
>>
>> Thanks for your quick answer. Can you please elaborate a bit more about the
>> points bellow.
>>
>> On Tue, Dec 21, 2010 at 1:31 PM, Pavel Stehule <pavel.stehule@gmail.com>
>> wrote:
>>>
>>> Hello
>>>
>>> you can emulate it now.
>>>
>>> a) try to do a simple stored procedure, where you can wrap your query
>>
>> Do you mean I should use PREPARE?
>
> yes
>
>>
>>> b) use a FAST CALL API to call this procedure
>>
>> Currently I use PHP to access the DB which use libpq. Is that cosidered a
>> fast call API ? if not, can you please refer me to the right info.
>>
>>>
>
> sorry it is a fast-path interface
>
> http://www.postgresql.org/docs/8.1/static/libpq-fastpath.html
>
> but php hasn't a adequate API :(


I don't think fastpath interface is going to get you there.  What they
are doing with mysql is bypassing both the parser and the protocol.
As soon as you use libpq, you've lost the battle...you can't see
anywhere close to to that performance before you become network
bottlenecked.

If you want to see postgres doing this in action, you could fire up
the database in single user mode and run raw queries against the
backend.   Another way to do it is to hack tcop/postgres.c and inject
protocol messages manually.  Right now, the only way to get that close
to the metal using standard techniques is via SPI (plpgsql, etc).  A
proper transaction free stored procedure implementation would open a
lot of doors for fast query execution.

merlin

Re: MySQL HandlerSocket - Is this possible in PG?

From
Michael Ben-Nes
Date:
Hi Pavel,

Thanks for your quick answer. Can you please elaborate a bit more about the points bellow.

On Tue, Dec 21, 2010 at 1:31 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hello

you can emulate it now.

a) try to do a simple stored procedure, where you can wrap your query

Do you mean I should use PREPARE?

b) use a FAST CALL API to call this procedure

Currently I use PHP to access the DB which use libpq. Is that cosidered a fast call API ? if not, can you please refer me to the right info.
 
c) use a some pool tool for pooling and persisting sessions

PHP pg_pconnect command open a persistent PostgreSQL connection. Is it enough or I better use PgPool2 or something similar?
 
Considering the points above, will I be able to get such high QPS from PostgreSQL ? If so, it will be my pleasure to dump Reddis and work solely with PG :)


Thanks,
Miki


Regards

Pavel Stehule

2010/12/21 Michael Ben-Nes <michael@epoch.co.il>:
> Hi,
>
> Just stumbled on the following post:
> http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html
>
> The post claim that MySQL can do more qps then MemCahed or any other NoSQL
> when doing simple queries like: SELECT * FROM table WHERE id=num;
>
> And I wonder if:
>
> 1. Currently, is it possbile to achive the same using PG 9.0.x
> 2. Is it possible at all?
>
> It seems to me that if such gain is possible, PG should benefit from that
> significantly when it comes to Key/Value queries.
>
>
> Best,
> Miki
>
>
--------------------------------------------------
Michael Ben-Nes - Internet Consultant and Director.
http://www.epoch.co.il - weaving the Net.
Cellular: 054-4848113
--------------------------------------------------

Re: MySQL HandlerSocket - Is this possible in PG?

From
"Pierre C"
Date:

> Do you mean I should use PREPARE?
>
> Currently I use PHP to access the DB which use libpq. Is that cosidered a
> fast call API ? if not, can you please refer me to the right info.
>
> PHP pg_pconnect command open a persistent PostgreSQL connection. Is it
> enough or I better use PgPool2 or something similar?
>
> Considering the points above, will I be able to get such high QPS from
> PostgreSQL ? If so, it will be my pleasure to dump Reddis and work solely
> with PG :)

I suppose you already have a web server like lighttpd, zeus, or nginx,
using php as fastcgi, or apache behind a proxy ? In that case, since the
number of php processes is limited (usually to something like 2x your
number of cores), the number of postgres connections a web server
generates is limited, and you can do without pgpool and use pg_pconnect.
Be wary of the pg_pconnect bugs though (like if you restart pg, you also
have to restart php, I suppose you know that).

Here are some timings (Core 2 Q6600) for a simple SELECT on PK query :

using tcp (localhost)
    218 µs / query :  pg_query
    226 µs / query :  pg_query_params
    143 µs / query :  pg_execute

using unix sockets
    107 µs / query :  pg_query
    122 µs / query :  pg_query_params
     63 µs / query :  pg_execute

query inside plpgsql function
     17 µs / query

Don't use PDO, it is 2x-3x slower.

TCP overhead is quite large...

If you have a named prepared statement (created with PREPARE) use
pg_execute(), which is much faster than pg_query (for very simple queries).

Of course you need to prepare the statements... you can do that with
pg_pool which can execute a script upon connection initialization.

Re: MySQL HandlerSocket - Is this possible in PG?

From
Scott Marlowe
Date:
On Tue, Dec 21, 2010 at 2:09 AM, Michael Ben-Nes <michael@epoch.co.il> wrote:
> Hi,
>
> Just stumbled on the following post:
> http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html
>
> The post claim that MySQL can do more qps then MemCahed or any other NoSQL
> when doing simple queries like: SELECT * FROM table WHERE id=num;

No it does not.  They use an interface that bypasses SQL and is much
more primitive.

Re: MySQL HandlerSocket - Is this possible in PG?

From
Marti Raudsepp
Date:
On Tue, Dec 21, 2010 at 11:09, Michael Ben-Nes <michael@epoch.co.il> wrote:
> Just stumbled on the following post:
> http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html
>
> The post claim that MySQL can do more qps then MemCahed or any other NoSQL
> when doing simple queries like: SELECT * FROM table WHERE id=num;
>
> And I wonder if:
>
> 1. Currently, is it possbile to achive the same using PG 9.0.x
> 2. Is it possible at all?

I was curious what could be done currently, without any modifications
to PostgreSQL itself, so I ran a simple benchmark.

Table:
create table usr (user_id int primary key not null, user_name text not
null, user_email text not null, created timestamp not null);
insert into usr select generate_series(1, 1000000), 'Yukari Takeba',
'yukari.takeba@dena.jp', '2010-02-03 11:22:33';

<?php
$db = pg_connect('');
$res = pg_prepare($db, 'get_user', 'select user_name, user_email,
created from usr where user_id=$1');
$res = pg_query($db, 'begin');

$args = array();
for($i = 0; $i < 250000; $i++)
{
  $args[0] = rand(1, 1000000);
  $res = pg_execute($db, 'get_user', $args);
  $row = pg_fetch_row($res);
}
?>

Each process does 250k queries, so when I run 4 in parallel it's 1M
queries total.

I'm running PostgreSQL 9.1alpha2, PHP 5.3.4, kernel 2.6.36.2 on Arch
Linux; AMD Phenom II X4 955.
The only tuning I did was setting shared_buffers=256M

Results:
% time php pg.php & time php pg.php &time php pg.php &time php pg.php & sleep 11
[1] 29792
[2] 29793
[3] 29795
[4] 29797
php pg.php  1,99s user 0,97s system 30% cpu 9,678 total
[2]    done       time php pg.php
php pg.php  1,94s user 1,06s system 30% cpu 9,731 total
[3]  - done       time php pg.php
php pg.php  1,92s user 1,07s system 30% cpu 9,746 total
[1]  - done       time php pg.php
php pg.php  2,00s user 1,04s system 31% cpu 9,777 total
[4]  + done       time php pg.php

So around 10 seconds to run the test in total.
These numbers aren't directly comparable to their test -- I tested
over a local UNIX socket, with PHP client on the same machine -- but
it's a datapoint nevertheless.

Bottom line, you can expect up to 100 000 QPS using pg_execute() on a
cheap quad-core gamer CPU. You won't be beating memcached with current
PostgreSQL, but I think it's a respectable result.

Regards,
Marti

Re: MySQL HandlerSocket - Is this possible in PG?

From
"Pierre C"
Date:
On Wed, 22 Dec 2010 14:17:21 +0100, Michael Ben-Nes <michael@epoch.co.il>
wrote:

> Thanks, it is most interesting
>
> --------------------------------------------------
> Michael Ben-Nes - Internet Consultant and Director.
> http://www.epoch.co.il - weaving the Net.
> Cellular: 054-4848113
> --------------------------------------------------
>

In fact, it would be possible to implement something like MySQL
HandlerSocket, using the following Very Ugly Hack :

This would only work for ultra simple "SELECT 1 row WHERE primary key =
constant" queries.

- a pooler (separate process) waits for connections
- clients connect to the pooler and send queries
- pooler accumulates enough queries to justify the overhead of what's
going to come next
- pooler takes a bunch of queries and encodes them in some custom ad-hoc
format (not SQL)
- pooler says to postgres "SELECT do_my_queries( serialized data )"
- do_my_queries() is a user function (in C) which uses postgres access
methods directly (like index access method on primary key), processes
queries, and sends results back as binary data
- repeat for next batch

Nested Loop Index Scan processes about 400.000 rows/s which is 2.5
us/query, maybe you could get into that ballpark (per core).

Of course it's a rather extremely ugly hack.

-------------------

Note that you could very possibly have almost the same benefits with
"almost" none of the ugliness by doing the following :

same as above :
- a pooler (separate process) waits for connections
- clients connect to the pooler and send queries in the format query +
parameters (which libpq uses if you ask)
- pooler accumulates enough queries to justify the overhead of what's
going to come next

different :
- pooler looks at each query, and if it has not seen it yet on this
particular pg connection, issues a "PREPARE" on the query
- pooler sends, in one TCP block, a begin, then a bunch of "execute named
prepared statement with parameters" commands, then a rollback
- postgres executes all of those and returns all replies in one TCP block
(this would need a modification)
- pooler distributes results back to clients

This would need a very minor change to postgres (coalescing output
blocks). It would make the pooler pay TCP overhead instead of postgres,
and greatly improve cache locality in postgres.

Since error handling would be "problematic" (to say the least...) and
expensive it would only work on simple selects.

Re: MySQL HandlerSocket - Is this possible in PG?

From
Michael Ben-Nes
Date:
I think this might be a game changing feature.
For the first time after 10 years I have reason to consider MySQL, as the cost per performance in such scenario is amazing. Morever I wont have to run it in single mod or loose other functionality by using this feautre. as I can access the ordinary interface on port 3306 and the fast interface on other port.

I wonder if PostgreSQL should replicate this functionality somehow. How can I represent this idea to the developers? They will probably know if this feature worth something.

Thanks,
Miki

--------------------------------------------------
Michael Ben-Nes - Internet Consultant and Director.
http://www.epoch.co.il - weaving the Net.
Cellular: 054-4848113
--------------------------------------------------


On Tue, Dec 21, 2010 at 11:07 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Tue, Dec 21, 2010 at 10:50 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2010/12/21 Michael Ben-Nes <michael@epoch.co.il>:
>> Hi Pavel,
>>
>> Thanks for your quick answer. Can you please elaborate a bit more about the
>> points bellow.
>>
>> On Tue, Dec 21, 2010 at 1:31 PM, Pavel Stehule <pavel.stehule@gmail.com>
>> wrote:
>>>
>>> Hello
>>>
>>> you can emulate it now.
>>>
>>> a) try to do a simple stored procedure, where you can wrap your query
>>
>> Do you mean I should use PREPARE?
>
> yes
>
>>
>>> b) use a FAST CALL API to call this procedure
>>
>> Currently I use PHP to access the DB which use libpq. Is that cosidered a
>> fast call API ? if not, can you please refer me to the right info.
>>
>>>
>
> sorry it is a fast-path interface
>
> http://www.postgresql.org/docs/8.1/static/libpq-fastpath.html
>
> but php hasn't a adequate API :(


I don't think fastpath interface is going to get you there.  What they
are doing with mysql is bypassing both the parser and the protocol.
As soon as you use libpq, you've lost the battle...you can't see
anywhere close to to that performance before you become network
bottlenecked.

If you want to see postgres doing this in action, you could fire up
the database in single user mode and run raw queries against the
backend.   Another way to do it is to hack tcop/postgres.c and inject
protocol messages manually.  Right now, the only way to get that close
to the metal using standard techniques is via SPI (plpgsql, etc).  A
proper transaction free stored procedure implementation would open a
lot of doors for fast query execution.

merlin

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: MySQL HandlerSocket - Is this possible in PG?

From
Dimitri Fontaine
Date:
Michael Ben-Nes <michael@epoch.co.il> writes:
> I wonder if PostgreSQL should replicate this functionality somehow. How can
> I represent this idea to the developers? They will probably know if this
> feature worth something.

As I didn't have enough time to follow this thread in detail I'm not
sure how closely it is related, but have you tried preprepare?

  https://github.com/dimitri/preprepare

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support