Thread: MySQL HandlerSocket - Is this possible in PG?
Hi,
Just stumbled on the following post:
http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html
The post claim that MySQL can do more qps then MemCahed or any other NoSQL when doing simple queries like: SELECT * FROM table WHERE id=num;
And I wonder if:
1. Currently, is it possbile to achive the same using PG 9.0.x
2. Is it possible at all?
It seems to me that if such gain is possible, PG should benefit from that significantly when it comes to Key/Value queries.
Best,
Miki
--------------------------------------------------
Michael Ben-Nes - Internet Consultant and Director.
http://www.epoch.co.il - weaving the Net.
Cellular: 054-4848113
--------------------------------------------------
Just stumbled on the following post:
http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html
The post claim that MySQL can do more qps then MemCahed or any other NoSQL when doing simple queries like: SELECT * FROM table WHERE id=num;
And I wonder if:
1. Currently, is it possbile to achive the same using PG 9.0.x
2. Is it possible at all?
It seems to me that if such gain is possible, PG should benefit from that significantly when it comes to Key/Value queries.
Best,
Miki
--------------------------------------------------
Michael Ben-Nes - Internet Consultant and Director.
http://www.epoch.co.il - weaving the Net.
Cellular: 054-4848113
--------------------------------------------------
Hello you can emulate it now. a) try to do a simple stored procedure, where you can wrap your query b) use a FAST CALL API to call this procedure c) use a some pool tool for pooling and persisting sessions Regards Pavel Stehule 2010/12/21 Michael Ben-Nes <michael@epoch.co.il>: > Hi, > > Just stumbled on the following post: > http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html > > The post claim that MySQL can do more qps then MemCahed or any other NoSQL > when doing simple queries like: SELECT * FROM table WHERE id=num; > > And I wonder if: > > 1. Currently, is it possbile to achive the same using PG 9.0.x > 2. Is it possible at all? > > It seems to me that if such gain is possible, PG should benefit from that > significantly when it comes to Key/Value queries. > > > Best, > Miki > > -------------------------------------------------- > Michael Ben-Nes - Internet Consultant and Director. > http://www.epoch.co.il - weaving the Net. > Cellular: 054-4848113 > -------------------------------------------------- >
2010/12/21 Michael Ben-Nes <michael@epoch.co.il>: > Hi Pavel, > > Thanks for your quick answer. Can you please elaborate a bit more about the > points bellow. > > On Tue, Dec 21, 2010 at 1:31 PM, Pavel Stehule <pavel.stehule@gmail.com> > wrote: >> >> Hello >> >> you can emulate it now. >> >> a) try to do a simple stored procedure, where you can wrap your query > > Do you mean I should use PREPARE? yes > >> b) use a FAST CALL API to call this procedure > > Currently I use PHP to access the DB which use libpq. Is that cosidered a > fast call API ? if not, can you please refer me to the right info. > >> sorry it is a fast-path interface http://www.postgresql.org/docs/8.1/static/libpq-fastpath.html but php hasn't a adequate API :( >> c) use a some pool tool for pooling and persisting sessions > > PHP pg_pconnect command open a persistent PostgreSQL connection. Is it > enough or I better use PgPool2 or something similar? > probably it's enough > > Considering the points above, will I be able to get such high QPS from > PostgreSQL ? If so, it will be my pleasure to dump Reddis and work solely > with PG :) > There is a lot of unknown factors, but I believe so speed is limited by IO more than by sw. The PostgreSQL engine isn't specially optimised for access with primary key (InnoDB has this optimization, PostgreSQL hasn't clustered index) , so probably Pg will be slower. Regards Pavel Stehule > > Thanks, > Miki > >> >> Regards >> >> Pavel Stehule >> >> 2010/12/21 Michael Ben-Nes <michael@epoch.co.il>: >> > Hi, >> > >> > Just stumbled on the following post: >> > >> > http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html >> > >> > The post claim that MySQL can do more qps then MemCahed or any other >> > NoSQL >> > when doing simple queries like: SELECT * FROM table WHERE id=num; >> > >> > And I wonder if: >> > >> > 1. Currently, is it possbile to achive the same using PG 9.0.x >> > 2. Is it possible at all? >> > >> > It seems to me that if such gain is possible, PG should benefit from >> > that >> > significantly when it comes to Key/Value queries. >> > >> > >> > Best, >> > Miki >> > >> > > > -------------------------------------------------- > Michael Ben-Nes - Internet Consultant and Director. > http://www.epoch.co.il - weaving the Net. > Cellular: 054-4848113 > -------------------------------------------------- >
On Tue, Dec 21, 2010 at 10:50 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > 2010/12/21 Michael Ben-Nes <michael@epoch.co.il>: >> Hi Pavel, >> >> Thanks for your quick answer. Can you please elaborate a bit more about the >> points bellow. >> >> On Tue, Dec 21, 2010 at 1:31 PM, Pavel Stehule <pavel.stehule@gmail.com> >> wrote: >>> >>> Hello >>> >>> you can emulate it now. >>> >>> a) try to do a simple stored procedure, where you can wrap your query >> >> Do you mean I should use PREPARE? > > yes > >> >>> b) use a FAST CALL API to call this procedure >> >> Currently I use PHP to access the DB which use libpq. Is that cosidered a >> fast call API ? if not, can you please refer me to the right info. >> >>> > > sorry it is a fast-path interface > > http://www.postgresql.org/docs/8.1/static/libpq-fastpath.html > > but php hasn't a adequate API :( I don't think fastpath interface is going to get you there. What they are doing with mysql is bypassing both the parser and the protocol. As soon as you use libpq, you've lost the battle...you can't see anywhere close to to that performance before you become network bottlenecked. If you want to see postgres doing this in action, you could fire up the database in single user mode and run raw queries against the backend. Another way to do it is to hack tcop/postgres.c and inject protocol messages manually. Right now, the only way to get that close to the metal using standard techniques is via SPI (plpgsql, etc). A proper transaction free stored procedure implementation would open a lot of doors for fast query execution. merlin
Hi Pavel,
Thanks for your quick answer. Can you please elaborate a bit more about the points bellow.
Michael Ben-Nes - Internet Consultant and Director.
http://www.epoch.co.il - weaving the Net.
Cellular: 054-4848113
--------------------------------------------------
Thanks for your quick answer. Can you please elaborate a bit more about the points bellow.
On Tue, Dec 21, 2010 at 1:31 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Do you mean I should use PREPARE?
Currently I use PHP to access the DB which use libpq. Is that cosidered a fast call API ? if not, can you please refer me to the right info.
--------------------------------------------------Hello
you can emulate it now.
a) try to do a simple stored procedure, where you can wrap your query
Do you mean I should use PREPARE?
b) use a FAST CALL API to call this procedure
Currently I use PHP to access the DB which use libpq. Is that cosidered a fast call API ? if not, can you please refer me to the right info.
c) use a some pool tool for pooling and persisting sessions
PHP pg_pconnect command open a persistent PostgreSQL connection. Is it enough or I better use PgPool2 or something similar?
Considering the points above, will I be able to get such high QPS from PostgreSQL ? If so, it will be my pleasure to dump Reddis and work solely with PG :)
Thanks,
Miki
Regards
Pavel Stehule
2010/12/21 Michael Ben-Nes <michael@epoch.co.il>:> Hi,
>
> Just stumbled on the following post:
> http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html
>
> The post claim that MySQL can do more qps then MemCahed or any other NoSQL
> when doing simple queries like: SELECT * FROM table WHERE id=num;
>
> And I wonder if:
>
> 1. Currently, is it possbile to achive the same using PG 9.0.x
> 2. Is it possible at all?
>
> It seems to me that if such gain is possible, PG should benefit from that
> significantly when it comes to Key/Value queries.
>
>
> Best,
> Miki
>
>
Michael Ben-Nes - Internet Consultant and Director.
http://www.epoch.co.il - weaving the Net.
Cellular: 054-4848113
--------------------------------------------------
> Do you mean I should use PREPARE? > > Currently I use PHP to access the DB which use libpq. Is that cosidered a > fast call API ? if not, can you please refer me to the right info. > > PHP pg_pconnect command open a persistent PostgreSQL connection. Is it > enough or I better use PgPool2 or something similar? > > Considering the points above, will I be able to get such high QPS from > PostgreSQL ? If so, it will be my pleasure to dump Reddis and work solely > with PG :) I suppose you already have a web server like lighttpd, zeus, or nginx, using php as fastcgi, or apache behind a proxy ? In that case, since the number of php processes is limited (usually to something like 2x your number of cores), the number of postgres connections a web server generates is limited, and you can do without pgpool and use pg_pconnect. Be wary of the pg_pconnect bugs though (like if you restart pg, you also have to restart php, I suppose you know that). Here are some timings (Core 2 Q6600) for a simple SELECT on PK query : using tcp (localhost) 218 µs / query : pg_query 226 µs / query : pg_query_params 143 µs / query : pg_execute using unix sockets 107 µs / query : pg_query 122 µs / query : pg_query_params 63 µs / query : pg_execute query inside plpgsql function 17 µs / query Don't use PDO, it is 2x-3x slower. TCP overhead is quite large... If you have a named prepared statement (created with PREPARE) use pg_execute(), which is much faster than pg_query (for very simple queries). Of course you need to prepare the statements... you can do that with pg_pool which can execute a script upon connection initialization.
On Tue, Dec 21, 2010 at 2:09 AM, Michael Ben-Nes <michael@epoch.co.il> wrote: > Hi, > > Just stumbled on the following post: > http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html > > The post claim that MySQL can do more qps then MemCahed or any other NoSQL > when doing simple queries like: SELECT * FROM table WHERE id=num; No it does not. They use an interface that bypasses SQL and is much more primitive.
On Tue, Dec 21, 2010 at 11:09, Michael Ben-Nes <michael@epoch.co.il> wrote: > Just stumbled on the following post: > http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html > > The post claim that MySQL can do more qps then MemCahed or any other NoSQL > when doing simple queries like: SELECT * FROM table WHERE id=num; > > And I wonder if: > > 1. Currently, is it possbile to achive the same using PG 9.0.x > 2. Is it possible at all? I was curious what could be done currently, without any modifications to PostgreSQL itself, so I ran a simple benchmark. Table: create table usr (user_id int primary key not null, user_name text not null, user_email text not null, created timestamp not null); insert into usr select generate_series(1, 1000000), 'Yukari Takeba', 'yukari.takeba@dena.jp', '2010-02-03 11:22:33'; <?php $db = pg_connect(''); $res = pg_prepare($db, 'get_user', 'select user_name, user_email, created from usr where user_id=$1'); $res = pg_query($db, 'begin'); $args = array(); for($i = 0; $i < 250000; $i++) { $args[0] = rand(1, 1000000); $res = pg_execute($db, 'get_user', $args); $row = pg_fetch_row($res); } ?> Each process does 250k queries, so when I run 4 in parallel it's 1M queries total. I'm running PostgreSQL 9.1alpha2, PHP 5.3.4, kernel 2.6.36.2 on Arch Linux; AMD Phenom II X4 955. The only tuning I did was setting shared_buffers=256M Results: % time php pg.php & time php pg.php &time php pg.php &time php pg.php & sleep 11 [1] 29792 [2] 29793 [3] 29795 [4] 29797 php pg.php 1,99s user 0,97s system 30% cpu 9,678 total [2] done time php pg.php php pg.php 1,94s user 1,06s system 30% cpu 9,731 total [3] - done time php pg.php php pg.php 1,92s user 1,07s system 30% cpu 9,746 total [1] - done time php pg.php php pg.php 2,00s user 1,04s system 31% cpu 9,777 total [4] + done time php pg.php So around 10 seconds to run the test in total. These numbers aren't directly comparable to their test -- I tested over a local UNIX socket, with PHP client on the same machine -- but it's a datapoint nevertheless. Bottom line, you can expect up to 100 000 QPS using pg_execute() on a cheap quad-core gamer CPU. You won't be beating memcached with current PostgreSQL, but I think it's a respectable result. Regards, Marti
On Wed, 22 Dec 2010 14:17:21 +0100, Michael Ben-Nes <michael@epoch.co.il> wrote: > Thanks, it is most interesting > > -------------------------------------------------- > Michael Ben-Nes - Internet Consultant and Director. > http://www.epoch.co.il - weaving the Net. > Cellular: 054-4848113 > -------------------------------------------------- > In fact, it would be possible to implement something like MySQL HandlerSocket, using the following Very Ugly Hack : This would only work for ultra simple "SELECT 1 row WHERE primary key = constant" queries. - a pooler (separate process) waits for connections - clients connect to the pooler and send queries - pooler accumulates enough queries to justify the overhead of what's going to come next - pooler takes a bunch of queries and encodes them in some custom ad-hoc format (not SQL) - pooler says to postgres "SELECT do_my_queries( serialized data )" - do_my_queries() is a user function (in C) which uses postgres access methods directly (like index access method on primary key), processes queries, and sends results back as binary data - repeat for next batch Nested Loop Index Scan processes about 400.000 rows/s which is 2.5 us/query, maybe you could get into that ballpark (per core). Of course it's a rather extremely ugly hack. ------------------- Note that you could very possibly have almost the same benefits with "almost" none of the ugliness by doing the following : same as above : - a pooler (separate process) waits for connections - clients connect to the pooler and send queries in the format query + parameters (which libpq uses if you ask) - pooler accumulates enough queries to justify the overhead of what's going to come next different : - pooler looks at each query, and if it has not seen it yet on this particular pg connection, issues a "PREPARE" on the query - pooler sends, in one TCP block, a begin, then a bunch of "execute named prepared statement with parameters" commands, then a rollback - postgres executes all of those and returns all replies in one TCP block (this would need a modification) - pooler distributes results back to clients This would need a very minor change to postgres (coalescing output blocks). It would make the pooler pay TCP overhead instead of postgres, and greatly improve cache locality in postgres. Since error handling would be "problematic" (to say the least...) and expensive it would only work on simple selects.
I think this might be a game changing feature.
For the first time after 10 years I have reason to consider MySQL, as the cost per performance in such scenario is amazing. Morever I wont have to run it in single mod or loose other functionality by using this feautre. as I can access the ordinary interface on port 3306 and the fast interface on other port.
I wonder if PostgreSQL should replicate this functionality somehow. How can I represent this idea to the developers? They will probably know if this feature worth something.
Thanks,
Miki
--------------------------------------------------
Michael Ben-Nes - Internet Consultant and Director.
http://www.epoch.co.il - weaving the Net.
Cellular: 054-4848113
--------------------------------------------------
For the first time after 10 years I have reason to consider MySQL, as the cost per performance in such scenario is amazing. Morever I wont have to run it in single mod or loose other functionality by using this feautre. as I can access the ordinary interface on port 3306 and the fast interface on other port.
I wonder if PostgreSQL should replicate this functionality somehow. How can I represent this idea to the developers? They will probably know if this feature worth something.
Thanks,
Miki
--------------------------------------------------
Michael Ben-Nes - Internet Consultant and Director.
http://www.epoch.co.il - weaving the Net.
Cellular: 054-4848113
--------------------------------------------------
On Tue, Dec 21, 2010 at 11:07 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Tue, Dec 21, 2010 at 10:50 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:I don't think fastpath interface is going to get you there. What they
> 2010/12/21 Michael Ben-Nes <michael@epoch.co.il>:
>> Hi Pavel,
>>
>> Thanks for your quick answer. Can you please elaborate a bit more about the
>> points bellow.
>>
>> On Tue, Dec 21, 2010 at 1:31 PM, Pavel Stehule <pavel.stehule@gmail.com>
>> wrote:
>>>
>>> Hello
>>>
>>> you can emulate it now.
>>>
>>> a) try to do a simple stored procedure, where you can wrap your query
>>
>> Do you mean I should use PREPARE?
>
> yes
>
>>
>>> b) use a FAST CALL API to call this procedure
>>
>> Currently I use PHP to access the DB which use libpq. Is that cosidered a
>> fast call API ? if not, can you please refer me to the right info.
>>
>>>
>
> sorry it is a fast-path interface
>
> http://www.postgresql.org/docs/8.1/static/libpq-fastpath.html
>
> but php hasn't a adequate API :(
are doing with mysql is bypassing both the parser and the protocol.
As soon as you use libpq, you've lost the battle...you can't see
anywhere close to to that performance before you become network
bottlenecked.
If you want to see postgres doing this in action, you could fire up
the database in single user mode and run raw queries against the
backend. Another way to do it is to hack tcop/postgres.c and inject
protocol messages manually. Right now, the only way to get that close
to the metal using standard techniques is via SPI (plpgsql, etc). A
proper transaction free stored procedure implementation would open a
lot of doors for fast query execution.
merlin
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Michael Ben-Nes <michael@epoch.co.il> writes: > I wonder if PostgreSQL should replicate this functionality somehow. How can > I represent this idea to the developers? They will probably know if this > feature worth something. As I didn't have enough time to follow this thread in detail I'm not sure how closely it is related, but have you tried preprepare? https://github.com/dimitri/preprepare Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support