Thread: Re: [Dbdpg-general] benchmarking old Pg and DBD::Pg
Hi there, we've noticed that too here and quick explanation is - with the new "prepare" implementation, the query get prepared on the server side, which is good; the problem is that all of the bind values passed to the sql server during execute call are in form of ***varchar***, i.e. even if a field is of int type in postgresql table, DBD::Pg passes your bind value as char - as a result, postgresql doesn't use indexes. I don't know if developers of new DBD::Pg have some game plan to fix this problem... but it won't be easy: since perl is a no-var-type language, so either DBD::Pg needs to analyze the table columns data types first before executing a query and then use correct type for bind values, or let coder pass those types explicitly... though I'm not aware of a method in DBI that would let them do that... Anyone from DBD::Pg can comment this please? I was going to post an example to the list later today. On Apr 8, 2005 12:30 PM, Brandon Metcalf <bmetcalf@nortel.com> wrote: > Has anyone done any benchmarking between the old Pg interface and > DBI/DBD::Pg? I have two versions of some code one of which uses Pg > and the other DBI/DBD::Pg and the latter appears to cause much more of > load on the system where it's running than the former. However, I > haven't done any benchmarking to support this claim. > > -- > Brandon > _______________________________________________ > Dbdpg-general mailing list > Dbdpg-general@gborg.postgresql.org > http://gborg.postgresql.org/mailman/listinfo/dbdpg-general > -- Vlad
> m> I don't know if developers of new DBD::Pg have some game plan to fix > m> this problem... but it won't be easy: since perl is a no-var-type > m> language, so either DBD::Pg needs to analyze the table columns data > m> types first before executing a query and then use correct type for > m> bind values, or let coder pass those types explicitly... though I'm > m> not aware of a method in DBI that would let them do that... > > Wouldn't bind_param() do just that? > just looked up DBI man - you are right, bind_param accepts column type... the question is - does DBD::Pg really process those types ? Checkout lines 1569 ... 1576 of dbdimp.c - it only checks BYTEAOID, everything else goes in as varchar :( While In other place ( line 1682) it analyzes the types... It's better to double check that with DBD::Pg developer, since I'm not familiar with its code deeply. Ideallly a programmer doesn't need to care to specify column type when putting a code like this (assuming C is of an "int" type ): my $sth = $dbh->prepare_cached( "select A from B where C=?"); $sth->execute( 123 ); ... and DBD::Pg would guess (don't ask me how) the column types and pass it to postgresql so the query executed in most efficient way. With DBD::Pg ver 1.40-1.41, if a programmer didn't specify a column type, it looks like it gonna be executed by postgresql w/o using indexes because of the bind values and table fields types conflict. -- Vlad
On Fri, 2005-04-08 at 14:12 -0400, Vlad wrote: > we've noticed that too here and quick explanation is - with the new > "prepare" implementation, the query get prepared on the server side, > which is good; the problem is that all of the bind values passed to > the sql server during execute call are in form of ***varchar***, i.e. > even if a field is of int type in postgresql table, DBD::Pg passes > your bind value as char - as a result, postgresql doesn't use indexes. A work around would be in the sql to cast the param to the appropriate type. EG: SELECT * FROM Person WHERE personid = %::int -- David Stanaway <david@stanaway.net>
David, it's acceptible (actually better in form of passing columns type via bind_values) as a quick workaround for a small scripts only. Actually as a quick workarround it's easier to downgrade DBD::Pg. Think of numerouse DB-OO Mappers and Object Persistance tools... it won't be easy to fix all of them. As of right now a lot of times you get screwed if you upgrade DBD::Pg to 1.40-1.41. On Apr 8, 2005 2:44 PM, David Stanaway <david@stanaway.net> wrote: > On Fri, 2005-04-08 at 14:12 -0400, Vlad wrote: > > we've noticed that too here and quick explanation is - with the new > > "prepare" implementation, the query get prepared on the server side, > > which is good; the problem is that all of the bind values passed to > > the sql server during execute call are in form of ***varchar***, i.e. > > even if a field is of int type in postgresql table, DBD::Pg passes > > your bind value as char - as a result, postgresql doesn't use indexes. > > A work around would be in the sql to cast the param to the appropriate > type. > > EG: SELECT * FROM Person WHERE personid = %::int > > -- > David Stanaway <david@stanaway.net> -- Vlad
Vlad <marchenko@gmail.com> writes: > Ideallly a programmer doesn't need to care to specify column type when > putting a code like this (assuming C is of an "int" type ): > my $sth = $dbh->prepare_cached( "select A from B where C=?"); > $sth->execute( 123 ); > ... > and DBD::Pg would guess (don't ask me how) the column types and pass > it to postgresql so the query executed in most efficient way. DBD::Pg shouldn't need to do any such thing --- the backend can do it a whole lot better than DBD::Pg could. What should be happening here is that if DBD::Pg hasn't been told a specific type for the parameter, it should pass the parameter type to the backend as "UNKNOWN" (yes, there is such a type ID) and let the backend try to infer the type. There are cases where there's not enough information to infer the type, and then the programmer has to provide it, either by writing a cast in the query text or by passing the info to bind_param. It sounds like bind_param could do with some work too ... but 80% of this issue would go away if UNKNOWN were being used correctly. regards, tom lane
Tom, Thanks for suggestion - I'm gonna try this patch: --- dbdimp.c.bak Wed Apr 6 16:40:20 2005 +++ dbdimp.c Fri Apr 8 15:21:29 2005 @@ -1415,7 +1415,7 @@ } else if (NULL == currph->bind_type) { /* "sticky" data type */ /* Thisis the default type, but we will honor defaultval if we can */ - currph->bind_type = pg_type_data(VARCHAROID); + currph->bind_type = pg_type_data(UNKNOWNOID); if (!currph->bind_type) croak("Default type is bad!!!!???"); } On Apr 8, 2005 3:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Vlad <marchenko@gmail.com> writes: > > Ideallly a programmer doesn't need to care to specify column type when > > putting a code like this (assuming C is of an "int" type ): > > > my $sth = $dbh->prepare_cached( "select A from B where C=?"); > > $sth->execute( 123 ); > > ... > > > and DBD::Pg would guess (don't ask me how) the column types and pass > > it to postgresql so the query executed in most efficient way. > > DBD::Pg shouldn't need to do any such thing --- the backend can do it a > whole lot better than DBD::Pg could. What should be happening here is > that if DBD::Pg hasn't been told a specific type for the parameter, it > should pass the parameter type to the backend as "UNKNOWN" (yes, there > is such a type ID) and let the backend try to infer the type. > > There are cases where there's not enough information to infer the type, > and then the programmer has to provide it, either by writing a cast in > the query text or by passing the info to bind_param. It sounds like > bind_param could do with some work too ... but 80% of this issue would > go away if UNKNOWN were being used correctly. > > regards, tom lane > -- Vlad
Brandon, try this patch instead. diff -ru DBD-Pg-1.41/dbdimp.c DBD-Pg-1.41-patched/dbdimp.c --- DBD-Pg-1.41/dbdimp.c Fri Apr 8 16:06:12 2005 +++ DBD-Pg-1.41-patched/dbdimp.c Fri Apr 8 15:46:50 2005 @@ -1415,7 +1415,7 @@ } else if (NULL == currph->bind_type) { /* "sticky" data type */ /* Thisis the default type, but we will honor defaultval if we can */ - currph->bind_type = pg_type_data(VARCHAROID); + currph->bind_type = pg_type_data(UNKNOWNOID); if (!currph->bind_type) croak("Default type is bad!!!!???"); } diff -ru DBD-Pg-1.41/types.c DBD-Pg-1.41-patched/types.c --- DBD-Pg-1.41/types.c Fri Apr 8 16:06:23 2005 +++ DBD-Pg-1.41-patched/types.c Fri Apr 8 15:46:37 2005 @@ -55,7 +55,7 @@ {ABSTIMEOID, "abstime", DBDPG_TRUE, null_quote, null_dequote, {0}}, {RELTIMEOID, "reltime",DBDPG_TRUE, null_quote, null_dequote, {0}}, {TINTERVALOID, "tinterval", DBDPG_TRUE, null_quote, null_dequote,{0}}, - {UNKNOWNOID, "unknown", DBDPG_FALSE, null_quote, null_dequote, {0}}, + {UNKNOWNOID, "unknown", DBDPG_FALSE, quote_varchar, null_dequote, {0}}, {CIRCLEOID, "circle", DBDPG_FALSE,null_quote, null_dequote, {0}}, {CASHOID, "money", DBDPG_TRUE, null_quote, null_dequote, {0}}, {MACADDROID,"MAC address", DBDPG_TRUE, quote_varchar,dequote_varchar, {0}}, On Apr 8, 2005 4:14 PM, Brandon Metcalf <bmetcalf@nortel.com> wrote: > m == marchenko@gmail.com writes: > > m> --- dbdimp.c.bak Wed Apr 6 16:40:20 2005 > m> +++ dbdimp.c Fri Apr 8 15:21:29 2005 > m> @@ -1415,7 +1415,7 @@ > m> } > m> else if (NULL == currph->bind_type) { /* "sticky" data type */ > m> /* This is the default type, but we will honor > m> defaultval if we can */ > m> - currph->bind_type = pg_type_data(VARCHAROID); > m> + currph->bind_type = pg_type_data(UNKNOWNOID); > m> if (!currph->bind_type) > m> croak("Default type is bad!!!!???"); > m> } > > Using this patch results in error messages like: > > DBD::Pg::st execute failed: ERROR: syntax error at or near "15" at character 158 > > Not sure why yet. > > -- > Brandon > -- Vlad
Vlad <marchenko@gmail.com> writes: > diff -ru DBD-Pg-1.41/types.c DBD-Pg-1.41-patched/types.c > --- DBD-Pg-1.41/types.c Fri Apr 8 16:06:23 2005 > +++ DBD-Pg-1.41-patched/types.c Fri Apr 8 15:46:37 2005 > @@ -55,7 +55,7 @@ > {ABSTIMEOID, "abstime", DBDPG_TRUE, null_quote, null_dequote, {0}}, > {RELTIMEOID, "reltime", DBDPG_TRUE, null_quote, null_dequote, {0}}, > {TINTERVALOID, "tinterval", DBDPG_TRUE, null_quote, null_dequote, {0}}, > - {UNKNOWNOID, "unknown", DBDPG_FALSE, null_quote, null_dequote, {0}}, > + {UNKNOWNOID, "unknown", DBDPG_FALSE, quote_varchar, null_dequote, {0}}, > {CIRCLEOID, "circle", DBDPG_FALSE, null_quote, null_dequote, {0}}, > {CASHOID, "money", DBDPG_TRUE, null_quote, null_dequote, {0}}, > {MACADDROID, "MAC address", DBDPG_TRUE, > quote_varchar,dequote_varchar, {0}}, Hmm ... if those columns mean what it looks like they mean, quite a few of the other entries seem wrong too. All of the types visible in that fragment ought to be quoted when used as SQL literals. Also, if the MACADDROID entry is any precedent, shouldn't you be changing the dequote field too? regards, tom lane
Brandon, could you let us know YOUR results with DBD::Pg 1.41 with and w/o patch that I've posted earlier? I just did some surface-deep testing here and here is what I've found: DBD-1.40 - slow DBD-1.41 - fast DBD-1.41, patched - fast. I can't see difference between patched and non patched. so it actually turned out that stock 1.41 already has that problem fixed... Though using UNKNOWN instead of VARCHAR for cases when no type was supplied (as Tom Lane suggested) is still a good idea, I think. My patch needs to be careful reviewed - I'm not familiar with DBD:Pg on a deep enough level to gurantee that it's not screwing anything. On Apr 8, 2005 4:24 PM, Vlad <marchenko@gmail.com> wrote: > Brandon, > > try this patch instead. > > diff -ru DBD-Pg-1.41/dbdimp.c DBD-Pg-1.41-patched/dbdimp.c > --- DBD-Pg-1.41/dbdimp.c Fri Apr 8 16:06:12 2005 > +++ DBD-Pg-1.41-patched/dbdimp.c Fri Apr 8 15:46:50 2005 > @@ -1415,7 +1415,7 @@ > } > else if (NULL == currph->bind_type) { /* "sticky" data type */ > /* This is the default type, but we will honor > defaultval if we can */ > - currph->bind_type = pg_type_data(VARCHAROID); > + currph->bind_type = pg_type_data(UNKNOWNOID); > if (!currph->bind_type) > croak("Default type is bad!!!!???"); > } > diff -ru DBD-Pg-1.41/types.c DBD-Pg-1.41-patched/types.c > --- DBD-Pg-1.41/types.c Fri Apr 8 16:06:23 2005 > +++ DBD-Pg-1.41-patched/types.c Fri Apr 8 15:46:37 2005 > @@ -55,7 +55,7 @@ > {ABSTIMEOID, "abstime", DBDPG_TRUE, null_quote, null_dequote, {0}}, > {RELTIMEOID, "reltime", DBDPG_TRUE, null_quote, null_dequote, {0}}, > {TINTERVALOID, "tinterval", DBDPG_TRUE, null_quote, null_dequote, {0}}, > - {UNKNOWNOID, "unknown", DBDPG_FALSE, null_quote, null_dequote, {0}}, > + {UNKNOWNOID, "unknown", DBDPG_FALSE, quote_varchar, null_dequote, {0}}, > {CIRCLEOID, "circle", DBDPG_FALSE, null_quote, null_dequote, {0}}, > {CASHOID, "money", DBDPG_TRUE, null_quote, null_dequote, {0}}, > {MACADDROID, "MAC address", DBDPG_TRUE, > quote_varchar,dequote_varchar, {0}}, > > > On Apr 8, 2005 4:14 PM, Brandon Metcalf <bmetcalf@nortel.com> wrote: > > m == marchenko@gmail.com writes: > > > > m> --- dbdimp.c.bak Wed Apr 6 16:40:20 2005 > > m> +++ dbdimp.c Fri Apr 8 15:21:29 2005 > > m> @@ -1415,7 +1415,7 @@ > > m> } > > m> else if (NULL == currph->bind_type) { /* "sticky" data type */ > > m> /* This is the default type, but we will honor > > m> defaultval if we can */ > > m> - currph->bind_type = pg_type_data(VARCHAROID); > > m> + currph->bind_type = pg_type_data(UNKNOWNOID); > > m> if (!currph->bind_type) > > m> croak("Default type is bad!!!!???"); > > m> } > > > > Using this patch results in error messages like: > > > > DBD::Pg::st execute failed: ERROR: syntax error at or near "15" at character 158 > > > > Not sure why yet. > > > > -- > > Brandon > > > > -- > Vlad > -- Vlad
Tom, > > diff -ru DBD-Pg-1.41/types.c DBD-Pg-1.41-patched/types.c > > --- DBD-Pg-1.41/types.c Fri Apr 8 16:06:23 2005 > > +++ DBD-Pg-1.41-patched/types.c Fri Apr 8 15:46:37 2005 > > @@ -55,7 +55,7 @@ > > {ABSTIMEOID, "abstime", DBDPG_TRUE, null_quote, null_dequote, {0}}, > > {RELTIMEOID, "reltime", DBDPG_TRUE, null_quote, null_dequote, {0}}, > > {TINTERVALOID, "tinterval", DBDPG_TRUE, null_quote, null_dequote, {0}}, > > - {UNKNOWNOID, "unknown", DBDPG_FALSE, null_quote, null_dequote, {0}}, > > + {UNKNOWNOID, "unknown", DBDPG_FALSE, quote_varchar, null_dequote, {0}}, > > {CIRCLEOID, "circle", DBDPG_FALSE, null_quote, null_dequote, {0}}, > > {CASHOID, "money", DBDPG_TRUE, null_quote, null_dequote, {0}}, > > {MACADDROID, "MAC address", DBDPG_TRUE, > > quote_varchar,dequote_varchar, {0}}, > > Hmm ... if those columns mean what it looks like they mean, quite a few > of the other entries seem wrong too. All of the types visible in that > fragment ought to be quoted when used as SQL literals. > > Also, if the MACADDROID entry is any precedent, shouldn't you be > changing the dequote field too? I'm not with the development team of DBD::Pg to make those changes, but I'm confident that it will be much appreciated by them (and myself and other users of DBD::Pg for sure) if you could review its code in regards to the type conversion issue and give your suggestions. http://search.cpan.org/~dbdpg/DBD-Pg-1.41/ -- Vlad
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Let's slow down and back up a bit. First, what version of PostgreSQL are you using to compile DBD::Pg, and what version are you connecting to? Many of the things being discussed have already been solved if you are running a modern version of PG (e.g. 7.4 or greater). The binding is already handled quite well: we use PQprepare whenever possible. The default of type "VARCHAROID" is mostly cosmetic, so that some other things work: we actually send a type "unknown" (0) if no type is specified for a placeholder via bind_param. (see the "defaultval" of the placeholder struct). Older servers may benefit from the VARCHAROID change however, so I will see about making that change. The stuff in types.c could certainly use some cleaning up, but it is also not really used anymore - we let the server do all the quoting for us now via PQexecPrepared and PQexecParams. - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200504081815 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iD8DBQFCVwLNvJuQZxSWSsgRAtmwAKC4e2bPApAYXj7cz2M2fOwtmPEk1ACgzz9C AqwZr6PYj+/sUmomA7dKlqE= =MdcS -----END PGP SIGNATURE-----
Greg, just re-run my tests with freshly compiled PostgreSQL 8.0.2 (congrats on release, btw) DBD:Pg 1.32 - fast DBD:Pg 1.40 - slooow and buggy, won't work w/o patching DBD:Pg 1.41 - fast DBD:Pg 1.41 patched - fast the patch I've applied in last test: diff -ru DBD-Pg-1.41/dbdimp.c DBD-Pg-1.41-patched/dbdimp.c --- DBD-Pg-1.41/dbdimp.c Fri Apr 8 16:06:12 2005 +++ DBD-Pg-1.41-patched/dbdimp.c Fri Apr 8 15:46:50 2005 @@ -1415,7 +1415,7 @@ } else if (NULL == currph->bind_type) { /* "sticky" data type */ /* Thisis the default type, but we will honor defaultval if we can */ - currph->bind_type = pg_type_data(VARCHAROID); + currph->bind_type = pg_type_data(UNKNOWNOID); if (!currph->bind_type) croak("Default type is bad!!!!???"); } diff -ru DBD-Pg-1.41/types.c DBD-Pg-1.41-patched/types.c --- DBD-Pg-1.41/types.c Fri Apr 8 16:06:23 2005 +++ DBD-Pg-1.41-patched/types.c Fri Apr 8 17:23:48 2005 @@ -55,7 +55,7 @@ {ABSTIMEOID, "abstime", DBDPG_TRUE, null_quote, null_dequote, {0}}, {RELTIMEOID, "reltime",DBDPG_TRUE, null_quote, null_dequote, {0}}, {TINTERVALOID, "tinterval", DBDPG_TRUE, null_quote, null_dequote,{0}}, - {UNKNOWNOID, "unknown", DBDPG_FALSE, null_quote, null_dequote, {0}}, + {UNKNOWNOID, "unknown", DBDPG_FALSE, quote_varchar, dequote_varchar, {0}}, {CIRCLEOID, "circle", DBDPG_FALSE, null_quote, null_dequote, {0}}, {CASHOID, "money",DBDPG_TRUE, null_quote, null_dequote, {0}}, {MACADDROID, "MAC address", DBDPG_TRUE, quote_varchar,dequote_varchar, {0}}, On Apr 8, 2005 6:17 PM, Greg Sabino Mullane <greg@turnstep.com> wrote: > Let's slow down and back up a bit. First, what version of > PostgreSQL are you using to compile DBD::Pg, and what version > are you connecting to? Many of the things being discussed > have already been solved if you are running a modern version > of PG (e.g. 7.4 or greater). -- Vlad
m == marchenko@gmail.com writes: m> could you let us know YOUR results with DBD::Pg 1.41 with and w/om> patch that I've posted earlier? I just did some surface-deeptestingm> here and here is what I've found: m> DBD-1.40 - slowm> DBD-1.41 - fastm> DBD-1.41, patched - fast. I can't see difference between patched andm> non patched. Hm. What I'm seeing is that both DBD-Pg-1.40 and DBD-Pg-1.41 are _much_ slower than the old Pg module. I see no difference between 1.40 and 1.41. In order to test your patches for performance, I'll need to put together a test environment that simulates the load in our production environment. I'll let you know. -- Brandon
m == marchenko@gmail.com writes: m> Thanks for suggestion - I'm gonna try this patch: m> --- dbdimp.c.bak Wed Apr 6 16:40:20 2005m> +++ dbdimp.c Fri Apr 8 15:21:29 2005m> @@ -1415,7 +1415,7 @@m> }m> else if (NULL == currph->bind_type) { /* "sticky" data type */m> /* This is the defaulttype, but we will honorm> defaultval if we can */m> - currph->bind_type = pg_type_data(VARCHAROID);m>+ currph->bind_type = pg_type_data(UNKNOWNOID);m> if (!currph->bind_type)m> croak("Default type is bad!!!!???");m> } Yeah, looks good. Let me know how it goes. I'll try the same thing over here. -- Brandon
m == marchenko@gmail.com writes: m> --- dbdimp.c.bak Wed Apr 6 16:40:20 2005m> +++ dbdimp.c Fri Apr 8 15:21:29 2005m> @@ -1415,7 +1415,7 @@m> }m> else if (NULL == currph->bind_type) { /* "sticky" data type */m> /* This is the defaulttype, but we will honorm> defaultval if we can */m> - currph->bind_type = pg_type_data(VARCHAROID);m>+ currph->bind_type = pg_type_data(UNKNOWNOID);m> if (!currph->bind_type)m> croak("Default type is bad!!!!???");m> } Using this patch results in error messages like: DBD::Pg::st execute failed: ERROR: syntax error at or near "15" at character 158 Not sure why yet. -- Brandon
m == marchenko@gmail.com writes: m> we've noticed that too here and quick explanation is - with the newm> "prepare" implementation, the query get preparedon the server side,m> which is good; the problem is that all of the bind values passed tom> the sql server duringexecute call are in form of ***varchar***, i.e.m> even if a field is of int type in postgresql table, DBD::Pg passesm>your bind value as char - as a result, postgresql doesn't use indexes. m> I don't know if developers of new DBD::Pg have some game plan to fixm> this problem... but it won't be easy: since perlis a no-var-typem> language, so either DBD::Pg needs to analyze the table columns datam> types first before executinga query and then use correct type form> bind values, or let coder pass those types explicitly... though I'mm> notaware of a method in DBI that would let them do that... Wouldn't bind_param() do just that? -- Brandon
m == marchenko@gmail.com writes: m> it's acceptible (actually better in form of passing columns type viam> bind_values) as a quick workaround for a smallscripts only. Actuallym> as a quick workarround it's easier to downgrade DBD::Pg. m> Think of numerouse DB-OO Mappers and Object Persistance tools... itm> won't be easy to fix all of them. As of right nowa lot of times youm> get screwed if you upgrade DBD::Pg to 1.40-1.41. When you say downgrade DBD::Pg, is there a version that doesn't have this problem? That is, the problem of everything being treated as SQL_VARCHAR? -- Brandon
g == greg@turnstep.com writes: g> Let's slow down and back up a bit. First, what version ofg> PostgreSQL are you using to compile DBD::Pg, and what versiong>are you connecting to? Many of the things being discussedg> have already been solved if you are running a modernversiong> of PG (e.g. 7.4 or greater). g> The binding is already handled quite well: we use PQprepareg> whenever possible. The default of type "VARCHAROID" is mostlyg>cosmetic, so that some other things work: we actually sendg> a type "unknown" (0) if no type is specified for a placeholderg>via bind_param. (see the "defaultval" of the placeholderg> struct). Older servers may benefit from the VARCHAROIDchangeg> however, so I will see about making that change. g> The stuff in types.c could certainly use some cleaning up, butg> it is also not really used anymore - we let the serverdo allg> the quoting for us now via PQexecPrepared and PQexecParams. This is consistent with what I'm seeing. That is, I see no performance change using Vlad's patch to DBD-Pg-1.41. What I am seeing, however, is my code that uses the old Pg interface is much faster than my version that uses DBI/DBD::Pg. But this is probably in the way I'm using it. For example, there are places where I should be using execute_array() instead of wrapping a loop around execute(). -- Brandon