Thread: Large Object problems (was Re: JDBC int8 hack)

Large Object problems (was Re: JDBC int8 hack)

From
Peter Mount
Date:
At 18:30 09/04/01 -0700, Kyle VanderBeek wrote:
>On Thu, Apr 05, 2001 at 04:08:48AM -0400, Peter T Mount wrote:
> > Quoting Kyle VanderBeek <kylev@yaga.com>:
> >
> >
> > > Please consider applying my patch to the 7.0 codebase as a stop-gap
> > > measure until such time as the optimizer can be improved to notice
> > > indecies on INT8 columns and cast INT arguments up.
> >
> > This will have to wait until after 7.1 is released. As this is a "new"
> feature,
> > this can not be included into 7.1 as it's now in the final Release
> Candidate
> > phase.
>
>This is a new feature?  Using indecies is "new"?  I guess I really beg to
>differ.  Seems like a bugfix to me (in the "workaround" category).

Yes they are. INT8 is not a feature/type yet supported by the driver, hence
it's "new".

Infact the jdbc driver supports no array's at this time (as PostgreSQL &
SQL3 arrays are different beasts).

If it's worked in the past, then that was sheer luck.

>I'm going to start digging around in the optimizer code so such hacks as
>mine aren't needed.  It's really haenous to find out your production
>server is freaking out and doing sequential scans for EVERYTHING.

Are you talking about the optimiser in the backend as there isn't one in
the jdbc driver.


>Another hack I need to work on (or someone else can) is to squish in a
>layer of filesystem hashing for large objects.  We tried to use large
>objects and got destroyed.  40,000 rows and the server barely functioned.
>I think this is because of 2 things:
>
>1) Filehandles not being closed.  This was an oversite I've seen covered
>in the list archives somewhere.

Ok, ensure you are closing the large objects within JDBC. If you are then
this is a backend problem.

One thing to try is to commit the transaction a bit more often (if you are
running within a single transaction for all 40k objects). Committing the
transaction will force the backend to close all open large objects on that
connection.

>2) The fact that all objects are stored in a the single data directory.
>Once you get up to a good number of objects, directory scans really take a
>long, long time.  This slows down any subsequent openings of large
>objects.  Is someone working on this problem?  Or have a patch already?

Again not JDBC. Forwarding to the hackers list on this one. The naming
conventions were changed a lot in 7.1, and it was for more flexability.

Peter


Re: Large Object problems (was Re: JDBC int8 hack)

From
Kyle VanderBeek
Date:
Sorry, meant to hit all of these.

On Tue, Apr 10, 2001 at 02:24:24PM +0100, Peter Mount wrote:
> >I'm going to start digging around in the optimizer code so such hacks as
> >mine aren't needed.  It's really haenous to find out your production
> >server is freaking out and doing sequential scans for EVERYTHING.
>
> Are you talking about the optimiser in the backend as there isn't one in
> the jdbc driver.

Yeah, in the backend.  My patch to the JDBC driver only helps people using
JDBC to get to a database (obviously).  From any other access method, a
statement like:

SELECT * FROM Foo where bar=1234

will do a sequential scan even if there is an index Foo_bar_idx on "bar"
if bar is INT8.  It seems to me that the optimizer should be able to
notice the index Foo_bar_idx and convert the argument "1234" to an INT8 in
order to use Foo_bar_idx over doing a sequential scan (in which case,
"1234" probably gets converted to INT8 anyhow to do comparisons).

Granted, I'm theorizing.  I should probably shut up and RTFS.

Anyhow, all my patch did was tack the "::int8" cast onto parameters that
were set by PreparedStatement.setLong().  We did this after finding that
EXPLAIN'ing this:

SELECT * FROM Foo where bar=1234::int8

didn't degrade to a sequential scan like the other SELECT statement (w/o
the cast).

> >Another hack I need to work on (or someone else can) is to squish in a
> >layer of filesystem hashing for large objects.  We tried to use large
> >objects and got destroyed.  40,000 rows and the server barely functioned.
> >I think this is because of 2 things:
> >
> >1) Filehandles not being closed.  This was an oversite I've seen covered
> >in the list archives somewhere.
>
> Ok, ensure you are closing the large objects within JDBC. If you are then
> this is a backend problem.
>
> One thing to try is to commit the transaction a bit more often (if you are
> running within a single transaction for all 40k objects). Committing the
> transaction will force the backend to close all open large objects on that
> connection.

We were using setBytes(), as we were trying to minimize porting work from
the previous database we were using.  And we were comitting after every
transaction.  We switched to Base64 encoding and storing strings, so we're
in better shape now.

I'm going to write some more test code in my evenings and see if I can get
current PostgreSQL to suck up filehandles.  I'll post again if I can put
together a coherent bug report or patch.

> >2) The fact that all objects are stored in a the single data directory.
> >Once you get up to a good number of objects, directory scans really take a
> >long, long time.  This slows down any subsequent openings of large
> >objects.  Is someone working on this problem?  Or have a patch already?
>
> Again not JDBC. Forwarding to the hackers list on this one. The naming
> conventions were changed a lot in 7.1, and it was for more flexability.

Right, cool.  I'll check out the new codebase.  Thanks.

This is *so* the curse of open source: now I'm going to be using up my
personal time to look for bugs I find at work.  Oh well, I didn't need
to sleep anyhow. ;-)

--
Kyle.
   "I hate every ape I see, from chimpan-A to chimpan-Z" -- Troy McClure

Re: Large Object problems (was Re: JDBC int8 hack)

From
Kyle VanderBeek
Date:
On Tue, Apr 10, 2001 at 02:24:24PM +0100, Peter Mount wrote:
> At 18:30 09/04/01 -0700, Kyle VanderBeek wrote:
> >This is a new feature?  Using indecies is "new"?  I guess I really beg to
> >differ.  Seems like a bugfix to me (in the "workaround" category).
>
> Yes they are. INT8 is not a feature/type yet supported by the driver, hence
> it's "new".
>
> Infact the jdbc driver supports no array's at this time (as PostgreSQL &
> SQL3 arrays are different beasts).
>
> If it's worked in the past, then that was sheer luck.

Alright man, you've got me confused.  Are you saying that despite the
existance of INT8 as a column type, and PreparedStatement.setLong(), that
these ought not be used?  If so, there is a really big warning missing
from the documentation!

I guess I'm asking this: I've got an enterprise database runnign 7.0.3
ready to go using INT8 primary keys and being accessed through my
re-touched JDBC driver.  Am I screwed?  Is it going to break?  If so, I
need to fix this all very, very fast.

--
Kyle.
   "I hate every ape I see, from chimpan-A to chimpan-Z" -- Troy McClure

Re: Large Object problems (was Re: JDBC int8 hack)

From
Thomas Lockhart
Date:
> > >This is a new feature?  Using indecies is "new"?  I guess I really beg to
> > >differ.  Seems like a bugfix to me (in the "workaround" category).
> > Yes they are. INT8 is not a feature/type yet supported by the driver, hence
> > it's "new".
> > Infact the jdbc driver supports no array's at this time (as PostgreSQL &
> > SQL3 arrays are different beasts).
> > If it's worked in the past, then that was sheer luck.
> Alright man, you've got me confused.  Are you saying that despite the
> existance of INT8 as a column type, and PreparedStatement.setLong(), that
> these ought not be used?  If so, there is a really big warning missing
> from the documentation!

Ah, it just dawned on me what might be happening: Peter, I'm guessing
that you are thinking of "INT48" or some such, the pseudo-integer array
type. Kyle is referring to the "int8" 8 byte integer type.

> I guess I'm asking this: I've got an enterprise database runnign 7.0.3
> ready to go using INT8 primary keys and being accessed through my
> re-touched JDBC driver.  Am I screwed?  Is it going to break?  If so, I
> need to fix this all very, very fast.

btw, it might be better to use a syntax like

  ... where col = '1234';

or

 ... where col = int8 '1234';

If the former works, then that is a bit more generic that slapping a
"::int8" onto the constant field.

I'd imagine that this could also be coded into the app; if so that may
be where it belongs since then the driver does not have to massage the
queries as much and it will be easier for the *driver* to stay
compatible with applications.

                   - Thomas

Re: [PATCHES] Re: Large Object problems (was Re: JDBC int8 hack)

From
Kyle VanderBeek
Date:
On Wed, Apr 11, 2001 at 02:57:16AM +0000, Thomas Lockhart wrote:
> > Alright man, you've got me confused.  Are you saying that despite the
> > existance of INT8 as a column type, and PreparedStatement.setLong(), that
> > these ought not be used?  If so, there is a really big warning missing
> > from the documentation!
>
> Ah, it just dawned on me what might be happening: Peter, I'm guessing
> that you are thinking of "INT48" or some such, the pseudo-integer array
> type. Kyle is referring to the "int8" 8 byte integer type.

Yes!

> > I guess I'm asking this: I've got an enterprise database runnign 7.0.3
> > ready to go using INT8 primary keys and being accessed through my
> > re-touched JDBC driver.  Am I screwed?  Is it going to break?  If so, I
> > need to fix this all very, very fast.
>
> btw, it might be better to use a syntax like
>
>   ... where col = '1234';
>
> or
>
>  ... where col = int8 '1234';
>
> If the former works, then that is a bit more generic that slapping a
> "::int8" onto the constant field.

It seems like a wash to me; either way gets the desired result.  Tacking
on ::int8 was the quickest.  It also seems neater than this:

   set(parameterIndex, ("int8 '" + new Long(x)).toString() + "'");

in PreparedStatement.setLong().

> I'd imagine that this could also be coded into the app; if so that may
> be where it belongs since then the driver does not have to massage the
> queries as much and it will be easier for the *driver* to stay
> compatible with applications.

This seems to be the wrong idea to me.  The idea is that JDBC allows you
to be a little bit "backend agnostic".  It'd be pretty disappointing if
this wasn't true for even the base types.  Application programmers should
just call setLong() they're dealing with an 8-byte (Long or long) integer.
It'd be a shame to have a PostgreSQL-specific call to setString("int8 '" +
x.toString() + "'") littering your code.  That seems to fly in the face of
everything that JDBC/DBI/ODBC (etc) are about.

--
Kyle.
   "I hate every ape I see, from chimpan-A to chimpan-Z" -- Troy McClure

Re: Large Object problems (was Re: JDBC int8 hack)

From
Peter T Mount
Date:
Quoting Kyle VanderBeek <kylev@yaga.com>:

> On Tue, Apr 10, 2001 at 02:24:24PM +0100, Peter Mount wrote:
> > At 18:30 09/04/01 -0700, Kyle VanderBeek wrote:
> > >This is a new feature?  Using indecies is "new"?  I guess I really
> beg to
> > >differ.  Seems like a bugfix to me (in the "workaround" category).
> >
> > Yes they are. INT8 is not a feature/type yet supported by the driver,
> hence
> > it's "new".
> >
> > Infact the jdbc driver supports no array's at this time (as PostgreSQL
> &
> > SQL3 arrays are different beasts).
> >
> > If it's worked in the past, then that was sheer luck.
>
> Alright man, you've got me confused.  Are you saying that despite the
> existance of INT8 as a column type, and PreparedStatement.setLong(),
> that
> these ought not be used?  If so, there is a really big warning missing
> from the documentation!

Erm, int8 isn't long, but an array of 8 int's (unless it's changed).

> I guess I'm asking this: I've got an enterprise database runnign 7.0.3
> ready to go using INT8 primary keys and being accessed through my
> re-touched JDBC driver.  Am I screwed?  Is it going to break?  If so, I
> need to fix this all very, very fast.
>
> --
> Kyle.
>    "I hate every ape I see, from chimpan-A to chimpan-Z" -- Troy
> McClure
>



--
Peter Mount peter@retep.org.uk
PostgreSQL JDBC Driver: http://www.retep.org.uk/postgres/
RetepPDF PDF library for Java: http://www.retep.org.uk/pdf/

Re: Large Object problems (was Re: JDBC int8 hack)

From
Thomas Lockhart
Date:
> Erm, int8 isn't long, but an array of 8 int's (unless it's changed).

int8 is a 64-bit integer. There used to be a type (maybe called int48
??) which was 8 4-byte integers. afaicr that is now called oidvector
(and there is an int2vector also). The name changes for these latter
types were fairly recent.

Kyle is asking about the 64-bit integer type called int8 in the catalog
and int64 in the backend source code.

                       - Thomas

Re: Large Object problems (was Re: JDBC int8 hack)

From
Peter T Mount
Date:
Quoting Thomas Lockhart <lockhart@alumni.caltech.edu>:

> > > >This is a new feature?  Using indecies is "new"?  I guess I really
> beg to
> > > >differ.  Seems like a bugfix to me (in the "workaround" category).
> > > Yes they are. INT8 is not a feature/type yet supported by the
> driver, hence
> > > it's "new".
> > > Infact the jdbc driver supports no array's at this time (as
> PostgreSQL &
> > > SQL3 arrays are different beasts).
> > > If it's worked in the past, then that was sheer luck.
> > Alright man, you've got me confused.  Are you saying that despite the
> > existance of INT8 as a column type, and PreparedStatement.setLong(),
> that
> > these ought not be used?  If so, there is a really big warning
> missing
> > from the documentation!
>
> Ah, it just dawned on me what might be happening: Peter, I'm guessing
> that you are thinking of "INT48" or some such, the pseudo-integer array
> type. Kyle is referring to the "int8" 8 byte integer type.

Ah, that would explain it. However int8 (as in 8 byte int) has not been
implemented AFAIK (which is why I've said it's "new"). Until now, I've taken
int8 to be the one that used to be used (probably still is) in system tables
etc.

> > I guess I'm asking this: I've got an enterprise database runnign
> 7.0.3
> > ready to go using INT8 primary keys and being accessed through my
> > re-touched JDBC driver.  Am I screwed?  Is it going to break?  If so,
> I
> > need to fix this all very, very fast.
>
> btw, it might be better to use a syntax like
>
>   ... where col = '1234';
>
> or
>
>  ... where col = int8 '1234';
>
> If the former works, then that is a bit more generic that slapping a
> "::int8" onto the constant field.
>
> I'd imagine that this could also be coded into the app; if so that may
> be where it belongs since then the driver does not have to massage the
> queries as much and it will be easier for the *driver* to stay
> compatible with applications.

I agree.

Peter

--
Peter Mount peter@retep.org.uk
PostgreSQL JDBC Driver: http://www.retep.org.uk/postgres/
RetepPDF PDF library for Java: http://www.retep.org.uk/pdf/

Re: Re: Large Object problems (was Re: JDBC int8 hack)

From
Tom Lane
Date:
Peter T Mount <peter@retep.org.uk> writes:
>> Ah, it just dawned on me what might be happening: Peter, I'm guessing
>> that you are thinking of "INT48" or some such, the pseudo-integer array
>> type. Kyle is referring to the "int8" 8 byte integer type.

> Ah, that would explain it. However int8 (as in 8 byte int) has not been
> implemented AFAIK (which is why I've said it's "new"). Until now, I've taken
> int8 to be the one that used to be used (probably still is) in system tables
> etc.

Say what?  "int8" has been a 64-bit-integer type since release 6.4.
I think it existed in contrib even before that, but certainly that is
what "int8" has meant for the last three or so years.

            regards, tom lane

Re: Large Object problems (was Re: JDBC int8 hack)

From
Kyle VanderBeek
Date:
On Tue, Apr 17, 2001 at 09:11:54AM -0400, Peter T Mount wrote:
> Erm, int8 isn't long, but an array of 8 int's (unless it's changed).

http://postgresql.readysetnet.com/users-lounge/docs/7.0/user/datatype.htm#AEN942

It is very much an 8-byte integer, the correlary to Java's Long/long.

--
Kyle.
   "I hate every ape I see, from chimpan-A to chimpan-Z" -- Troy McClure