Thread: Performance/Issues with CMP and JBoss
Hi,
We currently have an JBoss web application that persists a byte array it recieves (Using the EJB persistence API), and then tries to read it from the Database again from further parsing. The application works really well with the default Hypersonic datasource, but it will not work correctly when using postgres as the datasource.
The current problem we seem to have is that the data is persisted ok (or at least it seems to be in there with pgadmin), but cannot be read back out of the database all the time (in fact for about 90% of the time), the current behaviour of the application suggests it is trying to read it back out of the database (using EntityManager.find()) before it has really been saved, and thus fails to find the data. Do i have to tweak some settings in the postgres.conf file ? i have tried turning off fsync (i do not want to do this, for reliability reasons) and it performed far better. Can anyone advise me on the changes i need to make to speed up the inserting of data, i know that turning autocommit off is supposed to increase performance.
Thanks in advance,
Andy
We currently have an JBoss web application that persists a byte array it recieves (Using the EJB persistence API), and then tries to read it from the Database again from further parsing. The application works really well with the default Hypersonic datasource, but it will not work correctly when using postgres as the datasource.
The current problem we seem to have is that the data is persisted ok (or at least it seems to be in there with pgadmin), but cannot be read back out of the database all the time (in fact for about 90% of the time), the current behaviour of the application suggests it is trying to read it back out of the database (using EntityManager.find()) before it has really been saved, and thus fails to find the data. Do i have to tweak some settings in the postgres.conf file ? i have tried turning off fsync (i do not want to do this, for reliability reasons) and it performed far better. Can anyone advise me on the changes i need to make to speed up the inserting of data, i know that turning autocommit off is supposed to increase performance.
Thanks in advance,
Andy
"Andy Dale" <andy.dale@gmail.com> writes: > The current problem we seem to have is that the data is persisted ok > (or at least it seems to be in there with pgadmin), but cannot be > read back out of the database all the time (in fact for about 90% of > the time), the current behaviour of the application suggests it is > trying to read it back out of the database (using > EntityManager.find()) before it has really been saved, and thus > fails to find the data. Do i have to tweak some settings in the > postgres.conf file ? i have tried turning off fsync (i do not want > to do this, for reliability reasons) and it performed far better. > Can anyone advise me on the changes i need to make to speed up the > inserting of data, i know that turning autocommit off is supposed to > increase performance. This is almost certainly a problem with your persistence layer rather than with Postgres. If you can see the data with PGAdmin then it's in the database. It may be that the transaction that saves the object is not committing quickly, and so other connections don't see the object until the commit happens. But that's not the fault of Postgres. -Doug
Hi,
I have performed some tests earlier on today, and i think the problem lies with Postgres and it's bad performance when being used with container managed persistence. I am covinced of it being an issue with postgres because it works really well with MySQL and Hypersonic, but not with Postgres. I have been reading around on the internet and it seems that Postgres does not work too well with container managed persistence, but surely all i have to do is just change some configuration settings, or can Postgres not be used with container managed persistence.
Thanks,
Andy
I have performed some tests earlier on today, and i think the problem lies with Postgres and it's bad performance when being used with container managed persistence. I am covinced of it being an issue with postgres because it works really well with MySQL and Hypersonic, but not with Postgres. I have been reading around on the internet and it seems that Postgres does not work too well with container managed persistence, but surely all i have to do is just change some configuration settings, or can Postgres not be used with container managed persistence.
Thanks,
Andy
On 8/1/06, Douglas McNaught <doug@mcnaught.org> wrote:
"Andy Dale" <andy.dale@gmail.com> writes:
> The current problem we seem to have is that the data is persisted ok
> (or at least it seems to be in there with pgadmin), but cannot be
> read back out of the database all the time (in fact for about 90% of
> the time), the current behaviour of the application suggests it is
> trying to read it back out of the database (using
> EntityManager.find ()) before it has really been saved, and thus
> fails to find the data. Do i have to tweak some settings in the
> postgres.conf file ? i have tried turning off fsync (i do not want
> to do this, for reliability reasons) and it performed far better.
> Can anyone advise me on the changes i need to make to speed up the
> inserting of data, i know that turning autocommit off is supposed to
> increase performance.
This is almost certainly a problem with your persistence layer rather
than with Postgres. If you can see the data with PGAdmin then it's in
the database. It may be that the transaction that saves the object is
not committing quickly, and so other connections don't see the object
until the commit happens. But that's not the fault of Postgres.
-Doug
Andy Dale wrote: > Hi, > > I have performed some tests earlier on today, and i think the problem lies > with Postgres and it's bad performance when being used with container > managed persistence. Is your problem performance or an error? It sounded like you were getting errors in your first post. > I am covinced of it being an issue with postgres > because it works really well with MySQL and Hypersonic, but not with > Postgres. *What* works really well? Can you tell us what query/queries are giving you problems? > I have been reading around on the internet and it seems that > Postgres does not work too well with container managed persistence, but > surely all i have to do is just change some configuration settings, or can > Postgres not be used with container managed persistence. If you generate valid SQL then PostgreSQL can certainly process the queries. As to whether tuning will help, nobody can say because you haven't supplied details of: 1. Hardware 2. Operating System 3. Queries giving problems 4. Concurrency details 5. Current configuration settings 6. System activity (is CPU/RAM/IO maxed?) Without at least *some* of these facts nobody can say anything useful. -- Richard Huxton Archonet Ltd
Andy Dale wrote: > Hi, > > I will explain in more details what the test (extremely simple) program is > actually doing. A session bean receives some data (roughly 3K) and then > persists this data to the database with EntityManager.persist() (using the > EJB 3.0 Persistence API that comes with JBoss). Once the persist method > returns a message is sent to a message driven bean where it tries to get > the > previously persisted data from the database using the EntityManager.find() > method, this is where we run into problems with Postgres, it only seems to > find the persisted object 5% of the time. I don't know if you class it > as a > performance problem or an error but seems as it works in other databases i > am more inclined to classify this as an error. Well that's easy to decide. 1. Did the transaction that stored the object complete successfully? 2. Should the results of that transaction be visible from the reading transaction? If Yes & Yes, but you can't find the object there's an error. > When i say works really well, i mean it in the sense that a simple query > that the EntityManager produces for the find operation is something as > simple as "select <columns> from table where <primary key> = ?", this is > causing problems for Postgres, the query is in the form of a prepared > statement so could this be causing any problems ? Prepared queries mean you can't do certain optimisations, but for a single-table fetch on columns with a unique index I'd expect an index to be used (assuming analyse has been run recently). > As far as hardware goes my testing machine is: > > P4 3.0 GHz > 1GB RAM > 20GB of HD (IDE) > > But we intend to use a HP Prolient server with the following spec: > > Intel Xeon 3.3 GHz > 2 GB RAM > 146GB Ultra SCSI 320 > > The Operating System being run on both is Fedora Core 5 > > The server/machine also needs to run other programs and processes so we > don't want the database to hog to much of the resources, about 10 - 20 % > RAM > (and how to configure it) and CPU, the current config as defined in the > postgres.conf file is as so: Squeezing PostgreSQL and cache-space for its data into 256MB is going to depend on how large your DB is. Oh, and if you have a lot of updates then disk will probably be the limiting factor. > # ----------------------------- > # PostgreSQL configuration file > # ----------------------------- > # > # This file consists of lines of the form: > # We don't really need to see all the comment lines > #--------------------------------------------------------------------------- > > # RESOURCE USAGE (except WAL) > #--------------------------------------------------------------------------- > > > # - Memory - > > shared_buffers = 1000 # min 16 or max_connections*2, 8KB each > #temp_buffers = 1000 # min 100, 8KB each > #max_prepared_transactions = 5 # can be 0 or more > # note: increasing max_prepared_transactions costs ~600 bytes of shared > memory > # per transaction slot, plus lock space (see max_locks_per_transaction). > #work_mem = 1024 # min 64, size in KB > #maintenance_work_mem = 16384 # min 1024, size in KB > #max_stack_depth = 2048 # min 100, size in KB OK, so you haven't actually done any configuration. PG *will* crawl with the default settings, it's setup so you can install it on your 5-year old laptop without keeling over. I recommend you read the short article at the following URL and start from there. http://www.powerpostgresql.com/PerfList Oh, and if you don't know what vacuum, analyse and the autovacuum tool are you'll want to read the relevant parts of the manual. -- Richard Huxton Archonet Ltd
Diagnosing JBossCMP issues is not for the faint of heart, in that one of its main raison d'etre's is to hide SQL knowledge away from the casual coder. Add into the mix the concurrency issues which naturally occur since an EJB container is multithreaded and the overly complicated JTA stuff and you've a real mess to diagnose when things don't work just so. When we ran CMP, we'd also run postgres in debugging mode so as to have it emit queries onto stdout so that we could watch exactly what the CMP was doing. Try starting up postgres ala: /usr/local/pgsql/bin/postmaster -d 2 -i -D /usr/local/pgsql/data 2>&1 | grep LOG Your mileage may vary, as would your PGDATA dir etc. Read the docs on the postmaster. But the end result is the ability to watch each query fly by on your development machine -- letting you see the order of which updates, inserts, selects, and commits happen from the postgres backend's perspective. I suspect the original poster has code issues being tickled by java threading issues interacting poorly with their default transaction isolation level of READ COMMITTED -- their inserts are being done in one thread / JTA transaction, while the read is being done in another and is loosing the race -- the inserting thread has not committed yet. Running the backend in debugging mode should let you see the select happening _before_ the first thread has committed its transaction. Postgres is doing exactly what it is being told -- if the inserting transaction has not yet committed, and the reading transaction's isolation level is set to READ COMMITTED, then postgres _will_not_ return anything to the reading connection / thread which has not yet been committed. Good luck with reading and following all of the EJB and CMP specifications, the JBossCMP documentation, the JTA spec, and then swallowing all of postgres [ or any other SQL backend ]. If you don't have all of 'em fully understood yet, you will have to one day if you continue with all that fat tech which was supposed to make things easy for you. CMP is a very leaky overcomplicated abstraction. ---- James Robinson Socialserve.com