Re: [ADMIN] performance issue using DBI - Mailing list pgsql-general

From Giorgio Ponza
Subject Re: [ADMIN] performance issue using DBI
Date
Msg-id 0f6e01c20d47$9604eee0$c801a8c0@opla.it
Whole thread Raw
In response to performance issue using DBI  ("Nicolas Nolst" <nnolst@hotmail.com>)
List pgsql-general
i'm not a db expert but i think that a transaction with more than 20.000 operations cant be quick anyway.
I think you have to work in other ways, maybe reading one line at a time, doing the operations, committing, deleting the line from the file.
to remember all this transactions i think postgres must swap more then a 386 with 4MB Ram!
Bye
Giorgio
----- Original Message -----
Sent: Thursday, June 06, 2002 11:40 AM
Subject: [ADMIN] performance issue using DBI

Hi all,

I have developped a perl script to populate a database with two tables: sessions
and actions.

the table actions contains the following columns: session_id, url, timestamp.
The column session_id references to the table sessions.

the table sessions contains the following columns: session_id, remote_ip,
phone_type, phone_number. The column session_id is serial.

The lines of the table actions which are part of the same session have the same
session_id.

There are then more lines in the table actions than in the table session.


To fill the two tables, I first need to know if the session already exists for a
certain phone_type, a certain remote_ip and a certain phone_number:

SELECT session_id FROM sessions WHERE (phone_number = ?) AND (remote_ip  = ?)
AND (phone_type = ?) ORDER BY session_id;

I also need to apply a criteria to know if I have to add a new entry in the
table sessions or not:

SELECT (max(timestamp) + ?)<? FROM actions WHERE (session_id = ?);



If the session already exists I add a line in the table actions with a INSERT

If the session doesn't exist or if the criteria is true, I add a line in the
table sessions with an INSERT and then add  a line with a INSERT in the table actions (I use nextval and currval).

I have put indexes on sessions(session_id), sessions(msisdn),
actions(session_id) and actions(timestamp). I process one log file of about 20000 lines every day. All the lines are processed in one transaction (autocommit set to 0).

My problem is that populating my database is slower  when the data gets bigger
and the performance falls dramatically. I thought that is would be improve with
my indexes but the problem still persists.

Could you please give me some clues that could solve this issue.

Thanks.



Nicolas Nolst


MSN Photos is the easiest way to share and print your photos: Click Here

pgsql-general by date:

Previous
From: "Gianfranco Masia - Eprom s.r.l."
Date:
Subject: Version 7.2.1 is the last up to the new 7.3?
Next
From: Yutaka tanida
Date:
Subject: Re: [HACKERS] PostgreSQL and Windows2000 and defunct processes