Thread: Postgres (selection of thesis topic)

Postgres (selection of thesis topic)

From
"Harpreet Dhaliwal"
Date:
Hi,
I'm kind of new to postgresql and the project that I'm working on currently deals with parsing emails, storing parsed components in postgresql DB and fire triggers
on certain inserts that opens socket connection with a unix tools server, initiates tools like whois, traceroute etc and unix tools server opens ODBC connection back to same
postgres database and stores the results fetched from running the unix tools.

In this regard, I have to start working on some thesis topic related to the postgres database that we are using in the project. It can be in conjunction with email parsing on unix tools but the theme of the thesis topic should revolve around postgres database.

I have done alot of homework on this and could think of something like "bulk of data storage in email parsing and how vacuuming it would increase the performance" because i think this vacuum DB concept is not there in other RDBMS. This is just a petty topic but i was thinking something on these lines.

I have no clue what other options or topics do I have write to start writing my thesis on.
Any kind of help would be highly appreciated in this regard.

Thanks,
~Harpreet

Re: Postgres (selection of thesis topic)

From
Richard Huxton
Date:
Harpreet Dhaliwal wrote:
> In this regard, I have to start working on some thesis topic related to the
> postgres database that we are using in the project. It can be in
> conjunction
> with email parsing on unix tools but the theme of the thesis topic should
> revolve around postgres database.

You probably want to talk to the people behind this:
   http://www.archiveopteryx.org/

--
   Richard Huxton
   Archonet Ltd

Re: Postgres (selection of thesis topic)

From
"Alexander Staubo"
Date:
On 5/2/07, Harpreet Dhaliwal <harpreet.dhaliwal01@gmail.com> wrote:
> I'm kind of new to postgresql and the project that I'm working on currently
> deals with parsing emails, storing parsed components in postgresql DB and
> fire triggers
> on certain inserts that opens socket connection with a unix tools server,

Are you sure it is a good idea to do this processing synchronously?
What happens if there is a network problem? It sounds like an
inefficient and inflexible design.

> I have done alot of homework on this and could think of something like "bulk
> of data storage in email parsing and how vacuuming it would increase the
> performance" because i think this vacuum DB concept is not there in other
> RDBMS.

SQLite also requires vacuuming, as does other databases based on
MVCC-like designs, although some (eg., Oracle with its redo logs,
iirc) do their housekeeping behind the scenes.

Alexander.

Re: Postgres (selection of thesis topic)

From
Scott Marlowe
Date:
On Wed, 2007-05-02 at 08:00, Alexander Staubo wrote:
> On 5/2/07, Harpreet Dhaliwal <harpreet.dhaliwal01@gmail.com> wrote:
> > I'm kind of new to postgresql and the project that I'm working on currently
> > deals with parsing emails, storing parsed components in postgresql DB and
> > fire triggers
> > on certain inserts that opens socket connection with a unix tools server,
>
> Are you sure it is a good idea to do this processing synchronously?
> What happens if there is a network problem? It sounds like an
> inefficient and inflexible design.
>
> > I have done alot of homework on this and could think of something like "bulk
> > of data storage in email parsing and how vacuuming it would increase the
> > performance" because i think this vacuum DB concept is not there in other
> > RDBMS.
>
> SQLite also requires vacuuming, as does other databases based on
> MVCC-like designs, although some (eg., Oracle with its redo logs,
> iirc) do their housekeeping behind the scenes.

We're running Oracle 9 here, and it's even worse than vacuuming.  Once a
table grows, it stays grown until you rebuild it (you use the move
command, you just don't move it), and if it's filled up it's tablespace,
you have to extend it to get room to do that.  On top of that, you can't
move a partitioned table.

I'd say Oracle9 is about 10 times worse than PostgreSQL (any version)
for the amount of manual maintenance it takes to keep it happy.

Re: Postgres (selection of thesis topic)

From
"Martin Gainty"
Date:
Good Morning Scott-

The following URL contains the directive to Move Partition in a Partitioned
Tables
http://www.csee.umbc.edu/help/oracle8/server.815/a67772/partiti.htm
you will then need to rebuild the indices to point to the new partition

Is there some manner of automatically rebuilding the indices when moving
partition tables under Postgres?

Thanks/
Martin
This email message and any files transmitted with it contain confidential
information intended only for the person(s) to whom this email message is
addressed.  If you have received this email message in error, please notify
the sender immediately by telephone or email and destroy the original
message without making a copy.  Thank you.

----- Original Message -----
From: "Scott Marlowe" <smarlowe@g2switchworks.com>
To: "Alexander Staubo" <alex@purefiction.net>
Cc: "Harpreet Dhaliwal" <harpreet.dhaliwal01@gmail.com>; "pgsql general"
<pgsql-general@postgresql.org>
Sent: Wednesday, May 02, 2007 11:28 AM
Subject: Re: [GENERAL] Postgres (selection of thesis topic)


> On Wed, 2007-05-02 at 08:00, Alexander Staubo wrote:
>> On 5/2/07, Harpreet Dhaliwal <harpreet.dhaliwal01@gmail.com> wrote:
>> > I'm kind of new to postgresql and the project that I'm working on
>> > currently
>> > deals with parsing emails, storing parsed components in postgresql DB
>> > and
>> > fire triggers
>> > on certain inserts that opens socket connection with a unix tools
>> > server,
>>
>> Are you sure it is a good idea to do this processing synchronously?
>> What happens if there is a network problem? It sounds like an
>> inefficient and inflexible design.
>>
>> > I have done alot of homework on this and could think of something like
>> > "bulk
>> > of data storage in email parsing and how vacuuming it would increase
>> > the
>> > performance" because i think this vacuum DB concept is not there in
>> > other
>> > RDBMS.
>>
>> SQLite also requires vacuuming, as does other databases based on
>> MVCC-like designs, although some (eg., Oracle with its redo logs,
>> iirc) do their housekeeping behind the scenes.
>
> We're running Oracle 9 here, and it's even worse than vacuuming.  Once a
> table grows, it stays grown until you rebuild it (you use the move
> command, you just don't move it), and if it's filled up it's tablespace,
> you have to extend it to get room to do that.  On top of that, you can't
> move a partitioned table.
>
> I'd say Oracle9 is about 10 times worse than PostgreSQL (any version)
> for the amount of manual maintenance it takes to keep it happy.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>


Re: Postgres (selection of thesis topic)

From
Scott Marlowe
Date:
On Wed, 2007-05-02 at 10:52, Martin Gainty wrote:
> Good Morning Scott-
>
> The following URL contains the directive to Move Partition in a Partitioned
> Tables
> http://www.csee.umbc.edu/help/oracle8/server.815/a67772/partiti.htm
> you will then need to rebuild the indices to point to the new partition

Yeah, I've seen that, and used it even.

> Is there some manner of automatically rebuilding the indices when moving
> partition tables under Postgres?

You don't need to move partition tables / rebuild indexes in postgresql
generally, that was my main point.  If you need to reclaim space because
you forgot to regularly vacuum, then you can do a vacuumdb -fz and a
reindexdb...  And no mucked up indexes like you get when you move a
table in oracle and forget to rebuild its indexes.  Why automatic index
rebuilding wasn't a part of oracle 9 I'll never know.

It just seems that all the things someone decided to write a script /
program for in postgresql that got included into the pgsql/bin directory
or at least contrib or pgfoundy are spread across various web pages for
oracle.  Which seems almost backwards.  I'd kind of expect the open
source project to have things all over the web, mildly disorganized, and
the commercial project to have it all come with the package.

Keep in mind, I'm not slagging Oracle, really.  It's an impressive
database.  It just feels krufty, like there are tons of things you just
"have to know" to make it work.  More than I'd expected when I first
started using it.

Re: Postgres (selection of thesis topic)

From
"Alexander Staubo"
Date:
On 5/2/07, Scott Marlowe <smarlowe@g2switchworks.com> wrote:
> We're running Oracle 9 here, and it's even worse than vacuuming.  Once a
> table grows, it stays grown until you rebuild it (you use the move
> command, you just don't move it), and if it's filled up it's tablespace,

It's been a while since I touched the Beast, but does this unused
space significantly impact performance in the way it does with
PostgreSQL?

Alexander.