Re: Importing Large Amounts of Data - Mailing list pgsql-hackers

From Gavin Sherry
Subject Re: Importing Large Amounts of Data
Date
Msg-id Pine.LNX.4.21.0204161336500.5405-100000@linuxworld.com.au
Whole thread Raw
In response to Re: Importing Large Amounts of Data  (Curt Sampson <cjs@cynic.net>)
Responses Re: Importing Large Amounts of Data  (Curt Sampson <cjs@cynic.net>)
List pgsql-hackers
On Tue, 16 Apr 2002, Curt Sampson wrote:

[snip]

> What I'm thinking would be really cool would be to have an "offline"
> way of creating tables using a stand-alone program that would write
> the files at, one hopes, near disk speed. 

Personally, I think there is some merit in this. Postgres can be used
for large scale data mining, an application which does not need
(usually) multi-versioning and concurrency but which can benefit from
postgres's implementation of SQL, as well as backend extensibility. 

I don't see any straight forward way of modifying the code to allow a fast
path directly to relationals on-disk. However, it should be possible to
bypass locking, RI, MVCC etc with the use of a bootstrap-like
tool.

Such a tool would only be able to be used when the database was
offline. It would read data from files pasted to it in some format,
perhaps that generated by COPY.

Given the very low parsing and 'planning' overhead, the real cost would be
WAL (the bootstrapper could fail and render the database unusable) and the
subsequent updating of on-disk relations.

Comments?

Gavin



pgsql-hackers by date:

Previous
From: "Christopher Kings-Lynne"
Date:
Subject: Re: [PATCHES] [SQL] 16 parameter limit
Next
From: Manuel Sugawara
Date:
Subject: Re: regexp character class locale awareness patch