Home > mailing lists

Re: Importing Large Amounts of Data - Mailing list pgsql-hackers

From	Gavin Sherry
Subject	Re: Importing Large Amounts of Data
Date	April 16, 2002 02:56:59
Msg-id	Pine.LNX.4.21.0204161336500.5405-100000@linuxworld.com.au Whole thread Raw
In response to	Re: Importing Large Amounts of Data (Curt Sampson <cjs@cynic.net>)
Responses	Re: Importing Large Amounts of Data (Curt Sampson <cjs@cynic.net>)
List	pgsql-hackers

Tree view

On Tue, 16 Apr 2002, Curt Sampson wrote:

[snip]

> What I'm thinking would be really cool would be to have an "offline"
> way of creating tables using a stand-alone program that would write
> the files at, one hopes, near disk speed. 

Personally, I think there is some merit in this. Postgres can be used
for large scale data mining, an application which does not need
(usually) multi-versioning and concurrency but which can benefit from
postgres's implementation of SQL, as well as backend extensibility. 

I don't see any straight forward way of modifying the code to allow a fast
path directly to relationals on-disk. However, it should be possible to
bypass locking, RI, MVCC etc with the use of a bootstrap-like
tool.

Such a tool would only be able to be used when the database was
offline. It would read data from files pasted to it in some format,
perhaps that generated by COPY.

Given the very low parsing and 'planning' overhead, the real cost would be
WAL (the bootstrapper could fail and render the database unusable) and the
subsequent updating of on-disk relations.

Comments?

Gavin

pgsql-hackers by date:

From: "Christopher Kings-Lynne"
Date: 16 April 2002, 02:49:09
Subject: Re: [PATCHES] [SQL] 16 parameter limit

From: Manuel Sugawara
Date: 16 April 2002, 03:12:27
Subject: Re: regexp character class locale awareness patch

Re: Importing Large Amounts of Data - Mailing list pgsql-hackers

Previous

Next