Home > mailing lists

Loading speed - Mailing list pgsql-novice

From	Cath Lawrence
Subject	Loading speed
Date	October 22, 2003 02:16:59
Msg-id	FC0064C3-0436-11D8-B7DB-00039390F614@anu.edu.au Whole thread Raw
Responses	Re: Loading speed Re: Loading speed
List	pgsql-novice

Tree view

Hi again,

Is there an FAQ or something I can read about optimising speed of
insertions and queries? I am doing some simple-minded stuff at the
moment with pgsql, and my insertions are very very slow.

A rough idea of the problem here for background:
A protein has a small number of chains ('A', 'B' etc), each of which
has N2 amino acids, each of which has N3 atoms.
Now I've got lots and lots of atoms. Hundreds of thousands most like,
maybe a million or two, I haven't counted.  How can I most efficiently
load these?

The atom table is a bit like this with some language changes:
CREATE TABLE atom (
        id                  INTEGER PRIMARY KEY DEFAULT
nextval('atom_id_seq'),
        protein_id          CHAR(4) REFERENCES protein,
        chain_id            CHAR(1) NOT NULL,
        amino_acid_id       INTEGER REFERENCES amino_acid,
        atom_type           TEXT,
        element             TEXT,
        x                   FLOAT,
        y                   FLOAT,
        z                   FLOAT
);

Now the REFERENCES bits build triggers, and in each case it refers to
the primary key of the table. This is always indexed, yes? When I
construct the table it says  CREATE TABLE / PRIMARY KEY will create
implicit index 'amino_acid_pkey' for table 'amino_acid' - this IS an
index on the column I named as the PRIMARY KEY, not on some oid, isn't
id?

With my simple-minded pygresql script, I have one insert statement per
atom, and of course it crawls very feebly. Perhaps there's an
insert_many method I haven't found yet (the docs suck, I will keep
looking).

Or perhaps I should write them out to a text file and use COPY?

Any advice?

cheers
Cath
Cath Lawrence,                       Cath.Lawrence@anu.edu.au
Senior Scientific Programmer,  Centre for Bioinformation Science,
John Curtin School of Medical Research (room 4088)
Australian National University,  Canberra ACT 0200
ph: (02) 61257959   mobile: 0421-902694   fax: (02) 61252595

pgsql-novice by date:

From: Josh Berkus
Date: 21 October 2003, 20:13:11
Subject: Re: Almost relational PostgreSQL (was: one-to-one)

From: Michael Glaesmann
Date: 22 October 2003, 06:27:14
Subject: Re: Almost relational PostgreSQL (was: one-to-one)

Loading speed - Mailing list pgsql-novice

Previous

Next