Re: charting performance measures with number or records in table - Mailing list pgsql-general

From Jim C. Nasby
Subject Re: charting performance measures with number or records in table
Date
Msg-id 20060504202148.GY97354@pervasive.com
Whole thread Raw
In response to charting performance measures with number or records in table  (SunWuKung <Balazs.Klein@axelero.hu>)
List pgsql-general
On Mon, May 01, 2006 at 04:46:33PM +0200, SunWuKung wrote:
> We had a discussion with my friend about whether to use an array or an
> attached table and I was in favor of the attached table while he was
> concerned about the the performance of the select/insert as the number
> or records in the attached table grew and so favored to use an array in
> the parent table.
>
> To persuade him I wanted to see how the time required to select or
> insert records increased as the number of rows in the table grew. I was
> less interested in the actual time as it is very hardware dependent more
> interested in the trend. I tried this with the following table:
>
> CREATE TABLE "itemresponse" (
>   "testoccasionid" INTEGER NOT NULL,
>   "itemorder" INTEGER NOT NULL,
>   "placeholdertypeid" SMALLINT DEFAULT 1 NOT NULL,
>   "response_datatype" SMALLINT NOT NULL,
>   "response" TEXT,
>   CONSTRAINT "itemresponse_new_idx" PRIMARY KEY("testoccasionid",
> "itemorder", "placeholdertypeid")
> ) WITHOUT OIDS;
>
> SELECT * FROM itemresponse WHERE testoccasionid=1751
> --returns 20 records
>
> I tried this with 10^2, 10^3, 10^4, 10^5, 10^6, 10^7 records in the
> table.
> To my surprise neither the time for the select nor the time for the
> insert (1000 additional records) increased measurably.
> Can it be real or is it an artefact?

In this case the amount of work required to read will be largely
dependant on how many heap pages need to be read in, which will depend
greatly on the correlation of the index. Correlation is mostly dependent
on how you add and update data. Of course as the table size grows you're
likely to need to pull in more pages, but even a very large table with a
very high correlation is unlikely to need to read too many pages.

There's also some additional overhead as index size increases, but
that's fairly limited in most cases.

> --------
> On a more general note I think it would be usefull to make a
> 'theoretical' graph to illustrate the behaviour of an index. Probably
> there is already one but I didn't find it.
> Say there is a table:
>
> CREATE TABLE "test" (
> "id" INTEGER NOT NULL,
> CONSTRAINT id_idx PRIMARY KEY("id")
> ) WITHOUT OIDS;
>
> and there are 0, 10^1, 10^2, 10^3, 10^4, 10^5, 10^6, 10^7, 10^8, 10^9
> records in it
>
>  - Select id from test Where id=99 - time in whatever unit
>  - Insert Into test (id) Values (99) - time in whatever unit
>  - Select count(id) from test - time in whatever unit
>  - Table size - kb=?
>  - Index size - kb=?
>  - omit or add whatever makes/doesn't make sence here (eg. memory
> required to do the select?, time to vacuum?)
>
> and the same thing without an index on the table. I think it would make
> a good addition to the manual.
>
> Its just a thought, let me know what you think.

Might be interesting info, but I don't know that the docs are the right
place for it.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

pgsql-general by date:

Previous
From: "Tony Lausin"
Date:
Subject: Re: Is PostgreSQL an easy choice for a large CMS?
Next
From: "Jim C. Nasby"
Date:
Subject: Re: number of page slots needed exceeds max_fsm_pages