Re: charting performance measures with number or records in table - Mailing list pgsql-general
From | Jim C. Nasby |
---|---|
Subject | Re: charting performance measures with number or records in table |
Date | |
Msg-id | 20060504202148.GY97354@pervasive.com Whole thread Raw |
In response to | charting performance measures with number or records in table (SunWuKung <Balazs.Klein@axelero.hu>) |
List | pgsql-general |
On Mon, May 01, 2006 at 04:46:33PM +0200, SunWuKung wrote: > We had a discussion with my friend about whether to use an array or an > attached table and I was in favor of the attached table while he was > concerned about the the performance of the select/insert as the number > or records in the attached table grew and so favored to use an array in > the parent table. > > To persuade him I wanted to see how the time required to select or > insert records increased as the number of rows in the table grew. I was > less interested in the actual time as it is very hardware dependent more > interested in the trend. I tried this with the following table: > > CREATE TABLE "itemresponse" ( > "testoccasionid" INTEGER NOT NULL, > "itemorder" INTEGER NOT NULL, > "placeholdertypeid" SMALLINT DEFAULT 1 NOT NULL, > "response_datatype" SMALLINT NOT NULL, > "response" TEXT, > CONSTRAINT "itemresponse_new_idx" PRIMARY KEY("testoccasionid", > "itemorder", "placeholdertypeid") > ) WITHOUT OIDS; > > SELECT * FROM itemresponse WHERE testoccasionid=1751 > --returns 20 records > > I tried this with 10^2, 10^3, 10^4, 10^5, 10^6, 10^7 records in the > table. > To my surprise neither the time for the select nor the time for the > insert (1000 additional records) increased measurably. > Can it be real or is it an artefact? In this case the amount of work required to read will be largely dependant on how many heap pages need to be read in, which will depend greatly on the correlation of the index. Correlation is mostly dependent on how you add and update data. Of course as the table size grows you're likely to need to pull in more pages, but even a very large table with a very high correlation is unlikely to need to read too many pages. There's also some additional overhead as index size increases, but that's fairly limited in most cases. > -------- > On a more general note I think it would be usefull to make a > 'theoretical' graph to illustrate the behaviour of an index. Probably > there is already one but I didn't find it. > Say there is a table: > > CREATE TABLE "test" ( > "id" INTEGER NOT NULL, > CONSTRAINT id_idx PRIMARY KEY("id") > ) WITHOUT OIDS; > > and there are 0, 10^1, 10^2, 10^3, 10^4, 10^5, 10^6, 10^7, 10^8, 10^9 > records in it > > - Select id from test Where id=99 - time in whatever unit > - Insert Into test (id) Values (99) - time in whatever unit > - Select count(id) from test - time in whatever unit > - Table size - kb=? > - Index size - kb=? > - omit or add whatever makes/doesn't make sence here (eg. memory > required to do the select?, time to vacuum?) > > and the same thing without an index on the table. I think it would make > a good addition to the manual. > > Its just a thought, let me know what you think. Might be interesting info, but I don't know that the docs are the right place for it. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
pgsql-general by date: