Thread: Performance with temporary table
Hi I have written a program that imputes(or rather corrects data) with in my database. Iam using a temporary table where in i put data from other partitoined table. I then query this table to get the desired data.But the thing is this temporary table has to be craeted for every record that i need to correct and there are thousands of such records that need to be corrected. So the program necessarily creates a temporary table evrytime it has to correct a record. However this table is dropeed after each record is corrected. The program works fine.....but it runs for a very long time....or it runs for days. Iam particularyly finding that it takes more time during this statement: NOTICE: theQuery in createtablevolumelaneshist CREATE TEMPORARY TABLE predictiontable(lane_id, measurement_start, speed,volume,occupancy) AS SELECT lane_id, measurement_start, speed,volume,occupancy FROM samantha.lane_data_I_495 WHERE lane_id IN (1317) AND measurement_start BETWEEN '2007-11-18 09:25:00' AND 2007-11-19 01:39:06' Iam not sure if i can use a cursor to replicate the functionality of the temp table. Is the performance bad because of the creation and deletion of the temp table? Thanks Samantha
samantha mahindrakar wrote: > Iam using a temporary table where in i put data from other partitoined > table. I then query this table to get the desired data.But the thing > is this temporary table has to be craeted for every record that i need > to correct and there are thousands of such records that need to be > corrected. > So the program necessarily creates a temporary table evrytime it has > to correct a record. However this table is dropeed after each record > is corrected. Why? -- Richard Huxton Archonet Ltd
samantha mahindrakar escribió: > So the program necessarily creates a temporary table evrytime it has > to correct a record. However this table is dropeed after each record > is corrected. Perhaps it would be better to truncate the temp table instead. > Iam not sure if i can use a cursor to replicate the functionality of > the temp table. Is the performance bad because of the creation and > deletion of the temp table? Yes -- if you create/drop thousands of temp tables (or create/drop the same temp table thousands of time), the resulting catalog bloat is likely to hinder performance. Perhaps autovacuum should be at work here (and if not you can solve the issue with manual vacuums to the system catalogs), but even then it is at best unnecessary. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Well instead of creating a temp table everytime i just created a permanant table and insert the data into it everytime and truncate it. I created indexes on this permanent table too. This did improve the performance to some extent. Does using permanant tables also bloat the catalog or hinder the performance? Thanks Samantha On 4/8/08, Alvaro Herrera <alvherre@commandprompt.com> wrote: > samantha mahindrakar escribió: > > > So the program necessarily creates a temporary table evrytime it has > > to correct a record. However this table is dropeed after each record > > is corrected. > > Perhaps it would be better to truncate the temp table instead. > > > Iam not sure if i can use a cursor to replicate the functionality of > > the temp table. Is the performance bad because of the creation and > > deletion of the temp table? > > Yes -- if you create/drop thousands of temp tables (or create/drop the > same temp table thousands of time), the resulting catalog bloat is > likely to hinder performance. Perhaps autovacuum should be at work here > (and if not you can solve the issue with manual vacuums to the system > catalogs), but even then it is at best unnecessary. > > -- > Alvaro Herrera http://www.CommandPrompt.com/ > The PostgreSQL Company - Command Prompt, Inc. >
samantha mahindrakar escribió: > Well instead of creating a temp table everytime i just created a > permanant table and insert the data into it everytime and truncate it. > I created indexes on this permanent table too. This did improve the > performance to some extent. > > Does using permanant tables also bloat the catalog or hinder the performance? In terms of catalog usage, permanent tables behave exactly the same as temp tables. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Apr 7, 8:27 pm, sam.mahindra...@gmail.com ("samantha mahindrakar") wrote: > Hi > I have written a program that imputes(or rather corrects data) with in > my database. > Iam using a temporary table where in i put data from other partitoined > table. I then query this table to get the desired data.But the thing > is this temporary table has to be craeted for every record that i need > to correct and there are thousands of such records that need to be > corrected. > So the program necessarily creates a temporary table evrytime it has > to correct a record. However this table is dropeed after each record > is corrected. > The program works fine.....but it runs for a very long time....or it > runs for days. > Iam particularyly finding that it takes more time during this statement: > > NOTICE: theQuery in createtablevolumelaneshist CREATE TEMPORARY TABLE > predictiontable(lane_id, measurement_start, speed,volume,occupancy) AS > SELECT lane_id, measurement_start, speed,volume,occupancy > FROM samantha.lane_data_I_495 WHERE > lane_id IN (1317) AND > measurement_start BETWEEN '2007-11-18 09:25:00' AND 2007-11-19 01:39:06' > > Iam not sure if i can use a cursor to replicate the functionality of > the temp table. Is the performance bad because of the creation and > deletion of the temp table? > > Thanks > Samantha > > -- > Sent via pgsql-performance mailing list (pgsql-performa...@postgresql.org) > To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance And why do you copy data from the partition tables? Did you try to manipulate data directly in the needed tables? Or you are aggregating some of the data there? How the partitioning is actually designed? Do you use table inheritance? -- Valentine
On Apr 8, 2008, at 2:43 PM, Alvaro Herrera wrote: > samantha mahindrakar escribió: >> Well instead of creating a temp table everytime i just created a >> permanant table and insert the data into it everytime and truncate >> it. >> I created indexes on this permanent table too. This did improve the >> performance to some extent. >> >> Does using permanant tables also bloat the catalog or hinder the >> performance? > > In terms of catalog usage, permanent tables behave exactly the same as > temp tables. True, but the point is that you're not bloating the catalogs with thousands of temp table entries. I agree with others though: it certainly doesn't sound like there's any reason to be using temp tables here at all. This sounds like a case of trying to apply procedural programming techniques to a database instead of using set theory (which generally doesn't work well). -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828
Attachment
The partitions are used to separate the data according to months. I have run a query o find bad data from each such partition. The imputation algorithm that i use requires data from 10 previous weeks in order to impute the data. This historical data i store in a temporary table, the i query this data so that i can take a average of all the historical data. Before taking average some computations are performed. Since i need the historical data for every minute of data that i need to impute i have to store the data in some intermediate table. Hence the temporary table. Now i changed the code to use a permanent table that is truncated after one set of data is imputed. I hope this makes sense. Samantha On Wed, Apr 9, 2008 at 6:44 AM, valgog <valgog@gmail.com> wrote: > On Apr 7, 8:27 pm, sam.mahindra...@gmail.com ("samantha mahindrakar") > wrote: > > > > Hi > > I have written a program that imputes(or rather corrects data) with in > > my database. > > Iam using a temporary table where in i put data from other partitoined > > table. I then query this table to get the desired data.But the thing > > is this temporary table has to be craeted for every record that i need > > to correct and there are thousands of such records that need to be > > corrected. > > So the program necessarily creates a temporary table evrytime it has > > to correct a record. However this table is dropeed after each record > > is corrected. > > The program works fine.....but it runs for a very long time....or it > > runs for days. > > Iam particularyly finding that it takes more time during this statement: > > > > NOTICE: theQuery in createtablevolumelaneshist CREATE TEMPORARY TABLE > > predictiontable(lane_id, measurement_start, speed,volume,occupancy) AS > > SELECT lane_id, measurement_start, speed,volume,occupancy > > FROM samantha.lane_data_I_495 WHERE > > lane_id IN (1317) AND > > measurement_start BETWEEN '2007-11-18 09:25:00' AND 2007-11-19 01:39:06' > > > > Iam not sure if i can use a cursor to replicate the functionality of > > the temp table. Is the performance bad because of the creation and > > deletion of the temp table? > > > > Thanks > > Samantha > > > > -- > > Sent via pgsql-performance mailing list (pgsql-performa...@postgresql.org) > > To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance > > And why do you copy data from the partition tables? Did you try to > manipulate data directly in the needed tables? Or you are aggregating > some of the data there? How the partitioning is actually designed? Do > you use table inheritance? > > -- Valentine > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
Hi The reason for using the temporary table is that i need this data buffered somewhere so that i can use it for later computation. And the fact that for each imputation i need to have historical data from 10 previous weeks makes it necessary to create something that can hold the data. However once the computation is done for each record i wouldn't need that historical data for that record. I Would be moving on to the next record and find its own historical data. Is there any way i can avoid using temp table? Samantha On Wed, Apr 9, 2008 at 4:09 PM, Decibel! <decibel@decibel.org> wrote: > On Apr 8, 2008, at 2:43 PM, Alvaro Herrera wrote: > > > samantha mahindrakar escribió: > > > > > Well instead of creating a temp table everytime i just created a > > > permanant table and insert the data into it everytime and truncate it. > > > I created indexes on this permanent table too. This did improve the > > > performance to some extent. > > > > > > Does using permanant tables also bloat the catalog or hinder the > performance? > > > > > > > In terms of catalog usage, permanent tables behave exactly the same as > > temp tables. > > > > True, but the point is that you're not bloating the catalogs with thousands > of temp table entries. > > I agree with others though: it certainly doesn't sound like there's any > reason to be using temp tables here at all. This sounds like a case of > trying to apply procedural programming techniques to a database instead of > using set theory (which generally doesn't work well). > -- > Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org > Give your computer some brain candy! www.distributed.net Team #1828 > > >
On Apr 9, 2008, at 6:41 PM, samantha mahindrakar wrote: > Hi > The reason for using the temporary table is that i need this data > buffered somewhere so that i can use it for later computation. And the > fact that for each imputation i need to have historical data from 10 > previous weeks makes it necessary to create something that can hold > the data. However once the computation is done for each record i > wouldn't need that historical data for that record. I Would be moving > on to the next record and find its own historical data. > Is there any way i can avoid using temp table? What's wrong with the data in the paritions? Erik Jones DBA | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
We store traffic data in the partitioned tables. But the problem is that all this data is not correct. The data is corrupt, hence they need to be corrected. On Wed, Apr 9, 2008 at 10:31 PM, Erik Jones <erik@myemma.com> wrote: > > On Apr 9, 2008, at 6:41 PM, samantha mahindrakar wrote: > > > Hi > > The reason for using the temporary table is that i need this data > > buffered somewhere so that i can use it for later computation. And the > > fact that for each imputation i need to have historical data from 10 > > previous weeks makes it necessary to create something that can hold > > the data. However once the computation is done for each record i > > wouldn't need that historical data for that record. I Would be moving > > on to the next record and find its own historical data. > > Is there any way i can avoid using temp table? > > > > What's wrong with the data in the paritions? > > Erik Jones > > DBA | Emma(R) > erik@myemma.com > 800.595.4401 or 615.292.5888 > 615.292.0777 (fax) > > Emma helps organizations everywhere communicate & market in style. > Visit us online at http://www.myemma.com > > > >
I see, I am having practically the same problem... utilizing partitioning idea http://www.postgresql.org/docs/8.3/interactive/ddl-partitioning.html by table inheritance. I have prepared a post with some trigger and rule examples for you http://valgogtech.blogspot.com/2008/04/table-partitioning-automation-triggers.html . So I hope you will find it useful if you are not doing it already yourself :-). About the use of the temporary table, I would say, that you actually could try to add some special row status flag colum (I use "char" for such flags) to your partitioned tables to mark some rows as unused and then create some conditional indexes that consider this flag for your data operation... This would make it possible for you not to creating temporary tables I hope... With best regards, -- Valentine On Apr 10, 1:33 am, sam.mahindra...@gmail.com ("samantha mahindrakar") wrote: > The partitions are used to separate the data according to months. I > have run a query o find bad data from each such partition. The > imputation algorithm that i use requires data from 10 previous weeks > in order to impute the data. This historical data i store in a > temporary table, the i query this data so that i can take a average of > all the historical data. Before taking average some computations are > performed. Since i need the historical data for every minute of data > that i need to impute i have to store the data in some intermediate > table. Hence the temporary table. > Now i changed the code to use a permanent table that is truncated > after one set of data is imputed. > I hope this makes sense. > > Samantha > > > On Wed, Apr 9, 2008 at 6:44 AM, valgog <val...@gmail.com> wrote: > > On Apr 7, 8:27 pm, sam.mahindra...@gmail.com ("samantha mahindrakar") > > wrote: > > > > Hi > > > I have written a program that imputes(or rather corrects data) with in > > > my database. > > > Iam using a temporary table where in i put data from other partitoined > > > table. I then query this table to get the desired data.But the thing > > > is this temporary table has to be craeted for every record that i need > > > to correct and there are thousands of such records that need to be > > > corrected. > > > So the program necessarily creates a temporary table evrytime it has > > > to correct a record. However this table is dropeed after each record > > > is corrected. > > > The program works fine.....but it runs for a very long time....or it > > > runs for days. > > > Iam particularyly finding that it takes more time during this statement: > > > > NOTICE: theQuery in createtablevolumelaneshist CREATE TEMPORARY TABLE > > > predictiontable(lane_id, measurement_start, speed,volume,occupancy) AS > > > SELECT lane_id, measurement_start, speed,volume,occupancy > > > FROM samantha.lane_data_I_495 WHERE > > > lane_id IN (1317) AND > > > measurement_start BETWEEN '2007-11-18 09:25:00' AND 2007-11-19 01:39:06' > > > > Iam not sure if i can use a cursor to replicate the functionality of > > > the temp table. Is the performance bad because of the creation and > > > deletion of the temp table? > > > > Thanks > > > Samantha > > > > -- > > > Sent via pgsql-performance mailing list (pgsql-performa...@postgresql.org) > > > To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance > > > And why do you copy data from the partition tables? Did you try to > > manipulate data directly in the needed tables? Or you are aggregating > > some of the data there? How the partitioning is actually designed? Do > > you use table inheritance? > > > -- Valentine >