Thread: :Posgres - performance problem
Hi all, I have 1 table have: - 417 columns - 600.000 rows data - 34 indexs when i use query on this table, it so long. ex: update master_items set temp1 = '' where temp1 <> '' --Query returned successfully: 435214 rows affected, 1016137 ms execution time. alter table master_items add "TYPE-DE" varchar default '' -- Query returned successfully with no result in 1211019 ms. update master_items set "feedback_to_de" = 'Yes' --Query returned successfully: 591268 rows affected, 1589335 ms execution time. Can you help me find any way to increase performance? Thanks all -- View this message in context: http://postgresql.nabble.com/Posgres-performance-problem-tp5843364.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On Wed, Mar 25, 2015 at 4:19 AM, ginkgo36 <ginkgo56@gmail.com> wrote: > Hi all, > I have 1 table have: > - 417 columns > - 600.000 rows data > - 34 indexs > > when i use query on this table, it so long. ex: > > update master_items set > temp1 = '' where temp1 <> '' --Query returned successfully: 435214 rows > affected, 1016137 ms execution time. > > alter table master_items add "TYPE-DE" varchar default '' > -- Query returned successfully with no result in 1211019 ms. > > update master_items set "feedback_to_de" = 'Yes' > --Query returned successfully: 591268 rows affected, 1589335 ms execution > time. > > Can you help me find any way to increase performance? > > Thanks all I am not any kind of a performance expert. But the first thing that I would try is an EXPLAIN. If you're using the psql line command, I'd do something like: BEGIN; -- BEGIN TRANSACTION EXPLAIN (ANALYZE, TIMING) UPDATE master_items SET temp1 = "where temp1 <>"; ROLLBACK; I'd put the EXPLAIN in a transaction that I roll back so that I wouldn't actually update anything permanently . Also, doing a simple ANALYZE on the table might help some. I'm not sure. ANALYZE master_items; -- If you sent twitter messages while exploring, are you on a textpedition? He's about as useful as a wax frying pan. 10 to the 12th power microphones = 1 Megaphone Maranatha! <>< John McKown
I'm not an expert either, but your data model sounds very broken as well... I guess it's possible that each query would need all 417 columns but it seems unlikely...
If that were normalized into 'n' tables then each query would be returning a whole lot less data...
I've never heard of a database being stuffed into one table before...
What is your use case / requirement for one table?
Gary
On Wed, Mar 25, 2015 at 4:50 AM, John McKown <john.archie.mckown@gmail.com> wrote:
On Wed, Mar 25, 2015 at 4:19 AM, ginkgo36 <ginkgo56@gmail.com> wrote:
> Hi all,
> I have 1 table have:
> - 417 columns
> - 600.000 rows data
> - 34 indexs
>
> when i use query on this table, it so long. ex:
>
> update master_items set
> temp1 = '' where temp1 <> '' --Query returned successfully: 435214 rows
> affected, 1016137 ms execution time.
>
> alter table master_items add "TYPE-DE" varchar default ''
> -- Query returned successfully with no result in 1211019 ms.
>
> update master_items set "feedback_to_de" = 'Yes'
> --Query returned successfully: 591268 rows affected, 1589335 ms execution
> time.
>
> Can you help me find any way to increase performance?
>
> Thanks all
I am not any kind of a performance expert. But the first thing that I
would try is an EXPLAIN. If you're using the psql line command, I'd do
something like:
BEGIN; -- BEGIN TRANSACTION
EXPLAIN (ANALYZE, TIMING) UPDATE master_items SET temp1 = "where temp1 <>";
ROLLBACK;
I'd put the EXPLAIN in a transaction that I roll back so that I
wouldn't actually update anything permanently . Also, doing a simple
ANALYZE on the table might help some. I'm not sure.
ANALYZE master_items;
--
If you sent twitter messages while exploring, are you on a textpedition?
He's about as useful as a wax frying pan.
10 to the 12th power microphones = 1 Megaphone
Maranatha! <><
John McKown
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
@Gary I'm working on big data, because of the demands of the job so I export/import/update data on this table every day. I guess it's possible that each query would need all 417 columns but it seems unlikely... --> Yes, not at all but 2/3 of 417 columns :) I need gather data into one table for consistency and easy for export and import, it's ok if I split data to smaller tables, but when export/import/update, i must excute query on alot of table. And this way lead data to inconsistency if I forget update/export/import on 1 or more table. It is terrible. @John I will learing about it. Thanks alot. @All: If you have any suggestion, please let me known. Thanks for you help -- View this message in context: http://postgresql.nabble.com/Posgres-performance-problem-tp5843364p5843420.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 25/03/2015 14:30, ginkgo36 wrote: > @Gary > I'm working on big data, because of the demands of the job so I > export/import/update data on this table every day. > > I guess it's possible that each query would need all 417 columns but it > seems unlikely... --> Yes, not at all but 2/3 of 417 columns :) > > I need gather data into one table for consistency and easy for export and > import, it's ok if I split data to smaller tables, but when > export/import/update, i must excute query on alot of table. And this way > lead data to inconsistency if I forget update/export/import on 1 or more > table. It is terrible. You could create a view specifically for export, which would gather together data from all the tables. Ray. -- Raymond O'Donnell :: Galway :: Ireland rod@iol.ie
On 03/25/2015 07:30 AM, ginkgo36 wrote: > @Gary > I'm working on big data, because of the demands of the job so I > export/import/update data on this table every day. > > I guess it's possible that each query would need all 417 columns but it > seems unlikely... --> Yes, not at all but 2/3 of 417 columns :) There is also the matter of the 34 indexes, which can thought of as 34 axillary tables that need to be kept in sync with the main table. > > I need gather data into one table for consistency and easy for export and > import, it's ok if I split data to smaller tables, but when > export/import/update, i must excute query on alot of table. And this way > lead data to inconsistency if I forget update/export/import on 1 or more > table. It is terrible. > > > @John > > I will learing about it. Thanks alot. > > @All: If you have any suggestion, please let me known. Thanks for you help > > > -- Adrian Klaver adrian.klaver@aklaver.com
"I need gather data into one table for consistency and easy for export and
import, it's ok if I split data to smaller tables, but when
export/import/update, i must excute query on alot of table. And this way
lead data to inconsistency if I forget update/export/import on 1 or more
table. It is terrible."
import, it's ok if I split data to smaller tables, but when
export/import/update, i must excute query on alot of table. And this way
lead data to inconsistency if I forget update/export/import on 1 or more
table. It is terrible."
That statement is wrong on many levels:
- Easy for import and export...
- Multiple tables / one table are identical for import export purposes.
- export/import/update, i must excute query on alot of table.
- That's what SQL is for...
- lead data to inconsistency if I forget update/export/import on 1 or more
table. It is terrible.- You build your process once and test it... Additional runs of the process are 'free'...
And as someone else mentioned, the 34 indexes are additional tables anyway.
There is probably a way to optimize your current system... There often is no matter how horrible the implementation...
But I would start by normalizing that as much as possible and then running performance tests against a normalized jobs. There's lots of tools to do that... But they probably aren't much help with your current schema.
Gary
On Wed, Mar 25, 2015 at 7:43 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 03/25/2015 07:30 AM, ginkgo36 wrote:@Gary
I'm working on big data, because of the demands of the job so I
export/import/update data on this table every day.
I guess it's possible that each query would need all 417 columns but it
seems unlikely... --> Yes, not at all but 2/3 of 417 columns :)
There is also the matter of the 34 indexes, which can thought of as 34 axillary tables that need to be kept in sync with the main table.
I need gather data into one table for consistency and easy for export and
import, it's ok if I split data to smaller tables, but when
export/import/update, i must excute query on alot of table. And this way
lead data to inconsistency if I forget update/export/import on 1 or more
table. It is terrible.
@John
I will learing about it. Thanks alot.
@All: If you have any suggestion, please let me known. Thanks for you help
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 3/25/2015 2:19 AM, ginkgo36 wrote: > Hi all, > I have 1 table have: > - 417 columns > - 600.000 rows data > - 34 indexs > > when i use query on this table, it so long. ex: > > update master_items set > temp1 = '' where temp1 <> '' --Query returned successfully: 435214 rows > affected, 1016137 ms execution time. that query is modifying 435000 rows of your table, and if temp1 is an indexed field, the index has to be updated 435000 times, too. note that in postgres, a 'update' translates into a INSERT and a DELETE > alter table master_items add "TYPE-DE" varchar default '' > -- Query returned successfully with no result in 1211019 ms. that is rewriting all 600000 rows, to add this new field with its default empty string content > update master_items set "feedback_to_de" = 'Yes' > --Query returned successfully: 591268 rows affected, 1589335 ms execution > time. that is modifying 591000 rows, essentially rewriting the whole table. > Can you help me find any way to increase performance? more/faster storage. faster CPU. more RAM. or, completely rethink how you store this data and normalize it as everyone else has said. -- john, recycling bits in santa cruz
> update master_items set "feedback_to_de" = 'Yes' >--Query returned successfully: 591268 rows affected, 1589335 ms execution time. Here you should better use update master_items set feedback_to_de = 'Yes' WHERE feedback_to_de <> 'Yes' OR update master_items set feedback_to_de <> 'Yes' WHERE feedback_to_de IS DISTINCT FROM 'Yes' depending on the column nullable. regards, Marc Mamin