Thread: Question on partitioning
Hello , All.
I am not sure if this is the right mailing list to place this question.
If it doesn't, please kindly redirect me to the right list.
I have a giant table with about 10,6 million records.
Queries on it are usually slow, and if I try to do something more elaborate like an INNER JOIN with itself it becomes unnacceptably slow.
I am looking for a way to improve performance.
One of the columns is of type date. Each "day" includes about a few tens of thousands records
And the older a date is the less likely I am to do queries on it.
The objective of the "self join" is to compare data from two different days, looking for diferences.
Ive read that one of the benefits of partitioning is to speed up queries by separating less used records.
My question is if partitioning can be a good way to make the queries faster (specially the self joins) or if it isn't worth trying because it doesn't help on my particular situation.
Please kindly advice me on this
Many thanks in advance for your kind help
Best,
Oliveiros
On Thu, Aug 21, 2008 at 8:25 AM, Oliveiros Cristina <oliveiros.cristina@marktest.pt> wrote: > Hello , All. > > I am not sure if this is the right mailing list to place this question. > If it doesn't, please kindly redirect me to the right list. > > I have a giant table with about 10,6 million records. 10.6 million rows isn't really gigantic. It's a good size, but unless you're hitting a lot of the records at once, you should be able to get pretty good performance if it's indexed properly. > Queries on it are usually slow, and if I try to do something more elaborate > like an INNER JOIN with itself it becomes unnacceptably slow. > > I am looking for a way to improve performance. > One of the columns is of type date. Each "day" includes about a few tens of > thousands records > And the older a date is the less likely I am to do queries on it. Well, first let's look at your indexes and if they're being used. Got an explain analyze output of your slow queries for us to look at?
On Thu, 2008-08-21 at 15:25 +0100, Oliveiros Cristina wrote: > Hello , All. > > I am not sure if this is the right mailing list to place this > question. > If it doesn't, please kindly redirect me to the right list. > > I have a giant table with about 10,6 million records. > > Queries on it are usually slow, and if I try to do something more > elaborate like an INNER JOIN with itself it becomes unnacceptably > slow. > > I am looking for a way to improve performance. > One of the columns is of type date. Each "day" includes about a few > tens of thousands records > And the older a date is the less likely I am to do queries on it. > > The objective of the "self join" is to compare data from two different > days, looking for diferences. > > Ive read that one of the benefits of partitioning is to speed up > queries by separating less used records. > > My question is if partitioning can be a good way to make the queries > faster (specially the self joins) or if it isn't worth trying because > it doesn't help on my particular situation. > > Please kindly advice me on this > > Many thanks in advance for your kind help > > Best, > Oliveiros I would expect partitioning to work. I've heard tell that fine grained partitioning coupled with check constraints can even eliminate the need for certain indexes. I do know that check constraints on the date will help you tremendously if you decide to partition. -Mark
Hi Christina, Quoting Oliveiros Cristina <oliveiros.cristina@marktest.pt>: > Hello , All. > > I am not sure if this is the right mailing list to place this question. > If it doesn't, please kindly redirect me to the right list. > > I have a giant table with about 10,6 million records. > > Queries on it are usually slow, and if I try to do something more elaborate > like an INNER JOIN with itself it becomes unnacceptably slow. > > I am looking for a way to improve performance. > One of the columns is of type date. Each "day" includes about a few tens of > thousands records > And the older a date is the less likely I am to do queries on it. > > The objective of the "self join" is to compare data from two different days, > looking for diferences. > > Ive read that one of the benefits of partitioning is to speed up queries by > separating less used records. > > My question is if partitioning can be a good way to make the queries faster > (specially the self joins) or if it isn't worth trying because it doesn't > help on my particular situation. > > Please kindly advice me on this > > Many thanks in advance for your kind help > > Best, > Oliveiros > I'm sure partitioning is a good choice to improve your performance. But, as Scott noticed, be sure first to have done all possible improvements on your table such as indexes, of course (only if they don't damage your insert performances). I use partitioning with check constraints on a timestamp field and it works well without any optimization of the conf files of postgresql. It's ok for massive inserts (csv records to copy up to 60000 times per second) without indexes (they cost a lot for insert statements), as to select many datas on the same tables. Best regards, Sylvain Caillet IT Performance Software Project Manager