Thread: Index creation takes more time?
We have tables which we archive and shorten every day. That is - the main table that has daily inserts and updates is keptsmall, and there is a parallel table with all the old data up to a year ago. In the past we noticed that the bulk transfer from the main table to the archive table takes a very long time, so we decidedto do this in three steps: (1) drop indexes on the archive table, (2) insert a week's worth of data into the archivetable. (3) recreate the indexes. This proved to take much less time than having each row update the index. However, this week we finally upgraded from PG 8.3 to 9.1, and suddenly, the archiving process takes a lot more time thanit used to - 14:30 hours for the most important table, to be exact, spent only on index creation. The same work running on the same data in 8.3 on a much weaker PC took merely 4:30 hours. There are 8 indexes on the archive table. The size of the main table is currently (after archive) 7,805,009 records. The size of the archive table is currently 177,328,412 records. Has there been a major change in index creation that would cause 9.1 to do it this much slower? Should I go back to simplycopying over the data or is the whole concept breaking down? TIA, Herouth
Hi, Dne 09.09.2012 11:25, Herouth Maoz napsal: > We have tables which we archive and shorten every day. That is - the > main table that has daily inserts and updates is kept small, and > there > is a parallel table with all the old data up to a year ago. > > In the past we noticed that the bulk transfer from the main table to > the archive table takes a very long time, so we decided to do this in > three steps: (1) drop indexes on the archive table, (2) insert a > week's worth of data into the archive table. (3) recreate the > indexes. > This proved to take much less time than having each row update the > index. > > However, this week we finally upgraded from PG 8.3 to 9.1, and > suddenly, the archiving process takes a lot more time than it used to > - 14:30 hours for the most important table, to be exact, spent only > on > index creation. > > The same work running on the same data in 8.3 on a much weaker PC > took merely 4:30 hours. > > There are 8 indexes on the archive table. > > The size of the main table is currently (after archive) 7,805,009 > records. > The size of the archive table is currently 177,328,412 records. What amount of data are we talking about? What data types? Show us the table structure and definition of the indexes. What hardware is this running on? Try to collect some system stats (vmstat, iostat and sar are your friends) when the indexes are being rebuilt. > Has there been a major change in index creation that would cause 9.1 > to do it this much slower? Should I go back to simply copying over > the > data or is the whole concept breaking down? Not really, it's usually expected to be either faster or the same as with previous releases. Now, when upgrading to 9.1, have you used the same configuration or are there differences (I mean in postgresql.conf). I'm interested in the maintenance_work_mem value. Now, it might be useful to look at partitioning in this case, i.e. splitting the large table into smaller child table (e.g. one per week) and loading the data into these. But it depends on how you use the old data - with some queries this may cause issues (worse performance). Tomas
Herouth, I don't know if you saw Tomas Vondra's follow-up, as it was only to the list and not CC'd to you. Here's the archive link: http://archives.postgresql.org/message-id/e87a2f7a91ce1fca7143bcadc4553a0b@fuzzy.cz The short version: "More information required". On 09/09/2012 05:25 PM, Herouth Maoz wrote: > We have tables which we archive and shorten every day. That is - the main table that has daily inserts and updates is keptsmall, and there is a parallel table with all the old data up to a year ago. > > In the past we noticed that the bulk transfer from the main table to the archive table takes a very long time, so we decidedto do this in three steps: (1) drop indexes on the archive table, (2) insert a week's worth of data into the archivetable. (3) recreate the indexes. This proved to take much less time than having each row update the index. > > However, this week we finally upgraded from PG 8.3 to 9.1, and suddenly, the archiving process takes a lot more time thanit used to - 14:30 hours for the most important table, to be exact, spent only on index creation. > > The same work running on the same data in 8.3 on a much weaker PC took merely 4:30 hours. > > There are 8 indexes on the archive table. > > The size of the main table is currently (after archive) 7,805,009 records. > The size of the archive table is currently 177,328,412 records. > > Has there been a major change in index creation that would cause 9.1 to do it this much slower? Should I go back to simplycopying over the data or is the whole concept breaking down? > > > TIA, > Herouth >
Yes, thank you, I did notice it, but I decided to wait a week to the next archive schedule, to see if the problem persists, especially since the previous time ran with relatively low disk space because we kept the old database files around. We have removed them during the week.
Unfortunately, the problem persists.
So here is the information I could glean.
First, the variable "maintenance_work_mem" has not been changed between the old and new postgresql. In fact, it is commented out, so I assume it's the default 16MB for both installations.
The server that runs the 9.1 is generally better and faster than the PC that runs the 8.3 (it does every other operation - inserts, updates, selects - much faster than the PC). More specifically:
Server running 9.1:
3373252k of memory
Two hard disks, separate for system and database. The database disk is 15000RPM, 825G.
CPU: Xeon, 2.0GHz, 4 cores (or two CPUs with 2 cores, I'm not sure)
PC running 8.3:
3073344k of memory
One SATA hard disk (used for both system and database), 7200RPM, 915G.
CPU: Pentium dual-core 2.80GHz
In both machines postgreSQL is set up with shared_buffers of 1800M.
Now, the table itself:
Column | Type | Modifiers
-----------------------------+-----------------------------+-----------
service | smallint |
billing_priority | smallint |
account_number | integer |
msisdn | character varying(16) |
sme_reference | integer |
smsc_reference | numeric(21,0) |
gateway_id | smallint |
user_reference | numeric(21,0) |
user_time | timestamp without time zone |
time_arrived | timestamp without time zone |
time_submitted | timestamp without time zone |
time_final_state | timestamp without time zone |
status | integer |
time_notified | timestamp without time zone |
user_id | character varying(45) |
price | double precision |
sms_engine_id | character varying(15) |
smsc_session_id | character varying(64) |
external_billing_reference | character varying(128) |
multipart_reference | numeric(21,0) |
multipart_nr_segments | integer |
multipart_segment_nr | integer |
requested_target_network_id | character(1) |
actual_target_network_id | character(1) |
sm_type | character(2) |
There are no triggers, no foreign keys etc.
The index definitions:
CREATE INDEX billinga_user_id ON sms.billing__archive(user_id) ;
CREATE INDEX billinga_status ON sms.billing__archive(status) ;
CREATE INDEX billinga_time_arrived ON sms.billing__archive(time_arrived) ;
CREATE INDEX billinga_msisdn_sme_reference ON sms.billing__archive(msisdn,sme_reference) ;
CREATE INDEX billinga_account ON sms.billing__archive(account_number) ;
CREATE INDEX billinga_user_ref ON sms.billing__archive(user_reference) ;
CREATE INDEX billinga_smsc_ref ON sms.billing__archive (smsc_reference) ;
CREATE INDEX billinga_time_submitted ON sms.billing__archive(time_submitted) ;
Statistics collection:
For the sake of experimentation, I dropped and created the billinga_msisdn_sme_reference in both machines, timed it, and ran vmstat, iostat and sar in the background at intervals of 1 minute.
On the PC, the creation of the index took 40 minutes 35 seconds.
The server (9.1) has not finished yet. I set up stats to run for an hour, and I'm sending this hour's worth of stats.
I'm attaching the stats files in tarballs. I'm not sure what I'm supposed to look at.
Thanks for your time,
Herouth
-----הודעה מקורית-----
מאת: Craig Ringer [mailto:ringerc@ringerc.id.au]
נשלח: ב 17/09/2012 06:56
אל: Herouth Maoz
עותק לידיעה: pgsql-general@postgresql.org; tv@fuzzy.cz
נושא: Re: [GENERAL] Index creation takes more time?
Herouth,
I don't know if you saw Tomas Vondra's follow-up, as it was only to the
list and not CC'd to you. Here's the archive link:
http://archives.postgresql.org/message-id/e87a2f7a91ce1fca7143bcadc4553a0b@fuzzy.cz
The short version: "More information required".
On 09/09/2012 05:25 PM, Herouth Maoz wrote:
> We have tables which we archive and shorten every day. That is - the main table that has daily inserts and updates is kept small, and there is a parallel table with all the old data up to a year ago.
>
> In the past we noticed that the bulk transfer from the main table to the archive table takes a very long time, so we decided to do this in three steps: (1) drop indexes on the archive table, (2) insert a week's worth of data into the archive table. (3) recreate the indexes. This proved to take much less time than having each row update the index.
>
> However, this week we finally upgraded from PG 8.3 to 9.1, and suddenly, the archiving process takes a lot more time than it used to - 14:30 hours for the most important table, to be exact, spent only on index creation.
>
> The same work running on the same data in 8.3 on a much weaker PC took merely 4:30 hours.
>
> There are 8 indexes on the archive table.
>
> The size of the main table is currently (after archive) 7,805,009 records.
> The size of the archive table is currently 177,328,412 records.
>
> Has there been a major change in index creation that would cause 9.1 to do it this much slower? Should I go back to simply copying over the data or is the whole concept breaking down?
>
>
> TIA,
> Herouth
>
Attachment
On Mon, Sep 17, 2012 at 4:07 AM, Herouth Maoz <herouth@unicell.co.il> wrote: > Yes, thank you, I did notice it, but I decided to wait a week to the next > archive schedule, to see if the problem persists, especially since the > previous time ran with relatively low disk space because we kept the old > database files around. We have removed them during the week. > > Unfortunately, the problem persists. You changed hardware, pgsql versions, and kernel versions all at the same time, so that could make it hard to pin down the cause of the change. Especially if you don't have a test environment you can use for doing experiments. If not, hopefully you can get away with doing some tests on the production systems. > So here is the information I could glean. > > First, the variable "maintenance_work_mem" has not been changed between the > old and new postgresql. In fact, it is commented out, so I assume it's the > default 16MB for both installations. > > The server that runs the 9.1 is generally better and faster than the PC that > runs the 8.3 (it does every other operation - inserts, updates, selects - > much faster than the PC). More specifically: > > Server running 9.1: > 3373252k of memory > Two hard disks, separate for system and database. The database disk is > 15000RPM, 825G. > CPU: Xeon, 2.0GHz, 4 cores (or two CPUs with 2 cores, I'm not sure) > > PC running 8.3: > 3073344k of memory > One SATA hard disk (used for both system and database), 7200RPM, 915G. > CPU: Pentium dual-core 2.80GHz What are the minor release numbers? Is is 8.3.9 and 9.1.5? I'd run a very simple benchmark test on both machines, from psql: \timing on set work_mem = 16384; select count(distinct foo) from (select random() as foo from generate_series(1,100000000)) asdf; This is what I get, running both on the same hardware: 8.3.9 Time: 569041.153 ms 9.1.4 Time: 374607.288 ms So the newer version is faster for me. If 9.1.4 is slower for you, then the next thing I would do is install 8.3.9 on your new hardware where the 9.1.x version is running, running on a different port, and see how it does there on the same query. > > In both machines postgreSQL is set up with shared_buffers of 1800M. Can you give us the full configuration? http://wiki.postgresql.org/wiki/Server_Configuration I wonder if they have different encoding/collations. ... > CREATE INDEX billinga_msisdn_sme_reference ON > sms.billing__archive(msisdn,sme_reference) ; ... > > Statistics collection: > > For the sake of experimentation, I dropped and created the > billinga_msisdn_sme_reference in both machines, timed it, and ran vmstat, > iostat and sar in the background at intervals of 1 minute. A case with that many columns is unwieldy as a test case. To make a simpler test case, what if you do: create table foo as select msisdn,sme_reference from sms.billing__archive limit 1000000; And then index that table (on each system)? Do you still see the discrepancy, or has it gone away? It it has gone away, do the same thing again but without the limit, and see if it is still gone. > On the PC, the creation of the index took 40 minutes 35 seconds. > > The server (9.1) has not finished yet. I set up stats to run for an hour, > and I'm sending this hour's worth of stats. > > I'm attaching the stats files in tarballs. I'm not sure what I'm supposed to > look at. In vmstat, you can see that the process is purely CPU limited (user time, "us", is nearly 25%, that is, 1 out of 4 CPUs, the entire time, and IO wait time, "wa", is near zero) on the newer hardware. For the older hardware, it is partially CPU limited ("us" is not near 50, (1 out of 2 CPUs) but also not near zero), and partially IO limited. The SAR report show about the same thing, just in a different way. Cheers, Jeff
I think you hit the nail right on the head when you asked:
> I wonder if they have different encoding/collations.
[headdesk]Of course. One of the requirements of the upgrade was to change the database encoding to unicode, because previously it was in an 8-bit encoding and we couldn't handle international text, which has become an absolute necessity. So when I restored the database, I took care to create it in unicode first:
Old database (PC, 8.3.17):
Name | reports
Owner | reports
Encoding | ISO_8859_8
(Locale is C)
New database (server, 9.1.5):
Name | reports
Owner | reports
Encoding | UTF8
Collate | he_IL.UTF-8
Ctype | he_IL.UTF-8
Apparently, string comparison is heavily CPU bound... Now, it seems the server is inferior to the PC in CPU-bound tasks. It's no wonder - the PC has a better and faster CPU, and each PostgreSQL process runs on a single core, so the 4 cores are not an advantage. So running the test you asked:
> \timing on
> set work_mem = 16384;
> select count(distinct foo) from (select random() as foo from
> generate_series(1,100000000)) asdf;
I get
PC: Time: 554994.343 ms
Server: Time: 660577.789 ms
Which is slightly better in favor of the PC, but still doesn't show as much of a discrepancy as in the creation of indexes.
I must point out that the actual problem is not in comparison to this PC's hardware. The database originally ran on the server, and created the same indexes happily within reasonable time until the upgrade. The upgrade process involved shutting down PostgreSQL, moving all the database files and configuration over to the PC, and starting it there (running the PC under the old server's IP, so that all the clients work with the PC now as a production machine). Then we took the server, upgraded the system and PostgreSQL on it, created a dump from the PC, and restored it on the Server.
So the situation is that the performance is 4 times worse w.r.t. the same hardware, which chugged happily when it was still the old operating system and the old PostgreSQL. And the PC is still chugging away happily during the archive, albeit a bit more slowly (for I/O reasons - it is inferior to the server there). Anything disk-bound is done better on the Server, while the PC has a slight CPU advantage.
So, I must, at this point, draw the conclusion that string comparison is a much, much heavier task in utf-8 than it is in an 8-bit encoding - or that the collation is the problem.
Running a different test, which involves string comparison, shows a bigger discrepancy:
select count( foo ), foo from ( select cast(random() as varchar(14)) > cast(random() as varchar(14)) as foo
from generate_series (1,100000000)) asdf
group by foo;
PC: Time: 308152.090 ms
Server: Time: 499631.553 ms
Finally, I created a test table, as you asked:
> create table foo as select msisdn,sme_reference from
> sms.billing__archive limit 1000000;
Then I created an index on the msisdn and sme_reference columns together. 99% of the data in the msisdn field consist of 11-digit phone numbers. Result:
PC: 5792.641 ms
Server: 23740.470 ms
Huge discrepancy there.
Next, I dropped the index, and created an index on the sme_reference column alone (which is an integer). The result:
PC: 2457.315 ms
Server: 3722.920 ms
Still a slight advantage for the PC, but not on the same order of magnitude as when strings were concerned.
OK, if you agree with my conclusions, what should I do about it? I absolutely need this database to be able to support Unicode.
Thanks a lot for the help so far!
Herouth
"Herouth Maoz" <herouth@unicell.co.il> writes: > So, I must, at this point, draw the conclusion that string comparison is a much, much heavier task in utf-8 than it isin an 8-bit encoding - or that the collation is the problem. Going from "C" collation to anything else is generally a huge hit in terms of string comparison/sorting performance. Do you really need locale-aware sorting, or just enforcement of utf8 encoding? Because you can use "C" locale with any encoding. regards, tom lane
On Tue, Sep 18, 2012 at 1:13 AM, Herouth Maoz <herouth@unicell.co.il> wrote: > I think you hit the nail right on the head when you asked: > > > >> I wonder if they have different encoding/collations. > > [headdesk]Of course. One of the requirements of the upgrade was to change > the database encoding to unicode, because previously it was in an 8-bit > encoding and we couldn't handle international text, which has become an > absolute necessity. So when I restored the database, I took care to create > it in unicode first: > So, I must, at this point, draw the conclusion that string comparison is a > much, much heavier task in utf-8 than it is in an 8-bit encoding - or that > the collation is the problem. > > Running a different test, which involves string comparison, shows a bigger > discrepancy: > > select count( foo ), foo from ( select cast(random() as varchar(14)) > > cast(random() as varchar(14)) as foo > from generate_series (1,100000000)) asdf > group by foo; > > PC: Time: 308152.090 ms > Server: Time: 499631.553 ms I think the one below will show an even larger discrepancy. You are doing 2 casts for each comparison, so I think the casts overhead will dilute out the comparison. select count(distinct foo) from ( select cast(random() as varchar(14)) as foo from generate_series (1,100000000)) asdf; > Finally, I created a test table, as you asked: > > >> create table foo as select msisdn,sme_reference from >> sms.billing__archive limit 1000000; > > Then I created an index on the msisdn and sme_reference columns together. > 99% of the data in the msisdn field consist of 11-digit phone numbers. > Result: > > PC: 5792.641 ms > Server: 23740.470 ms > > Huge discrepancy there. try: create index ON foo (msisdn COLLATE "C", sme_reference) ; This can only be done on 9.1 server, as that feature is new to that release. It should be much faster to create than the index with default collation. (or change the collation of msisdn column definition, rather than just in the index). This assumes you just need the index for equality, not for some precise locale-specific ordering (which for phone numbers seems like a safe bet). Cheers, Jeff
On 18/09/2012, at 20:19, Jeff Janes wrote: > I think the one below will show an even larger discrepancy. You are > doing 2 casts for each comparison, > so I think the casts overhead will dilute out the comparison. > > select count(distinct foo) from ( select cast(random() as varchar(14)) as foo > from generate_series (1,100000000)) asdf; Actually, it doesn't. I suspect that it doesn't actually do string comparison per se. I don't know how "distinct" is implementedin PostgreSQL, but if it was me, I'd implement it with a hash table, which means that you calculate the hash ofthe string rather than compare it. Even if it is done with actual comparison, I don't think it's a collation-based comparison,but rather a byte-by-byte comparison. > > >> Finally, I created a test table, as you asked: >> >> >>> create table foo as select msisdn,sme_reference from >>> sms.billing__archive limit 1000000; >> >> Then I created an index on the msisdn and sme_reference columns together. >> 99% of the data in the msisdn field consist of 11-digit phone numbers. >> Result: >> >> PC: 5792.641 ms >> Server: 23740.470 ms >> >> Huge discrepancy there. > > try: > create index ON foo (msisdn COLLATE "C", sme_reference) ; > > This can only be done on 9.1 server, as that feature is new to that > release. It should be much faster to create than the index with > default collation. > > (or change the collation of msisdn column definition, rather than just > in the index). > > This assumes you just need the index for equality, not for some > precise locale-specific ordering (which for phone numbers seems like a > safe bet). Yes, this certainly reduced the index creation time to within a reasonable margin. OK, now we have to decide whether to movethe entire database to the 'C' collation (which would require, I suppose, a dump and restore) with the option of changingcollation for specific columns that actually need it, or to just solve the current problem by changing the indexcreation commands where relevant. Thank you very much for your help with this issue, your input has been invaluable. Herouth