Thread: slow query performance
I'm having severe performance issues with a conceptually simple database. The database has one table, containing weather observations. The table currently has about 13.5 million rows, and is being updated constantly. (The database is running on a dual 550MHz PIII with 512MB RAM. I have PostgreSQL 7.1.3 on i686-pc-linux-gnu, compiled by GCC 2.96 on RedHat 7.2) On the whole, queries are of the form: SELECT ? FROM obs WHERE station = ? AND valid_time < ? AND valid_time > ? or: SELECT ? FROM obs WHERE station IN (?, ?, ...) AND valid_time < ? AND valid_time > ? Queries like these are taking around 4 to 5 minutes each, which seems excessively slow to me (or are my expectations far too optimistic?). For instance: SELECT station, air_temp FROM obs WHERE station = 'EGBB' AND valid_time > '28/8/03 00:00' AND valid_time < '28/10/03 00:00' takes 4 mins 32 secs. An EXPLAIN of the above query says: NOTICE: QUERY PLAN: Index Scan using obs_pkey on obs (cost=0.00..9.01 rows=1 width=20) A simple "SELECT count(*) from obs" query takes around that sort of time too. I have run "vacuumdb --analyze obs", to little effect. How can I speed this up? Where am I going wrong? Is there a problem with the table structure, or the indexes? Does the continual updating of the database (at the rate of somewhere between 1-3 entries per second) cause problems? The table and indexes are defined as follows: Table "obs" Attribute | Type | Modifier ----------------------------+--------------------------+---------- valid_time | timestamp with time zone | metar_air_temp | double precision | relative_humidity | double precision | pressure_change | double precision | ceiling | double precision | metar_dew_point | double precision | metar_gusts | double precision | wet_bulb_temperature | double precision | past_weather | text | visibility | double precision | metar_visibility | double precision | precipitation | double precision | station | character(10) | pressure_msl | double precision | metar_min_temperature_6hr | double precision | precipitation_period | double precision | metar_wet_bulb | double precision | saturation_mixing_ratio | double precision | metar_pressure | double precision | metar_sky_cover | text | dew_point | double precision | wind_direction | double precision | actual_time | timestamp with time zone | gust_speed | double precision | high_cloud_type | text | precipitation_24hr | double precision | metar_precipitation_24hr | double precision | pressure_tendency | text | metar_relative_humidity | double precision | low_cloud_type | text | metar_max_temperature_6hr | double precision | middle_cloud_type | text | air_temp | double precision | low_and_middle_cloud_cover | text | metar_wind_dir | double precision | metar_weather | text | snow_depth | double precision | metar_snow_depth | double precision | min_temp_12hr | double precision | present_weather | text | wind_speed | double precision | snow_cover | text | metar_wind_speed | double precision | metar_ceiling | double precision | max_temp_12hr | double precision | mixing_ratio | double precision | pressure_change_3hr | double precision | total_cloud | integer | max_temp_24hr | double precision | min_temp_24hr | double precision | snow_amount_6hr | double precision | Indices: obs_pkey, obs_station, obs_valid_time Index "obs_pkey" Attribute | Type ------------+-------------------------- valid_time | timestamp with time zone station | character(10) unique btree Index "obs_station" Attribute | Type -----------+--------------- station | character(10) btree Index "obs_valid_time" Attribute | Type ------------+-------------------------- valid_time | timestamp with time zone btree (I suspect the obs_valid_time index is redundant, because of the obs_pkey index - is that right?) I'd be grateful for any advice and any clues to help speed this up. Many thanks, Dave
Dave Weaver wrote: too optimistic?). > > For instance: > SELECT station, air_temp FROM obs > WHERE station = 'EGBB' > AND valid_time > '28/8/03 00:00' > AND valid_time < '28/10/03 00:00' > > takes 4 mins 32 secs. > > An EXPLAIN of the above query says: > NOTICE: QUERY PLAN: > > Index Scan using obs_pkey on obs (cost=0.00..9.01 rows=1 width=20) > > A simple "SELECT count(*) from obs" query takes around that sort of time > too. That is no test. Postgresql will always fetch the entire table. > > I have run "vacuumdb --analyze obs", to little effect. Check http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html for general tuning tips and do a vacuum full if table is updated frequently. Also reindex the indexes after vacuum. It will take quite some time though. See if that helps. Shridhar
Shridhar Daithankar wrote:
> Dave Weaver wrote:
> > A simple "SELECT count(*) from obs" query takes around that sort of time
> > too.
>
> That is no test. Postgresql will always fetch the entire table.
OK, I wasn't aware of that.
> > I have run "vacuumdb --analyze obs", to little effect.
>
> Check http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html for general
> tuning tips
Thanks for the link - I will read and digest.
> and do a vacuum full if table is updated frequently. Also reindex
> the indexes after vacuum. It will take quite some time though.
The version of Postgres that I'm running (7.1.3) doesn't seem to have a "vacuum full".
I have reindexed after a vacuum, with liitle-to-no effect.
Dave.
Dave Weaver wrote: > The version of Postgres that I'm running (7.1.3) doesn't seem to have a > "vacuum full". > > I have reindexed after a vacuum, with liitle-to-no effect. If possible you should upgrade. 7.1.3 is rather old. But that shouldn't mke that much difference ( I guess. Wasn't a postgresql user back then..:-) Shridhar
> > For instance: > > SELECT station, air_temp FROM obs > > WHERE station = 'EGBB' > > AND valid_time > '28/8/03 00:00' > > AND valid_time < '28/10/03 00:00' > > > > takes 4 mins 32 secs. How many rows should that return? [explain analyze will tell you that] and while that runs is your disk thrashing? vmstat's bi/bo columns will tell you. 7.1 is quite old, but I do understand the pain of migraing to 7.3 [or .4beta] with huge db's -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
Jeff wrote: > Dave Weaver wrote: > > For instance: > > SELECT station, air_temp FROM obs > > WHERE station = 'EGBB' > > AND valid_time > '28/8/03 00:00' > > AND valid_time < '28/10/03 00:00' > > > > takes 4 mins 32 secs. > > How many rows should that return? > [explain analyze will tell you that] "explain analyze" doesn't seem to be part of this postgres version (or I misunderstood something). That particular query returned 24 rows. > and while that runs is your disk thrashing? vmstat's bi/bo columns will > tell you. The machine's over the other side of the building, so I can't physically see if the disk is thrashing. I'm not sure how to interpret the vmstat output; running "vmstat 1" shows me bi/bo both at zero (mostly) until I start the query. Then bi shoots up to around 2500 (bo remains around zero) until the query finishes. > 7.1 is quite old, but I do understand the pain of migraing to 7.3 [or > .4beta] with huge db's Is the upgrade likely to make a difference? I'm still none-the-wiser wether the problem I have is due to: 1 Postgres version 2 Database size 3 Table structure 4 Configuration issues 5 Slow hardware 6 All of the above 7 None of the above 8 Something else Thanks for the help, Dave.
On 30/10/2003 10:41 Shridhar Daithankar wrote: > Dave Weaver wrote: >> The version of Postgres that I'm running (7.1.3) doesn't seem to have a >> "vacuum full". >> >> I have reindexed after a vacuum, with liitle-to-no effect. > > If possible you should upgrade. 7.1.3 is rather old. But that shouldn't > mke that much difference ( I guess. Wasn't a postgresql user back > then..:-) > > Shridhar IIRC, vacuum on older versions of PG locked tables and was effectively what vacuum full is now. Looking at HISTORY, non-locking vacuum and vacuum full were introduced in 7.2. -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
On Thu, 30 Oct 2003 13:49:46 -0000 "Dave Weaver" <davew@wsieurope.com> wrote: > Jeff wrote: > > Dave Weaver wrote: > > > For instance: > > > SELECT station, air_temp FROM obs > > > WHERE station = 'EGBB' > > > AND valid_time > '28/8/03 00:00' > > > AND valid_time < '28/10/03 00:00' > > > > > > takes 4 mins 32 secs. > > > > How many rows should that return? > > [explain analyze will tell you that] > > "explain analyze" doesn't seem to be part of this postgres version > (or I misunderstood something). > That particular query returned 24 rows. > You run explain analyze [insert query here] Post that output > > The machine's over the other side of the building, so I can't > physically see if the disk is thrashing. > I'm not sure how to interpret the vmstat output; running "vmstat 1" > shows me bi/bo both at zero (mostly) until I start the query. Then bi > shoots up to around 2500 (bo remains around zero) until the query > finishes. > The BI column means it is reading 2500 blocks / second. This is typically kB/sec (Linux defaults to 1kB block size on filesystems) That seems pretty low.. even for an older disk. We'll need the explain analyze output to help further. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
Jeff Wrote: > "Dave Weaver" <davew@wsieurope.com> wrote: > > > > "explain analyze" doesn't seem to be part of this postgres version (or > > I misunderstood something). > > That particular query returned 24 rows. > > > You run explain analyze [insert query here] > > Post that output The output is probably not what you were expecting! :-) obs=> explain analyze select * from obs where station = 'EGBB' obs-> and valid_time > '28/8/03 00:00' and valid_time < '28/10/03 00:00'; ERROR: parser: parse error at or near "analyze" obs=> \h explain Command: EXPLAIN Description: Shows statement execution plan Syntax: EXPLAIN [ VERBOSE ] query As I said, "explain analyze" doesn't seem to be part of this postgres version. Dave.
On Thu, 30 Oct 2003, Dave Weaver wrote: > Jeff wrote: > > Dave Weaver wrote: > > > For instance: > > > SELECT station, air_temp FROM obs > > > WHERE station = 'EGBB' > > > AND valid_time > '28/8/03 00:00' > > > AND valid_time < '28/10/03 00:00' > > > > > > takes 4 mins 32 secs. > > > > How many rows should that return? > > [explain analyze will tell you that] > > "explain analyze" doesn't seem to be part of this postgres version > (or I misunderstood something). > That particular query returned 24 rows. Back then it was just explain. explain analyze actually runs the query and tells you how long each thing too etc... i.e. it gives you the "I imagine I'll get this many rows back and it'll cost this much" part, then the cold hard facts of how many rows really came back, and how long it really too. Quite a nice improvement. > > and while that runs is your disk thrashing? vmstat's bi/bo columns will > > tell you. > > The machine's over the other side of the building, so I can't physically > see if the disk is thrashing. > I'm not sure how to interpret the vmstat output; running "vmstat 1" shows > me bi/bo both at zero (mostly) until I start the query. Then bi shoots up > to around 2500 (bo remains around zero) until the query finishes. Your disk is likely trashing. Can you set sort_mem on that old version of pgsql to something higher? set sort_mem = 32768; or something similar? > > 7.1 is quite old, but I do understand the pain of migraing to 7.3 [or > > .4beta] with huge db's > > Is the upgrade likely to make a difference? > I'm still none-the-wiser wether the problem I have is due to: > 1 Postgres version > 2 Database size > 3 Table structure > 4 Configuration issues > 5 Slow hardware > 6 All of the above > 7 None of the above > 8 Something else Yes, the upgrade is very likely to make a difference. The average performance gain for each version since then has been, in my experience, anywhere from a few percentage points faster to many times faster, depending on what you were trying to do. Why not download 7.4beta5 and see if you can get it to import the data from 7.1.3? It's close to going production, and in my opinion, 7.4beta5 is probably at least as stable as 7.1.3 considering the number of unfixed bugs likely to be hiding out there. My guess is that you'll find your workstation running 74beta5 with one IDE hard drive outrunning your server with 7.1.3 on it. Seriously. We're running 7.2.4 where I work, and the change from 7.1 to 7.2 was huge for us, especially the non-full vacuums.
On Thu, 30 Oct 2003, Dave Weaver wrote: > Jeff wrote: > > Dave Weaver wrote: > > > For instance: > > > SELECT station, air_temp FROM obs > > > WHERE station = 'EGBB' > > > AND valid_time > '28/8/03 00:00' > > > AND valid_time < '28/10/03 00:00' > > > > > > takes 4 mins 32 secs. > > > > How many rows should that return? > > [explain analyze will tell you that] > > "explain analyze" doesn't seem to be part of this postgres version > (or I misunderstood something). > That particular query returned 24 rows. > > > > and while that runs is your disk thrashing? vmstat's bi/bo columns will > > tell you. > > The machine's over the other side of the building, so I can't physically > see if the disk is thrashing. > I'm not sure how to interpret the vmstat output; running "vmstat 1" shows > me bi/bo both at zero (mostly) until I start the query. Then bi shoots up > to around 2500 (bo remains around zero) until the query finishes. > > > > 7.1 is quite old, but I do understand the pain of migraing to 7.3 [or > > .4beta] with huge db's > > Is the upgrade likely to make a difference? Well, it's likely to get you better help. Explain Analyze (added in 7.2 IIRC) gets us information on the real time spent in operations as well as the real number of rows. But, back to the question, what is the definition of the index it's using? If you don't have already have an index on (station,valid_time does making one help?
"Dave Weaver" <davew@wsieurope.com> writes: > Is the upgrade likely to make a difference? I'm not sure if it would help for this specific query, but in general each major PG release has useful performance improvements over the previous one. What I'm wondering about is an index-bloat problem (see the pgsql-performance archives for discussions). Do you do a lot of updates or deletes on this table, or is it just inserts? What is the physical size of the table and its index? The output of VACUUM VERBOSE for this table would be useful to show. regards, tom lane
Tom Lane wrote: > Do you do a lot of updates or deletes on this table, or is it just > inserts? Inserts and updates. No deletes. > What is the physical size of the table and its index? How do I find out this information? > The output of VACUUM VERBOSE for this table would be useful to show. obs=> vacuum verbose obs; NOTICE: --Relation obs-- NOTICE: Pages 276896: Changed 2776, reaped 67000, Empty 0, New 0;s Tup 13739326: Vac 78031, Keep/VTL 3141/3141, Crash 0, UnUsed 303993, MinLen 72, MaxLen 476; Re-using: Free/Avail. Space 16174372/14995020; EndEmpty/Avail. Pages 0/18004. CPU 26.11s/3.78u sec. NOTICE: Index obs_pkey: Pages 114870; Tuples 13739326: Deleted 37445. CPU 12.33s/39.86u sec. NOTICE: Index obs_valid_time: Pages 45713; Tuples 13739326: Deleted 37445. CPU 4.38s/37.65u sec. NOTICE: InvalidateSharedInvalid: cache state reset NOTICE: Index obs_station: Pages 53170; Tuples 13739326: Deleted 37445. CPU 6.46s/56.63u sec. NOTICE: Rel obs: Pages: 276896 --> 275200; Tuple(s) moved: 30899. CPU 33.94s/51.05u sec. NOTICE: Index obs_pkey: Pages 114962; Tuples 13739326: Deleted 30881. CPU 13.24s/19.80u sec. NOTICE: Index obs_valid_time: Pages 45819; Tuples 13739326: Deleted 30881. CPU 4.51s/17.42u sec. NOTICE: Index obs_station: Pages 53238; Tuples 13739326: Deleted 30881. CPU 5.78s/18.33u sec. NOTICE: --Relation pg_toast_503832058-- NOTICE: Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec. NOTICE: Index pg_toast_503832058_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec. VACUUM obs=> Cheers, Dave.
Stephan Szabo wrote: > But, back to the question, what is the definition of the index it's using? > If you don't have already have an index on (station,valid_time does > making one help? From my original post: Index "obs_pkey" Attribute | Type ------------+-------------------------- valid_time | timestamp with time zone station | character(10) unique btree Index "obs_station" Attribute | Type -----------+--------------- station | character(10) btree Index "obs_valid_time" Attribute | Type ------------+-------------------------- valid_time | timestamp with time zone btree (I suspect the obs_valid_time index is redundant, because of the obs_pkey index - is that right?) Cheers, Dave.
Dave Weaver wrote: > Tom Lane wrote: > >>Do you do a lot of updates or deletes on this table, or is it just >>inserts? > > > Inserts and updates. No deletes. Updates are insert/deletes under postgresql as it does not updates rows in place. >>What is the physical size of the table and its index? > How do I find out this information? cd $PGDATA;du -h This will give you size of each directory. Using utility oid2name in contrib module in sources, you can find out what object is stored in which file. Same is true for data files under it as well. > obs=> vacuum verbose obs; > NOTICE: --Relation obs-- > NOTICE: Pages 276896: Changed 2776, reaped 67000, Empty 0, New 0;s > Tup 13739326: Vac 78031, Keep/VTL 3141/3141, Crash 0, UnUsed 303993, > MinLen 72, MaxLen 476; Re-using: Free/Avail. Space 16174372/14995020; > EndEmpty/Avail. Pages 0/18004. CPU 26.11s/3.78u sec. > NOTICE: Index obs_pkey: Pages 114870; Tuples 13739326: Deleted 37445. CPU > 12.33s/39.86u sec. > NOTICE: Index obs_valid_time: Pages 45713; Tuples 13739326: Deleted 37445. > CPU 4.38s/37.65u sec. > NOTICE: InvalidateSharedInvalid: cache state reset > NOTICE: Index obs_station: Pages 53170; Tuples 13739326: Deleted 37445. CPU > 6.46s/56.63u sec. > NOTICE: Rel obs: Pages: 276896 --> 275200; Tuple(s) moved: 30899. CPU > 33.94s/51.05u sec. > NOTICE: Index obs_pkey: Pages 114962; Tuples 13739326: Deleted 30881. CPU > 13.24s/19.80u sec. > NOTICE: Index obs_valid_time: Pages 45819; Tuples 13739326: Deleted 30881. > CPU 4.51s/17.42u sec. > NOTICE: Index obs_station: Pages 53238; Tuples 13739326: Deleted 30881. CPU > 5.78s/18.33u sec. > NOTICE: --Relation pg_toast_503832058-- > NOTICE: Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0, > Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. > Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec. > NOTICE: Index pg_toast_503832058_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u > sec. > VACUUM You reindexed? Seems like this is after you have already run a vacuum. So not much change is shown here. HTH Shridhar
Dave, Apologies if this has been suggested before, but maybe : - interchanging the key order for the "obs_pkey" index and - clustering the "obs" table on "station" might make these queries go a bit better? Alternatively if maintaining a cluster on station is infeasable, you could consider a collection of partial indexes on valid_time for each station: create index obs_valid_time _stat1 on obs(valid_time) where station = 'station 1'; (etc for each station)... regards Mark Dave Weaver wrote: >On the whole, queries are of the form: > > SELECT ? FROM obs WHERE station = ? > AND valid_time < ? AND valid_time > ? >or: > SELECT ? FROM obs WHERE station IN (?, ?, ...) > AND valid_time < ? AND valid_time > ? > >An EXPLAIN of the above query says: > NOTICE: QUERY PLAN: > > Index Scan using obs_pkey on obs (cost=0.00..9.01 rows=1 width=20) > > Index "obs_pkey" > Attribute | Type >------------+-------------------------- > valid_time | timestamp with time zone > station | character(10) >unique btree > > >
Mark wrote: > - interchanging the key order for the "obs_pkey" index and I will try this. > - clustering the "obs" table on "station" Sorry, I don't understand what you mean by this - can you explain? > create index obs_valid_time _stat1 on obs(valid_time) where station = 'station 1'; > Interesting. I didn't know you could do that (I have a lot to learn!). There are about 13500 stations, so that's a lot of indexes. Hang on - just tried it and I get a 'parse error at or near "where"', so I suspect that's yet another reason to upgrade... I think the one message that's come across loud and clear from this thread is "Upgrade!", so I guess that's what I'll be doing! Many thanks to all who have contributed to this thread so far. I really appreciate al hte help. Cheers, Dave.
"Dave Weaver" <davew@wsieurope.com> writes: >> The output of VACUUM VERBOSE for this table would be useful to show. > NOTICE: Index obs_pkey: Pages 114962; Tuples 13739326: Deleted 30881. CPU > 13.24s/19.80u sec. Lets see, 114962 pages at 8K apiece, divided by 13739326 entries, gives about 68 bytes per index entry. The entries themselves (contents a timestamp and a char(10)) take 8 + 4 + 10 bytes for data, plus 8 bytes for the index tuple header, plus 2 bytes wasted for alignment (assuming this is Intel hardware), plus a 4-byte line pointer; 36 bytes total. So you have an index loading factor of about 52%, which is noticeably less than the theoretical optimum of 70%, though not really bad yet. I think you are seeing some index bloat --- especially if you recently reindexed, meaning that the index hasn't had very long to grow. You could try keeping an eye on the size of obs_pkey over time and see if it grows faster than the table itself. If you are going to upgrade I'd counsel going to 7.4, which should solve or at least greatly reduce the problem of index bloat. See the pgsql-performance archives for more discussion. regards, tom lane
On 29 Oct 2003 09:40:43 GMT, I wrote: > > I'm having severe performance issues with a conceptually simple > database. Oops - forgot to mention, I'm running: PostgreSQL 7.1.3 on i686-pc-linux-gnu, compiled by GCC 2.96 on RedHat 7.2 -- Dave email: zen13097 (AT) zen [DOT] co {DOT} uk
Dave Weaver wrote: >>- clustering the "obs" table on "station" >> >> > >Sorry, I don't understand what you mean by this - can you explain? > > > Supposing obs_pkey is on (station, valid_time): cluster obs_pkey on obs will re-order the rows in obs based on the index obs_pkey. (This is clustering on 'station' and 'valid_time', to do just station you could use an index on just 'station'). The down side is that the row ordering is slowly lost as rows are updated, so periodic running of the cluster command is needed - this is a pain as it will take a while for 13 million row table. regards Mark
On Sat, Nov 01, 2003 at 10:18:14AM +1300, Mark Kirkwood wrote: > > Dave Weaver wrote: > > >>- clustering the "obs" table on "station" > > > >Sorry, I don't understand what you mean by this - can you explain? > Supposing obs_pkey is on (station, valid_time): > > cluster obs_pkey on obs Be aware that doing this on the 7.1.3 version you are running will drop the other indexes on the table, and some other metadata about it (grants, inheritance, foreign key relationships IIRC). -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Coge la flor que hoy nace alegre, ufana. ¿Quién sabe si nacera otra mañana?"