Thread: 7.2.1 optimises very badly against 7.2
Samuel Liddicott http://www.ananova.com
Support Consultant
sam@ananova.com
Direct Dial: +44 (0)113 367 4523
Fax: +44 (0)113 367 4680
Switchboard: +44 (0)113 367 4600Ananova Limited
Marshall Mill
Marshall Street
Leeds
LS11 9YJ
St James Court
Great Park Road
Almondsbury Park
Bradley Stoke
Bristol BS32 4QJ
Registered in England No.2858918
The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you receive this in error, please contact the sender and delete the material from any computer.
On Fri, 28 Jun 2002, Sam Liddicott wrote: > I have a 10GB database which serves up customised tv listings selections at > www.ananova.com/tv <http://www.ananova.com/tv> (see "Personalise Listings" > in the left column) > > We had it running very well under postgres 7.2 on a 4xPentium 700mhz with > 8GB RAM > > For a personalised selection from "start" to "end" of 7 channels [url1 > below] > takes a fraction of a second to do the query and if I'm lucky enough to spot > it on "top" it uses low CPU percentage. > > Under 7.2.1 it takes 99% CPU for between 5-9 seconds. > Rolling back to 7.2 is also very slow unless we vacuum analyse after rolling > back, then it is very fast again. > [We normally vacuum analyse every 24 hours] > > We have max connections=128, 4520 shared buffers 91268 sortmem and collect > row level and block level stats. I'm thinking that you need to run some queries with explain analyze ... to see what the planner thinks it should do and how it's estimates differ from reality under both 7.2 and 7.2.1. > I also have this anecodotal information; > we installed 7.2.1 on a fresh box and then restored a pg_dump of the tv > database and found it always very slow, stopping postgres and transferring > the binary DB files and restarting was very fast but degraded slowly over a > few days. I know nothing of the filesystem structure but that sounds odd. Like there's something introduced into 7.2.1 that's inherently slow. I assume there's also data loads over those few days. > So I imagine there is something about stats gathering and use changed with > 7.2.1 (I hear it has a new optimiser). Well you'll be able to see if it's the stats. that are causing this by doing explains and comparing pg_stats for the 7.2 and pg_restored 7.2.1. You could also do the binary transfer from 7.2 to 7.2.1 again and check the explains and pg_stats immediately after and then after it's slowed down. This is probably most useful if there are _no_ data loads in the meantime. > The query we do is complex and really does need a good optimiser (why we > don't use mysql) as it has to join programmes against broadcasts (restricted > by time) to channels (restricted to interested channels). It has to be > careful not to initially start with all broadcasts of a interested channel > as well as not all broadcasts on the interested channels. > > [url1, 7 channels from midnight to about 6:00am day after] > http://www.ananova.com/tv_listings/tv_mainlisting.html?day=day1 > <http://www.ananova.com/tv_listings/tv_mainlisting.html?day=day1&start=Start > &end=End&tvregion=n_14&i=1&p=0&h=&c=12.25.36.53.54.86.33&S=1> > &start=Start&end=End&tvregion=n_14&i=1&p=0&h=&c=12.25.36.53.54.86.33&S=1 > Interesting that your stuff completes so quick normally. I worked on a large TV listings site that had complex queries and ran with Oracle. It's queries took ages to run mostly. I didn't design the system btw although I did have a hand in some lucky chap winning two or three short breaks one week when we tweaked a slow query and it turned out to be slightly more complex than we thought when we tweaked it. -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants
"Sam Liddicott" <sam.liddicott@ananova.com> writes: > Under 7.2.1 it takes 99% CPU for between 5-9 seconds. > Rolling back to 7.2 is also very slow unless we vacuum analyse after rolling > back, then it is very fast again. > [We normally vacuum analyse every 24 hours] AFAIK, the only change from 7.2 to 7.2.1 that would be likely to have anything to do with this was a rework of the code in ANALYZE that estimates the number of distinct values in a column. (See pghackers archives from around 18-Feb.) Although the revised version did better on some test cases, it sounds like it's doing worse for you. As Nigel commented, we can't do much without seeing EXPLAIN ANALYZE results. I'd also like to see the pg_stats entries for the table(s) used by the query, as produced by both 7.2 ANALYZE and 7.2.1 ANALYZE. regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 28 June 2002 15:13 > To: Sam Liddicott > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] 7.2.1 optimises very badly against 7.2 > > As Nigel commented, we can't do much without seeing EXPLAIN ANALYZE > results. I'd also like to see the pg_stats entries for the table(s) > used by the query, as produced by both 7.2 ANALYZE and 7.2.1 ANALYZE. Thanks for your comments (and everyone else). I'll get these stats and explain analyse out on Monday and we can have a good luck. Just doing some swig-java work for my boss. Sam
A bit late, but here is the explain analyse data for the problem where a complex DB was able to produce very quick results with low load with postgres 7.2 but takes many many times longer with postgres 7.2.1. O notice missing in the 7.2.1 (slow) explain analyse this part: "Index Scan using idx_broadcast_channelregionid on broadcast" Here is the query: SELECT distinct channelregion."id", channelregion."customtitle", channelregion."title" as channeltitle, channelregion."tag" as channeltag, channelregion."distributionid", channelregion."channelid", distribution."description", broadcast."id" as broadcastid, broadcast."groupid", broadcast."duration", broadcast."start" as stime, broadcast."stereo", broadcast."subtitles" as subtitle, broadcast."repeat", broadcast."blackandwhite" as bw, broadcast."premiere", broadcast."surround", broadcast."widescreen", broadcast."followon", episode."id" as episodeid, episode."title" as title, episode."seriestitle" as seriestitle, episode."categories", episode."episodename", episode."episodereference", episode."episodenumber", episode."episodecount", episode."detail0", episode."detail1", episode."detail2", episode."created" as filmyear INTO TEMPORARY TABLE selection FROM "channelregion" channelregion, "broadcast" broadcast, "distribution" distribution, "episode" episode WHERE broadcast.channelregionid=channelregion.id AND channelregion."distributionid" = distribution."id" AND broadcast.episode=episode.id AND (((broadcast.start+broadcast.duration)>1026120300) AND (broadcast.sourcekey<=20020708) AND ((channelregion.id in (2,20,41,53,54,733,734,86,33)))) ORDER BY broadcast."start" ASC; Here is explain analyse on a postgres 7.2.1 box: Unique (cost=99202.15..99607.55 rows=523 width=279) (actual time=7932.43..7936.36 rows=276 loops=1) -> Sort (cost=99202.15..99202.15 rows=5231 width=279) (actual time=7932.41..7932.73 rows=276 loops=1) -> Nested Loop (cost=78.02..98879.06 rows=5231 width=279) (actual time=2779.61..7926.74 rows=276 loops=1) -> Hash Join (cost=78.02..74013.87 rows=5231 width=119) (actual time=2778.98..7886.85 rows=276 loops=1) -> Seq Scan on broadcast (cost=0.00..70871.32 rows=399885 width=35) (actual time=2768.80..7851.94 rows=8019 loops=1) -> Hash (cost=78.00..78.00 rows=9 width=84) (actual time=9.56..9.56 rows=0 loops=1) -> Hash Join (cost=1.09..78.00 rows=9 width=84) (actual time=1.73..9.53 rows=9 loops=1) -> Index Scan using channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey on channelregion (cost=0.00..76.75 rows=9 width=60) (actual time=0.48..8.00 rows=9 loops=1) -> Hash (cost=1.07..1.07 rows=7 width=24) (actual time=0.19..0.19 rows=0 loops=1) -> Seq Scan on distribution (cost=0.00..1.07 rows=7 width=24) (actual time=0.14..0.17 rows=7 loops=1) -> Index Scan using episode_pkey on episode (cost=0.00..4.74 rows=1 width=160) (actual time=0.09..0.11 rows=1 loops=276) Total runtime: 8009.97 msec Here is explain analyse on a 7.2 box: Unique (cost=13355.63..13416.75 rows=79 width=278) (actual time=525.79..529.63 rows=276 loops=1) -> Sort (cost=13355.63..13355.63 rows=789 width=278) (actual time=525.78..526.07 rows=276 loops=1) -> Nested Loop (cost=1.09..13317.68 rows=789 width=278) (actual time=5.32..520.46 rows=276 loops=1) -> Nested Loop (cost=1.09..9749.11 rows=789 width=119) (actual time=5.07..481.22 rows=276 loops=1) -> Hash Join (cost=1.09..69.44 rows=9 width=84) (actual time=1.24..3.89 rows=9 loops=1) -> Index Scan using channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey on channelregion (cost=0.00..68.20 rows=9 width=60) (actual time=0.08..2.17 rows=9 loops=1) -> Hash (cost=1.07..1.07 rows=7 width=24) (actual time=0.10..0.10 rows=0 loops=1) -> Seq Scan on distribution (cost=0.00..1.07 rows=7 width=24) (actual time=0.04..0.07 rows=7 loops=1) -> Index Scan using idx_broadcast_channelregionid on broadcast (cost=0.00..1080.72 rows=88 width=35) (actual time=3.80..52.47 rows=31 loops=9) -> Index Scan using episode_pkey on episode (cost=0.00..4.51 rows=1 width=159) (actual time=0.09..0.11 rows=1 loops=276) Total runtime: 598.78 msec
"Sam Liddicott" <sam.liddicott@ananova.com> writes: > O notice missing in the 7.2.1 (slow) explain analyse this part: > "Index Scan using idx_broadcast_channelregionid on broadcast" Indeed. What do 7.2 and 7.2.1 have in the pg_stats row for broadcast.channelregionid? What is the real distribution of that column? regards, tom lane
"Sam Liddicott" <sam.liddicott@ananova.com> writes: >> Indeed. What do 7.2 and 7.2.1 have in the pg_stats row for >> broadcast.channelregionid? What is the real distribution of that >> column? > 7.2 says: > [snip] Well, that's annoying: the new ANALYZE code is in fact more accurate than the old (it's estimating 429 distinct values, vs. formerly 4532, when the true figure is 636) but it's nonetheless making a worse final choice of plan. If you turn off enable_seqscan, what EXPLAIN results do you get from 7.2.1? How about if you leave enable_seqscan on, but reduce random_page_cost from the default 4.0 to perhaps 3.0? regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 08 July 2002 16:22 > To: Sam Liddicott > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] 7.2.1 optimises very badly against 7.2 > > > "Sam Liddicott" <sam.liddicott@ananova.com> writes: > > O notice missing in the 7.2.1 (slow) explain analyse this part: > > "Index Scan using idx_broadcast_channelregionid on broadcast" > > Indeed. What do 7.2 and 7.2.1 have in the pg_stats row for > broadcast.channelregionid? What is the real distribution of that > column? 7.2 says: tv=# select * from pg_stats where attname='channelregionid'; tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation -----------+-----------------+-----------+-----------+------------+--------- ----------------------+----------------------------------------------------- ----------------------------------+----------------------------------------- +------------- broadcast | channelregionid | 0 | 4 | 4532 | {54,81,2,22,1,4,645,76,53,23} | {0.0376667,0.0346667,0.0293333,0.029,0.0273333,0.024,0.0236667,0.019,0.01866 67,0.018} | {3,16,36,49,90,170,231,425,494,659,747} | -0.155299 (1 row) ................ channelregionid | count -----------------+------- 418 | 11698 588 | 8677 417 | 8331 138 | 7435 170 | 7336 219 | 6729 701 | 6585 184 | 6584 218 | 6537 109 | 6479 195 | 6367 734 | 6341 67 | 6235 33 | 5902 615 | 5900 707 | 5899 136 | 5896 227 | 5806 49 | 5754 414 | 5714 1 | 5710 122 | 5646 413 | 5629 48 | 5603 2 | 5593 415 | 5586 3 | 5581 34 | 5579 186 | 5565 13 | 5547 15 | 5546 11 | 5545 12 | 5545 18 | 5545 713 | 5545 9 | 5544 14 | 5544 4 | 5542 10 | 5542 17 | 5541 412 | 5541 16 | 5539 92 | 5493 39 | 5489 35 | 5393 612 | 5371 99 | 5346 678 | 5333 659 | 5304 45 | 5287 46 | 5287 85 | 5286 102 | 5269 705 | 5267 215 | 5252 190 | 5249 709 | 5247 47 | 5234 44 | 5221 36 | 5216 194 | 5210 38 | 5188 698 | 5187 661 | 5136 37 | 5134 40 | 5128 41 | 5114 663 | 5081 82 | 5068 42 | 5051 19 | 5036 81 | 5022 95 | 5019 141 | 4996 54 | 4984 52 | 4980 20 | 4979 25 | 4978 27 | 4978 24 | 4977 29 | 4977 31 | 4977 724 | 4977 22 | 4976 23 | 4976 26 | 4976 660 | 4976 30 | 4975 32 | 4975 420 | 4975 185 | 4966 43 | 4958 21 | 4933 149 | 4756 53 | 4692 480 | 4663 76 | 4652 91 | 4606 134 | 4577 89 | 4536 168 | 4507 700 | 4506 487 | 4499 200 | 4381 222 | 4379 617 | 4329 71 | 4250 613 | 4199 83 | 4198 128 | 4127 130 | 4076 188 | 4070 703 | 4060 197 | 4042 169 | 4025 706 | 4018 129 | 3972 66 | 3944 112 | 3851 704 | 3849 641 | 3809 232 | 3708 622 | 3696 133 | 3695 110 | 3649 221 | 3549 645 | 3484 183 | 3441 634 | 3404 738 | 3399 682 | 3376 123 | 3352 166 | 3346 90 | 3283 64 | 3258 84 | 3248 58 | 3237 631 | 3214 636 | 3180 635 | 3179 639 | 3176 640 | 3176 177 | 3144 710 | 3142 478 | 3124 203 | 3121 172 | 3066 733 | 3061 192 | 3051 214 | 3051 633 | 2962 632 | 2923 722 | 2909 171 | 2698 702 | 2695 107 | 2685 161 | 2658 485 | 2622 696 | 2598 638 | 2568 474 | 2559 275 | 2549 274 | 2546 451 | 2489 637 | 2486 619 | 2470 155 | 2406 433 | 2373 216 | 2334 431 | 2329 231 | 2279 241 | 2261 63 | 2253 605 | 2233 150 | 2227 114 | 2091 223 | 2048 606 | 2047 139 | 2036 73 | 2031 120 | 2020 668 | 2020 96 | 1984 68 | 1977 657 | 1976 365 | 1949 608 | 1940 368 | 1900 8 | 1888 187 | 1864 86 | 1830 70 | 1817 50 | 1813 175 | 1808 124 | 1806 69 | 1802 367 | 1801 119 | 1795 144 | 1791 178 | 1774 125 | 1753 174 | 1728 143 | 1724 74 | 1680 278 | 1666 422 | 1659 379 | 1644 369 | 1595 313 | 1594 535 | 1594 261 | 1553 154 | 1552 435 | 1523 359 | 1505 308 | 1495 530 | 1494 534 | 1493 543 | 1490 542 | 1487 111 | 1481 461 | 1481 249 | 1476 620 | 1476 602 | 1475 614 | 1469 153 | 1463 228 | 1459 87 | 1457 536 | 1451 289 | 1434 601 | 1421 524 | 1418 525 | 1418 512 | 1383 513 | 1383 375 | 1373 560 | 1373 518 | 1372 245 | 1370 521 | 1369 495 | 1367 493 | 1366 494 | 1366 454 | 1364 561 | 1364 505 | 1363 56 | 1358 496 | 1358 544 | 1356 284 | 1353 77 | 1349 78 | 1348 79 | 1348 80 | 1348 545 | 1347 540 | 1342 541 | 1342 537 | 1337 358 | 1334 618 | 1310 556 | 1299 557 | 1299 349 | 1290 366 | 1282 712 | 1280 425 | 1279 528 | 1276 529 | 1276 572 | 1276 568 | 1272 508 | 1271 509 | 1271 514 | 1257 727 | 1257 515 | 1256 362 | 1255 396 | 1254 603 | 1254 342 | 1249 479 | 1248 486 | 1248 554 | 1247 564 | 1246 565 | 1246 394 | 1239 229 | 1237 582 | 1232 570 | 1230 571 | 1230 292 | 1227 321 | 1227 286 | 1222 287 | 1222 510 | 1222 511 | 1222 580 | 1220 266 | 1218 531 | 1217 716 | 1213 546 | 1211 547 | 1211 491 | 1210 492 | 1210 374 | 1203 472 | 1203 563 | 1203 462 | 1199 500 | 1194 584 | 1193 499 | 1188 562 | 1188 731 | 1185 742 | 1184 437 | 1182 555 | 1182 280 | 1172 720 | 1170 581 | 1168 717 | 1168 732 | 1162 432 | 1160 242 | 1156 548 | 1151 549 | 1151 579 | 1151 260 | 1150 567 | 1149 569 | 1149 578 | 1148 428 | 1147 532 | 1146 559 | 1145 438 | 1144 621 | 1143 371 | 1138 333 | 1137 522 | 1137 533 | 1135 523 | 1132 558 | 1132 538 | 1126 539 | 1126 489 | 1124 714 | 1121 427 | 1115 506 | 1114 735 | 1107 59 | 1104 517 | 1104 430 | 1101 255 | 1099 336 | 1087 516 | 1084 652 | 1077 383 | 1076 387 | 1072 285 | 1071 251 | 1069 699 | 1069 322 | 1067 552 | 1067 553 | 1067 309 | 1063 473 | 1061 550 | 1061 551 | 1061 497 | 1060 498 | 1060 697 | 1060 385 | 1056 470 | 1046 256 | 1044 282 | 1044 296 | 1044 299 | 1044 314 | 1044 456 | 1044 650 | 1044 381 | 1042 463 | 1040 477 | 1040 335 | 1036 402 | 1036 471 | 1034 55 | 1033 646 | 1033 360 | 1031 643 | 1031 653 | 1031 279 | 1026 320 | 1026 352 | 1023 331 | 1021 364 | 1020 356 | 1018 465 | 1016 574 | 1016 464 | 1014 673 | 1014 103 | 1013 234 | 1013 334 | 1013 380 | 1013 295 | 1008 332 | 1006 263 | 1004 482 | 1004 585 | 1004 353 | 1000 361 | 998 401 | 997 484 | 996 294 | 993 217 | 987 156 | 986 246 | 986 312 | 986 311 | 985 376 | 984 399 | 984 377 | 983 594 | 982 330 | 981 692 | 978 271 | 977 267 | 976 270 | 976 272 | 976 277 | 976 326 | 975 575 | 972 233 | 966 298 | 966 305 | 966 424 | 966 649 | 966 235 | 965 459 | 963 526 | 962 372 | 960 408 | 959 527 | 954 319 | 947 599 | 947 651 | 947 340 | 945 373 | 945 577 | 945 403 | 944 469 | 943 327 | 938 455 | 937 719 | 936 158 | 931 236 | 926 258 | 926 276 | 926 488 | 926 586 | 926 476 | 925 388 | 924 501 | 924 502 | 924 409 | 919 573 | 918 744 | 917 264 | 906 445 | 904 443 | 903 283 | 902 442 | 902 444 | 900 407 | 896 339 | 890 252 | 889 247 | 888 503 | 888 253 | 886 273 | 886 589 | 886 583 | 884 576 | 882 239 | 880 607 | 877 406 | 876 220 | 875 386 | 873 440 | 872 329 | 871 384 | 871 350 | 870 405 | 870 708 | 870 250 | 868 604 | 868 392 | 864 429 | 864 140 | 862 248 | 859 458 | 858 393 | 854 439 | 852 357 | 850 595 | 850 262 | 849 269 | 848 304 | 848 318 | 848 647 | 848 723 | 848 354 | 847 268 | 846 281 | 846 648 | 846 240 | 845 225 | 842 325 | 839 224 | 838 146 | 836 441 | 833 389 | 832 664 | 832 145 | 829 481 | 829 315 | 828 644 | 827 346 | 825 328 | 823 404 | 812 475 | 812 348 | 810 378 | 809 600 | 805 390 | 792 466 | 792 747 | 792 341 | 790 426 | 790 593 | 789 316 | 788 301 | 773 457 | 772 591 | 772 592 | 772 288 | 770 587 | 758 597 | 758 460 | 752 590 | 752 291 | 745 436 | 739 254 | 731 683 | 731 244 | 730 715 | 721 324 | 715 721 | 708 297 | 690 654 | 690 310 | 673 338 | 672 382 | 641 237 | 632 693 | 632 448 | 629 355 | 600 370 | 600 259 | 592 118 | 588 238 | 572 351 | 494 395 | 456 290 | 454 265 | 453 300 | 434 655 | 424 656 | 417 303 | 414 323 | 410 446 | 399 179 | 366 293 | 363 658 | 360 363 | 351 230 | 338 756 | 337 642 | 336 116 | 332 691 | 306 307 | 296 317 | 296 306 | 276 257 | 215 159 | 168 181 | 163 180 | 160 737 | 144 60 | 138 62 | 138 93 | 138 100 | 138 101 | 138 104 | 138 151 | 138 204 | 138 205 | 138 206 | 138 207 | 138 208 | 138 209 | 138 210 | 138 211 | 138 212 | 138 213 | 138 410 | 138 411 | 138 616 | 138 121 | 137 749 | 136 182 | 135 337 | 135 596 | 135 450 | 134 189 | 131 449 | 119 447 | 114 751 | 90 142 | 81 745 | 67 743 | 66 711 | 56 391 | 42 694 | 24 137 | 19 695 | 13 736 | 11 (636 rows) =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 7.2.1 says: tv=# select * from pg_stats where attname='channelregionid'; tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation -----------+-----------------+-----------+-----------+------------+--------- ---------------------+------------------------------------------------------ ---------------------------------+-----------------------------------------+ ------------- broadcast | channelregionid | 0 | 4 | 429 | {2,54,76,4,81,1,37,3,22,487} | {0.0326667,0.032,0.029,0.028,0.0273333,0.0253333,0.0253333,0.0233333,0.02266 67,0.021} | {8,24,38,64,91,183,250,428,567,659,756} | -0.19478 (1 row) .......... channelregionid | count -----------------+------- 418 | 11698 588 | 8677 417 | 8331 138 | 7435 170 | 7336 219 | 6729 701 | 6585 184 | 6584 218 | 6537 109 | 6479 195 | 6367 734 | 6341 67 | 6235 33 | 5902 615 | 5900 707 | 5899 136 | 5896 227 | 5806 49 | 5754 414 | 5714 1 | 5710 122 | 5646 413 | 5629 48 | 5603 2 | 5593 415 | 5586 3 | 5581 34 | 5579 186 | 5565 13 | 5547 15 | 5546 11 | 5545 12 | 5545 18 | 5545 713 | 5545 9 | 5544 14 | 5544 4 | 5542 10 | 5542 17 | 5541 412 | 5541 16 | 5539 92 | 5493 39 | 5489 35 | 5393 612 | 5371 99 | 5346 678 | 5333 659 | 5304 45 | 5287 46 | 5287 85 | 5286 102 | 5269 705 | 5267 215 | 5252 190 | 5249 709 | 5247 47 | 5234 44 | 5221 36 | 5216 194 | 5210 38 | 5188 698 | 5187 661 | 5136 37 | 5134 40 | 5128 41 | 5114 663 | 5081 82 | 5068 42 | 5051 19 | 5036 81 | 5022 95 | 5019 141 | 4996 54 | 4984 52 | 4980 20 | 4979 25 | 4978 27 | 4978 24 | 4977 29 | 4977 31 | 4977 724 | 4977 22 | 4976 23 | 4976 26 | 4976 660 | 4976 30 | 4975 32 | 4975 420 | 4975 185 | 4966 43 | 4958 21 | 4933 149 | 4756 53 | 4692 480 | 4663 76 | 4652 91 | 4606 134 | 4577 89 | 4536 168 | 4507 700 | 4506 487 | 4499 200 | 4381 222 | 4379 617 | 4329 71 | 4250 613 | 4199 83 | 4198 128 | 4127 130 | 4076 188 | 4070 703 | 4060 197 | 4042 169 | 4025 706 | 4018 129 | 3972 66 | 3944 112 | 3851 704 | 3849 641 | 3809 232 | 3708 622 | 3696 133 | 3695 110 | 3649 221 | 3549 645 | 3484 183 | 3441 634 | 3404 738 | 3399 682 | 3376 123 | 3352 166 | 3346 90 | 3283 64 | 3258 84 | 3248 58 | 3237 631 | 3214 636 | 3180 635 | 3179 639 | 3176 640 | 3176 177 | 3144 710 | 3142 478 | 3124 203 | 3121 172 | 3066 733 | 3061 192 | 3051 214 | 3051 633 | 2962 632 | 2923 722 | 2909 171 | 2698 702 | 2695 107 | 2685 161 | 2658 485 | 2622 696 | 2598 638 | 2568 474 | 2559 275 | 2549 274 | 2546 451 | 2489 637 | 2486 619 | 2470 155 | 2406 433 | 2373 216 | 2334 431 | 2329 231 | 2279 241 | 2261 63 | 2253 605 | 2233 150 | 2227 114 | 2091 223 | 2048 606 | 2047 139 | 2036 73 | 2031 120 | 2020 668 | 2020 96 | 1984 68 | 1977 657 | 1976 365 | 1949 608 | 1940 368 | 1900 8 | 1888 187 | 1864 86 | 1830 70 | 1817 50 | 1813 175 | 1808 124 | 1806 69 | 1802 367 | 1801 119 | 1795 144 | 1791 178 | 1774 125 | 1753 174 | 1728 143 | 1724 74 | 1680 278 | 1666 422 | 1659 379 | 1644 369 | 1595 313 | 1594 535 | 1594 261 | 1553 154 | 1552 435 | 1523 359 | 1505 308 | 1495 530 | 1494 534 | 1493 543 | 1490 542 | 1487 111 | 1481 461 | 1481 249 | 1476 620 | 1476 602 | 1475 614 | 1469 153 | 1463 228 | 1459 87 | 1457 536 | 1451 289 | 1434 601 | 1421 524 | 1418 525 | 1418 512 | 1383 513 | 1383 375 | 1373 560 | 1373 518 | 1372 245 | 1370 521 | 1369 495 | 1367 493 | 1366 494 | 1366 454 | 1364 561 | 1364 505 | 1363 56 | 1358 496 | 1358 544 | 1356 284 | 1353 77 | 1349 78 | 1348 79 | 1348 80 | 1348 545 | 1347 540 | 1342 541 | 1342 537 | 1337 358 | 1334 618 | 1310 556 | 1299 557 | 1299 349 | 1290 366 | 1282 712 | 1280 425 | 1279 528 | 1276 529 | 1276 572 | 1276 568 | 1272 508 | 1271 509 | 1271 514 | 1257 727 | 1257 515 | 1256 362 | 1255 396 | 1254 603 | 1254 342 | 1249 479 | 1248 486 | 1248 554 | 1247 564 | 1246 565 | 1246 394 | 1239 229 | 1237 582 | 1232 570 | 1230 571 | 1230 292 | 1227 321 | 1227 286 | 1222 287 | 1222 510 | 1222 511 | 1222 580 | 1220 266 | 1218 531 | 1217 716 | 1213 546 | 1211 547 | 1211 491 | 1210 492 | 1210 374 | 1203 472 | 1203 563 | 1203 462 | 1199 500 | 1194 584 | 1193 499 | 1188 562 | 1188 731 | 1185 742 | 1184 437 | 1182 555 | 1182 280 | 1172 720 | 1170 581 | 1168 717 | 1168 732 | 1162 432 | 1160 242 | 1156 548 | 1151 549 | 1151 579 | 1151 260 | 1150 567 | 1149 569 | 1149 578 | 1148 428 | 1147 532 | 1146 559 | 1145 438 | 1144 621 | 1143 371 | 1138 333 | 1137 522 | 1137 533 | 1135 523 | 1132 558 | 1132 538 | 1126 539 | 1126 489 | 1124 714 | 1121 427 | 1115 506 | 1114 735 | 1107 59 | 1104 517 | 1104 430 | 1101 255 | 1099 336 | 1087 516 | 1084 652 | 1077 383 | 1076 387 | 1072 285 | 1071 251 | 1069 699 | 1069 322 | 1067 552 | 1067 553 | 1067 309 | 1063 473 | 1061 550 | 1061 551 | 1061 497 | 1060 498 | 1060 697 | 1060 385 | 1056 470 | 1046 256 | 1044 282 | 1044 296 | 1044 299 | 1044 314 | 1044 456 | 1044 650 | 1044 381 | 1042 463 | 1040 477 | 1040 335 | 1036 402 | 1036 471 | 1034 55 | 1033 646 | 1033 360 | 1031 643 | 1031 653 | 1031 279 | 1026 320 | 1026 352 | 1023 331 | 1021 364 | 1020 356 | 1018 465 | 1016 574 | 1016 464 | 1014 673 | 1014 103 | 1013 234 | 1013 334 | 1013 380 | 1013 295 | 1008 332 | 1006 263 | 1004 482 | 1004 585 | 1004 353 | 1000 361 | 998 401 | 997 484 | 996 294 | 993 217 | 987 156 | 986 246 | 986 312 | 986 311 | 985 376 | 984 399 | 984 377 | 983 594 | 982 330 | 981 692 | 978 271 | 977 267 | 976 270 | 976 272 | 976 277 | 976 326 | 975 575 | 972 233 | 966 298 | 966 305 | 966 424 | 966 649 | 966 235 | 965 459 | 963 526 | 962 372 | 960 408 | 959 527 | 954 319 | 947 599 | 947 651 | 947 340 | 945 373 | 945 577 | 945 403 | 944 469 | 943 327 | 938 455 | 937 719 | 936 158 | 931 236 | 926 258 | 926 276 | 926 488 | 926 586 | 926 476 | 925 388 | 924 501 | 924 502 | 924 409 | 919 573 | 918 744 | 917 264 | 906 445 | 904 443 | 903 283 | 902 442 | 902 444 | 900 407 | 896 339 | 890 252 | 889 247 | 888 503 | 888 253 | 886 273 | 886 589 | 886 583 | 884 576 | 882 239 | 880 607 | 877 406 | 876 220 | 875 386 | 873 440 | 872 329 | 871 384 | 871 350 | 870 405 | 870 708 | 870 250 | 868 604 | 868 392 | 864 429 | 864 140 | 862 248 | 859 458 | 858 393 | 854 439 | 852 357 | 850 595 | 850 262 | 849 269 | 848 304 | 848 318 | 848 647 | 848 723 | 848 354 | 847 268 | 846 281 | 846 648 | 846 240 | 845 225 | 842 325 | 839 224 | 838 146 | 836 441 | 833 389 | 832 664 | 832 145 | 829 481 | 829 315 | 828 644 | 827 346 | 825 328 | 823 404 | 812 475 | 812 348 | 810 378 | 809 600 | 805 390 | 792 466 | 792 747 | 792 341 | 790 426 | 790 593 | 789 316 | 788 301 | 773 457 | 772 591 | 772 592 | 772 288 | 770 587 | 758 597 | 758 460 | 752 590 | 752 291 | 745 436 | 739 254 | 731 683 | 731 244 | 730 715 | 721 324 | 715 721 | 708 297 | 690 654 | 690 310 | 673 338 | 672 382 | 641 237 | 632 693 | 632 448 | 629 355 | 600 370 | 600 259 | 592 118 | 588 238 | 572 351 | 494 395 | 456 290 | 454 265 | 453 300 | 434 655 | 424 656 | 417 303 | 414 323 | 410 446 | 399 179 | 366 293 | 363 658 | 360 363 | 351 230 | 338 756 | 337 642 | 336 116 | 332 691 | 306 307 | 296 317 | 296 306 | 276 257 | 215 159 | 168 181 | 163 180 | 160 737 | 144 60 | 138 62 | 138 93 | 138 100 | 138 101 | 138 104 | 138 151 | 138 204 | 138 205 | 138 206 | 138 207 | 138 208 | 138 209 | 138 210 | 138 211 | 138 212 | 138 213 | 138 410 | 138 411 | 138 616 | 138 121 | 137 749 | 136 182 | 135 337 | 135 596 | 135 450 | 134 189 | 131 449 | 119 447 | 114 751 | 90 142 | 81 745 | 67 743 | 66 711 | 56 391 | 42 694 | 24 137 | 19 695 | 13 736 | 11 (636 rows)
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 08 July 2002 21:44 > To: Sam Liddicott > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] 7.2.1 optimises very badly against 7.2 > > If you turn off enable_seqscan, what EXPLAIN results do you get from > 7.2.1? Wow. Even better than 7.2, much better use of indexes. Unique (cost=116936.99..117348.73 rows=531 width=276) (actual time=567.37..571.80 rows=305 loops=1) -> Sort (cost=116936.99..116936.99 rows=5313 width=276) (actual time=567.36..567.70 rows=305 loops=1) -> Nested Loop (cost=3.51..116608.26 rows=5313 width=276) (actual time=6.56..561.30 rows=305 loops=1) -> Nested Loop (cost=3.51..92080.31 rows=5313 width=119) (actual time=6.00..519.18 rows=305 loops=1) -> Hash Join (cost=3.51..61.71 rows=9 width=84) (actual time=1.45..7.61 rows=9 loops=1) -> Index Scan using channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey on channelregion (cost=0.00..58.05 rows=9 width=60) (actual time=0.17..5.78 rows=9 loops=1) -> Hash (cost=3.49..3.49 rows=7 width=24) (actual time=0.14..0.14 rows=0 loops=1) -> Index Scan using distribution_pkey on distribution (cost=0.00..3.49 rows=7 width=24) (actual time=0.06..0.11 rows=7 loops=1) -> Index Scan using idx_broadcast_channelregionid on broadcast (cost=0.00..10273.33 rows=872 width=35) (actual time=3.75..56.22 rows=34 loops=9) -> Index Scan using episode_pkey on episode (cost=0.00..4.60 rows=1 width=157) (actual time=0.09..0.10 rows=1 loops=305) Total runtime: 668.73 msec > How about if you leave enable_seqscan on, but reduce > random_page_cost from the default 4.0 to perhaps 3.0? The same. Normally (in high concurrent use) I think even sequence scans degrade to random_pages because of disk head contention; but in this case the seq_scan query was running on an otherwise idle machine, as the sole pgsql client, so perhaps as you hinted the random_page_cost was too high? But I feel where indexes are used and seq_scan *could* have been used, seq_scan is only slightly faster where the machine is idle (and the small delay can perhaps be afforded), but where there there is disk head contention seq_scan is deadly, thus I always prefer index scan, so I shall disable seq_scan in the config file. Is my reasoning faulty?, and is it a reasonable solution or can we expect that the query-plan-chooser ought always to do better? Thanks for you help on this and I'm happy to keep running tests for you until you are happy with your results as well as me happy with mine. Unique (cost=91354.47..91766.20 rows=531 width=276) (actual time=542.55..547.01 rows=305 loops=1) -> Sort (cost=91354.47..91354.47 rows=5313 width=276) (actual time=542.54..542.89 rows=305 loops=1) -> Nested Loop (cost=1.09..91025.74 rows=5313 width=276) (actual time=7.14..536.40 rows=305 loops=1) -> Nested Loop (cost=1.09..69315.01 rows=5313 width=119) (actual time=6.59..496.13 rows=305 loops=1) -> Hash Join (cost=1.09..50.53 rows=9 width=84) (actual time=1.94..9.11 rows=9 loops=1) -> Index Scan using channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey, channelregion_pkey on channelregion (cost=0.00..49.28 rows=9 width=60) (actual time=0.42..7.04 rows=9 loops=1) -> Hash (cost=1.07..1.07 rows=7 width=24) (actual time=0.20..0.20 rows=0 loops=1) -> Seq Scan on distribution (cost=0.00..1.07 rows=7 width=24) (actual time=0.13..0.16 rows=7 loops=1) -> Index Scan using idx_broadcast_channelregionid on broadcast (cost=0.00..7730.28 rows=872 width=35) (actual time=3.66..53.50 rows=34 loops=9) -> Index Scan using episode_pkey on episode (cost=0.00..4.07 rows=1 width=157) (actual time=0.09..0.10 rows=1 loops=305) Total runtime: 630.32 msec Sam
"Sam Liddicott" <sam.liddicott@ananova.com> writes: > But I feel where indexes are used and seq_scan *could* have been used, > seq_scan is only slightly faster where the machine is idle (and the small > delay can perhaps be afforded), but where there there is disk head > contention seq_scan is deadly, thus I always prefer index scan, so I shall > disable seq_scan in the config file. Is my reasoning faulty? Quite. If we could get by with a rule as simplistic as "never use a seqscan if you can avoid it" then the planner could be a lot simpler. Your real gripe is that the planner is overestimating the costs of indexscans relative to seqscans; that would be more appropriately addressed by lowering random_page_cost a little than by getting out the sledgehammer. A more practical reason not to do that is that in situations where a seqscan is not avoidable, enable_seq_scan = OFF is likely to cause the planner to make bad choices, since the disable cost will swamp out all the actually-useful cost judgments. regards, tom lane
On Wed, Jul 10, 2002 at 10:00:38AM -0400, Tom Lane wrote: > "Sam Liddicott" <sam.liddicott@ananova.com> writes: > > But I feel where indexes are used and seq_scan *could* have been used, > > seq_scan is only slightly faster where the machine is idle (and the small > > delay can perhaps be afforded), but where there there is disk head > > contention seq_scan is deadly, thus I always prefer index scan, so I shall > > disable seq_scan in the config file. Is my reasoning faulty? > > Quite. If we could get by with a rule as simplistic as "never use a > seqscan if you can avoid it" then the planner could be a lot simpler. > Your real gripe is that the planner is overestimating the costs of > indexscans relative to seqscans; that would be more appropriately > addressed by lowering random_page_cost a little than by getting out > the sledgehammer. I think there is a little problem with multiple seq scans in a single plan. If your plan is only doing a single seq scan on a large table, then the cost estimate is probably fine. But if the planner chooses the seq scan two large tables in parallel, the actual disk transfers degenerate to random access. But only if they are on the same disk. Should postgres be worrying about this? -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > There are 10 kinds of people in the world, those that can do binary > arithmetic and those that can't.
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: 10 July 2002 15:01 > To: Sam Liddicott > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] 7.2.1 optimises very badly against 7.2 > > If we could get by with a rule as simplistic as "never use a > seqscan if you can avoid it" then the planner could be a lot simpler. > Your real gripe is that the planner is overestimating the costs of > indexscans relative to seqscans; that would be more appropriately > addressed by lowering random_page_cost a little than by getting out > the sledgehammer. > > A more practical reason not to do that is that in situations where a > seqscan is not avoidable, enable_seq_scan = OFF is likely to cause the > planner to make bad choices, since the disable cost will swamp out all > the actually-useful cost judgments. Right. Do you feel the random page cost of 3 is good to solve this? Is it solely a tuning problem at my end, or do you want to do further tests to derive a better default value? Sam
"Sam Liddicott" <sam.liddicott@ananova.com> writes: > Do you feel the random page cost of 3 is good to solve this? For the moment, anyway. There have been a couple of rounds of pgsql-hackers discussion about whether to lower the default value of random_page_cost, but so far no one has done any experiments that would be needed to establish a good new value. (The current default of 4.0 is based on some old experiments I did. I'm quite willing to accept that those experiments might have been flawed, but not willing to replace the number without seeing better experiments...) regards, tom lane
Just curious, Is the number of record per page and the number of key per page taken in consideration? Tom Lane wrote: > > "Sam Liddicott" <sam.liddicott@ananova.com> writes: > > Do you feel the random page cost of 3 is good to solve this? > > For the moment, anyway. There have been a couple of rounds of > pgsql-hackers discussion about whether to lower the default value of > random_page_cost, but so far no one has done any experiments that > would be needed to establish a good new value. (The current default > of 4.0 is based on some old experiments I did. I'm quite willing to > accept that those experiments might have been flawed, but not willing > to replace the number without seeing better experiments...) > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
Jean-Luc Lachance <jllachan@nsd.ca> writes: > Is the number of record per page and the number of key per page taken in > consideration? Yes, indirectly. (The cost equations actually use total tuples and total pages in each table/index, but that's isomorphic to using average tuple size or average number of tuples per page or however you want to think about it ...) regards, tom lane
tgl@sss.pgh.pa.us (Tom Lane) wrote in message > For the moment, anyway. There have been a couple of rounds of > pgsql-hackers discussion about whether to lower the default value of > random_page_cost, but so far no one has done any experiments that > would be needed to establish a good new value. (The current default > of 4.0 is based on some old experiments I did. I'm quite willing to > accept that those experiments might have been flawed, but not willing > to replace the number without seeing better experiments...) > I wonder if a simple script could be designed to test the various parameters specified in postgresql.conf. The user could be asked to input 3 or 4 of the most commmon database queries and the script would determine the cost, run those queries, and send back the actual time of completion for each variation of the optimizer (e.g. run 1, enable_seq_scan false; run 2, random_page_cost 3; etc.). This way administrators would have a way (albeit brute force) to fine-tune their settings to their specific machine and data. Plus, they could upload the results to the hackers list just like regression tests. Just a thought. -Tony
On Fri, 12 Jul 2002, Tom Lane wrote: > "Sam Liddicott" <sam.liddicott@ananova.com> writes: > > Do you feel the random page cost of 3 is good to solve this? > > For the moment, anyway. There have been a couple of rounds of > pgsql-hackers discussion about whether to lower the default value of > random_page_cost, but so far no one has done any experiments that > would be needed to establish a good new value. (The current default > of 4.0 is based on some old experiments I did. I'm quite willing to > accept that those experiments might have been flawed, but not willing > to replace the number without seeing better experiments...) When I first started using the 7.x series, the query planner often picked a sequential scan that would take minutes to return, when an index scan would take seconds. A very low setting for random page cost would fix this (a setting of 0.1 or something like that) but would also make the planner make some bad choices where it should be picking a seq scan but didn't. With 7.2 I've found that a random page cost of around 2 to 4 seems to work very well. So, my point (and I have one!) is that previous experiences with random page cost and older versions of postgresql don't necessarily apply to postgresql 7.2.1. Also, if you're having problems with the query planner and you're running a version of postgresql from before 7.2, you should upgrade first, then worry about what random page cost should be set to.
Tom Lane wrote (snipped) >There have been a couple of rounds of >pgsql-hackers discussion about whether to lower the default value of >random_page_cost, but so far no one has done any experiments that >would be needed to establish a good new value. (The current default >of 4.0 is based on some old experiments I did 4.0 seems like a pretty reasonable default setting. I had the opportunity to measure approximate random_page_cost for a number of configurations recently whilst doing some file system benchmarking: Rough Config Random_page_cost ---------------------------------------------------------------- Sun E220 Solaris 8 SAN RAID 5 6 Cyclone Linux,3Ware 7850 4xATA-133 RAID 5 40 Cyclone Linux 1xATA 133 10 Given the wide variation obtainable ... I cant see any reason to move away from 4.0. regards Mark P.s : I used these to do the measurements - http://homepages.slingshot.co.nz/~markir/tar/benchtool/benchtools-0.9.1.tar.gz and if anyone knows how to enable largefile support on Linux in a cleaner way than I did, feel free to let me know !
> -----Original Message----- > From: Martijn van Oosterhout [mailto:kleptog@svana.org] > Sent: 11 July 2002 00:37 > To: Tom Lane > Cc: Sam Liddicott; pgsql-general@postgresql.org > Subject: Re: [GENERAL] 7.2.1 optimises very badly against 7.2 > > I think there is a little problem with multiple seq scans in > a single plan. > If your plan is only doing a single seq scan on a large > table, then the cost > estimate is probably fine. But if the planner chooses the seq > scan two large > tables in parallel, the actual disk transfers degenerate to > random access. > But only if they are on the same disk. > > Should postgres be worrying about this? I think it should. The same applies if two different queries are running together of the same disk; which is probably any DB with allow_connections>1 Sam
On Mon, 15 Jul 2002, Sam Liddicott wrote: > > From: Martijn van Oosterhout [mailto:kleptog@svana.org] > > > > But if the planner chooses the seq scan two large > > tables in parallel, the actual disk transfers degenerate to random access. > > But only if they are on the same disk. > > > > Should postgres be worrying about this? > > I think it should. The same applies if two different queries are running > together of the same disk; which is probably any DB with allow_connections>1 Well, should it then worry about read-ahead? On most OSes, it doesn't actually degenerate to 1-block random reads; it degerates to something along the lines of 8-block random reads. Trying to optimized based on more than the very simplest and common ideas about physical layout opens up a huge can of worms when you don't actually have any real control over or knowledge of that layout. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
Curt Sampson <cjs@cynic.net> writes: > Well, should it then worry about read-ahead? On most OSes, it > doesn't actually degenerate to 1-block random reads; it degerates > to something along the lines of 8-block random reads. Exactly. There is still an OS-provided benefit for sequential reading, even if it's not as large as it might be in the absence of any other activity. What this line of argument suggests is that random_page_cost should be reduced in a heavily-loaded system ... but probably not to 1.0. regards, tom lane