Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |
Date | |
Msg-id | 20200408155703.evmgiyjeffyhklui@development Whole thread Raw |
In response to | Re: [PATCH] Incremental sort (was: PoC: Partial sort) (James Coleman <jtc331@gmail.com>) |
Responses |
Re: [PATCH] Incremental sort (was: PoC: Partial sort)
|
List | pgsql-hackers |
On Wed, Apr 08, 2020 at 11:13:26AM -0400, James Coleman wrote: >On Wed, Apr 8, 2020 at 11:02 AM Tomas Vondra ><tomas.vondra@2ndquadrant.com> wrote: >> >> On Wed, Apr 08, 2020 at 04:08:39PM +0200, Tomas Vondra wrote: >> >On Wed, Apr 08, 2020 at 09:54:42AM -0400, James Coleman wrote: >> >>On Wed, Apr 8, 2020 at 9:43 AM Tomas Vondra >> >><tomas.vondra@2ndquadrant.com> wrote: >> >>> >> >>>On Wed, Apr 08, 2020 at 12:51:05PM +0200, Tomas Vondra wrote: >> >>>>On Tue, Apr 07, 2020 at 11:54:23PM -0400, Tom Lane wrote: >> >>>>>hyrax is not too happy with this test: >> >>>>> >> >>>>>https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2020-04-07%2004%3A55%3A15 >> >>>>> >> >>>>>It's not too clear to me why CLOBBER_CACHE_ALWAYS would be breaking >> >>>>>EXPLAIN output, but it evidently is. >> >>>>> >> >>>> >> >>>>Thanks, I'll investigate. It's not clear to me either what might be >> >>>>causing this, but I guess something must have gone wrong in >> >>>>estimation/planning. >> >>>> >> >>> >> >>>OK, I know what's going on - it's a rather embarassing issue in the >> >>>regression test. There's no analyze on the test tables, so it uses >> >>>default estimates for number of groups etc. But with clobber cache the >> >>>test runs long enough for autoanalyze to kick in and collect stats, so >> >>>we generate better estimates which changes the plan. >> >>> >> >>>I'll get this fixed - explicit analyze and tweaking the data a bit >> >>>should do the trick. >> >> >> >>Looking at the tests that failed, I think we should consider just adding: >> >>set enable_sort = off; >> >>because several of those tests have very specific amounts of data to >> >>ensure we test the transition points around the different modes in the >> >>incremental sort node. >> >> >> > >> >Maybe, but I'd much rather tweak the data so that we test both the >> >costing and execution part. >> > >> >> I do think this does the trick by increasing the number of rows a bit >> (from 100 to 1000) to make the Sort more expensive than Incremental >> Sort, while still testing the transition points. >> >> James, can you verify it that's still true? > >Those changes all look good to me from a "testing correctness" POV. >Also I like that we now test multiple sort methods in the explain >output, like: "Sort Methods: top-N heapsort, quicksort". > OK, good. I'll push the fix. >I personally find the `i/100` notation harder to read than a case, but >that's just an opinion... > Yeah, but with 1000 rows we'd need a more complex CASE statement (I don't think simply having two groups - small+large would work). >Should we change `analyze` to `analyze t` to avoid unnecessarily >re-analyzing all other tables in the regression db? > Ah, definitely. That was a mistake. Thanks for noticing. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: