On Fri, Jan 26, 2007 at 09:55:39AM -0500, Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > Apparantly there is a bug lurking somewhere in pgwin32_select(). Because
> > if I put a #undef select right before the select in pgstat.c, the
> > regression tests pass.
> > I guess the bug is shown because with row level stats we simply have
> > more data to process. And it appears only to happen on UDP sockets from
> > what I can tell.
>
> Hmm ... if this theory is correct, then statistics collection has
> never worked at all on Windows, at least not under more than the most
> marginal load; and hence neither has autovacuum.
We have had lots of reports of issues with the stats collector on
Windows. Some were definitly fixed by the patch by O&T, but I don't
think all.
The thing is, since it didn't give any error messages at all, most users
wouldn't notice. Other than their tables getting bloated, in which case
they would do a manual vacuum and conlcude autovacuum wasn't good
enough. Or something.
> Does that conclusion agree with reality? You'd think we'd have heard
> a whole lot of complaints about it, not just Jeremy's; and I don't
> remember it being a sore point. (But then again I just woke up.)
> What seems somewhat more likely is that we broke pgwin32_select
> recently, in which case we oughta find out why. Or else remove it
> entirely (does your patch make that possible?).
AFAIK, it only affects UDP connections, and this patch takes
pgwin32_select out of the loop for all UDP stuff.
But if we get this in, pgwin32_select is only used in the postmaster
accept-new-connections loop (from what I can tell by a quick look), so
I'd definitly want to rewrite that one as well to use a better way than
select-emulation. Then it could go away completely.
> Keep in mind also that we have seen the stats-test failure on
> non-Windows machines, so we still need to explain that ...
Yeah. But it *could* be two different stats issues lurking. Perhaps the
issue we've seen on non-windows can be fixed by the settings Alvaro had
me try (increasing autovacuum_vacuum_cost_delay or the delay in the
regression test).
//Magnus