Thread: Strange Windows problem, lock_timeout test request

Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

18 January 2013, 22:43:29

Hi,

using the MinGW cross-compiled PostgreSQL binaries from Fedora 18,
I get the following error for both 32 and 64-bit compiled executables.
listen_addresses = '*' and "trust" authentication was set for both.
The firewall was disabled for the tests and the server logs "incomplete startup packet".
The 64-bit version was compiled with my lock_timeout patch, the 32-bit
was without.

64-bit, connect to localhost:

C:\Users\Ákos\Desktop\PG93>bin\psql
psql: could not connect to server: Operation would block (0x00002733/10035)        Is the server running on host
"localhost"(::1) and accepting        TCP/IP connections on port 5432? 
could not connect to server: Operation would block (0x00002733/10035)        Is the server running on host "localhost"
(127.0.0.1)and accepting        TCP/IP connections on port 5432? 

64-bit, connect to own IP:

C:\Users\Ákos\Desktop\PG93>bin\psql -h 192.168.1.4
psql: could not connect to server: Operation would block (0x00002733/10035)        Is the server running on host
"192.168.1.4"and accepting        TCP/IP connections on port 5432? 

32-bit, connect to own IP:

C:\Users\Ákos\Desktop\PG93-32>bin\psql -h 192.168.1.4
psql: could not connect to server: Operation would block (0x00002733/10035)        Is the server running on host
"192.168.1.4"and accepting        TCP/IP connections on port 5432? 

Does it ring a bell for someone?

Unfortunately, I won't have time to do anything with my lock_timeout patch
for about 3 weeks. Does anyone have a little spare time to test it on Windows?
The patch is here, it still applies to HEAD without rejects or fuzz:
http://www.postgresql.org/message-id/506C0854.7090008@cybertec.at

Thanks in advance,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Andrew Dunstan

Date:

19 January 2013, 00:16:00

On 01/18/2013 05:43 PM, Boszormenyi Zoltan wrote:
> Hi,
>
> using the MinGW cross-compiled PostgreSQL binaries from Fedora 18,
> I get the following error for both 32 and 64-bit compiled executables.
> listen_addresses = '*' and "trust" authentication was set for both.
> The firewall was disabled for the tests and the server logs
> "incomplete startup packet".
> The 64-bit version was compiled with my lock_timeout patch, the 32-bit
> was without.
>
> 64-bit, connect to localhost:
>
> C:\Users\Ákos\Desktop\PG93>bin\psql
> psql: could not connect to server: Operation would block
> (0x00002733/10035)
>         Is the server running on host "localhost" (::1) and accepting
>         TCP/IP connections on port 5432?
> could not connect to server: Operation would block (0x00002733/10035)
>         Is the server running on host "localhost" (127.0.0.1) and
> accepting
>         TCP/IP connections on port 5432?
>
> 64-bit, connect to own IP:
>
> C:\Users\Ákos\Desktop\PG93>bin\psql -h 192.168.1.4
> psql: could not connect to server: Operation would block
> (0x00002733/10035)
>         Is the server running on host "192.168.1.4" and accepting
>         TCP/IP connections on port 5432?
>
> 32-bit, connect to own IP:
>
> C:\Users\Ákos\Desktop\PG93-32>bin\psql -h 192.168.1.4
> psql: could not connect to server: Operation would block
> (0x00002733/10035)
>         Is the server running on host "192.168.1.4" and accepting
>         TCP/IP connections on port 5432?
>
> Does it ring a bell for someone?
>
> Unfortunately, I won't have time to do anything with my lock_timeout
> patch
> for about 3 weeks. Does anyone have a little spare time to test it on
> Windows?
> The patch is here, it still applies to HEAD without rejects or fuzz:
> http://www.postgresql.org/message-id/506C0854.7090008@cybertec.at

Yes it rings a bell. See
<http://people.planetpostgresql.org/andrew/index.php?/archives/264-Cross-compiling-PostgreSQL-for-WIndows.html>

Cross-compiling is not really a supported platform. Why don't you just
build natively? This is know to work as shown by the buildfarm animals
doing it successfully.

cheers

andrew

Re: Strange Windows problem, lock_timeout test request

From

Amit kapila

Date:

19 January 2013, 05:53:02

On Saturday, January 19, 2013 4:13 AM Boszormenyi Zoltan wrote:
> Hi,

> Unfortunately, I won't have time to do anything with my lock_timeout patch
> for about 3 weeks. Does anyone have a little spare time to test it on Windows?

I shall try to do it, probably next week.
Others are also welcome to test the patch.

With Regards,
Amit Kapila.

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

19 January 2013, 07:36:21

2013-01-19 01:13 keltezéssel, Andrew Dunstan írta:
>
> On 01/18/2013 05:43 PM, Boszormenyi Zoltan wrote:
>> Hi,
>>
>> using the MinGW cross-compiled PostgreSQL binaries from Fedora 18,
>> I get the following error for both 32 and 64-bit compiled executables.
>> listen_addresses = '*' and "trust" authentication was set for both.
>> The firewall was disabled for the tests and the server logs "incomplete startup packet".
>> The 64-bit version was compiled with my lock_timeout patch, the 32-bit
>> was without.
>>
>> 64-bit, connect to localhost:
>>
>> C:\Users\Ákos\Desktop\PG93>bin\psql
>> psql: could not connect to server: Operation would block (0x00002733/10035)
>>         Is the server running on host "localhost" (::1) and accepting
>>         TCP/IP connections on port 5432?
>> could not connect to server: Operation would block (0x00002733/10035)
>>         Is the server running on host "localhost" (127.0.0.1) and accepting
>>         TCP/IP connections on port 5432?
>>
>> 64-bit, connect to own IP:
>>
>> C:\Users\Ákos\Desktop\PG93>bin\psql -h 192.168.1.4
>> psql: could not connect to server: Operation would block (0x00002733/10035)
>>         Is the server running on host "192.168.1.4" and accepting
>>         TCP/IP connections on port 5432?
>>
>> 32-bit, connect to own IP:
>>
>> C:\Users\Ákos\Desktop\PG93-32>bin\psql -h 192.168.1.4
>> psql: could not connect to server: Operation would block (0x00002733/10035)
>>         Is the server running on host "192.168.1.4" and accepting
>>         TCP/IP connections on port 5432?
>>
>> Does it ring a bell for someone?
>>
>> Unfortunately, I won't have time to do anything with my lock_timeout patch
>> for about 3 weeks. Does anyone have a little spare time to test it on Windows?
>> The patch is here, it still applies to HEAD without rejects or fuzz:
>> http://www.postgresql.org/message-id/506C0854.7090008@cybertec.at
>
> Yes it rings a bell. See
> <http://people.planetpostgresql.org/andrew/index.php?/archives/264-Cross-compiling-PostgreSQL-for-WIndows.html>

The strange thing is that I have used cross-compiled build
during Fedora 16 prime time (using the mingw-w64 test repository) and
early Fedora 17 using the mingw32-* and ming64-* GCC packages
that were pulled from the above mentioned repository. It was the last time
I tested PG on Windows and it worked. I could connect to it both locally
and from my Linux PC.

>
> Cross-compiling is not really a supported platform. Why don't you just build natively?
> This is know to work as shown by the buildfarm animals doing it successfully.

Because I don't have a mingw setup on Windows. (Sorry.)

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

19 January 2013, 07:37:09

2013-01-19 06:52 keltezéssel, Amit kapila írta:
> On Saturday, January 19, 2013 4:13 AM Boszormenyi Zoltan wrote:
>> Hi,
>
>
>> Unfortunately, I won't have time to do anything with my lock_timeout patch
>> for about 3 weeks. Does anyone have a little spare time to test it on Windows?
> I shall try to do it, probably next week.
> Others are also welcome to test the patch.

Thanks very much in advance.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

19 January 2013, 07:51:12

2013-01-19 01:13 keltezéssel, Andrew Dunstan írta:
>
> On 01/18/2013 05:43 PM, Boszormenyi Zoltan wrote:
>> Hi,
>>
>> using the MinGW cross-compiled PostgreSQL binaries from Fedora 18,
>> I get the following error for both 32 and 64-bit compiled executables.
>> listen_addresses = '*' and "trust" authentication was set for both.
>> The firewall was disabled for the tests and the server logs "incomplete startup packet".
>> The 64-bit version was compiled with my lock_timeout patch, the 32-bit
>> was without.
>>
>> 64-bit, connect to localhost:
>>
>> C:\Users\Ákos\Desktop\PG93>bin\psql
>> psql: could not connect to server: Operation would block (0x00002733/10035)
>>         Is the server running on host "localhost" (::1) and accepting
>>         TCP/IP connections on port 5432?
>> could not connect to server: Operation would block (0x00002733/10035)
>>         Is the server running on host "localhost" (127.0.0.1) and accepting
>>         TCP/IP connections on port 5432?
>>
>> 64-bit, connect to own IP:
>>
>> C:\Users\Ákos\Desktop\PG93>bin\psql -h 192.168.1.4
>> psql: could not connect to server: Operation would block (0x00002733/10035)
>>         Is the server running on host "192.168.1.4" and accepting
>>         TCP/IP connections on port 5432?
>>
>> 32-bit, connect to own IP:
>>
>> C:\Users\Ákos\Desktop\PG93-32>bin\psql -h 192.168.1.4
>> psql: could not connect to server: Operation would block (0x00002733/10035)
>>         Is the server running on host "192.168.1.4" and accepting
>>         TCP/IP connections on port 5432?
>>
>> Does it ring a bell for someone?
>>
>> Unfortunately, I won't have time to do anything with my lock_timeout patch
>> for about 3 weeks. Does anyone have a little spare time to test it on Windows?
>> The patch is here, it still applies to HEAD without rejects or fuzz:
>> http://www.postgresql.org/message-id/506C0854.7090008@cybertec.at
>
> Yes it rings a bell. See
> <http://people.planetpostgresql.org/andrew/index.php?/archives/264-Cross-compiling-PostgreSQL-for-WIndows.html>

I wanted to add a comment to this blog entry but it wasn't accepted.
Here it is:

It doesn't work under Wine, see:
http://www.winehq.org/pipermail/wine-users/2013-January/107008.html

But pg_config works so other PostgreSQL clients can also be built using the cross compiler.

>
> Cross-compiling is not really a supported platform. Why don't you just build natively?
> This is know to work as shown by the buildfarm animals doing it successfully.
>
> cheers
>
> andrew
>
>
>
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Andrew Dunstan

Date:

19 January 2013, 20:15:52

On 01/19/2013 02:36 AM, Boszormenyi Zoltan wrote:
>
>>
>> Cross-compiling is not really a supported platform. Why don't you 
>> just build natively? This is know to work as shown by the buildfarm 
>> animals doing it successfully.
>
> Because I don't have a mingw setup on Windows. (Sorry.)
>

A long time ago I had a lot of sympathy with this answer, but these days 
not so much. Getting a working mingw/msys environment sufficient to 
build a bare bones PostgreSQL from scratch is both cheap and fairly 
easy. The improvements that mingw has made in its install process, and 
the presence of cheap or free windows instances in the cloud combine to 
make this pretty simple.  But since it's still slightly involved here is 
how I constructed one such  this morning:
 * Create an Amazon t1.micro spot instance of   Windows_Server-2008-SP2-English-64Bit-Base-2012.12.12 (ami-554ac83c)
(currentprice $0.006 / hour) * get the credentials and log in using:     o rdesktop -g 80%  -u Administrator -p
'password'amazon-hostname * turn off annoying IE security enhancements, and fire up IE * go to
<http://sourceforge.net/projects/mingw/files/Installer/mingw-get-inst/>  and download latest mingw-get-inst * run this
-make sure to select the Msys and the developer toolkit in   addition to the C compiler * navigate in explorer or a
commandwindow to \Mingw\Msys\1.0 and run   msys.bat * run this command to install some useful packages:     o mingw-get
installmsys-wget msys-rxvt msys-unzip * close that window * open a normal command window and run the following to
createan   unprivileged user and open an msys window as that user:     o net user pgrunner SomePw1234 /add     o runas
/user:pgrunner"cmd /c \mingw\msys\1.0\msys.bat --rxvt" * in the rxvt window run:     o wget
http://ftp.postgresql.org/pub/snapshot/dev/postgresql-snapshot.tar.gz    o tar -z -xf postgresql-snapshot.tar.gz     o
wget
"http://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win64/Automated%20Builds/mingw-w64-bin_i686-mingw_20111220.zip/download"
   o mkdir /mingw64     o cd /mingw64     o unzip ~/mingw-w64-bin_i686-mingw_20111220.zip     o cd
~/postgresql-9.3devel    o export PATH=/mingw64/bin:$PATH     o ./configure --host=x86_64-w64-mingw32 --without-zlib &&
make &&       make check         + ( make some coffee and do the crossword or read War and           Peace - this can
takea while)

Of course, for something faster you would pick a rather more expensive 
instance than t1.micro (say, m1.large, which usually has a spot price of 
$0.03 to $0.06 per hour), and if you're going to do this a lot you'll 
stash your stuff on an EBS volume that you can remount as needed,  or 
zip it up and put it on S3, and then just pull it and unpack it in one 
hit from there. And there's barely enough room left on the root file 
system to do what's above anyway. But you can get the idea  from this. 
Note that we no longer need to look elsewhere for extra tools like flex 
and bison as we once did - the ones that come with modern Msys should 
work just fine.

If you want more build features (openssl, libxml, libxslt, perl, python 
etc) then there is extra work to do in getting hold of those. But then 
cross-compiling for those things might not be easy either.

Somebody more adept at automating Amazon than I am should be able to 
automate most or all of this.

Anyway that should be just about enough info for just about any 
competent developer to get going, even if they have never touched 
Windows. (No doubt one or two people will want to quibble with what I've 
done. That's fine - this is a description, not a prescription.)

cheers

andrew

Re: Strange Windows problem, lock_timeout test request

From

Andrew Dunstan

Date:

19 January 2013, 23:16:02

On 01/19/2013 02:51 AM, Boszormenyi Zoltan wrote:
>> Yes it rings a bell. See 
>> <http://people.planetpostgresql.org/andrew/index.php?/archives/264-Cross-compiling-PostgreSQL-for-WIndows.html>
>
>
> I wanted to add a comment to this blog entry but it wasn't accepted.


The blog is closed for comments. I have moved to a new blog, and this is 
just there for archive purposes.


> Here it is:
>
> It doesn't work under Wine, see:
> http://www.winehq.org/pipermail/wine-users/2013-January/107008.html
>
> But pg_config works so other PostgreSQL clients can also be built 
> using the cross compiler.
>


If you want to target Wine I think you're totally on your own.

cheers

andrew

Re: Strange Windows problem, lock_timeout test request

From

Zoltán Böszörményi

Date:

23 January 2013, 13:28:44

2013-01-20 00:15 keltezéssel, Andrew Dunstan írta:
>
> On 01/19/2013 02:51 AM, Boszormenyi Zoltan wrote:
>>> Yes it rings a bell. See
>>> <http://people.planetpostgresql.org/andrew/index.php?/archives/264-Cross-compiling-PostgreSQL-for-WIndows.html>
>>
>>
>> I wanted to add a comment to this blog entry but it wasn't accepted.
>
>
> The blog is closed for comments. I have moved to a new blog, and this
> is just there for archive purposes.
>
>
>> Here it is:
>>
>> It doesn't work under Wine, see:
>> http://www.winehq.org/pipermail/wine-users/2013-January/107008.html
>>
>> But pg_config works so other PostgreSQL clients can also be built
>> using the cross compiler.
>>
>
>
> If you want to target Wine I think you're totally on your own.

Yes, I know, it was only an attempt. The user's
administrator privilege under Wine is a showstopper.

>
> cheers
>
> andrew
>
>
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Zoltán Böszörményi

Date:

23 January 2013, 13:31:28

2013-01-19 21:15 keltezéssel, Andrew Dunstan írta:
>
> On 01/19/2013 02:36 AM, Boszormenyi Zoltan wrote:
>>
>>>
>>> Cross-compiling is not really a supported platform. Why don't you
>>> just build natively? This is know to work as shown by the buildfarm
>>> animals doing it successfully.
>>
>> Because I don't have a mingw setup on Windows. (Sorry.)
>>
>
> A long time ago I had a lot of sympathy with this answer, but these
> days not so much.

I didn't ask for sympathy. :-) I am just on a totally different
project until 9th February and I am also far away from my desk.
So I can't even attempt to install Windows+mingw inside Qemu/KVM.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Jeff Janes

Date:

24 January 2013, 18:44:52

On Sat, Jan 19, 2013 at 12:15 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> On 01/19/2013 02:36 AM, Boszormenyi Zoltan wrote:
>>
>
> A long time ago I had a lot of sympathy with this answer, but these days not
> so much. Getting a working mingw/msys environment sufficient to build a bare
> bones PostgreSQL from scratch is both cheap and fairly easy. The
> improvements that mingw has made in its install process, and the presence of
> cheap or free windows instances in the cloud combine to make this pretty
> simple.  But since it's still slightly involved here is how I constructed
> one such  this morning:

I've used this description, skipping the Amazon part and putting it
directly on my Windows computer, and it worked.

Except bin/pg_ctl does not work.  It just silently exits without doing
anything, so I have to use bin/postgres to start the database (which
is what "make check" uses anyway, so not a problem if you just want
make check).  Is that just me or is that a known problem?  I've seen
some discussion from 2004, but didn't find a conclusion.

What advantages does mingw have over MSVC?  Is it just that it is
cost-free, or is it easier to use mingw than MSVC for someone used to
building on Linux?  (mingw certainly does not seem to have the
advantage of being fast!)

Would you like to put this somewhere on wiki.postgresql.org, or would
you mind if I do so?

Thanks for the primer,

Jeff

Re: Strange Windows problem, lock_timeout test request

From

Andrew Dunstan

Date:

24 January 2013, 19:41:40

On 01/24/2013 01:44 PM, Jeff Janes wrote:
> On Sat, Jan 19, 2013 at 12:15 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> On 01/19/2013 02:36 AM, Boszormenyi Zoltan wrote:
>> A long time ago I had a lot of sympathy with this answer, but these days not
>> so much. Getting a working mingw/msys environment sufficient to build a bare
>> bones PostgreSQL from scratch is both cheap and fairly easy. The
>> improvements that mingw has made in its install process, and the presence of
>> cheap or free windows instances in the cloud combine to make this pretty
>> simple.  But since it's still slightly involved here is how I constructed
>> one such  this morning:
> I've used this description, skipping the Amazon part and putting it
> directly on my Windows computer, and it worked.
>
> Except bin/pg_ctl does not work.  It just silently exits without doing
> anything, so I have to use bin/postgres to start the database (which
> is what "make check" uses anyway, so not a problem if you just want
> make check).  Is that just me or is that a known problem?  I've seen
> some discussion from 2004, but didn't find a conclusion.

Did you copy libpq.dll from the lib directory to the bin directory? If 
not, try that and see if it fixes the problem.

>
> What advantages does mingw have over MSVC?  Is it just that it is
> cost-free, or is it easier to use mingw than MSVC for someone used to
> building on Linux?  (mingw certainly does not seem to have the
> advantage of being fast!)

See Craig's email today about problems with SDKs and availabilty of

>
> Would you like to put this somewhere on wiki.postgresql.org, or would
> you mind if I do so?
>
> Thanks for the primer,
>
> Jeff
>

Re: Strange Windows problem, lock_timeout test request

From

Andrew Dunstan

Date:

24 January 2013, 20:08:33

On 01/24/2013 02:41 PM, Andrew Dunstan wrote:
>
>>
>> What advantages does mingw have over MSVC?  Is it just that it is
>> cost-free, or is it easier to use mingw than MSVC for someone used to
>> building on Linux?  (mingw certainly does not seem to have the
>> advantage of being fast!)
>
> See Craig's email today about problems with SDKs and availabilty of

Not sure what happened there.

... availability of free 64 bit MSVC compilers.

Also, some third party libraries are built with the Mingw compilers and 
it's often best not to switch if you can help it.

>
>>
>> Would you like to put this somewhere on wiki.postgresql.org, or would
>> you mind if I do so?

Please go for it.

cheers

andrew

Re: Strange Windows problem, lock_timeout test request

From

Craig Ringer

Date:

25 January 2013, 00:57:18

On 01/25/2013 02:44 AM, Jeff Janes wrote:
>
> What advantages does mingw have over MSVC? Is it just that it is
> cost-free, or is it easier to use mingw than MSVC for someone used to
> building on Linux? (mingw certainly does not seem to have the
> advantage of being fast!)
As far as I'm concerned, the only advantages of MinGW are:

- Mostly compatible with the autotools/gmake toolchain via msys; and
- Not subject to Microsoft's whims regarding zero-cost access to toolchains

I don't particularly like autotools and think that using it on Windows
is pretty ugly. We need msys anyway because our build process uses bison
and flex and at the present time there are quality native Windows ports
of these tools, but there's no fundamental reason they couldn't run
without msys.

The greater appeal is that MinGW means that we're not totally dependent
on Microsoft's whims when it comes to free access to the toolchain. They
have a spotty history there.

There was no free access to Visual Studio compilers until 2004, with the
release of vctoolkit2003 (which has now vanished). They then added
compilers to the Windows SDK starting in (IIRC) 6.0a, through to 7.1
(Windows 7 and .NET 4.5). Those compilers have been removed from the SDK
as of the Windows 8 SDK. The other way to get Microsoft compilers is
from Visual Studio. This usually expensive product was released as a
free-to-use Express version with VC Express 2005. It was a dog of a
thing that required manual registry hacking and/or environment setup and
the manual addition of an SDK to make work. VC Express 2008 provided an
integrated installer, but also removed some debugging facilities. VC
Express 2010 removed more debugging features. Then with VC 2012
Microsoft announced that there would be no 2012 Express edition and that
everybody should convert their apps to the new Metro system for which
free a toolchain would be available. They went back on that under
community pressure and released VC Express 2012, which has many of the
debugging features restored and works well - but who knows if it'll see
any further updates.

All the Visual Studio Express editions require activation to use them,
though this activation is free. Once you have an activation code it's
valid forever, but there's no guarantee the servers will remain
available to generate new codes, so you'd better keep them written down.

Microsoft don't maintain old SDKs and VC Express editions much, if at
all, so installing and using older SDKs gets more complicated over time.
Take VC Express 2010, until recently the newest edition available:

- Uninstall any Visual Studio 2010 redists that're installed, taking
note of the versions
- Install VC Express 2010
- Install VC Express 2010 SP1
- Install VC Express 2010 SP1 Compiler Pack
- Download and reinstall any newer redists that you uninstalled, so your
apps keep working.

.... and it looks like having VC Express 2010 installed might partly
break VC 2012; I'm still unsure of exactly what caused that.

We can not fix any of these problems, nor can we update the toolsets to
run on newer Windows versions. So if we support only Visual Studio,
we're extremely dependent on the uncertain future of the free Microsoft
toolchain.

For that reason, even though it's not a great build environment and on
Windows not a great compiler, MinGW support is important.

If you notice issues with the MinGW builds, please report them so
regression test coverage can be improved. Hardly anyone uses them but
like the fire department, one day if you stop supporting them you really
regret it.

-- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Strange Windows problem, lock_timeout test request

From

Craig Ringer

Date:

25 January 2013, 01:44:45

On 01/25/2013 04:08 AM, Andrew Dunstan wrote:
>
> On 01/24/2013 02:41 PM, Andrew Dunstan wrote:
>>
>>>
>>> What advantages does mingw have over MSVC?  Is it just that it is
>>> cost-free, or is it easier to use mingw than MSVC for someone used to
>>> building on Linux?  (mingw certainly does not seem to have the
>>> advantage of being fast!)
>>
>> See Craig's email today about problems with SDKs and availabilty of
>> free 64 bit MSVC compilers.
>
> Also, some third party libraries are built with the Mingw compilers
> and it's often best not to switch if you can help it.
That's a trade-off; they're often not available in 64-bit, so you
usually land up needing to rebuild them for 64-bit anyway, and the mingw
64 bit toolchain seems to be a bit broken.

As for 64-bit MS compilers - at the moment the current MSVC 64-bit
compilers are available for free as cross-compilers. The rather bizarre
situation is that you have a 64-bit host OS running 32-bit compilers
under Wow64, producing 64-bit binaries as output. This seems to be quite
transparent to the build process when on a 64-bit host, as the host can
execute the cross-compiled binaries directly, so it isn't like a normal
cross-compile where you have to differentiate between host- and target-
executables.

So, while no native 64-bit compilers are available for free as part of
Visual Studio Express 2012 (11), it doesn't matter much. The issue is
more that it's very much Microsoft's whim whether they release compilers
at all and if so, which ones, when and how.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Strange Windows problem, lock_timeout test request

From

james

Date:

26 January 2013, 16:04:41

> So, while no native 64-bit compilers are available for free as part of
> Visual Studio Express 2012 (11), it doesn't matter much. The issue is
> more that it's very much Microsoft's whim whether they release compilers
> at all and if so, which ones, when and how.

I think I have a pretty vanilla Visual Studio Express 2012 for Desktop and:

C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64>.\cl
Microsoft (R) C/C++ Optimizing Compiler Version 17.00.51106.1 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

usage: cl [ option... ] filename... [ /link linkoption... ]

C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64>

Am I misunderstanding the discussion here?

Isn't that the 64-bit tool suite?

Does anyone care if the compiler is a 64 bit binary itself?

Re: Strange Windows problem, lock_timeout test request

From

Craig Ringer

Date:

26 January 2013, 23:38:32

On 01/27/2013 12:04 AM, james wrote:
>> So, while no native 64-bit compilers are available for free as part of
>> Visual Studio Express 2012 (11), it doesn't matter much. The issue is
>> more that it's very much Microsoft's whim whether they release compilers
>> at all and if so, which ones, when and how.
>
> I think I have a pretty vanilla Visual Studio Express 2012 for Desktop
> and:
>
> C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64>.\cl
> Microsoft (R) C/C++ Optimizing Compiler Version 17.00.51106.1 for x64
> Copyright (C) Microsoft Corporation.  All rights reserved.
>
> usage: cl [ option... ] filename... [ /link linkoption... ]
>
> C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64>
>
> Am I misunderstanding the discussion here?
Yep, that's a cross-compiler, hence "x86_amd64". As I said, it doesn't
actually matter very much on an x64 host as the target arch binaries run
fine on the host arch. The only reason you'd need the native 64-bit
compiler is if you're compiling huge programs that cause the 32-bit to
64-bit cross-compiler to run out of memory. Not an issue for Pg.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Strange Windows problem, lock_timeout test request

From

Hari Babu

Date:

28 January 2013, 14:21:24

On Saturday, January 19, 2013 11:23 AM Amit kapila wrote:
>On Saturday, January 19, 2013 4:13 AM Boszormenyi Zoltan wrote:
>> Hi,
>>
>> Unfortunately, I won't have time to do anything with my lock_timeout
patch
>> for about 3 weeks. Does anyone have a little spare time to test it on
Windows?
>
>I shall try to do it, probably next week.
>Others are also welcome to test the patch.

I have carried out some windows testing of the lock timeout patch. 

The extra tests which are carried out on the patch are attached in the mail.

Some observations on the patch: 

1. Patch needs a rebase as it causing some rejections. 
2. regress check failed because the expected ".out" file is not updated
properly. 

Regards,
Hari babu.

Re: Strange Windows problem, lock_timeout test request

From

Zoltán Böszörményi

Date:

30 January 2013, 14:29:27

2013-01-28 15:20 keltezéssel, Hari Babu írta:
> On Saturday, January 19, 2013 11:23 AM Amit kapila wrote:
>> On Saturday, January 19, 2013 4:13 AM Boszormenyi Zoltan wrote:
>>> Hi,
>>>
>>> Unfortunately, I won't have time to do anything with my lock_timeout
> patch
>>> for about 3 weeks. Does anyone have a little spare time to test it on
> Windows?
>> I shall try to do it, probably next week.
>> Others are also welcome to test the patch.
> I have carried out some windows testing of the lock timeout patch.

Thanks very much. It means it didn't crash for you and
it waited the expected amount of time as well.

> The extra tests which are carried out on the patch are attached in the mail.

The patch itself contained regression tests to run by itself
and compete with statement_timeout as well, although
it waits only 2 seconds instead of 60 as in your test.

> Some observations on the patch:
>
> 1. Patch needs a rebase as it causing some rejections.

On a fresh GIT pull, it only caused a reject for the documentation
parts of the patch. No other rejects and no fuzz, only line shift
warnings were indicated by the patch. Anyway, a refreshed one
is attached.

> 2. regress check failed because the expected ".out" file is not updated
> properly.

Which regress check failed? The .out file was updated in the patch
for prepared_xacts.sql where the regression tests for lock_timeout
were added. Or do you mean the one for the sql file you sent?

Thanks,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/

Attachment

2-lock_timeout-v28.patch.gz

Re: Strange Windows problem, lock_timeout test request

From

Zoltán Böszörményi

Date:

30 January 2013, 14:45:56

<div class="moz-cite-prefix">2013-01-30 15:29 keltezéssel, Zoltán Böszörményi írta:<br /></div><blockquote
cite="mid:51092E30.9040405@cybertec.at"type="cite">2013-01-28 15:20 keltezéssel, Hari Babu írta: <br /><blockquote
type="cite">OnSaturday, January 19, 2013 11:23 AM Amit kapila wrote: <br /><blockquote type="cite">On Saturday, January
19,2013 4:13 AM Boszormenyi Zoltan wrote: <br /><blockquote type="cite">Hi, <br /><br /> Unfortunately, I won't have
timeto do anything with my lock_timeout <br /></blockquote></blockquote> patch <br /><blockquote
type="cite"><blockquotetype="cite">for about 3 weeks. Does anyone have a little spare time to test it on <br
/></blockquote></blockquote>Windows? <br /><blockquote type="cite">I shall try to do it, probably next week. <br />
Othersare also welcome to test the patch. <br /></blockquote> I have carried out some windows testing of the lock
timeoutpatch. <br /></blockquote><br /> Thanks very much. It means it didn't crash for you and <br /> it waited the
expectedamount of time as well. <br /><br /><blockquote type="cite">The extra tests which are carried out on the patch
areattached in the mail. <br /></blockquote><br /> The patch itself contained regression tests to run by itself <br />
andcompete with statement_timeout as well, although <br /> it waits only 2 seconds instead of 60 as in your test. <br
/><br/><blockquote type="cite">Some observations on the patch: <br /><br /> 1. Patch needs a rebase as it causing some
rejections.<br /></blockquote><br /> On a fresh GIT pull, it only caused a reject for the documentation <br /> parts of
thepatch. No other rejects and no fuzz, only line shift <br /> warnings were indicated by the patch. Anyway, a
refreshedone <br /> is attached. <br /><br /><blockquote type="cite">2. regress check failed because the expected
".out"file is not updated <br /> properly. <br /></blockquote><br /> Which regress check failed? The .out file was
updatedin the patch <br /> for prepared_xacts.sql where the regression tests for lock_timeout <br /> were added. Or do
youmean the one for the sql file you sent? <br /></blockquote><br /> If the failed regression test is indeed the
prepared_xacts.sqlthat is<br /> in my patch, can you attach the regression.diff file after the failed<br /> "make
check"?Thanks.<br /><br /><blockquote cite="mid:51092E30.9040405@cybertec.at" type="cite"><br /> Thanks, <br /> Zoltán
Böszörményi<br /><br /><br /><br /><fieldset class="mimeAttachmentHeader"></fieldset><br /><pre wrap="">
 
</pre></blockquote><br /><br /><pre class="moz-signature" cols="72">-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: <a class="moz-txt-link-freetext" href="http://www.postgresql-support.de">http://www.postgresql-support.de</a>
<aclass="moz-txt-link-freetext" href="http://www.postgresql.at/">http://www.postgresql.at/</a>
 
</pre>

Re: Strange Windows problem, lock_timeout test request

From

Hari Babu

Date:

30 January 2013, 15:07:09

On Wednesday, January 30, 2013 7:59 PM Zoltán Böszörményi wrote:
>>2013-01-28 15:20 keltezéssel, Hari Babu írta:
>> 2. regress check failed because the expected ".out" file is not
>> updated properly.
>
>Which regress check failed? The .out file was updated in the patch for
prepared_xacts.sql where >the regression tests for lock_timeout were added.
Or do you mean the one for the sql file you >sent?

During regress test, "prepared_xacts_1.out" expected file used for comparing
with the result file. Which is not updated by the patch. Because of this
reason the regress check is failing.

Regards,
Hari babu.

Re: Strange Windows problem, lock_timeout test request

From

Zoltán Böszörményi

Date:

30 January 2013, 16:46:16

2013-01-30 16:06 keltezéssel, Hari Babu írta:
> On Wednesday, January 30, 2013 7:59 PM Zoltán Böszörményi wrote:
>>> 2013-01-28 15:20 keltezéssel, Hari Babu írta:
>>> 2. regress check failed because the expected ".out" file is not
>>> updated properly.
>> Which regress check failed? The .out file was updated in the patch for
> prepared_xacts.sql where >the regression tests for lock_timeout were added.
> Or do you mean the one for the sql file you >sent?
>
> During regress test, "prepared_xacts_1.out" expected file used for comparing
> with the result file. Which is not updated by the patch. Because of this
> reason the regress check is failing.

I see, so this is a Windows-only change that needs a different.
Can you send the resulting prepared_xacts_1.out file so I can integrate
its changes into my patch? That way it would be complete.
Thanks in advance.

>
> Regards,
> Hari babu.
>
>
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Zoltán Böszörményi

Date:

31 January 2013, 13:55:53

2013-01-30 17:45 keltezéssel, Zoltán Böszörményi írta:
> 2013-01-30 16:06 keltezéssel, Hari Babu írta:
>> On Wednesday, January 30, 2013 7:59 PM Zoltán Böszörményi wrote:
>>>> 2013-01-28 15:20 keltezéssel, Hari Babu írta:
>>>> 2. regress check failed because the expected ".out" file is not
>>>> updated properly.
>>> Which regress check failed? The .out file was updated in the patch for
>> prepared_xacts.sql where >the regression tests for lock_timeout were
>> added.
>> Or do you mean the one for the sql file you >sent?
>>
>> During regress test, "prepared_xacts_1.out" expected file used for
>> comparing
>> with the result file. Which is not updated by the patch. Because of this
>> reason the regress check is failing.
>
> I see, so this is a Windows-only change that needs a different.
> Can you send the resulting prepared_xacts_1.out file so I can integrate
> its changes into my patch? That way it would be complete.
> Thanks in advance.

I have found a little time to look into this problem and
found a way to make pg_regress use prepared_xacts_1.out.
I had to change line 2193 in pg_regress.c from

     fputs("max_prepared_transactions = 2\n", pg_conf);

to

     fputs("max_prepared_transactions = 0\n", pg_conf);

The patch now passed "make check" in both cases.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/

Attachment

2-lock_timeout-v29.patch.gz

Re: Strange Windows problem, lock_timeout test request

From

Alvaro Herrera

Date:

31 January 2013, 14:22:54

Zoltán Böszörményi wrote:

> I have found a little time to look into this problem and
> found a way to make pg_regress use prepared_xacts_1.out.
> I had to change line 2193 in pg_regress.c from
>
>     fputs("max_prepared_transactions = 2\n", pg_conf);
>
> to
>
>     fputs("max_prepared_transactions = 0\n", pg_conf);
>
> The patch now passed "make check" in both cases.

That sounds a lot more difficult than just using "make installcheck" and
configure the running server with zero prepared xacts ...

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Strange Windows problem, lock_timeout test request

From

Zoltán Böszörményi

Date:

31 January 2013, 14:56:04

2013-01-31 15:22 keltezéssel, Alvaro Herrera írta:
> Zoltán Böszörményi wrote:
>
>> I have found a little time to look into this problem and
>> found a way to make pg_regress use prepared_xacts_1.out.
>> I had to change line 2193 in pg_regress.c from
>>
>>      fputs("max_prepared_transactions = 2\n", pg_conf);
>>
>> to
>>
>>      fputs("max_prepared_transactions = 0\n", pg_conf);
>>
>> The patch now passed "make check" in both cases.
> That sounds a lot more difficult than just using "make installcheck" and
> configure the running server with zero prepared xacts ...

It didn't occur to me to use "make installcheck" for this one.

What is strange though is why prepared_xacts_1.out exists
at all, since pg_regress.c / make check seems to set
max_prepared_transactions on Windows, too.

Is there a way to specify such settings in REGRESS_OPTS?

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Andrew Dunstan

Date:

31 January 2013, 15:38:41

On 01/31/2013 09:55 AM, Zoltán Böszörményi wrote:
> 2013-01-31 15:22 keltezéssel, Alvaro Herrera írta:
>> Zoltán Böszörményi wrote:
>>
>>> I have found a little time to look into this problem and
>>> found a way to make pg_regress use prepared_xacts_1.out.
>>> I had to change line 2193 in pg_regress.c from
>>>
>>>      fputs("max_prepared_transactions = 2\n", pg_conf);
>>>
>>> to
>>>
>>>      fputs("max_prepared_transactions = 0\n", pg_conf);
>>>
>>> The patch now passed "make check" in both cases.
>> That sounds a lot more difficult than just using "make installcheck" and
>> configure the running server with zero prepared xacts ...
>
> It didn't occur to me to use "make installcheck" for this one.
>
> What is strange though is why prepared_xacts_1.out exists
> at all, since pg_regress.c / make check seems to set
> max_prepared_transactions on Windows, too.
>
> Is there a way to specify such settings in REGRESS_OPTS?
>
>


You can use the temp_config option to specify a file with non-standard
settings that will be appended to the installed postgresql.conf.

cheers

andrew

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

31 January 2013, 15:39:51

Zoltán Böszörményi <zb@cybertec.at> writes:
> 2013-01-31 15:22 keltez�ssel, Alvaro Herrera �rta:
>> That sounds a lot more difficult than just using "make installcheck" and
>> configure the running server with zero prepared xacts ...

> It didn't occur to me to use "make installcheck" for this one.

> What is strange though is why prepared_xacts_1.out exists
> at all, since pg_regress.c / make check seems to set
> max_prepared_transactions on Windows, too.

Alvaro told you why: so that the tests wouldn't report failure in
"make installcheck" against a stock-configuration server.

BTW, 99% of the time you can update alternative expected files by
applying the same patch to them as you did to the tested version.
At least that usually works for me, and it can be a lot easier than
arranging to duplicate the environment the alternative file is
meant for.
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Zoltán Böszörményi

Date:

31 January 2013, 17:27:40

2013-01-31 16:39 keltezéssel, Tom Lane írta:
> Zoltán Böszörményi <zb@cybertec.at> writes:
>> 2013-01-31 15:22 keltezéssel, Alvaro Herrera írta:
>>> That sounds a lot more difficult than just using "make installcheck" and
>>> configure the running server with zero prepared xacts ...
>> It didn't occur to me to use "make installcheck" for this one.
>> What is strange though is why prepared_xacts_1.out exists
>> at all, since pg_regress.c / make check seems to set
>> max_prepared_transactions on Windows, too.
> Alvaro told you why: so that the tests wouldn't report failure in
> "make installcheck" against a stock-configuration server.
>
> BTW, 99% of the time you can update alternative expected files by
> applying the same patch to them as you did to the tested version.
> At least that usually works for me, and it can be a lot easier than
> arranging to duplicate the environment the alternative file is
> meant for.
>
>             regards, tom lane

Thanks. A question though: how does "make check" or "make installcheck"
chooses between the *.out and its different *_N.out incarnations?
I couldn't find traces of prepared_xacts_1.out in any file saying "this
is the one to be used in this-and-this" configuration. Does the procedure
check against all versions and the least different one is reported?

Thanks,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

31 January 2013, 18:38:18

Zoltán Böszörményi <zb@cybertec.at> writes:
> Thanks. A question though: how does "make check" or "make installcheck"
> chooses between the *.out and its different *_N.out incarnations?
> I couldn't find traces of prepared_xacts_1.out in any file saying "this
> is the one to be used in this-and-this" configuration. Does the procedure
> check against all versions and the least different one is reported?

Exactly.  This is documented, see the "regression tests" chapter of the
SGML manual.
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Zoltán Böszörményi

Date:

31 January 2013, 21:12:26

2013-01-31 19:38 keltezéssel, Tom Lane írta:
> Zoltán Böszörményi <zb@cybertec.at> writes:
>> Thanks. A question though: how does "make check" or "make installcheck"
>> chooses between the *.out and its different *_N.out incarnations?
>> I couldn't find traces of prepared_xacts_1.out in any file saying "this
>> is the one to be used in this-and-this" configuration. Does the procedure
>> check against all versions and the least different one is reported?
> Exactly.  This is documented, see the "regression tests" chapter of the
> SGML manual.
>
>             regards, tom lane

Thanks.

I tested my patch with installcheck and installcheck-parallel
using max_prepared_transactions=0 in the server and
it passed that way too.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

12 February 2013, 18:33:07

2013-01-31 22:12 keltezéssel, Zoltán Böszörményi írta:
> 2013-01-31 19:38 keltezéssel, Tom Lane írta:
>> Zoltán Böszörményi <zb@cybertec.at> writes:
>>> Thanks. A question though: how does "make check" or "make installcheck"
>>> chooses between the *.out and its different *_N.out incarnations?
>>> I couldn't find traces of prepared_xacts_1.out in any file saying "this
>>> is the one to be used in this-and-this" configuration. Does the procedure
>>> check against all versions and the least different one is reported?
>> Exactly.  This is documented, see the "regression tests" chapter of the
>> SGML manual.
>>
>>             regards, tom lane
>
> Thanks.
>
> I tested my patch with installcheck and installcheck-parallel
> using max_prepared_transactions=0 in the server and
> it passed that way too.

# ping -c 3 craig.ringer
PING craig.ringer (192.168.1.80) 56(84) bytes of data.From craig.ringer (192.168.1.80) icmp_seq=1 Destination Host
UnreachableFromcraig.ringer (192.168.1.80) icmp_seq=2 Destination Host UnreachableFrom craig.ringer (192.168.1.80)
icmp_seq=3Destination Host Unreachable 

--- craig.ringer ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 1999ms

All details were fixed:
- rebased to recent GIT
- updated regression test for max_prepared_transactions=0
- tested on Windows (by Hari Babu, thanks!)

Why is this patch still in "Waiting on Author" state in the CF app?

Thanks in advance,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

12 February 2013, 18:35:23

Boszormenyi Zoltan <zb@cybertec.at> writes:
> Why is this patch still in "Waiting on Author" state in the CF app?

If you're the author, you should change it back to "Needs Review" when
you've responded to the review.
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

12 February 2013, 18:42:01

2013-02-12 19:35 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb@cybertec.at> writes:
>> Why is this patch still in "Waiting on Author" state in the CF app?
> If you're the author, you should change it back to "Needs Review" when
> you've responded to the review.
>
>             regards, tom lane

Maybe it's just me, but I can't see any facility (link or button) to do that.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

12 February 2013, 18:45:58

Boszormenyi Zoltan <zb@cybertec.at> writes:
> 2013-02-12 19:35 keltez�ssel, Tom Lane �rta:
>> If you're the author, you should change it back to "Needs Review" when
>> you've responded to the review.

> Maybe it's just me, but I can't see any facility (link or button) to do that.

Go to the patch page, click "edit patch", adjust the popdown list for
"patch status" ...
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

12 February 2013, 18:52:05

2013-02-12 19:45 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb@cybertec.at> writes:
>> 2013-02-12 19:35 keltezéssel, Tom Lane írta:
>>> If you're the author, you should change it back to "Needs Review" when
>>> you've responded to the review.
>> Maybe it's just me, but I can't see any facility (link or button) to do that.
> Go to the patch page, click "edit patch", adjust the popdown list for
> "patch status" ...
>
>             regards, tom lane

This is what I am saying, I am logged in and there is no popdown list
for the "Patch Status", only text...

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Andres Freund

Date:

12 February 2013, 18:55:50

On 2013-02-12 19:51:45 +0100, Boszormenyi Zoltan wrote:
> 2013-02-12 19:45 keltezéssel, Tom Lane írta:
> >Boszormenyi Zoltan <zb@cybertec.at> writes:
> >>2013-02-12 19:35 keltezéssel, Tom Lane írta:
> >>>If you're the author, you should change it back to "Needs Review" when
> >>>you've responded to the review.
> >>Maybe it's just me, but I can't see any facility (link or button) to do that.
> >Go to the patch page, click "edit patch", adjust the popdown list for
> >"patch status" ...

> This is what I am saying, I am logged in and there is no popdown list
> for the "Patch Status", only text...

Thats strange. I changed it for now just to see if there's any
problem, worked fine for me.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

12 February 2013, 18:56:33

Boszormenyi Zoltan <zb@cybertec.at> writes:
> 2013-02-12 19:45 keltez�ssel, Tom Lane �rta:
>> Go to the patch page, click "edit patch", adjust the popdown list for
>> "patch status" ...

> This is what I am saying, I am logged in and there is no popdown list
> for the "Patch Status", only text...

That sure sounds like the
https://commitfest.postgresql.org/action/patch_view?id=896
page, not the edit-patch page which is
https://commitfest.postgresql.org/action/patch_form?id=896

There should be popdown lists for both topic and status ...
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Andrew Dunstan

Date:

12 February 2013, 18:58:44

On 02/12/2013 01:51 PM, Boszormenyi Zoltan wrote:
> 2013-02-12 19:45 keltezéssel, Tom Lane írta:
>> Boszormenyi Zoltan <zb@cybertec.at> writes:
>>> 2013-02-12 19:35 keltezéssel, Tom Lane írta:
>>>> If you're the author, you should change it back to "Needs Review" when
>>>> you've responded to the review.
>>> Maybe it's just me, but I can't see any facility (link or button) to
>>> do that.
>> Go to the patch page, click "edit patch", adjust the popdown list for
>> "patch status" ...
>>
>>             regards, tom lane
>
> This is what I am saying, I am logged in and there is no popdown list
> for the "Patch Status", only text...
>
>

I bet you didn't go to "edit patch" like Tom said to.

cheers

andrew

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

12 February 2013, 19:00:29

2013-02-12 19:58 keltezéssel, Andrew Dunstan írta:
>
> On 02/12/2013 01:51 PM, Boszormenyi Zoltan wrote:
>> 2013-02-12 19:45 keltezéssel, Tom Lane írta:
>>> Boszormenyi Zoltan <zb@cybertec.at> writes:
>>>> 2013-02-12 19:35 keltezéssel, Tom Lane írta:
>>>>> If you're the author, you should change it back to "Needs Review" when
>>>>> you've responded to the review.
>>>> Maybe it's just me, but I can't see any facility (link or button) to do that.
>>> Go to the patch page, click "edit patch", adjust the popdown list for
>>> "patch status" ...
>>>
>>>             regards, tom lane
>>
>> This is what I am saying, I am logged in and there is no popdown list
>> for the "Patch Status", only text...
>>
>>
>
> I bet you didn't go to "edit patch" like Tom said to.

Yes, that was the problem. I missed that link. Thanks.
Where can I get brown paper bags in bulk? :-D

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Stephen Frost

Date:

23 February 2013, 01:55:36

Zoltán,

* Zoltán Böszörményi (zb@cybertec.at) wrote:
> The patch now passed "make check" in both cases.

Is v29 the latest version of this patch..?

Looking through the patch, I've noticed a couple of things:

First off, it's not in context diff format, which is the PG standard for
patch submission. Next, the documentation has a few issues:

- "Heavy-weight" should really be defined in terms of specific lock types or modes. We don't use the 'heavyweight'
termanywhere else in the documentation that I've found.

- I'd reword this:
Abort any statement that tries to acquire a heavy-weight lock on rows, pages, tables, indices or other objects and the
lock(s)has to wait more than the specified number of milliseconds.
as:
Abort any statement which waits longer than the specified number of milliseconds while attempting to acquire a lock on
...

- I don't particularly like "lock_timeout_option", for a couple of reasons. First is simply the name is terrible, but
beyondthat, it strikes me that wanting to set both a 'per-lock timeout' and a 'overall waiting-for-locks timeout' at
thesame time would be a reasonable use-case. If we're going to have 2 GUCs and we're going to support each of those
options,why not just let the user specify values for each?

- This is a bit disingenuous:
If <literal>NOWAIT</> option is not specified and <varname>lock_timeout</varname> is set and the lock or statement
(dependingon <varname>lock_timeout_option</varname>) needs to wait more than the specified value in milliseconds, the
commandreports an error after timing out, rather than waiting indefinitely.
The SELECT would simply continue to wait until the lock is available. That's a bit more specific than 'indefinitely'.
Also,we might add a sentence about statement_timeout as well, if we're going to document what can happen if you don't
useNOWAIT with your SELECT-FOR-UPDATE. Should we add documentation to the other commands that wait for locks?

- Looks like this was ended mid-thought...:

+ * Lock a semaphore (decrement count), blocking if count would be < 0
+ * until a timeout triggers. Returns true if

- Not a big fan of this:

+ * See notes in PGSemaphoreLock.

- I'm not thrilled with the new API for defining the timeouts. Particularly, I believe the more common convention for
passingaround arrays of structures is to have an empty array at the end, which avoids having to remember to update the
#of entries every time it gets changed. Of course, another option would be to use our existing linked list
implementationand its helper macros such as our foreach() construct.

- As I've mentioned in other places/times, comments should be about why we're doing something, not what we're doing-
thecode tells you that. As such, comments like this really aren't great: /* Assert request is sane */ /* Now re-enable
thetimer, if necessary. */

- Do we really need TimestampTzPlusMicroseconds..?

In general, I like this feature and a number of things above are pretty
small issues. The main questions, imv, are if we really need both
'options', and, if so, how they should work, and the API for defining
timers.
Thanks,
Stephen

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

23 February 2013, 18:15:38

2013-02-23 02:55 keltezéssel, Stephen Frost írta:
> Zoltán,
>
> * Zoltán Böszörményi (zb@cybertec.at) wrote:
>> The patch now passed "make check" in both cases.
> Is v29 the latest version of this patch..?

Yes, is was until you came up with your review... ;-)

> Looking through the patch, I've noticed a couple of things:
>
> First off, it's not in context diff format, which is the PG standard for
> patch submission.

Since moving to GIT, this expectation is obsolete. All PG hackers
became comfortable with the unified diff format, see references
from the list. A lot of hackers post "git diff" patches which cannot
produce context diff, either.

>    Next, the documentation has a few issues:
>
> - "Heavy-weight" should really be defined in terms of specific lock
>    types or modes.  We don't use the 'heavyweight' term anywhere else in
>    the documentation that I've found.

That's not strictly true but it's not widely used either:

$ find . -type f | xargs grep weight | grep heavy
./monitoring.sgml:     <entry>Probe that fires when a request for a heavyweight lock (lmgr
lock)
./monitoring.sgml:     <entry>Probe that fires when a request for a heavyweight lock (lmgr
lock)

>
> - I'd reword this:
>
>    Abort any statement that tries to acquire a heavy-weight lock on rows,
>    pages, tables, indices or other objects and the lock(s) has to wait
>    more than the specified number of milliseconds.
>
>    as:
>
>    Abort any statement which waits longer than the specified number of
>    milliseconds while attempting to acquire a lock on ...

OK.

>
> - I don't particularly like "lock_timeout_option", for a couple of
>    reasons.  First is simply the name is terrible, but beyond that, it
>    strikes me that wanting to set both a 'per-lock timeout' and a
>    'overall waiting-for-locks timeout' at the same time would be a
>    reasonable use-case.  If we're going to have 2 GUCs and we're going to
>    support each of those options, why not just let the user specify
>    values for each?

OK, suggest names for the 2 GUCS, please.

> - This is a bit disingenuous:
>
>    If <literal>NOWAIT</> option is not specified and
>    <varname>lock_timeout</varname> is set and the lock or statement
>    (depending on <varname>lock_timeout_option</varname>) needs to wait
>    more than the specified value in milliseconds, the command reports
>    an error after timing out, rather than waiting indefinitely.
>
>    The SELECT would simply continue to wait until the lock is available.
>    That's a bit more specific than 'indefinitely'.  Also, we might add a
>    sentence about statement_timeout as well, if we're going to document
>    what can happen if you don't use NOWAIT with your SELECT-FOR-UPDATE.
>    Should we add documentation to the other commands that wait for locks?
>
> - Looks like this was ended mid-thought...:
>
> + * Lock a semaphore (decrement count), blocking if count would be < 0
> + * until a timeout triggers. Returns true if

Right.

>
> - Not a big fan of this:
>
> +    * See notes in PGSemaphoreLock.

Why? Copy&paste the *long* comment (a different one in each *_sema.c)
or omitting the comments is better? Suggest a better comment, and
it will be included.

> - I'm not thrilled with the new API for defining the timeouts.
>    Particularly, I believe the more common convention for passing around
>    arrays of structures is to have an empty array at the end, which
>    avoids having to remember to update the # of entries every time it
>    gets changed.  Of course, another option would be to use our existing
>    linked list implementation and its helper macros such as our
>    foreach() construct.

A linked list might be better, actually I like it.

> - As I've mentioned in other places/times, comments should be about why
>    we're doing something, not what we're doing- the code tells you that.
>    As such, comments like this really aren't great:
>    /* Assert request is sane */
>    /* Now re-enable the timer, if necessary. */
>
> - Do we really need TimestampTzPlusMicroseconds..?

Well, my code is the only user for this macro but it's cleaner
than explicitly doing

#ifdef HAVE_INT64_TIMESTAMP    fin_time = timeout_start + delay_ms * (int64)1000;
#else    fin_time = timeout_start + delay_ms / 1000000.0;
#endif

>
> In general, I like this feature and a number of things above are pretty
> small issues.  The main questions, imv, are if we really need both
> 'options', and, if so, how they should work, and the API for defining
> timers.
>
>     Thanks,
>
>         Stephen


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Stephen Frost

Date:

24 February 2013, 02:23:36

Zoltán,

* Boszormenyi Zoltan (zb@cybertec.at) wrote:
> 2013-02-23 02:55 keltezéssel, Stephen Frost írta:
> >First off, it's not in context diff format, which is the PG standard for
> >patch submission.
>
> Since moving to GIT, this expectation is obsolete.

No, it isn't.  Patches posted to the list should be in context diff
format (and uncompressed unless it's too large) for easier reading.
That avoids having to download it, apply it to a git tree and then
create the diff ourselves.  Indeed, the move to git had impact on this
at all.

> All PG hackers
> became comfortable with the unified diff format, see references
> from the list. A lot of hackers post "git diff" patches which cannot
> produce context diff, either.

It's quite possible to create a context diff from git- indeed, it's
documented on the http://wiki.postgresql.org/wiki/Working_with_Git
portion of the wiki, specifically to help people understand how to take
the changes they've made in their git forks and create context diffs to
post to the mailing lists.  The preference for context diffs is also
still documented at http://wiki.postgresql.org/wiki/Submitting_a_Patch.

And, to be honest, I'm not sure how familiar you are with the history of
PG in this area, but I've been through all of it from the pre-git CVS
days, through the migration to git, to the exclusive use of git.  I feel
quite confident that the preference for context diffs hasn't changed.

> That's not strictly true but it's not widely used either:
>
> $ find . -type f | xargs grep weight | grep heavy
> ./monitoring.sgml:     <entry>Probe that fires when a request for a
> heavyweight lock (lmgr lock)
> ./monitoring.sgml:     <entry>Probe that fires when a request for a
> heavyweight lock (lmgr lock)

Interesting.  That didn't show up in the searches that I was doing
through the web interface, though it does now.  If we're going to use
that term, we should define it in the lock documentation.  If not, then
we should avoid it everywhere.  I'm fine with either.

> >- I don't particularly like "lock_timeout_option", for a couple of
> >   reasons.  First is simply the name is terrible, but beyond that, it
> >   strikes me that wanting to set both a 'per-lock timeout' and a
> >   'overall waiting-for-locks timeout' at the same time would be a
> >   reasonable use-case.  If we're going to have 2 GUCs and we're going to
> >   support each of those options, why not just let the user specify
> >   values for each?
>
> OK, suggest names for the 2 GUCS, please.

lock_timeout_wait and lock_timeout_stmt_wait ?  Though I've been really
wondering to myself as to if we need both of these options as well as
statement_timeout.  Perhaps you can explain the use case for each
option and how they're distinct from each other?  Indeed, the use-case
that I'm envisioning is focused around "don't wait more than 'X' for the
relation-level locks" which would allow you to distinguish queries that
are, most likely anyway, making progress, from those which have been
caught behind a lock.  That would match your 'per-statement' lock
timeout concept, iiuc, though I think it might be more simply
implemented.

> >- Not a big fan of this:
> >
> >+    * See notes in PGSemaphoreLock.
>
> Why? Copy&paste the *long* comment (a different one in each *_sema.c)
> or omitting the comments is better? Suggest a better comment, and
> it will be included.

How about, for starters, there's more than one PGSemaphoreLock?  Second,
as you'll note in posix_sema.c, it's useful to say more than just "look
over there".  I'm not suggesting a mass copy/paste, but more than 4
words would be valuable.

> >- Do we really need TimestampTzPlusMicroseconds..?
>
> Well, my code is the only user for this macro but it's cleaner
> than explicitly doing

To be honest, I was really wondering why TimestampTzPlusMilliseconds
isn't sufficient, not suggesting that we litter the code with #ifdef's.
Thanks,
    Stephen

Re: Strange Windows problem, lock_timeout test request

From

Andrew Dunstan

Date:

24 February 2013, 03:07:46

On 02/23/2013 01:15 PM, Boszormenyi Zoltan wrote:
>>
>> First off, it's not in context diff format, which is the PG standard for
>> patch submission.
>
>
> Since moving to GIT, this expectation is obsolete. All PG hackers
> became comfortable with the unified diff format, see references
> from the list. A lot of hackers post "git diff" patches which cannot
> produce context diff, either.

I am not aware that project policy has changed in the slightest in this 
regard.

Every unified diff can be turned into a context diff by passing it 
though "filterdiff --format=context".

cheers

andrew

Re: Strange Windows problem, lock_timeout test request

From

Stephen Frost

Date:

24 February 2013, 03:15:11

* Andrew Dunstan (andrew@dunslane.net) wrote:
> Every unified diff can be turned into a context diff by passing it
> though "filterdiff --format=context".

On that point, annoyingly, it's not accurate for every diff.  In
particular, when I tried it with the diff for 'v29' of this lock_timeout
patch, it only returns the context diff for the first two files:

zcat 2-lock_timeout-v29.patch.gz | grep -- '--- postgresql' | wc -l
22

zcat 2-lock_timeout-v29.patch.gz | \    filterdiff --format=context | \ grep '\*\*\* postgresql' | wc -l
2

In the end, I did create a local git branch, commit the patch, then
diff it back against master using my context-diff git helper to get
something easier to read through.  Rather annoying.
Thanks,
    Stephen

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

24 February 2013, 08:27:19

2013-02-24 03:23 keltezéssel, Stephen Frost írta:
> Zoltán,
>
> * Boszormenyi Zoltan (zb@cybertec.at) wrote:
>> 2013-02-23 02:55 keltezéssel, Stephen Frost írta:
>>> First off, it's not in context diff format, which is the PG standard for
>>> patch submission.
>> Since moving to GIT, this expectation is obsolete.
> No, it isn't.  Patches posted to the list should be in context diff
> format (and uncompressed unless it's too large) for easier reading.
> That avoids having to download it, apply it to a git tree and then
> create the diff ourselves.  Indeed, the move to git had impact on this
> at all.

I remembered this mail from The Master(tm):
http://www.postgresql.org/message-id/21555.1339866293@sss.pgh.pa.us

>
>> All PG hackers
>> became comfortable with the unified diff format, see references
>> from the list. A lot of hackers post "git diff" patches which cannot
>> produce context diff, either.
> It's quite possible to create a context diff from git- indeed, it's
> documented on the http://wiki.postgresql.org/wiki/Working_with_Git
> portion of the wiki, specifically to help people understand how to take
> the changes they've made in their git forks and create context diffs to
> post to the mailing lists.  The preference for context diffs is also
> still documented at http://wiki.postgresql.org/wiki/Submitting_a_Patch.
>
> And, to be honest, I'm not sure how familiar you are with the history of
> PG in this area, but I've been through all of it from the pre-git CVS
> days, through the migration to git, to the exclusive use of git.

I started with Postgres95, though only using the source tarballs.
I started following CVS at around 2003 or so.

>    I feel
> quite confident that the preference for context diffs hasn't changed.

With stats for the February, 2013:
Name                 ctx  ctx/gitunif/git unif
Dimitry Fontaine 40       1      0
Ian Lawrence Barwick  0     1       0      0
Alvaro Herrera       10     0       5     0
Pavel Stehule         9     0       0      0
Heikki Linnakangas    0     0       7     0
Michael Paquier       0     0       6     0
Andres Freund         0     0       5     0
Alexander Law         0     0       1      0
Etsuro Fujita         0     0       4     0
Amit Kapila           4     0       0      0
Kevin Grittner        2     0       3      0
Gurjeet Singh         0     0       2    0
Erik Rijkers          0     0       0      1
Zoltan Boszormenyi    0     0       2      0
Tomas Vondra          0     0       2      0
Bruce Momjian         0     4       0      0
Fujii Masao           1     0       0      0
David Fetter          5     0       0      0
Jonathan Rogers       0     0       1      0
Andrew Dunstan        2     0       0      0
Manlio Perillo        0     0       1      0
Dean Rasheed          0     1       2      0
Jeff Janes            0     2       1      0
Phil Sorber           0     3       2      0
Mark Wong             0     1       0      0
Ivan Lezhnjov IV      0     0       1      0
Kohei Gagai           0     0       2      0
MauMau                1     0       0      0
Tom Lane              0     1       0      0
Sean Chittenden       0     0       2      0
Albe Laurenz          0     1       0      0
Daniel Farina         1     0       0      0
Simon Riggs           0     0       3      0
Peter Eisentraut      0     0       4      0

Plain context diffs: 39
Context diffs generated from git: 14
"git diff" patches: 57
Plain unified diff: 1

Some mails contained more than one patch, these were
counted as-is. One patch is one patch.

So, more than halfof the recently posted patches come
directly from "git diff". The preference has changed.
But indeed, plain "diff -u" is cleanly in the minority.
I can repost my patch from "git diff", too, to be in
the majority. :-P

>> That's not strictly true but it's not widely used either:
>>
>> $ find . -type f | xargs grep weight | grep heavy
>> ./monitoring.sgml:     <entry>Probe that fires when a request for a
>> heavyweight lock (lmgr lock)
>> ./monitoring.sgml:     <entry>Probe that fires when a request for a
>> heavyweight lock (lmgr lock)
> Interesting.  That didn't show up in the searches that I was doing
> through the web interface, though it does now.  If we're going to use
> that term, we should define it in the lock documentation.  If not, then
> we should avoid it everywhere.  I'm fine with either.

Me too, the documentation should be consistent. I will remove the
"heavyweight" term.

>
>>> - I don't particularly like "lock_timeout_option", for a couple of
>>>    reasons.  First is simply the name is terrible, but beyond that, it
>>>    strikes me that wanting to set both a 'per-lock timeout' and a
>>>    'overall waiting-for-locks timeout' at the same time would be a
>>>    reasonable use-case.  If we're going to have 2 GUCs and we're going to
>>>    support each of those options, why not just let the user specify
>>>    values for each?
>> OK, suggest names for the 2 GUCS, please.
> lock_timeout_wait and lock_timeout_stmt_wait ?

Hm. How about without the "_wait" suffix?
Or lock_timeout vs statement_lock_timeout?

>    Though I've been really
> wondering to myself as to if we need both of these options as well as
> statement_timeout.

>    Perhaps you can explain the use case for each
> option and how they're distinct from each other?

statement_timeout is the legacy behaviour, it should be kept.
It's behaviour is well understood: the statement finishes under
the specified time or it throws an error. The problem from the
application point of view is that the error can happen while
the tuples are being transferred to the client, so the recordset
can be cut in half.

"lock_timeout_stmt" (or lock_timeout_option = per_statement)
is somewhat extending the statement_timeout as only the
time waiting on locks are counted, so SELECT FOR UPDATE/etc.
may throw an error but if all rows are collected already, or
plain SELECT is run, the application gets them all.
This seems to follow the Microsoft SQL Server semantics:
http://msdn.microsoft.com/en-us/library/ms189470.aspx

The per-lock lock_timeout was the result of a big Informix
project ported to PostgreSQL, this follows Informix semantics:
http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqls.doc/sqls788.htm

>    Indeed, the use-case
> that I'm envisioning is focused around "don't wait more than 'X' for the
> relation-level locks" which would allow you to distinguish queries that
> are, most likely anyway, making progress, from those which have been
> caught behind a lock.  That would match your 'per-statement' lock
> timeout concept, iiuc, though I think it might be more simply
> implemented.
>
>>> - Not a big fan of this:
>>>
>>> +    * See notes in PGSemaphoreLock.
>> Why? Copy&paste the *long* comment (a different one in each *_sema.c)
>> or omitting the comments is better? Suggest a better comment, and
>> it will be included.
> How about, for starters, there's more than one PGSemaphoreLock?  Second,
> as you'll note in posix_sema.c, it's useful to say more than just "look
> over there".  I'm not suggesting a mass copy/paste, but more than 4
> words would be valuable.

OK.

>
>>> - Do we really need TimestampTzPlusMicroseconds..?
>> Well, my code is the only user for this macro but it's cleaner
>> than explicitly doing
> To be honest, I was really wondering why TimestampTzPlusMilliseconds
> isn't sufficient, not suggesting that we litter the code with #ifdef's.

Because Timestamp[Tz] is microsecond resolution, and the timeout
GUCs are in milliseconds. Adding up time differences (and rounding
or truncating them to millisecond in the process) would make the
per-statement lock timeout lose precision...

>
>     Thanks,
>
>         Stephen


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)

From

Heikki Linnakangas

Date:

24 February 2013, 08:45:13

On 24.02.2013 05:07, Andrew Dunstan wrote:
> On 02/23/2013 01:15 PM, Boszormenyi Zoltan wrote:
>>>
>>> First off, it's not in context diff format, which is the PG standard for
>>> patch submission.
>>
>> Since moving to GIT, this expectation is obsolete. All PG hackers
>> became comfortable with the unified diff format, see references
>> from the list. A lot of hackers post "git diff" patches which cannot
>> produce context diff, either.
>
> I am not aware that project policy has changed in the slightest in this
> regard.

I can't speak for others, but I personally don't care whether a patch is 
posted in unified or context diff format. Not as a general rule, anyway; 
patches that modify a few lines here and there are generally more 
readable in unified format, as the old and new lines are lined up right 
next to each other:

@@ -329,7 +346,7 @@ foreign_expr_walker(Node *node, foreign_expr_cxt 
*context) static bool is_builtin(Oid oid) {
-       return (oid < FirstNormalObjectId);
+       return (oid < FirstBootstrapObjectId); }

Then again, more complicated patches that modify large chunks of code 
are often unreadable when the - and + lines are mixed up in the same 
chunk; those are more readable in context format.

So if you want to be kind to readers, look at the patch and choose the 
format depending on which one makes it look better. But there's no need 
to make a point of it when someone posts in "wrong" format.

> Every unified diff can be turned into a context diff by passing it
> though "filterdiff --format=context".

Yep. And in emacs, there's "diff-unified->context" command. I bet most 
editors have a similar functionality these days.

- Heikki

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

24 February 2013, 12:30:40

Stephen,

2013-02-23 02:55 keltezéssel, Stephen Frost írta:
> Zoltán,
>
> * Zoltán Böszörményi (zb@cybertec.at) wrote:
>> The patch now passed "make check" in both cases.
> Is v29 the latest version of this patch..?

attached is v30, I hope with everything fixed.
- List based enable/disable_multiple_timeouts()
- separate per-lock and per-statement lock_timeout variants
- modified comments and documentation

Patch is "git diff" format. Please, review.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/

Attachment

2-lock_timeout-v30.patch.gz

Re: Strange Windows problem, lock_timeout test request

From

Stephen Frost

Date:

24 February 2013, 14:03:52

* Boszormenyi Zoltan (zb@cybertec.at) wrote:
> 2013-02-24 03:23 keltezéssel, Stephen Frost írta:
> >No, it isn't.  Patches posted to the list should be in context diff
> >format (and uncompressed unless it's too large) for easier reading.
> >That avoids having to download it, apply it to a git tree and then
> >create the diff ourselves.  Indeed, the move to git had impact on this
> >at all.
>
> I remembered this mail from The Master(tm):
> http://www.postgresql.org/message-id/21555.1339866293@sss.pgh.pa.us

Which matches exactly what I was saying- context diff provides easier
readind for review purposes, which is presumably what you were looking
to have happen when you posted this patch to the mailing list.  And it's
still the documented expectation and project policy, regardless of any
individual's email.  If we're going to change it, then the
documentation, et al, should be updated as well.

For my 2c, I still prefer context diffs posted to the mailing lists,
but would also like to see more people posting pull requests.  That
doesn't make it project policy though.

> So, more than halfof the recently posted patches come
> directly from "git diff". The preference has changed.

No, more people have ignored the project policy than not- that doesn't
say anything about the preference or what the policy is.

> >lock_timeout_wait and lock_timeout_stmt_wait ?
>
> Hm. How about without the "_wait" suffix?
> Or lock_timeout vs statement_lock_timeout?

I could live without the _wait suffix, but it occurs to me that none of
these really indicate that statement_lock_timeout is an accumulated
timeout.  Perhaps it should say 'total lock wait timeout' or similar?

> statement_timeout is the legacy behaviour, it should be kept.
> It's behaviour is well understood: the statement finishes under
> the specified time or it throws an error. The problem from the
> application point of view is that the error can happen while
> the tuples are being transferred to the client, so the recordset
> can be cut in half.
>
> "lock_timeout_stmt" (or lock_timeout_option = per_statement)
> is somewhat extending the statement_timeout as only the
> time waiting on locks are counted, so SELECT FOR UPDATE/etc.
> may throw an error but if all rows are collected already, or
> plain SELECT is run, the application gets them all.
> This seems to follow the Microsoft SQL Server semantics:
> http://msdn.microsoft.com/en-us/library/ms189470.aspx

The documentation at that link appears to match what 'lock_timeout'
would do.  Note that it says 'a lock' here: "When a wait for a lock
exceeds the time-out value, an error is returned.", or have you tested
the actual behavior and seen it treat this value as an accumulated
time across all lock waits for an entire statement?

> The per-lock lock_timeout was the result of a big Informix
> project ported to PostgreSQL, this follows Informix semantics:
> http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqls.doc/sqls788.htm

I agree that the Informix command looks to be per-lock, but based on my
simple reading of both documentation links, I think they're actually the
same behavior and neither is about an accumulated time.

> Because Timestamp[Tz] is microsecond resolution, and the timeout
> GUCs are in milliseconds. Adding up time differences (and rounding
> or truncating them to millisecond in the process) would make the
> per-statement lock timeout lose precision...

Removing the accumulation-based option would fix that.. :D
Thanks,
    Stephen

Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)

From

Stephen Frost

Date:

24 February 2013, 14:08:39

* Heikki Linnakangas (hlinnakangas@vmware.com) wrote:
> So if you want to be kind to readers, look at the patch and choose
> the format depending on which one makes it look better. But there's
> no need to make a point of it when someone posts in "wrong" format.

To be more precise- my main complaint about this is that this patch is
making changes to multi-line comments and to documentation, both of
which get very annoying to try and read in uniform diff format.
Patches that don't do one or the other of those are likely incomplete
anyway.

As another point, it's also the very first thing that we document in
http://wiki.postgresql.org/wiki/Reviewing_a_Patch to check for.

> >Every unified diff can be turned into a context diff by passing it
> >though "filterdiff --format=context".
>
> Yep. And in emacs, there's "diff-unified->context" command. I bet
> most editors have a similar functionality these days.

And it probably doesn't work for every patch either, just as filterdiff
doesn't (see my other email).
Thanks,
    Stephen

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

24 February 2013, 15:15:06

2013-02-24 15:03 keltezéssel, Stephen Frost írta:
> * Boszormenyi Zoltan (zb@cybertec.at) wrote:
>> 2013-02-24 03:23 keltezéssel, Stephen Frost írta:
>>> No, it isn't.  Patches posted to the list should be in context diff
>>> format (and uncompressed unless it's too large) for easier reading.
>>> That avoids having to download it, apply it to a git tree and then
>>> create the diff ourselves.  Indeed, the move to git had impact on this
>>> at all.
>> I remembered this mail from The Master(tm):
>> http://www.postgresql.org/message-id/21555.1339866293@sss.pgh.pa.us
> Which matches exactly what I was saying- context diff provides easier
> readind for review purposes, which is presumably what you were looking
> to have happen when you posted this patch to the mailing list.  And it's
> still the documented expectation and project policy, regardless of any
> individual's email.  If we're going to change it, then the
> documentation, et al, should be updated as well.
>
> For my 2c, I still prefer context diffs posted to the mailing lists,
> but would also like to see more people posting pull requests.  That
> doesn't make it project policy though.
>
>> So, more than halfof the recently posted patches come
>> directly from "git diff". The preference has changed.
> No, more people have ignored the project policy than not- that doesn't
> say anything about the preference or what the policy is.
>
>>> lock_timeout_wait and lock_timeout_stmt_wait ?
>> Hm. How about without the "_wait" suffix?
>> Or lock_timeout vs statement_lock_timeout?
> I could live without the _wait suffix, but it occurs to me that none of
> these really indicate that statement_lock_timeout is an accumulated
> timeout.  Perhaps it should say 'total lock wait timeout' or similar?
>
>> statement_timeout is the legacy behaviour, it should be kept.
>> It's behaviour is well understood: the statement finishes under
>> the specified time or it throws an error. The problem from the
>> application point of view is that the error can happen while
>> the tuples are being transferred to the client, so the recordset
>> can be cut in half.
>>
>> "lock_timeout_stmt" (or lock_timeout_option = per_statement)
>> is somewhat extending the statement_timeout as only the
>> time waiting on locks are counted, so SELECT FOR UPDATE/etc.
>> may throw an error but if all rows are collected already, or
>> plain SELECT is run, the application gets them all.
>> This seems to follow the Microsoft SQL Server semantics:
>> http://msdn.microsoft.com/en-us/library/ms189470.aspx
> The documentation at that link appears to match what 'lock_timeout'
> would do.  Note that it says 'a lock' here: "When a wait for a lock
> exceeds the time-out value, an error is returned.", or have you tested
> the actual behavior and seen it treat this value as an accumulated
> time across all lock waits for an entire statement?

(Note to myself: don't read with headache.)
I may have misread the MSDN link.

>
>> The per-lock lock_timeout was the result of a big Informix
>> project ported to PostgreSQL, this follows Informix semantics:
>> http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqls.doc/sqls788.htm
> I agree that the Informix command looks to be per-lock, but based on my
> simple reading of both documentation links, I think they're actually the
> same behavior and neither is about an accumulated time.

You seem to be right.

>> Because Timestamp[Tz] is microsecond resolution, and the timeout
>> GUCs are in milliseconds. Adding up time differences (and rounding
>> or truncating them to millisecond in the process) would make the
>> per-statement lock timeout lose precision...
> Removing the accumulation-based option would fix that.. :D

To call out the wrath of Tom? No, thanks. :-D
IIRC, he was the one who didn't like the per-lock behaviour
the first time he looked at this patch in an earlier form
back in 2010/2011 and he proposed this use case instead.
If not him, then someone else. I got the idea from this list...

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)

From

Claudio Freire

Date:

24 February 2013, 17:39:44

On Sun, Feb 24, 2013 at 11:08 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Heikki Linnakangas (hlinnakangas@vmware.com) wrote:
>> So if you want to be kind to readers, look at the patch and choose
>> the format depending on which one makes it look better. But there's
>> no need to make a point of it when someone posts in "wrong" format.
>
> To be more precise- my main complaint about this is that this patch is
> making changes to multi-line comments and to documentation, both of
> which get very annoying to try and read in uniform diff format.
> Patches that don't do one or the other of those are likely incomplete
> anyway.
>
> As another point, it's also the very first thing that we document in
> http://wiki.postgresql.org/wiki/Reviewing_a_Patch to check for.


TBH, that wiki link seems to suggest that *having context* is the
point of the requirement (to be able to merge with fuzz).

Both unified and context formats have context.

Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)

From

Andrew Dunstan

Date:

24 February 2013, 19:42:35

On 02/24/2013 12:39 PM, Claudio Freire wrote:
> On Sun, Feb 24, 2013 at 11:08 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> * Heikki Linnakangas (hlinnakangas@vmware.com) wrote:
>>> So if you want to be kind to readers, look at the patch and choose
>>> the format depending on which one makes it look better. But there's
>>> no need to make a point of it when someone posts in "wrong" format.
>> To be more precise- my main complaint about this is that this patch is
>> making changes to multi-line comments and to documentation, both of
>> which get very annoying to try and read in uniform diff format.
>> Patches that don't do one or the other of those are likely incomplete
>> anyway.
>>
>> As another point, it's also the very first thing that we document in
>> http://wiki.postgresql.org/wiki/Reviewing_a_Patch to check for.
>
> TBH, that wiki link seems to suggest that *having context* is the
> point of the requirement (to be able to merge with fuzz).
>
> Both unified and context formats have context.
>

No, you're missing the point. Some people find reading context diffs 
much easier than reading unified diffs. That's why context diffs are the 
project's stated preference.

cheers

andrew

Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)

From

Peter Geoghegan

Date:

24 February 2013, 20:08:38

On 24 February 2013 08:44, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> I can't speak for others, but I personally don't care whether a patch is
> posted in unified or context diff format. Not as a general rule, anyway;
> patches that modify a few lines here and there are generally more readable
> in unified format, as the old and new lines are lined up right next to each
> other:

I don't care either. My personal preference is context diff format,
but then that's what I usually see anyway. I don't use filterdiff or
anything like that. I just have a strong habit of using feature
branches extensively, even for patches that I'm reviewing, and my
setup makes that easy to create from a patch file. It's quite a rare
occurrence for me to care enough about a patch to want to eyeball the
code (and not just read the author's summary), and yet not care about
it enough to make a feature branch for it. I can see how other
people's habits might differ from my own here, and that they might
reasonably state a preference for unified, which is fine. I developed
a preference for unified over time, having originally just used the
format on the advice of the wiki.

-- 
Regards,
Peter Geoghegan

Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)

From

Stephen Frost

Date:

24 February 2013, 21:31:47

* Claudio Freire (klaussfreire@gmail.com) wrote:
> > As another point, it's also the very first thing that we document in
> > http://wiki.postgresql.org/wiki/Reviewing_a_Patch to check for.
>
> TBH, that wiki link seems to suggest that *having context* is the
> point of the requirement (to be able to merge with fuzz).

The PG wiki link states "Is the patch in context diff format?" and
provides a link to the wikipedia article about *that specific format*.
There's absolutely zero confusion over what "context diff format" means.
Thanks,
    Stephen

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

25 February 2013, 08:23:56

2013-02-24 16:14 keltezéssel, Boszormenyi Zoltan írta:
> 2013-02-24 15:03 keltezéssel, Stephen Frost írta:
>> * Boszormenyi Zoltan (zb@cybertec.at) wrote:
>>> 2013-02-24 03:23 keltezéssel, Stephen Frost írta:
>>>> No, it isn't.  Patches posted to the list should be in context diff
>>>> format (and uncompressed unless it's too large) for easier reading.
>>>> That avoids having to download it, apply it to a git tree and then
>>>> create the diff ourselves.  Indeed, the move to git had impact on this
>>>> at all.
>>> I remembered this mail from The Master(tm):
>>> http://www.postgresql.org/message-id/21555.1339866293@sss.pgh.pa.us
>> Which matches exactly what I was saying- context diff provides easier
>> readind for review purposes, which is presumably what you were looking
>> to have happen when you posted this patch to the mailing list. And it's
>> still the documented expectation and project policy, regardless of any
>> individual's email.  If we're going to change it, then the
>> documentation, et al, should be updated as well.
>>
>> For my 2c, I still prefer context diffs posted to the mailing lists,
>> but would also like to see more people posting pull requests. That
>> doesn't make it project policy though.
>>
>>> So, more than halfof the recently posted patches come
>>> directly from "git diff". The preference has changed.
>> No, more people have ignored the project policy than not- that doesn't
>> say anything about the preference or what the policy is.
>>
>>>> lock_timeout_wait and lock_timeout_stmt_wait ?
>>> Hm. How about without the "_wait" suffix?
>>> Or lock_timeout vs statement_lock_timeout?
>> I could live without the _wait suffix, but it occurs to me that none of
>> these really indicate that statement_lock_timeout is an accumulated
>> timeout.  Perhaps it should say 'total lock wait timeout' or similar?
>>
>>> statement_timeout is the legacy behaviour, it should be kept.
>>> It's behaviour is well understood: the statement finishes under
>>> the specified time or it throws an error. The problem from the
>>> application point of view is that the error can happen while
>>> the tuples are being transferred to the client, so the recordset
>>> can be cut in half.
>>>
>>> "lock_timeout_stmt" (or lock_timeout_option = per_statement)
>>> is somewhat extending the statement_timeout as only the
>>> time waiting on locks are counted, so SELECT FOR UPDATE/etc.
>>> may throw an error but if all rows are collected already, or
>>> plain SELECT is run, the application gets them all.
>>> This seems to follow the Microsoft SQL Server semantics:
>>> http://msdn.microsoft.com/en-us/library/ms189470.aspx
>> The documentation at that link appears to match what 'lock_timeout'
>> would do.  Note that it says 'a lock' here: "When a wait for a lock
>> exceeds the time-out value, an error is returned.", or have you tested
>> the actual behavior and seen it treat this value as an accumulated
>> time across all lock waits for an entire statement?
>
> (Note to myself: don't read with headache.)
> I may have misread the MSDN link.
>
>>
>>> The per-lock lock_timeout was the result of a big Informix
>>> project ported to PostgreSQL, this follows Informix semantics:
>>> http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqls.doc/sqls788.htm
>>>
>> I agree that the Informix command looks to be per-lock, but based on my
>> simple reading of both documentation links, I think they're actually the
>> same behavior and neither is about an accumulated time.
>
> You seem to be right.
>
>>> Because Timestamp[Tz] is microsecond resolution, and the timeout
>>> GUCs are in milliseconds. Adding up time differences (and rounding
>>> or truncating them to millisecond in the process) would make the
>>> per-statement lock timeout lose precision...
>> Removing the accumulation-based option would fix that.. :D
>
> To call out the wrath of Tom? No, thanks. :-D
> IIRC, he was the one who didn't like the per-lock behaviour
> the first time he looked at this patch in an earlier form
> back in 2010/2011 and he proposed this use case instead.
> If not him, then someone else. I got the idea from this list...

Another question just popped up. Now, that    bool enable_multiple_timeouts(List *timeouts);
exists, do we really need the singular versions?

Since the "timeout after N msec" have the per-lock and per-stmt
versions, enable_timeout_after() gained a new argument to
distinguish between the two cases and every occurrences of
this function happen to just use "0" here. The only usage of the
per-stmt variant is used with enable_multiple_timeouts().

Wouldn't it be better to have a single    bool enable_timeouts(List *timeouts);
instead?


>
> Best regards,
> Zoltán Böszörményi
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)

From

Robert Haas

Date:

25 February 2013, 12:56:44

On Sun, Feb 24, 2013 at 4:31 PM, Stephen Frost <sfrost@snowman.net> wrote:
> * Claudio Freire (klaussfreire@gmail.com) wrote:
>> > As another point, it's also the very first thing that we document in
>> > http://wiki.postgresql.org/wiki/Reviewing_a_Patch to check for.
>>
>> TBH, that wiki link seems to suggest that *having context* is the
>> point of the requirement (to be able to merge with fuzz).
>
> The PG wiki link states "Is the patch in context diff format?" and
> provides a link to the wikipedia article about *that specific format*.
> There's absolutely zero confusion over what "context diff format" means.

True, but I'm with Heikki: it's a pedantic and unhelpful guideline.
Everyone here who reviews patches regularly knows how to, and probably
does, convert between those formats with regularity.  Making patch
submitters feel badly because they've used the "wrong" format does not
advance the goals of the project.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Strange Windows problem, lock_timeout test request

From

Stephen Frost

Date:

25 February 2013, 14:05:29

Zoltan,

* Boszormenyi Zoltan (zb@cybertec.at) wrote:
> Another question just popped up. Now, that
>     bool enable_multiple_timeouts(List *timeouts);
> exists, do we really need the singular versions?
>
> Since the "timeout after N msec" have the per-lock and per-stmt
> versions, enable_timeout_after() gained a new argument to
> distinguish between the two cases and every occurrences of
> this function happen to just use "0" here. The only usage of the
> per-stmt variant is used with enable_multiple_timeouts().

For my 2c, I didn't partciularly care for changing
enable_timeout_after() by adding an extra parameter that ended up being
passed as ',0'..  Perhaps make it a wrapper instead of changing the
definition and leaving the invocations of it alone?

> Wouldn't it be better to have a single
>     bool enable_timeouts(List *timeouts);
> instead?

This might also work though, if everything is updated to use it and it's
relatively clean.  I realize for the aggregate case, you have to have
it, but I really don't like the changes to have to reset the counter
either.

Tom, can you comment on your thoughts around this notion of an aggregate
time constraint for waiting on locks?  As mentioned upthread, I like the
idea of having an upper-limit on waiting for relation-level locks, but
once you're past that, I'm not sure that an aggregate waiting-on-locks
is any better than the overall statement-level timeout and it seems
somewhat messy, to me anyway.
Thanks,
    Stephen

Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)

From

Stephen Frost

Date:

25 February 2013, 14:11:29

* Robert Haas (robertmhaas@gmail.com) wrote:
> True, but I'm with Heikki: it's a pedantic and unhelpful guideline.

Then let's change it, drop the preference, and update the documentation.
I'd certainly prefer that to getting shot for pointing out to patch
submitters that they're not following our documented guidelines.

> Everyone here who reviews patches regularly knows how to, and probably
> does, convert between those formats with regularity.  Making patch
> submitters feel badly because they've used the "wrong" format does not
> advance the goals of the project.

For my part, I'd rather put the onus on the submitter to submit a
readable patch in the first part than ask the reviewer and anyone else
interested in looking at the patch to fix it.  That's even more true
when you consider the archives and reading patches through the web
interface (though downloading the original mail message has gotten
better with the new archive code).
Thanks,
    Stephen

Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)

From

Andres Freund

Date:

25 February 2013, 14:17:31

On 2013-02-25 09:11:27 -0500, Stephen Frost wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
> > True, but I'm with Heikki: it's a pedantic and unhelpful guideline.
> 
> Then let's change it, drop the preference, and update the documentation.

+1

> > Everyone here who reviews patches regularly knows how to, and probably
> > does, convert between those formats with regularity.  Making patch
> > submitters feel badly because they've used the "wrong" format does not
> > advance the goals of the project.
> 
> For my part, I'd rather put the onus on the submitter to submit a
> readable patch in the first part than ask the reviewer and anyone else
> interested in looking at the patch to fix it.  That's even more true
> when you consider the archives and reading patches through the web
> interface (though downloading the original mail message has gotten
> better with the new archive code).

Well, the point is that you cannot satisfy enough people with either
choice anyway. I for one feel much more comfortable sending patches in a
format that I can actually read without thinking too much. Which is the
case for unified but definitely not for context. But its different for
others.

I gave in and made my mailreader convert all patches to unified for
reading, that way I don't care about other peoples preferences for one.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)

From

Tom Lane

Date:

25 February 2013, 14:26:11

Stephen Frost <sfrost@snowman.net> writes:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> True, but I'm with Heikki: it's a pedantic and unhelpful guideline.

> Then let's change it, drop the preference, and update the documentation.

I think we should drop the hard requirement for context-format, and
instead say that it must not be plain (context-free) diff, since that
clearly *is* a hard requirement.

However, I liked the upthread suggestion (I think it was from Heikki)
that we recommend that submitters actually take a moment to think about
which format is more readable for their particular patch.   Readability
is important not only to help people who just give the patch a quick
eyeball, but also to help the inevitable situations where hunks have
to be applied by hand because the underlying code has changed.  The
less readable the patch, the more likely an error in doing that.
(And I trust we've all learned by now that git isn't so good at merging
that this isn't a problem.)
        regards, tom lane

Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)

From

Stephen Frost

Date:

25 February 2013, 14:38:15

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Robert Haas (robertmhaas@gmail.com) wrote:
> >> True, but I'm with Heikki: it's a pedantic and unhelpful guideline.
>
> > Then let's change it, drop the preference, and update the documentation.
>
> I think we should drop the hard requirement for context-format, and
> instead say that it must not be plain (context-free) diff, since that
> clearly *is* a hard requirement.

Alright, I'll start making those changes.

> However, I liked the upthread suggestion (I think it was from Heikki)
> that we recommend that submitters actually take a moment to think about
> which format is more readable for their particular patch.   Readability
> is important not only to help people who just give the patch a quick
> eyeball, but also to help the inevitable situations where hunks have
> to be applied by hand because the underlying code has changed.  The
> less readable the patch, the more likely an error in doing that.
> (And I trust we've all learned by now that git isn't so good at merging
> that this isn't a problem.)

I'll include that point in my changes but I consider the chances of that
actually happening with any regularity to be essentially zero.  Reducing
our requirement to a level where the default passes means that nearly
everyone, who hasn't already changed things to use a non-default
automatically, is going to just use the default.
Thanks,
    Stephen

Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)

From

Boszormenyi Zoltan

Date:

25 February 2013, 16:39:45

2013-02-25 15:25 keltezéssel, Tom Lane írta:
> Stephen Frost <sfrost@snowman.net> writes:
>> * Robert Haas (robertmhaas@gmail.com) wrote:
>>> True, but I'm with Heikki: it's a pedantic and unhelpful guideline.
>> Then let's change it, drop the preference, and update the documentation.
> I think we should drop the hard requirement for context-format, and
> instead say that it must not be plain (context-free) diff, since that
> clearly *is* a hard requirement.
>
> However, I liked the upthread suggestion (I think it was from Heikki)
> that we recommend that submitters actually take a moment to think about
> which format is more readable for their particular patch.   Readability
> is important not only to help people who just give the patch a quick
> eyeball, but also to help the inevitable situations where hunks have
> to be applied by hand because the underlying code has changed.  The
> less readable the patch, the more likely an error in doing that.

+1

I think the readability is mostly the de-facto threshold. Considering
that quite a few plain "git diff" patches were committed during this CF
but those were readable in that format, even biggies like "teach
receivexlog to switch timelines" which (I just browsed it) had parts
that rewrote code in a way that the diff had pieces with different
indentation intermixed. This can constitute to being unreadable
at times but it also depends on the reader.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: unified vs context diffs (was Re: Strange Windows problem, lock_timeout test request)

From

Stephen Frost

Date:

25 February 2013, 16:48:13

* Stephen Frost (sfrost@snowman.net) wrote:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> > I think we should drop the hard requirement for context-format, and
> > instead say that it must not be plain (context-free) diff, since that
> > clearly *is* a hard requirement.
>
> Alright, I'll start making those changes.

I've updated the wiki pages accordingly, though there's now a number of
translated pages which also need updating to reflect this change.
Thanks,
    Stephen

Re: Strange Windows problem, lock_timeout test request

From

Stephen Frost

Date:

27 February 2013, 03:06:51

Zoltan,

* Boszormenyi Zoltan (zb@cybertec.at) wrote:
> attached is v30, I hope with everything fixed.

Making progress, certainly.

Given the hack to the API of enable_timeout_after() and the need for
timeout_reset_base_time(), I'm just going to voice my objection to the
"accumulated wait time on locks" portion again.  I still like the idea
of a timeout for waiting on relation-level locks, as we acquire those
all up-front and we'd be able to just set a timeout at the appropriate
point and then release it when we're past acquiring the relation-level
locks.  Seems like that would be much cleaner.

On the other hand, if we're going to go down this route, I'm really
starting to wonder if it should be the timeout systems responsibility to
keep track of such accumulated time.

Other than that..

> - List based enable/disable_multiple_timeouts()

That looks good, like the use of foreach(), etc, but I don't like how
you've set up delay_ms as a pointer..?  Looks to be to allow you to
initialize the TimeoutParams structs early in proc.c..?  Is there
another reason it needs to be a pointer that I'm missing?  Why not build
the TimeoutParam strcutures in the if() blocks that check if the GUCs
are set..?

> - separate per-lock and per-statement lock_timeout variants
> - modified comments and documentation

Thanks.
Stephen

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

27 February 2013, 09:40:38

2013-02-27 04:06 keltezéssel, Stephen Frost írta:
> Zoltan,
>
> * Boszormenyi Zoltan (zb@cybertec.at) wrote:
>> attached is v30, I hope with everything fixed.
> Making progress, certainly.
>
> Given the hack to the API of enable_timeout_after() and the need for
> timeout_reset_base_time(), I'm just going to voice my objection to the
> "accumulated wait time on locks" portion again.  I still like the idea
> of a timeout for waiting on relation-level locks, as we acquire those
> all up-front and we'd be able to just set a timeout at the appropriate
> point and then release it when we're past acquiring the relation-level
> locks.  Seems like that would be much cleaner.

Yes, it would be cleaner. But this way the total timeout the
statement would wait for locks is N * lock_timeout - <some delta>
in the worst case when all the locks can be acquired just under
the set timeout interval, N being the number of locks that statement
wants to acquire. For some use cases, it's better to have known limit
in the total amount of wait time. But unlike statement_timeout,
with lock_timeout_stmt the statement can still finish after this limit
as it does useful work besides waiting for locks.

> On the other hand, if we're going to go down this route, I'm really
> starting to wonder if it should be the timeout systems responsibility to
> keep track of such accumulated time.

Thinking about it a bit more, I start to agree with it.
It's not likely that any new timeout sources will get added
to proc.c that has anything to do with waiting across multiple locks.From this POV, this accumulated time can be done
byproc.c itself. 
But this makes it necessary to reschedule the timer from the
ProcSleep() loop so it increases the number of setitimer() calls.
But with clever coding, the "it_interval" part of struct itimerval
can be used to decrease the number of setitimer calls, so it may
be balanced out.

Another thought is that there is no need to have an extra function
to set the start time for the timeouts. It can be folded into
enable_timeout_after(), enable_timeout_at() and
enable_multiple_timeouts() and it simplifies the API, too.

Since setitimer() has microsecond resolution, I wonder whether
the timeout.c code shouldn't accept that, too. Especially if we want
to keep the per-statement accumulated version for the lock timeout.
Then time to wait that can be represented using int32 would be
about 35.8 minutes at most, we will need to use int64 if the maximum
number of millisecs is to stay as INT_MAX, which I guess it should.

Comments?

> Other than that..
>
>> - List based enable/disable_multiple_timeouts()
> That looks good, like the use of foreach(), etc, but I don't like how
> you've set up delay_ms as a pointer..?  Looks to be to allow you to
> initialize the TimeoutParams structs early in proc.c..?  Is there
> another reason it needs to be a pointer that I'm missing?  Why not build
> the TimeoutParam strcutures in the if() blocks that check if the GUCs
> are set..?

It's fixed in my tree and I also added an Assert to
enable_multiple_timeouts() to make sure all timeout sources
use either TIMEOUT_AT or TIMEOUT_AFTER. Some more
comments were also fixed. But I would like to send out the
new  patch after the above questions are settled.

I also want to thank you for your comments. It's good to have
a fresh look on this code.

>> - separate per-lock and per-statement lock_timeout variants
>> - modified comments and documentation
> Thanks.
>
>     Stephen

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Stephen Frost

Date:

27 February 2013, 13:58:17

* Boszormenyi Zoltan (zb@cybertec.at) wrote:
> But unlike statement_timeout,
> with lock_timeout_stmt the statement can still finish after this limit
> as it does useful work besides waiting for locks.

It's still entirely possible to get 99% done and then hit that last
tuple that you need a lock on and just tip over the lock_timeout_stmt
limit due to prior waiting and ending up wasting a bunch of work, hence
why I'm not entirely sure that this is that much better than
statement_timeout.

> Thinking about it a bit more, I start to agree with it.
> It's not likely that any new timeout sources will get added
> to proc.c that has anything to do with waiting across multiple locks.
> From this POV, this accumulated time can be done by proc.c itself.
> But this makes it necessary to reschedule the timer from the
> ProcSleep() loop so it increases the number of setitimer() calls.
> But with clever coding, the "it_interval" part of struct itimerval
> can be used to decrease the number of setitimer calls, so it may
> be balanced out.

We're not even going down this code path until we're already waiting on
a lock from someone, right?  Not sure that we need to stress out too
much about calling setitimer().

> Another thought is that there is no need to have an extra function
> to set the start time for the timeouts. It can be folded into
> enable_timeout_after(), enable_timeout_at() and
> enable_multiple_timeouts() and it simplifies the API, too.

Right, back to how the API was originally, for the most part, no? :)

> Since setitimer() has microsecond resolution, I wonder whether
> the timeout.c code shouldn't accept that, too. Especially if we want
> to keep the per-statement accumulated version for the lock timeout.
> Then time to wait that can be represented using int32 would be
> about 35.8 minutes at most, we will need to use int64 if the maximum
> number of millisecs is to stay as INT_MAX, which I guess it should.
>
> Comments?

Wouldn't that impact statement_timeout also..?  I can definitely see
someone wanting to set that at, say, an hour.  If anything, I continue
to feel that going the *other* way makes more sense- keep everything at
millisecond and just floor() down to the ms when we're doing accounting
based on microsecond information.  Sure, we might end up waiting a bit
(a very small bit..) longer than we were supposed to, but I hardly see
that as being a major complaint.
Thanks,
    Stephen

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

27 February 2013, 18:07:43

Stephen Frost <sfrost@snowman.net> writes:
> Tom, can you comment on your thoughts around this notion of an aggregate
> time constraint for waiting on locks?  As mentioned upthread, I like the
> idea of having an upper-limit on waiting for relation-level locks, but
> once you're past that, I'm not sure that an aggregate waiting-on-locks
> is any better than the overall statement-level timeout and it seems
> somewhat messy, to me anyway.

I think anything that makes this patch simpler is a good idea.  I don't
like any of the accum_time stuff: it complicates the timeout API
unreasonably and slows down existing use cases.

Some other thoughts:

* timeout_reset_base_time() seems like an ugly and unnecessary API wart.
I don't like the timeout_start state variable at all; if you need
several timeouts to be scheduled relative to the exact same starting
point, can't you just do that in a single enable_multiple_timeouts()
call?  And I think the optional TMPARAM_ACCUM action is completely
unacceptable, because it supposes that every user of a timeout, now and
in the future, is okay with having their accumulated time reset at the
same point.  The whole point of having invented this timeout API is to
serve timeout uses we don't currently foresee, so actions that affect
every timeout seem pretty undesirable.

* I don't care for changing the API of enable_timeout_after when there
is in fact not a single caller using the flags argument (probably
because the only defined flag is too baroque to be useful).  If there
were a use case for the "accum" action it'd be better to have a separate
API function for it, probably.
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Stephen Frost

Date:

27 February 2013, 18:28:53

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> I think anything that makes this patch simpler is a good idea.  I don't
> like any of the accum_time stuff: it complicates the timeout API
> unreasonably and slows down existing use cases.

Right, I think we're on the same page there- I had just commented to
Zoltan that tracking the accumulated time shouldn't be the timeout
system's responsibility and that the timout API really shouldn't need
to be changed.

I'm not convinced that the lock-time-accumulation-timeout capability
is really all that valuable in the first place though, but perhaps I'm
in the minority on that.
Thanks,
    Stephen

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

27 February 2013, 18:39:08

2013-02-27 19:07 keltezéssel, Tom Lane írta:
> Stephen Frost <sfrost@snowman.net> writes:
>> Tom, can you comment on your thoughts around this notion of an aggregate
>> time constraint for waiting on locks?  As mentioned upthread, I like the
>> idea of having an upper-limit on waiting for relation-level locks, but
>> once you're past that, I'm not sure that an aggregate waiting-on-locks
>> is any better than the overall statement-level timeout and it seems
>> somewhat messy, to me anyway.
> I think anything that makes this patch simpler is a good idea.  I don't
> like any of the accum_time stuff: it complicates the timeout API
> unreasonably and slows down existing use cases.

Perfect. :-)

> Some other thoughts:
>
> * timeout_reset_base_time() seems like an ugly and unnecessary API wart.
> I don't like the timeout_start state variable at all; if you need
> several timeouts to be scheduled relative to the exact same starting
> point, can't you just do that in a single enable_multiple_timeouts()
> call?  And I think the optional TMPARAM_ACCUM action is completely
> unacceptable,

If we get rid of the per-statement variant, there is no need for that either.

>   because it supposes that every user of a timeout, now and
> in the future, is okay with having their accumulated time reset at the
> same point.  The whole point of having invented this timeout API is to
> serve timeout uses we don't currently foresee, so actions that affect
> every timeout seem pretty undesirable.
>
> * I don't care for changing the API of enable_timeout_after when there
> is in fact not a single caller using the flags argument (probably
> because the only defined flag is too baroque to be useful).  If there
> were a use case for the "accum" action it'd be better to have a separate
> API function for it, probably.

This way, enable_timeout_after() wouldn't have this extra argument either.

Thanks for your kind words.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Stephen Frost

Date:

27 February 2013, 19:06:36

Zoltan,

* Boszormenyi Zoltan (zb@cybertec.at) wrote:
> If we get rid of the per-statement variant, there is no need for that either.

For my 2c, I didn't see Tom's comments as saying that we shouldn't have
that capability, just that the implementation was ugly. :)

That said, perhaps we should just drop it for now, get the lock_timeout
piece solid, and then come back to the question about lock_timeout_stmt.
Thanks,
    Stephen

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

27 February 2013, 19:38:53

2013-02-27 20:06 keltezéssel, Stephen Frost írta:
> Zoltan,
>
> * Boszormenyi Zoltan (zb@cybertec.at) wrote:
>> If we get rid of the per-statement variant, there is no need for that either.
> For my 2c, I didn't see Tom's comments as saying that we shouldn't have
> that capability, just that the implementation was ugly. :)

But I am happy to drop it. ;-)

> That said, perhaps we should just drop it for now, get the lock_timeout
> piece solid, and then come back to the question about lock_timeout_stmt.

OK, let's do it this way.

>
>     Thanks,
>
>         Stephen


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

28 February 2013, 08:22:48

2013-02-27 20:38 keltezéssel, Boszormenyi Zoltan írta:
> 2013-02-27 20:06 keltezéssel, Stephen Frost írta:
>> Zoltan,
>>
>> * Boszormenyi Zoltan (zb@cybertec.at) wrote:
>>> If we get rid of the per-statement variant, there is no need for that either.
>> For my 2c, I didn't see Tom's comments as saying that we shouldn't have
>> that capability, just that the implementation was ugly. :)
>
> But I am happy to drop it. ;-)
>
>> That said, perhaps we should just drop it for now, get the lock_timeout
>> piece solid, and then come back to the question about lock_timeout_stmt.
>
> OK, let's do it this way.

Dropped the per-statement lock timeout for now. The patch is
now obviously simpler and shorter. I renamed
enable/disable_multiple_timeouts() to simply enable/disable_timeouts()
since the List* argument implies more than one of them and
you need to type less.

The comments and the documentation needs another review,
to make sure I left no traces of the per-statements variant.
I can't see any but I stared at this patch for so long that I can't
be sure anymore.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/

Attachment

2-lock_timeout-v33.patch

Re: Strange Windows problem, lock_timeout test request

From

Craig Ringer

Date:

04 March 2013, 03:28:55

<br /> -----BEGIN PGP SIGNED MESSAGE-----<br /> Hash: SHA1<br /><br /> On 02/27/2013 09:58 PM, Stephen Frost wrote:<br
/><spanstyle="white-space: pre;">> * Boszormenyi Zoltan (<a class="moz-txt-link-abbreviated"
href="mailto:zb@cybertec.at">zb@cybertec.at</a>)wrote:<br /> >> But unlike statement_timeout,<br /> >> with
lock_timeout_stmtthe statement can still finish after this limit<br /> >> as it does useful work besides waiting
forlocks.<br /> ><br /> > It's still entirely possible to get 99% done and then hit that last<br /> > tuple
thatyou need a lock on and just tip over the lock_timeout_stmt<br /> > limit due to prior waiting and ending up
wastinga bunch of work, hence<br /> > why I'm not entirely sure that this is that much better than<br /> >
statement_timeout.</span><br/><br /> There are questions about whether this is a good idea, and there's still
discussionongoing. It doesn't look like it's in a state where we can confidently say "let's include this for 9.3" to
me,but I'd like other viewpoints.<br /><br /> Should we bump this to the next CF? It's clearly still a viable idea,
justpossibly not ready yet.<br /><br /> - -- <br />  Craig Ringer                   <a class="moz-txt-link-freetext"
href="http://www.2ndQuadrant.com/">http://www.2ndQuadrant.com/</a><br/>  PostgreSQL Development, 24x7 Support, Training
&Services<br /> -----BEGIN PGP SIGNATURE-----<br /> Version: GnuPG v1.4.13 (GNU/Linux)<br /> Comment: Using GnuPG
withThunderbird - <a class="moz-txt-link-freetext" href="http://www.enigmail.net/">http://www.enigmail.net/</a><br
/><br/> iQEcBAEBAgAGBQJRNBTkAAoJELBXNkqjr+S2qd0H+gMdDFmoWLJbqw1IvlopTTiz<br />
LtYr/lkmiRVFFOPgcAMwrDrTzT1AkGIHbkYd0erXqRUNsSrFY9O3FabyQYfo9QG2<br/>
5HhvZkmNxf+WFyaqpg1gq/L1pm+2gr0o0N3GabmJTmg9JO7sf1BUBv/EdImaq1CT<br/>
lARJXXNC5vI/sVr2P/GpazCzl2120t+ZM9QGyqqqrz6e5t3BjpkGR4Y7MxyVkcfs<br/>
hDOpVIoXMDwOZVJTojLHLqeBdjOljRhCjgkqHKXii9ZUBCs5jFGBT/yOQCTwA2xo<br/>
YyHbJt+7VJm/lTvG379Q/vXMvIAZkbWtENOwKokwPThlq2HDAAEluZ8U7h5/8i4=<br/> =49I0<br /> -----END PGP SIGNATURE-----<br /><br
/>

Re: Strange Windows problem, lock_timeout test request

From

Stephen Frost

Date:

04 March 2013, 03:34:58

Craig,

* Craig Ringer (craig@2ndquadrant.com) wrote:
> There are questions about whether this is a good idea, and there's still
> discussion ongoing. It doesn't look like it's in a state where we can
> confidently say "let's include this for 9.3" to me, but I'd like other
> viewpoints.

For my part, I think the straight-up 'lock_timeout' piece, which is in
the latest patch, is in pretty good shape.  It's much less invasive and
provides a capability which other RDBMS's have and is reasonably
straight forward.

> Should we bump this to the next CF? It's clearly still a viable idea,
> just possibly not ready yet.

Unless a committer steps up to take on the statement-level lock-wait
timeout, it's not going to get into 9.3, imv.  Waiting to add that
(whatever it ends up being) until post-9.3 makes sense to me, but I'd
hate to miss getting the simpler lock_timeout capability into 9.3.
Thanks,
    Stephen

Re: Strange Windows problem, lock_timeout test request

From

Robert Haas

Date:

06 March 2013, 18:23:30

On Wed, Feb 27, 2013 at 8:58 AM, Stephen Frost <sfrost@snowman.net> wrote:
> * Boszormenyi Zoltan (zb@cybertec.at) wrote:
>> But unlike statement_timeout,
>> with lock_timeout_stmt the statement can still finish after this limit
>> as it does useful work besides waiting for locks.
>
> It's still entirely possible to get 99% done and then hit that last
> tuple that you need a lock on and just tip over the lock_timeout_stmt
> limit due to prior waiting and ending up wasting a bunch of work, hence
> why I'm not entirely sure that this is that much better than
> statement_timeout.

I tend to agree that this should be based on the length of any
individual lock wait, not the cumulative duration of lock waits.
Otherwise, it seems like it'll be very hard to set this to a
meaningful value.  For example, if you set this to 1 minute, and that
means the length of any single wait, then you basically know that
it'll only kick in if there is some other, long-running transaction
that's holding the lock.  But if it means the cumulative length of all
waits, it's not so clear, because now you might also have this kick in
if you wait for 100ms on 600 different occasions.  In other words,
complex queries that lock or update many tuples may get killed even if
they never wait very long at all for any single lock.  That seems like
it will be almost indistinguishable from random, unprincipled query
cancellations.

Now, you could try to work around that by varying the setting based on
the complexity of the query you're about to run, but that seems like a
pain in the neck, to put it mildly.  And it might still not give you
the behavior that you want.  Personally, I'd think a big part of the
appeal of this is to make sure that you don't hang waiting for a
RELATION level lock for too long before giving up.  And for that,
scaling with the complexity of the query would be exactly the wrong
thing to do, even if you could figure out some system for it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

06 March 2013, 18:54:01

Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Feb 27, 2013 at 8:58 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> It's still entirely possible to get 99% done and then hit that last
>> tuple that you need a lock on and just tip over the lock_timeout_stmt
>> limit due to prior waiting and ending up wasting a bunch of work, hence
>> why I'm not entirely sure that this is that much better than
>> statement_timeout.

> I tend to agree that this should be based on the length of any
> individual lock wait, not the cumulative duration of lock waits.
> Otherwise, it seems like it'll be very hard to set this to a
> meaningful value.  For example, if you set this to 1 minute, and that
> means the length of any single wait, then you basically know that
> it'll only kick in if there is some other, long-running transaction
> that's holding the lock.  But if it means the cumulative length of all
> waits, it's not so clear, because now you might also have this kick in
> if you wait for 100ms on 600 different occasions.  In other words,
> complex queries that lock or update many tuples may get killed even if
> they never wait very long at all for any single lock.  That seems like
> it will be almost indistinguishable from random, unprincipled query
> cancellations.

Yeah.  I'm also unconvinced that there's really much use-case territory
here that statement_timeout doesn't cover well enough.  To have a case
that statement-level lock timeout covers and statement_timeout doesn't,
you need to suppose that you know how long the query can realistically
wait for all locks together, but *not* how long it's going to run in the
absence of lock delays.  That seems a bit far-fetched, particularly when
thinking of row-level locks, whose cumulative timeout would presumably
need to scale with the number of rows the query will visit.

If statement-level lock timeouts were cheap to add, that would be one
thing; but given that they're complicating the code materially, I think
we need a more convincing argument for them.
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

06 March 2013, 18:59:06

2013-03-06 19:53 keltezéssel, Tom Lane írta:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Wed, Feb 27, 2013 at 8:58 AM, Stephen Frost <sfrost@snowman.net> wrote:
>>> It's still entirely possible to get 99% done and then hit that last
>>> tuple that you need a lock on and just tip over the lock_timeout_stmt
>>> limit due to prior waiting and ending up wasting a bunch of work, hence
>>> why I'm not entirely sure that this is that much better than
>>> statement_timeout.
>> I tend to agree that this should be based on the length of any
>> individual lock wait, not the cumulative duration of lock waits.
>> Otherwise, it seems like it'll be very hard to set this to a
>> meaningful value.  For example, if you set this to 1 minute, and that
>> means the length of any single wait, then you basically know that
>> it'll only kick in if there is some other, long-running transaction
>> that's holding the lock.  But if it means the cumulative length of all
>> waits, it's not so clear, because now you might also have this kick in
>> if you wait for 100ms on 600 different occasions.  In other words,
>> complex queries that lock or update many tuples may get killed even if
>> they never wait very long at all for any single lock.  That seems like
>> it will be almost indistinguishable from random, unprincipled query
>> cancellations.
> Yeah.  I'm also unconvinced that there's really much use-case territory
> here that statement_timeout doesn't cover well enough.  To have a case
> that statement-level lock timeout covers and statement_timeout doesn't,
> you need to suppose that you know how long the query can realistically
> wait for all locks together, but *not* how long it's going to run in the
> absence of lock delays.  That seems a bit far-fetched, particularly when
> thinking of row-level locks, whose cumulative timeout would presumably
> need to scale with the number of rows the query will visit.
>
> If statement-level lock timeouts were cheap to add, that would be one
> thing; but given that they're complicating the code materially, I think
> we need a more convincing argument for them.

OK, so it's not wanted. Surprise, surprise, it was already dropped
from the patch. Can you _please_ review the last patch and comment
on it instead of the state of past?

Thanks and best regards,
Zoltán Böszörményi

>
>             regards, tom lane
>
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

15 March 2013, 17:53:58

Boszormenyi Zoltan <zb@cybertec.at> writes:
> [ 2-lock_timeout-v33.patch ]

I looked at this patch a bit.  I don't understand why you've chosen to
alter the API of the enable_timeout variants to have a bool result that
says "I didn't bother to process the timeout because it would have fired
immediately".  That is not an advantage for any call site that I can
see: it just means that the caller needs an extra, very difficult to
test code path to handle what would normally be handled by a timeout
interrupt.  Even if it were a good API choice, you've broken a number of
existing call sites that the patch fails to touch (eg in postmaster.c
and postgres.c).  It's not acceptable to define a failure return from
enable_timeout_after and then let callers assume that the failure can't
happen.

Please undo that.

Also, I'm not really enamored of the choice to use List* infrastructure
for enable_timeouts().  That doesn't appear to be especially convenient
for any caller, and what it does do is create memory-leak risks for most
of them (if they forget to free the list cells, perhaps as a result of
an error exit).  I think a local-variable array of TimeoutParams[]
structs would serve better for most use-cases.

Another thought is that the PGSemaphoreTimedLock implementations all
share the same bug, which is that if the "condition" callback returns
true immediately after we acquire the semaphore, they will all wrongly
return false indicating that the semaphore wasn't acquired.  (BTW,
I do not like either the variable name "condition" or the typedef name
"TimeoutCondition" for something that's actually a callback function
pointer.)

In the same vein, this comment change:
      * NOTE: Think not to put any shared-state cleanup after the call to      * ProcSleep, in either the normal or
failurepath.  The lock state must
 
!      * be fully set by the lock grantor, or by CheckDeadLock if we give up
!      * waiting for the lock.  This is necessary because of the possibility
!      * that a cancel/die interrupt will interrupt ProcSleep after someone else
!      * grants us the lock, but before we've noticed it. Hence, after granting,
!      * the locktable state must fully reflect the fact that we own the lock;
!      * we can't do additional work on return.      *      * We can and do use a PG_TRY block to try to clean up after
failure,but      * this still has a major limitation: elog(FATAL) can occur while waiting
 
--- 1594,1606 ----     /*      * NOTE: Think not to put any shared-state cleanup after the call to      * ProcSleep, in
eitherthe normal or failure path.  The lock state must
 
!      * be fully set by the lock grantor, or by CheckDeadLock or by ProcSleep
!      * itself in case a timeout is detected or if we give up waiting for the lock.
!      * This is necessary because of the possibility that a cancel/die interrupt
!      * will interrupt ProcSleep after someone else grants us the lock, but
!      * before we've noticed it. Hence, after granting, the locktable state must
!      * fully reflect the fact that we own the lock; we can't do additional work
!      * on return.      *

suggests that you've failed to grasp the fundamental race-condition
problem here.  ProcSleep can't do cleanup after the fact any more than
its callers can, because otherwise it has a race condition with some
other process deciding to grant it the lock at about the same time its
timeout goes off.  I think the changes in ProcSleep that alter the
state-cleanup behavior are just plain wrong, and what you need to do
is make that look more like the existing mechanisms that clean up when
statement timeout occurs.
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

16 March 2013, 16:00:12

2013-03-15 18:53 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan<zb@cybertec.at>  writes:
>> [ 2-lock_timeout-v33.patch ]
> I looked at this patch a bit.  I don't understand why you've chosen to
> alter the API of the enable_timeout variants to have a bool result that
> says "I didn't bother to process the timeout because it would have fired
> immediately".

I don't know either...

>    That is not an advantage for any call site that I can
> see: it just means that the caller needs an extra, very difficult to
> test code path to handle what would normally be handled by a timeout
> interrupt.  Even if it were a good API choice, you've broken a number of
> existing call sites that the patch fails to touch (eg in postmaster.c
> and postgres.c).  It's not acceptable to define a failure return from
> enable_timeout_after and then let callers assume that the failure can't
> happen.
>
> Please undo that.

Done.

> Also, I'm not really enamored of the choice to use List* infrastructure
> for enable_timeouts().  That doesn't appear to be especially convenient
> for any caller, and what it does do is create memory-leak risks for most
> of them (if they forget to free the list cells, perhaps as a result of
> an error exit).  I think a local-variable array of TimeoutParams[]
> structs would serve better for most use-cases.

Changed. However, the first member of the structure is
"TimeoutId id" and a sensible end-of-array value can be -1.
Some versions of GCC (maybe other compilers, too) complain
if a constant is assigned to an enum which is outside of the
declared values of the enum. So I added a "NO_TIMEOUT = -1"
to enum TimeoutId. Comments?

> Another thought is that the PGSemaphoreTimedLock implementations all
> share the same bug, which is that if the "condition" callback returns
> true immediately after we acquire the semaphore, they will all wrongly
> return false indicating that the semaphore wasn't acquired.

You are right. Fixed.

>    (BTW,
> I do not like either the variable name "condition" or the typedef name
> "TimeoutCondition" for something that's actually a callback function
> pointer.)

How about "TimeoutCallback" and "callback_fn"?

> In the same vein, this comment change:
>
>         * NOTE: Think not to put any shared-state cleanup after the call to
>         * ProcSleep, in either the normal or failure path.  The lock state must
> !      * be fully set by the lock grantor, or by CheckDeadLock if we give up
> !      * waiting for the lock.  This is necessary because of the possibility
> !      * that a cancel/die interrupt will interrupt ProcSleep after someone else
> !      * grants us the lock, but before we've noticed it. Hence, after granting,
> !      * the locktable state must fully reflect the fact that we own the lock;
> !      * we can't do additional work on return.
>         *
>         * We can and do use a PG_TRY block to try to clean up after failure, but
>         * this still has a major limitation: elog(FATAL) can occur while waiting
> --- 1594,1606 ----
>        /*
>         * NOTE: Think not to put any shared-state cleanup after the call to
>         * ProcSleep, in either the normal or failure path.  The lock state must
> !      * be fully set by the lock grantor, or by CheckDeadLock or by ProcSleep
> !      * itself in case a timeout is detected or if we give up waiting for the lock.
> !      * This is necessary because of the possibility that a cancel/die interrupt
> !      * will interrupt ProcSleep after someone else grants us the lock, but
> !      * before we've noticed it. Hence, after granting, the locktable state must
> !      * fully reflect the fact that we own the lock; we can't do additional work
> !      * on return.
>         *
>
> suggests that you've failed to grasp the fundamental race-condition
> problem here.  ProcSleep can't do cleanup after the fact any more than
> its callers can, because otherwise it has a race condition with some
> other process deciding to grant it the lock at about the same time its
> timeout goes off.  I think the changes in ProcSleep that alter the
> state-cleanup behavior are just plain wrong, and what you need to do
> is make that look more like the existing mechanisms that clean up when
> statement timeout occurs.

This was the most enlightening comment up to now.
It seems no one else understood the timer code but you...
Thanks very much.

I hope the way I did it is right. I factored out the core of
StatementCancelHandler() into a common function that can
be used by the lock_timeout interrupt as its timeout callback
function. Now the code doesn't need PGSemaphoreTimedLock().

While I was thinking about how this thing works, I recognized
that I also need to set the timeout indicator to false after checking
it in ProcessInterrupts().

The reason is that it's needed is this scenario:
1. lock_timeout is set and the transaction throws its error
2. lock_timeout is unset before the next transaction
3. the user presses Ctrl-C during the next transaction
In  this case, the second transaction would report the
lock timeout error, since the indicator was still set.

The other way would be to call disable_all_timeouts(false)
unconditionally in ProcessInterrupts() but setting only the indicator
is faster and it doesn't interfere with unknown registered timeout.
Also, the reason is that it's disable_timeout_indicator() instead of
a generic set_timeout_indicator() is that the generic one may be
abused to produce false positives.

Best regards,
Zoltán Böszörményi

>             regards, tom lane
>
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web:http://www.postgresql-support.de
      http://www.postgresql.at/

Attachment

2-lock_timeout-v36.patch

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

16 March 2013, 16:43:11

Boszormenyi Zoltan <zb@cybertec.at> writes:
> 2013-03-15 18:53 keltez�ssel, Tom Lane �rta:
>> Also, I'm not really enamored of the choice to use List* infrastructure
>> for enable_timeouts().

> Changed. However, the first member of the structure is
> "TimeoutId id" and a sensible end-of-array value can be -1.
> Some versions of GCC (maybe other compilers, too) complain
> if a constant is assigned to an enum which is outside of the
> declared values of the enum. So I added a "NO_TIMEOUT = -1"
> to enum TimeoutId. Comments?

I was thinking more of having array pointer and count parameters, ieenable_timeouts(TimeoutParams *timeouts, int n)
I guess we could do it with sentinels instead but not sure that's
better.

The sentinel approach might be all right if there was another reason
to have an "invalid" value in the enum, but I'm not seeing one ATM.

> I hope the way I did it is right. I factored out the core of
> StatementCancelHandler() into a common function that can
> be used by the lock_timeout interrupt as its timeout callback
> function. Now the code doesn't need PGSemaphoreTimedLock().

Hm, not needing PGSemaphoreTimedLock at all is an improvement for
sure.  Don't have time right now to look closer though.
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

16 March 2013, 17:17:23

2013-03-16 17:42 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb@cybertec.at> writes:
>> 2013-03-15 18:53 keltezéssel, Tom Lane írta:
>>> Also, I'm not really enamored of the choice to use List* infrastructure
>>> for enable_timeouts().
>> Changed. However, the first member of the structure is
>> "TimeoutId id" and a sensible end-of-array value can be -1.
>> Some versions of GCC (maybe other compilers, too) complain
>> if a constant is assigned to an enum which is outside of the
>> declared values of the enum. So I added a "NO_TIMEOUT = -1"
>> to enum TimeoutId. Comments?
> I was thinking more of having array pointer and count parameters, ie
>     enable_timeouts(TimeoutParams *timeouts, int n)
> I guess we could do it with sentinels instead but not sure that's
> better.
>
> The sentinel approach might be all right if there was another reason
> to have an "invalid" value in the enum, but I'm not seeing one ATM.

Stephen Frost was against the array pointer/count variant,
it was done that way earlier. Let me redo it again. :-)

>
>> I hope the way I did it is right. I factored out the core of
>> StatementCancelHandler() into a common function that can
>> be used by the lock_timeout interrupt as its timeout callback
>> function. Now the code doesn't need PGSemaphoreTimedLock().
> Hm, not needing PGSemaphoreTimedLock at all is an improvement for
> sure.  Don't have time right now to look closer though.
>
>             regards, tom lane
>
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/

Attachment

2-lock_timeout-v37.patch

Re: Strange Windows problem, lock_timeout test request

From

Stephen Frost

Date:

16 March 2013, 17:48:43

* Boszormenyi Zoltan (zb@cybertec.at) wrote:
> Stephen Frost was against the array pointer/count variant,
> it was done that way earlier. Let me redo it again. :-)

I still don't particularly like the array approach, and see the
array+count approach as worse (seems like a higher chance that the count
will end up being wrong at some point than having an array termination
identifier).  I still like the List approach, as that builds on a
structure we've already got and can take advantage of the existing
infrastructure. but Tom's got a good point regarding the potential for
memory leaks with that solution.

I havn't had a chance to look, but I would have expected the Lists for
these to be allocated in a per-statement context, which would address
the memory leak issue.  Perhaps that isn't possible though.  I agree
that the List construct doesn't particularly help the callers, though I
do think it makes the enable_timeouts() function cleaner.
    Thanks,
        Stephen

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

16 March 2013, 18:04:44

Stephen Frost <sfrost@snowman.net> writes:
> * Boszormenyi Zoltan (zb@cybertec.at) wrote:
>> Stephen Frost was against the array pointer/count variant,
>> it was done that way earlier. Let me redo it again. :-)

> I still don't particularly like the array approach, and see the
> array+count approach as worse (seems like a higher chance that the count
> will end up being wrong at some point than having an array termination
> identifier).  I still like the List approach, as that builds on a
> structure we've already got and can take advantage of the existing
> infrastructure. but Tom's got a good point regarding the potential for
> memory leaks with that solution.  

> I havn't had a chance to look, but I would have expected the Lists for
> these to be allocated in a per-statement context, which would address
> the memory leak issue.  Perhaps that isn't possible though.  I agree
> that the List construct doesn't particularly help the callers, though I
> do think it makes the enable_timeouts() function cleaner.

I'm not sure about that.  I think locks will typically get taken in
query-lifespan or transaction-lifespan contexts, so that repeated leaks
would be significant.  Possibly my fear of leaks during error exits is
unfounded, but I think forgetting to pfree the list cells in normal
execution would be a problem.

Anyway, the bottom line for me was that the List way didn't seem to be
either convenient for the callers or especially efficient, because of
the need to palloc list cells and then remember to pfree them again.

Another way that we perhaps should consider is to follow the example of
XLogInsert and use internally-threaded lists that are typically stored
in local arrays in the callers.  I've never thought that way was
especially beautiful, but it does have the advantage of being an idiom
that's already in use in other low-level code.

On the whole though, I don't see anything wrong with pointer-and-count.
I don't really believe that there's ever going to be a need to enable
more than a couple of timeouts simultaneously, so I don't want an overly
complicated data structure for it.
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Stephen Frost

Date:

16 March 2013, 18:26:39

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> On the whole though, I don't see anything wrong with pointer-and-count.
> I don't really believe that there's ever going to be a need to enable
> more than a couple of timeouts simultaneously, so I don't want an overly
> complicated data structure for it.

Alright, fair enough.

Zoltan, sorry for the back-and-forth Zoltan and thanks for being
persistent; I'd really like to see this capability added.
Thanks again,
    Stephen

Re: Strange Windows problem, lock_timeout test request

From

Alvaro Herrera

Date:

17 March 2013, 00:35:52

Tom Lane wrote:

> Another way that we perhaps should consider is to follow the example of
> XLogInsert and use internally-threaded lists that are typically stored
> in local arrays in the callers.  I've never thought that way was
> especially beautiful, but it does have the advantage of being an idiom
> that's already in use in other low-level code.

FWIW you could use an slist from ilist.c.  It means each node would need
a "next" pointer, but there's no separately allocated list cell.

There are many places that could use slist/dlist.  For instance while
reading the SET PERSISTENT patch I noticed it has head and tail pointers
being passed all over the place, which looks rather ugly.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

17 March 2013, 03:49:06

Boszormenyi Zoltan <zb@cybertec.at> writes:
>  [ 2-lock_timeout-v37.patch ]

Applied after a fair amount of additional hacking.

I was disappointed to find that the patch introduced a new race
condition into timeout.c, or at least broke a safety factor that had
been there.  The argument why enable_timeout() could skip disabling
the timer interrupt on entry, if num_active_timeouts is zero, doesn't
work for enable_timeouts(): as soon as you've incremented
num_active_timeouts for the first new timeout, the signal handler would
mess around with the data structure if it were to fire.  What I did
about that was to modify disable_alarm() to forcibly disable the
interrupt if we're adding multiple timeouts in enable_timeouts(), even
if we think no interrupt is pending.  This might be overly paranoid,
but since all of this is new code for 9.3 and hasn't been through any
beta testing, I felt it best to preserve that safety feature.  We can
revisit it later if it proves to be an issue.  (It's conceivable for
instance that we could avoid incrementing the stored value of
num_active_timeouts until we're done adding all the new timeouts;
but that seemed pretty messy.)  For the current usage pattern I'm not
too sure that it matters anyway: a savings is only possible if you
have enabled lock_timeout but not statement_timeout, and I'm doubtful
that that will be a common use-case.

I also rearranged the handling of the LOCK_TIMEOUT interrupt some more,
since I didn't see a need for it to be different from STATEMENT_TIMEOUT,
and got rid of some non-C89 coding practices that didn't seem to me to
be improving readability anyway.
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

17 March 2013, 03:54:04

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> Another way that we perhaps should consider is to follow the example of
>> XLogInsert and use internally-threaded lists that are typically stored
>> in local arrays in the callers.  I've never thought that way was
>> especially beautiful, but it does have the advantage of being an idiom
>> that's already in use in other low-level code.

> FWIW you could use an slist from ilist.c.  It means each node would need
> a "next" pointer, but there's no separately allocated list cell.

Yeah, if the usage patterns were more complicated it'd be worth thinking
about that.  Right now there's nothing more complex than this:

*************** ResolveRecoveryConflictWithBufferPin(voi
*** 428,435 ****          * Wake up at ltime, and check for deadlocks as well if we will be          * waiting longer
thandeadlock_timeout          */
 
!         enable_timeout_after(STANDBY_DEADLOCK_TIMEOUT, DeadlockTimeout);
!         enable_timeout_at(STANDBY_TIMEOUT, ltime);     }      /* Wait to be signaled by UnpinBuffer() */
--- 428,442 ----          * Wake up at ltime, and check for deadlocks as well if we will be          * waiting longer
thandeadlock_timeout          */
 
!         EnableTimeoutParams timeouts[2];
! 
!         timeouts[0].id = STANDBY_TIMEOUT;
!         timeouts[0].type = TMPARAM_AT;
!         timeouts[0].fin_time = ltime;
!         timeouts[1].id = STANDBY_DEADLOCK_TIMEOUT;
!         timeouts[1].type = TMPARAM_AFTER;
!         timeouts[1].delay_ms = DeadlockTimeout;
!         enable_timeouts(timeouts, 2);     }

and you really can't improve that by complicating the data structure.
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

17 March 2013, 08:01:09

2013-03-17 04:48 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb@cybertec.at> writes:
>>   [ 2-lock_timeout-v37.patch ]
> Applied after a fair amount of additional hacking.

Thank you, thank you, thank you! :-)

> I was disappointed to find that the patch introduced a new race
> condition into timeout.c, or at least broke a safety factor that had
> been there.  The argument why enable_timeout() could skip disabling
> the timer interrupt on entry, if num_active_timeouts is zero, doesn't
> work for enable_timeouts(): as soon as you've incremented
> num_active_timeouts for the first new timeout, the signal handler would
> mess around with the data structure if it were to fire.  What I did
> about that was to modify disable_alarm() to forcibly disable the
> interrupt if we're adding multiple timeouts in enable_timeouts(), even
> if we think no interrupt is pending.  This might be overly paranoid,
> but since all of this is new code for 9.3 and hasn't been through any
> beta testing, I felt it best to preserve that safety feature.  We can
> revisit it later if it proves to be an issue.  (It's conceivable for
> instance that we could avoid incrementing the stored value of
> num_active_timeouts until we're done adding all the new timeouts;
> but that seemed pretty messy.)

Your reasoning seems to be correct. However, if we take it to the
extreme, enable_timeout_at/enable_timeout_after/enable_timeouts
should also disable the interrupt handler at the beginning of the
function and enabled at the end with pqsignal(). An evil soul may
send SIGALRM externally to the backend processes at the proper
moment and create a race condition while enable_timeout*() is in
progress and the itimer is about to trigger at the same time (but
doesn't since it was disabled).

>    For the current usage pattern I'm not
> too sure that it matters anyway: a savings is only possible if you
> have enabled lock_timeout but not statement_timeout, and I'm doubtful
> that that will be a common use-case.

I am not too sure about it. With lock_timeout in place, code migrated
from Informix, MSSQL, etc. can have the same semantic behaviour.

>
> I also rearranged the handling of the LOCK_TIMEOUT interrupt some more,
> since I didn't see a need for it to be different from STATEMENT_TIMEOUT,
> and got rid of some non-C89 coding practices that didn't seem to me to
> be improving readability anyway.

Thanks for that, too. I was too blind to see that choice, even after
thinking about why the statement_timeout cannot be done from
SIGALRM context and why does the code need to also send SIGINT
to the process group. (To kill children, too, like scripts executed via
system()...)

Thanks again,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

17 March 2013, 15:07:53

Boszormenyi Zoltan <zb@cybertec.at> writes:
> 2013-03-17 04:48 keltez�ssel, Tom Lane �rta:
>> [ worries about stray SIGALRM events ]

> Your reasoning seems to be correct. However, if we take it to the
> extreme, enable_timeout_at/enable_timeout_after/enable_timeouts
> should also disable the interrupt handler at the beginning of the
> function and enabled at the end with pqsignal(). An evil soul may
> send SIGALRM externally to the backend processes at the proper
> moment and create a race condition while enable_timeout*() is in
> progress and the itimer is about to trigger at the same time (but
> doesn't since it was disabled).

Well, a malefactor with the ability to send signals to a backend
process could also send SIGKILL, or any number of other signals that
would mess things up.  I'm not too concerned about that scenario.
I *am* concerned about leftover timer events, both because this code
isn't very well tested, and because third-party code loaded into the
backend (think pl/perl for instance) could easily set up timer events
we weren't expecting.

It suddenly occurs to me though that there's more than one way to skin
this cat.  We could easily add another static flag variable called
"sigalrm_allowed" or some such, and have the signal handler test that
and immediately return without doing anything if it's off.  Clearing
and setting such a variable would be a lot cheaper than an extra
setitimer call, as well as more robust since it would protect us from
all sources of SIGALRM not just ITIMER_REAL.
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

17 March 2013, 19:06:37

2013-03-17 16:07 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb@cybertec.at> writes:
>> 2013-03-17 04:48 keltezéssel, Tom Lane írta:
>>> [ worries about stray SIGALRM events ]
>> Your reasoning seems to be correct. However, if we take it to the
>> extreme, enable_timeout_at/enable_timeout_after/enable_timeouts
>> should also disable the interrupt handler at the beginning of the
>> function and enabled at the end with pqsignal(). An evil soul may
>> send SIGALRM externally to the backend processes at the proper
>> moment and create a race condition while enable_timeout*() is in
>> progress and the itimer is about to trigger at the same time (but
>> doesn't since it was disabled).
> Well, a malefactor with the ability to send signals to a backend
> process could also send SIGKILL, or any number of other signals that
> would mess things up.  I'm not too concerned about that scenario.
> I *am* concerned about leftover timer events, both because this code
> isn't very well tested, and because third-party code loaded into the
> backend (think pl/perl for instance) could easily set up timer events
> we weren't expecting.

I see.

> It suddenly occurs to me though that there's more than one way to skin
> this cat.  We could easily add another static flag variable called
> "sigalrm_allowed" or some such, and have the signal handler test that
> and immediately return without doing anything if it's off.  Clearing
> and setting such a variable would be a lot cheaper than an extra
> setitimer call, as well as more robust since it would protect us from
> all sources of SIGALRM not just ITIMER_REAL.

Something like the attached patch?

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/

Attachment

save-one-setitimer.patch

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

18 March 2013, 02:52:33

Boszormenyi Zoltan <zb@cybertec.at> writes:
> 2013-03-17 16:07 keltez�ssel, Tom Lane �rta:
>> It suddenly occurs to me though that there's more than one way to skin
>> this cat.  We could easily add another static flag variable called
>> "sigalrm_allowed" or some such, and have the signal handler test that
>> and immediately return without doing anything if it's off.  Clearing
>> and setting such a variable would be a lot cheaper than an extra
>> setitimer call, as well as more robust since it would protect us from
>> all sources of SIGALRM not just ITIMER_REAL.

> Something like the attached patch?

Well, something like that, but it still needed some improvements.  In
particular I thought it best to leave the signal handler still releasing
the procLatch unconditionally --- that behavior is independent of what
this module is doing.  Also you seem to have some odd ideas about what
do-while will accomplish.  AFAIK, in this usage it's purely a syntactic
trick without much semantic content.  It's the marking of the variable
as "volatile" that counts for telling the compiler not to re-order
operations.

Updated and committed.
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

18 March 2013, 05:23:10

2013-03-18 03:52 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb@cybertec.at> writes:
>> 2013-03-17 16:07 keltezéssel, Tom Lane írta:
>>> It suddenly occurs to me though that there's more than one way to skin
>>> this cat.  We could easily add another static flag variable called
>>> "sigalrm_allowed" or some such, and have the signal handler test that
>>> and immediately return without doing anything if it's off.  Clearing
>>> and setting such a variable would be a lot cheaper than an extra
>>> setitimer call, as well as more robust since it would protect us from
>>> all sources of SIGALRM not just ITIMER_REAL.
>> Something like the attached patch?
> Well, something like that, but it still needed some improvements.  In
> particular I thought it best to leave the signal handler still releasing
> the procLatch unconditionally --- that behavior is independent of what
> this module is doing.  Also you seem to have some odd ideas about what
> do-while will accomplish.  AFAIK, in this usage it's purely a syntactic
> trick without much semantic content.  It's the marking of the variable
> as "volatile" that counts for telling the compiler not to re-order
> operations.

The volatile marking shouldn't even be necessary there.
The signal handler doesn't writes to it, only the main code.
I just put it there saying "why not?" to myself.
IIRC, volatile is needed if both the signal handler and the
main code changes the same variable.

I got the reordering idea from here:
http://yarchive.net/comp/linux/compiler_barriers.html

Thanks for committing,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

18 March 2013, 05:29:56

2013-03-18 06:22 keltezéssel, Boszormenyi Zoltan írta:
> 2013-03-18 03:52 keltezéssel, Tom Lane írta:
>> Boszormenyi Zoltan <zb@cybertec.at> writes:
>>> 2013-03-17 16:07 keltezéssel, Tom Lane írta:
>>>> It suddenly occurs to me though that there's more than one way to skin
>>>> this cat.  We could easily add another static flag variable called
>>>> "sigalrm_allowed" or some such, and have the signal handler test that
>>>> and immediately return without doing anything if it's off. Clearing
>>>> and setting such a variable would be a lot cheaper than an extra
>>>> setitimer call, as well as more robust since it would protect us from
>>>> all sources of SIGALRM not just ITIMER_REAL.
>>> Something like the attached patch?
>> Well, something like that, but it still needed some improvements.  In
>> particular I thought it best to leave the signal handler still releasing
>> the procLatch unconditionally --- that behavior is independent of what
>> this module is doing.  Also you seem to have some odd ideas about what
>> do-while will accomplish.  AFAIK, in this usage it's purely a syntactic
>> trick without much semantic content.  It's the marking of the variable
>> as "volatile" that counts for telling the compiler not to re-order
>> operations.
>
> The volatile marking shouldn't even be necessary there.
> The signal handler doesn't writes to it, only the main code.
> I just put it there saying "why not?" to myself.
> IIRC, volatile is needed if both the signal handler and the
> main code changes the same variable.

Also, since the the variable assignment doesn't depend on other code
in the same function (or vice-versa) the compiler is still free to
reorder it.

Volatile is about not caching the variable in a CPU register since
it's "volatile"...

>
> I got the reordering idea from here:
> http://yarchive.net/comp/linux/compiler_barriers.html
>
> Thanks for committing,
> Zoltán Böszörményi
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

Re: Strange Windows problem, lock_timeout test request

From

Tom Lane

Date:

18 March 2013, 05:47:57

Boszormenyi Zoltan <zb@cybertec.at> writes:
>> The volatile marking shouldn't even be necessary there.
>> The signal handler doesn't writes to it, only the main code.

Well, (a) that's not the case in the patch as committed, and (b) even if
it were true, the volatile marking is still *necessary*, because that's
what tells the compiler it can't optimize away changes to the variable,
say on the grounds of there being another store with no intervening
fetches in the main-line code.  Without that, a compiler that had
aggressively inlined the smaller functions could well deduce that the
disable_alarm() assignment was useless.

> Also, since the the variable assignment doesn't depend on other code
> in the same function (or vice-versa) the compiler is still free to
> reorder it.
> Volatile is about not caching the variable in a CPU register since
> it's "volatile"...

I don't believe you understand what volatile is about at all.  Go read
the C standard: it's about requiring objects' values to actually match
the nominal program-specified values at sequence points.  A more
accurate heuristic is that volatile tells the compiler there may be
other forces reading or writing the variable besides the ones it can see
in the current function's source code, and so it can't drop or reorder
accesses to the variable.
        regards, tom lane

Re: Strange Windows problem, lock_timeout test request

From

Boszormenyi Zoltan

Date:

18 March 2013, 10:41:13

2013-03-18 06:47 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb@cybertec.at> writes:
>>> The volatile marking shouldn't even be necessary there.
>>> The signal handler doesn't writes to it, only the main code.
> Well, (a) that's not the case in the patch as committed, and (b) even if
> it were true, the volatile marking is still *necessary*, because that's
> what tells the compiler it can't optimize away changes to the variable,
> say on the grounds of there being another store with no intervening
> fetches in the main-line code.  Without that, a compiler that had
> aggressively inlined the smaller functions could well deduce that the
> disable_alarm() assignment was useless.
>
>> Also, since the the variable assignment doesn't depend on other code
>> in the same function (or vice-versa) the compiler is still free to
>> reorder it.
>> Volatile is about not caching the variable in a CPU register since
>> it's "volatile"...
> I don't believe you understand what volatile is about at all.  Go read
> the C standard: it's about requiring objects' values to actually match
> the nominal program-specified values at sequence points.  A more
> accurate heuristic is that volatile tells the compiler there may be
> other forces reading or writing the variable besides the ones it can see
> in the current function's source code, and so it can't drop or reorder
> accesses to the variable.
>
>             regards, tom lane

After reading up on a lot of volatile and memory barriers,
I am still not totally convinced.

For the record, sig_atomit_t is a plain int without any special
treatment from the compiler. It's atomic by nature on every 32-bit
and 64-bit CPU.

How about the attached patch over current GIT? In other words,
why I am wrong with this idea?

The problem that may arise if it's wrong is that transactions are
left waiting for the lock when the interrupt triggers and the variable
is still seen as false from the interrupt handler. But this doesn't happen.

FWIW, this small patch seems to give a 0,7 percent performance
boost for my settings:

shared_buffers = 256MB
work_mem = 8MB
effective_io_concurrency = 8
wal_level = hot_standby
checkpoint_segments = 64
random_page_cost = 1.8

Everything else is the default. I tested the patch on a 8-core
AMD FX-8120 CPU with this pgbench script below. Basically, it's
the default transaction prepended with "SET lock_timeout = 1;"
I have used the attached quick-and-dirty patch to pgbench to
ignore SQL errors coming from statements. "-s 100" was used
to initialize the test database.

\set nbranches 1 * :scale
\set ntellers 10 * :scale
\set naccounts 100000 * :scale
\setrandom aid 1 :naccounts
\setrandom bid 1 :nbranches
\setrandom tid 1 :ntellers
\setrandom delta -5000 5000
BEGIN;
SET lock_timeout = 1;
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid,
:delta, CURRENT_TIMESTAMP);
END;

Results of "pgbench -c 32 -j 32 -t 10000 -e -f script.sql" are
for the GIT version:

tps = 3366.844023 (including connections establishing)
tps = 3367.645454 (excluding connections establishing)

tps = 3379.784707 (including connections establishing)
tps = 3380.622317 (excluding connections establishing)

tps = 3385.198238 (including connections establishing)
tps = 3386.132433 (excluding connections establishing)

and with the barrier patch:

tps = 3412.799044 (including connections establishing)
tps = 3413.670832 (excluding connections establishing)

tps = 3389.796287 (including connections establishing)
tps = 3390.602187 (excluding connections establishing)

tps = 3405.924548 (including connections establishing)
tps = 3406.726997 (excluding connections establishing)

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/

On Fri, Mar 22, 2013 at 2:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> The "barrier" essentially
> divides up the code into chunks and requires that those chunks be
> optimized independently by the compiler without knowledge of what
> earlier or later chunks are doing

While all this sounds sensible I would love to see a gcc programmer or
llvm programmer actually comment on what they think volatile does and
what they want to implement in the compiler.

I'm a bit worried that we're making assumptions like "things happen in
a specific order" that aren't really justified. In these days of
superscalar execution and multi-level caches things may be weirder
than we're imagining.

-- 
greg