Thread: Cause of recent buildfarm failures on hamerkop
I looked into why buildfarm member hamerkop has been failing regression tests for the last two months. The symptoms appear consistent with the theory that the src/test/regress/data/tenk.data test file has been converted to have DOS style line endings (\r\n). The test cases that are failing involve reading data that should start 2030 bytes into that file, but if you assume that the preceding \n line endings actually take two bytes each, the data would be offset to match what we see in the actual results. Also, the data load test that is the primary use of that file wouldn't complain, because COPY is built to allow either \n or \r\n line endings. I assume this means that the git checkout was created with options that allowed conversion of text files to \r\n line endings. I'm not sure if we should just write this off as pilot error, or if we should try to make the regression tests proof against such things. If the latter, how exactly? regards, tom lane
On Fri, Sep 14, 2012 at 7:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I looked into why buildfarm member hamerkop has been failing regression > tests for the last two months. The symptoms appear consistent with the > theory that the src/test/regress/data/tenk.data test file has been > converted to have DOS style line endings (\r\n). The test cases that > are failing involve reading data that should start 2030 bytes into that > file, but if you assume that the preceding \n line endings actually take > two bytes each, the data would be offset to match what we see in the > actual results. Also, the data load test that is the primary use of > that file wouldn't complain, because COPY is built to allow either \n or > \r\n line endings. > > I assume this means that the git checkout was created with options that > allowed conversion of text files to \r\n line endings. > > I'm not sure if we should just write this off as pilot error, or if we > should try to make the regression tests proof against such things. If > the latter, how exactly? I don't think we need to make them proof against it. But it wouldn't hurt to have a check that threw a predictable error when it happens. E.g. a first step in the regression tests that just verifies what kind of line endings are in a file. Could maybe be as simple as checking the size of the file? -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Fri, Sep 14, 2012 at 4:56 AM, Magnus Hagander <magnus@hagander.net> wrote: >> I assume this means that the git checkout was created with options that >> allowed conversion of text files to \r\n line endings. If we have "text files" that we need to be "binary equivilents" for the purpose of diffing, we should probably attribute them in git attributes to make sure they are not considered "text autocrlf'able". It could be as simple as adding: *.out -text *.data -text *.source -text into src/test/regress/.gitattributes >> I'm not sure if we should just write this off as pilot error, or if we >> should try to make the regression tests proof against such things. If >> the latter, how exactly? > > I don't think we need to make them proof against it. But it wouldn't > hurt to have a check that threw a predictable error when it happens. > E.g. a first step in the regression tests that just verifies what kind > of line endings are in a file. Could maybe be as simple as checking > the size of the file? This leads to making sure you keep your "verification list" in source, and up-to-date too... a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Magnus Hagander <magnus@hagander.net> writes: > On Fri, Sep 14, 2012 at 7:02 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm not sure if we should just write this off as pilot error, or if we >> should try to make the regression tests proof against such things. If >> the latter, how exactly? > I don't think we need to make them proof against it. But it wouldn't > hurt to have a check that threw a predictable error when it happens. > E.g. a first step in the regression tests that just verifies what kind > of line endings are in a file. Could maybe be as simple as checking > the size of the file? Yeah, it would be easy to add a query that prints the large object's size to the currently-failing test. That seems like a good idea independently of whether we add a .gitattributes file as Aidan suggests. regards, tom lane
Aidan Van Dyk <aidan@highrise.ca> writes: > If we have "text files" that we need to be "binary equivilents" for > the purpose of diffing, we should probably attribute them in git > attributes to make sure they are not considered "text autocrlf'able". > It could be as simple as adding: > *.out -text > *.data -text > *.source -text > into src/test/regress/.gitattributes Oh, hold the phone. We don't need to do that. When I went and looked, I realized that there *already is* a variant file largeobject_1.source that's meant to provide matching output for DOS-ified source files. The reason that hamerkop has been failing since commit 3a0e4d36ebd7f477822d5bae41ba121a40d22ccc is that that commit changed the largeobject test and did not bother to update all the corresponding expected files. Boo hiss. I went ahead and put in the length query, in hopes that the problem will be a little more obvious next time this happens. regards, tom lane