Thread: Re: [HACKERS] large object regression tests

Re: [HACKERS] large object regression tests

From
Jeremy Drake
Date:
I put together a patch which adds a regression test for large objects,
hopefully attached to this message.  I would like some critique of it, to
see if I have gone about it the right way.  Also I would be happy to hear
any additional tests which should be added to it.

On Tue, 5 Sep 2006, Jeremy Drake wrote:

> I noticed when I was working on a patch quite a while back that there are
> no regression tests for large object support.  I know, large objects
> are not the most sexy part of the code-base, and I think they tend to be
> ignored/forgotten most of the time.  Which IMHO is all the more reason
> they should have some regression tests.  Otherwise, if someone managed to
> break them somehow, it is quite likely not to be noticed for quite some
> time.
>
> So in this vein, I have recently found myself with some free time, and a
> desire to contribute something, and decided this would be the perfect
> place to get my feet wet without stepping on any toes.
>
> I guess what I should ask is, would a patch to add a test for large
> objects to the regression suite be well received?  And, is there any
> advice for how to go about making these tests?
>
> I am considering, and I think that in order to get a real test of the
> large objects, I would need to load data into a large object which would
> be sufficient to be loaded into more than one block (large object blocks
> were 1 or 2K IIRC) so that the block boundary case could be tested.  Is
> there any precedent on where to grab such a large chunk of data from?  I
> was thinking about using an excerpt from a public domain text such as Moby
> Dick, but on second thought binary data may be better to test things with.
>
> My current efforts, and probably the preliminary portion of the final
> test, involves loading a small amount (less than one block) of text into a
> large object inline from a sql script and calling the various functions
> against it to verify that they do what they should.  In the course of
> doing so, I find that it is necessary to stash certain values across
> statements (large object ids, large object 'handles'), and so far I am
> using a temporary table to store these.  Is this reasonable, or is there a
> cleaner way to do that?
>
>



--
One seldom sees a monument to a committee.

Attachment

Re: [HACKERS] large object regression tests

From
Tom Lane
Date:
Jeremy Drake <pgsql-patches@jdrake.com> writes:
> I put together a patch which adds a regression test for large objects,
> hopefully attached to this message.  I would like some critique of it, to
> see if I have gone about it the right way.  Also I would be happy to hear
> any additional tests which should be added to it.

I'd prefer it if we could arrange not to need any absolute paths
embedded into the test, because maintaining tests that require such is
a real PITA --- instead of just committing the actual test output, one
has to reverse-convert it to a ".source" file.

I suggest that instead of testing the server-side lo_import/lo_export
functions, perhaps you could test the psql equivalents and write and
read a file in psql's working directory.  I think we could do without
the Moby Dick extract too ...

            regards, tom lane

Re: [HACKERS] large object regression tests

From
Jeremy Drake
Date:
On Thu, 21 Sep 2006, Tom Lane wrote:

> Jeremy Drake <pgsql-patches@jdrake.com> writes:
> > I put together a patch which adds a regression test for large objects,
> > hopefully attached to this message.  I would like some critique of it, to
> > see if I have gone about it the right way.  Also I would be happy to hear
> > any additional tests which should be added to it.
>
> I'd prefer it if we could arrange not to need any absolute paths
> embedded into the test, because maintaining tests that require such is
> a real PITA --- instead of just committing the actual test output, one
> has to reverse-convert it to a ".source" file.

I just copied how the test for COPY worked, since I perceived a similarity
in what I needed to do (use external files to load data).

> I suggest that instead of testing the server-side lo_import/lo_export
> functions, perhaps you could test the psql equivalents and write and
> read a file in psql's working directory.

I did not see any precedent for that when I was looking around in the
existing tests for an example of how to do things.  I am not even sure
where the cwd of psql is, so I can put an input file there.  Could you
provide an example of how this might look, by telling me where to put a
file in the src/test/regress tree and the path to give to \lo_import?
Besides which, shouldn't both the server-side and psql versions be tested?
When I was looking at the copy tests, it looked like the server-side ones
were tested, and then the psql ones were tested by exporting and then
importing data which was originally loaded from the server-side method.
Am I correctly interpreting the precedent, or are you suggesting that the
precedent be changed?  I was trying to stay as close to the copy tests as
possible since the functionality is so similar (transferring data to/from
files in the filesystem, either via server-side functions which require
absolute paths or via psql \ commands (which I forgot about for the lo
funcs)).

> I think we could do without the Moby Dick extract too ...

I am open to suggestions.  I saw one suggestion that I use an image of an
elephant, but I suspect that was tongue-in-cheek.  I am not very fond of
the idea of generating repetitious data, as I think it would be more
difficult to determine whether or not the loseek/tell functions put me in
the right place in the middle of the file.  Perhaps if there was a way to
generate deterministic pseudo-random data, that would work (has to be
deterministic so the diffs of the output come out right).  Anyone have a
good example of seeding a random number generator and generating a bunch
of bytea which is deterministic cross-platform?

>
>             regards, tom lane
>

In the mean time, I will alter the test to also test the psql backslash
commands based on how the copy equivalents are tested, since I had
forgotten them and they need to be tested also.

--
Any sufficiently advanced technology is indistinguishable from a rigged
demo.

Re: [HACKERS] large object regression tests

From
Jeremy Drake
Date:
On Sun, 24 Sep 2006, Jeremy Drake wrote:

> On Thu, 21 Sep 2006, Tom Lane wrote:
>
> > I suggest that instead of testing the server-side lo_import/lo_export
> > functions, perhaps you could test the psql equivalents and write and
> > read a file in psql's working directory.
>
> I did not see any precedent for that when I was looking around in the
> existing tests for an example of how to do things.
<snip>
> When I was looking at the copy tests, it looked like the server-side ones
> were tested, and then the psql ones were tested by exporting and then
> importing data which was originally loaded from the server-side method.

I just went back and looked at the tests again.  The only time the psql
\copy command was used was in the (quite recent IIRC) copyselect test, and
then only via stdout (never referring to psql working directory, or to
files at all).  Did I misunderstand, and you are proposing a completely
new way of doing things in the regression tests?  I am not particularly
fond of the sed substitution stuff myself, but it seems to be the only
currently supported/used method in the regression tests...  I do think
that making the large object test and the copy test consistent would make
a lot of sense, since as I said before, the functionality of file access
is so similar...

--
We demand rigidly defined areas of doubt and uncertainty!
        -- Vroomfondel

Re: [HACKERS] large object regression tests

From
"Bort, Paul"
Date:
Jeremy Drake wrote:
>
> I am open to suggestions.  I saw one suggestion that I use an
> image of an elephant, but I suspect that was tongue-in-cheek.
>  I am not very fond of the idea of generating repetitious
> data, as I think it would be more difficult to determine
> whether or not the loseek/tell functions put me in the right
> place in the middle of the file.  Perhaps if there was a way
> to generate deterministic pseudo-random data, that would work
> (has to be deterministic so the diffs of the output come out
> right).  Anyone have a good example of seeding a random
> number generator and generating a bunch of bytea which is
> deterministic cross-platform?
>

How about just using a mathmatical series, like Fibonacci? You can make
the file as big as you want from a trivial generator. If you store it as
space-separated ASCII ( 1 1 2 3 5 8 13 21 34 ... ), it should be
platform independent and you can compare any range of offsets that
suits.

Regards,
Paul Bort

Re: [HACKERS] large object regression tests

From
Jeremy Drake
Date:
On Mon, 25 Sep 2006, Jeremy Drake wrote:

>
> It looks like the large_obj.c output is missing much of the output
> settings handling which is in the PrintQueryStatus function in common.c,
> such as handling quiet mode, and html output.  I will try to dig around
> and try to put together a patch to make it respect the settings like other
> commands...

I put together a patch for psql's large_obj.c to make it respect the
output settings.  Is this reasonable?

--
For every complex problem, there is a solution that is simple, neat,
and wrong.
        -- H. L. Mencken

Attachment

Re: [HACKERS] large object regression tests

From
Jeremy Drake
Date:
On Sun, 24 Sep 2006, Jeremy Drake wrote:

> On Thu, 21 Sep 2006, Tom Lane wrote:
>
> > I think we could do without the Moby Dick extract too ...
>
> I am open to suggestions.  I saw one suggestion that I use an image of an
> elephant, but I suspect that was tongue-in-cheek.  I am not very fond of
> the idea of generating repetitious data, as I think it would be more
> difficult to determine whether or not the loseek/tell functions put me in
> the right place in the middle of the file.

I just had the idea that I could use one of the existing data files which
are used for testing COPY instead of the Moby Dick extract.  They are
already there, a few of them are pretty good sized, they have data in the
file which is not just simple repetition so it would be pretty obvious if
the seek function broke, and they are very unlikely to change.  I am
considering changing the test I put together to use tenk.data as the input
file tomorrow and send in what I have again, since I also am doing a test
of \lo_import (which also requires a patch to psql I sent in earlier to
fix the output of the \lo_* commands to respect the output settings).

--
When does summertime come to Minnesota, you ask?
Well, last year, I think it was a Tuesday.

Re: [HACKERS] large object regression tests

From
Bruce Momjian
Date:
Patch applied.  Thanks.

---------------------------------------------------------------------------


Jeremy Drake wrote:
> On Sun, 24 Sep 2006, Jeremy Drake wrote:
>
> > On Thu, 21 Sep 2006, Tom Lane wrote:
> >
> > > I think we could do without the Moby Dick extract too ...
> >
> > I am open to suggestions.  I saw one suggestion that I use an image of an
> > elephant, but I suspect that was tongue-in-cheek.  I am not very fond of
> > the idea of generating repetitious data, as I think it would be more
> > difficult to determine whether or not the loseek/tell functions put me in
> > the right place in the middle of the file.
>
> I just had the idea that I could use one of the existing data files which
> are used for testing COPY instead of the Moby Dick extract.  They are
> already there, a few of them are pretty good sized, they have data in the
> file which is not just simple repetition so it would be pretty obvious if
> the seek function broke, and they are very unlikely to change.  I am
> considering changing the test I put together to use tenk.data as the input
> file tomorrow and send in what I have again, since I also am doing a test
> of \lo_import (which also requires a patch to psql I sent in earlier to
> fix the output of the \lo_* commands to respect the output settings).
>
> --
> When does summertime come to Minnesota, you ask?
> Well, last year, I think it was a Tuesday.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [HACKERS] large object regression tests

From
Bruce Momjian
Date:
Sorry, please disregard.  Was means for a Japanese FAQ update.

---------------------------------------------------------------------------

Bruce Momjian wrote:
>
> Patch applied.  Thanks.
>
> ---------------------------------------------------------------------------
>
>
> Jeremy Drake wrote:
> > On Sun, 24 Sep 2006, Jeremy Drake wrote:
> >
> > > On Thu, 21 Sep 2006, Tom Lane wrote:
> > >
> > > > I think we could do without the Moby Dick extract too ...
> > >
> > > I am open to suggestions.  I saw one suggestion that I use an image of an
> > > elephant, but I suspect that was tongue-in-cheek.  I am not very fond of
> > > the idea of generating repetitious data, as I think it would be more
> > > difficult to determine whether or not the loseek/tell functions put me in
> > > the right place in the middle of the file.
> >
> > I just had the idea that I could use one of the existing data files which
> > are used for testing COPY instead of the Moby Dick extract.  They are
> > already there, a few of them are pretty good sized, they have data in the
> > file which is not just simple repetition so it would be pretty obvious if
> > the seek function broke, and they are very unlikely to change.  I am
> > considering changing the test I put together to use tenk.data as the input
> > file tomorrow and send in what I have again, since I also am doing a test
> > of \lo_import (which also requires a patch to psql I sent in earlier to
> > fix the output of the \lo_* commands to respect the output settings).
> >
> > --
> > When does summertime come to Minnesota, you ask?
> > Well, last year, I think it was a Tuesday.
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: Have you checked our extensive FAQ?
> >
> >                http://www.postgresql.org/docs/faq
>
> --
>   Bruce Momjian   bruce@momjian.us
>   EnterpriseDB    http://www.enterprisedb.com
>
>   + If your life is a hard drive, Christ can be your backup. +
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +