Thread: Generating a SQL Server population routine

Generating a SQL Server population routine

From

Martin_Hurst@dom.com

Date:

06 October 2003, 14:06:04

Has some one come up with a similar type script that could be used in a
Postgresql database?
The script below was created for a SQLServer database.
Thx,
-Martin

++++++++++++++++++++++++++++++++++++++
http://searchdatabase.techtarget.com/tip/1,289483,sid13_gci913717,00.html

In the early stages of application design DBA or a developer creates a data
model. Unfortunately many models work very well with a handful of rows but
fail miserably when the application grows by leaps and bounds. This is why
it is important to populate your data model with data and stress test it
prior to making it available for users. Test data doesn't have to be
perfect; indeed, you can duplicate the same record, or a few different
records, to test the performance of your queries.

This article offers a script for generating INSERT statements for every
table in your database. The script is fairly simple -- it relies on three
system tables: sysusers, sysobjects and syscolumns. It generates an INSERT
statement duplicating the top row in your table. However, it can be easily
altered to fit your needs.

SET NOCOUNT ON
DECLARE @table VARCHAR(200),
 @owner VARCHAR(100),
 @sql VARCHAR(2000),
 @sql1 VARCHAR(2000)

DECLARE @schema TABLE (
 table_name VARCHAR(200),
 column_name VARCHAR(200))

INSERT @schema
SELECT c.name + '.' + a.name, b.name FROM sysobjects a INNER JOIN
syscolumns b ON b.id = a.id
AND a.type = 'u'
AND a.name <> 'dtproperties'
INNER JOIN sysusers c ON c.uid = a.uid
ORDER BY a.name, b.colid

DECLARE table_cursor CURSOR FOR

SELECT DISTINCT table_name FROM @schema

OPEN table_cursor
FETCH NEXT FROM table_cursor INTO @table

WHILE @@FETCH_STATUS = 0
BEGIN

SELECT @sql1 = ''
SELECT @sql1 = @sql1 + ', '+ column_name FROM @schema
WHERE table_name = @table

SELECT @sql1 = SUBSTRING(@sql1, 3, LEN(@sql1)-2)


SELECT @sql = 'INSERT ' + @table + '( ' + @sql1 + ' ) ' + CHAR(10) + '
SELECT TOP 1 ' + @sql1 + ' FROM ' +
  @table

SELECT @sql
FETCH NEXT FROM table_cursor INTO @table
END
CLOSE table_cursor
DEALLOCATE table_cursor

In the pubs database, the output will be similar to the following:

INSERT dbo.authors( au_id, au_lname, au_fname, phone, address, city, state,
zip, contract )
 SELECT TOP 1 au_id, au_lname, au_fname, phone, address, city, state, zip,
contract FROM dbo.authors

INSERT dbo.discounts( discounttype, stor_id, lowqty, highqty, discount )
 SELECT TOP 1 discounttype, stor_id, lowqty, highqty, discount FROM
dbo.discounts

INSERT dbo.employee( emp_id, fname, minit, lname, job_id, job_lvl, pub_id,
hire_date )
 SELECT TOP 1 emp_id, fname, minit, lname, job_id, job_lvl, pub_id,
hire_date FROM dbo.employee

Re: Generating a SQL Server population routine

From

Mike Mascari

Date:

06 October 2003, 15:18:38

Martin_Hurst@dom.com wrote:

> Has some one come up with a similar type script that could be used in a
> Postgresql database?

> The script below was created for a SQLServer database.
> Thx,
> -Martin

I haven't. But I was wondering if a general purpose tuple-generating
function, which would be trivial to implement, might be worthwhile in
PostgreSQL or perhaps added to Joe Conway's tablefunc module.
Something like:

tuple_generator(integer)

which returns a set of numbers whose elements are the integer values
between 1 and the number supplied.

You could then create any number of pseudo-duplicates (can't violate
the candidate key, obviously) from a single-record table like so:

INSERT INTO employees (name, salary)
SELECT employees.name, employees.salary
FROM employees, tuple_generator(1000)
WHERE employees.employeeid = 1;

You could easily build a script to fill your database by querying
pg_class.relname and feeding the output to psql.

It would also be useful for handling sparse date and time data:

SELECT day_of_year,
(SELECT COALESCE(SUM(purchases.qty), 0)
 FROM purchases
 WHERE EXTRACT(doy FROM purchases.sale_date) = day_of_year)
FROM tuple_generator(366) AS day_of_year
ORDER BY day_of_year;

Mike Mascari
mascarm@mascari.com

Re: Possible bug on insert

From

"Rick Gigger"

Date:

06 October 2003, 15:45:32

I think I have come across a bug in postgres 7.3.x.  I noticed that when I
upgraded to 7.3 my code broke on certain sql statements.  They still don't
work on the lastest stable release of 7.3.

Here is an example to illustrate my problem:

-- create the table with:
CREATE TABLE bugtest ( a int2 );

-- then do the following inserts:
insert into bugtest (a) select 1 union select 1; -- this one succeeds
insert into bugtest (a) select 1 union select '1'; -- this one also succeeds
insert into bugtest (a) select '1' union select 1; -- this one also succeeds
insert into bugtest (a) select '1' union select '1'; -- this one fails

The all succeed except the last one.  It fails with the following error:

ERROR:  column "a" is of type smallint but expression is of type text
        You will need to rewrite or cast the expression

It seems to me that they should all succeed.  At least postgres seems to
have no problem converting '<intvalue>' to <intvalue> anywhere else that I
can find.  At least the last 3 inserts shoudl either all fail or all succeed
(IMHO).

Is this a bug?  Has anyone else reported it?  Is there a procedure I need to
follow to report it?  Has it been fixed in 7.4?

On a sidenote I am doing the weird select union thing as a way to insert
many records at once without having to execute multiple queries.  I first
started doing it on SQLServer and it was much, much faster than doing
separate inserts.  Is there a better way to do it in postgres?  I have
looked at the copy from command but I can't find any examples of how to use
it in php or how to specify the columns / column order that you are going to
use with php.  Also what characters need to be excaped if I do this
(obviously new lines and tabs).  And nulls are represented by \N.

Rick Gigger

Re: Possible bug on insert

From

Tom Lane

Date:

06 October 2003, 16:04:56

"Rick Gigger" <rick@alpinenetworking.com> writes:
> insert into bugtest (a) select '1' union select '1'; -- this one fails

> Is this a bug?

No.  It's unfortunate perhaps, but it's not a bug.  The UNION forces us
to make a decision about the output datatype of the UNION operation.
In your other cases the chosen datatype is integer, which can later be
cast to smallint for the insert, but in this case the chosen datatype is
text, which is not implicitly castable to smallint.

            regards, tom lane

Re: Possible bug on insert

From

"Rick Gigger"

Date:

06 October 2003, 16:56:23

What was it that changed in 7.3 that made this behavior change.  (it worked
in 7.2)

Thanks,

Rick Gigger

----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Rick Gigger" <rick@alpinenetworking.com>
Cc: <pgsql-general@postgresql.org>
Sent: Monday, October 06, 2003 1:04 PM
Subject: Re: [GENERAL] Possible bug on insert


> "Rick Gigger" <rick@alpinenetworking.com> writes:
> > insert into bugtest (a) select '1' union select '1'; -- this one fails
>
> > Is this a bug?
>
> No.  It's unfortunate perhaps, but it's not a bug.  The UNION forces us
> to make a decision about the output datatype of the UNION operation.
> In your other cases the chosen datatype is integer, which can later be
> cast to smallint for the insert, but in this case the chosen datatype is
> text, which is not implicitly castable to smallint.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>

Re: Possible bug on insert

From

"Rick Gigger"

Date:

06 October 2003, 17:14:38

I guess then I will switch to use COPY "tablename (fieldlist)" FROM.  Will
this end up being faster anyway.

In case anyone was wondering here is a good example of how to do it in php
from:

http://us4.php.net/manual/en/function.pg-put-line.php

<?php
    $conn = pg_pconnect("dbname=foo");
    pg_query($conn, "create table bar (a int4, b char(16), d float8)");
    pg_query($conn, "copy bar from stdin");
    pg_put_line($conn, "3\thello world\t4.5\n");
    pg_put_line($conn, "4\tgoodbye world\t7.11\n");
    pg_put_line($conn, "\\.\n");
    pg_end_copy($conn);
?>

thanks,

Rick Gigger

----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Rick Gigger" <rick@alpinenetworking.com>
Cc: <pgsql-general@postgresql.org>
Sent: Monday, October 06, 2003 1:04 PM
Subject: Re: [GENERAL] Possible bug on insert


> "Rick Gigger" <rick@alpinenetworking.com> writes:
> > insert into bugtest (a) select '1' union select '1'; -- this one fails
>
> > Is this a bug?
>
> No.  It's unfortunate perhaps, but it's not a bug.  The UNION forces us
> to make a decision about the output datatype of the UNION operation.
> In your other cases the chosen datatype is integer, which can later be
> cast to smallint for the insert, but in this case the chosen datatype is
> text, which is not implicitly castable to smallint.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>

Re: Possible bug on insert

From

Tom Lane

Date:

06 October 2003, 17:26:52

"Rick Gigger" <rick@alpinenetworking.com> writes:
> What was it that changed in 7.3 that made this behavior change.  (it worked
> in 7.2)

7.2 allowed implicit casts from text to smallint, I think.

            regards, tom lane

Re: Possible bug on insert

From

"Rick Gigger"

Date:

06 October 2003, 17:48:00

Is there any possiblity of adding an option for compatibilityes sake (or
does one already exist) to revert to the old behavior.  This has currently
kept me from upgrading beyond 7.2.4 thus far in production as it will break
all of my apps.  I can slowly update them but many of them don't get changed
very often and I will have to undergo a testing cycle for each of them just
to maintain compatibility with postgres > 7.2.4.  This is not something I
really want to do.  I would much prefer to just upgrade and have my legasy
apps work without modification or testing.

Thanks,

Rick Gigger

----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Rick Gigger" <rick@alpinenetworking.com>
Cc: <pgsql-general@postgresql.org>
Sent: Monday, October 06, 2003 2:26 PM
Subject: Re: [GENERAL] Possible bug on insert

> "Rick Gigger" <rick@alpinenetworking.com> writes:
> > What was it that changed in 7.3 that made this behavior change.  (it
worked
> > in 7.2)
>
> 7.2 allowed implicit casts from text to smallint, I think.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>

Re: Possible bug on insert

From

"Rick Gigger"

Date:

06 October 2003, 18:41:50

Two questions:

1) how would I go about doing that
2) is there any change that doing that could break other things?

thanks,

Rick Gigger

----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Rick Gigger" <rick@alpinenetworking.com>
Sent: Monday, October 06, 2003 2:57 PM
Subject: Re: [GENERAL] Possible bug on insert


> > Is there any possiblity of adding an option for compatibilityes sake (or
> > does one already exist) to revert to the old behavior.
>
> If you're desperate you can mark that cast as implicit by changing its
> entry in pg_cast.  Not sure what side-effects you might see though.
>
> regards, tom lane
>

Re: Possible bug on insert

From

Vivek Khera

Date:

07 October 2003, 11:30:10

>>>>> "RG" == Rick Gigger <rick@alpinenetworking.com> writes:

RG> very often and I will have to undergo a testing cycle for each of them just
RG> to maintain compatibility with postgres > 7.2.4.  This is not something I
RG> really want to do.  I would much prefer to just upgrade and have my legasy
RG> apps work without modification or testing.

You can't have progress that way.  To be fair, it *is* a major version
number change, and if you don't test your apps across major version
changes of *anything* you don't need to be in this business.  No
offense ;-)

Anyhow, Bruce posted a patch to the 7.3 line to allow this behavior
back, as it was necessary for some apps.  However, don't count on it
for 7.4...

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Generating a SQL Server population routine

From

Harald Fuchs

Date:

08 October 2003, 06:53:49

In article <3F81B176.3060701@mascari.com>,
Mike Mascari <mascarm@mascari.com> writes:

> Martin_Hurst@dom.com wrote:
>> Has some one come up with a similar type script that could be used in a
>> Postgresql database?

>> The script below was created for a SQLServer database.
>> Thx,
>> -Martin

> I haven't. But I was wondering if a general purpose tuple-generating
> function, which would be trivial to implement, might be worthwhile in
> PostgreSQL or perhaps added to Joe Conway's tablefunc module.
> Something like:

> tuple_generator(integer)

> which returns a set of numbers whose elements are the integer values
> between 1 and the number supplied.

How about this?

  CREATE OR REPLACE FUNCTION enum (INT) RETURNS SETOF INT AS '
  DECLARE
    numvals ALIAS FOR $1;
  BEGIN
    FOR currval IN 0 .. numvals - 1 LOOP
      RETURN NEXT currval;
    END LOOP;
    RETURN;
  END;
  ' LANGUAGE 'plpgsql';

  CREATE OR REPLACE FUNCTION enum (INT, INT) RETURNS SETOF INT AS '
  DECLARE
    numvals ALIAS FOR $1;
    minval ALIAS FOR $2;
  BEGIN
    FOR currval IN 0 .. numvals - 1 LOOP
      RETURN NEXT minval + currval;
    END LOOP;
    RETURN;
  END;
  ' LANGUAGE 'plpgsql';

  CREATE OR REPLACE FUNCTION enum (INT, INT, INT) RETURNS SETOF INT AS '
  DECLARE
    numvals ALIAS FOR $1;
    minval ALIAS FOR $2;
    maxval ALIAS FOR $3;
  BEGIN
    FOR currval IN 0 .. numvals - 1 LOOP
      RETURN NEXT currval % (maxval - minval + 1) + minval;
    END LOOP;
    RETURN;
  END;
  ' LANGUAGE 'plpgsql';

Usage: SELECT * FROM enum (numvals [, minval [, maxval]])
Returns numvals consecutive numbers, beginning with 0 or minval
Wraps around to minval if maxval is reached

Re: Generating a SQL Server population routine

From

Date:

08 October 2003, 21:24:04

> In article <3F81B176.3060701@mascari.com>,
> Mike Mascari <mascarm@mascari.com> writes:
>
>> Martin_Hurst@dom.com wrote:
>>> Has some one come up with a similar type script that could be used
>>> in a Postgresql database?
>
>>> The script below was created for a SQLServer database.
>>> Thx,
>>> -Martin
>
>> I haven't. But I was wondering if a general purpose tuple-generating
>> function, which would be trivial to implement, might be worthwhile in
>> PostgreSQL or perhaps added to Joe Conway's tablefunc module.
>> Something like:
>
>> tuple_generator(integer)
>
>> which returns a set of numbers whose elements are the integer values
>> between 1 and the number supplied.
>
> How about this?
>
>   CREATE OR REPLACE FUNCTION enum (INT) RETURNS SETOF INT AS '
>   DECLARE
>     numvals ALIAS FOR $1;
>   BEGIN
>     FOR currval IN 0 .. numvals - 1 LOOP
>       RETURN NEXT currval;
>     END LOOP;
>     RETURN;
>   END;
>   ' LANGUAGE 'plpgsql';
>
>   CREATE OR REPLACE FUNCTION enum (INT, INT) RETURNS SETOF INT AS '
> DECLARE
>     numvals ALIAS FOR $1;
>     minval ALIAS FOR $2;
>   BEGIN
>     FOR currval IN 0 .. numvals - 1 LOOP
>       RETURN NEXT minval + currval;
>     END LOOP;
>     RETURN;
>   END;
>   ' LANGUAGE 'plpgsql';
>
>   CREATE OR REPLACE FUNCTION enum (INT, INT, INT) RETURNS SETOF INT AS
> ' DECLARE
>     numvals ALIAS FOR $1;
>     minval ALIAS FOR $2;
>     maxval ALIAS FOR $3;
>   BEGIN
>     FOR currval IN 0 .. numvals - 1 LOOP
>       RETURN NEXT currval % (maxval - minval + 1) + minval;
>     END LOOP;
>     RETURN;
>   END;
>   ' LANGUAGE 'plpgsql';
>
> Usage: SELECT * FROM enum (numvals [, minval [, maxval]])
> Returns numvals consecutive numbers, beginning with 0 or minval
> Wraps around to minval if maxval is reached
>

Or a little different, with the over-loaded functions relying on the
original:

CREATE OR REPLACE FUNCTION public.enum(int4) RETURNS  SETOF int4 AS '
  DECLARE
    numvals ALIAS FOR $1;
  BEGIN
    FOR currval IN 0 .. numvals - 1 LOOP
      RETURN NEXT currval;
    END LOOP;
    RETURN;
  END;
  '  LANGUAGE 'plpgsql' VOLATILE;


CREATE OR REPLACE FUNCTION public.enum(int4, int4) RETURNS  SETOF int4 AS '
  DECLARE
    numvals ALIAS FOR $1;
    minval ALIAS FOR $2;
    currval RECORD;
  BEGIN
    FOR currval IN SELECT minval + enum AS enum FROM enum(numvals) LOOP
      RETURN NEXT currval.enum;
    END LOOP;
    RETURN;
  END;
  '  LANGUAGE 'plpgsql' VOLATILE;

CREATE OR REPLACE FUNCTION public.enum(int4, int4, int4) RETURNS  SETOF
int4 AS ' DECLARE
    numvals ALIAS FOR $1;
    minval ALIAS FOR $2;
    maxval ALIAS FOR $3;
    currval RECORD;
/*
 From: Harald Fuchs
 Date: Wed, October 8, 2003 5:53
 To: pgsql-general@postgresql.org

 tuple_generator(integer)

 which returns a set of numbers whose elements are the integer values
 between 1 and the number supplied.

 Usage: SELECT * FROM enum (numvals [, minval [, maxval]])
 Returns numvals consecutive numbers, beginning with 0 or minval
 Wraps around to minval if maxval is reached
*/

  BEGIN
    FOR currval IN SELECT * FROM enum(numvals, minval) LOOP
      RETURN NEXT currval.enum % maxval;
    END LOOP;
    RETURN;
  END;
  '  LANGUAGE 'plpgsql' VOLATILE;


But, as interesting as these look, what would you actually use them for?


~Berend Tober

Re: Generating a SQL Server population routine

From

Harald Fuchs

Date:

09 October 2003, 05:52:50

In article <64591.66.212.203.144.1065659357.squirrel@$HOSTNAME>,
<btober@seaworthysys.com> writes:

> Or a little different, with the over-loaded functions relying on the
> original:

> CREATE OR REPLACE FUNCTION public.enum(int4) RETURNS  SETOF int4 AS '
>   DECLARE
>     numvals ALIAS FOR $1;
>   BEGIN
>     FOR currval IN 0 .. numvals - 1 LOOP
>       RETURN NEXT currval;
>     END LOOP;
>     RETURN;
>   END;
>   '  LANGUAGE 'plpgsql' VOLATILE;

Why VOLATILE?  Shouldn't that be IMMUTABLE?
(Sorry, but I'm a PostgreSQL newbie.)

> But, as interesting as these look, what would you actually use them for?

  SELECT extract (month FROM sdate) AS month,
         count (*) AS monthly_sales
  FROM sales
  GROUP BY month
  ORDER BY month;

gives you the monthly sales, but what if you would like a result row
also for months with nothing sold?

  SELECT enum, count (sdate) AS monthly_sales
  FROM enum (12, 1)
  LEFT JOIN sales ON enum = extract (month FROM sdate)
  GROUP BY enum
  ORDER BY enum;

This would do the trick.  Is there a more elegant solution?