Thread: Inserting Multiple Random Rows

Inserting Multiple Random Rows

From
Gary Chambers
Date:
All,

I have a table of user addresses that I'm trying to randomly populate
with data from a cities table.  Using the following query:

INSERT INTO useraddrs(userid, addrdesc, city, stprov)
SELECT u.userid, 'Home', c.cityname, c.stateabbr
FROM users u, cities c
WHERE u.userid NOT IN (SELECT userid FROM useraddrs)   AND cid=(SELECT cid FROM cities ORDER BY RANDOM() LIMIT 1);

I am able to achieve most of what I am trying to accomplish, but once
the random number is selected, it doesn't change.  What am I missing?
Thank you in advance.

-- Gary Chambers

/* Nothing fancy and nothing Microsoft! */


Re: Inserting Multiple Random Rows

From
Tom Lane
Date:
Gary Chambers <gwchamb@gmail.com> writes:
> INSERT INTO useraddrs(userid, addrdesc, city, stprov)
> SELECT u.userid, 'Home', c.cityname, c.stateabbr
> FROM users u, cities c
> WHERE u.userid NOT IN (SELECT userid FROM useraddrs)
>     AND cid=(SELECT cid FROM cities ORDER BY RANDOM() LIMIT 1);

> I am able to achieve most of what I am trying to accomplish, but once
> the random number is selected, it doesn't change.  What am I missing?

I believe the sub-SELECT will only get executed once, since it has no
dependency on the outer query.  What were you expecting to happen?
        regards, tom lane


Re: Inserting Multiple Random Rows

From
Justin Graf
Date:
That won't work because Order by does not recompute Random() once gets a 
number it  stops

you need to generate a series of random numbers then select each record 
one at a time out of cities table .  You will have to write a plsql 
function to do this As any join will cause the result to be ordered.

Here is example of pl-sql procedure that inserts records randomly into a 
table from a another table.  This is an  excerpt from the function.  
There is more logic that limits the random result set size and  rules 
not to repeat a given number.

--First create cursor of the source records    OPEN _questions SCROLL for  (Select  *    from questions
          where quest_expire > now()::date                                and trim( both '' from quest_type)  = 
 
_classexams.exam_quest_type                                and trim( both '' from quest_level) = 
_classexams.exam_level                            order by quest_id );

--need to limit the number range created by random so not to exceed the 
record count created by the Cursor
select count(quest_id) into _rcount from educate.questions                where quest_expire > now()::date
     and trim( both '' from quest_type)  = 
 
_classexams.exam_quest_type                    and trim( both '' from quest_level) = 
_classexams.exam_level ;

Generate a Random list of  of numbers
for _randlist IN (Select num from (                select round(random()*1000)::int as num from 
generate_series(1,100000)) rand                where num <= _rcount and num > 0 ) LOOP

FETCH ABSOLUTE _randlist.num from _questions into _quest ;

Next Insert into Into the destination  Table

end loop;

On 4/28/2010 12:22 PM, Gary Chambers wrote:
> All,
>
> I have a table of user addresses that I'm trying to randomly populate
> with data from a cities table.  Using the following query:
>
> INSERT INTO useraddrs(userid, addrdesc, city, stprov)
> SELECT u.userid, 'Home', c.cityname, c.stateabbr
> FROM users u, cities c
> WHERE u.userid NOT IN (SELECT userid FROM useraddrs)
>      AND cid=(SELECT cid FROM cities ORDER BY RANDOM() LIMIT 1);
>
> I am able to achieve most of what I am trying to accomplish, but once
> the random number is selected, it doesn't change.  What am I missing?
> Thank you in advance.
>
> -- Gary Chambers
>
> /* Nothing fancy and nothing Microsoft! */
>
>    


All legitimate Magwerks Corporation quotations are sent in a .PDF file attachment with a unique ID number generated by
ourproprietary quotation system. Quotations received via any other form of communication will not be honored.
 

CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally privileged, confidential or other
informationproprietary to Magwerks Corporation and is intended solely for the use of the individual to whom it
addresses.If the reader of this e-mail is not the intended recipient or authorized agent, the reader is hereby notified
thatany unauthorized viewing, dissemination, distribution or copying of this e-mail is strictly prohibited. If you have
receivedthis e-mail in error, please notify the sender by replying to this message and destroy all occurrences of this
e-mailimmediately.
 
Thank you.



Re: Inserting Multiple Random Rows

From
Gary Chambers
Date:
Tom,

> I believe the sub-SELECT will only get executed once, since it has no
> dependency on the outer query.

That seems to be the behavior its exhibiting.

> What were you expecting to happen?

The clouds parting, choirs of angels singing, and fireworks
celebrating the veil of my obtuseness being lifted, and my grasp and
command of SQL to be complete and infinite.  None of which appears
will ever happen...

-- Gary Chambers

/* Nothing fancy and nothing Microsoft! */


Re: Inserting Multiple Random Rows

From
Justin Graf
Date:
On 4/28/2010 1:48 PM, Gary Chambers wrote:<br /><blockquote
cite="mid:z2i302670f21004281048n3ed3d9aew9203d8cceec79f74@mail.gmail.com"type="cite"><pre wrap="">pen? </pre> <pre
wrap="">
The clouds parting, choirs of angels singing, and fireworks
celebrating the veil of my obtuseness being lifted, and my grasp and
command of SQL to be complete and infinite.  None of which appears
will ever happen...

-- Gary Chambers

/* Nothing fancy and nothing Microsoft! */
 </pre></blockquote> You now what <b><u>might </u></b>work <br /><br /> Insert into useraddrs(userid, addrdesc, city,
stprov)<br /> select u.userid, 'Home', c.cityname, c.stateabbr FROM users u, cities c<span class="moz-txt-citetags"><br
/></span>WHEREu.userid NOT IN (SELECT userid FROM useraddrs) <br /> and cid IN (select round(random()*10000)::int as
numfrom generate_series(1,100000)) rand ) <br /><br /> As you don't care if a record repeats<br /><br /> All legitimate
MagwerksCorporation quotations are sent in a .PDF file attachment with a unique ID number generated by our proprietary
quotationsystem. Quotations received via any other form of communication will not be honored. <br /><br />
CONFIDENTIALITYNOTICE: This e-mail, including attachments, may contain legally privileged, confidential or other
informationproprietary to Magwerks Corporation and is intended solely for the use of the individual to whom it
addresses.If the reader of this e-mail is not the intended recipient or authorized agent, the reader is hereby notified
thatany unauthorized viewing, dissemination, distribution or copying of this e-mail is strictly prohibited. If you have
receivedthis e-mail in error, please notify the sender by replying to this message and destroy all occurrences of this
e-mailimmediately. <br /> Thank you. <br /> 

Re: Inserting Multiple Random Rows

From
Gary Chambers
Date:
Justin,

Thanks for the reply!

> you need to generate a series of random numbers then select each record
> one at a time out of cities table .  You will have to write a plsql
> function to do this As any join will cause the result to be ordered.

After modifying my search terms at Google, I've discovered that I'm
not alone in this endeavor.  There are a couple of informative
articles, the most beneficial of which is written by depesz at
http://www.depesz.com/index.php/2007/09/16/my-thoughts-on-getting-random-row/
and the one that referred me to it at
http://blog.rhodiumtoad.org.uk/2009/03/08/selecting-random-rows-from-a-table/

> Here is example of pl-sql procedure that inserts records randomly into a
> table from a another table.

I could have been done with this in pretty short order if I had just
continued with writing a quick PL/pgSQL function to do it.  It's all
Quassnoi's fault that I abandoned my original plan and tried to solve
the problem with a query.  After all, I need to learn to think in sets
and get out of the procedural mindset, but I still haven't heard the
click! :)

Thanks for the suggestion in your follow-up e-mail.  I'll post the
solution upon discovering it.

-- Gary Chambers

/* Nothing fancy and nothing Microsoft! */


Re: Inserting Multiple Random Rows

From
Gary Chambers
Date:
Justin (et al),

> You now what might work

In the interest of efficiency, I abandoned the quest for the perfect
query and wrote my own function and used a modified version of
depesz's get_random_id() function:

CREATE OR REPLACE FUNCTION gen_fake_addresses() RETURNS INTEGER AS
$gen_fake_addresses$
DECLARE   v_uid BIGINT;   v_cid INTEGER;   v_cst RECORD;   v_count BIGINT := 0;

BEGIN   FOR v_uid IN   SELECT userid   FROM users   WHERE userid NOT IN (SELECT userid FROM useraddrs)   LOOP
SELECTINTO v_cid get_random_city();       SELECT INTO v_cst cityname, stateabbr FROM cities WHERE cid = v_cid;
INSERTINTO useraddrs(userid, addrdesc, city, stprov)       VALUES (v_uid, 'Home', v_cst.cityname, v_cst.stateabbr);
 v_count := v_count + 1;   END LOOP;   RETURN v_count;
 
END;
$gen_fake_addresses$ LANGUAGE plpgsql VOLATILE;

/* This is depesz's */
CREATE OR REPLACE FUNCTION get_random_city() RETURNS INT4 AS
$get_random_city$
DECLARE   id_range RECORD;   reply INT4;   try INT4 := 0;

BEGIN   SELECT min(cid), max(cid) - min(cid) + 1 AS range INTO id_range FROM cities;   WHILE (try < 10) LOOP       try
:=try + 1;       reply := FLOOR(RANDOM() * id_range.range) + id_range.min;       PERFORM cid FROM cities WHERE cid =
reply;      IF FOUND THEN           RETURN reply;       END IF;   END LOOP;   RAISE EXCEPTION 'No record found in %
tries',try;
 
END;
$get_random_city$ LANGUAGE plpgsql STABLE;

I'd like to thank Justin Graf for his time, effort, and assistance
with this problem.

-- Gary Chambers

/* Nothing fancy and nothing Microsoft! */