Thread: Re: Let's talk up 6.3

Re: Let's talk up 6.3

From
Bruce Momjian
Date:
>
> I have made no indices yet.  And these are the only two tables
> in the database (beside the system ones).

No indexes.  No wonder it takes so long.  Put an index on
mdc1_runs.runnum and mdc1_simu.version, and see how fast it is.  Did
Oracle manage to do this quickly without the indexes?

Having it crash is certainly not an acceptable outcome, but I am sure
indexes will fix the problem.

Now, the fact that it runs quickly as separate queries, even without the
indexes, but takes a long time with the indexes, I think is
understandable.  Think of a join of two tables.  You can do through each
quickly, but if you join two non-indexed fields, it will take quite some
time.  I think our subselect code is doing just that.  We designed it
that way to give good performance for the majority of subselects,
including correlated ones.


>
> bbrmdc=> explain verbose select distinct runtype from mdc1_runs where runnum in
> bbrmdc-> (select runnum from mdc1_simu where version = '4.3.7g');
> NOTICE:  QUERY PLAN:
>
>
> Unique  (cost=686.02 size=0 width=0)
>   ->  Sort  (cost=686.02 size=0 width=0)
>         ->  Seq Scan on mdc1_runs  (cost=686.02 size=1455 width=12)
>               SubPlan
>                 ->  Seq Scan on mdc1_simu  (cost=733.02 size=1 width=12)
>


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: Let's talk up 6.3

From
Paul Raines
Date:
bbrmdc=> select distinct runtype from mdc1_runs where runnum in
bbrmdc-> (select runnum from mdc1_simu where version = '3.1.0');

Ok, without indices, the subselect took 35 seconds.  I then
created the following two indices which seem to correspond
to the ones I have in Oracle:

bbrmdc=> create index mdc1_runs_pk on mdc1_runs using btree (
bbrmdc-> runnum char_ops );
CREATE
bbrmdc=> create index mdc1_simu_pk on mdc1_simu using btree (
bbrmdc-> runnum char_ops, version varchar_ops );
CREATE

The subselect still took 35 seconds.  I then created:

bbrmdc=> create index mdc1_simu_ver on mdc1_simu using btree (
bbrmdc-> version varchar_ops );
CREATE

Now the subselect takes < 3 seconds.  Should I have expected that
second index above to help at all?  Since all runnum's are
unique in this example, probably not.  Would a rule be that
if the first attribute of an index is unique, then additional
attributes are basically useless?

>> Having it crash is certainly not an acceptable outcome, but I am sure
>> indexes will fix the problem.
>>

Well, it didn't exactly crash.  I just gave up on it and killed it
myself after 15 minutes.  That was when I had about 5500 rows in
each table rather than the 2500 now.  BTW, is there anyway for a "user"
to stop a runaway postgres process?  I had to log in directly to the
server and kill it as either root or postgres.

>> Now, the fact that it runs quickly as separate queries, even without the
>> indexes, but takes a long time with the indexes, I think is
>> understandable.  Think of a join of two tables.  You can do through each
>> quickly, but if you join two non-indexed fields, it will take quite some
>> time.  I think our subselect code is doing just that.  We designed it
>> that way to give good performance for the majority of subselects,
>> including correlated ones.
>>

Is there a better way to do this subselect?  Is there a way to
save the results of one query and feed it into a second one easily
when doing interactive stuff on psql?  I know this can be done in
programming, though I worry the statement might get too long. I
was thinking of trying a function for this but they only seem to
return scalars, not suitable for a IN clause.

On another note, is there anyway to prevent a user from being able
to create tables in a database?  There only seems to be security
in making the connection in the first place and then there is
just security on existing tables.  I want to set up a "safe" user
id that has query access only on a database.

Thanks for all your help.

pr


--
_________________________________________________________________________
Paul Raines               raines@slac.stanford.edu           650-926-2369
Stanford Linear Accelerator   BABAR Group                   Software Team
http://www.slac.stanford.edu/~raines/index.html  <======== PGP public key




Re: Let's talk up 6.3

From
Bruce Momjian
Date:
>
> bbrmdc=> select distinct runtype from mdc1_runs where runnum in
> bbrmdc-> (select runnum from mdc1_simu where version = '3.1.0');
>
> Ok, without indices, the subselect took 35 seconds.  I then
> created the following two indices which seem to correspond
> to the ones I have in Oracle:
>
> bbrmdc=> create index mdc1_runs_pk on mdc1_runs using btree (
> bbrmdc-> runnum char_ops );
> CREATE


> bbrmdc=> create index mdc1_simu_pk on mdc1_simu using btree (
> bbrmdc-> runnum char_ops, version varchar_ops );
> CREATE

This index is useless.  If you are only restricting on the second field
of an index, and not the first, the index is useless, just like knowing
the second letter of a word is q doesn't help you look it up in a
dictionary.

>
> The subselect still took 35 seconds.  I then created:
>
> bbrmdc=> create index mdc1_simu_ver on mdc1_simu using btree (
> bbrmdc-> version varchar_ops );
> CREATE
>
> Now the subselect takes < 3 seconds.  Should I have expected that
> second index above to help at all?  Since all runnum's are
> unique in this example, probably not.  Would a rule be that
> if the first attribute of an index is unique, then additional
> attributes are basically useless?

See above.

> bbrmdc=> create index mdc1_simu_pk on mdc1_simu using btree (
> bbrmdc-> runnum char_ops, version varchar_ops );

> Well, it didn't exactly crash.  I just gave up on it and killed it
> myself after 15 minutes.  That was when I had about 5500 rows in
> each table rather than the 2500 now.  BTW, is there anyway for a "user"
> to stop a runaway postgres process?  I had to log in directly to the
> server and kill it as either root or postgres.

No, but on the TODO list.

> >> Now, the fact that it runs quickly as separate queries, even without the
> >> indexes, but takes a long time with the indexes, I think is
> >> understandable.  Think of a join of two tables.  You can do through each
> >> quickly, but if you join two non-indexed fields, it will take quite some
> >> time.  I think our subselect code is doing just that.  We designed it
> >> that way to give good performance for the majority of subselects,
> >> including correlated ones.
> >>
>
> Is there a better way to do this subselect?  Is there a way to
> save the results of one query and feed it into a second one easily
> when doing interactive stuff on psql?  I know this can be done in
> programming, though I worry the statement might get too long. I
> was thinking of trying a function for this but they only seem to
> return scalars, not suitable for a IN clause.

SELECT * INTO TABLE ... would work.  DELETE when done.


>
> On another note, is there anyway to prevent a user from being able
> to create tables in a database?  There only seems to be security
> in making the connection in the first place and then there is
> just security on existing tables.  I want to set up a "safe" user
> id that has query access only on a database.

No, again on the TODO list.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)