Thread: [NEWBIE] need help optimizing this query

[NEWBIE] need help optimizing this query

From
Dexter Tad-y
Date:
Greetings,
I need help in optimizing this query:

select a.id, b.transaction from test as a left join pg_locks as b on
a.xmax = b.transaction where b.transaction is null;

im using the query in obtaining records not locked by other
transactions. any help is appreciated.

cheers!

Dexter Tad-y


Re: [NEWBIE] need help optimizing this query

From
Bill Moran
Date:
Dexter Tad-y wrote:
> Greetings,
> I need help in optimizing this query:
>
> select a.id, b.transaction from test as a left join pg_locks as b on
> a.xmax = b.transaction where b.transaction is null;
>
> im using the query in obtaining records not locked by other
> transactions. any help is appreciated.

The query, in and of itself, is as optimized as it's going to get.

The real question is whether or not your database is optimized.

1) How often do you vacuum?
2) How often do you analyze? (I recenlty saw a 300% speedup on a query
    after running analyze!  I didn't realize just how important it was
    until then!)
3) Do you have indexes on a.xmax and b.transaction?  (I was wondering
    why a test database was running so slow (about 100x slower than usual)
    and I realized I had forgotten to create the indexes)
4) Have you tweaked postgres.conf apropriately?
5) If none of these helps, you should post the output of EXPLAIN on
    this query, which will give the people on the list enough details to
    give you more specific advice.

--
Bill Moran
Potential Technologies
http://www.potentialtech.com


Re: [NEWBIE] need help optimizing this query

From
Stephan Szabo
Date:
On Wed, 10 Mar 2004, Dexter Tad-y wrote:

> Greetings,
> I need help in optimizing this query:
>
> select a.id, b.transaction from test as a left join pg_locks as b on
> a.xmax = b.transaction where b.transaction is null;
>
> im using the query in obtaining records not locked by other
> transactions. any help is appreciated.

It's hard to say without knowing more about the size of a and explain
analyze output.  On my 7.4 machine, using NOT IN rather than the left join
gives about a 2x speed increase on a 400k row table.

Re: [NEWBIE] need help optimizing this query

From
Dexter Tad-y
Date:
On Wed, 2004-03-10 at 23:08, Stephan Szabo wrote:
> On Wed, 10 Mar 2004, Dexter Tad-y wrote:
>
> > Greetings,
> > I need help in optimizing this query:
> >
> > select a.id, b.transaction from test as a left join pg_locks as b on
> > a.xmax = b.transaction where b.transaction is null;
> >
> > im using the query in obtaining records not locked by other
> > transactions. any help is appreciated.
>
> It's hard to say without knowing more about the size of a and explain
> analyze output.  On my 7.4 machine, using NOT IN rather than the left join
> gives about a 2x speed increase on a 400k row table.


here's what comes up with explain:

1) using LEFT JOIN

csp=> explain select a.id, b.transaction from test as a left join
pg_locks as b  on a.xmax = b.transaction having transaction is null;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Hash Left Join  (cost=15.00..340.01 rows=1000 width=12)
   Hash Cond: ("outer".xmax = "inner"."transaction")
   Filter: ("inner"."transaction" IS NULL)
   ->  Seq Scan on test a  (cost=0.00..20.00 rows=1000 width=12)
   ->  Hash  (cost=12.50..12.50 rows=1000 width=4)
         ->  Function Scan on pg_lock_status l  (cost=0.00..12.50
rows=1000 width=4)
(6 rows)



2) using NOT IN

csp=> explain select * from test where id not in (select test.id from
test, pg_locks where pg_locks.transaction=test.xmax);
                                         QUERY PLAN
---------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=352.51..375.01 rows=500 width=32)
   Filter: (NOT (hashed subplan))
   SubPlan
     ->  Hash Join  (cost=15.00..340.01 rows=5001 width=8)
           Hash Cond: ("outer".xmax = "inner"."transaction")
           ->  Seq Scan on test  (cost=0.00..20.00 rows=1000 width=12)
           ->  Hash  (cost=12.50..12.50 rows=1000 width=4)
                 ->  Function Scan on pg_lock_status l
(cost=0.00..12.50 rows=1000 width=4)
(8 rows)


which of the two is faster? :D

cheers!


Dexter Tad-y



Re: [NEWBIE] need help optimizing this query

From
Dexter Tad-y
Date:
On Wed, 2004-03-10 at 22:42, Bill Moran wrote:
> Dexter Tad-y wrote:
> > Greetings,
> > I need help in optimizing this query:
> >
> > select a.id, b.transaction from test as a left join pg_locks as b on
> > a.xmax = b.transaction where b.transaction is null;
> >
> > im using the query in obtaining records not locked by other
> > transactions. any help is appreciated.
>
> The query, in and of itself, is as optimized as it's going to get.
>
> The real question is whether or not your database is optimized.
>
> 1) How often do you vacuum?
> 2) How often do you analyze? (I recenlty saw a 300% speedup on a query
>     after running analyze!  I didn't realize just how important it was
>     until then!)
> 3) Do you have indexes on a.xmax and b.transaction?  (I was wondering
>     why a test database was running so slow (about 100x slower than usual)
>     and I realized I had forgotten to create the indexes)
> 4) Have you tweaked postgres.conf apropriately?
> 5) If none of these helps, you should post the output of EXPLAIN on
>     this query, which will give the people on the list enough details to
>     give you more specific advice.

Hi,

1) and 2). Both I use occasionally.

3) I think you can't index xmax since its a reserved field. Same with
pg_locks.transaction view as it's built-in. As for the tables, i believe
they're indexed properly.

4) I think postgres.conf is tweak to match our requirements for kernel,
memory, etc.

5) EXPLAIN results posted. :D

Thanks!
Cheers!


Dexter Tad-y


Re: [NEWBIE] need help optimizing this query

From
Stephan Szabo
Date:
On Wed, 10 Mar 2004, Dexter Tad-y wrote:

> On Wed, 2004-03-10 at 23:08, Stephan Szabo wrote:
> > On Wed, 10 Mar 2004, Dexter Tad-y wrote:
> >
> > > Greetings,
> > > I need help in optimizing this query:
> > >
> > > select a.id, b.transaction from test as a left join pg_locks as b on
> > > a.xmax = b.transaction where b.transaction is null;
> > >
> > > im using the query in obtaining records not locked by other
> > > transactions. any help is appreciated.
> >
> > It's hard to say without knowing more about the size of a and explain
> > analyze output.  On my 7.4 machine, using NOT IN rather than the left join
> > gives about a 2x speed increase on a 400k row table.
>
>
> 2) using NOT IN
>
> csp=> explain select * from test where id not in (select test.id from
> test, pg_locks where pg_locks.transaction=test.xmax);

I think you'd want:
select * from text where xmax not in (select transaction from pg_locks);

Also, use explain analyze which will actually run the query and show you
the real time for the steps.

Re: [NEWBIE] need help optimizing this query

From
Dexter Tad-y
Date:
> explain select * from foo where xmax not in
> (select transaction from pg_locks where transaction is not null);


Thanks a lot! This query is what i've been looking for.



Cheers!

Dexter Tad-y



Re: [NEWBIE] need help optimizing this query

From
Tom Lane
Date:
Bill Moran <wmoran@potentialtech.com> writes:
> 3) Do you have indexes on a.xmax and b.transaction?

He can't index either (xmax is simply not indexable, and pg_locks is a view).

In a quick experiment I got reasonable-seeming join plans; the output of
pg_locks got hashed and then the system did a seqscan over the outer
table.  It's not possible to do any better than that with the problem
as given.  I assume the OP's problem is that the outer table is big and
he doesn't want to seqscan it.  The only way I can see is to add an
additional filter condition that can be indexed, so that not all the
rows in the outer table have to be checked for xmax.

BTW, in 7.4 you get equivalently good plans with the more transparent

explain select * from foo where xmax not in
(select transaction from pg_locks where transaction is not null);

The EXPLAIN output looks different, but it's still effectively a hash
join.

            regards, tom lane

I'm running PostgreSQL 7.4.1 and trying to use an application on Solaris
that relies on "ident sameuser".  In the postgres log I get:

  "Ident authentication is not supported on local connections on this
platform"

The only thread I could find on this seems to indicate a patch was put in
way back at 7.1.2.

Is there a way to get this to work on Solaris?

Wes


Re: Ident authentication is not supported on local connections

From
Bruce Momjian
Date:
wespvp@syntegra.com wrote:
> I'm running PostgreSQL 7.4.1 and trying to use an application on Solaris
> that relies on "ident sameuser".  In the postgres log I get:
>
>   "Ident authentication is not supported on local connections on this
> platform"
>
> The only thread I could find on this seems to indicate a patch was put in
> way back at 7.1.2.
>
> Is there a way to get this to work on Solaris?

It can be added if you tell us how you can find out who is the user on
the other end of a unix domain socket.  See src/backend/libpq/auth.h and
hba.c and look for mentions of CRED to see how other platforms do it.
It is usually some system call like get/setsockopt().

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Solaris ecpg program doesn't work - pulling my hair

From
Date:
I've been able to determine that I am actually connecting to the database
under Solaris if I specify the correct credentials.  And, database
operations such as searches do work.  Due to a test environment error, it
previously looked like they weren't.  However, the error code is *never*
set.  I display the entire sqlca data stucture after a successful and
unsuccessful operation and can see no differences:

sqlca = {
    sqlcaid  = "SQLCA   "
    sqlabc   = 140
    sqlcode  = 0
    sqlerrm  = {
        sqlerrml = 0
        sqlerrmc = ""
    }
    sqlerrp  = "NOT SET "
    sqlerrd  = (0, 0, 0, 0, 0, 0)
    sqlwarn  = ""
    sqlstate = "00000"
}


The following simple inelegant test case works properly on Linux but not
Solaris.  If valid credentials are specified, both will connect to the
database and return the record, but Solaris will not set the error code if
there is a problem.  sqlca.sqlcode is *always* 0.  I've compared the C code
generated on Linux to that on Solaris, and they are the same.  I know there
is a problem with NOT FOUND at 7.4.1, but that is not the issue here (I have
a patch for that).

main()
{
    char *system   = "host.dom.ain";
    char *target   = "mydb@host.dom.ain";
    char *username = "testuser";
    char *password = "test";

    EXEC SQL BEGIN DECLARE SECTION;
        VARCHAR user[60];
        VARCHAR pw[60];
        VARCHAR dbTarget[255];
        VARCHAR systemName[255];
        int systemNum;
    EXEC SQL END DECLARE SECTION;

    EXEC SQL WHENEVER SQLERROR GOTO connect_error;
    EXEC SQL WHENEVER NOT FOUND CONTINUE;

    strcpy (user.arr, username);
    user.len = strlen(user.arr);

    strcpy (pw.arr, password);
    user.len = strlen(pw.arr);

    strcpy (dbTarget.arr, target);
    user.len = strlen(dbTarget.arr);

    EXEC SQL CONNECT TO :dbTarget USER :user USING :pw;

    printf ("Connect error code: %d\n", sqlca.sqlcode);

    EXEC SQL WHENEVER SQLERROR GOTO select_error;
    EXEC SQL WHENEVER NOT FOUND GOTO select_error;

    strcpy (systemName.arr, system);
    systemName.len = strlen(systemName.arr);
    EXEC SQL SELECT system_key
        INTO :systemNum
        FROM systems
        WHERE system_name=:systemName;

    printf ("SELECT error code: %d\n", sqlca.sqlcode);
    printf ("systemNum = %d\n", systemNum);
    exit(0);

connect_error:
    printf ("Connect failure: %d\n", sqlca.sqlcode);
    exit (1);


select_error:
    printf ("Select failure: %d\n", sqlca.sqlcode);
    exit (1);
}