Thread: Re: [PHP] faster output from php and postgres

Re: [PHP] faster output from php and postgres

From

Chadwick Rolfs

Date:

27 May 2003, 12:40:53

So, I have the same problem, but I need all authors for each publication
to show up in it's own column.  I tried the full join query from a
suggestion off pgsql-sql, but it only returns ONE author id TWICE instead
of ALL authors at once.

I'll do some RTFMing of the joins.. and post any results I get

BUT, right now, looping over each publication with php isn't taking that
long.  I would like to know how to make this query, though!

Please let me know how to get a result like:
____________________________________________________
|All Authors|Title|Source|Year|Type|Length|Keywords|
----------------------------------------------------

If there is a way on the SQL side to do this, that is ;>

Here we go:

CREATE TABLE "author" (
        "auth_id" integer DEFAULT
nextval('"author_temp_auth_id_seq"'::text) NOT NULL,
        "first" text,
        "last" text,
        "auth_last_updated" timestamp with time zone,
        Constraint "author_temp_pkey" Primary Key ("auth_id")
);


CREATE UNIQUE INDEX auth_id_author_key ON author USING btree (auth_id);


CREATE UNIQUE INDEX auth_last_updated_author_key ON author USING btree
(auth_last_updated);



CREATE TABLE "publication" (
        "copyis" text,
        "pub_id" integer DEFAULT nextval('publication_pub_id_seq'::text)
NOT NULL,
        "title" text,
        "source" text,
        "year" text,
        "month" text,
        "length" text,
        "type" text,
        "keywords" text,
        "copyright" text,
        "abstract" text,
        "pdformat" text,
        "pub_last_updated" timestamp with time zone
);



CREATE UNIQUE INDEX publication_pub_id_key ON publication USING btree
(pub_id);


CREATE INDEX keywords_publication_key ON publication USING btree
(keywords);


CREATE UNIQUE INDEX pub_last_updated_publication_ke ON publication USING
btree (pub_last_updated);


CREATE UNIQUE INDEX pub_id_publication_key ON publication USING btree
(pub_id);



CREATE TABLE "pub_auth" (
        "pub_auth_id" integer DEFAULT
nextval('"pub_auth_temp_pub_auth_id_seq"'::text) NOT NULL,
        "pub_id" integer,
        "auth_id" integer,
        Constraint "pub_auth_temp_pkey" Primary Key ("pub_auth_id")
);



CREATE INDEX pub_id_pub_auth_key ON pub_auth USING btree (pub_id);


CREATE INDEX auth_id_pub_auth_key ON pub_auth USING btree (auth_id);


On Sat, 24 May 2003, Frank Bax wrote:

> Finding previous examples of complex joins in archives is not likely an
> easy thing to find.
>
> pg_dump -s -t author -t publication -t pub_auth [database] | grep -v ^--
>
> Change [database] to the name of your database - this command will dump out
> schema relative to your request.  Post the results to this list.  Then ask
> us the question "how do I write a SELECT that produces...[ you finish this
> sentence]".  Question probably more appropriate to the list you mentioned,
> but I expect there are people here who are just as capable of answering the
> question.  I've even seen examples where the process goes through several
> emails before SQL produces desired results exactly.
>
>  >How would a join make this easier?
>
> I have always found that one properly constructed complex query is always
> "cheaper" in runtime than numerous queries inside a foreach loop.  Your
> final query will likely include joining a table to itself (this can
> sometimes be a difficult concept to grasp).
>
> Frank
>
>
> At 11:50 AM 5/24/03, Chadwick Rolfs wrote:
>
>
> >I'm glad this came up, because I have the same type of problem.  Except,
> >I don't see how a join can work... of course, I'm not really schooled in
> >this stuff.
> >
> >I also have three tables: author, publication, and pub_auth.
> >
> >There are multiple authors for some publications, so it is necessary to
> >check each publication selected for ALL authors.  I'm doing this with a
> >foreach loop on the result of each publication key returned.
> >
> >How would a join make this easier?
> >
> >I'm browsing the pgsql-sql archives now, but that may take a week.  I'm
> >not sure what to search for...
> >


-Chadwick

Re: [PHP] faster output from php and postgres

From

Richard Huxton

Date:

27 May 2003, 14:19:06

On Tuesday 27 May 2003 5:34 pm, Chadwick Rolfs wrote:
> So, I have the same problem, but I need all authors for each publication
> to show up in it's own column.  I tried the full join query from a
> suggestion off pgsql-sql, but it only returns ONE author id TWICE instead
> of ALL authors at once.
>
> I'll do some RTFMing of the joins.. and post any results I get
>
> BUT, right now, looping over each publication with php isn't taking that
> long.  I would like to know how to make this query, though!
>
> Please let me know how to get a result like:
> ____________________________________________________
>
> |All Authors|Title|Source|Year|Type|Length|Keywords|

Well, if you search the archives for terms "text", "concat", "aggregate" you
should come up with one solution. This involves writing your own aggregate
function, like SUM() but for text. Don't worry, it's not difficult. The only
issue is that you won't be able to guarantee the order of authors in the
field.

There might be something on this in my PostgreSQL Notes on
http://techdocs.postgresql.org/ too - can't remember for sure.

--
  Richard Huxton

Re: [PHP] faster output from php and postgres

From

Chadwick Rolfs

Date:

27 May 2003, 14:33:38

On Tue, 27 May 2003, Richard Huxton wrote:

> On Tuesday 27 May 2003 5:34 pm, Chadwick Rolfs wrote:
> > So, I have the same problem, but I need all authors for each publication
> > to show up in it's own column.  I tried the full join query from a
> > suggestion off pgsql-sql, but it only returns ONE author id TWICE instead
> > of ALL authors at once.
> >
> > I'll do some RTFMing of the joins.. and post any results I get
> >
> > BUT, right now, looping over each publication with php isn't taking that
> > long.  I would like to know how to make this query, though!
> >
> > Please let me know how to get a result like:
> > ____________________________________________________
> >
> > |All Authors|Title|Source|Year|Type|Length|Keywords|
> 
> Well, if you search the archives for terms "text", "concat", "aggregate" you 
> should come up with one solution. This involves writing your own aggregate 
> function, like SUM() but for text. Don't worry, it's not difficult. The only 
> issue is that you won't be able to guarantee the order of authors in the 
> field.
> 
> There might be something on this in my PostgreSQL Notes on 
> http://techdocs.postgresql.org/ too - can't remember for sure.
> 
> -- 
>   Richard Huxton

A function is definitely in need.  But I already have PHP doing something
of this sort.  I realize that PostgreSQL would most likely do it faster.

What I meant by the result output above was that there would be _many_
results, but for each publication only ONE column for ALL authors in EACH
row. (*whew*)  So perhaps a function, but not an aggregate function.

I'll look into functions next, as it seems that there is no single SQL
statement that can do this... I can't see how.

-Chadwick

Re: [PHP] faster output from php and postgres

From

Rod Taylor

Date:

27 May 2003, 14:54:41

On Tue, 2003-05-27 at 14:19, Richard Huxton wrote:
> On Tuesday 27 May 2003 5:34 pm, Chadwick Rolfs wrote:
> > So, I have the same problem, but I need all authors for each publication
> > to show up in it's own column.  I tried the full join query from a
> > suggestion off pgsql-sql, but it only returns ONE author id TWICE instead
> > of ALL authors at once.
> >
> > I'll do some RTFMing of the joins.. and post any results I get
> >
> > BUT, right now, looping over each publication with php isn't taking that
> > long.  I would like to know how to make this query, though!
> >
> > Please let me know how to get a result like:
> > ____________________________________________________
> >
> > |All Authors|Title|Source|Year|Type|Length|Keywords|
>
> Well, if you search the archives for terms "text", "concat", "aggregate" you
> should come up with one solution. This involves writing your own aggregate
> function, like SUM() but for text. Don't worry, it's not difficult. The only
> issue is that you won't be able to guarantee the order of authors in the
> field.

If order is required:

SELECT custom_aggregate(author) as authors
  FROM (SELECT author FROM table ORDER BY author) AS tab;

The above should give you authors in alphabetical order if
custom_aggregate() was written to concatenate text.


--
Rod Taylor <rbt@rbt.ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

Attachment

signature.asc

Re: [PHP] faster output from php and postgres

From

Jean-Luc Lachance

Date:

27 May 2003, 15:35:19

KISS

why not use PHP to concatenate the authors while pub_id is the same???
If you insist on having each author in its own column,
put them at the end and concatenate with <TD>.

jll


Chadwick Rolfs wrote:
>
> So, I have the same problem, but I need all authors for each publication
> to show up in it's own column.  I tried the full join query from a
> suggestion off pgsql-sql, but it only returns ONE author id TWICE instead
> of ALL authors at once.
>
> I'll do some RTFMing of the joins.. and post any results I get
>
> BUT, right now, looping over each publication with php isn't taking that
> long.  I would like to know how to make this query, though!
>
> Please let me know how to get a result like:
> ____________________________________________________
> |All Authors|Title|Source|Year|Type|Length|Keywords|
> ----------------------------------------------------
>
> If there is a way on the SQL side to do this, that is ;>
>
> Here we go:
>
> CREATE TABLE "author" (
>         "auth_id" integer DEFAULT
> nextval('"author_temp_auth_id_seq"'::text) NOT NULL,
>         "first" text,
>         "last" text,
>         "auth_last_updated" timestamp with time zone,
>         Constraint "author_temp_pkey" Primary Key ("auth_id")
> );
>
> CREATE UNIQUE INDEX auth_id_author_key ON author USING btree (auth_id);
>
> CREATE UNIQUE INDEX auth_last_updated_author_key ON author USING btree
> (auth_last_updated);
>
> CREATE TABLE "publication" (
>         "copyis" text,
>         "pub_id" integer DEFAULT nextval('publication_pub_id_seq'::text)
> NOT NULL,
>         "title" text,
>         "source" text,
>         "year" text,
>         "month" text,
>         "length" text,
>         "type" text,
>         "keywords" text,
>         "copyright" text,
>         "abstract" text,
>         "pdformat" text,
>         "pub_last_updated" timestamp with time zone
> );
>
> CREATE UNIQUE INDEX publication_pub_id_key ON publication USING btree
> (pub_id);
>
> CREATE INDEX keywords_publication_key ON publication USING btree
> (keywords);
>
> CREATE UNIQUE INDEX pub_last_updated_publication_ke ON publication USING
> btree (pub_last_updated);
>
> CREATE UNIQUE INDEX pub_id_publication_key ON publication USING btree
> (pub_id);
>
> CREATE TABLE "pub_auth" (
>         "pub_auth_id" integer DEFAULT
> nextval('"pub_auth_temp_pub_auth_id_seq"'::text) NOT NULL,
>         "pub_id" integer,
>         "auth_id" integer,
>         Constraint "pub_auth_temp_pkey" Primary Key ("pub_auth_id")
> );
>
> CREATE INDEX pub_id_pub_auth_key ON pub_auth USING btree (pub_id);
>
> CREATE INDEX auth_id_pub_auth_key ON pub_auth USING btree (auth_id);
>
> On Sat, 24 May 2003, Frank Bax wrote:
>
> > Finding previous examples of complex joins in archives is not likely an
> > easy thing to find.
> >
> > pg_dump -s -t author -t publication -t pub_auth [database] | grep -v ^--
> >
> > Change [database] to the name of your database - this command will dump out
> > schema relative to your request.  Post the results to this list.  Then ask
> > us the question "how do I write a SELECT that produces...[ you finish this
> > sentence]".  Question probably more appropriate to the list you mentioned,
> > but I expect there are people here who are just as capable of answering the
> > question.  I've even seen examples where the process goes through several
> > emails before SQL produces desired results exactly.
> >
> >  >How would a join make this easier?
> >
> > I have always found that one properly constructed complex query is always
> > "cheaper" in runtime than numerous queries inside a foreach loop.  Your
> > final query will likely include joining a table to itself (this can
> > sometimes be a difficult concept to grasp).
> >
> > Frank
> >
> >
> > At 11:50 AM 5/24/03, Chadwick Rolfs wrote:
> >
> >
> > >I'm glad this came up, because I have the same type of problem.  Except,
> > >I don't see how a join can work... of course, I'm not really schooled in
> > >this stuff.
> > >
> > >I also have three tables: author, publication, and pub_auth.
> > >
> > >There are multiple authors for some publications, so it is necessary to
> > >check each publication selected for ALL authors.  I'm doing this with a
> > >foreach loop on the result of each publication key returned.
> > >
> > >How would a join make this easier?
> > >
> > >I'm browsing the pgsql-sql archives now, but that may take a week.  I'm
> > >not sure what to search for...
> > >
>
> -Chadwick
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: [PHP] faster output from php and postgres

From

Chadwick Rolfs

Date:

27 May 2003, 16:35:01

Yes, this may be better than foreach()ing through each publication
returned.

But a plpgsql function may suit these needs much better.

On Tue, 27 May 2003, Jean-Luc Lachance wrote:

> KISS
>
> why not use PHP to concatenate the authors while pub_id is the same???
> If you insist on having each author in its own column,
> put them at the end and concatenate with <TD>.
>
> jll
>
>
> Chadwick Rolfs wrote:
> >
> > So, I have the same problem, but I need all authors for each publication
> > to show up in it's own column.  I tried the full join query from a
> > suggestion off pgsql-sql, but it only returns ONE author id TWICE instead
> > of ALL authors at once.
> >
> > I'll do some RTFMing of the joins.. and post any results I get
> >
> > BUT, right now, looping over each publication with php isn't taking that
> > long.  I would like to know how to make this query, though!
> >
> > Please let me know how to get a result like:
> > ____________________________________________________
> > |All Authors|Title|Source|Year|Type|Length|Keywords|
> > ----------------------------------------------------
> >
> > If there is a way on the SQL side to do this, that is ;>
> >
> > Here we go:
> >
> > CREATE TABLE "author" (
> >         "auth_id" integer DEFAULT
> > nextval('"author_temp_auth_id_seq"'::text) NOT NULL,
> >         "first" text,
> >         "last" text,
> >         "auth_last_updated" timestamp with time zone,
> >         Constraint "author_temp_pkey" Primary Key ("auth_id")
> > );
> >
> > CREATE UNIQUE INDEX auth_id_author_key ON author USING btree (auth_id);
> >
> > CREATE UNIQUE INDEX auth_last_updated_author_key ON author USING btree
> > (auth_last_updated);
> >
> > CREATE TABLE "publication" (
> >         "copyis" text,
> >         "pub_id" integer DEFAULT nextval('publication_pub_id_seq'::text)
> > NOT NULL,
> >         "title" text,
> >         "source" text,
> >         "year" text,
> >         "month" text,
> >         "length" text,
> >         "type" text,
> >         "keywords" text,
> >         "copyright" text,
> >         "abstract" text,
> >         "pdformat" text,
> >         "pub_last_updated" timestamp with time zone
> > );
> >
> > CREATE UNIQUE INDEX publication_pub_id_key ON publication USING btree
> > (pub_id);
> >
> > CREATE INDEX keywords_publication_key ON publication USING btree
> > (keywords);
> >
> > CREATE UNIQUE INDEX pub_last_updated_publication_ke ON publication USING
> > btree (pub_last_updated);
> >
> > CREATE UNIQUE INDEX pub_id_publication_key ON publication USING btree
> > (pub_id);
> >
> > CREATE TABLE "pub_auth" (
> >         "pub_auth_id" integer DEFAULT
> > nextval('"pub_auth_temp_pub_auth_id_seq"'::text) NOT NULL,
> >         "pub_id" integer,
> >         "auth_id" integer,
> >         Constraint "pub_auth_temp_pkey" Primary Key ("pub_auth_id")
> > );
> >
> > CREATE INDEX pub_id_pub_auth_key ON pub_auth USING btree (pub_id);
> >
> > CREATE INDEX auth_id_pub_auth_key ON pub_auth USING btree (auth_id);
> >
> > On Sat, 24 May 2003, Frank Bax wrote:
> >
> > > Finding previous examples of complex joins in archives is not likely an
> > > easy thing to find.
> > >
> > > pg_dump -s -t author -t publication -t pub_auth [database] | grep -v ^--
> > >
> > > Change [database] to the name of your database - this command will dump out
> > > schema relative to your request.  Post the results to this list.  Then ask
> > > us the question "how do I write a SELECT that produces...[ you finish this
> > > sentence]".  Question probably more appropriate to the list you mentioned,
> > > but I expect there are people here who are just as capable of answering the
> > > question.  I've even seen examples where the process goes through several
> > > emails before SQL produces desired results exactly.
> > >
> > >  >How would a join make this easier?
> > >
> > > I have always found that one properly constructed complex query is always
> > > "cheaper" in runtime than numerous queries inside a foreach loop.  Your
> > > final query will likely include joining a table to itself (this can
> > > sometimes be a difficult concept to grasp).
> > >
> > > Frank
> > >
> > >
> > > At 11:50 AM 5/24/03, Chadwick Rolfs wrote:
> > >
> > >
> > > >I'm glad this came up, because I have the same type of problem.  Except,
> > > >I don't see how a join can work... of course, I'm not really schooled in
> > > >this stuff.
> > > >
> > > >I also have three tables: author, publication, and pub_auth.
> > > >
> > > >There are multiple authors for some publications, so it is necessary to
> > > >check each publication selected for ALL authors.  I'm doing this with a
> > > >foreach loop on the result of each publication key returned.
> > > >
> > > >How would a join make this easier?
> > > >
> > > >I'm browsing the pgsql-sql archives now, but that may take a week.  I'm
> > > >not sure what to search for...
> > > >
> >
> > -Chadwick
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>

-Chadwick

Re: [PHP] faster output from php and postgres (one resolution)

From

Chadwick Rolfs

Date:

29 May 2003, 15:58:58

My solution to the problem stated below was to learn plpgsql, and write a
function that loops through the returned records, concatenating a string
together, and returning that string as the Author field.
This function was actually e-mailed to me by a collegue, and I did a bit
of doctoring (it still needs it) to get it functional (pun perhaps
intended :>):

----------------------------------------------------------------
create function show_auth(integer) returns varchar as '
declare
 a record;
 b alias for $1;
 s varchar;
 c varchar;
begin
 s := '''';
 c := '''';
 for a in
   select author.last
   from pub_auth, author
   where pub_auth.pub_id = b
   and author.auth_id = pub_auth.auth_id
   order by author.last
 loop
   s := s || c|| a.last;
   c := '', '';
 end loop;
 return s;
end;
' language 'plpgsql';
----------------------------------------------------------------

Although there were many ways to do this, as Jean-Luc points out below, I
believe in the long run this one is best.
This function will be expanded to actually handle the entire query that
was starting with:

select auth_id
  from author
  where
    lower(last) like lower('%".addslashes($search_term)."%')
  order by last, first

-OR-

select distinct pub_auth.pub_id, title, source, year, type, length,
copyright, abstract, pdformat, keywords
  from publication, pub_auth
  where
    keywords like '%".addslashes($search_term)."%'
    and
    publication.pub_id = pub_auth.pub_id
  order by year desc, title

In other words, two different functions.  I guess.

Hopefully this helps anyone else in the future!

On Tue, 27 May 2003, Jean-Luc Lachance wrote:

> KISS
>
> why not use PHP to concatenate the authors while pub_id is the same???
> If you insist on having each author in its own column,
> put them at the end and concatenate with <TD>.
>
> jll
>
>
> Chadwick Rolfs wrote:
> >
> > So, I have the same problem, but I need all authors for each publication
> > to show up in it's own column.  I tried the full join query from a
> > suggestion off pgsql-sql, but it only returns ONE author id TWICE instead
> > of ALL authors at once.
> >
> > I'll do some RTFMing of the joins.. and post any results I get
> >
> > BUT, right now, looping over each publication with php isn't taking that
> > long.  I would like to know how to make this query, though!
> >
> > Please let me know how to get a result like:
> > ____________________________________________________
> > |All Authors|Title|Source|Year|Type|Length|Keywords|
> > ----------------------------------------------------
> >
> > If there is a way on the SQL side to do this, that is ;>
> >
> > Here we go:
> >
> > CREATE TABLE "author" (
> >         "auth_id" integer DEFAULT
> > nextval('"author_temp_auth_id_seq"'::text) NOT NULL,
> >         "first" text,
> >         "last" text,
> >         "auth_last_updated" timestamp with time zone,
> >         Constraint "author_temp_pkey" Primary Key ("auth_id")
> > );
> >
> > CREATE UNIQUE INDEX auth_id_author_key ON author USING btree (auth_id);
> >
> > CREATE UNIQUE INDEX auth_last_updated_author_key ON author USING btree
> > (auth_last_updated);
> >
> > CREATE TABLE "publication" (
> >         "copyis" text,
> >         "pub_id" integer DEFAULT nextval('publication_pub_id_seq'::text)
> > NOT NULL,
> >         "title" text,
> >         "source" text,
> >         "year" text,
> >         "month" text,
> >         "length" text,
> >         "type" text,
> >         "keywords" text,
> >         "copyright" text,
> >         "abstract" text,
> >         "pdformat" text,
> >         "pub_last_updated" timestamp with time zone
> > );
> >
> > CREATE UNIQUE INDEX publication_pub_id_key ON publication USING btree
> > (pub_id);
> >
> > CREATE INDEX keywords_publication_key ON publication USING btree
> > (keywords);
> >
> > CREATE UNIQUE INDEX pub_last_updated_publication_ke ON publication USING
> > btree (pub_last_updated);
> >
> > CREATE UNIQUE INDEX pub_id_publication_key ON publication USING btree
> > (pub_id);
> >
> > CREATE TABLE "pub_auth" (
> >         "pub_auth_id" integer DEFAULT
> > nextval('"pub_auth_temp_pub_auth_id_seq"'::text) NOT NULL,
> >         "pub_id" integer,
> >         "auth_id" integer,
> >         Constraint "pub_auth_temp_pkey" Primary Key ("pub_auth_id")
> > );
> >
> > CREATE INDEX pub_id_pub_auth_key ON pub_auth USING btree (pub_id);
> >
> > CREATE INDEX auth_id_pub_auth_key ON pub_auth USING btree (auth_id);
> >
> > On Sat, 24 May 2003, Frank Bax wrote:
> >
> > > Finding previous examples of complex joins in archives is not likely an
> > > easy thing to find.
> > >
> > > pg_dump -s -t author -t publication -t pub_auth [database] | grep -v ^--
> > >
> > > Change [database] to the name of your database - this command will dump out
> > > schema relative to your request.  Post the results to this list.  Then ask
> > > us the question "how do I write a SELECT that produces...[ you finish this
> > > sentence]".  Question probably more appropriate to the list you mentioned,
> > > but I expect there are people here who are just as capable of answering the
> > > question.  I've even seen examples where the process goes through several
> > > emails before SQL produces desired results exactly.
> > >
> > >  >How would a join make this easier?
> > >
> > > I have always found that one properly constructed complex query is always
> > > "cheaper" in runtime than numerous queries inside a foreach loop.  Your
> > > final query will likely include joining a table to itself (this can
> > > sometimes be a difficult concept to grasp).
> > >
> > > Frank
> > >
> > >
> > > At 11:50 AM 5/24/03, Chadwick Rolfs wrote:
> > >
> > >
> > > >I'm glad this came up, because I have the same type of problem.  Except,
> > > >I don't see how a join can work... of course, I'm not really schooled in
> > > >this stuff.
> > > >
> > > >I also have three tables: author, publication, and pub_auth.
> > > >
> > > >There are multiple authors for some publications, so it is necessary to
> > > >check each publication selected for ALL authors.  I'm doing this with a
> > > >foreach loop on the result of each publication key returned.
> > > >
> > > >How would a join make this easier?
> > > >
> > > >I'm browsing the pgsql-sql archives now, but that may take a week.  I'm
> > > >not sure what to search for...
> > > >
> >
> > -Chadwick
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 2: you can get off all lists at once with the unregister command
> >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>

-Chadwick