Thread: total newbie question: windows download (binary)

total newbie question: windows download (binary)

From

Mark Moser

Date:

02 August 2002, 09:34:08

The PostGresQL website/ftpsite is confusing:  are
there any Windows downloadable binaries of PostGresQL?


thanks!
-Mark

__________________________________________________
Do You Yahoo!?
Yahoo! Health - Feel better, live better
http://health.yahoo.com

Re: total newbie question: windows download (binary)

From

"Henshall, Stuart - WCP"

Date:

02 August 2002, 10:08:26

You can use cygwin to run PostgreSQL on windows (www.cygwin.com).
There is a prebuilt binary with the distribution.
however you will need to also install cygipc from:
http://www.neuro.gatech.edu/users/cwilson/cygutils/cygipc/index.html
when the ipc-daemon is running you should then be able to initdb.
The cygwin distribution comes with a readme for more info.
Also there is a pgsql-cygwin mailing list.
hth,
- Stuart

> -----Original Message-----
> From: Mark Moser [mailto:markmoser@yahoo.com]
> Sent: 31 July 2002 19:32
> To: pgsql-novice@postgresql.org
> Subject: [NOVICE] total newbie question: windows download (binary)
>
>
> The PostGresQL website/ftpsite is confusing: are
> there any Windows downloadable binaries of PostGresQL?
>
>
> thanks!
> -Mark

Aggregates and Indexes

From

"Adam Erickson"

Date:

02 August 2002, 14:20:34

Their has been some discussion regarding aggregate functions and indexes.
It's clear that aggregate functions will not take advantage of any indexes.
(ie. SELECT sum(number) FROM table WHERE type=BLAH will not take advantage
of 'type' being indexed.)

I do a lot of reporting based on such SQL statements.  Counting rows from
large datasets.  Since the PG gurus don't seem to think this is such a big
deal can someone enlighten me as to why?

Thanks,
Adam Erickson

Re: Aggregates and Indexes

From

Josh Berkus

Date:

02 August 2002, 16:20:08

Adam,

> I do a lot of reporting based on such SQL statements.  Counting rows from
> large datasets.  Since the PG gurus don't seem to think this is such a big
> deal can someone enlighten me as to why?

I am not a core developer, but I will take a crack at your question anyway
based on my personal knowledge.  I am sure that Tom, Bruce, or Stephan will
correct my mistaken assumptions.

Actually, the inability to index MAX, MIN, and COUNT is an annoyance I think
everyone would like to fix; it's just that the technical challenge is much
greater than the benefit for the core team.   If you know a programmer who
wants to tackle it, go ahead.

Presumably you've already read the many previous e-mails on why it is a
techincal challenge.

Now, even if that challenge were solved, indexing for aggregates would still
be of limited usefulness because:

Few (if any) RDBMSs can index for SUM, STDEV, or other "calcuation"
aggregates.  This is because the value of every single record must be incuded
and estimates are not possible, so the performance gain from using an index
is infinitessimal except for those RDBMSs with very slow file access times.

For Postgres custom aggregates, using a standard index is impossible, for
reasons I think are obvious.

That leaves MAX, MIN, and COUNT.    All of these aggregates should, in an
ideal world, be index-responsive for large data sets.  Once again, for small
data sets or subsets, indexes are not useful.   And there is a workaround for
Min and Max.

So what we'd be looking at is either developing a special parser routine for
MIN, MAX, and COUNT (and them only) just to index for those aggregates, or
coming up with a new type of index just for aggregates.   The first approach
is a bit of kludge that would require a lot of debugging; the second is
probably the best long-term solution, but would require a great deal of
innovation.


--
-Josh Berkus
 Aglio Database Solutions
 San Francisco

Re: Aggregates and Indexes

From

Tom Lane

Date:

02 August 2002, 16:44:21

Josh Berkus <josh@agliodbs.com> writes:
> For Postgres custom aggregates, using a standard index is impossible, for
> reasons I think are obvious.

> That leaves MAX, MIN, and COUNT.    All of these aggregates should, in an
> ideal world, be index-responsive for large data sets.

While it's fairly clear how one might use an index for MAX/MIN
(basically, make the optimizer transform it into a SELECT ... ORDER BY
... LIMIT 1 operation, which can then be done with an indexscan),
I really don't see how an index can help for COUNT.

The real problem with COUNT is that any attempt to maintain such a value
on-the-fly creates a single-point bottleneck for all insertions and
deletions on the table.  The perspective of most of the developers is
that that cost outweighs any possible savings from having an
instantaneous COUNT operation.

When you add in the issue that under MVCC there isn't a unique COUNT
that's the same for all transactions, it's just not worth thinking
about.  (And do I need to point out that with WHERE conditions,
GROUP BY, or a variable COUNT argument, all hope of such optimization
disappears anyway?  A global rowcount doesn't help in those cases.)

The MAX/MIN issue will probably be addressed someday, but since there
is a good workaround it's not very high on anyone's TODO queue.  We have
many more-pressing problems.

            regards, tom lane

Re: Aggregates and Indexes

From

Ron Johnson

Date:

02 August 2002, 20:27:37

On Fri, 2002-08-02 at 15:43, Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> > For Postgres custom aggregates, using a standard index is impossible, for
> > reasons I think are obvious.
>
> > That leaves MAX, MIN, and COUNT.    All of these aggregates should, in an
> > ideal world, be index-responsive for large data sets.
>
> While it's fairly clear how one might use an index for MAX/MIN
> (basically, make the optimizer transform it into a SELECT ... ORDER BY
> ... LIMIT 1 operation, which can then be done with an indexscan),
> I really don't see how an index can help for COUNT.
>
> The real problem with COUNT is that any attempt to maintain such a value
> on-the-fly creates a single-point bottleneck for all insertions and
> deletions on the table.  The perspective of most of the developers is
> that that cost outweighs any possible savings from having an
> instantaneous COUNT operation.

Blecch!  Who wants to keep current counts?????  You are right
that that is totally silly.  The closed-source database I use
(Rdb/VMS on Alpha VMS) takes great advantage of indexes when
processing COUNT, SUM, MIN & MAX.

For example:
CREATE INDEX ndx1 on table1 (fld1, fld2);
SELECT fld1, COUNT(*)
FROM table1
WHERE fld1 = 'bar'
GROUP BY fld1;

Since the index stores key-values and oids that point to the
data, Rdb looks at the "wedge" of ndx1 that matches "fld1 = 'bar'"
and counts up the oids that meet the criteria.

In more complicated queries, where there is only partial index
support for the query, yes, you will have to hit the live table,
but the index should minimize the number of pages in the table
that the must read.  Thus, the index still speeds up the query.

> When you add in the issue that under MVCC there isn't a unique COUNT
> that's the same for all transactions, it's just not worth thinking
> about.  (And do I need to point out that with WHERE conditions,
> GROUP BY, or a variable COUNT argument, all hope of such optimization
> disappears anyway?  A global rowcount doesn't help in those cases.)
>
> The MAX/MIN issue will probably be addressed someday, but since there
> is a good workaround it's not very high on anyone's TODO queue.  We have
> many more-pressing problems.

--
+-----------------------------------------------------------------+
| Ron Johnson, Jr.        Home: ron.l.johnson@cox.net             |
| Jefferson, LA  USA                                              |
|                                                                 |
| "The greatest dangers to liberty lurk in insidious encroachment |
|  by men of zeal, well-meaning, but without understanding."      |
|   Justice Louis Brandeis, dissenting, Olmstead v US (1928)      |
+-----------------------------------------------------------------+

Re: [SQL] Aggregates and Indexes

From

Bruce Momjian

Date:

06 August 2002, 12:49:22

Josh Berkus wrote:
>
> Adam,
>
> > I do a lot of reporting based on such SQL statements.  Counting rows from
> > large datasets.  Since the PG gurus don't seem to think this is such a big
> > deal can someone enlighten me as to why?
>
> I am not a core developer, but I will take a crack at your question anyway
> based on my personal knowledge.  I am sure that Tom, Bruce, or Stephan will
> correct my mistaken assumptions.

I have seen a few mentions in the past weeks about core vs. non-core
developers.  I should reiterate that the core group feels there is no
distinction between the opinions of the core people and the other major
developers.  Everyone gets just one vote.

The only reason for core is to deal with certain "sensitive" issues that
can't be dealt with in public forums, and making _big_ decisions like
should we release on Friday or Monday.  ;-)

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026