Thread: total newbie question: windows download (binary)
The PostGresQL website/ftpsite is confusing: are there any Windows downloadable binaries of PostGresQL? thanks! -Mark __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com
You can use cygwin to run PostgreSQL on windows (www.cygwin.com).
There is a prebuilt binary with the distribution.
however you will need to also install cygipc from:
http://www.neuro.gatech.edu/users/cwilson/cygutils/cygipc/index.html
when the ipc-daemon is running you should then be able to initdb.
The cygwin distribution comes with a readme for more info.
Also there is a pgsql-cygwin mailing list.
hth,
- Stuart
> -----Original Message-----
> From: Mark Moser [mailto:markmoser@yahoo.com]
> Sent: 31 July 2002 19:32
> To: pgsql-novice@postgresql.org
> Subject: [NOVICE] total newbie question: windows download (binary)
>
>
> The PostGresQL website/ftpsite is confusing: are
> there any Windows downloadable binaries of PostGresQL?
>
>
> thanks!
> -Mark
Their has been some discussion regarding aggregate functions and indexes. It's clear that aggregate functions will not take advantage of any indexes. (ie. SELECT sum(number) FROM table WHERE type=BLAH will not take advantage of 'type' being indexed.) I do a lot of reporting based on such SQL statements. Counting rows from large datasets. Since the PG gurus don't seem to think this is such a big deal can someone enlighten me as to why? Thanks, Adam Erickson
Adam, > I do a lot of reporting based on such SQL statements. Counting rows from > large datasets. Since the PG gurus don't seem to think this is such a big > deal can someone enlighten me as to why? I am not a core developer, but I will take a crack at your question anyway based on my personal knowledge. I am sure that Tom, Bruce, or Stephan will correct my mistaken assumptions. Actually, the inability to index MAX, MIN, and COUNT is an annoyance I think everyone would like to fix; it's just that the technical challenge is much greater than the benefit for the core team. If you know a programmer who wants to tackle it, go ahead. Presumably you've already read the many previous e-mails on why it is a techincal challenge. Now, even if that challenge were solved, indexing for aggregates would still be of limited usefulness because: Few (if any) RDBMSs can index for SUM, STDEV, or other "calcuation" aggregates. This is because the value of every single record must be incuded and estimates are not possible, so the performance gain from using an index is infinitessimal except for those RDBMSs with very slow file access times. For Postgres custom aggregates, using a standard index is impossible, for reasons I think are obvious. That leaves MAX, MIN, and COUNT. All of these aggregates should, in an ideal world, be index-responsive for large data sets. Once again, for small data sets or subsets, indexes are not useful. And there is a workaround for Min and Max. So what we'd be looking at is either developing a special parser routine for MIN, MAX, and COUNT (and them only) just to index for those aggregates, or coming up with a new type of index just for aggregates. The first approach is a bit of kludge that would require a lot of debugging; the second is probably the best long-term solution, but would require a great deal of innovation. -- -Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus <josh@agliodbs.com> writes: > For Postgres custom aggregates, using a standard index is impossible, for > reasons I think are obvious. > That leaves MAX, MIN, and COUNT. All of these aggregates should, in an > ideal world, be index-responsive for large data sets. While it's fairly clear how one might use an index for MAX/MIN (basically, make the optimizer transform it into a SELECT ... ORDER BY ... LIMIT 1 operation, which can then be done with an indexscan), I really don't see how an index can help for COUNT. The real problem with COUNT is that any attempt to maintain such a value on-the-fly creates a single-point bottleneck for all insertions and deletions on the table. The perspective of most of the developers is that that cost outweighs any possible savings from having an instantaneous COUNT operation. When you add in the issue that under MVCC there isn't a unique COUNT that's the same for all transactions, it's just not worth thinking about. (And do I need to point out that with WHERE conditions, GROUP BY, or a variable COUNT argument, all hope of such optimization disappears anyway? A global rowcount doesn't help in those cases.) The MAX/MIN issue will probably be addressed someday, but since there is a good workaround it's not very high on anyone's TODO queue. We have many more-pressing problems. regards, tom lane
On Fri, 2002-08-02 at 15:43, Tom Lane wrote: > Josh Berkus <josh@agliodbs.com> writes: > > For Postgres custom aggregates, using a standard index is impossible, for > > reasons I think are obvious. > > > That leaves MAX, MIN, and COUNT. All of these aggregates should, in an > > ideal world, be index-responsive for large data sets. > > While it's fairly clear how one might use an index for MAX/MIN > (basically, make the optimizer transform it into a SELECT ... ORDER BY > ... LIMIT 1 operation, which can then be done with an indexscan), > I really don't see how an index can help for COUNT. > > The real problem with COUNT is that any attempt to maintain such a value > on-the-fly creates a single-point bottleneck for all insertions and > deletions on the table. The perspective of most of the developers is > that that cost outweighs any possible savings from having an > instantaneous COUNT operation. Blecch! Who wants to keep current counts????? You are right that that is totally silly. The closed-source database I use (Rdb/VMS on Alpha VMS) takes great advantage of indexes when processing COUNT, SUM, MIN & MAX. For example: CREATE INDEX ndx1 on table1 (fld1, fld2); SELECT fld1, COUNT(*) FROM table1 WHERE fld1 = 'bar' GROUP BY fld1; Since the index stores key-values and oids that point to the data, Rdb looks at the "wedge" of ndx1 that matches "fld1 = 'bar'" and counts up the oids that meet the criteria. In more complicated queries, where there is only partial index support for the query, yes, you will have to hit the live table, but the index should minimize the number of pages in the table that the must read. Thus, the index still speeds up the query. > When you add in the issue that under MVCC there isn't a unique COUNT > that's the same for all transactions, it's just not worth thinking > about. (And do I need to point out that with WHERE conditions, > GROUP BY, or a variable COUNT argument, all hope of such optimization > disappears anyway? A global rowcount doesn't help in those cases.) > > The MAX/MIN issue will probably be addressed someday, but since there > is a good workaround it's not very high on anyone's TODO queue. We have > many more-pressing problems. -- +-----------------------------------------------------------------+ | Ron Johnson, Jr. Home: ron.l.johnson@cox.net | | Jefferson, LA USA | | | | "The greatest dangers to liberty lurk in insidious encroachment | | by men of zeal, well-meaning, but without understanding." | | Justice Louis Brandeis, dissenting, Olmstead v US (1928) | +-----------------------------------------------------------------+
Josh Berkus wrote: > > Adam, > > > I do a lot of reporting based on such SQL statements. Counting rows from > > large datasets. Since the PG gurus don't seem to think this is such a big > > deal can someone enlighten me as to why? > > I am not a core developer, but I will take a crack at your question anyway > based on my personal knowledge. I am sure that Tom, Bruce, or Stephan will > correct my mistaken assumptions. I have seen a few mentions in the past weeks about core vs. non-core developers. I should reiterate that the core group feels there is no distinction between the opinions of the core people and the other major developers. Everyone gets just one vote. The only reason for core is to deal with certain "sensitive" issues that can't be dealt with in public forums, and making _big_ decisions like should we release on Friday or Monday. ;-) -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026