Thread: data-mining

data-mining

From
Antoine
Date:
Hello all, I was woundering  if postgresql would be able to handle
data-mining and data-warehousing concepts.  I have a project for class
in which I have to get at least two rules of data-mining working.  If
anyone has done this with postgresql, please point me in the right
direction.   Would I need additional software or just use the features
of postgresql?  Also could anyone point me in the right direction so I a
large data set to use with data-mining?

Thanks if anyone can help :-)
--
Antoine <asolomon15@nyc.rr.com>


Re: data-mining

From
Josh Berkus
Date:
Antoine,

> Hello all, I was woundering  if postgresql would be able to handle
> data-mining and data-warehousing concepts.  I have a project for class
> in which I have to get at least two rules of data-mining working.  If
> anyone has done this with postgresql, please point me in the right
> direction.   Would I need additional software or just use the features
> of postgresql?  Also could anyone point me in the right direction so I a
> large data set to use with data-mining?

Given that "data-mining" and "data-warehousing" are just branding names for
"big databases with lots of relational historical information", this should
be no problem on PostgreSQL.  Whether or not you need 3rd-party tools depends
on what kind of data mining you want to do.   Of particular interest should
be Joe Conway's PL/R project, which adds some sophisticated statistical and
graphical capabilities to PostgreSQL.

For your data, someone on the PGSQL-PERFORMANCE list posted a link to the
TigerUSA database derived from the last US Census.   This should provide you
plenty of material.   Search the online list archives for the link, or google
for it.

--
Josh Berkus
Aglio Database Solutions
San Francisco


Re: data-mining

From
Antoine
Date:
On Tue, 2003-04-29 at 11:45, Josh Berkus wrote:
> Antoine,
>
> > Hello all, I was woundering  if postgresql would be able to handle
> > data-mining and data-warehousing concepts.  I have a project for class
> > in which I have to get at least two rules of data-mining working.  If
> > anyone has done this with postgresql, please point me in the right
> > direction.   Would I need additional software or just use the features
> > of postgresql?  Also could anyone point me in the right direction so I a
> > large data set to use with data-mining?
>
> Given that "data-mining" and "data-warehousing" are just branding names for
> "big databases with lots of relational historical information", this should
> be no problem on PostgreSQL.  Whether or not you need 3rd-party tools depends
> on what kind of data mining you want to do.   Of particular interest should
> be Joe Conway's PL/R project, which adds some sophisticated statistical and
> graphical capabilities to PostgreSQL.
>
> For your data, someone on the PGSQL-PERFORMANCE list posted a link to the
> TigerUSA database derived from the last US Census.   This should provide you
> plenty of material.   Search the on line list archives for the link, or google
> for it.

All I have to do is get some sort of data-mining algorithm working and
submit to professor.   I took a look at the TigerUSA database and it
appears very large.  I am not sure How I would be able to get that into
the database..   it appears that this Census comes in many zip files
with different file formatts.
--
Antoine <asolomon15@nyc.rr.com>