Thread: data-mining
Hello all, I was woundering if postgresql would be able to handle data-mining and data-warehousing concepts. I have a project for class in which I have to get at least two rules of data-mining working. If anyone has done this with postgresql, please point me in the right direction. Would I need additional software or just use the features of postgresql? Also could anyone point me in the right direction so I a large data set to use with data-mining? Thanks if anyone can help :-) -- Antoine <asolomon15@nyc.rr.com>
Antoine, > Hello all, I was woundering if postgresql would be able to handle > data-mining and data-warehousing concepts. I have a project for class > in which I have to get at least two rules of data-mining working. If > anyone has done this with postgresql, please point me in the right > direction. Would I need additional software or just use the features > of postgresql? Also could anyone point me in the right direction so I a > large data set to use with data-mining? Given that "data-mining" and "data-warehousing" are just branding names for "big databases with lots of relational historical information", this should be no problem on PostgreSQL. Whether or not you need 3rd-party tools depends on what kind of data mining you want to do. Of particular interest should be Joe Conway's PL/R project, which adds some sophisticated statistical and graphical capabilities to PostgreSQL. For your data, someone on the PGSQL-PERFORMANCE list posted a link to the TigerUSA database derived from the last US Census. This should provide you plenty of material. Search the online list archives for the link, or google for it. -- Josh Berkus Aglio Database Solutions San Francisco
On Tue, 2003-04-29 at 11:45, Josh Berkus wrote: > Antoine, > > > Hello all, I was woundering if postgresql would be able to handle > > data-mining and data-warehousing concepts. I have a project for class > > in which I have to get at least two rules of data-mining working. If > > anyone has done this with postgresql, please point me in the right > > direction. Would I need additional software or just use the features > > of postgresql? Also could anyone point me in the right direction so I a > > large data set to use with data-mining? > > Given that "data-mining" and "data-warehousing" are just branding names for > "big databases with lots of relational historical information", this should > be no problem on PostgreSQL. Whether or not you need 3rd-party tools depends > on what kind of data mining you want to do. Of particular interest should > be Joe Conway's PL/R project, which adds some sophisticated statistical and > graphical capabilities to PostgreSQL. > > For your data, someone on the PGSQL-PERFORMANCE list posted a link to the > TigerUSA database derived from the last US Census. This should provide you > plenty of material. Search the on line list archives for the link, or google > for it. All I have to do is get some sort of data-mining algorithm working and submit to professor. I took a look at the TigerUSA database and it appears very large. I am not sure How I would be able to get that into the database.. it appears that this Census comes in many zip files with different file formatts. -- Antoine <asolomon15@nyc.rr.com>