Re: SQL Loader? - Mailing list pgsql-general
From | Guy Fraser |
---|---|
Subject | Re: SQL Loader? |
Date | |
Msg-id | 3B9CF40A.1D84D909@incentre.net Whole thread Raw |
List | pgsql-general |
Joel Pang wrote: > > Hi Fraser, > > I've read your mail in postgresql web site that you've a procedure that will do bulk loading of records into database. I've been looking a utility just like the sql loader of Oracle for the Postgres database. I was wondering if youwill share your procedure will me? or give me some URL links that I can get reference on how to write my own sql loaderprocedure? > > Thanks in advance for your help. > > Cheers, > Joel Please not the script below uses some special program to modify radius accounting files for import as an ordered tab delimited file format. The script also shows a way to import into an intermediate table for seperating different uniqu records into seperate files then moving the duplicate entries to a table for data integrity. This also demonstrates a method of using tables determined from the data being imported. I wrote this software a long time ago and have spent little time patching or rewriting. Since the script was initially written, I now know some better ways of performing some of these tasks. I don't have time to rewrite the script and it has worked for over 3 years so it is low priority. Guy Fraser ======== begin radimport ======== #!/bin/bash ORIG=`pwd` WORKDIR='/usr/local/pgsql' cd $WORKDIR echo Start collecting files wget -P $WORKDIR/ -t 10 -T 30 -c \ ftp://username:password@host.domain/path/*.acct.gz # modified for security echo Finished collecting files echo Start decompressing files gunzip -v *.acct.gz echo Finished decompressing files for fname in `ls *.acct`; do { [ -f $fname ] && { date echo Start inputing $fname # parserad - convert radius detail file to tab delimited format. /usr/local/sbin/parserad -i $fname \ | /usr/local/pgsql/bin/psql -c "\ copy detail from stdin;" radius echo Finished inputing $fname echo Start compressing $fname gzip -v $fname echo Finished compressing $fname # # # Clean up detail # /usr/local/bin/psql -c "vacuum detail;" radius # # If more than one month determine Current and Previous, Month and Year. # # MINTIME=`/usr/local/pgsql/bin/psql -c "\ select date_trunc('month',min(\"Time-Stamp\")::abstime) from detail;\ " -A -t radius` MINMON=`echo $MINTIME | awk '{print $2}' -` MINYEAR=`echo $MINTIME | awk '{print $5}' -` MAXTIME=`/usr/local/pgsql/bin/psql -c "\ select date_trunc('month',max(\"Time-Stamp\")::abstime) from detail;\ " -A -t radius` MAXMON=`echo $MAXTIME | awk '{print $2}' -` MAXYEAR=`echo $MAXTIME | awk '{print $5}' -` [ "$MAXYEAR" = "" ] && (echo "Exiting: No Data in detail table." || exit 1) echo Moving $fname mv $fname.gz /mnt/sdb3/done/$MAXYEAR echo Start processing data from $fname # # Process records in detail file and create a unique record file called radius. # # echo Creating lookup table /usr/local/bin/psql -c "\ select min(oid) as recnum,max(\"Acct-Session-Time\"),\ \"Acct-Status-Type\",\"Acct-Session-Id\",\"NAS-IP-Address\",\"NAS-Port-Id\",\ \"User-Name\",\"Realm\",\"Framed-IP-Address\",\"Calling-Station-Id\" \ into radius \ from detail \ group by \"Acct-Session-Id\",\"NAS-IP-Address\",\"NAS-Port-Id\",\"User-Name\",\ \"Realm\",\"Framed-IP-Address\",\"Calling-Station-Id\",\"Acct-Status-Type\";\ " radius /usr/local/bin/psql -c "vacuum radius;" radius # # # Move stop records to stop table # echo Creating unique stop record table /usr/local/bin/psql -c "\ select recnum into radstop from radius where \"Acct-Status-Type\" = 'Stop';\ " radius echo Filtering stop records /usr/local/bin/psql -c "\ select a.* into stop from detail as a,radstop as b where a.oid = b.recnum;\ " radius /usr/local/bin/psql -c "select count(*) as \"Filtered\" from stop;" radius echo Cleaning stop records /usr/local/bin/psql -c "\ delete from detail where oid = radstop.recnum;\ " radius echo Cleaning temporary tables /usr/local/bin/psql -c "drop table radstop;" radius /usr/local/bin/psql -c "vacuum detail;" radius # # Move start and alive records to start table # echo Creating unique start record table /usr/local/bin/psql -c "\ select recnum into radstart from radius where \"Acct-Status-Type\" != 'Stop';\ " radius echo Filtering start records /usr/local/bin/psql -c "\ select a.* into start from detail as a,radstart as b where a.oid = b.recnum;\ " radius /usr/local/bin/psql -c "select count(*) as \"Filtered\" from start;" radius echo Cleaning start records /usr/local/bin/psql -c "\ delete from detail where oid = radstart.recnum;\ " radius echo Cleaning temporary tables /usr/local/bin/psql -c "drop table radstart;" radius /usr/local/bin/psql -c "drop table radius;" radius /usr/local/bin/psql -c "vacuum detail;" radius # # Move rest of records to dups table # echo Filtering duplicate records /usr/local/bin/psql -c "select * into dups from detail;" radius /usr/local/bin/psql -c "select count(*) as \"Filtered\" from dups;" radius echo Cleaning duplicate records /usr/local/bin/psql -c "delete from detail;" radius echo Cleaning temporary tables /usr/local/bin/psql -c "vacuum detail;" radius # # # Determine if more than one month. # [ "$MAXMON" != "$MINMON" ] && { echo Previous Month # # Starting Previous Month # # # # # Put Stop records in Previous stop # # echo Inserting stop records into ${MINMON}start /usr/local/bin/psql -c "\ insert into ${MINMON}stop \ select * \ from stop \ where \"Time-Stamp\" between \ date_trunc('month','$MINMON 1 $MINYEAR 00:00:00'::abstime) \ and date_trunc('month','$MINMON 1 $MINYEAR 00:00:00'::abstime \ + '32 days')::abstime - '1 second'; " radius # # Put Start and Alive records in Previous start # echo Inserting start records into ${MINMON}start /usr/local/bin/psql -c "\ insert into ${MINMON}start \ select * \ from start \ where \"Time-Stamp\" between \ date_trunc('month','$MINMON 1 $MINYEAR 00:00:00'::abstime) \ and date_trunc('month','$MINMON 1 $MINYEAR 00:00:00'::abstime \ + '32 days')::abstime - '1 second';\ " radius # # Put rest of data in Previous dups # echo Inserting duplicate records into ${MINMON}dups /usr/local/bin/psql -c "\ insert into ${MINMON}dups \ select * \ from dups \ where \"Time-Stamp\" between \ date_trunc('month','$MINMON 1 $MINYEAR 00:00:00'::abstime) \ and date_trunc('month','$MINMON 1 $MINYEAR 00:00:00'::abstime \ + '32 days')::abstime - '1 second';\ " radius # # # clean up after previous month. # echo Finished processing data from $fname for $MINMON $MINYEAR date echo Cleaning up after $fname /usr/local/bin/psql -c "vacuum ${MINMON}stop;" radius /usr/local/bin/psql -c "vacuum ${MINMON}start;" radius /usr/local/bin/psql -c "vacuum ${MINMON}dups;" radius date echo Making Daily and Monthly Tables for $MINMON /usr/local/pgsql/make-day-time $MINMON } echo Current Month # # Starting Current Month # # # # # Put Stop records in Current stop # echo Inserting stop records into ${MINMON}stop /usr/local/bin/psql -c "\ insert into ${MAXMON}stop \ select * \ from stop \ where \"Time-Stamp\" between \ date_trunc('month','$MAXMON 1 $MAXYEAR 00:00:00'::abstime) \ and date_trunc('month','$MAXMON 1 $MAXYEAR 00:00:00'::abstime \ + '32 days')::abstime - '1 second';\ " radius /usr/local/bin/psql -c "drop table stop;" radius # # Put Stop records in Current stop # echo Inserting start records into ${MINMON}start /usr/local/bin/psql -c "\ insert into ${MAXMON}start \ select * \ from start \ where \"Time-Stamp\" between \ date_trunc('month','$MAXMON 1 $MAXYEAR 00:00:00'::abstime) \ and date_trunc('month','$MAXMON 1 $MAXYEAR 00:00:00'::abstime \ + '32 days')::abstime - '1 second';\ " radius /usr/local/bin/psql -c "drop table start;" radius # # Put rest of data in Current dups # echo Inserting duplicate records into ${MINMON}dups /usr/local/bin/psql -c "\ insert into ${MAXMON}dups \ select * \ from dups \ where \"Time-Stamp\" between \ date_trunc('month','$MAXMON 1 $MAXYEAR 00:00:00'::abstime) \ and date_trunc('month','$MAXMON 1 $MAXYEAR 00:00:00'::abstime \ + '32 days')::abstime - '1 second';\ " radius /usr/local/bin/psql -c "drop table dups;" radius # # # # # # # clean up after current month. # echo Finished processing data from $fname for $MAXMON $MAXYEAR date echo Cleaning up after $fname /usr/local/bin/psql -c "vacuum ${MAXMON}stop;" radius /usr/local/bin/psql -c "vacuum ${MAXMON}start;" radius /usr/local/bin/psql -c "vacuum ${MAXMON}dups;" radius date echo Making Daily and Monthly Tables for $MAXMON /usr/local/pgsql/make-day-time $MAXMON } } date done echo "Finished, no more files to process ." cd $ORIG ======== end radimport ======== -- There is a fine line between genius and lunacy, fear not, walk the line with pride. Not all things will end up as you wanted, but you will certainly discover things the meek and timid will miss out on.
pgsql-general by date: