Thread: Need help with search-and-replace
Folks, I need to strip certain columns out of my pgdump file. However, I can't figure out how to use any Unix-based tool to search-and-replace a specific value which includes a tab character (e.g. replace "{TAB}7 00:00:00" with "" to eliminate the column). RIght now, I'm copying the file to a Win32 machine and using MS Word for the search-and-replace, but I'm sure there's got to be a better way ... *without* learning VI or Emacs. Help? -Josh ______AGLIO DATABASE SOLUTIONS___________________________ Josh Berkus Complete informationtechnology josh@agliodbs.com and data management solutions (415) 565-7293 for law firms, small businesses fax 621-2533 and non-profit organizations. San Francisco
On Sunday 06 May 2001 10:27, Josh Berkus wrote: > Folks, > > I need to strip certain columns out of my pgdump file. However, I > can't figure out how to use any Unix-based tool to search-and-replace a > specific value which includes a tab character (e.g. replace "{TAB}7 > 00:00:00" with "" to eliminate the column). In other words you wish to remove one field from a tab delimited file? > RIght now, I'm copying the file to a Win32 machine and using MS Word > for the search-and-replace, Oh no! MS is bound to screw it up somehow. > but I'm sure there's got to be a better way > ... *without* learning VI or Emacs. Help man cut is your friend. info cut is another friend if you are on a GNU system. cat the.file | cut -f1,3- # assuming is the field you wish to remove is the second one. cut uses the tab character as the delimiter by default. single line file attached -- Sincerely etc., NAME Christopher SawtellCELL PHONE 021 257 4451ICQ UIN 45863470EMAIL csawtell @ xtra . co . nzCNOTES ftp://ftp.funet.fi/pub/languages/C/tutorials/sawtell_C.tar.gz -->> Please refrain from using HTML or WORD attachments in e-mails to me <<--
There are oh-so-many ways, as I am sure people will tell you. regular expressions are the most wonderful things for such a task. I am comfortable with tcl, so I would read the file into a tcl variable and use 'regsub -all {\t700:00:00} $instring {} outstring'. There are unbelievably simple, unvbelievably fast ways to do this in one line from the shell using sed, but I don't speak sed. I suspect someone will hook you up with some basic sed. Try this in Windows. Visual Basic can use regular expressions, but you have to instantiate a regular expression object, then execute one of it's methods to do anything. Ugh. Ian Josh Berkus wrote: > Folks, > > I need to strip certain columns out of my pgdump file. However, I > can't figure out how to use any Unix-based tool to search-and-replace a > specific value which includes a tab character (e.g. replace "{TAB}7 > 00:00:00" with "" to eliminate the column). > > RIght now, I'm copying the file to a Win32 machine and using MS Word > for the search-and-replace, but I'm sure there's got to be a better way > ... *without* learning VI or Emacs. Help? > > -Josh > > ______AGLIO DATABASE SOLUTIONS___________________________ > Josh Berkus > Complete information technology josh@agliodbs.com > and data management solutions (415) 565-7293 > for law firms, small businesses fax 621-2533 > and non-profit organizations. San Francisco > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
>>>>> "Josh" == Josh Berkus <josh@agliodbs.com> writes: Josh> Folks, I need to strip certain columns out of my pgdump Josh> file. However, I can't figure out how to use anyUnix-based Josh> tool to search-and-replace a specific value which includes a Josh> tab character (e.g. replace "{TAB}700:00:00" with "" to Josh> eliminate the column). Unix lives by the shell pipe. Set "exit on error", to avoid data loss in case of "filesystem full", proceed by using "tr" to translate single characters within the file to something more easily replacable, do the replace with "sed", translate back using "tr", move over old file, done: --------------------------------------------------------------------------- #!/bin/bash set -e -x cat "$*" | \ tr '\t' '�' | \ sed -e 's/�7 00:00:00//g' | \ tr '�' '\t' | \ cat > x.$$ mv x.$$ "*" --------------------------------------------------------------------------- (please don't kill me for the two "cat" operators, they serve no purpose besides legibility). so long, Oliver
I am sure someone already sent this reply and I missed it. Anyway, if I understand the original problem correctly, you want to find instances of "\t\t00:00:00" and "\t\t\t\t\t\t\t00:00:00", etc. and remove them. I hope this is generic enough so you can change it to fit your needs: echo "Start c 00:00:00crap here." | sed "s/\([^ ]*\)[ ]\+[0-9][0-9]:[0-9][0-9]:[0-9][0-9]\(.*\)/\1\2/g" This will find an instance of "nn:nn:nn" only when preceded by more than one tab. Perl is easier to read, so here is a perlish version: echo "Start c 00:00:00crap here." | {perlish} "s/([^\t]*)[\t]+\d\d:\d\d:\d\d(.*)/$1$2/g" Troy > > There are oh-so-many ways, as I am sure people will tell you. regular > expressions are the most wonderful things for such a task. I am comfortable > with tcl, so I would read the file into a tcl variable and use 'regsub -all > {\t700:00:00} $instring {} outstring'. > > There are unbelievably simple, unvbelievably fast ways to do this in one line > from the shell using sed, but I don't speak sed. I suspect someone will hook > you up with some basic sed. > > Try this in Windows. Visual Basic can use regular expressions, but you have to > instantiate a regular expression object, then execute one of it's methods to do > anything. Ugh. > > Ian > > Josh Berkus wrote: > > > Folks, > > > > I need to strip certain columns out of my pgdump file. However, I > > can't figure out how to use any Unix-based tool to search-and-replace a > > specific value which includes a tab character (e.g. replace "{TAB}7 > > 00:00:00" with "" to eliminate the column). > > > > RIght now, I'm copying the file to a Win32 machine and using MS Word > > for the search-and-replace, but I'm sure there's got to be a better way > > ... *without* learning VI or Emacs. Help? > > > > -Josh > > > > ______AGLIO DATABASE SOLUTIONS___________________________ > > Josh Berkus > > Complete information technology josh@agliodbs.com > > and data management solutions (415) 565-7293 > > for law firms, small businesses fax 621-2533 > > and non-profit organizations. San Francisco > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >