Thread: Need help with search-and-replace

Need help with search-and-replace

From
"Josh Berkus"
Date:
Folks,
I need to strip certain columns out of my pgdump file.  However, I
can't figure out how to use any Unix-based tool to search-and-replace a
specific value which includes a tab character (e.g. replace "{TAB}7
00:00:00" with "" to eliminate the column).
RIght now, I'm copying the file to a Win32 machine and using MS Word
for the search-and-replace, but I'm sure there's got to be a better way
... *without* learning VI or Emacs.  Help?
                -Josh


______AGLIO DATABASE SOLUTIONS___________________________                                      Josh Berkus Complete
informationtechnology      josh@agliodbs.com  and data management solutions       (415) 565-7293 for law firms, small
businesses       fax 621-2533   and non-profit organizations.      San Francisco
 


Re: Need help with search-and-replace

From
Christopher Sawtell
Date:
On Sunday 06 May 2001 10:27, Josh Berkus wrote:
> Folks,
>
>     I need to strip certain columns out of my pgdump file.  However, I
> can't figure out how to use any Unix-based tool to search-and-replace a
> specific value which includes a tab character (e.g. replace "{TAB}7
> 00:00:00" with "" to eliminate the column).

In other words you wish to remove one field from a tab delimited file?

>     RIght now, I'm copying the file to a Win32 machine and using MS Word
> for the search-and-replace,

Oh no!  MS is bound to screw it up somehow.

> but I'm sure there's got to be a better way
> ... *without* learning VI or Emacs.  Help

man cut  is your friend.
info cut  is another friend if you are on a GNU system.

cat the.file | cut  -f1,3-   # assuming is the field you wish to remove is 
the second one.

cut uses the tab character as the delimiter by default.

single line file attached

-- 
Sincerely etc.,
NAME       Christopher SawtellCELL PHONE 021 257 4451ICQ UIN    45863470EMAIL      csawtell @ xtra . co . nzCNOTES
ftp://ftp.funet.fi/pub/languages/C/tutorials/sawtell_C.tar.gz
-->> Please refrain from using HTML or WORD attachments in e-mails to me <<--


Re: Need help with search-and-replace

From
Ian Harding
Date:
There are oh-so-many ways, as I am sure people will tell you.  regular
expressions are the most wonderful things for such a task.  I am comfortable
with tcl, so I would read the file into a tcl variable and use 'regsub -all
{\t700:00:00} $instring {} outstring'.

There are unbelievably simple, unvbelievably fast ways to do this in one line
from the shell using sed, but I don't speak sed.  I suspect someone will hook
you up with some basic sed.

Try this in Windows.  Visual Basic can use regular expressions, but you have to
instantiate a regular expression object, then execute one of it's methods to do
anything.  Ugh.

Ian

Josh Berkus wrote:

> Folks,
>
>         I need to strip certain columns out of my pgdump file.  However, I
> can't figure out how to use any Unix-based tool to search-and-replace a
> specific value which includes a tab character (e.g. replace "{TAB}7
> 00:00:00" with "" to eliminate the column).
>
>         RIght now, I'm copying the file to a Win32 machine and using MS Word
> for the search-and-replace, but I'm sure there's got to be a better way
> ... *without* learning VI or Emacs.  Help?
>
>                                         -Josh
>
> ______AGLIO DATABASE SOLUTIONS___________________________
>                                        Josh Berkus
>   Complete information technology      josh@agliodbs.com
>    and data management solutions       (415) 565-7293
>   for law firms, small businesses        fax 621-2533
>     and non-profit organizations.      San Francisco
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster



OFFTOPIC: search and replace with unix tools

From
Oliver Seidel
Date:
>>>>> "Josh" == Josh Berkus <josh@agliodbs.com> writes:
   Josh> Folks, I need to strip certain columns out of my pgdump   Josh> file.  However, I can't figure out how to use
anyUnix-based   Josh> tool to search-and-replace a specific value which includes a   Josh> tab character (e.g. replace
"{TAB}700:00:00" with "" to   Josh> eliminate the column).
 

Unix lives by the shell pipe.  Set "exit on error", to avoid data loss
in case of "filesystem full", proceed by using "tr" to translate
single characters within the file to something more easily replacable,
do the replace with "sed", translate back using "tr", move over old
file, done:

---------------------------------------------------------------------------
#!/bin/bash

set -e -x

cat "$*" | \
tr '\t' '�' | \
sed -e 's/�7 00:00:00//g' | \
tr '�' '\t' | \
cat > x.$$

mv x.$$ "*"
---------------------------------------------------------------------------

(please don't kill me for the two "cat" operators, they serve no
purpose besides legibility).

so long,

Oliver


Re: Re: Need help with search-and-replace

From
"tjk@tksoft.com"
Date:
I am sure someone already sent this reply and I missed it.

Anyway, if I understand the original problem correctly, you want to
find instances of "\t\t00:00:00" and "\t\t\t\t\t\t\t00:00:00", etc. and 
remove them. 

I hope this is generic enough so you can change it to fit your needs:

echo "Start c     00:00:00crap here." | sed "s/\([^   ]*\)[ ]\+[0-9][0-9]:[0-9][0-9]:[0-9][0-9]\(.*\)/\1\2/g"


This will find an instance of "nn:nn:nn" only when preceded by more than one
tab.

Perl is easier to read, so here is a perlish version:
echo "Start c     00:00:00crap here." | {perlish} "s/([^\t]*)[\t]+\d\d:\d\d:\d\d(.*)/$1$2/g"


Troy


> 
> There are oh-so-many ways, as I am sure people will tell you.  regular
> expressions are the most wonderful things for such a task.  I am comfortable
> with tcl, so I would read the file into a tcl variable and use 'regsub -all
> {\t700:00:00} $instring {} outstring'.
> 
> There are unbelievably simple, unvbelievably fast ways to do this in one line
> from the shell using sed, but I don't speak sed.  I suspect someone will hook
> you up with some basic sed.
> 
> Try this in Windows.  Visual Basic can use regular expressions, but you have to
> instantiate a regular expression object, then execute one of it's methods to do
> anything.  Ugh.
> 
> Ian
> 
> Josh Berkus wrote:
> 
> > Folks,
> >
> >         I need to strip certain columns out of my pgdump file.  However, I
> > can't figure out how to use any Unix-based tool to search-and-replace a
> > specific value which includes a tab character (e.g. replace "{TAB}7
> > 00:00:00" with "" to eliminate the column).
> >
> >         RIght now, I'm copying the file to a Win32 machine and using MS Word
> > for the search-and-replace, but I'm sure there's got to be a better way
> > ... *without* learning VI or Emacs.  Help?
> >
> >                                         -Josh
> >
> > ______AGLIO DATABASE SOLUTIONS___________________________
> >                                        Josh Berkus
> >   Complete information technology      josh@agliodbs.com
> >    and data management solutions       (415) 565-7293
> >   for law firms, small businesses        fax 621-2533
> >     and non-profit organizations.      San Francisco
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>