Code Question: How can I extract and save text using Perl?

No extracted data output to data2.txt? What goes wrong to the code?

MyFile.txt

ex1,fx2,xx1
mm1,nn2,gg3
EX1,hh2,ff7

This is my desired output in data2.txt:

ex1,fx2,xx1
EX1,hh2,ff7

#! /DATA/PLUG/pvelasco/Softwares/PERLINUX/bin/perl -w

my $infile  ='My1.txt';
my $outfile ='data2.txt';

open IN,  '<', $infile  or die "Cant open $infile:$!";
open OUT, '>', $outfile or die "Cant open $outfile:$!";

while (<IN>) {   
  if (m/EX$HF|ex$HF/) {
    print OUT $_, "\n";      
    print $_;   
  }
}

close IN;
close OUT;

From stackoverflow Shiel

When I run your code, but name the input file My1.txt instead of MyFile.txt I get the desired output - except with empty lines, which you can remove by removing the , "\n" from the print statement.

Shiel : Oh sorry I forgot to edit My1.txt. It should be MyFile.txt.

From moritz
This regex makes no sense:
```
m/EX$HF|ex$HF/
```
Is $HF supposed to be a variable? What are you trying to match?

Also, the second line in every Perl script you write should be:
```
use strict;
```
It will make Perl catch such mistakes and tell you about them, rather than silently ignoring them.

Brad Gilbert : ... and the third should be `use warnings`.

raldi : He already has -w on the first line.

Brad Gilbert : Well why doesn't he just add -Mstrict to the first line?

From raldi
```
while (<IN>) {
  if (m/^(EX|ex)\d.*/) {   
    print OUT "$_";      
    print $_;   
  }
}
```
Jouni K. Seppänen : Also, if you don't need the (debug?) output of all lines in the input file, you can reduce this to the one-liner perl -ne 'print if /^(EX|ex)\d/'

John Ferguson : perl golf has its place, but I'd rather people put readable code into production.

Brad Gilbert : This *is* simple enough to use a one-liner.

From benPearce
Bleh! "use strict;" "use warnings;". Lexical-filehandles. Three-args-open.

From Shlomi Fish

What are you trying to do? Keep all the lines that start with EX? Using a regexp is overkill - you're much better off just checking the first two letters. In python:



from __future__ import with_statement
class converter(object):
    def __init__(self, inFile, outFile):
        self.inFile, self.outFile = inFile, outFile

def main(self):
    with open(self.inFile, 'r') as infsock:
        with open(self.outFile, 'w') as outfsock:
            for line in infsock:
                self.doReplace(line, outfsock)

def doReplace(self, line, outsock):
    if ''.join(line[:2]).upper() == "EX":
        outsock.write(line)


if name == 'main':
    import sys
    ZeConverter = converter(sys.argv[1], sys.argv[2])
    ZeConverter.main()

Sub out doReplace if you need a different replacement method

From kanja

The filenames don't match.
```
open(my $inhandle, '<', $infile)   or die "Cant open $infile: $!";
open(my $outhandle, '>', $outfile) or die "Cant open $outfile: $!";

while(my $line = <$inhandle>) {   

    # Assumes that ex, Ex, eX, EX all are valid first characters
    if($line =~ m{^ex}i) {         # or   if(lc(substr $line, 0 => 2) eq 'ex') {
        print { $outhandle } $line;      
        print $line;
    }
}
```
And yes, always always use strict;

You could also chomp $line and (if using perl 5.10) say $line instead of print "$line\n".

raldi : What are the braces for in this line? print { $outhandle } $line;

draegtun : It helps avoid mistakes like... print $outhandle, $line; (the comma means print won't recognise $outhandle as a file handle). Its a recommendation from "Perl Best Practises" by Damian Conway.

Brad Gilbert : I didn't realize that would work.

From Berserk
Sorry if this seems like stating the bleeding obvious, but what's wrong with
```
grep -i ^ex < My1.txt > data2.txt
```
... or if you really want to do it in perl (and there's nothing wrong with that):
```
perl -ne '/^ex/i && print' < My1.txt > data2.txt
```
This assumes the purpose of the request is to find lines that start with EX, with case-insensitivity.

From RET

Code Question

Saturday, February 12, 2011

How can I extract and save text using Perl?

0 comments:

Post a Comment

Blog Archive