Perl, Pattern Matching each element ($line) in an array - arrays

I have a simple enough problem I think, I have recently ran a script which extracted specific information from the string in each element in an array. I have written this before and it functions well however when trying the very simple version of it right now it will not presen data only the same response uninitialized value argument! I am getting really frustrated as my previous code works. I am clearly doing something STUPID and would love some help!
#!/usr/bin/env perl
use strict;
use warnings;
my#histone;
my$line;
my$idea;
my$file="demo_site.txt";
open(IN, "<$file")||die"\ncannot be opend\n";
#histone=<IN>;
print #histone;
foreach $line(#histone)
{
$line=~ m/([a-zA-Z0-9]+)\t[0-9]+\t[0-9]+\t/;
print$1."\n";
print$2."\n";
print$3."\n";
}
The infile "demo_site.txt" takes the format of a tab delimited .txt file:
chr9 1234 5678 . 200 . 14.0 -1
This file has multiple lines as above and I wish to extract the first three items of data so the output looks as follows.
chr9
1234
5678
Cheers!

You don't really need a regular expression since it's tab delimited.
foreach $line(#histone)
{
#line_data = split(/\t/,$line)
print $line_data[0]."\n";
print $line_data[1]."\n";
print $line_data[2]."\n";
}
Edit:
If you want to assign the values to specific named variables, assign it in a temporary array.
($varA, $varB, $varC .... ) = split(/\t/,$line)

The actual problem here is that you're trying to print the values of $1, $2 and $3, but you only have one set of capturing parenthesis in your regex, so only $1 gets a value. $2 and $3 will remain undefined and hence give you that error when you try to print them.
The solution is to add two more sets of capturing parenthesis. I expect you want something like this:
$line=~ m/([a-zA-Z0-9]+)\t([0-9]+)\t([0-9]+)\t/;

Let's assume, that file.txt have what you want: (file.txt eq demo_site.txt )
chr9 1234 5678 . 200 . 14.0 -1
you can use simple thing:
perl -ane '$" = "\n"; print "#F[0..2]"' file.txt 1>output.txt
One-liners in Perl are powerful. And you don't need to write your scripts for simple tasks;)
Just open Terminal sometimes;)
P.S:
This is not very good one-liner, I know, but It do what It must.

$line=~ m/([a-zA-Z0-9]+)\t[0-9]+\t[0-9]+\t/)
First of all, the parens are not balanced.
Second, I haven't checked this, but don't you need a set of parens for each capture?
Third, as misplacedme said split() is definitely the way to go. ;)

If I may self-promote, you can use Tie::Array::CSV to give direct read-write access to the file as a Perl array of arrayrefs.
use strict;
use warnings;
use Tie::Array::CSV;
tie my #file, 'Tie::Array::CSV', 'demo_site.txt', sep_char => "\t";
print $file[0][0]; # first line before first tab
$file[2][1] = 10; # set the third line between the first and second tabs

Related

Why perl omitting the # strings

#!/usr/bin/perl
$test ="#test=hello"; #(Actually am reading this text from File and assigning to a variable)
print "$test";
After executing the above code perl outputs the below
Output:
=hello instead of #test=hello.
Can anyone please explain the above. I believe that it is considering as empty array, but how can i avoid this when and reading a file, please clear my misconception.
Thanks,
Perl interpolates variables in strings delimited by double quotes.
#test is treated as an array.
Unless you have created this explicitly, you should get an error when you try to do that. If you don't then you must have forgotten to use strict; and use warnings;!
Use single quoted strings if you don't want to interpolate variables.
#!/usr/bin/env perl
use strict;
use warnings;
use v5.10;
my $test = '#test=hello';
say $test;

Bash, variables and arrays

I crawled through lots of boards but didn't find the final solution for my problem.
I have got an array, named "array0", the name is stored to a variable called arrayname. Also I've got a logged IP address, let's say 127.0.0.1, also stored in a variable, called ip.
Now I want to assign the IP to index 3 in the array like that:
"$arrayname"[3]="$ip"
So, this didn't work. I tried lots of ways how to solve that but none worked.
Is anyone out there who can tell me the final solution for this case?
Update: The given opportunities to handle the problem are great! But I forgot to mention that the array I'm working with is just sourced from another file (also written in bash). My goal is now to edit the array in the sourced file itself. Any more ideas for that?
Try
read ${arrayname}[3] <<<"$ip"
You'll need to use the declare command and indirect parameter expansion, but it's a little tricky to use with array names. It helps if you think of the index as part of the variable name, instead of an operator applied to the array name, like in most languages.
array0=(1 2 3 4 5)
arrayname=array0
name=$arrayname[3]
declare "$name=$ip"
echo "${!name}
And yet another way to do it, this time using the versatile printf.
printf -v "$arrayname[3]" %s "$ip"
demo
#!/bin/bash
array0=(a b c d e)
echo "${array0[#]}"
arrayname='array0'
ip='127.0.0.1'
printf -v "$arrayname[3]" %s "$ip"
echo "${array0[#]}"
output
a b c d e
a b c 127.0.0.1 e
See this:
# declare -a arrayname=(element1 element2 element3)
# echo ${arrayname[0]}
element1
# arrayname[4]="Yellow"
# echo ${arrayname[4]}
Yellow
# export ip="192.168.190.23"
# arrayname[5]=$ip
# echo ${arrayname[5]}
192.168.190.23
You don't have to use quotes.
After initializing the arrays, you can access the array elements using their indices as follows.
Access as:
${arrayname[3]}

Perl: Indexing function returning array syntax

I have a question about Perl more out of curiosity than necessity. I have seen there are many ways to do a lot of things in Perl, a lot of the time the syntax seems unintuitive to me (I've seen a few one liners doing som impressive stuff).
So.. I know the function split returns an array. My question is, how do I go about printing the first element of this array without saving it into a special variable? Something like $(split(" ",$_))[0] ... but one that works.
You're 99% there
$ perl -de0
Loading DB routines from perl5db.pl version 1.33
Editor support available.
Enter h or `h h' for help, or `man perldebug' for more help.
main::(-e:1): 0
DB<1> $a = "This is a test"
DB<2> $b = (split(" ",$a))[0]
DB<3> p $b
This
DB<4> p "'$b'"
'This'
This should do it:
print ((split(" ", $_))[0]);
You need one set of parentheses to allow you to apply array indexing to the result of a function. The outer parentheses are needed to get around special parsing of print arguments.
Try this out to print the first element of a whitespace separated list. The \s+ regex matches one or more whitespace characters to split on.
echo "1 2 3 4" | perl -pe 'print +(split(/\s+/, $_))[0]'
Also, see this related post.

Making a perl array unique

I am currently having a very simple problem with capturing the output from a backticked shell command. I apologize that the problem is rather simple.
I have some sorted array (#valid_runs) which I know contains consecutive, duplicate elements. I want to use backticks to echo this array to uniq. I want to capture the STDOUT in an array. I attempt to do so like this.
#unique_valids = `echo '#valid_runs' | uniq`;
print #unique_valids;
This print statement yields nothing. For that matter neither does this.
#unique_valids = `echo '#valid_runs'`;
print #unique_valids;
I know how to use uniq and echo. This seems rather odd to me. I would think this has more to do with perl arrays than proper use of those commands. I have searched around a bit elsewhere, so please don't pelt me with downvotes just because the solution might seem trivial. Thanks again for your time.
NOTE on solutions:
TLP's solution is the most straightforward as far handling the uniq problem. I am being rather flexible, since all responses suggested not making a system call for this problem. If Perl's uniq function the same as Unix's uniq then the array ought to remain sorted.
John Corbett's solution works well if you don't care about a sorted result.
Using system calls for something that can easily be accomplished with perl code is not a good idea. The module List::MoreUtils has a uniq function that does what you require:
use List::MoreUtils qw(uniq);
my #unique = uniq #runs;
The subroutine inside the module itself is very simple, though, exactly like theglauber's answer:
sub uniq (#) {
my %seen = ();
grep { not $seen{$_}++ } #_;
}
you should just store the array into a hash, because hash keys are always unique. You can do that like this:
my %temp_hash = map { $_ => 1 } #valid_runs;
my #unique_valids = keys %temp_hash;
that's the perl way of doing it anyway. There's no need to use back tics here (I try to avoid those as much as I can).
It's easy to do this in perl. Here's a rather obscure but fun way to dedup an array:
#dedup = grep !$seen{$_}++ #orig_array;
Figure out what this is doing by checking the documentation for the perl function grep.
If you have to use uniq, you probably need to put each array element in a separate line.
join("\n", #your_array)
should achieve that.
#!/usr/bin/perl
use warnings;
#a = (1, 2, 3, 3, 4, 4, 5);
$cmd = "/usr/bin/uniq <<EOF\n";
$cmd .= $_."\n" foreach (#a);
$cmd .= "EOF\n";
$result = `$cmd`;
print "Cmd: $cmd\n";
print "Result is $result";
#u = split /\n/,$result;
print "After ",join " ",#u,"\n";
This does what you ask, but theglauber's answer is still better Perl.

Using a variable name to create an array bash, unix

First I should perhaps explain what I want to do...
I have 'n' amounts of files with 'n' amount of lines. All I know is
that the line count will be even.
The user selects the files that they want. This is saved into an
array called ${selected_sets[#]}.
The program will print to screen a randomly selected 'odd numbered'
line from a randomly selected file.
Once the line has been printed, I don't want it printed again...
Most of it is fine, but I am having trouble creating arrays based on the contents of ${selected_sets[#]}... I think I have got my syntax all wrong :)
for i in ${selected_sets[#]}
do
x=1
linecount=$(cat $desired_path/$i | wc -l) #get line count of every set
while [ $x -le $linecount ]
do ${i}[${#${i}[#]}]=$x
x=$(($x+2)) # only insert odd numbers up to max limit of linecount
done
done
The problem is ${i}[${#${i}[#]}]=$x
I know that I can use array[${#array[#]}]=$x but I don't know how to use a variable name.
Any ideas would be most welcome (I am really stumped)!!!
In general, this type is question is solved with eval. If you want a a variable named "foo" and have a variable bar="foo", you simply do:
eval $bar=5
Bash (or any sh) treats that as if you had typed
foo=5
So you may just need to write:
eval ${i}[\${#${i}[#]}]=$x
with suitable escapes. (A useful technique is to replace 'eval' with 'echo', run the script and examine the output and make sure it looks like what you want to be evaluated.)
You can create named variables using the declare command
declare -a name=${#${i}[#]}
I'm just not sure how you would then reference those variables, I don't have time to investigate that now.
Using an array:
declare -a myArray
for i in ${selected_sets[#]}
do
x=1
linecount=$(cat $desired_path/$i | wc -l) #get line count of every set
while [ $x -le $linecount ]
do
$myArray[${#${i}[#]}]=$x
let x=x+1 #This is a bit simpler!
done
done
Beware! I didn't test any of the above. HTH

Resources