Related
When I want to assign input file to array, I am getting this error.
while (<>) {
my #tmp = split;
push my #arr,[#tmp];
print "#arr\n";
}
output: ARRAY(0x7f0b00)
ARRAY(0x7fb2f0)
If I change [ to ( then I am getting the required output.
while (<>) {
my #tmp = split;
push my #arr,(#tmp);
print "#arr\n";
output: hello, testing the perl
check the arrays.
What is the deference between (#tmp) and [#tmp]?
Normal parentheses () have no special function besides changing precedence. They are commonly used to confine a list, e.g. my #arr = (1,2,3) Square brackets return an array reference. In your case, you would be constructing a two-dimensional array. (You would, if your code was not broken).
Your code should perhaps be written like this. Note that you need to declare the array outside the loop block, otherwise it will not keep values from previous iterations. Note also that you do not need to use a #tmp array, you can simply put the split inside the push.
my #arr; # declare #arr outside the loop block
while (<>) {
push #arr, [ split ]; # stores array reference in #arr
}
for my $aref (#arr) {
print "#$aref"; # print your values
}
This array would have the structure:
$arr[0] = [ "hello,", "testing", "the", "perl" ];
$arr[1] = [ "check", "the", "arrays." ];
This is a good idea if you for example want to keep lines of input from being mixed up. Otherwise all values end up in the same level of the array.
I am trying to print a multidimensional array (matrix) in Perl passing reference to array to subroutine.
Here is my code:
sub print_matrix(\#) {
my $array = shift;
for my $i ( 0 .. $#{ $array } ) {
my $row = $array[$i];
for my $j ( 0 .. $#($row) ) {
print $array[$i][$j];
}
}
}
Borodin tells you what was wrong with your code.
Now consider this module: Data::Dumper (available on CPAN). You can use this module to print any data structure: arrayref of arrayrefs (what you called a matrix), hashref of hashrefs, arrayref of hashrefs, hashref of hashrefs, or any other combination of these structures for as many dimensions as you want. Of course, if you have too many dimensions, it could lead to a confusing output.
My point is, some time ago, I was asked in an interview how I would implement this module. I thought it was a very clever question. I had to think a little because I use the module often but never bothered to figure how it works. It is in fact very simple. Imagine in your subroutine you receive a reference but you don't actually know what kind of reference it is (scalarref, arrayref, hashref, etc.), how would you determine what it is? If you have multiple possibilities, what would you do to cover all of them? Have you thought of creating a recursive function?
So, to solve your problem quickly, if you just want to print your matrix for debugging purpose, use Data::Dumper. Otherwise, if you want to do something more complex and wish to cover multiple cases, try to create a recursive function.
Here's a Data::Dumper example:
my $arrayref = [
[ qw/ a b c d / ],
[ qw/ e f g h / ],
[ qw/ i j k l / ],
];
use Data::Dumper;
print Dumper $arrayref;
And here's the result you will get:
$VAR1 = [
[
'a',
'b',
'c',
'd'
],
[
'e',
'f',
'g',
'h'
],
[
'i',
'j',
'k',
'l'
]
];
Each "row" of your matrix is printed as a list of elements, separated by a comma (and a new line), inside a pair of brackets. Be careful, if you pass it an array, it will print each elements one by one, and you will lose the "dimensions". If you only have an array, you have to pass it as a reference like this:
print Dumper \#array;
I hope this helps.
Using plain print is OK when all you have are single letter entries in your matrix, but a module like Text::Table can make it much easier to produce tidy output. For example,
#!/usr/bin/env perl
use strict;
use warnings;
use Text::Table;
my #matrix = map {
[ map sprintf('%.2f', -500 + rand(1000)), 1 .. 5 ]
} 1 .. 5;
my $mat = Text::Table->new;
$mat->load(#matrix);
print $mat;
Output:
-7.73 -83.85 -351.18 21.06 320.40
174.83 238.29 91.16 361.43 213.04
446.43 -4.82 322.81 10.38 -436.62
-128.05 195.68 199.05 288.39 115.30
-251.19 -329.35 244.13 -428.25 454.64
You can print a two-dimensional Perl array very simply with something like this
use strict;
use warnings;
my #arr_2d = (
[ qw/ a b c d / ],
[ qw/ e f g h / ],
[ qw/ i j k l / ],
);
print_2d(\#arr_2d);
sub print_2d {
my ($matrix) = #_;
print "#$_\n" for #$matrix;
}
output
a b c d
e f g h
i j k l
Update
Here's a working version of your own code. You weren't using array references properly and had parentheses where there should have been braces. This version also prints a space after each element and a newline after each row.
sub print_matrix {
my $array = shift;
for my $i ( 0 .. $#{ $array } ) {
my $row = $array->[$i];
for my $j ( 0 .. $#{ $row } ) {
print $array->[$i][$j], ' ';
}
print "\n";
}
}
I have a set of CSV values like this:
device name, CPU value, frequency of CPU value, CPU in percentage
For example
router1,5,10,4
router1,5,1,5
router2,5,10,4
router2,5,2,5
router3,4,5,6
router3,7,6,5
I need to form a data structure like this:
array = {
router1 => [5,10,4],[5,1,5],
router2 => [5,10,4],[5,2,5],
router3 => [4,5,6],[7,6,5]
}
I need help in forming this data structure in Perl.
I have tried visualizing how to do this but am unable to do so. I would appreciate any help on this.
The end goal for me is to convert this into a JSON object.
This should get you started. It uses the DATA file handle so that I could embed the data in the program itself. I have used to_json from the JSON module to format the hash as JSON data. The statement $_ += 0 for #values converts the contents of #values from string to to numeric, to avoid quotation marks in the resultant JSON data.
use strict;
use warnings;
use JSON;
my %data;
while (<DATA>) {
chomp;
my ($device, #values) = split /,/;
$_ += 0 for #values;
push #{ $data{$device} }, \#values;
}
print to_json(\%data, { pretty => 1, canonical => 1 });
__DATA__
router1,5,10,4
router1,5,1,5
router2,5,10,4
router2,5,2,5
router3,4,5,6
router3,7,6,5
output
{
"router1" : [
[
5,
10,
4
],
[
5,
1,
5
]
],
"router2" : [
[
5,
10,
4
],
[
5,
2,
5
]
],
"router3" : [
[
4,
5,
6
],
[
7,
6,
5
]
]
}
Here is a simple solution which prints desired JSON object.
#!/usr/bin/env perl
use strict;
use warnings;
use 5.010;
my %hash;
while (my $line = <DATA>) {
chomp $line;
my ($device, #cpu_values) = split(/,/, $line);
my $cpu_token = join(",", #cpu_values);
$hash{$device} .= '[' . $cpu_token . '], ';
}
my #devices = keys %hash;
print "array = { \n";
foreach (sort #devices) {
print "$_ => [$hash{$_}]\n";
}
print "}\n";
__DATA__
router1,5,10,4
router1,5,1,5
router2,5,10,4
router2,5,2,5
router3,4,5,6
router3,7,6,5
In Perl you need to use references in the way of anonymous arrays and hashes to make multidimensional arrays, arrays of arrays, hashes containing hashes and anywhere in between. perlreftut should cover how to accomplish what you are trying to do. Here is an example I wrote the other day that could help explain as well:
print "\nFun with multidimensional arrays\n";
my #myMultiArray = ([1,2,3],[1,2,3],[1,2,3]);
for my $a (#myMultiArray){
for my $b (#{$a}){
print "$b\n";
}
}
print "\nFun with multidimensional arrays containing hashes\nwhich contains an anonymous array\n";
my #myArrayFullOfHashes = (
{'this-key'=>'this-value','that-key'=>'that-value'},
{'this-array'=>[1,2,3], 'this-sub' => sub {return 'hi'}},
);
for my $a (#myArrayFullOfHashes){
for my $b (keys %{$a}){
if (ref $a->{$b} eq 'ARRAY'){
for my $c (#{$a->{$b}}){
print "$b.$c => $c\n";
}
} elsif ($a->{$b} =~ /^CODE/){
print "$b => ". $a->{$b}() . "\n";
} else {
print "$b => $a->{$b}\n";
}
}
}
* UPDATED* for typos
Another PERL question.... I am trying to loop through a 2D array. I am positive about the size of one dimension but unsure on the second. The code snippet:
foreach my $value (#surfaces[1])
{
my $sum = 0;
my $smallest = 9999;
my $limit_surface = 0;
for (my $i = 0; $i < 3; $i++)
{
$sum += $surfaces[$i][$counter];
if ($surfaces[$i][$counter] <= $smallest)
{
$smallest = $surfaces[$i][$counter];
$limit_surface = $subchannel_number[$i];
}
}
$counter++;
push(#avg_value,$sum/#rodsurface_number);
push(#limiting_schan,$limit_surface);
push(#limiting_value,$smallest);
}
I am compiled but $value variable is failing to initialize.
Repeat after me:
Perl does not have multidimensional arrays
Perl does not have multidimensional arrays
Perl does not have multidimensional arrays
What Perl does have is have are arrays that contain references pointing to other arrays. You can emulate multidimensional arrays in Perl, but they are not true multidimensional arrays. For example:
my #array;
$array[0] = [ 1, 2, 3, 4, 5 ];
$array[1] = [ 1, 2, 3 ];
$array[2] = [ 1, 2 ];
I can talk about $array[0][1], and $array[2][1], but while $array[0][3] exists, $array[2][3] doesn't exist.
If you don't understand references, read the tutorial on references.
What you need to do is go through your array and then find out the size of each subarray and go through each of those. There's no guarantee that
The reference contained in your primary array actually points to another array:
That your sub-array contains only scalar data.
You can use the $# operator to find the size of your array. For example $#array is the number of items in your array. You an use ( 0..$#array ) to go through each item of your array, and this way, you have the index to play around with.
use strict;
use warnings;
my #array;
$array[0] = [ 1, 2, 3, 4, 5 ];
$array[1] = [ 1, 2, 3 ];
$array[2] = [ 1, 2, ];
#
# Here's my loop for the primary array.
#
for my $row ( 0..$#array ) {
printf "Row %3d: ", $row ;
#
# My assumption is that this is another array that contains nothing
# but scalar data...
#
my #columns = #{ $array[$row] }; # Dereferencing my array reference
for my $column ( #columns ) {
printf "%3d ", $column;
}
print "\n";
}
Note I did my #columns = #{ $array[$row] }; to convert my reference back into an array. This is an extra step. I could have simply done the dereferencing in my for loop and saved a step.
This prints out:
Row 0: 1 2 3 4 5
Row 1: 1 2 3
Row 2: 1 2
I could put some safety checks in here. For example, I might want to verify the size of each row, and if one row doesn't match the other, complain:
my $row_size = $array[0];
for my $row ( 1..$#array ) {
my #columns = #{ $array[$row] };
if ( $#columns ne $array_size ) {
die qq(This is not a 2D array. Not all rows are equal);
}
}
You do not describe your data structure, nor explain exactly what you want to do with it. This limits the advice that we can give to just the general variety.
If you're trying to iterate over an array of arrays, I would advise you to do it based off of element instead of index.
For example, below I have a 4 by 5 matrix of integers. I would like to find the average of these values. One way to do this is to simply iterate over each row and then column, and add up the values:
use strict;
use warnings;
my #AoA = (
[11, 12, 13, 14, 15],
[21, 22, 23, 24, 25],
[31, 32, 33, 34, 35],
[41, 42, 43, 44, 45],
);
my $sum = 0;
my $count = 0;
for my $row (#AoA) {
for my $element (#$row) { # <-- dereference the array ref
$sum += $element;
$count++;
}
}
print "Average of Matrix is " . ($sum / $count) . "\n";
Outputs:
Average of Matrix is 28
For more information on complex data structures, check out: Perl Data Structures Cookbook
I've set up some dummy variables and changed a few things around. This compiles and produces the results I show below.
This might not answer your question, but should allow you to copy and paste the code, run it yourself, edit the input and see how the output compares to what you want.
use warnings;
use strict;
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;
my #surfaces = ( ['1','2','3'],
['10','20','30'],
['100','200','400'],
);
my #subchannel_number = ( ['1','2','3'],
['10','20','30'],
['100','200','400'],
);
my #rodsurface_number = (1 .. 10);
my $counter = 0;
my (#avg_value, #limiting_schan, #limiting_value);
foreach my $value ($surfaces[1]){
my $sum = 0;
my $smallest = 9999;
my $limit_surface = 0;
for (my $i = 0; $i < 3; $i++) {
$sum += $surfaces[$i][$counter];
if ($surfaces[$i][$counter] <= $smallest){
$smallest = $surfaces[$i][$counter];
$limit_surface = $subchannel_number[$i];
}
}
$counter++;
push(#avg_value,$sum/#rodsurface_number);
push(#limiting_schan,$limit_surface);
push(#limiting_value,$smallest);
}
print Dumper (\#avg_value, \#limiting_schan, \#limiting_value);
$VAR1 = [
'11.1'
];
$VAR2 = [
[
'1',
'2',
'3'
]
];
$VAR3 = [
1
];
I'm writing a piece of code that extracts some numbers from an input file, which holds information for two conditions. The code therefore extracts two numbers for each line, and compares them against each other. The snippet below works fine, but I'm having trouble understanding which of the below approaches is 'correct', and why:
Input:
gi|63100484|gb|BC094950.1|_Xenopus_tropicalis_cDNA_clone_IMAGE:7022272 C1:XLOC_017431_0.110169:4.99086,_Change:5.5015,_p:0.00265,_q:0.847141 [95.08] C2:XLOC_020690_0.050681:9.12527,_Change:7.49228,_p:0.0196,_q:0.967194 [95.08]
gi|6572468|emb|AJ251750.1|_Xenopus_laevis_mRNA_for_frizzled_4_protein_(fz4_gene) C1:XLOC_027664_1.61212:4.37413,_Change:1.44003,_p:0.00515,_q:0.999592 [99.40] C2:XLOC_032999_2.94775:14.2322,_Change:2.27147,_p:5e-05,_q:0.0438548 [99.40]
gi|68533737|gb|BC098974.1|_Xenopus_laevis_RDC1_like_protein,_mRNA_(cDNA_clone_MGC:114801_IMAGE:4632706),_complete_cds C1:XLOC_036220_0.565861:6.52476,_Change:3.52741,_p:0.00015,_q:0.21728 [99.95] C2:XLOC_043165_0.157752:2.52129,_Change:3.99843,_p:0.02115,_q:0.99976 [99.95]
gi|70672087|gb|DQ096846.1|_Xenopus_laevis_degr03_mRNA,_complete_sequence C1:XLOC_031048_0.998437:4.20942,_Change:2.07588,_p:0.01365,_q:0.999592 [99.87] C2:XLOC_037051_1.1335:4.36819,_Change:1.94624,_p:0.01905,_q:0.9452 [99.87]
gi|70672102|gb|DQ096861.1|_Xenopus_laevis_rexp44_mRNA,_complete_sequence C1:XLOC_049520_12.3353:6.30193,_Change:-0.968926,_p:0.04935,_q:0.999592 [92.90] C2:XLOC_058958_13.0419:5.10275,_Change:-1.35381,_p:0.0373,_q:0.99976 [92.90]
gi|7110523|gb|AF231711.1|_Xenopus_laevis_7-transmembrane_receptor_frizzled-1_mRNA,_complete_cds C1:XLOC_038309_0.784476:2.37536,_Change:1.59835,_p:0.0079,_q:0.999592 [99.94] C2:XLOC_045678_0.692883:3.52599,_Change:2.34735,_p:0.00125,_q:0.341583 [99.94]
#!/usr/bin/perl
use strict;
use warnings;
use File::Slurp;
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;
my #intersect = read_file('text.txt');
my (#q1, #q2, #change_q, #q_values, #q_value1, #q_value2);
foreach (#intersect) {
chomp;
#q_value1 = ($_ =~ /C1:.*?q:(\d+\.\d+)/);
#q_value2 = ($_ =~ /C2:.*?q:(\d+\.\d+)/);
push #q_values, "C1:#q_value1\tC2:#q_value2";
if (abs $q_value1[#_] < abs $q_value2[#_]) {
push #change_q, $q_value1[#_];
}
elsif (abs $q_value2[#_] < abs $q_value1[#_]) {
push #change_q, $q_value2[#_];
}
}
print Dumper (\#q_values);
print Dumper (\#change_q);
Output:
$VAR1 = [
'C1:0.847141 C2:0.967194',
'C1:0.999592 C2:0.0438548',
'C1:0.21728 C2:0.99976',
'C1:0.999592 C2:0.9452',
'C1:0.999592 C2:0.99976',
'C1:0.999592 C2:0.341583'
];
$VAR1 = [
'0.847141',
'0.0438548',
'0.21728',
'0.9452',
'0.999592',
'0.341583'
];
This works perfectly, outputting the smaller 'q-value' for each condition. However replacing #_ with $#_ also works.
As does this approach:
foreach (#intersect) {
chomp;
#q_value1 = ($_ =~ /C1:.*?q:(\d+\.\d+)/);
#q_value2 = ($_ =~ /C2:.*?q:(\d+\.\d+)/);
push #q_values, "C1:#q_value1\tC2:#q_value2";
my $q_value1 = $q_value1[0] // $q_value1[1];
my $q_value2 = $q_value2[0] // $q_value2[1];
if (abs $q_value1 < abs $q_value2) {
push #change_q, $q_value1;
}
elsif (abs $q_value2 < abs $q_value1) {
push #change_q, $q_value2;
}
}
print Dumper (\#q_values);
print Dumper (\#change_q);
Output:
$VAR1 = [
'C1:0.847141 C2:0.967194',
'C1:0.999592 C2:0.0438548',
'C1:0.21728 C2:0.99976',
'C1:0.999592 C2:0.9452',
'C1:0.999592 C2:0.99976',
'C1:0.999592 C2:0.341583'
];
$VAR1 = [
'0.847141',
'0.0438548',
'0.21728',
'0.9452',
'0.999592',
'0.341583'
"This works perfectly" is putting it a bit strong. It works coincidentally would be a better description. You are using the #_ array, its highest index $#_ and the number zero, getting the same result every time. What you are not realizing is that #_ is actually empty, because it is only used when passing arguments to subroutines. So when you say
$foo[#_]
You are really saying
$foo[0]
And when you are saying
$foo[$#_]
You are really saying
$foo[-1]
For extra fun, -1 is also a valid array element, meaning the last element in the array, so for an array of size 1 or 2, it probably seems to work fine.
Because in scalar context, an array #_ will return its size, which in this case is 0. $#_ will return -1 when #_ is empty, because there is no highest index.
So, to answer your question: Because using #_ is wrong and only works on accident, using fixed numbers 0 and 1 is the better solution.