How to print a particular column from tabular data - arrays

I'm Trying to print columns from data by using index key value in the outer part of a foreach loop.
my #col;
foreach(<DATA>){
#x = split(' ',$_);
#xz = ($x[0],$x[1],$x[2]) ;
#print "$x[0]\n"; This is working but i'm not expect this.
push(#col,#xz);
}
print "$col[0]\n";
__DATA__
7 2 3
3 2 8
6 7 2
I expect the output is
7 3 6
How can i do it?

Always use use strict; and use warnings;!!
You have a couple of issues:
push( #col, #xz );
In this case, you're losing your information in #xz array. After this loop, you end up with a single array that looks like this:
#col = ( 7, 2, 3, 3, 2, 8, 6, 7, 2);
So, when you print:
print "$col[0]\n";
You get that zeroth element: 7.
We can preserve the structure of the data by using a reference:
#! /usr/bin/env perl
#
use strict; # Lets you know when you misspell variable names
use warnings; # Warns of issues (using undefined variables
use feature qw(say);
use Data::Dumper;
my #columns;
for my $data ( <DATA> ) {
my #data_list = split /\s+/, $data;
push #columns, \#data_list;
}
say Dumper \#columns;
__DATA__
7 2 3
3 2 8
6 7 2
Here you see I've included Data::Dumper to print out the structure of #columns:
$VAR1 = [
[
'7',
'2',
'3'
],
[
'3',
'2',
'8'
],
[
'6',
'7',
'2'
]
];
As you can see, each entry in the #columns array is now another array. However, printing out $columns[0] array reference isn't going to print what you want. Instead, it's going to print the zeroth array reference: 7, 2, 3, and not the zeroth element of each array reference: 7, 3, 6.
To do that, we need a subroutine that will go through #columns and print out the the zeroth entry of each of the arrays. Here I'm creating a subroutine called fetch_index that will fetch the passed index of the passed array:
#! /usr/bin/env perl
#
use strict; # Lets you know when you misspell variable names
use warnings; # Warns of issues (using undefined variables
use feature qw(say);
use Data::Dumper;
my #columns;
for my $data ( <DATA> ) {
my #data_list = split /\s*/, $data;
push #columns, \#data_list;
}
say join ", ", fetch_index( 0, #columns );
sub fetch_index {
my $entry = shift; #Entry you want from all arrays
my #array = #_;
my #values;
for my $array_ref ( #array ) {
push #values, ${array_ref}->[$entry];
}
return #values;
}
__DATA__
7 2 3
3 2 8
6 7 2
The subroutine merely goes through each array reference I've stored in my array, and fetched the $entry value from that array reference. I push those into my #values array and return that.

my #col;
while (<DATA>) {
push #col, (split ' ')[0];
# push #col, /(\S+)/; # split alternative
}
print "#col\n";
__DATA__
7 2 3
3 2 8
6 7 2
output
7 3 6

Once you've absorbed the information about anonymous arrays and references in the other excellent posts here, you can start to have fun. e.g. you can often get a one liner approach to work:
perl -nE 'say [split]->[1] ' col_data.txt
would loop (-n creates an implicit while(){} loop) through the data in col_data.txt, split the topic variable ($_) creating a series of anonymous arrays from each row and then print the second element, or "column" for example.
You can use the autosplit command line option (-a) to split each row into an array called #F (mnemonic: "F" for "Field"). In later versions of perl, the -a implies the implicit while loop (-n):
perl -anE 'say $F[1] ' col_data.txt
would be the equivalent of the previous command - printing the second column:
output:
2
2
7
There is a famous and short perl workalike for cut that is a more featureful variaton on this theme, and there is this Perl Monks thread.

perl -a -F' ' -ne 'print "$F[0]\n";' data.txt
here you $F[0] is field you can change it accordingly you will get the expected output

You were pretty close I think. This is what I did (edited to reflect comments from #Borodin):
use strict;
use warnings;
sub getColumn {
my ($data, $col) = #_;
my #output = map $_->[$col], #{$data};
return #output;
}
my #data;
while (<DATA>){
push(#data, [split(' ',$_)]);
}
print join(' ', getColumn(\#data, 0), "\n");
print join(' ', getColumn(\#data, 1), "\n");
print join(' ', getColumn(\#data, 2), "\n");
__DATA__
7 2 3
3 2 8
6 7 2
That subroutine getColumn should let you retrieve any arbitrary column. When I ran it with your data I got this for output:
7 3 6
2 2 7
3 8 2

Related

Using an array to index into another array

I have two arrays, let's call them #a1 and #a2. What I'm trying to do is obtain elements from #a2 using the values in #a1 as indices. My current attempt doesn't work properly.
foreach (#a1) {
print $a2[$_] . "at" . $_;
}
This only prints $_ but not $a2[$_].
I sense there is a trivial solution to this, but I just can't find it.
There is nothing wrong with the code you have. I have tested a small script and it works as expected. Asi i suggested in my comment, try using something like Data::Dumper to see whats in the arrays before the loop.
use strict;
use warnings;
use Data::Dumper;
my #a1 = (0..4);
my #a2 = ("a".."e");
print Dumper \#a1, \#a2;
foreach (#a1){
print $a2[$_]." at ".$_."\n";
}
OUTPUT
$VAR1 = [
0,
1,
2,
3,
4
];
$VAR2 = [
'a',
'b',
'c',
'd',
'e'
];
a at 0
b at 1
c at 2
d at 3
e at 4
there's no reason your code shouldn't work as long as the values of the first array are valid addresses in the second array. but if all you really want to do is just get the values and address of the second array, you could just do:
for my $i (0..$#a2) {
print "$i: $a2[$i]","\n";
}
$#a2 is the last element address of the array.

Reverse the order of key, value in an array conversion to a hash

Suppose I have an array of values, then keys (the reverse of what an assignment to a hash would expect):
use strict;
use warnings;
use Data::Dump;
my #arr = qw(1 one 2 two 3 three 4 four 1 uno 2 dos 3 tres 4 cuatro);
my %hash = #arr;
dd \%hash;
Prints
{ 1 => "uno", 2 => "dos", 3 => "tres", 4 => "cuatro" }
Obviously, the duplicate keys are eliminated when the hash is constructed.
How can I reverse the order of the pairs of values used to construct the hash?
I know that I can write a C style loop:
for(my $i=1; $i<=$#arr; $i=$i+2){
$hash{$arr[$i]}=$arr[$i-1];
}
dd \%hash;
# { cuatro => 4, dos => 2, four => 4, one => 1, three => 3, tres => 3, two => 2, uno => 1 }
But that seems a little clumsy. I am looking for something a little more idiomatic Perl.
In Python, I would just do dict(zip(arr[1::2], arr[0::2]))
Use reverse:
my %hash = reverse #arr;
A list of the built-in functions in Perl is in perldoc perlfunc.
TLP has the right answer, but the other way to avoid eliminating dup keys is to use hash of arrays. I am assuming that's the reason for you reversing the array in the first place.
use strict;
use warnings;
use Data::Dump;
my #arr = qw(1 one 2 two 3 three 4 four 1 uno 2 dos 3 tres 4 cuatro);
my %hash;
push #{ $hash{$arr[$_]} }, $arr[$_ + 1] for grep { not $_ % 2 } 0 .. $#arr;
dd \%hash;
Output:
{
1 => ["one", "uno"],
2 => ["two", "dos"],
3 => ["three", "tres"],
4 => ["four", "cuatro"],
}
As suggested by ikegami in the comments, you can take a look at the List::Pairwise module available on CPAN for a more readable solution:
use strict;
use warnings;
use Data::Dump;
use List::Pairwise qw( mapp );
my #arr = qw(1 one 2 two 3 three 4 four 1 uno 2 dos 3 tres 4 cuatro);
my %hash;
mapp { push #{ $hash{$a} }, $b } #arr;
dd \%hash;
TLP has the right answer if your array of value, keys are ready to go into the hash.
That said, if you want to process the key or value in anyway before they go into the hash, I find this to be something I use:
while (my ($v, $k)=(shift #arr, shift #arr)) {
last unless defined $k;
# xform $k or $v in someway, like $k=~s/\s*$//; to strip trailing whitespace...
$hash{$k}=$v;
}
(Note -- destructive to the array #arr. If you want to use #arr for something else, make a copy of it first.)

Perl array element manipulation

I've been trying and trying with this one, but it just doesn't seem to click.
If I have an array with let's say 6 numbers:
#a = (1,2,3,4,5,6)
How do I get every second index ( 2, 4, 6) in this case?
how do I compute the difference of every two elements, so
the output here would be:
1 1 1 (because 2-1 =1 and 4-3 =1 and so on..)
Note: don't ever use $a or $b, they're special (sort uses them) ... it's generally better to give your variables a descriptive name, name it as to what's in there rather than what type of variable it is.
for ( my $index = 0; $index < scalar( #pairs ); $index += 2 ) {
my $first = $pairs[ $index + 0 ];
my $second = $pairs[ $index + 1 ];
my $pair = $index / 2;
my $difference = $second - $first;
print "the difference of pair $pair is $difference\n";
}
I think you should post your earlier attempts. In my opinion, the best way to learn is to learn from your mistakes, not being presented a correct solution.
For this problem, I think I would use a C-style for-loop for the first part, simply because it is straightforward, and can easily be tweaked if some new requirement comes up.
The second problem can easily be solved using a regular Perl-style for-loop.
use strict;
use warnings; # always use these two pragmas
my #nums = 1..6;
my #idx;
for (my $n = 0; $n <= $#nums; $n += 2) { # loop from 0 to max index, step 2
push #idx, $n; # store number in #idx
}
print "Indexes: #idx\n";
my #diff;
for my $n (0 .. $#nums - 1) { # loop from 0 to max index minus 1
push #diff, $nums[$n + 1] - $nums[$n]; # store diff in #diff
}
print "Diff: #diff\n";
Output:
Indexes: 0 2 4
Diff: 1 1 1 1 1
Try this:
use strict;
use warnings;
my $index = 1;
my #a = (1,2,3,4,5,6);
for (#a) {
if ($index % 2 == 0) {
my $diff = $_ - $a[$index-2];
print $diff;
}
$index++;
}
You likely want to use the new List::Util pair functions.
For your first question:
use List::Util 'pairvalues';
my #seconds = pairvalues #list; # yields (2, 4, 6)
For your second question:
use List::Util 'pairmap';
my #diffs = pairmap { $b-$a } #list; # yields (1, 1, 1)
You can use map:
my #a = 1 .. 6;
print join ' ', 'Every second:', map $a[ 1 + $_ * 2 ], 0 .. $#a / 2;
print "\n";
print join ' ', 'Differences:', map $a[ 1 + $_ * 2 ] - $a[ $_ * 2 ], 0 .. $#a / 2;
print "\n";
First: Don't use variables a and b. $a and $b are special variables used in sorting. Just be a bit more descriptive of your variables (even if it's merely #my_array) and you should be fine.
You can loop through your array any which way you like. However, I prefer to use a while loop instead of the thee part for because the three part for loop is a bit misleading. It is a while loop in disguise and the promised indexing of the loop can be misleading.
#! /usr/bin/env perl
use warnings;
use strict;
use feature qw(say);
my #array = qw( 1 2 3 4 5 6 );
my $index = 1; # Remember Perl indexes start at zero!
while ( $index <= $#array ) {
say "Item is $array[$index]";
say "The difference is " . ($array[$index] - $array[$index-1]);
$index += 2;
}
You said every second element. Indexes of arrays start at 0, so you want the odd number elements. Most of the answers use map which is a very nice little command, but does an awful lot in a single line which can make it confusing for a beginner. Plus, I don't think the Perldoc on it is very clear. There should be more simple examples.
The say is a newer version of print. However say always adds a \n at the end. You should always use strict; and use warnings;. These will catch about 90% of your programming bugs.
The qw( ... ) is a quick way to make an array. Each word becomes an array element. You don't need quotes or commas.
#!/usr/bin/perl
use strict;
use warnings;
my #ar = (1, 2, 3, 4, 5, 6);
# 1. How do I get every second index ( 2, 4, 6) in this case?
my #even = map { $_ & 1 ? $ar[$_] : () } 0 .. $#ar;
# 2. how do I compute the difference of every two elements?
my (#c, #diff) = #ar;
push #diff, -1 * (shift(#c) - shift(#c)) while #c;
use Data::Dumper;
print Dumper \#even;
print Dumper \#diff;
1;
__END__
$VAR1 = [
2,
4,
6
];
$VAR1 = [
1,
1,
1
];

How to create multiple multi-dimensional arrays from STDIN in perl?

So I'm getting input from STDIN like:
1 2 3
4 5 6
7 6 3
4 3 2
2 3 5
2 5 1
Blank lines separate the matrices, so the above input should create two multi-dimensional arrays...I know how to create one (code below), but how do I create multiple ones depending on how many blank lines the user inputs?
I won't know how many arrays the user wants to create so how can I dynamically create arrays depending on the blank lines in the user input?
my #arrayrefs;
while(<>)
{
chomp;
my #data = split(/\s+/,$_);
push #arrayrefs, \#data;
}
for $ref (#arrayrefs){
print "[#$ref] \n";
}
With your data, I'd say using paragraph mode for the input stream would be a good idea. That is basically setting the input record separator $/ to "\n\n", but in this case we will use "", which is a bit more magical in that it is flexible with extra blank lines.
use strict;
use warnings;
use Data::Dumper;
sub parse_data {
my #matrix = map { [ split / / ] } split /\n/, shift;
return \#matrix;
}
my #array;
$/ = "";
while (<>) {
push #array, parse_data($_);
}
print Dumper \#array;
The map/split statement is not as complex as it looks. Reading from right to left:
shift an argument from the argument list #_
split that argument on newline
take each those (i.e. map them) split arguments and split them again on space, and put the result inside an anonymous array, using brackets [ ].
All done.
It won't win any Code Golf competition, but it does seem to work:
$ cat data
1 2 3
4 5 6
7 6 3
4 3 2
2 3 5
2 5 1
$ cat xx.pl
#!/usr/bin/env perl
use strict;
use warnings;
my #matrices;
my #matrix;
sub print_matrices()
{
print "Matrix dump\n";
foreach my $mref (#matrices)
{
foreach my $rref (#{$mref})
{
foreach my $num (#{$rref})
{
print " $num";
}
print "\n";
}
print "\n";
}
}
while(<>)
{
chomp;
if ($_ eq "")
{
my(#result) = #matrix;
push #matrices, \#result;
#matrix = ();
}
else
{
my #row = split(/\s+/,$_);
push #matrix, \#row;
}
}
# In case the last line of the file is not a blank line
if (scalar(#matrix) != 0)
{
my(#result) = #matrix;
push #matrices, \#result;
#matrix = ();
}
print_matrices();
$ perl xx.pl data
Matrix dump
1 2 3
4 5 6
7 6 3
4 3 2
2 3 5
2 5 1
$
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my #arrays = [];
while (<>) {
if (my #array = /(\d+)/g) {
push $arrays[$#arrays], \#array;
} else {
push #arrays, [];
}
}
$Data::Dumper::Indent = 0;
printf("%s\n", Dumper $arrays[0]);
printf("%s\n", Dumper $arrays[1]);
Output:
$VAR1 = [['1','2','3'],['4','5','6'],['7','6','3']];
$VAR1 = [['4','3','2'],['2','3','5'],['2','5','1']];

How can I create a two-dimensional array in Perl?

I am currently trying to pass a 32 by 48 matrix file to a multi-dimensional array in Perl. I am able to access all of the values, but I am having issues accessing a specific value.
Here is a link to the data set:
http://paste-it.net/public/x1d5301/
Here is what I have for code right now.
#!/usr/bin/perl
open FILE, "testset.txt" or die $!;
my #lines = <FILE>;
my $size = scalar #lines;
my #matrix = (1 .. 32);
my $i = 0;
my $j = 0;
my #micro;
foreach ($matrix)
{
foreach ($lines)
{
push #{$micro[$matrix]}, $lines;
}
}
It doesn't seem you understand that $matrix only indicates #matrix when it is immediately followed by an array indexer: [ $slot ]. Otherwise, $matrix is a completely different variable from #matrix (and both different from %matrix as well). See perldata.
#!/usr/bin/perl
use English;
Don't! use English--that way!
This brings in $MATCH, $PREMATCH, and $POSTMATCH and incurs the dreaded $&, $`, $' penalty. You should wait until you're using an English variable and then just import that.
open FILE, "testset.txt" or die $!;
Two things: 1) use lexical file handles, and 2) use the three-argument open.
my #lines = <FILE>;
As long as I'm picking: Don't slurp big files. (Not the case here, but it's a good warning.)
my $size = scalar #lines;
my #matrix = (1 .. 32);
my $i = 0;
my $j = 0;
my #micro;
I see we're at the "PROFIT!!" stage here...
foreach ($matrix) {
You don't have a variable $matrix; you have a variable #matrix.
foreach ($lines) {
The same thing is true with $lines.
push #{ $micro[$matrix]}, $lines;
}
}
Rewrite:
use strict;
use warnings;
use English qw<$OS_ERROR>; # $!
open( my $input, '<', 'testset.txt' ) or die $OS_ERROR;
# I'm going to assume space-delimited, since you don't show
my #matrix;
# while ( defined( $_ = <$input> ))...
while ( <$input> ) {
chomp; # strip off the record separator
# Load each slot of #matrix with a reference to an array filled with
# the line split by spaces.
push #matrix, [ split ]; # split = split( ' ', $_ )
}
If you are going to be doing quite a bit of math, you might consider PDL (the Perl Data Language). You can easily set up your matrix and before operations on it:
use 5.010;
use PDL;
use PDL::Matrix;
my #rows;
while( <DATA> ) {
chomp;
my #row = split /\s+/;
push #rows, \#row;
}
my $a = PDL::Matrix->pdl( \#rows );
say "Start ", $a;
$a->index2d( 1, 2 ) .= 999;
say "(1,2) to 999 ", $a;
$a++;
say "Increment all ", $a;
__DATA__
1 2 3
4 5 6
7 8 9
2 3 4
The output shows the matrix evolution:
Start
[
[1 2 3]
[4 5 6]
[7 8 9]
[2 3 4]
]
(1,2) to 999
[
[ 1 2 3]
[ 4 5 999]
[ 7 8 9]
[ 2 3 4]
]
Increment all
[
[ 2 3 4]
[ 5 6 1000]
[ 8 9 10]
[ 3 4 5]
]
There's quite a bit of power to run arbitrary and complex operations on every member of the matrix just like I added 1 to every member. You completely skip the looping acrobatics.
Not only that, PDL does a lot of special stuff to make math really fast and to have a low memory footprint. Some of the stuff you want to do may already be implemented.
You probably need to chomp the values:
chomp( my #lines = <FILE> );
To clarify a tangential point to Axeman's answer:
See perldoc -f split:
A split on /\s+/ is like a split(' ') except that any leading whitespace produces a null first field. A split with no arguments really does a split(' ', $_) internally.
#!/usr/bin/perl
use YAML;
$_ = "\t1 2\n3\f4\r5\n";
print Dump { 'split' => [ split ] },
{ "split ' '" => [ split ' ' ] },
{ 'split /\s+/' => [ split /\s+/ ] }
;
Output:
---
split:
- 1
- 2
- 3
- 4
- 5
---
split ' ':
- 1
- 2
- 3
- 4
- 5
---
split /\s+/:
- ''
- 1
- 2
- 3
- 4
- 5
I see the question is pretty old, but as the author has just edited the question, perhaps this is still of interest. Also the link to the data is dead, but since other answers use space as the separator, I will too.
This answer demonstrates Tie::Array::CSV which allows random access to a CSV (or other file parsable with Text::CSV).
#!/usr/bin/env perl
use strict;
use warnings;
use Tie::Array::CSV;
## put DATA into temporary file
## if not using DATA, put file name in $file
use File::Temp ();
my $file = File::Temp->new();
print $file <DATA>;
##
tie my #data, 'Tie::Array::CSV', $file, {
text_csv => {
sep_char => " ",
},
};
print $data[1][2];
__DATA__
1 2 3 4 5
6 7 8 9 1
2 3 4 5 6

Resources