Related
I'm pretty new to Perl and this is my most complex project yet. Apologies if any parts of my explanation don't make sense or I miss something out - I'll be happy to provide further clarification. It's only one line of code that's causing me an issue.
The Aim:
I have a text file that contains a single column of data. It reads like this:
0
a,a,b,a
b,b,b,a
1
a,b,b,a
b,b,b,a
It continues like this with a number in ascending order up to 15, and the following two lines after each number are a combination of four a's or b's separated by commas. I have tied this file to an array #diplo so I can specify specific lines of it.
I also have got a file that contains two columns of data with headers that I have converted into a hash of arrays (with each of the two columns being an array). The name of the hash is $lookup and the array names are the names of the headings. The actual arrays only start from the first value in each column that isn't a heading. This file looks like this:
haplo frequency
"|5,a,b,a,a|" 0.202493719
"|2,b,b,b,a|" 0.161139191
"|3,b,b,b,a|" 0.132602458
This file contains all of the possible combinations of a or b at the four positions combined with all numbers 0-14 and their associated frequencies. In other words, it includes all possible combinations from "|0,a,a,a,a|" followed be "|1,a,a,a,a|" through to "|13,b,b,b,b|" and "|14,b,b,b,b|".
I want my Perl code to go through each of the combinations of letters in #diplo starting with a,a,b,a and record the frequency associated with the row of the haplo array containing each number from 0-14, e.g. first recording the frequency associated with "|0,a,a,b,a|" then "|1,a,a,b,a|" etc.
The output would hopefully look like this:
0 #this is the number in the #diplo file and they increase in order from 0 up to 15
0.011 0.0023 0.003 0.0532 0.163 0.3421 0.128 0.0972 0.0869 0.05514 0.0219 0.0172 0.00824 0.00886 0.00196 #these are the frequencies associated with x,a,a,b,a where x is any number from 0 to 14.
My code:
And here is the Perl code I created to hopefully sort this out (there is more to create the arrays and such which I can post if required, but I didn't want to post a load of code if it isn't necessary):
my $irow = 1; #this is the row/element number in #diplo
my $lrow = 0; #this is the row/element in $lookup{'haplo'}
my $copynumber = 0;
#print "$copynumber, $diplo[2]";
while ($irow < $diplolines - 1) {
while ($copynumber < 15) {
while ($lrow < $uplines - 1) {
if ("|$copynumber,$diplo[$irow]|" = $lookup{'haplo'}[$lrow]) { ##this is the only line that causes errors
if ($copynumber == 0) {
print "$diplo[$irow-1]\n";
#print "$lookup{'frequency'}[$lrow]\t";
}
print "$lookup{'frequency'}[$lrow]\t";
}
$lrow = $lrow + 1;
}
$lrow = 0;
$copynumber = $copynumber + 1;
}
$lrow = 0;
$copynumber = 0;
$irow = $irow + 1;
}
However, the line if ("|$copynumber,$diplo[$irow]|" = $lookup{'haplo'}[$lrow]) is causing an error Can't modify string in scalar assignment near "]) ".
I have tried adding in speech marks, rounded brackets and apostrophes around various elements in this line but I still get some sort of variant on this error. I'm not sure how to get around this error.
Apologies for the long question, any help would be appreciated.
EDIT: Thanks for the suggestions regarding eq, it gets rid of the error and I now know a bit more about Perl than I did. However, even though I don't get an error now, if I put anything inside the if loop for this line, e.g. printing a number, it doesn't get executed. If I put the same command within the while loop but outside of the if, it does get executed. I have strict and warnings on. Any ideas?
= is assignment, == is numerical comparison, eq is string comparison.
You can't modify a string:
$ perl -e 'use strict; use warnings; my $foo="def";
if ("abc$foo" = "abcdef") { print "match\n"; } '
Found = in conditional, should be == at -e line 1.
Can't modify string in scalar assignment at -e line 1, near ""abcdef") "
Execution of -e aborted due to compilation errors.
Nonnumerical strings act like zeroes in a numerical comparison:
$ perl -e 'use strict; use warnings; my $foo="def";
if ("abc$foo" == 0) { print "match\n"; } '
Argument "abcdef" isn't numeric in numeric eq (==) at -e line 1.
match
A string comparison is probably what you want:
$ perl -e 'use strict; use warnings; my $foo="def";
if ("abc$foo" eq "abcdef") { print "match\n"; } '
match
This is the problematic expression:
"|$copynumber,$diplo[$irow]|" = $lookup{'haplo'}[$lrow]
The equals sign (=) is an assignment operator. It assigns the value on its right-hand side to the variable on its left-hand side. Therefore, the left-hand operand needs to be a variable, not a string as you have here.
I don't think you want to do an assignment here at all. I think you're trying to check for equality. So don't use an assignment operator, use a comparison operator.
Perl has two equality comparison operators. == does a numeric comparison to see if its operands are equal and eq does a string comparison. Why does Perl need two operators? Well Perl converts automatically between strings and numbers so it can't possibly know what kind of comparison you want to do. So you need to tell it.
What's the difference between the two types of comparison? Well, consider this code.
$x = '0';
$y = '0.0';
Are $x and $y equal? Well it depends on the kind of comparison you do. If you compare them as numbers then, yes, they are the same value (zero is the same thing whether it's an integer or a real number). But if you compare them as strings, they are different (they're not the same length for a start).
So we now know the following
$x == $y # this is true as it's a numeric comparison
$x eq $y # this is false as it's a string comparison
So let's go back to your code:
"|$copynumber,$diplo[$irow]|" = $lookup{'haplo'}[$lrow]
I guess you started with == here.
"|$copynumber,$diplo[$irow]|" == $lookup{'haplo'}[$lrow]
But that's not right as |$copynumber,$diplo[$irow]| is clearly as string, not a number. And Perl will give you a warning if you try to do a numeric comparison using a value that doesn't look like a number.
So you changed it to = and that doesn't work either as you've now changed it to an assignment.
What you really need is a string comparison:
"|$copynumber,$diplo[$irow]|" eq $lookup{'haplo'}[$lrow]
How to get the length of an anonymous list?
perl -E 'say scalar ("a", "b");' # => b
I expected scalar to return the list in a scalar context - its length.
Why it returns the second (last) element?
It works for an array:
perl -E 'my #lst = ("a", "b"); say scalar #lst;' # => 2
You can use
my $n = () = f();
As applied to your case, it's
say scalar( () = ("a", "b") );
or
say 0+( () = ("a", "b") );
First, let's clear up a misconception.
You appear to believe that some operators evaluate to some kind of data structure called a list regardless of context, and that this list returns its length when coerced into scalar context.
All of that is incorrect.
An operator must evaluate to exactly one scalar in scalar context, and a sub must return exactly one scalar in scalar context. In list context, operators can evaluate to any number of scalars, and subs can return any number of scalars. So when we say an operator evaluates to a list, and when we say a sub returns a list, we aren't referring to some data structure; we are simply using "list" as a shorthand for "zero or more scalars".
Since there's no such thing as a list data structure, it can't be coerced into a scalar. Context isn't a coercion; context is something operators check to determine to what they evaluate in the first place. They literally let context determine their behaviour and what they return. It's up to each operator to decide what they return in scalar and list context, and there's a lot of variance.
As you've noted,
The #a operator in scalar context evaluates to a single scalar: the length of the array.
The comma operator in scalar context evaluates to a single scalar: the same value as its last operand.
The qw operator in scalar context evaluates to a single scalar: the last value it would normally return.
On to your question.
To determine to how many scalars an operator would evaluate when evaluated in list context, we need to evaluate the operator in list context. An operator always evaluates to a single scalar in scalar context, so your attempts to impose a scalar context are ill-founded (unless the operator happens to evaluate to the length of what it would have returned in list context, as is the case for #a, but not for many other operators).
The solution is to use
my $n = () = f();
The explanation is complicated.
One way
perl -wE'$len = () = qw(a b c); say $len' #--> 3
The = () = "operator" is a play on context. It forces list context on its right side and assigns the length of the list. See this post about list vs scalar assignments and this page for some thoughts on all this.
If this need be used in a list context then the LHS context can also be forced by scalar, like
say scalar( () = qw(a b c) );
Or by yet other ways (0+...), but scalar is in this case actually suitable, and clearest.
In your honest attempt scalar imposes the scalar context on its operand -- or here an expression, which is thus evaluated by the comma operator, whereby one after another term is discarded, until the last one which is returned.
You'd get to know about that with warnings on, as it would emit
Useless use of a constant ("a") in void context at -e line 1
Warnings can always be enabled in one-liners as well, with -w flag. I recommend that.
I'd like to also comment on the notion of a "list" in Perl, often misunderstood.
In programming text a "list" is merely a syntax device, that code can use; a number of scalars, perhaps submitted to a function, or assigned to an array variable, or so. It is often identified by parenthesis but those really only decide precedence and don't "make" anything nor give a "list" any sort of individuality, like a variable has; a list is just a grouping of scalars.
Internally that's how data is moved around; a "list" is a fleeting bunch of scalars on a stack, returned somewhere and gone.
A list is not -- not -- any kind of a data structure or a data type; that would be an array. See for instance a perlfaq4 item and this related page.
Perl references are hard. I'm not sending you to read prelref since it's not something that anyone can just read and start using.
Long story short, use this pattern to get an anonymous array size: 0+#{[ <...your array expression...> ]}
Example:
print 0+#{[ ("a", "b") ]};
I have to solve Sudoku puzzles in the format of a vector containing 9 vectors (of length 9 each). Seeing as vectors are linked lists in Prolog, I figured the search would go faster if I transformed the puzzles in a 2D array format first.
Example puzzle:
puzzle(P) :- P =
[[_,_,8,7,_,_,_,_,6],
[4,_,_,_,_,9,_,_,_],
[_,_,_,5,4,6,9,_,_],
[_,_,_,_,_,3,_,5,_],
[_,_,3,_,_,7,6,_,_],
[_,_,_,_,_,_,_,8,9],
[_,7,_,4,_,2,_,_,5],
[8,_,_,9,_,5,_,2,3],
[2,_,9,3,_,8,7,6,_]].
I'm using ECLiPSe CLP to implement a solver. The best I've come up with so far is to write a domain like this:
domain(P):-
dim(P,[9,9]),
P[1..9,1..9] :: 1..9.
and a converter for the puzzle (parameter P is the given puzzle and Sudoku is the new defined grid with the 2D array). But I'm having trouble linking the values from the given initial puzzle to my 2D array.
convertVectorsToArray(Sudoku,P):-
( for(I,1,9),
param(Sudoku,P)
do
( for(J,1,9),
param(Sudoku,P,I)
do
Sudoku[I,J] is P[I,J]
)
).
Before this, I tried using array_list (http://eclipseclp.org/doc/bips/kernel/termmanip/array_list-2.html), but I kept getting type errors. How I did it before:
convertVectorsToArray(Sudoku,P):-
( for(I,1,9),
param(Sudoku,P)
do
( for(J,1,9),
param(Sudoku,P,I)
do
A is Sudoku[I],
array_list(A,P[I])
)
).
When my Sudoku finally outputs the example puzzle P in the following format:
Sudoku = []([](_Var1, _Var2, 8, 7, ..., 6), [](4, ...), ...)
then I'll be happy.
update
I tried again with the array_list; it almost works with the following code:
convertVectorsToArray(Sudoku,P):-
( for(I,1,9),
param(Sudoku,P)
do
X is Sudoku[I],
Y is P[I],
write(I),nl,
write(X),nl,
write(Y),nl,
array_list(X, Y)
).
The writes are there to see how the vectors/arrays look like. For some reason, it stops at the second iteration (instead of 9 times) and outputs the rest of the example puzzle as a vector of vectors. Only the first vector gets assigned correctly.
update2
While I'm sure the answer given by jschimpf is correct, I also figured out my own implementation:
convertVectorsToArray(Sudoku,[],_).
convertVectorsToArray(Sudoku,[Y|Rest],Count):-
X is Sudoku[Count],
array_list(X, Y),
NewCount is Count + 1,
convertVectorsToArray(Sudoku,Rest,NewCount).
Thanks for the added explanation on why it didn't work before though!
The easiest solution is to avoid the conversion altogether by writing your puzzle specification directly as a 2-D array. An ECLiPSe "array" is simply a structure with the functor '[]'/N, so you can write:
puzzle(P) :- P = [](
[](_,_,8,7,_,_,_,_,6),
[](4,_,_,_,_,9,_,_,_),
[](_,_,_,5,4,6,9,_,_),
[](_,_,_,_,_,3,_,5,_),
[](_,_,3,_,_,7,6,_,_),
[](_,_,_,_,_,_,_,8,9),
[](_,7,_,4,_,2,_,_,5),
[](8,_,_,9,_,5,_,2,3),
[](2,_,9,3,_,8,7,6,_)).
You can then use this 2-D array directly as the container for your domain variables:
sudoku(P) :-
puzzle(P),
P[1..9,1..9] :: 1..9,
...
However, if you want to keep your list-of-lists puzzle specification, and convert that to an array-of-arrays format, you can use array_list/2. But since that only works for 1-D arrays, you have to convert the nesting levels individually:
listoflists_to_matrix(Xss, Xzz) :-
% list of lists to list of arrays
( foreach(Xs,Xss), foreach(Xz,Xzs) do
array_list(Xz, Xs)
),
% list of arrays to array of arrays
array_list(Xzz, Xzs).
As for the reason your own code didn't work: this is due to the subscript notation P[I]. This
requires P to be an array (you were using it on lists)
works only in contexts where an arithmetic expression is expected, e.g. the right hand side of is/2, in arithmetic constraints, etc.
I cam across the code below online where it's trying to add two array. Can anyone explain what it is calculating to get 14?
my #a = (1,2,5)+(8,9);
print "#a";
output: 14
Output is 14 as $a[0] is 14 => 5+9
+ operator imposes scalar context on both lists so last elements are taken and added,
# in scalar context $x is assigned with last element
my $x = (1,2,5);
print "\$x is $x\n";
outputs $x is 5
warnings pragma would also complain, giving you a hint that something fishy is going on,
Useless use of a constant (8) in void context
Starting with:
my #a = (1,2,5)+(8,9);
When using a list in a scalar context, the last element is returned. Consult What is the difference between a list and an array? for details.
Therefore the above two lists reduce to:
my #a = 5 + 9;
Which mathematically equals:
my #a = (14);
I'm a total Perl newb, but still cannot believe I cannot figure this out with all the info I've read through online, but, I've burned too much time and am suffering from block at this point. Hoping to learn something based on my real life example...
Ok, I think I have an array of arrays, created like this:
my #array1 = ();
my #array2 = ();
my $ctr1 = 0;
my $col;
[sql query]
while(($col)=$sth->fetchrow_array() ) {
$array1[$ctr1]=$col;
$ctr1++;
}
print STDERR "#array1";
##results in 10 rows, a mac address in each
##00:00:00:00:00:00 00:11:11:11:11:11 22:22:22:22:22:22 33:33:33:33:33:33 ...
Now I do another query. While looping through results, I am looking for those 10 mac addresses. When I find one, I add a row to array2 with the mac and the sequential number accumulated to the point, like this:
[sql query]
while(($col)=$sth->fetchrow_array() ) {
$ctr2++;
if( my ($matched) = grep $_ eq $col, #array1 ) {
push( #array2, ($col,$ctr2) );
}
}
print STDERR "#array2";
##results in 10 rows, a mac address and an integer in each
##00:00:00:00:00:00 2 00:11:11:11:11:11 24 22:22:22:22:22:22 69 33:33:33:33:33:33 82 ...
Now the easy part. I want to loop through array2, grabbing the mac address to use as part of a sql query. Therein lies the problem. I am so ignorant as to exactly what I am doing that even though I had it almost working, I can't get back to that point. Ignorance is definitely not bliss.
When I loop through array2, I am getting a host of errors, based on the different forms of the statement. The one I think is right is listed below along with the error message...
my $ctr3 = 0;
foreach $ctr3 (#array2) {
my $chkmac = $array2[$ctr3][0]; <--- gacks here with the error below - line 607
[SQL query]
[Thu May 30 14:05:09 2013] [error] Can't use string ("00:66:55:77:99:88") as an ARRAY ref while "strict refs" in use at /path/to/test.cgi line 607.\n
I believe the issue is that my array of arrays is not an array of arrays. If it were, it would work as coded, or so I think from the reading... That said, I cannot fathom what I am dealing with otherwise. This will be a head slapper I'm all but sure, but I am stumped.... Little help, please?
TIA
O
For an array of arrays you want to use an array reference, e.g.
push #array2, [$col, $ctr2];
When accessing an element within an array refernce, you'll want to use the -> operator. Also, when looping through an array, it's not necessary to index back into that same array. So the last part would look more like:
foreach $ctr3 (#array2) {
my $chkmac = $ctr3->[0];
....
When you do the foreach there, $ctrl3 won't have the index in it, it'll have the value. So you should just need to do $ctrl3->[0]. Note the -> which dereferences the array reference (#array2 is actually an array of array references).
EDIT: As AKHolland pointed out, #array2 actually isn't an array of array references, although that's what it should be. You also need to change:
push( #array2, ($col, $ctr2) );
To
push( #array2, [$col, $ctr2] );
This makes an array reference, rather than a list. A list in this context just collapses down into regular arguments to push, meaning you're pushing two separate strings into #array2.
You are correct that your array of arrays is not an array of arrays, since in Perl there is no such thing. So what do you have instead? There's two ways to see.
First, when you print #array2, you come up with a string composed of alternating MACs and counts, separated by spaces. Since the spaces sort-of-signify the division between array elements, we might surmise that what we've got is a single array of heterogeneous elements, such that element 0 is a MAC, element 1 is a count, element 2 is another MAC, and so on.
The other perspective is to look at how #array2 is constructed:
push( #array2, ($col,$ctr2) );
From the documentation for push, we find that push ARRAY LIST works by appending the elements of LIST to the end of ARRAY. This has the effect of flattening the list into the array such that its original identity as a list is lost. You can add all the parentheses you want, when Perl expects a list it flattens all of them away.
So how do you achieve the effect you want? The List-of-Lists documentation has a detailed treatment, but the short answer is that you make a list of array references. Array references are scalars and are therefore legal elements in an array. But they retain their identify as array references!
The anonymous array reference constructor is the square bracket []. In order to push an array reference containing the elements $col and $ctr2 onto the end of #array2, you simply do this:
push( #array2, [$col, $ctr2] );
The code you wrote for accessing a particular element of the array-reference-in-an-array now works. But since I've already written a bunch of paragraphs on the subject, let me finish by explaining what was wrong originally and how changing the push statements suddenly makes it work.
The expression $array2[$ctr3][0] is sometimes written as $array2[$ctr3]->[0] to clarify what it's actually doing. What it does is to take the value of $array2[$ctr3] and treat it as an array reference, taking its 0 element. If we take $ctr3 to be 0 (as it would be at the top of the loop) the value of $array2[$ctr3] is the first element, 00:00:00:00:00:00. When you then subsequently ask Perl to treat 00:00:00:00:00:00 as an array reference, Perl dies because it doesn't know how to treat a string as an array reference.
When instead the value of $array2[$ctr3] is an array reference because that is what you pushed onto #array2 when constructing it, Perl is able to do as you ask, dereferencing the array reference and looking at element 0 of the resulting array, whose value happens to be 00:00:00:00:00:00.