I have a powershell script where I group and store a collection of Conputernames in a hashtable. Everithing works fine so far.
$table = #{}
$pcs = "w7cl002","w7cl002","w7cl001","w7cl002","w7cl008", `
"w7lp001","w7lp001","w7cl008","w7cl004","w7lp001"
foreach ($pc in $pcs){
if ($table.Keys -notcontains $pc){
$Table.Add($pc,1)
}
else{
$occ = $table.get_item($pc) +1
$table.set_item($pc,$occ)
}
}
$table
This is what I want and what I get.
Name Value
---- -----
stflw7lp001 3
stflw7cl002 3
stflw7cl004 1
stflw7cl001 1
stflw7cl008 2
Initially, I wanted to do this by using a 2D-Array. But after 5 hours struggling and running my head against a wall, I gave up and realized it with a hash table.
I just would be interested in whether this is possible using a 2D array?
2D arrays (in the C# sense, not the Java sense) are icky in PowerShell. I tend to avoid them. There is no direct language support for them either, and you have to create them with New-Object.
As for your code, you can achieve the same with this:
$pcs = "w7cl002","w7cl002","w7cl001","w7cl002","w7cl008",
"w7lp001","w7lp001","w7cl008","w7cl004","w7lp001"
$table = #{}
$pcs | Group-Object -NoElement | ForEach-Object { $table[$_.Name] = $_.Count }
No need for awkward loops and code that looks like an unholy combination of C# and VBScript (honestly, none of the actual working lines in your code look like PowerShell, except for the first three and the last).
If you need arrays of arrays instead of a hashtable, you can do so as well:
$result = $pcs | Group-Object -NoElement | ForEach-Object { ,($_.Name, $_.Count) }
However, you really don't want to work this way. For one, you can often get confused whether you have an array or a scalar as an item. Some operations may automatically unroll the array again, etc. IMHO it's much better to work with objects when you can. In this case, just use the result from Group-Object directly. It has handy Name and Count properties that spell out exactly what they are, instead of [0] and [1] which are a bit more opaque.
If you need nicer property names, just project into new objects:
$pcs | group -n | % {
New-Object PSObject -Property #{
PC = $_.Name
Count = $_.Count
}
}
or
$pcs | group -n | select #{l='PC';e={$_.Name}},Count
Generally, when using PowerShell, think of your data as objects and your problem as operations on those objects. Then try to find how to make those operations happen with things the language already gives you. Manipulation of collection classes, or C-like code with lots of loops almost always looks out of place and wrong in PowerShell – usually for good reason, because the pipeline is often a better, easier, and shorter way of accomplishing the same. I can probably count the number of times I used a for loop in PowerShell over the past few years on both hands, while I wouldn't need both hands to count the number of times I used a foreach loop. Those are rare things to use.
In general, what I end up with in a situation like this is not a 2D array, but an array of tuples, or an array of custom objects. This is what import-csv gives you by default. For my work, this is a good representation of a relation, and many of the transforms I want are some combination of set ops and relational ops.
BTW, I'm only a beginner in Powershell.
Related
Sadly I don't know the exact word for what my problem is, but I try to explain it and I'd be happy if someone could tell me what I should be looking for :)
I have two arrays
Array 1 with Name, Surname, and Array 2 with Name, Surname aswell.
Of course, Surnames and Names are NOT unique, but the combination of them is unique.
So now I'd need to check if the Combination of Array 1 IS existing in Array 2, if not do something...
Now my problem is that i know -contains but I don't know how to use that on multiple hashes (Contains Surname or Name only is not useful, it has to be and)
i tried the following
if ( $oldList -notcontains $newPerson.Name -and $newPerson.Surname) {....}
But it neither worked nor I expected it to work or I did another mistake?!
Could anybody give me some advice? Thanks in advance
PS. It is not Surname/Name in my case, but I for understanding Surname/Name is easier!
Update
The Hashtables (or Arrays?!) look with an Write-Host like this:
#{Name=Peter; Surname=Fox}.... and so on
Update 2 - Solution
Hey Guys, just for every future reader who maybe doesn't find out by himself...
It's an Compare-Object $arr1 $arr2 - and it outputs every diffenence by either => odr =<
:) Therefore question is answered (by myself :P)
Compare-Object $arr1 $arr2
delivers the solution I was looking for. It shows differences by => and <= and therefore you can evaluate if it is containd in arr2 or not.
Converting arrays to JSON string in PowerShell couldn't be more simple:
#(1,2,3) | ConvertTo-Json
Produces:
[
1,
2,
3
]
However if the array is empty the result is an empty string:
#() | ConvertTo-Json
Results an empty string instead of [].
It works without pipelining
PS C:\> ConvertTo-Json #()
[
]
This is a use case for the unary comma operator, https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_operators?view=powershell-7.2
The pipeline is breaking the array apart. In the first example, the array is broken apart into its integer values as it passes through the pipeline (I'm not sure of what the actual mechanics of it being reassembled on the other side are: if it's the pipeline acting logically seeing three integers grouped together or if it's some interpretation being done by the receiving cmdlet itself). Regardless, using the unary operator ',' creates an array of one element. When used as either , #(1, 2, 3) or , #() it creates a container array which is broken apart by the pipeline still, but the sub-arrays are the objects being passed through and preserved as intended to be interpreted by the ConvertTo-Json cmdlet properly. Assuming your array is stored in a variable like $myArray, the following general code will work for all situations:
, $myArray | ConvertTo-Json
I am new to Perl and having some difficulty with arrays in Perl. Can somebody will explain to me as to why I am not able to print the value of an array in the script below.
$sum=();
$min = 999;
$LogEntry = '';
foreach $item (1, 2, 3, 4, 5)
{
$min = $item if $min > $item;
if ($LogEntry eq '') {
push(#sum,"1"); }
print "debugging the IF condition\n";
}
print "Array is: $sum\n";
print "Min = $min\n";
The output I get is:
debugging the IF condition
debugging the IF condition
debugging the IF condition
debugging the IF condition
debugging the IF condition
Array is:
Min = 1
Shouldn't I get Array is: 1 1 1 1 1 (5 times).
Can somebody please help?
Thanks.
You need two things:
use strict;
use warnings;
at which point the bug in your code ($sum instead of #sum) should become obvious...
$sum is not the same variable as #sum.
In this case you would benefit from starting your script with:
use strict;
use warnings;
Strict forces you to declare all variables, and warnings gives warnings..
In the meantime, change the first line to:
#sum = ();
and the second-to-last line to:
print "Array is: " . join (', ', #sum) . "\n";
See join.
As others have noted, you need to understand the way Perl uses sigils ($, #, %) to denote data structures and the access of the data in them.
You are using a scalar sigil ($), which will simply try to access a scalar variable named $sum, that has nothing to do with a completely distinct array variable named #sum - and you obviously want the latter.
What confuses you is likely the fact that, once the array variable #sum exists, you can access individual values in the array using $sum[0] syntax, but here the sigil+braces ($[]) act as a "unified" syntactic constract.
The first thing you need to do (after using strict and warnings) is to read the following documentation on sigils in Perl (aside from good Perl book):
https://stackoverflow.com/a/2732643/119280 - brian d. foy's excellent summary
The rest of the answers to the same question
This SO answer
The best summary I can give you on the syntax of accessing data structures in Perl is (quoting from my older comment)
the sigil represents the amount of data from the data structure that you are retrieving ($ of 1 element, # for a list of elements, % for entire hash)
whereas the brace style represent what your data structure is (square for array, curly for hash).
As a special case, when there are NO braces, the sigil will represent BOTH the amount of data, as well as what the data structure is.
Please note that in your specific case, it's the last bullet point that matters. Since you're referring to the array as a whole, you won't have braces, and therefore the sigil will represent the data structure type - since it's an array, you must use the # sigil.
You push the values into the array #sum, then finish up by printing the scalar $sum. #sum and $sum are two completely independent variables. If you print "#sum\n" instead, you should get the output "11111".
print "Array is: $sum\n";
will print a non-existent scalar variable called $sum, not the array #sum and not the first item of the array.
If you 'use strict' it will flag the user of un-initialized variables like this.
You should definitly add use strict; and use warnings; to your script. That would have complained about the print "Array is: $sum\n"; line (among others).
And you initialize an array with my #sum=(); not with my $sum=();
Like CFL_Jeff mentions, you can't just do a quick print. Instead, do something like:
print "Array is ".join(', ',#array);
Still would like to add some details to this picture. )
See, Perl is well-known as a Very High Level Language. And this is not just because you can replace (1,2,3,4,5) with (1..5) and get the same result.
And not because you may leave your variables without (explicitly) assigning some initial values to them: my #arr is as good as my #arr = (), and my $scal (instead of my $scal = 'some filler value') may actually save you an hour or two one day. Perl is usually (with use warnings, yes) good at spotting undefined values in unusual places - but not so lucky with 'filler values'...
The true point of VHLL is that, in my opinion, you can express a solution in Perl code just like in any human language available (and some may be even less suitable for that case).
Don't believe me? Ok, check your code - or rather your set of tasks, for example.
Need to find the lowest element in a array? Or a sum of all values in array? List::Util module is to your command:
use List::Util qw( min sum );
my #array_of_values = (1..10);
my $min_value = min( #array_of_values );
my $sum_of_values = sum( #array_of_values );
say "In the beginning was... #array_of_values";
say "With lowest point at $min_value";
say "Collected they give $sum_of_values";
Need to construct an array from another array, filtering out unneeded values? grep is here to save the day:
#filtered_array = grep { $filter_condition } #source_array;
See the pattern? Don't try to code your solution into some machine-codish mumbo-jumbo. ) Find a solution in your own language, then just find means to translate THAT solution into Perl code instead. It's easier than you thought. )
Disclaimer: I do understand that reinventing the wheel may be good for learning why wheels are so useful at first place. ) But I do see how often wheels are reimplemented - becoming uglier and slower in process - in production code, just because people got used to this mode of thinking.
If I have an array:
#int_array = (7,101,80,22,42);
How can I check if the integer value 80 is in the array without looping through every element?
You can't without looping. That's part of what it means to be an array. You can use an implicit loop using grep or smartmatch, but there's still a loop. If you want to avoid the loop, use a hash instead (or in addition).
# grep
if ( grep $_ == 80, #int_array ) ...
# smartmatch
use 5.010001;
if ( 80 ~~ #int_array ) ...
Before using smartmatch, note:
http://search.cpan.org/dist/perl-5.18.0/pod/perldelta.pod#The_smartmatch_family_of_features_are_now_experimental:
The smartmatch family of features are now experimental
Smart match, added in v5.10.0 and significantly revised in v5.10.1, has been a regular point of complaint. Although there are a number of ways in which it is useful, it has also proven problematic and confusing for both users and implementors of Perl. There have been a number of proposals on how to best address the problem. It is clear that smartmatch is almost certainly either going to change or go away in the future. Relying on its current behavior is not recommended.
Warnings will now be issued when the parser sees ~~, given, or when. To disable these warnings, you can add this line to the appropriate scope
CPAN solution: use List::MoreUtils
use List::MoreUtils qw{any};
print "found!\n" if any { $_ == 7 } (7,101,80,22,42);
If you need to do MANY MANY lookups in the same array, a more efficient way is to store the array in a hash once and look up in the hash:
#int_array{#int_array} = 1;
foreach my $lookup_value (#lookup_values) {
print "found $lookup_value\n" if exists $int_array{$lookup_value}
}
Why use this solution over the alternatives?
Can't use smart match in Perl before 5.10. According to this SO post by brian d foy]2, smart match is short circuiting, so it's as good as "any" solution for 5.10.
grep solution loops through the entire list even if the first element of 1,000,000 long list matches. any will short-circuit and quit the moment the first match is found, thus it is more efficient. Original poster explicitly said "without looping through every element"
If you need to do LOTs of lookups, the one-time sunk cost of hash creation makes the hash lookup method a LOT more efficient than any other. See this SO post for details
Yet another way to check for a number in an array:
#!/usr/bin/env perl
use strict;
use warnings;
use List::Util 'first';
my #int_array = qw( 7 101 80 22 42 );
my $number_to_check = 80;
if ( first { $_ == $number_to_check } #int_array ) {
print "$number_to_check exists in ", join ', ', #int_array;
}
See List::Util.
if ( grep /^80$/, #int_array ) {
...
}
If you are using Perl 5.10 or later, you can use the smart match operator ~~:
my $found = (80 ~~ $in_array);
I need to see if there are duplicates in an array of strings, what's the most time-efficient way of doing it?
One of the things I love about Perl is it's ability to almost read like English. It just sort of makes sense.
use strict;
use warnings;
my #array = qw/yes no maybe true false false perhaps no/;
my %seen;
foreach my $string (#array) {
next unless $seen{$string}++;
print "'$string' is duplicated.\n";
}
Output
'false' is duplicated.
'no' is duplicated.
Turning the array into a hash is the fastest way [O(n)], though its memory inefficient. Using a for loop is a bit faster than grep, but I'm not sure why.
#!/usr/bin/perl
use strict;
use warnings;
my %count;
my %dups;
for(#array) {
$dups{$_}++ if $count{$_}++;
}
A memory efficient way is to sort the array in place and iterate through it looking for equal and adjacent entries.
# not exactly sort in place, but Perl does a decent job optimizing it
#array = sort #array;
my $last;
my %dups;
for my $entry (#array) {
$dups{$entry}++ if defined $last and $entry eq $last;
$last = $entry;
}
This is nlogn speed, because of the sort, but only needs to store the duplicates rather than a second copy of the data in %count. Worst case memory usage is still O(n) (when everything is duplicated) but if your array is large and there's not a lot of duplicates you'll win.
Theory aside, benchmarking shows the latter starts to lose on large arrays (like over a million) with a high percentage of duplicates.
If you need the uniquified array anyway, it is fastest to use the heavily-optimized library List::MoreUtils, and then compare the result to the original:
use strict;
use warnings;
use List::MoreUtils 'uniq';
my #array = qw(1 1 2 3 fibonacci!);
my #array_uniq = uniq #array;
print ((scalar(#array) == scalar(#array_uniq)) ? "no dupes" : "dupes") . " found!\n";
Or if the list is large and you want to bail as soon as a duplicate entry is found, use a hash:
my %uniq_elements;
foreach my $element (#array)
{
die "dupe found!" if $uniq_elements{$element}++;
}
Create a hash or a set or use a collections.Counter().
As you encounter each string/input check to see if there's an instance of that in the hash. If so, it's a duplicate (do whatever you want about those). Otherwise add a value (such as, oh, say, the numeral one) to the hash, using the string as the key.
Example (using Python collections.Counter):
#!python
import collections
counts = collections.Counter(mylist)
uniq = [i for i,c in counts.iteritems() if c==1]
dupes = [i for i, c in counts.iteritems() if c>1]
These Counters are built around dictionaries (Pythons name for hashed mapping collections).
This is time efficient because hash keys are indexed. In most cases the lookup and insertion time for keys is done in near constant time. (In fact Perl "hashes" are so-called because they are implemented using an algorithmic trick called "hashing" --- a sort of checksum chosen for its extremely low probability of collision when fed arbitrary inputs).
If you initialize values to integers, starting with 1, then you can increment each value as you find its key already in the hash. This is just about the most efficient general purpose means of counting strings.
Not a direct answer, but this will return an array without duplicates:
#!/usr/bin/perl
use strict;
use warnings;
my #arr = ('a','a','a','b','b','c');
my %count;
my #arr_no_dups = grep { !$count{$_}++ } #arr;
print #arr_no_dups, "\n";
Please don't ask about the most time efficient way to do something unless you have some specific requirements, such as "I have to dedupe a list of 100,000 integers in under a second." Otherwise, you're worrying about how long something takes for no reason.
similar to #Schwern's second solution, but checks for duplicates a little earlier from within the comparison function of sort:
use strict;
use warnings;
#_ = sort { print "dup = $a$/" if $a eq $b; $a cmp $b } #ARGV;
it won't be as fast as the hashing solutions, but it requires less memory and is pretty darn cute