How dow I stringify an array without spaces between its elements? - arrays

$readFile = get-content $readInput
#create an empty array to be filled with bank account numbers
$fNameArray = #()
for($i = 0; $i -lt $readFile.length; $i++){
#assigns a random letter from the list to $letter.
#$letter = get-random -inputobject ("A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z") -count $readFile.length
$letter = $readFile[$i] | foreach-object{get-random -inputobject ("A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z") -count $readFile[$i].length}
$fnameArray += "$letter"
}
$fnameArray
the code is reading in a file that has a list of names and randomizing the letters for Data Masking. The only problem I am running into is the output is like such:
L R Y E B
R O M I
U Q N G R
H K Y
M G A W Q
J G W Y D K T
X E Q
J Y P I G
It looks like it is output with spaces between the letters. How do I eliminate them?

The unary form of the -join operator joins (concatenates) all array elements without a separator.
> -join ('a', 'b', 'c')
abc
Therefore, simply use:
$fnameArray += -join $letter
By contrast, "$letter" stringifies the array using $OFS (the output field separator) as the separator, which defaults to a space, which explains your output.
Therefore, you could alternatively set $OFS to '' (the empty string) and use "$letter".
However, the -join approach is simpler and doesn't require you to create a local scope for / restore the previous $OFS value.

Related

Perl remove same value back to back with splice

I am trying to remove, the same values twice in an array, it is located back to back, this is my code
#{$tmp_h->{'a'}} = qw/A B B C/;
print Dumper ($tmp_h);
my $j = 0;
foreach my $cur (#{$tmp_h->{'a'}}) {
if ($cur eq 'B') {
splice(#{$tmp_h->{'a'}}, $j, 1);
}
$j++;
}
print Dumper $tmp_h;
However what got is,
$VAR1 = {
'a' => [
'A',
'B',
'B',
'C'
]
};
$VAR1 = {
'a' => [
'A',
'B',
'C'
]
};
I am expecting both 'B' to be removed in this case, what could possibly went wrong?
That code is removing from an array while iterating over it, pulling the carpet from underneath itself; is that necessary?
Instead, iterate and put elements on another array if the adjacent ones aren't equal. So iterate over the index, looking up an element and the next (or previous) one.†
I presume that B is just an example while in fact it can be any value, equal to its adjacent one.
It's interesting that regex can help too, with its simple way to find repeated patterns using backreferences
my #ary = qw(a b b c d d e f f f g);
my $str_ary = join '', #ary;
$str_ary =~ s/(.)\g{-1}//g;
my #new_ary = split //, $str_ary;
say "#new_ary"; #--> a c e f g
This removes pairs of adjacent values, so if there is an odd number of equal adjacent values it leaves the odd one (f above). As a curiosity note that it can be written in one statement
my #new_ary = split //, join('', #ary) =~ s/(.)\g{-1}//gr;
The join-ed array, forming a string, is bound to the substitution operator where /r modifier is crucial, for allowing this and returning the changed string which is then split back into a list.
To change an array in place have it assign to itself.‡
But single-letter elements are only an example, likely. With multiple characters in elements we can't join them by empty string because we wouldn't know how to split that back into an array; we have to join by something that can't be in any one element, clearly a tricky proposition. A reasonable take is a line-feed, as one can expect to know whether elements are/not multiline strings
my #ary = qw(aa no no way bah bah bah go);
my $str_ary = join "\n", #ary ;
$str_ary =~ s/([^\n]+)\n\g{-1}//g;
my #new = grep { $_ } split /\n/, $str_ary;
say "#new"; #--> aa way bah go
This would still have edge cases with interesting elements, like spaces and empty strings (but then any approach would).
† For example
use warnings;
use strict;
use feature 'say';
my #ary = qw(a b b c d d e f f f g);
my #new_ary;
my $i = 0;
while (++$i <= $#ary) {
if ($ary[$i] ne $ary[$i-1]) {
push #new_ary, $ary[$i-1]
}
else { ++$i }
}
push #new_ary, $ary[-1] if $ary[-1] ne $ary[-2];
say "#new_ary"; #--> a c e f g
‡ Done for the arrayref in the question
#{ $hr->{a} } = qw/A B B C/;
#{$hr->{a}} = split //, join('', #{$hr->{a}}) =~ s/(.)\g{-1}//gr;
say "#{$hr->{a}}"; #--> A C
The Perl documentation tells you in perlsyn under Foreach Loops:
If any part of LIST is an array, foreach will get very confused if you
add or remove elements within the loop body, for example with splice. So
don't do that.
You can iterate over the indices instead, but don't forget to not increment the index when removing a value:
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my $tmp_h = {a => [qw[ A B B C ]]};
print Dumper($tmp_h);
my $j = 0;
while ($j <= $#{ $tmp_h->{a} }) {
my $cur = $tmp_h->{a}[$j];
if ($cur eq 'B') {
splice #{ $tmp_h->{a} }, $j, 1;
} else {
++$j;
}
}
print Dumper($tmp_h);
Or start from the right so you don't have to worry:
my $j = $#{ $tmp_h->{a} };
while ($j-- >= 0) {
my $cur = $tmp_h->{a}[$j];
splice #{ $tmp_h->{a} }, $j, 1 if $cur eq 'B';
}
But the most straight forward way is to use grep:
#{ $tmp_h->{a} } = grep $_ ne 'B', #{ $tmp_h->{a} };

Two-dimensional Array of Same Character

I am experimenting on two-dimensional arrays. I want some sort of a matrix, all with the same character. I can define a blank multi-dimensional array with fixed elements, and supply it with a character using loops. However, I can also do #(something)*n to directly define an array already supplied with something.
From what I understood so far, this is how to do it:
> $arr = ,(,'E'*3)*3
These seems alright:
> $arr[1]
E
E
E
> $arr[1][2]
E
But when I try to replace a character somewhere, like $arr[1][2] = 'D', many characters are replaced:
> $arr
E
E
D
E
E
D
E
E
D
Is my array definition wrong? Added: Then, how to correctly define it 'quickly'?
Using the * operator on non-numeric values creates copies of the original value. However, if the item you're copying isn't of a primitive(-ish) type like String or Char, the result will not be a duplicate of that object, but a copy of the object reference. Since all instances will then be pointing to the same object, changing one will change all.
To create distinct instances you need to repeat the array instantiation in a loop, as PetSerAl showed in the comments:
$arr = 1..3 | ForEach-Object { ,(,'E' * 3) }
In this particular case you could also create a "template" array and clone it:
$a0 = ,'E' * 3
$arr = 1..3 | ForEach-Object { ,$a0.Clone() }
Note, however, that cloning an object will not clone nested object references, so the latter is not a viable approach in all scenarios.
Something like this won't work the way you intend (because the references of the nested hashtable objects are still pointing to the same actual hashtables after cloning the array object):
PS C:\> $a0 = ([PSCustomObject]#{'x'='E'}),([PSCustomObject]#{'x'='E'})
PS C:\> $arr = 1..2 | ForEach-Object { ,$a0.Clone() }
PS C:\> $arr
x
-
E
E
E
E
PS C:\> $arr[1][1].x = 'F'
PS C:\> $arr
x
-
E
F
E
F
But something like this will work:
PS C:\> $arr = 1..2 | ForEach-Object { ,(([PSCustomObject]#{'x'='E'}),([PSCustomObject]#{'x'='E'})) }
PS C:\> $arr
x
-
E
E
E
E
PS C:\> $arr[1][1].x = 'F'
PS C:\> $arr
x
-
E
E
E
F

Unique Combos from powershell array - No duplicate combos

I'm trying to figure out the best way to get unique combinations from a powershell array. For instance, my array might be
#(B,C,D,E)
I would be hoping for an output like this :
B
C
D
E
B,C
B,D
B,E
C,D
C,E
D,E
B,C,D
C,D,E
B,C,D,E
I do not want re-arranged combos. If combo C,D exists already then I do not want combo D,C. It's redundant for my purposes.
I looked into the functions here : Get all combinations of an array
But they aren't what I want. I've been working on figuring this out myself, but have spent quite a bit of time without success. I thought I'd ask the question here so that if someone else already know I'm not wasting my time.
Thanks!
This is an adaptation from a solution for a C# class I took that asked this same question. For any set find all subsets, including the empty set.
function Get-Subsets ($a){
#uncomment following to ensure only unique inputs are parsed
#e.g. 'B','C','D','E','E' would become 'B','C','D','E'
#$a = $a | Select-Object -Unique
#create an array to store output
$l = #()
#for any set of length n the maximum number of subsets is 2^n
for ($i = 0; $i -lt [Math]::Pow(2,$a.Length); $i++)
{
#temporary array to hold output
[string[]]$out = New-Object string[] $a.length
#iterate through each element
for ($j = 0; $j -lt $a.Length; $j++)
{
#start at the end of the array take elements, work your way towards the front
if (($i -band (1 -shl ($a.Length - $j - 1))) -ne 0)
{
#store the subset in a temp array
$out[$j] = $a[$j]
}
}
#stick subset into an array
$l += -join $out
}
#group the subsets by length, iterate through them and sort
$l | Group-Object -Property Length | %{$_.Group | sort}
}
Use like so:
PS C:>Get-Subsets #('b','c','d','e')
b
c
d
e
bc
bd
be
cd
ce
de
bcd
bce
bde
cde
bcde
Note that computational costs go up exponentially with the length of the input array.
Elements SecondstoComplete
15 46.3488228
14 13.4836299
13 3.6316713
12 1.2542701
11 0.4472637
10 0.1942997
9 0.0867832
My tired attempt at this. I did manage to get it to produce the expected results but how it does it is not as elegant. Uses a recursive functionality.
Function Get-Permutations{
Param(
$theInput
)
$theInput | ForEach-Object{
$element = $_
$sansElement = ($theInput | Where-Object{$_ -ne $element})
If($sansElement.Count -gt 1){
# Build a collection of permutations using the remaining elements that were not isolated in this pass.
# Use the single element since it is a valid permutation
$perms = ,$element
For($elementIndex = 0;$elementIndex -le ($sansElement.Count - 1);$elementIndex++){
$perms += ,#(,$element + $sansElement[0..$elementIndex] | sort-object)
}
# For loop does not send to output properly so that is the purpose of collecting the results of this pass in $perms
$perms
# If there are more than 2 elements in $sansElement then we need to be sure they are accounted for
If($sansElement -gt 2){Get-Permutations $sansElement}
}
}
}
Get-Permutations B,C,D,E | %{$_ -join ","} | Sort-Object -Unique
I hope I can explain myself clearly....So each pass of the function will take an array. Each individual element of that array will be isolated from the rest of the array which is represented by the variables $element and $sansElement.
Using those variables we build individual and progressively larger arrays composing of those elements. Let this example show using the array 1,2,3,4
1
1,2
1,2,3
1,2,3,4
The above is done for each "number"
2
2,1
2,1,3
2,1,3,4
and so forth. If the returned array contains more that two elements (1,2 would be the same as 2,1 in your example so we don't care about pairs beyond one match) we would take that array and run it through the same function.
The real issue is that the logic here (I know this might be hard to swallow) creates several duplicates. I suppose you could create a hashtable instead which I will explore but it does not remove the logic flaw.
Regardless of me beating myself up as long as you don't have thousands of elements the process would still produce results.
Get-Permutations would return and array of arrays. PowerShell would display that one element per line. You asked for comma delimited output which is where -join comes in. Sort-Object -Unique takes those sorted string an discards the duplicates.
Sample Output
B
B,C
B,C,D
B,C,D,E
B,C,E #< Missing from your example output.
B,D
B,D,E #< Missing from your example output.
B,E
C
C,D
C,D,E
C,E
D
E

Extract information from lines and columns in PERL

I have a huge file with multiple lines and columns. Each line has many columns and many lines have the same name in the same position. E.g.
A C Z Y X
A C E J
B E K L M
What is the best way to Find all lines that share the same items in a certain position? For instance, I would like to know that there are 2 A, 2 C, 1 D, etc., all ordered by column.
I am really new to Perl, and so I am struggling a lot to advance in this so any tips are appreciated.
I got to this point:
#!/usr/local/bin/perl -w
use strict;
my $path='My:\Path\To\My\File.txt';
my $columns;
my $line;
open (FILE,$path), print "Opened!\n" or die ("Error opening");
while (<FILE>)
{
#line=split('\t',$_);
}
close FILE;
The output of this can be another TSV, that examines the file only until the 5th column, ordered from top to bottom, like:
A 2
C 2
Z 1
Y 1
E 1
J 1
B 1
E 1
K 1
L 1
Note that the first items appear first and, when shared among lines, do not show again for subsequent lines.
Edit: as per the questions in the comments, I changed the dataset and output. Note that two E appear: one belonging to the third column, the other belonging to the second column.
Edit2: Alternatively, this could also be analyzed column by column, thus showing the results in the first column, then in the second, and so on, as long as they were clearly separated. Something like
"1st" "col"
A 2
B 1
"2nd" "col"
C 2
E 1
"3rd" "col"
Z 1
E 1
K 1
"4th" "col"
Y 1
J 1
L 1
I did not fully understand the formatting of your desired output, so the below script outputs all the data from the first col on the first row, and so on. This can easily be modified to the format that you desire, but is a quick starting point to how to acummulate the data first and then processing it.
use strict;
use warnings;
use autodie;
my $path='My:\Path\To\My\File.txt';
open my $fh, '<', $path;
my #data;
# while (<$fh>) { Switch these lines when ready for real data
while (<DATA>) {
my #row = split ' ';
for my $col (0..$#row) {
$data[$col]{$row[$col]}++;
}
}
for my $coldata (#data) {
for my $letter (sort keys %$coldata) {
print "$letter $coldata->{$letter} ";
}
print "\n";
}
close $fh;
__DATA__
A C Z Y X
A C D J
B E K L M
Outputs
A 2 B 1
C 2 E 1
D 1 K 1 Z 1
J 1 L 1 Y 1
M 1 X 1
Perhaps the following will be helpful:
use strict;
use warnings;
my $path = 'My:\Path\To\My\File.txt';
my %hash;
open my $fh, '<', $path or die $!;
while (<$fh>) {
my #cols = split ' ', $_, 5;
$hash{$_}{ $cols[$_] || '' }++ for 0 .. 3;
}
close $fh;
for my $key ( sort { $a <=> $b } keys %hash ) {
print "Col ", $key + 1, "\n";
print "$_ $hash{$key}{$_}\n"
for sort { $hash{$key}->{$b} <=> $hash{$key}->{$a} } grep $_,
keys %{ $hash{$key} };
}
Output on your dataset:
Col 1
A 2
B 1
Col 2
C 2
E 1
Col 3
Z 1
K 1
E 1
Col 4
J 1
L 1
Y 1

searching two array string for equal words

I am a beginner in Perl. I have two string arrays array1 and array2. I want to check the each and every element in 2nd array. if there is i want to give a relative value one to that particular element in the 2nd array. the relative values are store in an array.I try it out but it wont work and git gives a warning like" Use of uninitialized value in string eq at pjt.pl line 52, line 3".
while($i <= (scalar #resultarray-1))
{
while ($j <= (scalar #inputsymbl-1))
{
if ($resultarray[$i] eq $inputsymbl[$j])
{
$rel[$j]=1;
$i=$i+1;
$j=0;
}
else
{
$j=$j+1;
}
}
if($j==(scalar #inputsymbl))
{
$i=$i+1;
$j=0;
}
}
try this:
my $i = 0;
my $j = 0;
## walk each array element
foreach(#resultarray) {
my $result = $_;
foreach(#inputsymbl) {
my $symbl = $_;
if ($result eq $symbl) {
$rel[$j] = 1;
$i++;
} else {
$j++;
}
}
if ($j == (scalar #inputsymbl - 1)) {
$i++;
$j = 0;
}
}
provide more informations if you need detailed help.
From your question and code, it appears that you want to flag the indexes, by using a third array, of the two array's elements that are equal. By doing this, however, you're creating a sparse array. Also, if the two arrays don't have the same number of elements, a "Use of uninitialized value in string eq..." warning will eventually occur. Given these issues, consider using the smaller index of the two arrays (done using the ternary operator below) and pushing the indexes of the equal elements onto the third array:
use strict;
use warnings;
use Data::Dumper;
my #results;
my #arr1 = qw/A B C D E F G H I J/;
my #arr2 = qw/A D C H E K L H N J P Q R S T/;
# Equal: ^ ^ ^ ^ ^
# Index: 0 2 4 7 9
for my $i ( 0 .. ( $#arr1 <= $#arr2 ? $#arr1 : $#arr2 ) ) {
push #results, $i if $arr1[$i] eq $arr2[$i];
}
print Dumper \#results;
Output:
$VAR1 = [
0,
2,
4,
7,
9
];
Hope this helps!

Resources