PowerShell Sum Array That Already Contains MBs - arrays

I have an array that contains MBs already in the values. This is how MS DPM returns data written to a tape. I would like to sum them together. Is there an easy one liner to accommodate for this?

MB is a recognized numeric suffix in PowerShell's native grammar, so you can parse and evaluate your size strings with Invoke-Expression:
PS ~> Invoke-Expression '2401927.56MB'
2517924115906.56
You'll want to do some basic input validation to make sure it's actually a numeric sequence, and remove the thousand separator:
$Tapes.DataWrittenDisplayString |ForEach-Object {
# remove commas and whitespace
$dataWritten = $_ -replace '[,\s]'
# ensure it's actually a number in the expected format
if($dataWritten -match '^\d+(?:\.\d+)?[kmgtp]b$'){
# let PowerShell do the rest
$dataWritten |Invoke-Expression
}
}

Related

Why does an array behave differently when directly assigned or retrieved from get-content

Here's something I don't understand.
When I define a variable:
$v = [byte]2, [byte]3
and check its type:
$v.getType().name
I get
Object[]
I then format $v:
'{0} {1}' -f $v
which prints
2 3
Now, if I get a file's first two bytes:
$f = (get-content 'xyz.txt' -encoding byte -readCount 2 -totalCount 2)
and check its type:
$f.getType().name
I get the same type as before: Object[].
However, unlike with $v, I cannot format $f:
'{0} {1}' -f $f
I get the error message Error formatting a string: Index (zero based) must be greater than or equal to zero and less than the size of the, although the length of the array is 2:
$f.length
returns
2
I don't understand why this is and would appreciate an explanation.
The behavior should be considered a bug in the -f operator; it is present as of v7.1 and reported in GitHub issue #14355; it does not affect other operators with array operands, such as -split or -in.
The workaround is to cast $f to [array] or, if creating a copy of the array is acceptable, #($f):
'abc' > xyz.txt
$f = get-content 'xyz.txt' -encoding byte -readCount 2 -totalCount 2
'{0} {1}' -f ([array] $f)
Note: Using #(), the array-subexpression operator - ... - #($f) - as Mathias R. Jessen notes - is the even simpler option, but do note that using #() involves cloning (creating a shallow copy of) the array, whereas the [array] cast in this case does not.
The alternative is to apply the [array] cast as a type constraint (by placing it to the left of the $f = ... assignment):
'abc' > xyz.txt
[array] $f = (get-content 'xyz.txt' -encoding byte -readCount 2 -totalCount 2)
'{0} {1}' -f $f
Note:
In PowerShell [Core] v6+, you must use -AsByteStream in lieu of -Encoding Byte.
The problem can also be avoided if -ReadCount 2 is omitted, but note that that decreases the performance of the command, because the bytes are then emitted one by one; that is, with -ReadCount 2 -TotalCount 2 a single object is emitted that is a 2-byte array as a whole, whereas just -TotalCount 2 emits the individual bytes, one by one to the pipeline, in which case it is then the PowerShell engine itself that collects these bytes in an [object[]] array for the assignment.
Note that applying #() directly to the command - #(get-content ...) - would not work in this case, because #(), due to parameter combination -ReadCount 2 -TotalCount 2, receives a single output object that happens to be an array as a whole and therefore wraps that single object in another array. This results in a single-element array whose element is the original 2-element array of bytes; for more information about how #(...) works, see this answer.
Background information:
The problem is an invisible [psobject] wrapper around each array returned by Get-Content -ReadCount (just one in this case), which unexpectedly causes the $f array passed to -f not to be recognized as such.
Note that PowerShell's other array-based operators, such as -in and -replace, are not affected.
The wrapper can be bypassed in two ways:
$f.psobject.BaseObject
casting to [array], as shown at the top.
Note:
Generally, output objects produced by cmdlets - as opposed to output produced by PowerShell code - have generally invisible [psobject] wrappers; mostly, they are benign, because PowerShell usually just cares about the .NET object being wrapped, not about the wrapper, but on occasion problems arise, such as in this case - see GitHub issue #5579 for a discussion of the problem and other contexts in which it manifests.
In order to test if a given object has a [psobject] wrapper, use -is [psobject]; e.g.:
$var = 1
$var -is [psobject] # -> $false
$var = Write-Output 1
$var -is [psobject] # -> $true, due to use of a cmdlet.
# You can also test command output directly.
(Write-Output 1) -is [psobject] # -> $true

Why does using a comma concatenate strings in PowerShell?

I have encountered a PowerShell behavior I don't understand. A comma between strings concatenates them and inserts a space in between, e.g.:
PS H:\> [string]$result = "a","b"
PS H:\> $result # a string with 3 characters
a b
If the result is interpeted as an array, then using a comma separates the elements of the array, e.g.:
PS H:\> [array]$result = "a","b"
PS H:\> $result # an array with 2 elements
a
b
I have tried searching for an explanation of this behavior, but I don't really understand what to search for. In the documentation about the comma operator, I see that it is used to initialize arrays, but I have not found an explanation for the "string concatenation" (which, I suspect, may not be the correct term to use here).
Indeed: , is the array constructor operator.
Since your variable is a scalar - a single [string] instance - PowerShell implicitly stringifies your array (converts it to a string).
PowerShell stringifies arrays by joining the (stringified) elements of the array with a space character as the separator by default.
You may set a different separator via the $OFS preference variable, but that is rarely done in practice.[1]
You can observe this behavior in the context of string interpolation:
PS> "$( "a","b" )"
a b
[1] As zett42 points out, an easier way to specify a different separator is to use the -join operator; e.g.,
"a", "b" -join '#' yields 'a#b'.
By contrast, setting $OFS without resetting it afterwards can cause later commands that expect it to be at its default to malfunction; with -join you get explicit control.

Bash - fastest way to do whole string matching over array elements?

I have bash array (called tenantlist_array below) populated with elements with the following format:
{3 characters}-{3-5 characters}{3-5 digits}-{2 chars}{1-2 digits}.
Example:
abc-hac101-bb0
xyz-b2blo97250-aa99
abc-b2b9912-xy00
fff-hac101-g3
Array elements are unique. Please notice the hyphen, it is part of every array element.
I need to check if the supplied string (used in the below example as a variable tenant) produces a full match with any array element - because array elements are unique, the first match is sufficient.
I am iterating over array elements using the simple code:
tenant="$1"
for k in "${tenantlist_array[#]}"; do
result=$(grep -x -- "$tenant" <<<"$k")
if [[ $result ]]; then
break
fi
done
Please note - I need to have a full string match - if, for example, the string I am searching is hac101 it must not match any array element even if can be a substring if an array element.
In other words, only the full string abc-hac101-bb0 must produce the match with the first element. Strings abc, abc-hac, b2b, 99, - must not produce the match. That's why -x parameter is with the grep call.
Now, the above code works, but I find it quite slow. I've run it with the array having 193 elements and on an ordinary notebook it takes almost 90 seconds to iterate over the array elements:
real 1m2.541s
user 0m0.500s
sys 0m24.063s
And with the 385 elements in the array, time is following:
real 2m8.618s
user 0m0.906s
sys 0m48.094s
So my question - is there a faster way to do it?
Without running any loop you can do this using glob:
tenant="$1"
[[ $(printf '\3%s\3' "${tenantlist_array[#]}") == *$'\3'"$tenant"$'\3'* ]] &&
echo "ok" || echo "no"
In printf we place a control character \3 around each element and while comparing we make sure to place \3 before & after search key.
Thanks to #arco444, the solution is astonishingly simple:
tenant="$1"
for k in "${tenantlist_array[#]}"; do
if [[ $k = "$tenant" ]]; then
result="$k"
break
fi
done
And the seed difference for the 385 member array:
real 0m0.007s
user 0m0.000s
sys 0m0.000s
Thousand times faster.
This gives an idea of how wasteful is calling grep, which needs to be avoided, if possible.
This is an alternative way of using grep that actually uses grep at most of its power.
The code to "format" the array could be completely removed just appending a \n at the end of each uuid string when creating the array the first time.
This code would also degrade much slower with the length of the strings that are compared and with the length of the array.
tenant="$1"
formatted_array=""
for k in "${tenantlist_array[#]}"; do
formatted_array="$formatted_array $i\n"
done
result=$(echo -e "$formatted_array" | grep $tenant)

Powershell: enigmatic behavior of -like operator

We have an application that keeps a server database - a list of server names and other related information. Sometimes we need to export the information in the XML format to process it by a Powershell script. Server names in the XML file can be in simple ("ServerXX") or FQDN ("ServerXX.abc.com") formats. The script searches for a server name that is always in the simple format, and the search results should contain all simple and full server names that match the searched name.
The main search operator (slightly simplified) looks like this:
$FoundServer = ($ServerList | Where {$_.Name -match $ServerName+"*"})
$ServerList here is the array of strings (server names). Looks simple and works as expected. Usually.
The strange thing is, sometimes the script can't find some FQDNs. For example, if the FQDN in the file is "ServerXX.abc.com", and we're searching for "ServerXX", the FQDN is not found. At the same time search for other names works as expected. When debugging the script, it can be seen that the expression inside {} is literally "ServerXX.abc.com" -like "ServerXX*". It MUST be true. But the resulting search result is empty. And even more interesting thing is, if the search name is specified as "ServerXX.", "ServerXX.a" or with other letters from the FQDN, the script finds it. If the same server name is specified in the file without the domain name (in the simple form), the script finds it.
Well, and even more enigmatic thing is, we have two instances of the installed application, one for production, another one for testing. The test one contains a much smaller server database. If I add the "invisible" server name from the prod instance to the test one and export the database, the script finds this name without any problems.
If I replace -like with -match, the issue disappears. So it's not an issue of the XML file generator (it's another PS script that generates a PSCustomObject and exports it via Export-CliXml). It's also not an issue of some invisible or non-ANSI symbols in the server name. I also examined the content of the XML file manually. It's huge (several tens of megabytes) and complex, so it's pretty difficult to analyze but I didn't find any visible issue. The XML structure looks correct.
I don't understand that random behavior. Can it be related somehow to the XML file size? Memory lack in PS or something like that? We use Powershell v4.
Note that this answer is not a solution, because (as of this writing) there's not enough information to diagnose your problem; however, your use of the -like and -match operators deserves some scrutiny.
$_.Name -match $ServerName+"*" (more succinctly: $_.Name -match "$ServerName*") is not the same as $_.Name -like "$ServerName*":
-match uses regular expressions (regexes), which (also) match part of the input, unless explicitly formulated to match at the start (^) and/or the end ($) of the input.
-like uses wildcard expressions, which must match the input as a whole.
While regexes and wildcards are distantly related, their syntax - and capabilities - are different; regexes are far more powerful; in the case at hand (note that matching is case-insensitive by default):
... -like 'ServerXX*' matches a string that starts with ServerXX and is followed by zero or more arbitrary characters (*).
Inputs 'ServerXX', 'ServerXX.foo.bar' and 'ServerXXY' would all return $true.
... -match 'ServerXX*' matches a string that contains substring ServerX (just one X!) anywhere in the input, if followed by zero or more (*) X characters, because duplication symbol * modifies the preceding character/subexpression.
While inputs 'ServerXX' and 'ServerXX.foo.bar' would return $true, so would 'ServerX' and 'fooServerXX' - which is undesired in this case.
If your inputs are FQDNs, use either of the following expressions, which are equivalent:
... -like 'ServerXX.*'
... -match '^ServerXX\.'
If the server name is supplied via variable, e.g. $ServerName, use "...", an expandable string, in the simplest case:
... -like "$ServerName.*"
... -match "^$ServerName\."
This is fine in the case of server names, as they're not permitted to contain characters that could mistakenly be interpreted as regex / wildcard metacharacters (characters with special meaning, such as *).
Generally, the safest approach is to explicitly escape a variable value to ensure its literal use, though note that needing to do so is much more likely in a regex than in a wildcard expression, because regexes have many more metacharacters:
... -like ('{0}.*' -f [System.Management.Automation.WildcardPattern]::Escape($ServerName))
... -match ('^{0}\.' -f [regex]::Escape($ServerName))
Using a single-quoted template string with -f, the format operator ({0} represents the 1st RHS operand), makes it obvious which parts are used literally, and which parts are spliced in as an escaped variable value.

How to split up a string by every new line and put each line into a separate array entry in PowerShell

My question is very simple and I am very new to PowerShell but I'm just wondering if there is a easy way to split a string by each line and add the contents of each line into a separate array entry.
To complement Don Cruickshank's helpful answer:
"`n" expands to a LF character only (Unix-style, Unicode code point U+000A), but your input may use CRLF sequences ("`r`n", Windows-style, Unicode code points U+000D and U+000A) as newlines (line breaks).
While a given platform's native newline character [sequence] is reflected in [Environment]::Newline, there's no guarantee that a given input (file)'s newlines uses it.
Notably, script-embedded here-strings (e.g., #"<newline>...<newline>"#) use whatever newline format the enclosing script file was saved with - and when reading scripts for execution, PowerShell accepts both LF-only and CRLF files on all platforms it supports.
Therefore, the most robust form to split a string into lines by newlines is the following idiom, which takes advantage of the fact that the -split operator by default accepts a regular expression as the split criterion:
$str -split '\r?\n' # returns array of lines contained in $str
The above handles input with both LF-only (\n) and CRLF (\r\n) newlines correctly,
because \r?\n matches each \n (LF) optionally (?) preceded by an \r (CR).
The help system for PowerShell contains lots of useful information (type help to get started.)
Use the -split operator to split a string. PowerShell uses the backtick character (`) for escape codes and so it would look like this:
$str -split "`n"
For example we can define a string and call Length on that string to get the Length (in characters):
PS C:\> $str = "One`nTwo`nThree"
PS C:\> $str
One
Two
Three
PS C:\> $str.Length
13
Now we'll create an array where that string is split into lines and get the length of the resulting array. Note that in PowerShell, arrays are shown one item per line and so it's appears just like the earlier result in this case!
PS C:\> $arr = $str -split "`n"
PS C:\> $arr
One
Two
Three
PS C:\> $arr.Length
3
PowerShell can be confusing at first. One trick is to convert data structures to JSON to see what's going on until you get used to it.
PS C:\> $str | ConvertTo-Json -Compress
"One\nTwo\nThree"
PS C:\> $arr | ConvertTo-Json -Compress
["One","Two","Three"]

Resources