Why does using a comma concatenate strings in PowerShell? - arrays

I have encountered a PowerShell behavior I don't understand. A comma between strings concatenates them and inserts a space in between, e.g.:
PS H:\> [string]$result = "a","b"
PS H:\> $result # a string with 3 characters
a b
If the result is interpeted as an array, then using a comma separates the elements of the array, e.g.:
PS H:\> [array]$result = "a","b"
PS H:\> $result # an array with 2 elements
a
b
I have tried searching for an explanation of this behavior, but I don't really understand what to search for. In the documentation about the comma operator, I see that it is used to initialize arrays, but I have not found an explanation for the "string concatenation" (which, I suspect, may not be the correct term to use here).

Indeed: , is the array constructor operator.
Since your variable is a scalar - a single [string] instance - PowerShell implicitly stringifies your array (converts it to a string).
PowerShell stringifies arrays by joining the (stringified) elements of the array with a space character as the separator by default.
You may set a different separator via the $OFS preference variable, but that is rarely done in practice.[1]
You can observe this behavior in the context of string interpolation:
PS> "$( "a","b" )"
a b
[1] As zett42 points out, an easier way to specify a different separator is to use the -join operator; e.g.,
"a", "b" -join '#' yields 'a#b'.
By contrast, setting $OFS without resetting it afterwards can cause later commands that expect it to be at its default to malfunction; with -join you get explicit control.

Related

PowerShell Sum Array That Already Contains MBs

I have an array that contains MBs already in the values. This is how MS DPM returns data written to a tape. I would like to sum them together. Is there an easy one liner to accommodate for this?
MB is a recognized numeric suffix in PowerShell's native grammar, so you can parse and evaluate your size strings with Invoke-Expression:
PS ~> Invoke-Expression '2401927.56MB'
2517924115906.56
You'll want to do some basic input validation to make sure it's actually a numeric sequence, and remove the thousand separator:
$Tapes.DataWrittenDisplayString |ForEach-Object {
# remove commas and whitespace
$dataWritten = $_ -replace '[,\s]'
# ensure it's actually a number in the expected format
if($dataWritten -match '^\d+(?:\.\d+)?[kmgtp]b$'){
# let PowerShell do the rest
$dataWritten |Invoke-Expression
}
}

How does -join ";" work?

This answer includes a Powershell like of code that splits $env:path, applies a filter and puts the result together, to store it in $env:path.
$path = ($path.Split(';') | Where-Object { $_ -ne 'ValueToRemove' }) -join ';'
I was reading this code, and then, suddenly, a wild -join ';' appears. How does this work? What is the concept behind this? I would expect (<expression) would eventually become some object, but then this like reads <object> -join ';', so the join part would still be some independent token. How is it evaluated? I can see that it obviously does "the right thing", but how and why does it work? How exactly does Powershell evaluate this line? Looks like very dark magic to me.
As stated out by the docs the -Join Operator can be used mainly in two ways:
-Join <String[]>
<String[]> -Join <Delimiter>
Most of PowerShells operators work this way (and look similiar to method parameters). You got some values on the left side of an operator and execute an action (in this case joining) with the right-side-token, in your case the semicolon ;.
Take a look at the help (by typing Get-Help <section_name>) sections about_Join and about_Operators
Also a nice example copied from the docs (without the need of splitting beforehand) is:
PS> $a = "WIND", "S P", "ERSHELL"
PS> $a -join "OW"
WINDOWS POWERSHELL
To add to Clijsters' answer, it's an operator that acts on a string array (String[]). The character following the operator specifies the character to use to join each element of the array.
To break down the parts:
($path.Split(';') # take the *string* $path and split into an *array*.
# do this on the ; character
Where-Object { $_ -ne 'ValueToRemove' }) # exclude the array element which is equal to
# "ValueToRemove"
-join ';' # Join the *array* back into a *string*
# Put ";" between each array element.

How to split up a string by every new line and put each line into a separate array entry in PowerShell

My question is very simple and I am very new to PowerShell but I'm just wondering if there is a easy way to split a string by each line and add the contents of each line into a separate array entry.
To complement Don Cruickshank's helpful answer:
"`n" expands to a LF character only (Unix-style, Unicode code point U+000A), but your input may use CRLF sequences ("`r`n", Windows-style, Unicode code points U+000D and U+000A) as newlines (line breaks).
While a given platform's native newline character [sequence] is reflected in [Environment]::Newline, there's no guarantee that a given input (file)'s newlines uses it.
Notably, script-embedded here-strings (e.g., #"<newline>...<newline>"#) use whatever newline format the enclosing script file was saved with - and when reading scripts for execution, PowerShell accepts both LF-only and CRLF files on all platforms it supports.
Therefore, the most robust form to split a string into lines by newlines is the following idiom, which takes advantage of the fact that the -split operator by default accepts a regular expression as the split criterion:
$str -split '\r?\n' # returns array of lines contained in $str
The above handles input with both LF-only (\n) and CRLF (\r\n) newlines correctly,
because \r?\n matches each \n (LF) optionally (?) preceded by an \r (CR).
The help system for PowerShell contains lots of useful information (type help to get started.)
Use the -split operator to split a string. PowerShell uses the backtick character (`) for escape codes and so it would look like this:
$str -split "`n"
For example we can define a string and call Length on that string to get the Length (in characters):
PS C:\> $str = "One`nTwo`nThree"
PS C:\> $str
One
Two
Three
PS C:\> $str.Length
13
Now we'll create an array where that string is split into lines and get the length of the resulting array. Note that in PowerShell, arrays are shown one item per line and so it's appears just like the earlier result in this case!
PS C:\> $arr = $str -split "`n"
PS C:\> $arr
One
Two
Three
PS C:\> $arr.Length
3
PowerShell can be confusing at first. One trick is to convert data structures to JSON to see what's going on until you get used to it.
PS C:\> $str | ConvertTo-Json -Compress
"One\nTwo\nThree"
PS C:\> $arr | ConvertTo-Json -Compress
["One","Two","Three"]

How to fetch one item at a time from an array and print them? [duplicate]

I found some strange behavior in PowerShell surrounding arrays and double quotes. If I create and print the first element in an array, such as:
$test = #('testing')
echo $test[0]
Output:
testing
Everything works fine. But if I put double quotes around it:
echo "$test[0]"
Output:
testing[0]
Only the $test variable was evaluated and the array marker [0] was treated literally as a string. The easy fix is to just avoid interpolating array variables in double quotes, or assign them to another variable first. But is this behavior by design?
So when you are using interpolation, by default it interpolates just the next variable in toto. So when you do this:
"$test[0]"
It sees the $test as the next variable, it realizes that this is an array and that it has no good way to display an array, so it decides it can't interpolate and just displays the string as a string. The solution is to explicitly tell PowerShell where the bit to interpolate starts and where it stops:
"$($test[0])"
Note that this behavior is one of my main reasons for using formatted strings instead of relying on interpolation:
"{0}" -f $test[0]
EBGreen's helpful answer contains effective solutions, but only a cursory explanation of PowerShell's string expansion (string interpolation):
Only variables by themselves can be embedded directly inside double-quoted strings ("...") (by contrast, single-quoted strings ('...'), as in many other languages, are for literal contents).
This applies to both regular variables and variables referencing a specific namespace; e.g.:
"var contains: $var", "Path: $env:PATH"
If the first character after the variable name can be mistaken for part of the name - which notably includes : - use {...} around the variable name to disambiguate; e.g.:
"${var}", "${env:PATH}"
To use a $ as a literal, you must escape it with `, PowerShell's escape character; e.g.:
"Variable `$var"
Any character after the variable name - including [ and . is treated as a literal part of the string, so in order to index into embedded variables ($var[0]) or to access a property ($var.Count), you need $(...), the subexpression operator (in fact, $(...) allows you to embed entire statements); e.g.:
"1st element: $($var[0])"
"Element count: $($var.Count)"
"Today's date: $((Get-Date -DisplayHint Date | Out-String).Trim())"
Stringification (to-string conversion) is applied to any variable value / evaluation result that isn't already a string:
Caveat: Where culture-specific formatting can be applied, PowerShell chooses the invariant culture, which largely coincides with the US-English date and number formatting; that is, dates and numbers will be represented in US-like format (e.g., month-first date format and . as the decimal mark).
In essence, the .ToString() method is called on any resulting non-string object or collection (strictly speaking, it is .psobject.ToString(), which overrides .ToString() in some cases, notably for arrays / collections and PS custom objects)
Note that this is not the same representation you get when you output a variable or expression directly, and many types have no meaningful default string representations - they just return their full type name.
However, you can embed $(... | Out-String) in order to explicitly apply PowerShell's default output formatting.
For a more comprehensive discussion of stringification, see this answer.
As stated, using -f, the string-formatting operator (<format-string> -f <arg>[, ...]) is an alternative to string interpolation that separates the literal parts of a string from the variable parts:
'1st element: {0}; count: {1:x}' -f $var[0], $var.Count
Note the use of '...' on the LHS, because the format string (the template) is itself a literal. Using '...' in this case is a good habit to form, both to signal the intent of using literal contents and for the ability to embed $ characters without escaping.
In addition to simple positional placeholders ({0} for the 1st argument. {1} for the 2nd, ...), you may optionally exercise more formatting control over the to-string conversion; in the example above, x requests a hex representation of the number.
For available formats, see the documentation of the .NET framework's String.Format method, which the -f operator is based on.
Pitfall: -f has high precedence, so be sure to enclose RHS expressions other than simple index or property access in (...); e.g., '{0:N2}' -f 1/3 won't work as intended, only '{0:N2}' -f (1/3) will.
Caveats: There are important differences between string interpolation and -f:
Unlike expansion inside "...", the -f operator is culture-sensitive:
Therefore, the following two seemingly equivalent statements do not
yield the same result:
PS> [cultureinfo]::CurrentCulture='fr'; $n=1.2; "expanded: $n"; '-f: {0}' -f $n
expanded: 1.2
-f: 1,2
Note how only the -f-formatted command respected the French (fr) decimal mark (,).
Again, see the previously linked answer for a comprehensive look at when PowerShell is and isn't culture-sensitive.
Unlike expansion inside "...", -f stringifies arrays as <type-name>[]:
PS> $arr = 1, 2, 3; "`$arr: $arr"; '$arr: {0}' -f (, $arr)
$arr: 1 2 3
$arr: System.Object[]
Note how "..." interpolation created a space-separated list of the stringification of all array elements, whereas -f-formatting only printed the array's type name.
(As discussed, $arr inside "..." is equivalent to:
(1, 2, 3).psobject.ToString() and it is the generally invisible helper type [psobject] that provides the friendly representation.)
Also note how (, ...) was used to wrap array $arr in a helper array that ensures that -f sees the expression as a single operand; by default, the array's elements would be treated as individual operands.
In such cases you have to do:
echo "$($test[0])"
Another alternative is to use string formatting
echo "this is {0}" -f $test[0]
Note that this will be the case when you are accessing properties in strings as well. Like "$a.Foo" - should be written as "$($a.Foo)"

Perl: Indexing function returning array syntax

I have a question about Perl more out of curiosity than necessity. I have seen there are many ways to do a lot of things in Perl, a lot of the time the syntax seems unintuitive to me (I've seen a few one liners doing som impressive stuff).
So.. I know the function split returns an array. My question is, how do I go about printing the first element of this array without saving it into a special variable? Something like $(split(" ",$_))[0] ... but one that works.
You're 99% there
$ perl -de0
Loading DB routines from perl5db.pl version 1.33
Editor support available.
Enter h or `h h' for help, or `man perldebug' for more help.
main::(-e:1): 0
DB<1> $a = "This is a test"
DB<2> $b = (split(" ",$a))[0]
DB<3> p $b
This
DB<4> p "'$b'"
'This'
This should do it:
print ((split(" ", $_))[0]);
You need one set of parentheses to allow you to apply array indexing to the result of a function. The outer parentheses are needed to get around special parsing of print arguments.
Try this out to print the first element of a whitespace separated list. The \s+ regex matches one or more whitespace characters to split on.
echo "1 2 3 4" | perl -pe 'print +(split(/\s+/, $_))[0]'
Also, see this related post.

Resources