Perl print shows leading spaces for every element - arrays

I load a file into an array (every line in array element).
I process the array elements and save to a new file.
I want to print out the new file:
print ("Array: #myArray");
But - it shows them with leading spaces in every line.
Is there a simple way to print out the array without the leading spaces?

Yes -- use join:
my $delimiter = ''; # empty string
my $string = join($delimiter, #myArray);
print "Array: $string";

Matt Fenwick is correct. When your array is in double quotes, Perl will put the value of $" (which defaults to a space; see the perlvar manpage) between the elements. You can just put it outside the quotes:
print ('Array: ', #myArray);
If you want the elements separated by for example a comma, change the output field separator:
use English '-no_match_vars';
$OUTPUT_FIELD_SEPARATOR = ','; # or "\n" etc.
print ('Array: ', #myArray);

Related

Convert array to a string with newlines embeded in string

i need to convert a perl array to a single string variable, containing all array items separated with newlines.
my $content = "";
#names=('A','C','C','D','E');
$content = $content . join($", #names) . "\n";
print $content;
I intend the output to be like:
A
C
C
D
E
But i get:
A C C D E
Why isn't the newline \n character being honoured?
Since it appears that you want a newline not just between each line, but after the last one too, you can use any of the following:
join("\n", #names) . "\n"
join("", map "$_\n", #names)
join("\n", #names, "")
These are equivalent except when the array in empty. In that situation, the first results in a newline, and the other result in an empty string.
By the way,
$content = $content . EXPR;
can be written as
$content .= EXPR;
To join an array with newlines in between, use
join("\n", #array)
Your code uses the contents of the $" variable as the separator, which by default contains a space.
Do this instead:
$content = $content . join("\n", #names);
The $" variable holds the value that Perl uses between array elements when it interpolates an array in a double-quoted string. By default (and you haven't changed it), it's a space.
Perhaps you were thinking about the input or output record separators, $/ or $\.
But, you don't really want to play with those variables. If you want a newline, use a newline:
join "\n", #array;

How to type selected elements of an array in double quotes

Here, I'm trying to print the 2nd, 3rd and 4th elements of an array in double quotes and I've only been able to do it in single quotes.
my #fruits = ("apples", "oranges", "guavas", "passionfruits", "grapes");
my $a = 1;
while ($a<4){
print " '$fruits[$a]' \n";
$a += 1;
}
But I can't do this in single quotes. When I change the single quotes to double quotes and vice versa, it prints "$fruits[$a]"\n three times instead.
And when I change all quotes to double quotes, it gives an error which I understand why.
Please I really need help here.
And if I could get a way to print all three elements in double quotes without having to use a loop. Thanks!
To use " in a string delimited by ", escape it.
"foo: \"$bar\"\n"
You could also switch the delimiter (keeping in mind that "..." is short for qq"...").
qq{foo: "$bar"\n}
Always use
use strict;
use warnings;
even in the shortest Perl scripts.
In case of typos in the code, Perl will usually issue errors if you include them. Without, Perl will happily do weird, wrong and pointless things silently.
Your example will not print the entire array. Instead you will get:
'oranges'
'guavas'
'passionfruits'
The first index of an array is 0 and therefore 'apples' is skipped because $a is initialized with 1. The loop is also exited due to reaching the value 4 before printing out 'grapes'.
In order to print the entire array you would do:
if you need to use the index value $i somewhere:
for my $i (0 .. $#fruits) {
print " $i: '$fruits[$i]' \n";
}
($#fruits is the last index if #fruits, equal to the the size of the array minus 1. Since this array has 5 items, the index values range from 0 to 4)
otherwise:
foreach my $current_fruit (#fruits) {
print " '$current_fruit' \n";
}
where $current_fruit is set to each item in the array #fruits in its turn.
Quotes in Perl function as operators, and depending on which ones you use, they may or may not do various things with the included string.
In your examples, the double quotes will do interpolation on the string, substituting the value of $fruits[$a] and replacing the escape sequence \n. Therefore:
print " '$fruits[$a]' \n";
(for $a == 1) becomes:
'oranges'
Single quotes, in contrast, will not do interpolation.
Only single quotes themselves (and backslashes preceding single quotes) need to be escaped with a backslash. (Other backslashes can optionally be escaped.) All other character sequences will appear as they are, so the argument to print in:
print ' "$fruits[$a]" \n';
is considered entirely a literal string and thus
"$fruits[$a]" \n "$fruits[$a]" \n "$fruits[$a]" \n
is printed out.
To get the desired output, there are multiple ways to go about it:
The simplest way - but not easiest to read for complex strings - is to use double quotes and escape the included quotes:
print " \"$fruits[$a]\" \n";
You can use an generic notation for "...", which is qq{...} where {} are either a pair of braces (any of (), [], {} or <>) or the same other non-whitespace, non-word character, e.g.:
print qq{ "$fruits[$a]" \n};
or
print qq! "$fruits[$a]" \n!;
You can concatenate the string out of parts that you quote separately:
print ' "' . $fruits[$a] . '"' . " \n";
Which is easiest to read in code will depend on the complexity of the string and the variables contained: For really long strings with complex dereferences, indeces and hash keys you might want to use option 3, whereas for short ones 2 would be the best.
My entire edited code:
use strict;
use warnings;
my #fruits = ("apples", "oranges", "guavas", "passionfruits", "grapes");
for my $i (0 .. $#fruits) {
print " $i: \"$fruits[$i]\" \n";
print qq< $i: "$fruits[$i]" \n>;
print ' ' . $i . ': "' . $fruits[$i] . '"' . " \n";
}
foreach my $current_fruit (#fruits) {
print " \"$current_fruit\" \n";
print qq¤ "$current_fruit" \n¤;
print ' "' . $current_fruit . '"' . " \n";
}
You can learn more about the different quotes in Perl from the perldoc (or man on UNIX-like systems) page perlop and its section titled "Quote and Quote-like Operators".

How to use chomp

Below I have a list of data I am trying to manipulate. I want to split the columns and rejoin them in a different arrangement.
I would like to switch the last element of the array with the third one but I'm running into a problem.
Since the last element of the array contains a line character at the end, when I switch it to be a thrid, it kicks everything a line down.
CODE
while (<>) {
my #flds = split /,/;
DO STUFF HERE;
ETC;
print join ",", #flds[ 0, 1, 3, 2 ]; # switches 3rd element with last
}
SAMPLE DATA
1,josh,Hello,Company_name
1,josh,Hello,Company_name
1,josh,Hello,Company_name
1,josh,Hello,Company_name
1,josh,Hello,Company_name
1,josh,Hello,Company_name
MY RESULTS - Kicked down a line.
1,josh,Company_name
,Hello1,josh,Company_name
,Hello1,josh,Company_name
,Hello1,josh,Company_name
,Hello1,josh,Company_name
,Hello1,josh,Company_name,Hello
*Desired REsults**
1,josh,Company_name,Hello
1,josh,Company_name,Hello
1,josh,Company_name,Hello
1,josh,Company_name,Hello
1,josh,Company_name,Hello
1,josh,Company_name,Hello
I know it has something to do with chomp but when I chomp the first or last element, all \n are removed.
When I use chomp on anything in between, nothing happens. Can anyone help?
chomp removes the trailing newline from the argument. Since none of your four fields should actually contain a newline, this is probably something you want to do for the purposes of data processing. You can remove the newline with chomp before you even split the line into fields, and then add a newline after each record with your final print statement:
while (<>) {
chomp; # Automatically operates on $_
my #flds = split /,/;
DO STUFF HERE;
ETC;
print join(",", #flds[0,1,3,2]) . "\n"; # switches 3rd element with last
}
while ( <> ) {
chomp;
my #flds = split /,/;
... rest of your stuff
}
In the while loop, as each line is processed, $_ is set to the contents of the line. chomp by default, acts on $_ and removes trailing line feeds. split also defaults to using $_, so that works fine.
Technically what will be happening is the last element in #flds includes the trailing \n from the line - e.g. $flds[3].
The chomp() function will remove (usually) any newline character from the end of a string. The reason we say usually is that it actually removes any character that matches the current value of $/ (the input record separator), and $/ defaults to a newline.
Example 1. Chomping a string
Most often you will use chomp() when reading data from a file or from a user. When reading user input from the standard input stream (STDIN) for instance, you get a newline character with each line of data. chomp() is really useful in this case because you do not need to write a regular expression and you do not need to worry about it removing needed characters.
while (my $text = <STDIN>) {
chomp($text);
print "You entered '$text'\n";
last if ($text eq '');
}
Example usage and output of this program is:
a word
You entered 'a word'
some text
You entered 'some text'
You entered ''

How to compare each element of an array using regex?

I am using the Lingua::EN::Tagger Perl module in order to tag parts of speech from a user's input. That portion of my code works perfect. However, the problem is that I only want to keep the input that has the noun tags which are "NN, NNS, NNP, NNPS", and store these words in a separate array #nounArray. The user will be inputting a question such as "what is your name?" Each element of the question will be tagged: What/WP is/is your/PN name/NN
my #UserInput = $readable_text;
my #nounArray;
foreach my $UserInput (#UserInput){
if ($UserInput =~ m/NN|NNS$|NNP$|NNPS$/){
$UserInput = #nounArray;
}
print #nounArray;
}
However, nothing occurs when I run the code. The goal is to have the nouns of the user's input be placed in a separate array after separating them from the original array. I do not want to print the array, but i do this in order to see if the code was working.
Since you want to iterate over the words in $readable_text you can split them first into array,
my $readable_text = "What/WP is/is your/PN name/NN";
my #UserInput = split ' ', $readable_text;
my #nounArray;
foreach my $UserInput (#UserInput) {
if ($UserInput =~ m/NN|NNS$|NNP$|NNPS$/) {
# print "$UserInput\n";
push #nounArray, $UserInput;
}
}
print #nounArray;
$ matches at the end of the string. I suppose your strings have at least a \n at the end, which would prevent them from matching.
But as you point out in your comment, it looks like you're trying to match word boundaries here, so just replace all $ in your expression with \b.
First, split your words by whitespace:
my #UserInput = split /\s+/, $UserInput;
Then grep for the nouns:
my #nouns = grep { m%/N% } #UserInput; # only noun tags include /N

How to ignore any empty values in a perl grep?

I am using the following to count the number of occurrences of a pattern in a file:
my #lines = grep /$text/, <$fp>;
print ($#lines + 1);
But sometimes it prints one more than the actual value. I checked and it is because the last element of #lines is null, and that is also counted.
How can the last element of the grep result be empty sometimes? Also, how can this issue be resolved?
It really depends a lot on your pattern, but one thing you could do is join a couple of matches, the first one disqualifying any line that contains only space (or nothing). This example will reject any line that is either empty, newline only, or any amount of whitespace only.
my #lines = grep { not /^\s*$/ and /$test/ } <$fp>;
Keep in mind that if the contents of $test happen to include regexp special metacharacters they either need to be intended for their metacharacter purposes, or sterilized with quotemeta().
My theories are that you might have a line terminated in \n which is somehow matching your $text regexp, or your $text regexp contains metacharacters in it that are affecting the match without you being aware. Either way, the snippet I provided will at least force rejection of "blank lines", where blank could mean completely empty (unlikely), newline terminated but otherwise empty (probable), or whitespace containing (possible) lines that appear blank when printed.
A regular expression that matches the empty string will match undef. Perl will warn about doing so, but casts undef to '' before trying to match against it, at which point grep will quite happily promote the undef to its results. If you don't want to pick up the empty string (or anything that will be matched as though it were the empty string), you need to rewrite your regular expression to not match it.
To accurately see what is in lines, do:
use Data::Dumper;
$Data::Dumper::Useqq = 1;
print Dumper \#lines;
Ok, since no more information about the contents of $text (the regex) is forthcoming, I guess I'll toss out some general information.
Consider the following example:
use Data::Dumper;
my #array = (' ', 1, 2, 'a', '');
print Dumper [ grep /\s*/, #array ];
We get:
$VAR1 = [
' ',
1,
2,
'a',
''
];
All the values match. Why? Because they also match the empty string. To get what we want, we need \s or \s+. (There will be no practical difference between the two)
You may have such a problem.

Resources