PowerShell Replace CRLF in certain scenarios - file

I'm new to PowerShell and wish to replace CRLF in some scenarios within a textfile.
An Example textfile will be:
Begin 1 2 3
End 1 2 3
List asd asd
Begin 1 2 3
End 1 2 3
Begin 1 2 3
End 1 2 3
Sometest asd asd
Begin 1 2 3
Where a line isn't starting with Begin or End, I wish to append that line onto the previous one.
So the desired outcome would be:
Begin 1 2 3
End 1 2 3 List asd asd
Begin 1 2 3
End 1 2 3
Begin 1 2 3
End 1 2 3 Sometest asd asd
Begin 1 2 3
The file is Tab Seperated. So after Begin and End, is a TAB.
I tried the below, just to get rid of all the CRLF's, which doesn't work:
$content = Get-Content c:\test.txt
$content -replace "'r'n","" | Set-Content c:\test2.txt
I've read the MSDN on PowerShell and can replace text on different lines, just not over multiple lines like this :(
I'm at home testing on Windows 7, but this is for work and will be on Vista.

# read the file
$content = Get-Content file.txt
# Create a new variable (array) to hold the new content
$newContent = #()
# loop over the file content
for($i=0; $i -lt $content.count; $i++)
{
# if the current line doesn't begin with 'begin' or 'end'
# append it to the last line םכ the new content variable
if($content[$i] -notmatch '^(begin|end)')
{
$newContent[-1] = $content[$i-1]+' '+$content[$i]
}
else
{
$newContent += $content[$i]
}
}
$newContent

What do you think about this one line ?
gc "beginend.txt" | % {}{if(($_ -match "^End")-or($_ -match "^Begin")){write-host "`n$_ " -nonewline}else{write-host $_ -nonewline}}{"`n"}
Begin 1 2 3
End 1 2 3 List asd asd
Begin 1 2 3
End 1 2 3
Begin 1 2 3
End 1 2 3 Sometest asd asd
Begin 1 2 3

$data = gc "beginend.txt"
$start = ""
foreach($line in $data) {
if($line -match "^(Begin|End)") {
if($start -ne "") {
write-output $start
}
$start = $line
} else {
$start = $start + " " + $line
}
}
# This last part is a bit of a hack. It picks up the last line
# if the last line begins with Begin or End. Otherwise, the loop
# above would skip the last line. Probably a more elegant way to
# do it :-)
if($data[-1] -match "^(Begin|End)") {
write-output $data[-1]
}

Related

2-dimensional array in Powershell [duplicate]

To accomplish with powershell:
Original
Transposed
0 1 2 3a b c d# $ # %
0 a #1 b $2 c #3 d %
how can a magic number be used (some regex in practice) where each original row has a variable number of columns so that only when the keyword occurs does that initiate a transpose?
Assuming that the original matrix is nx2 so something like:
a 1
b 2
c 3
d 4
a 5
d 6
a 7
b 8
c 9
The resulting matrix may very well be sparse, but each occurrence of a would signify a new column of output.
To transpose Rows to column conversion of file with powershell:
(Note that there is nothing in the referral that shows how magic numbers should be used to accomplish this)
$Orginal = #'
0 1 2 3
a b c d
# $ # %
'#
$Transposed = [Collections.ObjectModel.Collection[Object]]::new()
$Lines = $Orginal -Split '\r?\n'
for ($y = 0; $y -lt $Lines.Count; $y++) {
$Items = $lines[$y] -Split '\s+'
for ($x = 0; $x -lt $Items.Count; $x++) {
if ($x -ge $Transposed.Count) { $Transposed.Add((,#() * $Lines.Count)) }
$Transposed[$x][$y] = $Items[$x]
}
}
$Transposed |Foreach-Object { "$_" }
0 a #
1 b $
2 c #
3 d %

TCL list data to histogram

I'm doing some data analysis, and the output is a long list of numbers. Each line consists of 1 to n numbers, which may be duplicated:
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 4
I'd like to put these into a (time-series) histogram. I'm not an expert in tcl (yet?), and I have some ideas how to do this but I have not been successful yet. The puts statements are just so I can see what's happening.
while { [gets $infile line] != -1 } {
set m [llength $line]
puts "line length $m"
foreach item $line {
puts $item
incr nc($item)
puts "nc: $nc($item)"
}
}
this nc array I've created is giving me a size-based array. However, I'd like a per-line based array (2D). Naively it would be nc($item)($nlines). I initially tried labeling the array variable with the length such as nc${item}($nlines), but I am not smart enough to get that to work.
I appreciate any help.
Best
Mike
Although Tcl arrays are one-dimensional, you can construct key strings to fake multi-dimensionality:
set lineno -1
set fh [open infile r]
while {[gets $fh line] != -1} {
incr lineno
foreach item [split [string trim $line]] {
incr nc($lineno,$item)
}
}
close $fh
# `parray` is a handy command for inspecting arrays
parray nc
outputs
nc(0,1) = 20
nc(0,2) = 8
nc(0,3) = 2
nc(0,4) = 1
nc(1,1) = 2
nc(1,2) = 4
nc(1,4) = 3
nc(2,1) = 1
nc(2,2) = 1
nc(2,3) = 1
nc(2,4) = 1
Or use dictionaries:
set lineno -1
set nc {}
set fh [open infile r]
while {[gets $fh line] != -1} {
set thisLine {}
foreach item [split [string trim $line]] {
dict incr thisLine $item
}
dict set nc [incr lineno] $thisLine
}
close $fh
dict for {line data} $nc {
puts [list $line $data]
}
outputs
0 {1 20 2 8 3 2 4 1}
1 {1 2 2 4 4 3}
2 {1 1 2 1 3 1 4 1}

How Can I read TCL file value by value

I have a file in Tcl, and I want to read each value alone as the array... I recognize my file like this
PUx(1) 1 2 3 4 5
PUx(2) 1 2 3 4 5
PUx(3) 1 2 3 4 5
PUx(4) 1 2 3 4 5
PUx(5) 1 2 3 4 5
So, I want to get, for example, the value of PUx(1) one by one and add it to the variable.
enter image description here
As Donal pointed out, it all depends on the actual format. But if the example data is representative, this might work for you:
set content {PUx(1) 1 2 3 4 5
PUx(2) 1 2 3 4 5
PUx(3) 1 2 3 4 5
PUx(4) 1 2 3 4 5
PUx(5) 1 2 3 4 5}
foreach line [split $content \n] {
set values [lassign $line varName]
set $varName $values
}
parray PUx
lassign assumes the line-wise data to represent a valid Tcl list. This might or might not be the case for you.
Update
You might want to re-organize your dataset, this would allow you to use a Tcl array idiom to access "rows" and "columns" of data in a straightforward manner, more or less:
set content {PUx(1,1) 1
PUx(1,2) 2
PUx(1,3) 3
PUx(1,4) 4
PUx(1,5) 5
PUx(2,1) 1
PUx(2,2) 2
PUx(2,3) 3
PUx(2,4) 4
PUx(2,5) 5}
foreach line [split $content \n] {
set values [lassign $line varName]
set $varName $values
}
parray PUx
# first column: *,1
foreach {k v} [array get PUx *,1] {
puts $v
}
# first row: 1,*
foreach {k v} [array get PUx 1,*] {
puts $v
}
Provided that your main concern is how to compute the sum over a list of elements, these are three options available to:
proc lsum1 {x} {
set r 0
foreach i $x {
incr r $i
}
return $r
}
proc lsum2 {x} {
expr [join $x " + "]
}
proc lsum3 {x} {
::tcl::mathop::+ {*}$x
}
set x {1 2 3 4 5}
lsum1 $x
lsum2 $x
lsum3 $x
lsum1 and lsum3 are preferable. lsum2 is the literate translation of what you describe as your "problem", at least in my reading. You may also want to check the Tcl wiki. It gives you some background on the details of lsum3.
This can be easily integrated with reading your data, as shown in my first answer:
lsum1 $PUx(1)
lsum3 $PUx(1)

How to make a copy of a nested array in an array of array structure

I'm trying to make a copy of a nested array, and it appears that I continue to make a reference with my attempts.
To be more specific I am trying to have an array of arrays wherein each sub array builds upon the previous array. Here is my attempt:
#!/usr/bin/perl -w
use strict;
use warnings;
my #aoa=[(1)];
my $i = 2;
foreach (#aoa){
my $temp = $_;#copy current array into $temp
push $temp, $i++;
push #aoa, $temp;
last if $_->[-1] == 5;
}
#print contents of #aoa
foreach my $row (#aoa){
foreach my $ele (#$row){
print "$ele ";
}
print "\n";
}
My output is:
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
And I want/expect it to be:
1
1 2
1 2 3
1 2 3 4
1 2 3 4 5
I'm assuming my problems lies with how I am assigning $temp, please let me know if this is not the case. Any help is appreciated.
Create a new array with my, copy the contents of the array to be built upon, then add to it.
Keeping it as close as possible to your code
foreach (#aoa) {
last if $_->[-1] == 5;
my #temp = #$_; #copy current array into #temp
push #temp, $i++;
push #aoa, \#temp;
}

Fetching indices of a text file from another text file

The title may not be so descriptive. Let me explain:
I have a file (Say File 1) having some numbers [delimited by a space]. see here,
1 2 3 4 5
1 2 8 4 5 6 7
1 9 3 4 5 6 7 8
..... n lines (length of each line varies).
I have another file (Say File 2) having some numbers [delimited by a tab]. see here,
1 1 1 1 1 1 0 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 0 1 1 1 1 1
1 1 1 1 1 1 0 1 1 1 1 1
..... m lines (length of each line fixed).
I want sum of 1 2 3 4 5 th (file 1 Line 1) position of file 2, line 1
I want sum of 1 2 3 4 5 6 7 th (file 1 Line 2) position of file 2, line 1 and so on.
I want linewise sum of file 2 with positions all lines in file 1
It will look like:
5 6 6 …n columns (File 1)
1 8 3
9 8 4
… m rows (File 2)
I did this by the following code:
open( FH1, "File1.txt" );
#index = <FH1>;
open( FH2, "File2.txt" );
#matrix = <FH2>;
open( OUTPUT, ">sum.txt" );
foreach $xx (#matrix) {
#k1 = split( /\t/, "$xx" );
foreach $yy (#index) {
#k2 = split( / /, "$yy" );
$ssum = 0;
foreach $zz (#k2) {
$zz1 = $zz - 1;
if ( $k1[$zz1] == 1 ) {
$ssum++;
}
}
printf OUTPUT"$ssum\t";
$ssum = 0;
}
print OUTPUT"\n";
}
close FH1;
close FH2;
close OUTPUT;
It works absolutely fine except that, the time time requirement is enormous for large files. (e.g. 1000 lines File 1 X 25000 lines File 2 : The time is 8 minutes .
My data may exceed 4 times this example. And it's unacceptable for my users.
How to accomplish this, consuming much lesser time. or by Any other concept.
Always include use strict; and use warnings; in every PERL script.
You can simplify your script by not processing the first file multiple times. Also, you coding style is very outdated. You use with some lessons from Modern Perl Book by chromatic.
The following is your script simplified to take advantage of more modern style and techniques. Note, that it currently loads the file data from inside the script instead of external sources:
use strict;
use warnings;
use autodie;
use List::Util qw(sum);
my #indexes = do {
#open my $fh, '<', "File1.txt";
open my $fh, '<', \ "1 2 3 4 5\n1 2 8 4 5 6 7\n1 9 3 4 5 6 7 8\n";
map { [map {$_ - 1} split ' '] } <$fh>
};
#open my $infh, '<', "File2.txt";
my $infh = \*DATA;
#open my $outfh, '>', "sum.txt";
my $outfh = \*STDOUT;
while (<$infh>) {
my #vals = split ' ';
print $outfh join(' ', map {sum(#vals[#$_])} #indexes), "\n";
}
__DATA__
1 1 1 1 1 1 0 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 0 1 1 1 1 1
1 1 1 1 1 1 0 1 1 1 1 1
Outputs:
5 6 7
5 7 8
5 6 7
5 6 7

Resources