So what I'm trying to do in my code is basically read in a spreadsheet that has this format
username, lastname, firstname, x1, x2, x3, x4
user1, dudette, mary, 7, 2, 4
user2, dude, john, 6, 2, 4,
user3, dudest, rad,
user4, dudaa, pad, 3, 3, 5, 9
basically, it has usernames, the names those usernames correspond to, and values for each x. What I want to do is read in this from a csv file and then find all of the blank spaces and fill them in with 5s. My approach to doing this was to read in the whole array and then substitute all null spaces with 0s. This is the code so far...
#!/bin/bash
while IFS=$'\t' read -r -a myarray
do
echo $myarray
done < something.csv
for e in ${myarray[#]
do
echo 'Can you see me #1?'
if [[-z $e]]
echo 'Can you see me #2?'
sed 's//0'
fi
done
The code isn't really changing my csv file at all. EDITED NOTE: the data is all comma separated.
What I've figured out so far:
Okay, the 'Can you see me' and the echo myarray are test code. I wanted to see if the whole csv file was being read in from echo myarray (which according to the output of the code seems to be the case). It doesn't seem, however, that the code is running through the for loop at all...which I can't seem to understand.
Help is much appreciated! :)
The format of your .csv file is not comma separated, it's left aligned with a non-constant number of whitespace characters separating each field. This makes it difficult to be accurate when trying to find and replace empty columns which are followed by non-empty columns.
Here is a Bash only solution that would be entirely accurate if the fields were comma separated.
#!/bin/bash
n=5
while IFS=, read username lastname firstname x1 x2 x3 x4; do
! [[ $x1 ]] && x1=$n
! [[ $x2 ]] && x2=$n
! [[ $x3 ]] && x3=$n
! [[ $x4 ]] && x4=$n
echo $username,$lastname,$firstname,$x1,$x2,$x3,$x4
done < something.csv > newfile.csv && mv newfile.csv something.csv
Output:
username,lastname,firstname,x1,x2,x3,x4
user1,dudette,mary,7,2,5,4
user2,dude,john,6,2,4,5
user3,dudest,rad,5,5,5,5
user4,dudaa,pad,3,3,5,9
I realize you asked for bash, but if you don't mind perl in lieu of bash, perl is a great tool for record-oriented files.
#!/usr/bin/perl
open (FILE, 'something.csv');
open (OUTFILE, '>outdata.txt');
while(<FILE>) {
chomp;
($username,$lastname,$firstname,$x1,$x2,$x3,$x4) = split("\t");
$x1 = 5 if $x1 eq "";
$x2 = 5 if $x2 eq "";
$x3 = 5 if $x3 eq "";
$x4 = 5 if $x4 eq "";
print OUTFILE "$username\t$lastname\t$x1\t$x2\t$x3\t$x4\n";
}
close (FILE);
close (OUTFILE);
exit;
This reads your infile, something.csv which is assumed to have tab-separated fields, and writes a new file outdata.txt with the re-written records.
I'm sure there's a better or more idiomatic solution, but this works:
#!/bin/bash
infile=bashcsv.csv # Input filename
declare -i i # Iteration variable
declare -i defval=5 # Default value for missing cells
declare -i n_cells=7 # Total number of cells per line
declare -i i_start=3 # Starting index for numeric cells
declare -a cells # Array variable for cells
# We'd usually save/restore the old value of IFS, but there's no need here:
IFS=','
# Convenience function to bail/bug out on error:
bail () {
echo $# >&2
exit 1
}
# Strip whitespace and replace empty cells with `$defval`:
sed -s 's/[[:space:]]//g' $infile | while read -a cells; do
# Skip empty/malformed lines:
if [ ${#cells[*]} -lt $i_start ]; then
continue
fi
# If there are fewer cells than $n_cells, pad to $n_cells
# with $defval; if there are more, bail:
if [ ${#cells[*]} -lt $n_cells ]; then
for ((i=${#cells[*]}; $i<$n_cells; i++)); do
cells[$i]=$defval
done
elif [ ${#cells[*]} -gt $n_cells ]; then
bail "Too many cells."
fi
# Replace empty cells with default value:
for ((i=$i_start; $i<$n_cells; i++)); do
if [ -z "${cells[$i]}" ]; then
cells[$i]=$defval
fi
done
# Print out whole line, interpolating commas back in:
echo "${cells[*]}"
done
Here's a gratuitous awk one-liner that gets the job done:
awk -F'[[:space:]]*,[[:space:]]*' 'BEGIN{OFS=","} /,/ {NF=7; for(i=4;i<=7;i++) if($i=="") $i=5; print}' infile.csv
Related
I have a text file with the following:
Paige
Buckley
Govan
Mayer
King
Harrison
Atkins
Reinhardt
Wilson
Vaughan
Sergovia
Tarrega
My goal is to create an array for each set of names. Then Iterate through the first array of values then move on to the second array of values and lastly the third array. Each set is separated by a new line in the text file. Help with code or logic is much appreciated!
so far I have the following. i am unsure of the logic moving forward when i reach a line break. My research here also suggests that i can use readarray -d.
#!/bin/bash
my_array=()
while IFS= read -r line || [[ "$line" ]]; do
if [[ $line -eq "" ]];
.
.
.
arr+=("$line") # i know this adds the value to the array
done < "$1"
printf '%s\n' "${my_array[#]}"
desired output:
array1 = (Paige Buckley6 Govan Mayer King)
array2 = (Harrison Atkins Reinhardt Wilson)
array3 = (Vaughan Sergovia Terrega)
#then loop through the each array one after the other.
Bash has no array-of-arrays. So you have to represent it in an other way.
You could leave the newlines and have an array of newline separated elements:
array=()
elem=""
while IFS= read -r line; do
if [[ "$line" != "" ]]; then
elem+="${elem:+$'\n'}$line" # accumulate lines in elem
else
array+=("$elem") # flush elem as array element
elem=""
fi
done
if [[ -n "$elem" ]]; then
array+=("$elem") # flush the last elem
fi
# iterate over array
for ((i=0;i<${#array[#]};++i)); do
# each array element is newline separated items
readarray -t elem <<<"${array[i]}"
printf 'array%d = (%s)\n' "$i" "${elem[*]}"
done
You could simplify the loop with some unique character and a sed for example like:
readarray -d '#' -t array < <(sed -z 's/\n\n/#/g' file)
But overall, this awk generates same output:
awk -v RS= -v FS='\n' '{
printf "array%d = (", NR;
for (i=1;i<=NF;++i) printf "%s%s", $i, i==NF?"":" ";
printf ")\n"
}'
Using nameref :
#!/usr/bin/env bash
declare -a array1 array2 array3
declare -n array=array$((n=1))
while IFS= read -r line; do
test "$line" = "" && declare -n array=array$((n=n+1)) || array+=("$line")
done < "$1"
declare -p array1 array2 array3
Called with :
bash test.sh data
# result
declare -a array1=([0]="Paige" [1]="Buckley" [2]="Govan" [3]="Mayer" [4]="King")
declare -a array2=([0]="Harrison" [1]="Atkins" [2]="Reinhardt" [3]="Wilson")
declare -a array3=([0]="Vaughan" [1]="Sergovia" [2]="Tarrega")
Assumptions:
blank links are truly blank (ie, no need to worry about any white space on said lines)
could have consecutive blank lines
names could have embedded white space
the number of groups could vary and won't always be 3 (as with the sample data provided in the question)
OP is open to using a (simulated) 2-dimensional array as opposed to a (variable) number of 1-dimensional arrays
My data file:
$ cat names.dat
<<< leading blank lines
Paige
Buckley
Govan
Mayer
King Kong
<<< consecutive blank lines
Harrison
Atkins
Reinhardt
Wilson
Larry
Moe
Curly
Shemp
Vaughan
Sergovia
Tarrega
<<< trailing blank lines
One idea that uses a couple arrays:
array #1: associative array - the previously mentioned (simulated) 2-dimensional array with the index - [x,y] - where x is a unique identifier for a group of names and y is a unique identifier for a name within a group
array #2: 1-dimensional array to keep track of max(y) for each group x
Loading the arrays:
unset names max_y # make sure array names are not already in use
declare -A names # declare associative array
x=1 # init group counter
y=0 # init name counter
max_y=() # initialize the max(y) array
inc= # clear increment flag
while read -r name
do
if [[ "${name}" = '' ]] # if we found a blank line ...
then
[[ "${y}" -eq 0 ]] && # if this is a leading blank line then ...
continue # ignore and skip to the next line
inc=y # set flag to increment 'x'
else
[[ "${inc}" = 'y' ]] && # if increment flag is set ...
max_y[${x}]="${y}" && # make note of max(y) for this 'x'
((x++)) && # increment 'x' (group counter)
y=0 && # reset 'y'
inc= # clear increment flag
((y++)) # increment 'y' (name counter)
names[${x},${y}]="${name}" # save the name
fi
done < names.dat
max_y[${x}]="${y}" # make note of the last max(y) value
Contents of the array:
$ typeset -p names
declare -A names=([1,5]="King Kong" [1,4]="Mayer" [1,1]="Paige" [1,3]="Govan" [1,2]="Buckley" [3,4]="Shemp" [3,3]="Curly" [3,2]="Moe" [3,1]="Larry" [2,4]="Wilson" [2,2]="Atkins" [2,3]="Reinhardt" [2,1]="Harrison" [4,1]="Vaughan" [4,2]="Sergovia" [4,3]="Tarrega" )
$ for (( i=1; i<=${x}; i++ ))
do
for (( j=1; j<=${max_y[${i}]}; j++ ))
do
echo "names[${i},${j}] : ${names[${i},${j}]}"
done
echo ""
done
names[1,1] : Paige
names[1,2] : Buckley
names[1,3] : Govan
names[1,4] : Mayer
names[1,5] : King Kong
names[2,1] : Harrison
names[2,2] : Atkins
names[2,3] : Reinhardt
names[2,4] : Wilson
names[3,1] : Larry
names[3,2] : Moe
names[3,3] : Curly
names[3,4] : Shemp
names[4,1] : Vaughan
names[4,2] : Sergovia
names[4,3] : Tarrega
I'd like to either process one row of a csv file or the whole file.
The variables are set by the header row, which may be in any order.
There may be up to 12 columns, but only 3 or 4 variables are needed.
The source files might be in either format, and all I want from both is lastname and country. I know of many different ways and tools to do it if the columns were fixed and always in the same order. But they're not.
examplesource.csv:
firstname,lastname,country
Linus,Torvalds,Finland
Linus,van Pelt,USA
examplesource2.csv:
lastname,age,country
Torvalds,66,Finland
van Pelt,7,USA
I have cobbled together something from various Stackoverflow postings which looks a bit voodoo but seems fairly robust. I say "voodoo" because shellcheck complains that, for example, "firstname is referenced but not assigned". And yet it prints it.
#!/bin/bash
#set the field seperator to newline
IFS=$'\n'
#split/transpose the first-line column titles to rows
COLUMNAMES=$(head -n1 examplesource.csv | tr ',' '\n')
#set an array and read the columns into it
columns=()
for line in $COLUMNAMES; do
columns+=("$line")
done
#reset the field seperator
IFS=","
#using -p here to debug in output
declare -ap columns
#read from line 2 onwards
sed 1d examplesource.csv | while read "${columns[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
In the case of looping through everything, it works perfectly for my needs and I can process within the "while read" loop. But to make it cleaner, I'd rather pass the current element(?) to an external function to process (not just echo).
And if I only wanted the array (current row) belonging to "Torvalds", I cannot find how to access that or even get its current index, eg: "if $wantedname && $lastname == $wantedname then call function with currentrow only otherwise loop all rows and call function".
I know there aren't multidimensional associative arrays in bash from reading
Multidimensional associative arrays in Bash and I've tried to understand arrays from
https://opensource.com/article/18/5/you-dont-know-bash-intro-bash-arrays
Is it clear what I'm trying to achieve in a bash-only manner and does the question make sense?
Many thanks.
Let's short your function. Don't read the source twice (first with head then with sed). You can do that once. Also the whole array reading can be shorten to just IFS=',' COLUMNAMES=($(head -n1 source.csv)). Here's a shorter version:
#!/bin/bash
cat examplesource.csv |
{
IFS=',' read -r -a columnnames
while IFS=',' read -r "${columnnames[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
}
If you want to parse both files and the same time, ie. join them, nothing simpler ;). First, let's number lines in the first file using nl -w1 -s,. Then we use join to join the files on the name of the people. Remember that join input needs to be sort-ed using proper fields. Then we sort the output with sort using the number from the first file. After that we can read all the data just like that:
# join the files, using `,` as the seaprator
# on the 3rd field from the first file and the first field from the second file
# the output should be first the fields from the first file, then the second file
# the country (field 1.4) is duplicated in 2.3, so just omiting it.
join -t, -13 -21 -o 1.1,1.2,1.3,2.2,2.3 <(
# number the lines in the first file
<examplesource.csv nl -w1 -s, |
# there is one field more, sort using the 3rd field
sort -t, -k3
) <(
# sort the second file using the first field
<examplesource2.csv sort -t, -k1
) |
# sort the output using the numbers from the first file
sort -t, -k1 -n |
# well, remove the numbers
cut -d, -f2- |
# just a normal read follows
{
# read the headers
IFS=, read -r -a names
while IFS=, read -r "${names[#]}"; do
# finally out output!
echo "${firstname} ${lastname} is from ${country} and is so many ${age} years old!"
done
}
Tested on tutorialspoint.
GNU Awk has multidimensional arrays. It also has array sorting mechanisms, which I have not used here. Please comment if you are interested in pursuing this solution further. The following depends on consistent key names and line numbers across input files, but can handle an arbitrary number of fields and input files.
$ gawk -V |gawk NR==1
GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5, GNU MP 6.1.2)
$ gawk -F, '
FNR == 1 {for(f=1;f<=NF;f++) Key[f]=$f}
FNR != 1 {for(f=1;f<=NF;f++) People[FNR][Key[f]]=$f}
END {
for(Person in People) {
for(attribute in People[Person])
output = output FS People[Person][attribute]
print substr(output,2)
output=""
}
}
' file*
66,Finland,Linus,Torvalds
7,USA,Linus,van Pelt
A bash solution takes a bit more work than an awk solution, but if this is an exercise over what bash provides, it provides all you need to handle determining the column holding the last name from the first line of input and then outputting the lastname from the remaining lines.
An easy approach is simply to read each line into a normal array and then loop over the elements of the first line to locate the column "lastname" appears in saving the column in a variable. You can then read each of the remaining lines the same way and output the lastname field by outputting the element at the saved column.
A short example would be:
#!/bin/bash
col=0 ## column count for lastname
cnt=0 ## line count
while IFS=',' read -a arr; do ## read each line into array
if [ "$cnt" -eq '0' ]; then ## test if line-count is zero
for ((i = 0; i < "${#arr[#]}"; i++)); do ## loop for lastname
[ "${arr[i]}" = 'lastname' ] && ## test for lastname
{ col=i; break; } ## if found set cos = 1, break loop
done
fi
[ "$cnt" -gt '0' ] && ## if not headder row
echo "line $cnt lastname: ${arr[col]}" ## output lastname variable
((cnt++)) ## increment linecount
done < "$1"
Example Use/Output
Using your two files data files, the output would be:
$ bash readcsv.sh ex1.csv
line 1 lastname: Torvalds
line 2 lastname: van Pelt
$ bash readcsv.sh ex2.csv
line 1 lastname: Torvalds
line 2 lastname: van Pelt
A similar implementation using awk would be:
awk -F, -v col=1 '
NR == 1 {
for (i in FN) {
if (i = "lastname") next
}
col++
}
NR > 1 {
print "lastname: ", $col
}
' ex1.csv
Example Use/Output
$ awk -F, -v col=1 'NR == 1 { for (i in FN) { if (i = "lastname") next } col++ } NR > 1 {print "lastname: ", $col }' ex1.csv
lastname: Torvalds
lastname: van Pelt
(output is the same for either file)
Thank you all. I've taken a couple of bits from two answers
I used the answer from David to find the number of the row, then I used the elegantly simple solution from Kamil at to loop through what I need.
The result is exactly what I wanted. Thank you all.
$ readexample.sh examplesource.csv "Torvalds"
Everyone
Linus Torvalds is from Finland
Linus van Pelt is from USA
now just Torvalds
Linus Torvalds is from Finland
And this is the code - now that you know what I want it to do, if anyone can see any dangers or improvements, please let me know as I'm always learning. Thanks.
#!/bin/bash
FILENAME="$1"
WANTED="$2"
printDetails() {
SINGLEROW="$1"
[[ ! -z "$SINGLEROW" ]] && opt=("--expression" "1p" "--expression" "${SINGLEROW}p") || opt=("--expression" "1p" "--expression" "2,199p")
sed -n "${opt[#]}" "$FILENAME" |
{
IFS=',' read -r -a columnnames
while IFS=',' read -r "${columnnames[#]}"; do
echo "${firstname} ${lastname} is from ${country}"
done
}
}
findRow() {
col=0 ## column count for lastname
cnt=0 ## line count
while IFS=',' read -a arr; do ## read each line into array
if [ "$cnt" -eq '0' ]; then ## test if line-count is zero
for ((i = 0; i < "${#arr[#]}"; i++)); do ## loop for lastname
[ "${arr[i]}" = 'lastname' ] && ## test for lastname
{
col=i
break
} ## if found set cos = 1, break loop
done
fi
[ "$cnt" -gt '0' ] && ## if not headder row
if [ "${arr[col]}" == "$1" ]; then
echo "$cnt" ## output lastname variable
fi
((cnt++)) ## increment linecount
done <"$FILENAME"
}
echo "Everyone"
printDetails
if [ ! -z "${WANTED}" ]; then
echo -e "\nnow just ${WANTED}"
row=$(findRow "${WANTED}")
printDetails "$((row + 1))"
fi
I have a grep output and I'm trying to make an associative array from the output that I get.
Here is my grep output:
"HardwareSerialNumber": "123456789101",
"DeviceId": "devid1234",
"HardwareSerialNumber": "111213141516",
"DeviceId": "devid5678",
I want to use that output to define an associative array, like this:
array[123456789101]=devid1234
array[11213141516]=devid5678
Is that possible? I'm new at making arrays. I hope someone could help me in my problem.
Either pipe your grep output to a helper script with a while loop containing a simple "0/1" toggle to read two lines taking the last field of each to fill your array, e.g.
#!/bin/bash
declare -A array
declare -i n=0
arridx=
while read -r label value; do # read 2 fields
if [ "$n" -eq 0 ]
then
arridx="${value:1}" # strip 1st and lst 2 chars
arridx="${arridx:0:(-2)}" # save in arridx (array index)
((n++)) # increment toggle
else
arrval="${value:1}" # strip 1st and lst 2 chars
arrval="${arrval:0:(-2)}" # save in arrval (array value)
array[$arridx]="$arrval" # assign to associative array
n=0 # zero toggle
fi
done
for i in ${!array[#]}; do # output array
echo "array[$i] ${array[$i]}"
done
Or you can use process substitution containing the grep command within the script to do the same thing, e.g.
done < <( your grep command )
You can also add a check under the else clause that if [[ $label =~ DeviceId ]] to validate you are on the right line and catch any variation in the grep output content.
Example Input
$ cat dat/grepout.txt
"HardwareSerialNumber": "123456789101",
"DeviceId": "devid1234",
"HardwareSerialNumber": "111213141516",
"DeviceId": "devid5678",
Example Use/Output
$ cat dat/grepout.txt | bash parsegrep2array.sh
array[123456789101] devid1234
array[111213141516] devid5678
Parsing out the values is easy, and once you have them you can certainly use those values to build up an array. The trickiest part comes from the fact that you need to combine input from separate lines. Here is one approach; note that this script is verbose on purpose, to show what's going on; once you see what's happening, you can eliminate most of the output:
so.input
"HardwareSerialNumber": "123456789101",
"DeviceId": "devid1234",
"HardwareSerialNumber": "111213141516",
"DeviceId": "devid5678",
so.sh
#!/bin/bash
declare -a hardwareInfo
while [[ 1 ]]; do
# read in two lines of input
# if either line is the last one, we don't have enough input to proceed
read lineA < "${1:-/dev/stdin}"
# if EOF or empty line, exit
if [[ "$lineA" == "" ]]; then break; fi
read lineB < "${1:-/dev/stdin}"
# if EOF or empty line, exit
if [[ "$lineB" == "" ]]; then break; fi
echo "$lineA"
echo "$lineB"
hwsn=$lineA
hwsn=${hwsn//HardwareSerialNumber/}
hwsn=${hwsn//\"/}
hwsn=${hwsn//:/}
hwsn=${hwsn//,/}
echo $hwsn
# some checking could be done here to test that the value is numeric
devid=$lineB
devid=${devid//DeviceId/}
devid=${devid//\"/}
devid=${devid//:/}
devid=${devid//,/}
echo $devid
# some checking could be done here to make sure the value is valid
# populate the array
hardwareInfo[$hwsn]=$devid
done
# spacer, for readability of the output
echo
# display the array; in your script, you would do something different and useful
for key in "${!hardwareInfo[#]}"; do echo $key --- ${hardwareInfo[$key]}; done
cat so.input | ./so.sh
"HardwareSerialNumber": "123456789101",
"DeviceId": "devid1234",
123456789101
devid1234
"HardwareSerialNumber": "111213141516",
"DeviceId": "devid5678",
111213141516
devid5678
111213141516 --- devid5678
123456789101 --- devid1234
I created the input file so.input just for convenience. You would probably pipe your grep output into the bash script, like so:
grep-command | ./so.sh
EDIT #1: There are lots of choices for parsing out the key and value from the strings fed in by grep; the answer from #David C. Rankin shows another way. The best way depends on what you can rely on about the content and structure of the grep output.
There are also several choices for reading two separate lines that are related to each other; David's "toggle" approach is also good, and commonly used; I considered it myself, before going with "read two lines and stop if either is blank".
EDIT #2: I see declare -A in David's answer and in examples on the web; I used declare -a because that's what my version of bash wants (I'm using a Mac). So, just be aware that there can be differences.
So for some reason my script isnt adding the extra data to the arrays inside the while loops. The data is there and correctly matching but not appending to the arrays.
The input file contents is:
name:passwordhash
while read -r lines;
do
HASH_COMMON=$(echo -n "$lines" | sha256sum | awk '{print $1}');
while read -r line ;
do
# change seperator to colon
IFS=: read -r NAME PASSWORD <<< "$line"
# compare hashed password against hashed password list
if [ "$HASH_COMMON" == "$PASSWORD" ] ;
then
CRACKED_RESULT+=("$lines")
HASH_RESULT+=("$HASH_COMMON")
NAME_RESULT+=("$NAME")
PASSWORD_RESULT+=("$PASSWORD")
fi
done < "$PASSED_FILE"
done < "$COMMON_PW"
output if i print the arrays outside the loop is just a single pass of data using for loop
for i in "${NAME_RESULT[#]}"
do
echo "$i"
echo ""
done
then I am comparing the arrays later with a for loop as well
for isCracked in "${CRACKED_RESULT[#]}";
do
for checkCrack in "${PASSWORD_RESULT[#]}";
do
one="${HASH_RESULT[isCracked]}"
two="${PASSWORD_RESULT[checkCracked]}"
if [ "$one" == "$two" ];
then
RESULT_ARRAY+=("USERNAME: ${NAME_RESULT[checkCracked]} PASSWORD: ${CRACKED_RESULT[checkCracked]}")
fi
checkCrack=+1
done
isCracked=+1
done
Posted my code below, wondering if I can search one array for a match... or if theres a way I can search a unix file inside of an argument.
#!/bin/bash
# store words in file
cat $1 | ispell -l > file
# move words in file into array
array=($(< file))
# remove temp file
rm file
# move already checked words into array
checked=($(< .spelled))
# print out words & ask for corrections
for ((i=0; i<${#array[#]}; i++ ))
do
if [[ ! ${array[i]} = ${checked[#]} ]]; then
read -p "' ${array[i]} ' is mispelled. Press "Enter" to keep
this spelling, or type a correction here: " input
if [[ ! $input = "" ]]; then
correction[i]=$input
else
echo ${array[i]} >> .spelled
fi
fi
done
echo "MISPELLED: CORRECTIONS:"
for ((i=0; i<${#correction[#]}; i++ ))
do
echo ${array[i]} ${correction[i]}
done
otherwise, i would need to write a for loop to check each array indice, and then somehow make a decision statement whether to go through the loop and print/take input
The ususal shell incantation to do this is:
cat $1 | ispell -l |while read -r ln
do
read -p "$ln is misspelled. Enter correction" corrected
if [ ! x$corrected = x ] ; then
ln=$corrected
fi
echo $ln
done >correctedwords.txt
The while;do;done is kind of like a function and you can pipe data into and out of it.
P.S. I didn't test the above code so there may be syntax errors