First line of every file in a new file - file

How can I get the first line of EVERY file in a directory and save them all in a new file?
#!/bin/bash
rm FIRSTLINE
for file in "$(find $1 -type f)";
do
head -1 $file >> FIRSTLINE
done
cat FIRSTLINE
This is my bash script, but when I do this and I open the file FIRSTLINE,
then I see this:
==> 'path of the file' <==
'first line' of the file
and this for all the files in my argument.
Does anybody has some solution?

find . -type f -exec head -1 \{\} \; > YOURFILE
might work for you.

The problem is that you've quoted the output of find so it gets treated as a single string, so the for loop only runs once, with a single argument containing all the files. That means you run head -1 file1 file2 file3 file4 ... etc. and when given multiple files head prints the ==> file1 <== headers.
So to fix it, remove the double quotes around the find shell-out, which ensures you run the for loop once for each file, as intended. Also, the semi-colon after the shell-out is unnecessary.
#!/bin/bash
rm FIRSTLINE
for file in $(find $1 -type f)
do
head -1 $file >> FIRSTLINE
done
cat FIRSTLINE
This has some style issues though, do you really need to write to a file then cat the file to stdout? You could just print the output to stdout:
#!/bin/bash
for file in $(find $1 -type f)
do
head -1 $file
done
Personally I'd write it like this:
find $1 -type f | xargs -L1 head -1
or if you need the output in the file and printed to stdout:
find $1 -type f | xargs -L1 head -1 | tee FIRSTLINE

$ for file in $(find $1 -type f); do echo '';
echo $file;
head -n 4 $file;
done

for gzip files fo instances:
for file in `ls *.gz`; do gzcat $file | head -n 1; done > toto.txt

Related

Script terminates prematurely after do loop. Last "echo" is not executed.

I am looking to execute the following script below. The issue I am encountering is it will not execute anything after the do loop. It doesn't matter what I happen to have after the loop, it never executes, so I am def missing something somewhere.
Also, any suggestions on a more efficient way or writing this script? I am very new to the scripting environment and very open to better ways of going about things.
#!/bin/bash
# mcidas environment
PATH=$HOME/bin:
PATH=$PATH:/usr/sww/bin:/usr/local/bin:$HOME/mcidas/bin
PATH=$PATH:/home/mcidas/bin:/bin:/usr/bin:/etc:/usr/ucb
PATH=$PATH:/usr/bin/X11:/common/tool/bin:.
export PATH
MCPATH=$HOME/mcidas/data
MCPATH=$MCPATH:/home/mcidas/data
export MCPATH
#variables
basedir1="ftp://ladsweb.nascom.nasa.gov/allData/6/MOD02QKM" #TERRA
basedir2="ftp://ladsweb.nascom.nasa.gov/allData/6/MYD02QKM" #AQUA
day=`date +%j`
day1=`date +"%j" -d "-1 day"`
hour=`date -u +"%H"`
min=`date -u +"%m"`
year=`date -u +"%Y"`
segment=1
count=$(ls /satellite/modis_processed/ | grep -v ^d | wc -l)
count_max=25
files=(/satellite/modis_processed/*)
if [ $hour -ge "17" ]; then
workinghour="16"
echo "Searching for hour $workinghour"
url="${basedir2}/${year}/${day1}/MYD02QKM.A${year}${day1}.${workinghour}*.006.${year}*"
wget -r -nd --no-parent -nc -e robots=off -R 'index.*' -P /satellitemodis/ $url
#find /satellite/modis/ -type f -mmin -30 -exec cp "{}" /satellite/modis_processed/ \;
for files in /satellite/modis_processed/*
do
echo "The number used for the data file is ${count}"
echo "The number used for the image file is ${segment}"
export segment
export count
#Run McIDAS
mcenv <<- 'EOF'
imgcopy.k MODISD.${count} MODISI.${segment} BAND=1 SIZE=SAME
imgremap.k MODISD.${segment} MODISI.${segment} BAND=1 SIZE=ALL PRO=MERC
imgcha.k MODISI.${segment} CTYPE=BRIT
exit
EOF
segment=`expr ${segment} + 1`
count=`expr ${count} - 1`
#Reset Counter if equal or greater than 25
if [[ $segment -ge $count_max ]]; then
segment=1
fi
find /satellite/awips -type f -name "AREA62*" -exec mv "{}" /awips2/edex/data/manual/ \;
done;
echo "We have exported ${segment} converted modis files to EDEX."
fi
You have a here-document in that script. That here-document is not properly terminated. The end marker, EOF, needs to be in the first column, not indented at all.
If you indent it, it has to be with tabs, and the start of the here-document should be <<-'EOF'.
The effect of the wrongly indented EOF marker is that the rest of the script is read as the contents of the here-document.
As Charles Duffy points out, ShellCheck is your friend.

rename specific pattern of files in bash

I have the following files and directories:
/tmp/jj/
/tmp/jj/ese
/tmp/jj/ese/2010
/tmp/jj/ese/2010/test.db
/tmp/jj/dfhdh
/tmp/jj/dfhdh/2010
/tmp/jj/dfhdh/2010/rfdf.db
/tmp/jj/ddfxcg
/tmp/jj/ddfxcg/2010
/tmp/jj/ddfxcg/2010/df.db
/tmp/jj/ddfnghmnhm
/tmp/jj/ddfnghmnhm/2010
/tmp/jj/ddfnghmnhm/2010/sdfs.db
I want to rename all 2010 directories to their parent directories then tar all .db files...
What I tried is:
#!/bin/bash
if [ $# -ne 1 ]; then
echo "Usage: `basename $0` <absolute-path>"
exit 1
fi
if [ "$(id -u)" != "0" ]; then
echo "This script must be run as root" 1>&2
exit 1
fi
rm /tmp/test
find $1 >> /tmp/test
for line in $(cat /tmp/test)
do
arr=$( (echo $line | awk -F"/" '{for (i = 1; i < NF; i++) if ($i == "2010") print $(i-1)}') )
for index in "${arr[#]}"
do
echo $index #HOW TO WRITE MV COMMAND RATHER THAN ECHO COMMAND?
done
done
1) The result is:
ese
dfhdh
ddfxcg
ddfnghmnhm
But it should be:
ese
dfhdh
ddfxcg
ddfnghmnhm
2) How can I rename all 2010 directories to their parent directory?
I mean how to do (I want to do it in loop because of larg numbers of dirs):
mv /tmp/jj/ese/2010 /tmp/jj/ese/ese
mv /tmp/jj/dfhdh/2010 /tmp/jj/dfhdh/dfhdh
mv /tmp/jj/ddfxcg/2010 /tmp/jj/ddfxcg/ddfxcg
mv /tmp/jj/ddfnghmnhm/2010 /tmp/jj/ddfnghmnhm/ddfnghmnhm
You could instead use find in order to determine if a directory contains a subdirectory named 2010 and perform the mv:
find /tmp -type d -exec sh -c '[ -d "{}"/2010 ] && mv "{}"/2010 "{}"/$(basename "{}")' -- {} \;
I'm not sure if you have any other question here but this would do what you've listed at the end of the question, i.e. it would:
mv /tmp/jj/ese/2010 /tmp/jj/ese/ese
and so on...
Can be done using grep -P:
grep -oP '[^/]+(?=/2010)' file
ese
ese
dfhdh
dfhdh
ddfxcg
ddfxcg
ddfnghmnhm
ddfnghmnhm
This should be close:
find "$1" -type d -name 2010 -print |
while IFS= read -r dir
do
parentPath=$(dirname "$dir")
parentDir=$(basename "$parentPath")
echo mv "$dir" "$parentPath/$parentDir"
done
Remove the echo after testing. If your dir names can contain newlines then look into the -print0 option for find, and the -0 option for xargs.
First, only iterate through the dirs you're interested in, and avoid temporary files:
for d in $(find $1 -type d -name '2010') ; do
Then you can use basename and dirname to extract parts of that directory name and reconstruct the desired one. Something like:
b="$(dirname $d)"
p="$(basename $b)"
echo mv "$d" "$b/$p"
You could use shell string replace operations instead of basename/dirname.

Populating Arrays With Nested Loops in Bash

I'm curious as to why the following:
array1=(file1 file2 file3)
array2=()
for i in ${array1[#]}
do
find . -name $i -type f -print0 2>/dev/null | \
while read -d '' -r file
do
array2+=( $file )
done
done
fails to populate array2 assuming the filenames file1, file2, and file3 exist in the filesystem in sub-directories from the parent where the search is initiated. I would appreciate if someone could point out where I mis-stepped here.
Try this:
array1=(file1 file2 file3)
array2=()
for i in "${array1[#]}"
do
while read -d '' -r file
do
array2+=( "$file" )
done < <(find . -name "$i" -type f -print0)
done
Due to your use of pipes sub shell is created and your array2 values get lost when sub shell ends.
If you are using bash 4, you can avoid using find:
shopt -s globstar
array1=(file1 file2 file3)
array2=()
for i in "${array1[#]}"
do
for f in **/"$i"; do
[[ -f "$f" ]] && array2+=( "$f" )
done
done

How do i find the largest file, by size, then copy to another directory

how do I search for files in a directory, sort by size, then copy the largest file into another directory.
I have seen bits and pieces..yet to solve it.
I have tried the below code. However, it does not work.
find sourceDirectory -type f -exec ls -s {} \; | sort -n -r | head -1 | cp {} targetdirectory
The curly brace notation ({}) is used in the arguments for the -exec option for find, it has no meaning to cp in this context. You need to split this up into two separate steps, 1) find the file, and 2) copy the file.
If you are using GNU find I would suggest something like this:
read size filepath < <(find . -type f -printf '%k %p\n' | sort -nr)
cp "$filepath" target/path/
Here is an alternative that avoids temporary variables:
cp "$(find . -type f -printf '%k %p\n' | sort -nr | head -n1 | cut -d' ' -f2-)" target/path/
You can replace -printf '%k %p\n' by-exec ls -s {} \;` but printf is much more efficient.
Note that special precautions may be needed if the file names contain other than ASCII characters.
You were almost there. just needed extra support of awk and xargs
I would prefer using du in place of ls -s although they both work fine in this case.
find <sourceDirectory> -type f -exec du {} \; | sort -nr | head -1 | awk '{print $2}' | xargs -I file cp file <targetdirectory>

BASH: Sort array of files with crazy names

Problem:
Need to sort array before operating on them with function.
First, array is loaded with files:
unset a i
counter=1
while IFS= read -r -d $'\0' file; do
a[i++]="$file"
done < <(find $DIR -type f -print0)
Next, each member of array is sent to function
for f in "${a[#]}"
do
func_hash "$f"
[ $(expr $counter % 20) -eq 0 ] && printf "="
counter=$((counter + 1))
done
Somehow a sort needs to be thrown into the above for loop. Have looked
through the SO posts on sorting arrays but somehow my crazy file names
cause issues when I try to tack on a sort.
Ideas?
Thanks!
Bubnoff
UPDATE: Here's code with sort:
while IFS= read -r -d $'\0' file; do
func_hash "$file"
[ $(expr $counter % 20) -eq 0 ] && printf "="
counter=$((counter + 1))
done < <(find $DIR -type f -print0 | sort -z +1 -1)
It's sorting by full path rather than file name. Any ideas on how to
sort by file name given that the path is needed for the function?
UPDATE 2: Decided to compromise.
My main goal was to avoid temp files using sort. GNU sort can write back to the original
file with its '-o' option so now I can:
sort -o $OUT -t',' -k 1 $OUT
Anyone have a more 'elegant' solution ( whatever that means ).
SOLVED See jw013's answer below. Thanks man!
EDIT
while IFS= read -r -d/ && read -r -d '' file; do
a[i++]="$file"
done < <(find "$DIR" -type f -printf '%f/%p\0' | sort -z -t/ -k1 )
Rationale:
I make the assumption that / is never a legal character within a file name (which seems reasonable on most *nix filesystems since it is the path separator).
The -printf is used to print the file name without leading directories, then then full file name with path, separated by /. The sort takes place on the first field separated by / which should be the full file name without path.
The read is modified to first use / as a delimiter to throw out the pathless file name.
side note
Any POSIX shell should support the modulo operator as part of its arithmetic expansion. You can replace line with the call to external command expr in the second loop with
[ $(( counter % 20 )) -eq 0 ] ...

Resources