unit-test zsh extendedglob functionality - c

How can I, in a c program, perform a glob by using the function provided by the z shell?
I have created a README of my explorations so far. It is for use in an open source library.
https://bitbucket.org/sentimental/zsh_source_experimentation/src/master/README
I copy it here:
Start
Lets get the sources
apt-get source zsh
apt-get source zsh-dev
I've discovered by using ldd that zsh does not produce any library files::
#ldd /bin/zsh4
linux-gate.so.1 => (0xb7775000)
libcap.so.2 => /lib/i386-linux-gnu/libcap.so.2 (0xb7751000)
libdl.so.2 => /lib/i386-linux-gnu/libdl.so.2 (0xb774c000)
libtinfo.so.5 => /lib/i386-linux-gnu/libtinfo.so.5 (0xb772c000)
libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xb7700000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb755b000)
I think I will have to use the source files directly.
Lets locate the files containing references to extended globbling
(I'm using zsh as my shell).::
grep -ir EXTENDEDGLOB . | egrep "\.(c|h):" | cut -d: -f1 | sort -u
./README
./zsh-4.3.17/Etc/ChangeLog-3.0
./zsh-4.3.17/Src/glob.c
./zsh-4.3.17/Src/Modules/zutil.c
./zsh-4.3.17/Src/options.c
./zsh-4.3.17/Src/pattern.c
./zsh-4.3.17/Src/utils.c
./zsh-4.3.17/Src/Zle/complist.c
./zsh-4.3.17/Src/Zle/zle_tricky.c
./zsh-4.3.17/Src/zsh.h
Lets consider a couple of those files
zsh.h
In here EXTENDEDGLOB is defined as part of an anonymous enum
There are publications
here
http://publications.gbdirect.co.uk/c_book/chapter6/enums.html
and here
http://bytes.com/topic/c/answers/63891-enum-within-function-standard
detailing the use of enum in c
An example of its use is probably the method arguments for the function
................
static int
bin_zregexparse(char *nam, char **args, Options ops, UNUSED(int func))
Found in the file
.................
./zsh-4.3.17/Src/Modules/zutil.c
Lets see what's calling that function. Hmm only one call. The only reference to that call is in that file.
grep -r bin_zregexparse . | egrep "\.(c|h):" | cut -d: -f1 | sort -u
./zsh-4.3.17/Src/Modules/zutil.c
Hmm.... How does it work if nothing calls this function?
Ok lets see if there is some conditional configuration that sets up or aliases this code somehow?
grep -i regex ./**/conf*
/zsh-4.3.17/config.h.in :/* Define to 1 if you have the `regexec' function. */
./zsh-4.3.17/config.h.in :#undef HAVE_REGEXEC
./zsh-4.3.17/config.h.in :/* Define to 1 if you have the `regexec' function. */
./zsh-4.3.17/config.h.in :#undef HAVE_REGEXEC
./zsh-4.3.17/configure : regcomp regexecc regerror regfree \
./zsh-4.3.17/configure : regcomp regexec regexecerror regfree \
./zsh-4.3.17/configure.ac : regcomp regexec regerrorror regfree \
Lets investigate these files.
config.h.in
Doesn't seem to exist, perhaps it is generated?
There seems to be a block in
configure
8148 for ac_func in strftime strptime mktime timelocal \
8149 difftime gettimeofday clock_gettime \
8150 select poll \
8151 readlink faccessx fchdir ftruncate \
etc etc etc ..
8178 htons ntohs \
8179 regcomp regexec regerror regfree \
8180 gdbm_open getxattr \
8181 realpath canonicalize_file_name \
8182 symlink getcwd
8183 do :
8184 as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
8185 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
8186 if eval test \"x\$"$as_ac_var"\" = x"yes"; then :
8187 cat >>confdefs.h <<_ACEOF
8188 #define `$as_echo "HAVE_$ac_func" | $as_tr_cpp` 1
8189 _ACEOF
8190
8191 fi
8192 done
}}}
I've no idea what that is doing. TODO: Investigate!
In
./zsh-4.3.17/configure.ac
1167 dnl ---------------
1168 dnl CHECK FUNCTIONS
1169 dnl ---------------
1170 | rglobdata.gd_gf_noglobdot
1171 dnl need to integrate this function
1172 dnl AC_FUNC_STRFTIME
1173 | rglobdata.gd_gf_listtypes
1174 AC_CHECK_FUNCS(strftime strptime mktime timelocal \
1175 difftime gettimeofday clock_gettime \
1176 select poll \
etc etc etc
1205 regcomp regexec regerror regfree \
1206 gdbm_open getxattr \
1207 realpath canonicalize_file_name \
1208 symlink getcwd)
1209 AC_FUNC_STRCOLL
Not really sure at this stage how I can do this, consider that I perhaps want to unit-test that function, how might I do so?

The function bin_zregexparse is the implementation of the zregexparse builtin provided by the zsh/zutil module. It is used below its definition in zutil.c:
static struct builtin bintab[] = {
…
BUILTIN("zregexparse", 0, bin_zregexparse, 3, -1, 0, "c", NULL),
…
};
zregexparse is intended to be used in the implementation of _regex_arguments. This isn't the most promising entry point.
If you want to implement zsh's globbing features, you'll have to pull in almost all of the zsh code into your program, since glob patterns can contain arbitrary embedded code. You can exclude the line editor during build, but that's about it.
I would recommend using a separate zsh binary and feeding it requests through a pair of pipes.
setopt extended_glob null_glob
print -Nr -- **/*(.Om+my_predicate) ''

Related

Remove some arguments from argument string in zsh

I'm trying to remove part of an arguments string using zsh parameter expansion (no external tools like sed please). Here's what for:
The RUBYOPT environment variable contains arguments which are applied whenever the ruby interpreter is used just as if they were given along with the ruby command. One argument controls the warning verbosity, possible settings are for instance -W0 or -W:no-deprecated. My goal is to remove all all -W... from RUBYOPT, say:
-W0 -X -> -X
-W:no-deprecated -X -W1 -> -X
My current approach is to split the string to an array and then make a substitution on every member of the array. This works on two lines of code, but I can't make it work on a single line of code:
% RUBYOPT="-W:no-deprecated -X -W1"
% parts=(${(#s: :)RUBYOPT})
% echo ${parts/-W*}
-X
% echo ${(${(#s: :)RUBYOPT})/-W*}
zsh: error in flags
What am I doing wrong here... or is there a different, more elegant way to achieve this?
Thanks for your hints!
${(... introduces parameter expansion flags (for expample:${(s: :)...}).
It cannot handle ${(${(#s: :... as a parameter expansion, especially as the parameter expansion flags for the (${(#s... part, so zsh yields an error "zsh: error in flags".
% RUBYOPT="-W:no-deprecated -X -W1"
% print -- ${${(s: :)RUBYOPT}/-W*}
# -X
could rescue.
update from rowboat's comments: it could be inappropriate for some flags like -abc-Whoops or -foo-Whoo etc:
% RUBYOPT="-W:no-deprecated -X -W1 -foo-Whoo"
% parts=(${(s: :)RUBYOPT})
% print -- ${parts/-W*}
# -X -foo
# Note: -foo would be unexpected
% print -- ${${(s: :)RUBYOPT}/-W*}
# -X -foo
# Note: -foo would be unexpected
The s globbing flag (along with the shell option EXTENDED_GLOB) could rescue:
% RUBYOPT="-W:no-deprecated -X -W1 -foo-Whoo"
% parts=(${(s: :)RUBYOPT})
% setopt extendedglob
# To use `(#s)` flag which is like regex's `^`
% print -- ${parts/(#s)-W*}
# -X -foo-Whoo
% print -- ${${(s: :)RUBYOPT}/(#s)-W*}
# -X -foo-Whoo
Globbing Flags
There are various flags which affect any text to their right up to the end of the enclosing group or to the end of the pattern; they require the EXTENDED_GLOB option. All take the form (#X) where X may have one of the following forms:
...
s, e
Unlike the other flags, these have only a local effect, and each must appear on its own: (#s) and (#e) are the only valid forms. The (#s) flag succeeds only at the start of the test string, and the (#e) flag succeeds only at the end of the test string; they correspond to ^ and $ in standard regular ex‐ pressions.
...
--- zshexpn(1), Expansion, Globbing Flags
Or ${name#:pattern} syntax described below could rescue, too.
end update from rowboat's comments
Use typeset -T feature to manipulate the scalar value by array operators is an option.
RUBYOPT="-W:no-deprecated -X -W1"
typeset -xT RUBYOPT rubyopt ' '
rubyopt=(${rubyopt:#-W*})
print -l -- "$RUBYOPT"
# -X
typeset
...
-T [ SCALAR[=VALUE] ARRAY[=(VALUE ...)] [ SEP ] ]
...
the -T option requires zero, two, or three arguments to be present. With no arguments, the list of parameters created in this fashion is shown. With two or three arguments, the first two are the name of a scalar and of an array parameter (in that order) that will be tied together in the manner of $PATH and $path. The optional third argument is a single-character separator which will be used to join the elements of the array to form the scalar; if absent, a colon is used, as with $PATH. Only the first character of the separator is significant; any remaining characters are ignored. Multibyte characters are not yet supported.
...
Both the scalar and the array may be manipulated as normal. If one is unset, the other will automatically be unset too.
...
--- zshbuiltin(1), Shell Bultin Commands, typeset
And rubyopt=(${rubyopt:#-W*}) to filter the array elements
${name:#pattern}
If the pattern matches the value of name, then substitute the empty string; otherwise, just substitute the value of name. If name is an array the matching array elements are removed (use the (M) flag to remove the non-matched elements).
--- zshexpn(1), Parameter Expansion , ${name:#pattern}
Note: It is possible to omit "#" from flags because the empty values are not necessary in this case.
RUBYOPT="-W:no-deprecated -X -W1"
parts=(${(s: :)RUBYOPT})
print -- ${parts/-W*}
# -X
print -- ${${(s: :)RUBYOPT}/-W*}
# -X
Parameter Expansion Flags
...
#
In double quotes, array elements are put into separate words. E.g., "${(#)foo}" is equivalent to "${foo[#]}" and "${(#)foo[1,2]}" is the same as "$foo[1]" "$foo[2]". This is distinct from field splitting by the f, s or z flags, which still applies within each array element.
--- zshexpn(1), Parameter Expansion Flags, #
If we cannot omit the empty value, ${name:#pattern} syntax could rescue.
RUBYOPT="-W:no-deprecated -X -W1"
parts=("${(#s: :)RUBYOPT}")
# parts=("-W:no-deprecated" "" "-X" "-W1")
# Note the empty value are retained
print -rC1 -- "${(#qqq)parts:#-W*}"
# ""
# "-X"
print -rC1 -- "${(#qqq)${(#s: :)RUBYOPT}:#-W*}"
# ""
# "-X"

Regular expression for GCC Pre Processor Line markers

Is there any Bash or Python regular expression for below C pre processor line marker ?
C Pre Processor output line markers as follows:
# 74 "a/b/some_file.c" 3 4
First comes the symbol - #
space
then line number - 74
space
then - "file path"
space
then Zero or more integers separated by space - 3 4
More info on:
https://gcc.gnu.org/onlinedocs/gcc-6.4.0/cpp/Preprocessor-Output.html
I dunno where you'd want Bash to match such lines, and Python undoubtedly could do it with its regexes. I have a sed script which I use to convert #line directives into comments (which can sometimes make debugging preprocessed code easier). It looks like this:
#!/bin/sh
#
# #(#)$Id: linecomments.sh,v 1.3 2018/01/05 05:12:10 jleffler Exp $
#
# Convert #line directives into comments
# Deals with four forms of the #line directive:
# # line 99 "file"
# # line 99
# # 99 "file"
# # 99
exec sed \
-e 's%^[[:space:]]*#[[:space:]]*line[[:space:]][0-9][0-9]*[[:space:]].*%/*&*/%' \
-e 's%^[[:space:]]*#[[:space:]]*[0-9][0-9]*[[:space:]].*%/*&*/%' \
-e 's%^[[:space:]]*#[[:space:]]*line[[:space:]][0-9][0-9]*$%/*&*/%' \
-e 's%^[[:space:]]*#[[:space:]]*[0-9][0-9]*$%/*&*/%' \
"$#"
I use % to delimit the regular expressions since I need /* and */ in the replacement text. The regexes match 'trailing junk' such as the extra numbers emitted by GCC.
The edit in 2018 replaced fragments that looked like [  ] with [[:space:]] — the old code (dated 2001) used a blank and a tab in each case. This is clearer; you can copy'n'paste without having to worry about where there are tabs.

call program with arguments from an array containing items from another array wrapped in double quotes

(This is a more specific version of the problem discussed in bash - expand arguments from array containing double quotes
.)
I want bash to call cmake with arguments from an array with double quotes which itself contain items from another array. Here is an example for clarification:
cxx_flags=()
cxx_flags+=(-fdiagnostics-color)
cxx_flags+=(-O3)
cmake_arguments=()
cmake_arguments+=(-DCMAKE_BUILD_TYPE=Release)
cmake_arguments+=("-DCMAKE_CXX_FLAGS=\"${cxx_flags[#]}\"")
The arguments shall be printed pretty like this:
$ echo "CMake arguments: ${cmake_arguments[#]}"
CMake arguments: -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-fdiagnostics-color -O3"
Problem
And finally cmake should be called (this does not work!):
cmake .. "${cmake_arguments[#]}"
It expands to (as set -x produces):
cmake .. -DCMAKE_BUILD_TYPE=Release '-DCMAKE_CXX_FLAGS="-fdiagnostics-color' '-O3"'
Workaround
echo "cmake .. ${cmake_arguments[#]}" | source /dev/stdin
Expands to:
cmake .. -DCMAKE_BUILD_TYPE=Release '-DCMAKE_CXX_FLAGS=-fdiagnostics-color -O3'
That's okay but it seems like a hack. Is there a better solution?
Update
If you want to iterate over the array you should use one more variable (as randomir and Jeff Breadner suggested):
cxx_flags=()
cxx_flags+=(-fdiagnostics-color)
cxx_flags+=(-O3)
cxx_flags_string="${cxx_flags[#]}"
cmake_arguments=()
cmake_arguments+=(-DCMAKE_BUILD_TYPE=Release)
cmake_arguments+=("-DCMAKE_CXX_FLAGS=\"$cxx_flags_string\"")
The core problem remains (and the workaround still works) but you could iterate over cmake_arguments and see two items (as intended) instead of three (-DCMAKE_BUILD_TYPE=Release, -DCMAKE_CXX_FLAGS="-fdiagnostics-color and -O3"):
echo "cmake .. \\"
size=${#cmake_arguments[#]}
for ((i = 0; i < $size; ++i)); do
if [[ $(($i + 1)) -eq $size ]]; then
echo " ${cmake_arguments[$i]}"
else
echo " ${cmake_arguments[$i]} \\"
fi
done
Prints:
cmake .. \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CXX_FLAGS="-fdiagnostics-color -O3"
It seems that there's another layer of parsing that has to happen before cmake is happy; the | source /dev/stdin handles this, but you could also just move your CXX flags through an additional variable:
#!/bin/bash -x
cxx_flags=()
cxx_flags+=(-fdiagnostics-color)
cxx_flags+=(-O3)
CXX_FLAGS="${cxx_flags[#]}"
cmake_arguments=()
cmake_arguments+=(-DCMAKE_BUILD_TYPE=Release)
cmake_arguments+=("'-DCMAKE_CXX_FLAGS=${CXX_FLAGS}'")
CMAKE_ARGUMENTS="${cmake_arguments[#]}"
echo "CMake arguments: ${CMAKE_ARGUMENTS}"
returns:
+ cxx_flags=()
+ cxx_flags+=(-fdiagnostics-color)
+ cxx_flags+=(-O3)
+ CXX_FLAGS='-fdiagnostics-color -O3'
+ cmake_arguments=()
+ cmake_arguments+=(-DCMAKE_BUILD_TYPE=Release)
+ cmake_arguments+=("'-DCMAKE_CXX_FLAGS=${CXX_FLAGS}'")
+ CMAKE_ARGUMENTS='-DCMAKE_BUILD_TYPE=Release '\''-DCMAKE_CXX_FLAGS=-fdiagnostics-color -O3'\'''
+ echo 'CMake arguments: -DCMAKE_BUILD_TYPE=Release '\''-DCMAKE_CXX_FLAGS=-fdiagnostics-color -O3'\'''
CMake arguments: -DCMAKE_BUILD_TYPE=Release '-DCMAKE_CXX_FLAGS=-fdiagnostics-color -O3'
There is probably a cleaner solution still, but this is better than the | source /dev/stdin thing, I think.
You basically want cxx_flags array expanded into a single word.
This:
cxx_flags=()
cxx_flags+=(-fdiagnostics-color)
cxx_flags+=(-O3)
flags="${cxx_flags[#]}"
cmake_arguments=()
cmake_arguments+=(-DCMAKE_BUILD_TYPE=Release)
cmake_arguments+=(-DCMAKE_CXX_FLAGS="$flags")
will produce the output you want:
$ set -x
$ echo "${cmake_arguments[#]}"
+ echo -DCMAKE_BUILD_TYPE=Release '-DCMAKE_CXX_FLAGS=-fdiagnostics-color -O3'
So, to summarize, running:
cmake .. "${cmake_arguments[#]}"
with array expansion quoted, ensures each array element (cmake argument) is expanded as only one word (if it contains spaces, the shell won't print quotes around it, but the command executed will receive the whole string as a single argument). You can verify that with set -x.
If you need to print the complete command with arguments in a way that can be reused by copy/pasting, you can consider using printf with %q format specifier, which will quote the argument in a way that can be reused as shell input:
$ printf "cmake .. "; printf "%q " "${cmake_arguments[#]}"; printf "\n"
cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS=-fdiagnostics-color\ -O3
Note the backslash which escapes the space.

Perl reading file, printing unique value from a column

I am new to perl, and i'd like to achieve the following with perl.
I have a file which contain the following data:
/dev/hda1 /boot ext3 rw 0 0
/dev/hda1 /boot ext3 rw 0 0
I'd like to extract the second field from the file and print unique values only. My desired output for this example is, the program should print :
ext3
also if i have several different filesystem, it should print in on the same line.
I have tried many piece of code but am left stuck.
Thank you
If you prefer awk:
$ cat file
/dev/hda1 /boot ext3 rw 0 0
/dev/hda1 /boot ext3 rw 0 0
$ awk '!seen[$3]++{print $3}' file
ext3
OR , using cut:
$ cut -d" " -f3 file | sort | uniq # or use just sort -u if your version supports it
ext3
Here is perl solution:
$ perl -lane 'print $F[2] unless $seen{$F[2]}++' file
ext3
Here is the perl command line options explanation (from perl -h):
l: enable line ending processing, specifies line terminator
a: autosplit mode with -n or -p (splits $_ into #F)
n: assume "while (<>) { ... }" loop around program
e: one line of program (several -e's allowed, omit programfile)
For a better explanation around these option, please refer: https://blogs.oracle.com/ksplice/entry/the_top_10_tricks_of
#!/usr/bin/perl
my %hash ;
while (<>) {
if (/\s*[^\s]+\s+[^\s]+\s+([^\s]+)\s+.*/) {
$hash{$1}=1;
}
}
print join("\n",keys(%hash))."\n";
Usage:
./<prog-name>.pl file1 fil2 ....
perl -anE '$s{$F[2]}++ }{say for keys %s' file
or
perl -anE '$s{$_}++ or say for $F[2]' file

How to find the largest file in a directory and its subdirectories?

We're just starting a UNIX class and are learning a variety of Bash commands. Our assignment involves performing various commands on a directory that has a number of folders under it as well.
I know how to list and count all the regular files from the root folder using:
find . -type l | wc -l
But I'd like to know where to go from there in order to find the largest file in the whole directory. I've seen somethings regarding a du command, but we haven't learned that, so in the repertoire of things we've learned I assume we need to somehow connect it to the ls -t command.
And pardon me if my 'lingo' isn't correct, I'm still getting used to it!
Quote from this link-
If you want to find and print the top 10 largest files names (not
directories) in a particular directory and its sub directories
$ find . -type f -printf '%s %p\n'|sort -nr|head
To restrict the search to the present directory use "-maxdepth 1" with
find.
$ find . -maxdepth 1 -printf '%s %p\n'|sort -nr|head
And to print the top 10 largest "files and directories":
$ du -a . | sort -nr | head
** Use "head -n X" instead of the only "head" above to print the top X largest files (in all the above examples)
To find the top 25 files in the current directory and its subdirectories:
find . -type f -exec ls -al {} \; | sort -nr -k5 | head -n 25
This will output the top 25 files by sorting based on the size of the files via the "sort -nr -k5" piped command.
Same but with human-readable file sizes:
find . -type f -exec ls -alh {} \; | sort -hr -k5 | head -n 25
find . -type f | xargs ls -lS | head -n 1
outputs
-rw-r--r-- 1 nneonneo staff 9274991 Apr 11 02:29 ./devel/misc/test.out
If you just want the filename:
find . -type f | xargs ls -1S | head -n 1
This avoids using awk and allows you to use whatever flags you want in ls.
Caveat. Because xargs tries to avoid building overlong command lines, this might fail if you run it on a directory with a lot of files because ls ends up executing more than once. It's not an insurmountable problem (you can collect the head -n 1 output from each ls invocation, and run ls -S again, looping until you have a single file), but it does mar this approach somewhat.
There is no simple command available to find out the largest files/directories on a Linux/UNIX/BSD filesystem. However, combination of following three commands (using pipes) you can easily find out list of largest files:
# du -a /var | sort -n -r | head -n 10
If you want more human readable output try:
$ cd /path/to/some/var
$ du -hsx * | sort -rh | head -10
Where,
Var is the directory you wan to search
du command -h option : display sizes in human readable format (e.g.,
1K, 234M, 2G).
du command -s option : show only a total for each
argument (summary).
du command -x option : skip directories on
different file systems.
sort command -r option : reverse the result
of comparisons.
sort command -h option : compare human readable
numbers. This is GNU sort specific option only.
head command -10 OR -n 10 option : show the first 10 lines.
This lists files recursively if they're normal files, sorts by the 7th field (which is size in my find output; check yours), and shows just the first file.
find . -type f -ls | sort +7 | head -1
The first option to find is the start path for the recursive search. A -type of f searches for normal files. Note that if you try to parse this as a filename, you may fail if the filename contains spaces, newlines or other special characters. The options to sort also vary by operating system. I'm using FreeBSD.
A "better" but more complex and heavier solution would be to have find traverse the directories, but perhaps use stat to get the details about the file, then perhaps use awk to find the largest size. Note that the output of stat also depends on your operating system.
This will find the largest file or folder in your present working directory:
ls -S /path/to/folder | head -1
To find the largest file in all sub-directories:
find /path/to/folder -type f -exec ls -s {} \; | sort -nr | awk 'NR==1 { $1=""; sub(/^ /, ""); print }'
On Solaris I use:
find . -type f -ls|sort -nr -k7|awk 'NR==1{print $7,$11}' #formatted
or
find . -type f -ls | sort -nrk7 | head -1 #unformatted
because anything else posted here didn't work.
This will find the largest file in $PWD and subdirectories.
Try the following one-liner (display top-20 biggest files):
ls -1Rs | sed -e "s/^ *//" | grep "^[0-9]" | sort -nr | head -n20
or (human readable sizes):
ls -1Rhs | sed -e "s/^ *//" | grep "^[0-9]" | sort -hr | head -n20
Works fine under Linux/BSD/OSX in comparison to other answers, as find's -printf option doesn't exist on OSX/BSD and stat has different parameters depending on OS. However the second command to work on OSX/BSD properly (as sort doesn't have -h), install sort from coreutils or remove -h from ls and use sort -nr instead.
So these aliases are useful to have in your rc files:
alias big='du -ah . | sort -rh | head -20'
alias big-files='ls -1Rhs | sed -e "s/^ *//" | grep "^[0-9]" | sort -hr | head -n20'
Try following command :
find /your/path -printf "%k %p\n" | sort -g -k 1,1 | awk '{if($1 > 500000) print $1/1024 "MB" " " $2 }' |tail -n 1
This will print the largest file name and size and more than 500M. You can move the if($1 > 500000),and it will print the largest file in the directory.
du -aS /PATH/TO/folder | sort -rn | head -2 | tail -1
or
du -aS /PATH/TO/folder | sort -rn | awk 'NR==2'
To list the larger file in a folder
ls -sh /pathFolder | sort -rh | head -n 1
The output of ls -sh is a sized s and human h understandable view of the file size number.
You could use ls -shS /pathFolder | head -n 1. The bigger S from ls already order the list from the larger files to the smaller ones but the first result its the sum of all files in that folder. So if you want just to list the bigger file, one file, you need to head -n 2 and check at the "second line result" or use the first example with ls sort head.
This command works for me,
find /path/to/dir -type f -exec du -h '{}' + | sort -hr | head -10
Lists Top 10 files ordered by size in human-readable mode.
This script simplifies finding largest files for further action.
I keep it in my ~/bin directory, and put ~/bin in my $PATH.
#!/usr/bin/env bash
# scriptname: above
# author: Jonathan D. Lettvin, 201401220235
# This finds files of size >= $1 (format ${count}[K|M|G|T], default 10G)
# using a reliable version-independent bash hash to relax find's -size syntax.
# Specifying size using 'T' for Terabytes is supported.
# Output size has units (K|M|G|T) in the left hand output column.
# Example:
# ubuntu12.04$ above 1T
# 128T /proc/core
# http://stackoverflow.com/questions/1494178/how-to-define-hash-tables-in-bash
# Inspiration for hasch: thanks Adam Katz, Oct 18 2012 00:39
function hasch() { local hasch=`echo "$1" | cksum`; echo "${hasch//[!0-9]}"; }
function usage() { echo "Usage: $0 [{count}{k|K|m|M|g|G|t|T}"; exit 1; }
function arg1() {
# Translate single arg (if present) into format usable by find.
count=10; units=G; # Default find -size argument to 10G.
size=${count}${units}
if [ -n "$1" ]; then
for P in TT tT GG gG MM mM Kk kk; do xlat[`hasch ${P:0:1}`]="${P:1:1}"; done
units=${xlat[`hasch ${1:(-1)}`]}; count=${1:0:(-1)}
test -n "$units" || usage
test -x $(echo "$count" | sed s/[0-9]//g) || usage
if [ "$units" == "T" ]; then units="G"; let count=$count*1024; fi
size=${count}${units}
fi
}
function main() {
sudo \
find / -type f -size +$size -exec ls -lh {} \; 2>/dev/null | \
awk '{ N=$5; fn=$9; for(i=10;i<=NF;i++){fn=fn" "$i};print N " " fn }'
}
arg1 $1
main $size
That is quite simpler way to do it:
ls -l | tr -s " " " " | cut -d " " -f 5,9 | sort -n -r | head -n 1***
And you'll get this: 8445 examples.desktop
Linux Solution: For example, you want to see all files/folder list of your home (/) directory according to file/folder size (Descending order).
sudo du -xm / | sort -rn | more
ls -alR|awk '{ if ($5 > max) {max=$5;ff=$9}} END {print max "\t" ff;}'
Kindly run below one liner with your required-path. as of now i am running for /var/log/ location
(sudo du -a /var/log/ |sort -nr|head -n20 |awk '{print $NF}'|while read l ;do du -csh $l|grep -vi total;done ) 2> /dev/null

Resources