Getting absolute path of a file - c

How can I convert a relative path to an absolute path in C on Unix?
Is there a convenient system function for this?
On Windows there is a GetFullPathName function that does the job, but I didn't find something similar on Unix...

Use realpath().
The realpath() function shall derive,
from the pathname pointed to by
file_name, an absolute pathname that
names the same file, whose resolution
does not involve '.', '..', or
symbolic links. The generated pathname
shall be stored as a null-terminated
string, up to a maximum of {PATH_MAX}
bytes, in the buffer pointed to by
resolved_name.
If resolved_name is a null pointer,
the behavior of realpath() is
implementation-defined.
The following example generates an
absolute pathname for the file
identified by the symlinkpath
argument. The generated pathname is
stored in the actualpath array.
#include <stdlib.h>
...
char *symlinkpath = "/tmp/symlink/file";
char actualpath [PATH_MAX+1];
char *ptr;
ptr = realpath(symlinkpath, actualpath);

Try realpath() in stdlib.h
char filename[] = "../../../../data/000000.jpg";
char* path = realpath(filename, NULL);
if(path == NULL){
printf("cannot find file with name[%s]\n", filename);
} else{
printf("path[%s]\n", path);
free(path);
}

There is also a small path library cwalk which works cross-platform. It has cwk_path_get_absolute to do that:
#include <cwalk.h>
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char buffer[FILENAME_MAX];
cwk_path_get_absolute("/hello/there", "./world", buffer, sizeof(buffer));
printf("The absolute path is: %s", buffer);
return EXIT_SUCCESS;
}
Outputs:
The absolute path is: /hello/there/world

Also try "getcwd"
#include <unistd.h>
char cwd[100000];
getcwd(cwd, sizeof(cwd));
std::cout << "Absolute path: "<< cwd << "/" << __FILE__ << std::endl;
Result:
Absolute path: /media/setivolkylany/WorkDisk/Programming/Sources/MichailFlenov/main.cpp
Testing environment:
setivolkylany#localhost$/ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 8.6 (jessie)
Release: 8.6
Codename: jessie
setivolkylany#localhost$/ uname -a
Linux localhost 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) x86_64 GNU/Linux
setivolkylany#localhost$/ g++ --version
g++ (Debian 4.9.2-10) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Related

Why doesn't the execve command in C on macOS allow the 'which' command to work?

Why does the execve command in C on macOS not allow the 'which' command to work? It works on non-Mac devices.
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
int main()
{
int fd;
char cmd[] = "/bin/cat";
char cmd1[] = "/usr/bin/which";
char *s[]={"which","ls",NULL};
if (execve(cmd1, s, NULL) == -1)
perror("oops ur wrong!!");
}
Expected output
 clang-7 -pthread -lm -o main main.c
 ./main
/bin/ls

but on a Mac, it returns nothing.
macOS
The code works. It doesn't work well, but it does work.
Given the null PATH in the environment (because you've used execve() and provided NULL as the environment), /usr/bin/which can't find ls — it has nowhere to look for it because PATH is not set.
On my machine (a MacBook Pro running macOS Big Sur 11.7.1 — it's a work machine and the company IT is behind the times), /usr/bin/which is a universal binary with two architectures. If I run /usr/bin/which ozymandias on the command line, there is no output (I don't have a command ozymandias anywhere), but the exit status is 1 (failure). That's an odd implementation — not reporting an error — but it works within its limits.
You can see this effect with:
$ (unset PATH; /usr/bin/which ls)
$ echo $?
1
$
If you use execv() instead of execve() and remove the , NULL from the argument list, the output is /bin/ls and the exit status is 0.
Linux
Just for comparison, on a RHEL 7.4 machine, I get different results:
$ which -a which
which='alias | /usr/bin/which --tty-only --read-alias --show-dot --show-tilde'
/usr/bin/alias
/usr/bin/which
/usr/bin/which
$ file /usr/bin/which
/usr/bin/which: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=317ba624d2914607bf9246993446803a977fbc18, stripped
$ /usr/bin/which which
/usr/bin/which
$ (unset PATH; /usr/bin/which which)
/usr/bin/which: no which in ((null))
$ /usr/bin/which ozymandias
/usr/bin/which: no ozymandias in (/work2/jleffler/bin:/u/jleffler/bin:/usr/perl/v5.34.0/bin:/usr/gcc/v12.2.0/bin:/usr/local/bin:/usr/bin:/usr/sbin)
$ /usr/bin/which --help
Usage: /usr/bin/which [options] [--] COMMAND [...]
Write the full path of COMMAND(s) to standard output.
--version, -[vV] Print version and exit successfully.
--help, Print this help and exit successfully.
--skip-dot Skip directories in PATH that start with a dot.
--skip-tilde Skip directories in PATH that start with a tilde.
--show-dot Don't expand a dot to current directory in output.
--show-tilde Output a tilde for HOME directory for non-root.
--tty-only Stop processing options on the right if not on tty.
--all, -a Print all matches in PATH, not just the first
--read-alias, -i Read list of aliases from stdin.
--skip-alias Ignore option --read-alias; don't read stdin.
--read-functions Read shell functions from stdin.
--skip-functions Ignore option --read-functions; don't read stdin.
Recommended use is to write the output of (alias; declare -f) to standard
input, so that which can show aliases and shell functions. See which(1) for
examples.
If the options --read-alias and/or --read-functions are specified then the
output can be a full alias or function definition, optionally followed by
the full path of each command used inside of those.
Report bugs to <which-bugs#gnu.org>.
$
PATH sanitized — radically shortened.
The which command reports an error when it can't find the command. It is a standalone executable on this Linux machine, and the which alias feeds it the aliases so it can report on them. The -a option reports on all the things that could be known as which (the second which in which -a which).
I found that adding the envp(path argument in main) to the arguments made it work
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
int main(int argv, char *argc[],char *envp[])
{
int fd;
char cmd1[] = "/usr/bin/which";
char *s[] = {"which", "ls", NULL};
if (execve(cmd1, s, envp) == -1)
perror("oops ur wrong!!");
}
thanks anyways

Why does pclose return prematurely?

UPDATE 1: This question has been updated to eliminate the multithreading, simplifying its scope. The original problem popened in the main thread, and pclosed the child process in a different thread. The problem being asked about is reproducible much more simply, by doing the popen and pclose in the same (main) thread.
Update 2: With help from responders at How to check libc version?, I think I've identified that the libc being used is uClibc 0.9.30.
The following code popens a script in the main thread, waits a little bit, then pcloses the child process in the same main thread. This program is cross-compiled for several cross-targets.
The executable's code:
// mybin.c
#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <stdbool.h>
#include <string.h>
#include <time.h>
#include <errno.h>
#include <unistd.h>
static FILE* logFile = NULL;
static void logInit( const char* fmt );
static void log_( const char* file, int line, const char* fmt, ... );
static void logCleanup();
#define log(fmt, ...) log_( __FILE__, __LINE__, fmt, ##__VA_ARGS__ )
int main( int argc, char* argv[] )
{
logInit( "./mybin.log" );
{
bool success = false;
FILE* f;
if ( ! (f = popen( "./myscript", "r" )) )
{
log( "popen error: %d (%s)", errno, strerror( errno ) );
goto end;
}
log( "before sleep" );
sleep( 1 );
log( "after sleep" );
pclose( f );
log( "after pclose" );
success = true;
}
end:
log( "At end" );
logCleanup();
return 0;
}
/** Initializes logging */
static void logInit( const char* file )
{
logFile = fopen( file, "a" );
}
/** Logs timestamp-prefixed, newline-suffixed printf-style text */
static void log_( const char* file, int line, const char* fmt, ... )
{
//static FILE* logOut = logFile ? logFile : stdout;
FILE* logOut = logFile ? logFile : stdout;
time_t t = time( NULL );
char fmtTime[16] = { '\0' };
struct tm stm = *(localtime( &t ));
char logStr[1024] = { '\0' };
va_list args;
va_start( args, fmt );
vsnprintf( logStr, sizeof logStr, fmt, args );
va_end( args );
strftime( fmtTime, sizeof fmtTime, "%Y%m%d_%H%M%S", &stm );
fprintf( logOut, "%s %s#%d %s\n", fmtTime, file, line, logStr );
}
/** Cleans up after logInit() */
static void logCleanup()
{
if ( logFile ) { fclose( logFile ); }
logFile = NULL;
}
The script:
#! /bin/bash
# mybin
rm -f ./myscript.log
for i in {1..10}; do echo "$(date +"%Y%m%d_%H%M%S") script is running" >> ./myscript.log; sleep 1; done
The expected behavior is that the compiled executable spawns execution of the script in a child process, waits for its completion, then exits. This is met on many cross-targets including x86, x64, and ARM. Below is an example architecture on which the expected behavior is met, compilation, and corresponding logs:
$ uname -a
Linux linuxbox 5.4.8-200.fc31.x86_64 #1 SMP Mon Jan 6 16:44:18 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Compilation:
$ gcc --version && gcc -g ./mybin.c -lpthread -o mybin
gcc (GCC) 9.2.1 20190827 (Red Hat 9.2.1-1)
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$
mybin.log:
20200705_200950 ./mybin.c#33 before sleep
20200705_200951 ./mybin.c#35 after sleep
20200705_201000 ./mybin.c#37 after pclose
20200705_201000 ./mybin.c#44 At end
myscript.log:
20200705_200950 script is running
20200705_200951 script is running
20200705_200952 script is running
20200705_200953 script is running
20200705_200954 script is running
20200705_200955 script is running
20200705_200956 script is running
20200705_200957 script is running
20200705_200958 script is running
20200705_200959 script is running
However, on one target, an odd thing occurs: pclose returns early: after the script has started running, but well before it has completed running -- why? Below is the problem architecture on which the unexpected behavior is observed, cross-compiler flags, and corresponding logs:
$ uname -a
Linux hostname 2.6.33-arm1 #2 Wed Jul 1 23:05:25 UTC 2020 armv7ml GNU/Linux
Cross-compilation:
$ /path/to/toolchains/ARM-cortex-m3-4.4/bin/arm-uclinuxeabi-gcc --version
arm-uclinuxeabi-gcc (Sourcery G++ Lite 2010q1-189) 4.4.1
Copyright (C) 2009 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ /path/to/toolchains/ARM-cortex-m3-4.4/bin/arm-uclinuxeabi-gcc -O2 -Wall -fno-strict-aliasing -Os -D__uClinux__ -fno-strict-aliasing -mcpu=cortex-m3 -mthumb -g -ffunction-sections -fdata-sections -I/path/to/toolchains/ARM-cortex-m3-4.4/usr/include/ -Wl,--gc-sections -Wl,-elf2flt=-s -Wl,-elf2flt=8192 -I/path/to/toolchains/ARM-cortex-m3-4.4/sysroot/usr/include -I/path/to/libs/ARM-cortex-m3-4.4/usr/include/ -L/path/to/toolchains/ARM-cortex-m3-4.4/sysroot/usr/lib -lrt -L/path/to/libs/ARM-cortex-m3-4.4/usr/lib -L/path/to/libs/ARM-cortex-m3-4.4/lib -o mybin ./mybin.c -lrt -lpthread
$
mybin.log:
20200705_235632 ./mybin.c#33 before sleep
20200705_235633 ./mybin.c#35 after sleep
20200705_235633 ./mybin.c#37 after pclose
20200705_235633 ./mybin.c#44 At end
myscript.log:
20200705_235632 script is running
The gist of my question is: why does pclose return prematurely, and why only on this one cross-target?
Comments and research have me circling the notion that this is a bug in the variant/version of libc - it'd be great if someone knowledgeable on the subject could help confirm if that is the case.
Not a dup of pclose() prematurely returning in a multi-threaded environment (Solaris 11)

In an arm32 image-based container, readdir returns EOVERFLOW when directory is empty

Upon calling readdir function in a C program within an arm32-based container executing on x64-based Ubuntu 19.10 host, the call returns EOVERFLOW for empty directories (e.g., /mnt, /media) instead of returning 0.
Have others observed this issue? Is this a configuration issue? If so, how can it be fixed?
Versions:
Guest: debian:buster- backports#sha256:8f27850df2144df1598b5c76b213616ecaab08e804a6d84ddace1455d8cbd9f0
Host: Ubuntu 19.10, amd64, Docker version: 19.03.6-0ubuntu1~19.10.1
Qemu version: 1:4.0+dfsg-0ubuntu9.6
Repro steps:
Build an image named crystal-for-buster-armhf:v1 based on Debian Buster for arm32 using the Dockerfile and build.sh script available here.
Start a container based on this image.
Compile and build the below program.
Execute the resulting executable with a directory name as a command line argument.
#define _POSIX_SOURCE
#include <dirent.h>
#include <errno.h>
#include <sys/types.h>
#undef _POSIX_SOURCE
#include <stdio.h>
main(int argc, char* argv[]) {
DIR *dir;
struct dirent *entry;
if ((dir = opendir(argv[1])) == NULL)
perror("opendir() error");
else {
puts("contents:");
while (1) {
errno = 0;
entry = readdir(dir);
if (entry == NULL) {
printf("Errno: %d EOVERFLOW: %d\n", errno, EOVERFLOW);
break;
}
printf(" %s\n", entry->d_name);
}
closedir(dir);
}
}
If you're using glibc (most Linux-based systems), you need to compile with -D_FILE_OFFSET_BITS=64. The default is still 32-bit off_t, and with it 32-bit ino_t, and in such a configuration, readdir, stat, etc. will fail with EOVERFLOW if the inode number does not fit in 32 bits. Many modern filesystems always have inode numbers that don't fit in 32 bits.

Why is © (the copyright symbol) replaced with (C) when using wprintf?

When I try to print the copyright symbol © with printf or write, it works just fine:
#include <stdio.h>
int main(void)
{
printf("©\n");
}
#include <unistd.h>
int main(void)
{
write(1, "©\n", 3);
}
Output:
©
But when I try to print it with wprintf, I get (C):
#include <stdio.h>
#include <wchar.h>
int main(void)
{
wprintf(L"©\n");
}
Output:
(C)
It's fixed when I add a call to setlocale, though:
#include <stdio.h>
#include <wchar.h>
#include <locale.h>
int main(void)
{
setlocale(LC_ALL, "");
wprintf(L"©\n");
}
Output:
©
Why is the original behavior present and why is it fixed when I call setlocale? Additionally, where does this conversion take place? And how can I make the behavior after setlocale the default?
compilation command:
gcc test.c
locale:
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
echo $LC_CTYPE:
uname -a:
Linux penguin 4.19.79-07511-ge32b3719f26b #1 SMP PREEMPT Mon Nov 18 17:41:41 PST 2019 x86_64 GNU/Linux
file test.c (same on all of the examples):
test.c: C source, UTF-8 Unicode text
gcc --version:
gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
/lib/x86_64-linux-gnu/libc-2.24.so (glibc version):
GNU C Library (Debian GLIBC 2.24-11+deb9u4) stable release version 2.24, by Roland McGrath et al.
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 6.3.0 20170516.
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<http://www.debian.org/Bugs/>.
cat /etc/debian_version:
9.12
The locale of the calling processes is not automatically inherited by the new process.
When the program first starts up, it is in the C locale. The man page for setlocale(3) says the following:
On startup of the main program, the portable "C" locale is selected
as default. A program may be made portable to all locales by calling:
setlocale(LC_ALL, "");
...
The locale "C" or "POSIX" is a portable locale; its LC_CTYPE part corresponds to the 7-bit ASCII character set.
So any multibyte / non-ASCII character is converted into one or more ASCII characters as the output shows.
The locale can be set as follows:
setlocale(LC_ALL, "");
The LC_ALL flag specifies changing all locale-related variables. An empty string for the locale means to set the locale according to the relevant environment variables. Once this is done, you should see the characters for your shell's locale.
#include <stdio.h>
#include <wchar.h>
#include <locale.h>
int main()
{
char *before = setlocale(LC_ALL, NULL);
setlocale(LC_ALL, "");
char *after = setlocale(LC_ALL, NULL);
wprintf(L"before locale: %s\n", before);
wprintf(L"after locale: %s\n", after);
wprintf(L"©\n");
wprintf(L"\u00A9\n");
return 0;
}
Output:
before locale: C
after locale: en_US.utf8
©
©

Getting directory of binary in C

How do I get the absolute path to the directory of the currently executing command in C? I'm looking for something similar to the command dirname "$(readlink -f "$0")" in a shell script. For instance, if the C binary is /home/august/foo/bar and it's executed as foo/bar I want to get the result /home/august/foo.
Maybe try POSIX realpath() with argv[0]; something like the following (works on my machine):
#include <limits.h> /* PATH_MAX */
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
char buf[PATH_MAX];
char *res = realpath(argv[0], buf);
(void)argc; /* make compiler happy */
if (res) {
printf("Binary is at %s.\n", buf);
} else {
perror("realpath");
exit(EXIT_FAILURE);
}
return 0;
}
One alternative to argv[0] and realpath(3) on Linux is to use /proc/self/exe, which is a symbolic link pointing to the executable. You can use readlink(2) to get the pathname from it. See proc(5) for more information.
argv[0] is allowed to be NULL by the way (though this usually wouldn't happen in practice). It is also not guaranteed to contain the path used to run the command, though it will when starting programs from the shell.
I have come to the conclusion that there is no portable way for a commpiled executable to get the path to its directory. The obvious alternative is to pass an environment variable to the executable telling it where it is located.

Resources