Alternative to dir command to query a directory in C - c

I am trying to create a program where the user can add different path regex-s so that a specific set of operations on the files that match the regex.
I tried using opendir() of the dirent.h header file but soon realized that it does not use the concept of regex.
The dir command I am trying to emulate is
dir [regex] /b
I need the output in a (char) buffer**. Piping the output could be a solution but I am looking for a more efficient way to do it.
Is there any predefined function in the standard (C90) library or will we have to create our own implementation?

C does not know about directories. They are operating system specific, usually provided by your OS kernel (look however inside GNU Hurd as an exception, and into unikernels). Read the C11 standard n1570 and forget, in 2020, about the obsolete C89 standard and TurboC. Consider trying some Linux distribution (such as Ubuntu or Debian or others). Most of them provide GCC or Clang (or the non-optimizing TinyCC compiler) and are very developer-friendly. My recommendation: use GCC as gcc -Wall -Wextra -g. Choose a good enough built automation tool (maybe GNU make) with an appropriate source-code editor (such as GNU emacs or vim or geany or others). Learn how to debug small programs and use the GDB debugger and the git version control tool.
POSIX does know about directories (it is an API specification written in English, also defining regex(3)). See here, and read the Linux man pages. And also the WinAPI.
On Linux, see mkdir(2), chdir(2), readdir(3), getcwd(3), unlink(2), stat(2), open(2), nftw(3), path_resolution(7) etc etc; you could want to study the source code of a Linux kernel and of some common C library for it, such as GNU glibc or musl-libc. Budget for that several months full time of your efforts. They are open source, so with some conditions you are allowed to study, improve and reuse their source code. See also http://linuxfromscratch.org/
Notice also popen. You probably don't want to use it and would prefer using more primitive system calls (see syscalls(2) for their list on Linux). You could use a library like Glib (from GTK).
Remember that C programs (of the freestanding kind) could run on the bare metal (e.g. Arduino). In those cases, speaking of directories does not make any sense. See also osdev.org for more, and observe that the Linux kernel is written in C (with a tiny amount of assembler code).
GrassHopper was an OS written mostly in C without any files or directories. See also old discussions archived on tunes.org and tccboot.

Use the function findfirst to start querying a directory and then findnext to iterate. The functions find all files in one directory matching a given pattern, so you possibly need to append \*.* to your directory name to list all files in that directory.
Refer to the Turbo C documentation for details.

Related

How do I read the source code for a C library in CLion

OS: Deepin 20 (base on Debian 10)
CLion: 2020.1.2
GCC: gcc (Uos 8.3.0.3-3+rebuild) 8.3.0
Make: 4.2.1 x86_64-pc-linux-gnu
Cmake: 3.18.1
I am a newcomer who just started learning C language. When I was writing C code using CLion, I could access it by Ctrl + mouse click .
I'm calling the method inside the header function. For example, if I use printf , I can access the stdio.h file, which can be seen at line 332 extern int printf (const char * ___, RESTRICT, format,...) ; .
But if I want to see the details of this method
I can't see it. According to Navigate in the code structure
Use Ctrl+Alt+Home to switch. But the IDE prompts No related file .
How can I get the source code to call a method? I want to learn from the good experiences of others by looking at their implementation logic in their libraries
Thank you for your review. I would really appreciate it if you could help me.
Even if most of GNU/Linux software is open source, it is not installed (in source code form) by default on your computer.
Regarding C programming, see Modern C (and the C11 standard n1570) and read the documentation of your C compiler (perhaps GCC or Clang, or simpler ones like nwcc or tinycc), your linker (probably binutils), your build automation tool (e.g. GNU make or ninja or cmake). Enable all warnings and DWARF debug info, so if using GCC compile with at least gcc -Wall -Wextra -g; then improve your C code to get no warnings. Once you have debugged your C source code (using GDB and perhaps valgrind), add optimization flags such as -O2. Order of arguments to gcc matters!
Consider, for some tasks, generating some of your C code (perhaps some #include-d header file) with tools like GNU bison, ANTLR, SWIG, RPCGEN, AWK, GUILE, GPP, GNU m4, GNU autoconf - or your own program or script.
I want to learn from the good experiences of others by looking at their implementation logic in their libraries
You need to fetch the source code from elsewhere.
For examples, see GNU libc or musl-libc, and the Linux kernel (and others: GTK, PostGreSQL, sqlite, GUILE, etc.... including many open source programs mentionned in this answer) and look also on websites like github, gitlab, sourceforge
Read also Advanced Linux Programming and syscalls(2). See also http://linuxfromscratch.org/
In 2020, a recent GCC compiler happens to handle specially calls to printf when asked to optimize. See the softwareheritage and Frama-C projects.
In some cases, consider accepting plugins in your program with dlopen(3) and dlsym(3) (see also elf(5) and How to Write Shared Libraries). You might even generate some code at runtime with libraries like libgccjit (or generate C code at runtime, then compile it as a plugin, and load it; such an approach is called metaprogramming and is related to partial evaluation; see also the blog of the late J.Pitrat for more insights).
Of course, you need tools to navigate in source code. Consider using GNU emacs combined with GNU grep for that, or some other source navigator. For large programs of millions of source code lines, consider writing your own GCC plugin to understand them.
Use also tools like strace(1) and GDB to understand the dynamic behavior of programs.
Expect several months of full time work to explore all this.
You could be interested by ACM conference papers also.
For your own source code, consider using some version control tool such as git. Of course read its documentation. And use LibreOffice, Lout or LaTeX, MarkDown (perhaps combined with inkscape or diagrams for figures) to write the documentation of your software.
In some cases, you might consider generating parts of the documentation from parts of your source code (e.g. using literate programming techniques like nuweb or documentation generators like doxygen).

Extract function source code from existing open source C library

I need to extract source code for a function from the existing C library (the library is open source). The problem is that functions are created using macros in header files, and when I write a test project and link the library to it the debugger points me to that header file on 'go to definition' action. I have the source code of the library and I guess i need to build it together with my test code (maybe this is not correct, I am not sure). Any advice on how to proceed, what to use? Thank you.
I need to extract source code for a function from the existing C library (the library is open source).
Several C compilers are themselves open source. Both GCC and Clang are (and so is tinycc). So you legally could improve them (but that could take months of work).
In addition, recent GCC versions (e.g. in july 2020, GCC 10) accept plugins. Your GCC plugin could work on some internal GCC representations (e.g. GIMPLE, GENERIC) so will know about functions (even obtained by preprocessor expansion).
You could also consider using some open source static program analyzers, such as Frama-C or Clang static analyzer.
PS. Take into account open source license issues (legal ones). I am not a lawyer (and you might need to ask one, if you mix various software of different open source licenses).

relationship of c compiler and c standard library

I have been doing a lot reading lately about how glibc functions wrap system calls in linux. I am wondering however about the relationship between glibc and the GNU C Compiler.
Lets say for example I wanted to write my own C Standard implementation and write a new library called "newglibc" and I change things just slightly. Like for example I take more checks and actions before and after the system calls. Would I have to write a new compiler? Or would I be able to use the same GNU gcc compiler?
If the compiler is completely separate from the library, then would someone be able to, THEORETICALLY, use the gcc on windows system if they could turn it into a .exe and provide the standard C library that windows provides?
Thank you
The Linux kernel, the GNU C Library ("glibc"), and the GNU Compiler Collection (gcc) are three separate development projects. They are often used all together, but they don't have to be. The ones with "GNU" in their name are offically part of the GNU Project; Linux isn't.
The C standard does not make a distinction between the "compiler" and the "library"; it's all one "implementation" to the committee. It is largely a historical accident that GCC is a separate development project from glibc—but a motivated one: back in the day, each commercial Unix variant shipped with its own C library and compiler, and they were terrible, 90% bugs by volume was typical. GNU got its start providing a less terrible replacement for the compiler (and the shell utilities, which were also terrible).
Replacing the compiler on a traditional commercial Unix is a lot easier than replacing the C library, because the C library isn't just the functions defined in clause 7 of the C standard; as you have noticed, it also provides the lowest-level interface to the kernel, and often that wasn't very well documented. glibc did at one time at least sort-of support a bunch of these Unixes, but nowadays it can only be used with Linux and an experimental kernel called the Hurd. By contrast, GCC supports dozens of different CPUs and kernels, and Linux supports dozens of different CPUs.
If you write your own C library and/or kernel, it is relatively easy to write a "back end" so that GCC can generate code for them as a cross-compiler, and somewhat more difficult to port GCC to run in that environment. You may also need to write a back end for the assembler and linker, which are yet a fourth project ("GNU Binutils"). Porting glibc to a new CPU running Linux is a large but straightforward task; porting glibc to a new operating system is hard, especially if that OS is not Unix-ish. (Windows is decidedly not Unix-ish, so much so that when Microsoft wanted to make it easier to run programs written for Unix under Windows, the path of least resistance was to bolt an in-house clone of the Linux kernel onto the side of the NT kernel. I am not making this up.)
If you write your own C compiler, you will have to make it conform to the expectations of the library and kernel that it is generating code for. A lot of that is documented in the "ABI" specification for the environment you're working in, but not all, unfortunately.
If that doesn't clarify, please let us know what is still unclear.

histogram function in ansi C program: GSL and/or others?

If I just want to use the gsl_histogram.h library from Gnu Scientific Library (GSL), can I copy it from an existing machine (Mac OS Snow Leopard) that has GSL installed to a different machine (Linux CentOS 5.7) that doesn't have GSL installed, and just use an #include <gls_histogram.h> statement in my c program? Would this work?
Or, do I have to go through the full install of GSL on the Linux box, even though I only need this one library?
Just copying a header gsl_histogram.h is not enough. Header states merely the interface that is exposed by this library. You would need to copy also binaries like *.so and *.a files, but it's hard to tell which ones to copy. So I think the you'd better just install it on your machine. It's pretty easy, just use this tutorial to find and install GSL package.
So there are surely a lot of libraries out there. However the particular one is Gnuplot. Using it you even do not need to compile the code, however you do need to read a bit of documentation. But luckily there is already a question about how to draw a histogram with Gnuplot on Stackoverflow: Histogram using gnuplot? It worth noting that Gnuplot is actually very powerful tool, so invested time into reading its documentation will certainly pay off.
You cannot copy libraries from OS and expect them to work unchanged.
OS X uses the Mach-O object file format while modern Linux systems use the ELF object file format. The usual ld.so(8) linker/loader will not know how to load the Mach-O format object files for your executable to execute. So you would need the Apple-provided ld.so(8) -- or whatever they call their loader. (It's been a while.)
Furthermore, the object files from OS X will be linked against the Apple-supplied libc, and require the corresponding symbols from the Apple-supplied library. You would also need to provide the Apple-provided libc on the Linux system. This C library would try to make system calls using the OS X system call numbers and calling conventions. I guarantee the system call numbers have changed and almost certainly calling conventions are different.
While the Linux kernel's binfmt_misc generic object loader can be used to teach the kernel how to load different object file formats, and the kernel's personality(2) system call can be used to select between different calling conventions, system call numbers, and so on, the amount of work required to make this work is nothing short of immense: the WINE Project has been working on exactly this issue (but with the Windows format COFF and supporting libraries) since 1993.
It would be easier to run:
apt-get install libgs0-dev
or whatever the equivalent is on your distribution of choice. If your distribution does not make it easily available, it would still be easier to compile and install the library by hand rather than try to make the OS X version work.

Why in Linux compiler we have to give additional arguments while compiling and running C programs?

I have implemented semaphores in Linux last year. But for that I have to use -lpthread.
Now while implementing log10() function in C, I surfed the INTERNET and I saw that I have to use -lm.
I want to know why these kind of command line arguments are necessary in Linux.And Does this rule is compiler oriented?
(In windows Turboc compiler, I never used these kind of arguments.)
You are instructing the compiler to look for certain libraries and use them to try and produce a final object file.
When you were doing your threading code, you used threading primitives. These threading primitives are implemented in a library called pthread, -lpthread tells the linker to use the library pthread, without providing this switch the compiler will not be able to produce a valid object file as it is missing threading code implementation.
On the file system the libraries can be found in /usr/lib and lib (among others) when you look in these directories you will see files start with the lib prefix. for example libpthreadxxxxxx. You will have to do your own research to figure out what the xxxx means.
The development cycle using unix style tools is very granular on the surface, when you use heavyweight IDE's (read: visual studiio for C++), the IDE implicetly links against loads of standard libraries, so often you do not need to supply the name of the libraries you will commonly use. However, when you start doing more advanced programming you will probably have to install and configure your IDE to use external code libraries. If you were to use threading primitives in visual studio, you most likely will not have to provide the compiler with information on where to look for threading primitives, Microsoft considers this a common library and every new project will implicitly link against it.
A little discussion on GCC
GCC is a very diverse compiler producing code for various different usage scenarios. As such they try to be neutral and do not make assumptions. For example pthread is a particular threading primitives implementation. However, even through now on Linux at least it is the defacto standard, it is not the only one. Other Unix implementation have had different implementation. When such choices exist it is not fair for the compiler developers to implicitly link against libraries. They do however implicitly link against standard libraries; for example G++ is just a wrapper command to the internal compiler code, it is a C++ front-end so it implicitly links against an implementation of the C++ standard library. Similarly the C front end links against a the standard C library.
People often do not want to use certain standard library implementation, and instead they might want to use another implementation, in such cases you have to explicetly inform the compiler to use an implementation that you provide. Such use cases are very granular and are surface level issues with G++. In visual studio, you would have to tinker a lot to make such changes generally, since it is not an anticipated use-case anymore.
wikipedia will provide you with more information.
Edit: I'll fix the spelling and Grammatical issues later :D
The option -l indicates to gcc what libraries must be used for linking. -lpthread stands for "use the pthread library", and -lm stands for "use the m library" which is the math library. These commands are relative to gcc, not linux.
Because by default, gcc only links the C library (libc), which contains the well-known functions printf, scanf, and many more.
log10 exists in a different library called libm, and thus you need to explictly tell gcc to link that library, with -lm. The same logic applies for -lpthread.
This is purely a backwards, harmful practice. Separating parts of the standard library into separate .so files does nothing but increase load time and memory usage. Good luck getting anyone to change it though... Just accept that you have to do it (and that POSIX specifically allows, but does not require, that an implementation require -lm for using the math functions and -lpthread for using threads, etc.) and move on to more important things.
Or, go pester Drepper about it on the glibc bug tracker/mailing list. He won't change his mind, but if you enjoy flamewars you can get some kicks...

Resources