how can I search for a file using C - c

I've been looking for a way to search for a file based on a pattern (*-stack.txt for example) over the last few days and have been having a very difficult time finding a way to do so, having said that I was wondering if anyone knew of a way to do this? Have searched around on google and such as well, but could not really find anything of use :/ this would just serve to search a linux directory for files that match a certain pattern
(an example of directory plus out)
/dev/shm/123-stack.txt abc-stack.txt overflow-stack.txt
searching for *-overflow.txt would return all of the above files

Your best bet is probably glob(3). It does almost exactly what you want. From what you've said a sketch of the proper code is
char glob_pattern[PATH_MAX];
glob_t glob_result;
snprintf(glob_pattern, PATH_MAX, "%s/%s", directory, file_pattern);
glob(glob_pattern, 0, NULL, &glob_result);
for (size_t i = 0; i < glob_result.gl_pathc; ++i) {
char *path = glob_result.gl_pathv[i];
/* process path */
}

I think you should use the opendir system call, like it's described in this question.
But it's going to be a lot more work on top of that - hence higher-level languages providing better interfaces.

Related

Structure information for pcre

I have the following function to compile a pcre regex:
/**
* common options: PCRE_DOTALL, PCRE_EXTENDED, PCRE_CASELESS, PCRE_MULTILINE
* full options located at: https://man7.org/linux/man-pages/man3/pcre_compile.3.html
*/
pcre* pcre_compile_pattern(const char* pattern, int options)
{
const char *pcre_error;
int error_offset;
pcre *re_compiled = pcre_compile(pattern, options, &pcre_error, &error_offset, NULL);
if (re_compiled == NULL) {
fprintf(stderr, "ERROR: '%s' occurs at pattern position %d\n", pcre_error, error_offset);
}
return re_compiled;
}
Is there a place where the pcre struct is described? For example, I'm looking to see if it contains the pattern (as a normal string) inside it or whether I have to keep the pattern separately. I've seen a lot of references in the man pages to pcre* but I haven't really been able to get more details on that struct.
In searching github here was one place I was able to find it, which seems like it might be what I'm using: https://github.com/luvit/pcre/blob/e2a236a5737b58d43bf324208406a60fe0dd95f4/pcre_internal.h#L2317. Everything is private though so you cannot access part of the struct, for example to read/print it directly.
Is there a place where the pcre struct is described?
The include file defining the interface is pcre.h for version 1 or pcre2.h for version 2.
Much in the same way that we don't need to know how stdio's FILE struct is designed, we don't need to know how pcre is defined. We also will not need the pattern after we have received a pcre struct.
Shawn, in comments, pointed out the importance of using pcre2 for new code. It is also noted on the website: pcre is end of life with 8.45 the last version, use pcre2 for new projects.
The primary change for pcre2 is more aggressive pattern validation.
A demonstration of pcre2 is available here.

Proper methods to Copy files/folders programmatically in C using POSIX functions

These terms may not be 100% accurate, but I'm using the GCC compiler and POSIX library. I have C code compiled with the SQLite amalgamation file to a single executable.
In the user interface that exchanges JSON messages with the C program, I'd like to make it possible for users to copy the SQLite database files they create through the C program, and copy a full directory/folder.
Thus far, I've been able to rename and move files and folders programmatically.
I've read many questions and answers here, at Microsoft's C runtime library, and other places but I must be missing the fundamental points. I'm using regular old C, not C++ or C#.
My question is are there POSIX functions similar to rename(), _mkdir(), rmdir(), remove(), _stat(), that allow for programmatic copying of files and folders in Windows and Linux?
If not, can one just make a new folder and/or file and fread/fwrite the bytes from the original file to the new file?
I am primarily concerned with copying SQLite database files, although I wouldn't mind knowing the answer in general also.
Is this answer an adequate method?
Is the system() function a poor method? It seems to work quite well. However, it took awhile to figure out how to stop the messages, such as "copied 2 files" from being sent to stdout and shutting down the requesting application since it's not well-formed JSON. This answer explains how and has a link to Microsoft "Using command redirection operators". A /q in xcopy may or may not be necessary also, but certainly didn't do the job alone.
Thank you very much for any direction you may be able to provide.
The question that someone suggested as an answer and placed the little submission box on this question is one that I had already linked to in my question. I don't mean to be rude but, if it had answered my question, I would not have written this one. Thank you whoever you are for taking the time to respond, I appreciate it.
I don't see how that would be a better option than using system() because with the right parameters all the sub-directories and files of a single parent folder can be copied in one statement without having to iterate through all of them manually. Is there any reason why it would not be better to use system() apart from the fact that code will need to be different for each OS?
Handling errors are a bit different because system() doesn't return an errno but an exit code; however, the errors can be redirected from stderr to a file and pulled from there, when necessary
rename(): posix
_mkdir(): not posix. You want mkdir which is. mkdir takes two arguments, the second of which should usually be 077.
rmdir(): posix
remove(): posix
_stat(): not posix, you want stat() which is.
_stat and _mkdir are called as such on the Windows C library because they're not quite compatible with the modern Unix calls. _mkdir is missing an argument, and _stat looks like a very old version of the Unix call. You'll have trouble on Windows with files larger than 2GB.
You could do:
#ifdef _WIN32
int mkdir(const char *path, int mode) { return _mkdir(path); } /* In the original C we could have #defined this but that doesn't work anymore */
#define stat _stat64
#endif
but if you do so, test it like crazy.
In the end, you're going to be copying stuff with stdio; this loop works. (beware the linked answer; it has bugs that'll bite ya.)
int copyfile(const char *src, const char *dst)
{
const int bufsz = 65536;
char *buf = malloc(bufsz);
if (!buf) return -1; /* like mkdir, rmdir, return 0 for success, -1 for failure */
FILE *hin = fopen(src, "rb");
if (!hin) { free(buf); return -1; }
FILE *hout = fopen(dst, "wb");
if (!hout) { free(buf); fclose(hin); return -1; }
size_t buflen;
while ((buflen = fread(buf, 1, bufsz)) > 0) {
if (buflen != fwrite(buf, 1, buflen)) {
fclose(hout);
fclose(hin);
free(buf);
return -1; /* IO error writing data */
}
}
free(buf);
int r = ferror(hin) ? -1 : 0; /* check if fread had indicated IO error on input */
fclose(hin);
return r | (fclose(hout) ? -1 : 0); /* final case: check if IO error flushing buffer -- don't omit this it really can happen; calling `fflush()` won't help. */
}

How to compare two (absolute) paths (given as char* ) in C and check if they are the same?

Given two paths as char*, I can't determine if the two paths are pointing to the same file.
How to implement in C a platform-independent utility to check if paths are pointing to the same file or not.
Using strcmp will not work because on windows paths can contain \ or /
Using ist_ino will not help because it does not work on windows
char *fileName = du->getFileName();
char *oldFileName = m_duPtr->getFileName();
bool isSameFile = pathCompare(fileName, oldFileName) == 0;//(strcmp(fileName, oldFileName) == 0);
if (isSameFile){
stat(fileName, &pBuf);
stat(oldFileName, &pBuf2);
if (pBuf.st_ino == pBuf2.st_ino){
bRet = true;
}
}
You can't. Hard links also exist on Windows and the C standard library has no methods for operating on them.
Plausible solutions to the larger problem: link against cygwin1.dll and use the st_ino method. You omitted st_dev from your sample code and need to put it back.
While there is an actual way to accomplish this on Windows, it involves ntdll methods and I had to read Cygwin's code to find out how to do it.
The methods are NtGetFileInformationByHandle and NtFsGetVolumeInformationNyHandle. There are documented kernel32 calls that claim to do the same thing. See the cygwin source code for why they don't work right (buggy fs drivers).

What is the entry point for git?

I was browsing through the git source code, and I was wondering where the entry point file is? I have gone through a couple files, that I thought would be it but could not find a main function.
I could be wrong, but I believe the entrypoint is main() in common-main.c.
int main(int argc, const char **argv)
{
/*
* Always open file descriptors 0/1/2 to avoid clobbering files
* in die(). It also avoids messing up when the pipes are dup'ed
* onto stdin/stdout/stderr in the child processes we spawn.
*/
sanitize_stdfds();
git_setup_gettext();
git_extract_argv0_path(argv[0]);
restore_sigpipe_to_default();
return cmd_main(argc, argv);
}
At the end you can see it returns cmd_main(argc, argv). There are a number of definitions of cmd_main(), but I believe the one returned here is the one defined in git.c, which is a bit long to post here in its entirety, but is excerpted below:
int cmd_main(int argc, const char **argv)
{
const char *cmd;
cmd = argv[0];
if (!cmd)
cmd = "git-help";
else {
const char *slash = find_last_dir_sep(cmd);
if (slash)
cmd = slash + 1;
}
/*
* "git-xxxx" is the same as "git xxxx", but we obviously:
*
* - cannot take flags in between the "git" and the "xxxx".
* - cannot execute it externally (since it would just do
* the same thing over again)
*
* So we just directly call the builtin handler, and die if
* that one cannot handle it.
*/
if (skip_prefix(cmd, "git-", &cmd)) {
argv[0] = cmd;
handle_builtin(argc, argv);
die("cannot handle %s as a builtin", cmd);
}
handle_builtin() is also defined in git.c.
Perhaps it's best to address the misunderstanding. Git is a way of collecting, recording, and archiving changes to a project directory. This is the purpose of a Version Control System, and git is perhaps one of the more recognizable ones.
Sometimes they also provide build automation, but often the best tools focus on the fewest responsibilities. In the case of git, it mostly focuses on commits to a repository in order to preserve different states of the directory it is initialized to. It doesn't build the program, so the entry points are unaffected.
For C projects, the entry point will always be the same one defined by the compiler. Generally this is a function called main, but there are ways to redefine or hide this entry point. Arduino, for example, uses setup as the entry point and then calls loop.
The comment left by #larks is an easy way to find the entry point when you're not sure. Using a simple recursive search from a git repo's root directory can hunt for the word main in any included file:
grep main *.c
The Windows equivalent is FINDSTR, but recent updates to Windows 10 have greatly improved compatibility with Bash commands. grep is usable in the version I'm running. So is ls, though I'm not sure whether it has been there all along.
Some git projects include multiple languages, and many languages related to C (and predecessors) use the same entry point name. Looking only in file extensions of .c is a good way to find the entry point of the C components, assuming the code is of high enough quality that you'd want to run it in the first place.
There are definitely ways to interfere with how well the extension filters out other languages, but their use implies very haphazard coding practice.

How can I search for files in a directory using a wildcard

I am currently trying to find a way to search for files in a specific directory (/dev/shm in this instance, no wild card needed for this part) that fit a pattern that includes a wild card, lets say for instance I have a directory that has
stack_review.txt stack_overflow.txt stack_servers.txt
in it, and I wanted to return all results that fit the pattern stack_*.txt, how would I got about doing this? Ive tried a few examples using readdir but unfortunately have not found anything that works correctly for this implementation yet, So I would really appreciate any help that I can get with this problem, thanks!
You're looking for glob(). From http://linux.die.net/man/3/glob:
One example of use is the following code, which simulates typing ls -l *.c ../*.c
in the shell:
glob_t globbuf;
globbuf.gl_offs = 2;
glob("*.c", GLOB_DOOFFS, NULL, &globbuf);
glob("../*.c", GLOB_DOOFFS | GLOB_APPEND, NULL, &globbuf);
globbuf.gl_pathv[0] = "ls";
globbuf.gl_pathv[1] = "-l";
execvp("ls", &globbuf.gl_pathv[0]);
Is it possible for elements to be nested in folders? If so your method must be recursive.

Resources