What does git_sysdir_find_in_dirlist() in libgit2 do? - c

I'm working on porting some logic from libgit2 into Go, but not a 1:1 port as Go works differently. I think this function scans a directory tree, but I'm not sure.
static int git_sysdir_find_in_dirlist(
git_buf *path,
const char *name,
git_sysdir_t which,
const char *label)
{
// allocations
size_t len;
const char *scan, *next = NULL;
const git_buf *syspath;
// check the path to make sure it exists?
GIT_ERROR_CHECK_ERROR(git_sysdir_get(&syspath, which));
if (!syspath || !git_buf_len(syspath))
goto done;
// this is the part I don't understand
for (scan = git_buf_cstr(syspath); scan; scan = next) {
/* find unescaped separator or end of string */
for (next = scan; *next; ++next) {
if (*next == GIT_PATH_LIST_SEPARATOR &&
(next <= scan || next[-1] != '\\'))
break;
}
len = (size_t)(next - scan);
next = (*next ? next + 1 : NULL);
if (!len)
continue;
GIT_ERROR_CHECK_ERROR(git_buf_set(path, scan, len));
if (name)
GIT_ERROR_CHECK_ERROR(git_buf_joinpath(path, path->ptr, name));
if (git_path_exists(path->ptr))
return 0;
}
done:
git_buf_dispose(path);
git_error_set(GIT_ERROR_OS, "the %s file '%s' doesn't exist", label, name);
return GIT_ENOTFOUND;
}
It's the for loop that's confusing me. for (scan = git_buf_cstr(syspath); scan; scan = next) { ... } looks like it's iterating/scanning syspath, and then I get totally lost at for (scan = git_buf_cstr(syspath); scan; scan = next) { ... }.
What does this function specifically do?

This is not looking at a directory tree but rather at delimited string containing a list of directories. For instance (though this clearly isn't aimed at this particular case), as the top level documentation says:
GIT_ALTERNATE_OBJECT_DIRECTORIES
Due to the immutable nature of Git objects, old objects can be
archived into shared, read-only directories. This variable
specifies a ":" separated (on Windows ";" separated) list of Git
object directories which can be used to search for Git objects. New
objects will not be written to these directories.
Entries that begin with " (double-quote) will be interpreted as
C-style quoted paths, removing leading and trailing double-quotes
and respecting backslash escapes. E.g., the value
"path-with-\"-and-:-in-it":vanilla-path has two paths:
path-with-"-and-:-in-it and vanilla-path.
The function is obviously scanning a character-separated path list (whatever that character is—probably colon or semicolon as above), checking for backslash prefixes, so that you can write /a:C\:/a to allow the thing to look into either /a or C:/a.

This function is tasked with locating the file name inside a "configuration level", (ie. ~/.git/, /etc/git, see git_sysdir_t for a list of known locations). As those levels are stored as a bunch of static ("readonly") / (or \)-separated C strings, and we can't modify that at runtime, we have to jump through hoops to what amounts to a foreach-string loop.

Related

`execvp()` seems to not be completing the path search

TL;DR -- what sorts of things might cause an execvp call to not fully function/ search the path properly?
I'm on the tail end of building a rudimentary shell with some quality of life features that I've added over time e.g. history, alias's, and completions. I built those features on top of a functional shell that had a working $PATH search for execution e.g. typing in "ls -la" produced the desired behavior. As you might imagine, I accomplished this just using execvp. (This is written in C if it's not already clear)
I have not changed any of my tokenizing logic and have ensured that the file name is correct; in particular, execvp was producing the desired behavior before I had added these features to my REPL. echo "hello" still produces a tokenized char **xyz and the first token is indeed echo, null-terminated, and so on. That is, my call still looks like, with variables filled-in, ... execvp("echo", argv); after which I call perror, which should only trigger when something has gone awry. Each time I just run the above command, though, since I've added in these features, it returns a failure with the no such file or directory --- before I added these features in though, the behavior was as desired. I'll note, though, that running /bin/echo "hello" runs as expected. Examples are WLOG.
I'm not sure where I should even start looking for errors, and my Google-fu has been mostly fruitless: any suggestions?
I'm initially going to omit code because it totals to several hundred lines and a MWE would not be particularly minimal in addition to my desires to keep this general rather than very particular to my code, though I'm not sure what's causing this. My repository is public and up-to-date, and I'm happy to post any code here.
EDIT:
I knew I wasn't explicitly editing the PATH variable, etc., but this block of code was the problem:
// Grab $PATH from env
char *pathvar = getenv("PATH");
if (pathvar) {
char *path;
int i;
// tokenize on colon to get paths
// then use that immediately to
// scandir, and add everything in
// there to the completions system
path = strtok(pathvar, ":");
while (path) {
struct dirent **fListTemp;
int num_files = scandir(path, &fListTemp, NULL, alphasort);
// only adding the names that are completely composed of
// lower case letters; completions are done using a naive
// Trie Node structure that only supports lowercase letters
// for now... e.g. g++ does not work, and the '+' leads to
// a seg-fault. Same holds for . and ..
for (i = 0; i < num_files; i++) {
char *curr = fListTemp[i]->d_name;
if (strcmp(curr, ".")==0 || strcmp(curr, "..")==0){
continue;
} else if (notalpha(curr)) {
continue;
} else {
str_tolower(curr);
tn_insert(completions, curr);
}
}
for (i = 0; i < num_files; i++) {
free(fListTemp[i]);
}
free (fListTemp);
path = strtok(NULL, ":");
}
} else {
fprintf(stderr, "{wsh # init} -- $PATH variable could not be found?");
}
Note that
The getenv() function returns a pointer to the value in the
environment, or NULL if there is no match.
so my original code was indeed tampering with the PATH variable. The solution I came up with quickly was just to create a copy of that string and use that to parse through the PATH instead:
// Grab $PATH from env
char *pathvar = getenv("PATH");
char *pathvar_cpy = strcpy(pathvar_cpy, pathvar);
if (pathvar_cpy) {
char *path;
int i;
path = strtok(pathvar_cpy, ":");
while (path) {
// Scan directory
struct dirent **fListTemp;
int num_files = scandir(path, &fListTemp, NULL, alphasort);
for (i = 0; i < num_files; i++) {
char *curr = fListTemp[i]->d_name;
if (strcmp(curr, ".")==0 || strcmp(curr, "..")==0){
continue;
} else if (notalpha(curr)) {
continue;
} else {
str_tolower(curr);
tn_insert(completions, curr);
}
}
for (i = 0; i < num_files; i++) {
free(fListTemp[i]);
}
free (fListTemp);
path = strtok(NULL, ":");
}
} else {
fprintf(stderr, "{wsh # init} -- $PATH variable could not be found?");
}

Finding tab and/or space separated words in a string (dependencies in a target in a makefile)

I'd like to write a program in C, which takes a line in a makefile, and process it in a few ways:
If the line contains a colon, we assume that it is a target line, and that there exists a target name before the colon.
On target lines, we store the target in a variable targetName as a null-terminated string.
On target lines, we then look for any dependencies in the same line as the target, after the colon. Zero or more whitespace characters (spaces and tabs) may appear after the colon, between dependancies or at the end of the line, but we are only interested in strings which either:
a. Identify existing files on disk,
b. Identify other targets, or
c. Identify a web-based URL.
A line to be processed from a makefile could look something like this:
calcmarks.o : calcmarks.c calcmarks.h globals.o
Where calcmarks.o is the name of the target, and calcmarks.c , calcmarks.h and globals.o are all names of dependencies, with the first 2 being files on disk and the final dependancy being another target. We would store calcmarks.o in the variable targetName , and then select a suitable data structure to store the other dependencies. I would like to store the names of the dependencies in such a way which makes it easy for me to later check the modification date of any dependency and compare it with the modification date of the target to see if the dependency has been modified more recently than the target.
Here is the code which I've been able to come up with so far. Essentially this function takes the file pointer and the target to be rebuilt, and will eventually rebuild it.
// filePtr is the text file pointer after going through processVariables
// Target is the name of the target isolated from the command line interface (requested by user)
void processTarget (FILE* filePtr , char target[]) {
// Holds line by line
char buffer[1000];
// Holds name of target with no whitespace
char targetName[1000];
// Boolean to check if the target was indeed found
bool targetFound = false;
while (fgets(buffer , sizeof(buffer) , filePtr) != NULL) {
int i;
for (i = 0; buffer[i] != '\0'; i++) {
// Copying a potential target name until we find a colon
if (buffer[i] == ':') {
// Don't copy the colon, so terminate the string
targetName[i] = '\0';
targetFound = true;
break;
}
targetName[i] = buffer[i];
}
// If the previous loop encountered a target name via the existance of a semicolon
if (targetFound) {
// If the target name is the same as the required target
if (target == targetName) {
// Test for 0 dependencies
if (buffer[i + 1] == '\0') continue; // End of string
int j;
for (j = i + 1; j != '\0'; j++) {
// Processing the line after the colon
}
}
}
}
}
Also note that the number of dependencies is variable; thus if I were to initialise a construct to store their names, I wouldn't know the size parameter.
How do I search for dependencies after the colon, ignoring whitespace, and storing the names of dependencies in a data structure?
EDIT: A clarification thanks to #JonathanLeffler: The dependencies which this project has to recognise will not be much harder than those shown in the line provided below, with the exception of web-based URLs.

Find and replace a line in a file

My goal is to search the file line by line until it finds a variable declaration in the format of varName = varValue. Count up the bytes to the beginning of that line, then replace that line with the same varName but a new value.
This is a very simple configuration file handler, I'm writing it from scratch to avoid any dependencies. The reason I'm doing it this way and not just dumping a string[string] associative array is because I want to preserve comments. I also wish to refrain from reading the entire file into memory, as it has the potential to get large.
This is the code I have written, but nothing happens and the file remains unchanged when using setVariable.
import std.stdio: File;
import std.string: indexOf, strip, stripRight, split, startsWith;
import std.range: enumerate;
ptrdiff_t getVarPosition(File configFile, const string varName) {
size_t countedBytes = 0;
foreach (line, text; configFile.byLine().enumerate(1)) {
if (text.strip().startsWith(varName))
return countedBytes;
countedBytes += text.length;
}
return -1;
}
void setVariable(File configFile, const string varName, const string varValue) {
ptrdiff_t varPosition = getVarPosition(configFile, varName);
if (varPosition == -1)
return; // For now, just return. This variable doesn't exist.
// Will handle this later, it needs to append to the bottom of the file.
configFile.seek(varPosition);
configFile.write(varName ~ " = " ~ varValue);
}
There's a few parts of your code missing, which makes diagnosis hard. The most important question may be 'How do you open your config file?'. This code does what I expect it to:
unittest {
auto f = File("foo.txt", "r+");
setVariable(f, "var3", "foo");
f.flush();
}
That is, it finds the line starting with "var3", and replaces part of the file with the new value. However, your getVarPosition function doesn't count newlines, so the offset is wrong. Also, consider what happens when the new varValue is a different length from the old value. If you have "var = hello world", and call setVariable(f, "var", "bye"), you'll end up with "var = byelo world". If it's longer than the existing value, it will overwrite the next variable(s).

Recursive function : abort-condition

We need to create a binary tree which contains content of textfiles. The pointer selection_a and selection_b pointing to another textfile in the directory.
The structure of the textfiles is following:
line: Title
line: OptionA
line: OptionB
line: Text.
The first file is given as parameter while starting the program. All files should be saved at the beginning of the program. Then the text of the first file shows, and the user can input A or B to continue. Based on the selection, the text of File Option A/B is shown and the user can decide again.
The last file of a tree contains no Options: lines 2 and 3 are "-\n".
The problem is, this code only reads all the option A files of the first tree. It doesn't read in any B-Options. In the end, the program shows a memory access error.
I think the problem is that the readingRows function has no abort condition.
current->selection_a = readingRows(input_selection_a);
current->selection_b = readingRows(input_selection_b);
I know the code may be kind of chaotic, but we are beginners in programming. Hope anybody can help us to write an abort-condition.
The function should be aborted if the content of option A (line 3) is "-\n".
Here is the whole function:
struct story_file* readingRows(FILE *current_file)
{
char *buffer = fileSize(current_file);
char *delimiter = "\n";
char *lines = strtok(buffer, delimiter);
int line_counter = 0;
struct story_file *current = malloc(sizeof(struct story_file));
while(lines != NULL)
{
if(line_counter == 0)
{
current->title = lines;
}
else if(line_counter == 1)
{
char *filename_chapter_a = lines;
FILE *input_selection_a = fopen(filename_chapter_a, "r");
if(input_selection_a)
{
current->selection_a = readingRows(input_selection_a);
}
fclose(input_selection_a);
}
else if(line_counter == 2)
{
char *filename_chapter_b = lines;
FILE *input_selection_b = fopen(filename_chapter_b, "r");
if(input_selection_b)
{
current->selection_b = readingRows(input_selection_b);
}
fclose(input_selection_b);
}
else if (line_counter >= 3)
{
current->text = lines;
}
lines = strtok(NULL, delimiter);
line_counter++;
}
return current;
}
There are two items that define a terminating recursive function:
One or more base cases
Recursive calls that move toward a base case
Your code has one base case: while (lines!=NULL) {} return current;, it breaks the while loop when lines is NULL and returns current. In other words, within any particular call to your function, it only terminates when it reaches the end of a file.
Your code moves toward that base case as long as your files do not refer to each other in a loop. We know this because you always read a line, take an action according to your if-else block, and the read the next line. So you always move toward the end of each file you read.
But as you note, the issue is that you don't have a case to handle "no Options", being when lines 2 or 3 are "-\n". So right now, even though you move through files, you are always opening files in line 2. Unless a file is malformed and does not contain a line 2, your recursive call tree never ends. So you just need to add another base case that looks at whether the beginning of lines matches "-\n", and if it does, return before the recursive call. This will end that branch of your recursive tree.
Inside of your while loop, you will need code along the lines of:
if `line_counter` is `2` or `3`
if `lines` starts with your terminating sequence "-\n"
return current
else
`fopen` and make the recursive call
In the parent function that made the recursive call, it will move to the next line and continue as expected.
P.S. Make sure you use free for each malloc you do.

Recursive CreateDirectory

I found many examples of CreatingDirectory recursively, but not the one I was looking for.
here is the spec
Given input
\\server\share\aa\bb\cc
c:\aa\bb\cc
USING helper API
CreateDirectory (char * path)
returns true, if successful
else
FALSE
Condition: There should not be any parsing to distinguish if the path is Local or Server share.
Write a routine in C, or C++
I think it's quite easier... here a version that works in every Windows version:
unsigned int pos = 0;
do
{
pos = path.find_first_of("\\/", pos + 1);
CreateDirectory(path.substr(0, pos).c_str(), NULL);
} while (pos != std::string::npos);
Unicode:
pos = path.find_first_of(L"\\/", pos + 1);
Regards,
This might be exactly what you want.
It doesn't try to do any parsing to distinguish if the path is Local or Server share.
bool TryCreateDirectory(char *path){
char *p;
bool b;
if(
!(b=CreateDirectory(path))
&&
!(b=NULL==(p=strrchr(path, '\\')))
){
size_t i;
(p=strncpy((char *)malloc(1+i), path, i=p-path))[i]='\0';
b=TryCreateDirectory(p);
free(p);
b=b?CreateDirectory(path):false;
}
return b;
}
The algorithm is quite simple, just pass the string of higher level directory recursively while creation of current level of directory fails until one success or there is no more higher level. When the inner call returns with succeed, create the current. This method do not parse to determ the local or server it self, it's according to the CreateDirectory.
In WINAPI, CreateDirectory will never allows you to create "c:" or "\" when the path reaches that level, the method soon falls in to calling it self with path="" and this fails, too. It's the reason why Microsoft defines file sharing naming rule like this, for compatibility of DOS path rule and simplify the coding effort.
Totally hackish and insecure and nothing you'd ever actually want to do in production code, but...
Warning: here be code that was typed in a browser:
int createDirectory(const char * path) {
char * buffer = malloc((strlen(path) + 10) * sizeof(char));
sprintf(buffer, "mkdir -p %s", path);
int result = system(buffer);
free(buffer);
return result;
}
How about using MakeSureDirectoryPathExists() ?
Just walk through each directory level in the path starting from the root, attempting to create the next level.
If any of the CreateDirectory calls fail then you can exit early, you're successful if you get to the end of the path without a failure.
This is assuming that calling CreateDirectory on a path that already exists has no ill effects.
The requirement of not parsing the pathname for server names is interesting, as it seems to concede that parsing for / is required.
Perhaps the idea is to avoid building in hackish expressions for potentially complex syntax for hosts and mount points, which can have on some systems elaborate credentials encoded.
If it's homework, I may be giving away the algorithm you are supposed to think up, but it occurs to me that one way to meet those requirements is to start trying by attempting to mkdir the full pathname. If it fails, trim off the last directory and try again, if that fails, trim off another and try again... Eventually you should reach a root directory without needing to understand the server syntax, and then you will need to start adding pathname components back and making the subdirs one by one.
std::pair<bool, unsigned long> CreateDirectory(std::basic_string<_TCHAR> path)
{
_ASSERT(!path.empty());
typedef std::basic_string<_TCHAR> tstring;
tstring::size_type pos = 0;
while ((pos = path.find_first_of(_T("\\/"), pos + 1)) != tstring::npos)
{
::CreateDirectory(path.substr(0, pos + 1).c_str(), nullptr);
}
if ((pos = path.find_first_of(_T("\\/"), path.length() - 1)) == tstring::npos)
{
path.append(_T("\\"));
}
::CreateDirectory(path.c_str(), nullptr);
return std::make_pair(
::GetFileAttributes(path.c_str()) != INVALID_FILE_ATTRIBUTES,
::GetLastError()
);
}
void createFolders(const std::string &s, char delim) {
std::stringstream ss(s);
std::string item;
char combinedName[50]={'\0'};
while (std::getline(ss, item, delim)) {
sprintf(combinedName,"%s%s%c",combinedName,item.c_str(),delim);
cout<<combinedName<<endl;
struct stat st = {0};
if (stat(combinedName,&st)==-1)
{
#if REDHAT
mkdir(combinedName,0777);
#else
CreateDirectory(combinedName,NULL);
#endif
}
}
}

Resources