Find and replace a line in a file - file

My goal is to search the file line by line until it finds a variable declaration in the format of varName = varValue. Count up the bytes to the beginning of that line, then replace that line with the same varName but a new value.
This is a very simple configuration file handler, I'm writing it from scratch to avoid any dependencies. The reason I'm doing it this way and not just dumping a string[string] associative array is because I want to preserve comments. I also wish to refrain from reading the entire file into memory, as it has the potential to get large.
This is the code I have written, but nothing happens and the file remains unchanged when using setVariable.
import std.stdio: File;
import std.string: indexOf, strip, stripRight, split, startsWith;
import std.range: enumerate;
ptrdiff_t getVarPosition(File configFile, const string varName) {
size_t countedBytes = 0;
foreach (line, text; configFile.byLine().enumerate(1)) {
if (text.strip().startsWith(varName))
return countedBytes;
countedBytes += text.length;
}
return -1;
}
void setVariable(File configFile, const string varName, const string varValue) {
ptrdiff_t varPosition = getVarPosition(configFile, varName);
if (varPosition == -1)
return; // For now, just return. This variable doesn't exist.
// Will handle this later, it needs to append to the bottom of the file.
configFile.seek(varPosition);
configFile.write(varName ~ " = " ~ varValue);
}

There's a few parts of your code missing, which makes diagnosis hard. The most important question may be 'How do you open your config file?'. This code does what I expect it to:
unittest {
auto f = File("foo.txt", "r+");
setVariable(f, "var3", "foo");
f.flush();
}
That is, it finds the line starting with "var3", and replaces part of the file with the new value. However, your getVarPosition function doesn't count newlines, so the offset is wrong. Also, consider what happens when the new varValue is a different length from the old value. If you have "var = hello world", and call setVariable(f, "var", "bye"), you'll end up with "var = byelo world". If it's longer than the existing value, it will overwrite the next variable(s).

Related

What does git_sysdir_find_in_dirlist() in libgit2 do?

I'm working on porting some logic from libgit2 into Go, but not a 1:1 port as Go works differently. I think this function scans a directory tree, but I'm not sure.
static int git_sysdir_find_in_dirlist(
git_buf *path,
const char *name,
git_sysdir_t which,
const char *label)
{
// allocations
size_t len;
const char *scan, *next = NULL;
const git_buf *syspath;
// check the path to make sure it exists?
GIT_ERROR_CHECK_ERROR(git_sysdir_get(&syspath, which));
if (!syspath || !git_buf_len(syspath))
goto done;
// this is the part I don't understand
for (scan = git_buf_cstr(syspath); scan; scan = next) {
/* find unescaped separator or end of string */
for (next = scan; *next; ++next) {
if (*next == GIT_PATH_LIST_SEPARATOR &&
(next <= scan || next[-1] != '\\'))
break;
}
len = (size_t)(next - scan);
next = (*next ? next + 1 : NULL);
if (!len)
continue;
GIT_ERROR_CHECK_ERROR(git_buf_set(path, scan, len));
if (name)
GIT_ERROR_CHECK_ERROR(git_buf_joinpath(path, path->ptr, name));
if (git_path_exists(path->ptr))
return 0;
}
done:
git_buf_dispose(path);
git_error_set(GIT_ERROR_OS, "the %s file '%s' doesn't exist", label, name);
return GIT_ENOTFOUND;
}
It's the for loop that's confusing me. for (scan = git_buf_cstr(syspath); scan; scan = next) { ... } looks like it's iterating/scanning syspath, and then I get totally lost at for (scan = git_buf_cstr(syspath); scan; scan = next) { ... }.
What does this function specifically do?
This is not looking at a directory tree but rather at delimited string containing a list of directories. For instance (though this clearly isn't aimed at this particular case), as the top level documentation says:
GIT_ALTERNATE_OBJECT_DIRECTORIES
Due to the immutable nature of Git objects, old objects can be
archived into shared, read-only directories. This variable
specifies a ":" separated (on Windows ";" separated) list of Git
object directories which can be used to search for Git objects. New
objects will not be written to these directories.
Entries that begin with " (double-quote) will be interpreted as
C-style quoted paths, removing leading and trailing double-quotes
and respecting backslash escapes. E.g., the value
"path-with-\"-and-:-in-it":vanilla-path has two paths:
path-with-"-and-:-in-it and vanilla-path.
The function is obviously scanning a character-separated path list (whatever that character is—probably colon or semicolon as above), checking for backslash prefixes, so that you can write /a:C\:/a to allow the thing to look into either /a or C:/a.
This function is tasked with locating the file name inside a "configuration level", (ie. ~/.git/, /etc/git, see git_sysdir_t for a list of known locations). As those levels are stored as a bunch of static ("readonly") / (or \)-separated C strings, and we can't modify that at runtime, we have to jump through hoops to what amounts to a foreach-string loop.

Recursive function : abort-condition

We need to create a binary tree which contains content of textfiles. The pointer selection_a and selection_b pointing to another textfile in the directory.
The structure of the textfiles is following:
line: Title
line: OptionA
line: OptionB
line: Text.
The first file is given as parameter while starting the program. All files should be saved at the beginning of the program. Then the text of the first file shows, and the user can input A or B to continue. Based on the selection, the text of File Option A/B is shown and the user can decide again.
The last file of a tree contains no Options: lines 2 and 3 are "-\n".
The problem is, this code only reads all the option A files of the first tree. It doesn't read in any B-Options. In the end, the program shows a memory access error.
I think the problem is that the readingRows function has no abort condition.
current->selection_a = readingRows(input_selection_a);
current->selection_b = readingRows(input_selection_b);
I know the code may be kind of chaotic, but we are beginners in programming. Hope anybody can help us to write an abort-condition.
The function should be aborted if the content of option A (line 3) is "-\n".
Here is the whole function:
struct story_file* readingRows(FILE *current_file)
{
char *buffer = fileSize(current_file);
char *delimiter = "\n";
char *lines = strtok(buffer, delimiter);
int line_counter = 0;
struct story_file *current = malloc(sizeof(struct story_file));
while(lines != NULL)
{
if(line_counter == 0)
{
current->title = lines;
}
else if(line_counter == 1)
{
char *filename_chapter_a = lines;
FILE *input_selection_a = fopen(filename_chapter_a, "r");
if(input_selection_a)
{
current->selection_a = readingRows(input_selection_a);
}
fclose(input_selection_a);
}
else if(line_counter == 2)
{
char *filename_chapter_b = lines;
FILE *input_selection_b = fopen(filename_chapter_b, "r");
if(input_selection_b)
{
current->selection_b = readingRows(input_selection_b);
}
fclose(input_selection_b);
}
else if (line_counter >= 3)
{
current->text = lines;
}
lines = strtok(NULL, delimiter);
line_counter++;
}
return current;
}
There are two items that define a terminating recursive function:
One or more base cases
Recursive calls that move toward a base case
Your code has one base case: while (lines!=NULL) {} return current;, it breaks the while loop when lines is NULL and returns current. In other words, within any particular call to your function, it only terminates when it reaches the end of a file.
Your code moves toward that base case as long as your files do not refer to each other in a loop. We know this because you always read a line, take an action according to your if-else block, and the read the next line. So you always move toward the end of each file you read.
But as you note, the issue is that you don't have a case to handle "no Options", being when lines 2 or 3 are "-\n". So right now, even though you move through files, you are always opening files in line 2. Unless a file is malformed and does not contain a line 2, your recursive call tree never ends. So you just need to add another base case that looks at whether the beginning of lines matches "-\n", and if it does, return before the recursive call. This will end that branch of your recursive tree.
Inside of your while loop, you will need code along the lines of:
if `line_counter` is `2` or `3`
if `lines` starts with your terminating sequence "-\n"
return current
else
`fopen` and make the recursive call
In the parent function that made the recursive call, it will move to the next line and continue as expected.
P.S. Make sure you use free for each malloc you do.

Where am I going wrong in getting this function to do what I would like?

I have written the following function in my C program. The program loads a text file (Les Miserables Vol. I) as well as another text file of 20 of the characters names. The purpose of this function is to scan the entire file, line by line, and count the number of times any of the 20 names appear.
NumOfNames = 20.
Names is an array of the 20 names stored from Names[1] - Names[20].
MaxName is a global integer variable which I would like to store the total number of name appearances throughout the file (It should be in the hundreds or even thousands).
EDIT: After the function is executed, the value of MaxName is 4. I am completely lost as to where I have made a mistake, but it appears that I have made several mistakes throughout the function. One seems to be that it only executed the first iteration of the for loop i.e. it only searches for Name[1], however the first name appears 196 times in the file, so it still isnt even working correctly for just the first name.
void MaxNameAppearances()
{
char LineOfText[85];
char *TempName;
FILE *fpn = fopen(LesMisFilePath, "r+");
for(i = 1; i<=NumOfNames; i++)
{
while(fgets(LineOfText, sizeof(LineOfText), fpn))
{
TempName = strstr(LineOfText, Names[i]);
if(TempName != NULL)
{
MaxName++;
}
}
}
fclose(fpn);
}
I guess that one problem of the code is that it would have to read the file upon every iteration of i. Try to re-order the loops like this:
while(fgets(LineOfText, sizeof(LineOfText), fpn))
{
for(i = 1; i<=NumOfNames; i++)
{
TempName = strstr(LineOfText, Names[i]);
if(TempName != NULL)
{
MaxName++;
}
}
}
This reads a line, checks the occurrances of all names in that line and then goes on to the next line.
If you do it your way, you will be at the end of file for i == 1 already.

How to get a specific word from a string and eliminate others in C

Hi following is the path of a file (which is stored as a string).
C:/db/OOSA/LIBRARIES/OOSA00/MS/filename.jpg
now I want only the file name from that for eg: "filename", rest should be filtered or removed.
How to do that in C?
I want to apply that file name to some other stuffs but i want to avoid .jpg extension and the path " C:/db/OOSA/LIBRARIES/OOSA00/MS/"
Below is the code:
static mgbool gbeadNApply (mgrec* db, mgrec* parent, mgrec* rec, void* gBead)
{
toolrec* toolRec = (toolrec*)gBead;
if (mgGetCode(rec) == fltXref);
{
char *xName;
parent = mgGetParent(rec);
mgGetAttList(rec,fltXrefFilename,&xName,MG_NULL);
mgSetName(parent,xName);
}
return MG_TRUE;
}
Here xName first collects the filename including path. and in mgSetName also you can see xName ( here xName assigns the collected file name along with path some thing like C:/db/OOSA/LIBRARIES/OOSA00/MS/filename.jpg. Now the thing is I want only the filename part of it to be written to mgSetName. so i want to filter xName for it.
This will be very vague, but you can probably figure out how to write a function to do this:
Write a function to find the "."
Write a function that returns, in a new string, everything before the "."
Write a function that finds the last "/" ( backslash? "\")
Write a function that removes everything before and including the "/"
you will use for loops
int find_period(const char *string)
{
if(!string) return -1;
int n;
for(n = 0; n < strlen(string); n++){
if(string[n] == '.') return n;
return -1;
}
Then you probably get the general idea.

Bus Error on void function return

I'm learning to use libcurl in C. To start, I'm using a randomized list of accession names to search for protein sequence files that may be found hosted here. These follow a set format where the first line is a variable length (but which contains no information I'm trying to query) then a series of capitalized letters with a new line every sixty (60) characters (what I want to pull down, but reformat to eighty (80) characters per line).
I have the call itself in a single function:
//finds and saves the fastas for each protein (assuming on exists)
void pullFasta (proteinEntry *entry, char matchType, FILE *outFile) {
//Local variables
URL_FILE *handle;
char buffer[2] = "", url[32] = "http://www.uniprot.org/uniprot/", sequence[2] = "";
//Build full URL
/*printf ("u:%s\nt:%s\n", url, entry->title); /*This line was used for debugging.*/
strcat (url, entry->title);
strcat (url, ".fasta");
//Open URL
/*printf ("u:%s\n", url); /*This line was used for debugging.*/
handle = url_fopen (url, "r");
//If there is data there
if (handle != NULL) {
//Skip the first line as it's got useless info
do {
url_fread(buffer, 1, 1, handle);
} while (buffer[0] != '\n');
//Grab the fasta data, skipping newline characters
while (!url_feof (handle)) {
url_fread(buffer, 1, 1, handle);
if (buffer[0] != '\n') {
strcat (sequence, buffer);
}
}
//Print it
printFastaEntry (entry->title, sequence, matchType, outFile);
}
url_fclose (handle);
return;
}
With proteinEntry being defined as:
//Entry for fasta formatable data
typedef struct proteinEntry {
char title[7];
struct proteinEntry *next;
} proteinEntry;
And the url_fopen, url_fclose, url_feof, url_read, and URL_FILE code found here, they mimic the file functions for which they are named.
As you can see I've been doing some debugging with the URL generator (uniprot URLs follow the same format for different proteins), I got it working properly and can pull down the data from the site and save it to file in the proper format that I want. I set the read buffer to 1 because I wanted to get a program that was very simplistic but functional (if inelegant) before I start playing with things, so I would have a base to return to as I learned.
I've tested the url_<function> calls and they are giving no errors. So I added incremental printf calls after each line to identify exactly where the bus error is occurring and it is happening at return;.
My understanding of bus errors is that it's a memory access issue wherein I'm trying to get at memory that my program doesn't have control over. My confusion comes from the fact that this is happening at the return of a void function. There's nothing being read, written, or passed to trigger the memory error (as far as I understand it, at least).
Can anyone point me in the right direction to fix my mistake please?
EDIT: As #BLUEPIXY pointed out I had a potential url_fclose (NULL). As #deltheil pointed out I had sequence as a static array. This also made me notice I'm repeating my bad memory allocation for url, so I updated it and it now works. Thanks for your help!
If we look at e.g http://www.uniprot.org/uniprot/Q6GZX1.fasta and skip the first line (as you do) we have:
MNAKYDTDQGVGRMLFLGTIGLAVVVGGLMAYGYYYDGKTPSSGTSFHTASPSFSSRYRY
Which is a 60 characters string.
When you try to read this sequence with:
//Grab the fasta data, skipping newline characters
while (!url_feof (handle)) {
url_fread(buffer, 1, 1, handle);
if (buffer[0] != '\n') {
strcat (sequence, buffer);
}
}
The problem is sequence is not expandable and not large enough (it is a fixed length array of size 2).
So make sure to choose a large enough size to hold any sequence, or implement the ability to expand it on-the-fly.

Resources