The basic gist is, I'm reading words from a text file, storing them as a string, running a function, and then looping over this multiple times, rewriting that string with every new line read. After this loop is done, I need to deal with a different string. The problem is, the second string's bytes, even though I've memset them to 0 at declaration, are getting overwritten by the extra letters in words longer than the space I've allocated to the first string:
char* currDictWord = malloc(9*(sizeof(char));
char* currBrutWord = malloc(9*(sizeof(char));
memset(currBrutWord, 0, 9);
memset(currDictWord, 0, 9);
...
while (stuff) {
fscanf(dictionary, "%s", currDictWord);
}
...
printf("word: %s\n", currBrutWord);
currBrutWord will not be empty anymore. The two ways I've dealt with this are by either making sure currDictWord is longer than the longest word in the dictionary file (kind of a ghetto solution), and doing a new memset on currBrutWord after the loop. Is there no way to tell C to stop writing stuff into memory I've specifically allocated for a different variable?
Yes: stop using fscanf (and preferably the whole scanf-family), and use fgets instead, it lets you pass the maximum number of bytes to read into the variable.
EDIT: (in response to the comment)
fgets will stop reading until count bytes have been read or a newline has been found, which will be in the string. So after fgetsing the string check if there is a newline at the end of it (and remove if necessary). If there is no newline in the string fgetc from the file until you've found one, like this:
fgets(currDictWord, 9, dictionary);
if(currDictWord[strlen(currDictWord) - 1] != '\n'){
while(fgetc(dictionary) != '\n'); /* no body necssary */
/* the stream-pointer is now a the beginning of the next line */
}
Improper string assignment and that not validating data read from a file.
currBrutWord is overrun because too many chars were written into currBrutWord. The same would have happened had you done:
strcpy(currBrutWord, "123456789"); // Bad as this copy 9+1 char into currBrutWord
When using fscanf(), one could limit the data read via:
fscanf(dictionary, "%8s", currDictWord);
This prevents fscanf() from putting too much data into currDictWord. That part is good, but you still have unexpected data coming from the file. You need to challenge any data from the outside world.
if (NULL == fgets(bigbuf, sizeof bigbuf, dictionary)) {
; handle EOF or I/O error
}
// now parse and validate bigbuf using various tools: strtok(), sscanf(), etc.
int n;
if ((sscanf(bigbuf, "%8s%n", currDictWord, &n) < 1) || (bigbif[n] != '\n')) {
; handle error
}
Related
So I have a wall of text in a file and I need to recognize some words that are between the $ sign and call them as numbers then print the modified text in another file along with what the numbers correspond to.
Also lines are not defined and columns should be max 80 characters.
Ex:
I $like$ cats.
I [1] cats.
[1] --> like
That's what I did:
#include <stdio.h>
#include <stdlib.h>
#define N 80
#define MAX 9999
int main()
{
FILE *fp;
int i=0,count=0;
char matr[MAX][N];
if((fp = fopen("text.txt","r")) == NULL){
printf("Error.");
exit(EXIT_FAILURE);
}
while((fscanf(fp,"%s",matr[i])) != EOF){
printf("%s ",matr[i]);
if(matr[i] == '\0')
printf("\n");
//I was thinking maybe to find two $ but Idk how to replace the entire word
/*
if(matr[i] == '$')
count++;
if(count == 2){
...code...
}
*/
i++;
}
fclose(fp);
return 0;
}
My problem is that fscanf doesn't recognize '\0' so it doesn't go in the next line when I print the array..also I don't know how to replace $word$ with a number.
Not only will fscanf("%s") read one whitespace-delimited string at a time, it will also eat all whitespace between those strings, including line terminators. If you want to reproduce the input whitespace in the output, as your example suggests you do, then you need a different approach.
Also lines are not defined and columns should be max 80 characters.
I take that to mean the number of lines is not known in advance, and that it is acceptable to assume that no line will contain more than 80 characters (not counting any line terminator).
When you say
My problem is that fscanf doesn't recognize '\0' so it doesn't go in the next line when I print the array
I suppose you're talking about this code:
char matr[MAX][N];
/* ... */
if(matr[i] == '\0')
Given that declaration for matr, the given condition will always evaluate to false, regardless of any other consideration. fscanf() does not factor in at all. The type of matr[i] is char[N], an array of N elements of type char. That evaluates to a pointer to the first element of the array, which pointer will never be NULL. It looks like you're trying to determine when to write a newline, but nothing remotely resembling this approach can do that.
I suggest you start by taking #Barmar's advice to read line-by-line via fgets(). That might look like so:
char line[N+2]; /* N + 2 leaves space for both newline and string terminator */
if (fgets(line, sizeof(line), fp) != NULL) {
/* one line read; handle it ... */
} else {
/* handle end-of-file or I/O error */
}
Then for each line you read, parse out the "$word$" tokens by whatever means you like, and output the needed results (everything but the $-delimited tokens verbatim; the bracket substitution number for each token). Of course, you'll need to memorialize the substitution tokens for later output. Remember to make copies of those, as the buffer will be overwritten on each read (if done as I suggest above).
fscanf() does recognize '\0', under select circumstances, but that is not the issue here.
Code needs to detect '\n'. fscanf(fp,"%s"... will not do that. The first thing "%s" directs is to consume (and not save) any leading white-space including '\n'. Read a line of text with fgets().
Simple read 1 line at a time. Then march down the buffer looking for words.
Following uses "%n" to track how far in the buffer scanning stopped.
// more room for \n \0
#define BUF_SIZE (N + 1 + 1)
char buffer[BUF_SIZE];
while (fgets(buffer, sizeof buffer, stdin) != NULL) {
char *p = buffer;
char word[sizeof buffer];
int n;
while (sscanf(p, "%s%n", word, &n) == 1) {
// do something with word
if (strcmp(word, "$zero$") == 0) fputs("0", stdout);
else if (strcmp(word, "$one$") == 0) fputs("1", stdout);
else fputs(word, stdout);
fputc(' ', stdout);
p += n;
}
fputc('\n', stdout);
}
Use fread() to read the file contents to a char[] buffer. Then iterate through this buffer and whenever you find a $ you perform a strncmp to detect with which value to replace it (keep in mind, that there is a 2nd $ at the end of the word). To replace $word$ with a number you need to either shrink or extend the buffer at the position of the word - this depends on the string size of the number in ascii format (look solutions up on google, normally you should be able to use memmove). Then you can write the number to the cave, that arose from extending the buffer (just overwrite the $word$ aswell).
Then write the buffer to the file, overwriting all its previous contents.
I'm looking to copy the FIRST line from a LONG string P into a buffer
I have no idea how to make it.
while (*pros_id != '/n'){
*pros_id_line=*pros_id;
pros_id++;
pros_id_line++;
}
And tried
fgets(pros_id_line, sizeof(pros_id_line), pros_id);
Both are not working. Can I get some help please?
Note, as Adriano Repetti pointed out in a comment and an answer, that the newline character is '\n' and not '/n'.
Your initial code can be fixed up to work, provided that the destination buffer is big enough:
while (*pros_id != '\n' && *pros_id != '\0')
*pros_id_line++ = *pros_id++;
*pros_id_line = '\0';
This code does not include the newline in the copied buffer; it is easy enough to add it if you need it.
One advantage of this code is that it makes a single pass through the data up to the newline (or end of string). An alternative makes two passes through the data, one to find the newline and another to copy to the newline:
if ((end = strchr(pros_id, '\n')) != 0)
{
memmove(pros_id_line, pros_id, end - pros_id);
pros_id_line[end - pros_id] = '\0';
}
This ensures that the string is null-terminated; again, it omits the newline, and assumes there is enough space in the pros_id_line buffer for the data. You have to decide what is the correct behaviour when there is no newline in the buffer. It might be sufficient to copy the buffer without the newline into the target area, or you might prefer to report a problem.
You can use strncpy() instead of memmove() but it has a more complex loop condition than memmove() — it has to check for a null byte as well as the count, whereas memmove() only has to check the count. You can use memcpy() instead of memmove() if you're sure there's no overlap between source and target, but memmove() always works and memcpy() sometimes doesn't (though only when the source and target areas overlap), and I prefer reliability over possible misbehaviour.
Note that setting a buffer to zero before copying a string to it is a waste of energy. The parts that you're about to overwrite with data didn't need to be zeroed. The parts that you aren't going to overwrite with data didn't need to be zeroed either. You should know exactly which byte needs to be zeroed, so why waste the time on zeroing anything except the one byte that needs to be zeroed?
(One exception to this is if you are dealing with sensitive data and are concerned that some function that your code will call may deliberately read beyond the end of the string and come across parts of a password or other sensitive data. Then it may be appropriate to wipe the memory before writing new data to it. On the whole, though, most people aren't writing such code.)
New line is \n not /n anyway I'd use strchar for this:
char* endOfFirstLine = strchr(inputString, '\n');
if (endOfFirstLine != NULL)
{
strncpy(yourBuffer, inputString,
endOfFirstLine - inputString);
}
else // Input is one single line
{
strcpy(yourBuffer, inputString);
}
With inputString as your char* multiline string and inputBuffer (assuming it's big enough to contain all data from inputString and it has been zeroed) as your required output (first line of inputString).
If you're going to be doing a lot of reading from long text buffers, you could try using a memory stream, if you system supports them: https://www.gnu.org/software/libc/manual/html_node/String-Streams.html
#define _GNU_SOURCE
#include <stdio.h>
#include <string.h>
static char buffer[] = "foo\nbar";
int
main()
{
char arr[100];
FILE *stream;
stream = fmemopen(buffer, strlen(buffer), "r");
fgets(arr, sizeof arr, stream);
printf("First line: %s\n", arr);
fgets(arr, sizeof arr, stream);
printf("Second line: %s\n", arr);
fclose (stream);
return 0;
}
POSIX 2008 (e.g. most Linux systems) has getline(3) which heap-allocates a buffer for a line.
So you could code
FILE* fil = fopen("something.txt","r");
if (!fil) { perror("fopen"); exit(EXIT_FAILURE); };
char *linebuf=NULL;
size_t linesiz=0;
if (getline(&linebuf, &linesiz, fil) {
do_something_with(linebuf);
}
else { perror("getline"; exit(EXIT_FAILURE); }
If you want to read an editable line from stdin in a terminal consider GNU readline.
If you are restricted to pure C99 code you have to do the heap allocation yourself (malloc or calloc or perhaps -with care- realloc)
If you just want to copy the first line of some existing buffer char*bigbuf; which is non-NULL, valid, and zero-byte terminated:
char*line = NULL;
char *eol = strchr(bigbuf, '\n');
if (!eol) { // bigbuf is a single line so duplicate it
line = strdup(bigbuf);
if (!line) { perror("strdup"); exit(EXIT_FAILURE); }
} else {
size_t linesize = eol-bugbuf;
line = malloc(linesize+1);
if (!line) { perror("malloc"); exit(EXIT_FAILURE);
memcpy (line, bigbuf, linesize);
line[linesize] = '\0';
}
I'm trying to write a function that removes the rest of a line in C. I'm passing in a char array and a file pointer (which the char array was read from). The array is only supposed to have 80 chars in it, and if there isn't a newline in the array, read (and discard) characters in the file until you reach it (newline). Here's what I have so far, but it doesn't seem to be working, and I'm not sure what I'm doing wrong. Any help would be greatly appreciated! Here's the given information about what the function should do:
discardRest - if the fgets didn't read a newline than an entire line hasn't been read. This function takes as input the most recently read line and the pointer to the file being read. discardRest looks for the newline character in the input line. If newline character is not in the line, the function reads (and discards) characters from the file until the newline is read. This will cause the file pointer to be positioned to the beginning of the next line in the input file.
And here's the code:
void discardRest(char line[], FILE* file)
{
bool newlineFound = FALSE;
int i;
for(i = 0; i < sizeof(line); i++)
{
if(line[i] == '\n') newlineFound = TRUE;
}
if(!newlineFound)
{
int c = getc(file);
while(c != '\n')
{
c = getc(file);
}
}
}
Your way is much too difficult, besides sizeof always giving the size of its operand, which is a pointer and not the array it points to which you think it is.
fgets has thefollowing contract:
return NULL: Some kind of error, do not use the buffer, its content might be indeterminate.
otherwise the buffer contains a 0-terminated string, with the last non-0 being the retained '\n' if the buffer and the file were both large enough.
Thus, this should work:
So, use strlen() to get the buffer length.
Determine if a whole line was read (length && [length-1] == '\n').
As appropriate:
remove the newline character and return.
discard the rest of the line like you tried.
While I could use strings, I would like to understand why this small example I'm working on behaves in this way, and how can I fix it ?
int ReadInput() {
char buffer [5];
printf("Number: ");
fgets(buffer,5,stdin);
return atoi(buffer);
}
void RunClient() {
int number;
int i = 5;
while (i != 0) {
number = ReadInput();
printf("Number is: %d\n",number);
i--;
}
}
This should, in theory or at least in my head, let me read 5 numbers from input (albeit overwriting them).
However this is not the case, it reads 0, no matter what.
I understand printf puts a \0 null terminator ... but I still think I should be able to either read the first number, not just have it by default 0. And I don't understand why the rest of the numbers are OK (not all 0).
CLARIFICATION: I can only read 4/5 numbers, first is always 0.
EDIT:
I've tested and it seems that this was causing the problem:
main.cpp
scanf("%s",&cmd);
if (strcmp(cmd, "client") == 0 || strcmp(cmd, "Client") == 0)
RunClient();
somehow.
EDIT:
Here is the code if someone wishes to compile. I still don't know how to fix
http://pastebin.com/8t8j63vj
FINAL EDIT:
Could not get rid of the error. Decided to simply add #ReadInput
int ReadInput(BOOL check) {
...
if (check)
printf ("Number: ");
...
# RunClient()
void RunClient() {
...
ReadInput(FALSE); // a pseudo - buffer flush. Not really but I ignore
while (...) { // line with garbage data
number = ReadInput(TRUE);
...
}
And call it a day.
fgets reads the input as well as the newline character. So when you input a number, it's like: 123\n.
atoi doesn't report errors when the conversion fails.
Remove the newline character from the buffer:
buf[5];
size_t length = strlen(buffer);
buffer[length - 1]=0;
Then use strtol to convert the string into number which provides better error detection when the conversion fails.
char * fgets ( char * str, int num, FILE * stream );
Get string from stream.
Reads characters from stream and stores them as a C string into str until (num-1) characters have been read or either a newline or the end-of-file is reached, whichever happens first.
A newline character makes fgets stop reading, but it is considered a valid character by the function and included in the string copied to str. (This means that you carry \n)
A terminating null character is automatically appended after the characters copied to str.
Notice that fgets is quite different from gets: not only fgets accepts a stream argument, but also allows to specify the maximum size of str and includes in the string any ending newline character.
PD: Try to have a larger buffer.
I'm using C and I want to read from a binaryFile.
I know that it is contain strings in the following way: Length of a string, the string itself, the length of a string, string itself, and so on...
I want to count the number of times which the string Str appears in the binary file.
So I want to do something like this:
int N;
while (!feof(file)){
if (fread(&N, sizeof(int), 1, file)==1)
...
Now I need to get the string itself. I know it's length. Should I do a 'for'
loop and get with fgetc char by char? I know I'm not allowed to use fscanf since
it's not a text file, but can I use fgetc? And would I get what I'm expecting for
my string? (To use dynamic allocation for char* for it with the size of the length
and use strcpy to add it to the current string?)
You could allocate some memory with malloc then fread into that buffer:
char *str;
/* ... */
if (fread(&N, sizeof(int), 1, file)==1)
{
/* check that N > 0 */
str = malloc(N+1);
if (fread(str, sizeof(char), N, file) == N)
{
str[N] = '\0'; /* terminate str */
printf("Read %d chars: %s\n", N, str);
}
free(str);
}
You should probably loop on:
while (fread(&N, sizeof(int), 1, file) == 1)
{
// Check N for sanity
char *buffer = malloc(N+1);
// Check malloc succeeded
if (fread(buffer, N, 1, file) != 1)
...process error...
buffer[N] = '\0'; // Null terminate for sanity's sake
...store buffer (the pointer) for later processing so you aren't leaking...
...or free it if you won't need it later...
}
You could use getc() or fgetc() in a loop; that would work. However, the direct fread() is much simpler (and is coded as if it uses getc() in a loop).
You might want to do some sanity checking on N before blindly using it with malloc(). In particular, negative values are likely to lead to much unhappiness.
The file format as written is tied to one class of machine — either big-endian or little-endian, and with the fixed size of int (probably 32-bits). Writing more portable data is slightly fiddlier, but eminently doable — but probably not relevant to you just yet.
Using feof() is seldom the correct way to test for whether to continue with a loop. Indeed, there is not often a need to use feof() in code. When it is used, it is because an I/O operation 'failed' and you need to disambiguate between 'it was not an error — just EOF' and 'there was some sort of error on the device'.