This question already has answers here:
Removing trailing newline character from fgets() input
(14 answers)
Closed 6 years ago.
I have a char* line on which I used while (fgets(line, line_size, fNames) != NULL). now the problem is I also get a new line character, which I want to strip.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
int main() {
int i;
char fileName[30];
FILE *fNames, *fCurrent;
char *line = NULL, command[100];
char letters3[3];
size_t len = 0;
//size_t read;
const size_t line_size = 300;
line = malloc(line_size);
if (access("fileNames.lst", F_OK) == -1)
system("crunch 3 3 abcd -o fileNames.lst");
else
printf("fileNames.lst already exists.\n");
fNames = fopen("./fileNames.lst","r");
while (fgets(line, line_size, fNames) != NULL) {
printf("Making File: %s.lst\n", line);
strcpy(command, "crunch 8 8 -t ");
strcpy(command, line);
strcpy(command, strcat(command," -o"));
puts(command);
strcpy(line, strcat(line, ".lst"));
fCurrent = fopen(line, "w");
//system(command);
fclose(fCurrent);
//system("read -r -p \"Press space to continue...\" key");
}
return 0;
}
There are problems in your code:
You do not check the return value of fopen(), nor that of malloc().
strcpy(command, strcat(command," -o")); and strcpy(line, strcat(line, ".lst")); invoke undefined behavior as you call strcpy on overlapping strings.
strcpy(command, line); overwrites the string you just copied to command with strcpy(command, "crunch 8 8 -t ");.
You do not check the lengths before copying or concatenating strings into line and command. You should use snprintf() for both a safer and simpler method.
To get rid of the trailing linefeed in line left by fgets(), you can either write:
line[strcspn(line, "\n")] = '\0';
or you can write this:
char *p = strchr(line, '\n');
if (p != NULL)
*p = '\0';
or even this one that removes the linefeed at the end of the string:
size_t len = strlen(line);
if (len > 0 && line[len - 1] == '\n')
line[--len] = '\0';
With both of the latter methods, you can track whether there was indeed a linefeed or not. The absence of a linefeed at the end of the line can mean one of several possibilities:
The line was truncated because it does not fit in the buffer provided. You should be careful to handle this condition because the string read does not correspond to the actual file contents. It could happen in your case if a filename is longer than 298 bytes, which is possible on many modern file systems.
The end of file was reached but no linefeed is present in the file at the end of the last line. This is probably not an error, but could indicate that the input file was truncated somehow.
The input file contains a '\0' byte, which causes early termination of the line read by fgets(). This would not be allowed as part of a filename and is quite unlikely to occur in text files, unless the file was encoded as ucs2 or UTF-16.
I think you are searching for this one. It is very easy to use and it does the job.
line[strcspn(line, "\n")] = '\0';
Related
Here's my task, below is most of the code done and finally my specific question
Write a program that reads strings and writes them to a file. The string must be dynamically allocated and the string can be of arbitrary length. When the string has been read it is written to the file. The length of the string must be written first then a colon (‘:’) and then the string. The program stops when user enters a single dot (‘.’) on the line.
For example:
User enters: This is a test
Program writes to file: 14:This is a test
Hint: fgets() writes a line feed at the end of the string if it fits in the string. Start with a small length, for example 16 characters, if you don’t see a line feed at the end then realloc the string to add more space and keep on adding new data to the string until you see a line feed at the end. Then you know that you have read the whole line. Then remove any ‘\r’ or ‘\n’ from the string and write the string length and the string to the file. Free the string before asking for a new string.
MY CODE:
#pragma warning(disable: 4996)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_NAME_SZ 256
int main()
{
char key[] = ".\n";
char* text;
text = (char*)malloc(MAX_NAME_SZ);
if (text == NULL)
{
perror("problem with allocating memory with malloc for *text");
return 1;
}
FILE* fp;
fp = fopen("EX13.txt", "w");
if (fp == NULL)
{
perror("EX13.txt not opened.\n");
return 1;
}
printf("Enter text or '.' to exit: ");
while (fgets(text, MAX_NAME_SZ, stdin) && strcmp(key, text))
{
fprintf(fp, "%ld: %s", strlen(text) - 1, text);
printf("Enter text or '.' to exit: ");
}
free((void*)text);
fclose(fp);
puts("Exit program");
return 0;
}
SPECIFIC QUESTION:
How can I make the program to allow arbitrarily long lines so there shouldn't be no limit at all for line length? Thanks
You could declare a pointer to char, read char by char and keep using reallocating the pointer until you get to the '\n':
int main()
{
char key[] = "."; //Excluded the \n since I'm not using fget
char* text;
FILE* fp;
fp = fopen("EX13.txt", "w");
if (fp == NULL)
{
perror("EX13.txt not opened.\n");
return 1;
}
printf("Enter text or '.' to exit: ");
int cont = 0;
while (1) //read all chars
{
if(!cont) //if it is the first, allocate space for 1
text = (char*) malloc(sizeof (char));
else //otherwise increase the space allocated by 1
text = (char*) realloc(text, (cont + 1) * sizeof(char));
scanf("%c", &text[cont]); //read a single char
if(text[cont] == '\n') //see if it is the end of line
{
text[cont] = 0; //if it is the end of line, then it is the end of the string
if(!strcmp(key, text)) //if the string is just a dot, end the loop
break;
fprintf(fp, "%ld: %s\n", cont, text);
printf("Enter text or '.' to exit: ");
cont = 0; //restarting the counter for the next input
free(text); // freeing after each iteration. you can optimize to maintain the space and only increase after getting to a bigger string than the previous you had so far
}
else //if it is not the end of the string, increase its size by 1
cont++;
}
free((void*)text);
fclose(fp);
puts("Exit program");
return 0;
}
Suggest using getline()
This seems to be a class room assignment, so I will not be writing the code for you.
Note: for the getline() function to be visible in linux, at the beginning of your code, you will need a statement similar to:
#define _GNU_SOURCE
or
#define _POSIX_C_SOURCE 200809L
getline(3)
NAME
getdelim, getline -- get a line from a stream
LIBRARY
Standard C Library (libc, -lc)
SYNOPSIS
#include <stdio.h>
ssize_t
getdelim(char ** restrict linep, size_t * restrict linecapp,
int delimiter, FILE * restrict stream);
ssize_t
getline(char ** restrict linep, size_t * restrict linecapp,
FILE * restrict stream);
DESCRIPTION
The getdelim() function reads a line from stream, delimited by the char-
acter delimiter. The getline() function is equivalent to getdelim() with
the newline character as the delimiter. The delimiter character is
included as part of the line, unless the end of the file is reached.
The caller may provide a pointer to a malloced buffer for the line in
*linep, and the capacity of that buffer in *linecapp. These functions
expand the buffer as needed, as if via realloc(). If linep points to a
NULL pointer, a new buffer will be allocated. In either case, *linep and
*linecapp will be updated accordingly.
RETURN VALUES
The getdelim() and getline() functions return the number of characters
written, excluding the terminating NUL character. The value -1 is
returned if an error occurs, or if end-of-file is reached.
EXAMPLES
The following code fragment reads lines from a file and writes them to
standard output. The fwrite() function is used in case the line contains
embedded NUL characters.
char *line = NULL;
size_t linecap = 0;
ssize_t linelen;
while ((linelen = getline(&line, &linecap, fp)) > 0)
fwrite(line, linelen, 1, stdout);
ERRORS
These functions may fail if:
[EINVAL] Either linep or linecapp is NULL.
[EOVERFLOW] No delimiter was found in the first SSIZE_MAX characters.
These functions may also fail due to any of the errors specified for
fgets() and malloc().
Note: you will need to pass to free() the line, when the code is through with it, to avoid a memory leak.
Note: to remove any trailing '\n' you can use:
line[ strcspn( line, "\n" ) ] = '\0';
Note: after removing any trailing '\n' you can use:
size_t length = strlen( line );
To get the length of the line in bytes.
Then print that length and the line using:
printf( "%zu:%s", length, line );
What is the most accurate way to read strings from the keyboard in C, when the string contains spaces in between words? When I use scanf for that purpose then it doesn't read a string with spaces.The second option is to use gets but it is supposed to be harmful(I also want to know why?).Another thing is that I don't want to use any file handling concept like fgets.
These are 2 ways to read strings containing spaces that don't use gets or fgets
You can use getline (POSIX 2008 may not exist in your system) that conveniently manages allocation of the buffer with adequate size to capture the whole line.
char *line = NULL;
size_t bufsize = 0;
size_t n_read; // number of characters read including delimiter
while ((n_read = getline(&line, &bufsize, stdin)) > 1 && line != NULL) {
// do something with line
}
If you absolutely want scanf, in this example it reads to the end of line unless the line has more than the specified number of chars minus 1 for the delimiter. In the later case the line is truncated and you'll get the remaining chars in the next scanf invocation.
char line[1024];
while (scanf("%1023[^\n]\n", line) == 1) {
// do something with line
}
I should also point out that when you read strings from the keyboard with scanf for example, you are actually reading from a file with file pointer stdin. So you can't really avoid "any file handling concept"
#user3623265,
Please find a sample program which Uses fgets to read string from standard input.
Please refer some sample C documents as to how fgets can be used to get strings from a keyboard and what is the purpose of stdin.
#include <stdio.h>
#include <string.h>
int main(void)
{
char str[80];
int i;
printf("Enter a string: ");
fgets(str, sizeof(str), stdin);
i = strlen(str) - 1;
if (str[i] == '\n')
str[i] = '\0';
printf("This is your string: %s", str);
return 0;
}
There is a third option, you can read the raw data from stdin with the read() call:
#include <unistd.h>
int main(void) {
char buf[1024];
ssize_t n_bytes_read;
n_bytes_read = read(STDIN_FILENO, buf, sizeof(buf) - 1);
if (n_bytes_read < 0) {
// error occured
}
buf[n_bytes_read] = '\0'; // terminte string
printf("\'%s\'", buf);
return 0;
}
Please not that every input is copied raw to buf including the trailing return. That is, if you enter Hello World you will get
'Hello World
'
as output. Try online.
If you insist on not having a FILE * in scope, use getchar().
char buff[1024];
int ch;
int i = 0;
while( (ch = getchar()) != '\n' )
if(i < 1023)
buff[i++] = ch;
buff[i] = 0;
/* now move string into a smaller buffer */
Generally however it's accepted that stdout and stdin and FILE * are available. Your requirement is a bit odd and, since you are obviously not an advanced C programmer who has an unusual need to suppress the FILE * symbol, I suspect your understanding of C IO is shaky.
I am reading one line from a file which contains on the first line the word "hello". And then I am comparing it with "hello" using strcasecmp, however it is telling me it is still different
char *line = NULL;
size_t len = 100;
printf("%s", argv[1]);
FILE * fp = fopen(argv[1], "r");
if (fp == NULL) {
printf("empty\n");
exit(0);
}
getline(&line, &len, fp);
if (strcasecmp(line, "hello") == 0) {
printf("same");
}
strcasecmp will only return 0 if the strings are the same (except for the case), not if the first string starts with the second string.
And getline reads the newline character at the end of the line, so if you type "hello" the string you get in "line" will be "hello\n".
this is from the man page for getline()
getline()
reads an entire line from stream, storing the address of the
buffer containing the text into *lineptr.
The buffer is null-terminated
and includes the newline character, if one was found.
Notice that part about including the newline character.
So, either limit the length of the comparison or better, trim the new line char, using something similar to:
char * newline = NULL;
if( NULL != (newline = strchr( line, '\n' ) )
{ // then newline found
*newline = '\0';
}
I have two files .csv and I need to read the whole file but it have to be filed by field. I mean, csv files are files with data separated by comma, so I cant use fgets.
I need to read all the data but I don't know how to jump to the next line.
Here is what I've done so far:
int main()
{
FILE *arq_file;
arq_file = fopen("file.csv", "r");
if(arq_file == NULL){
printf("Not possible to read the file.");
exit(0);
}
while( !feof(arq_file) ){
fscanf(arq_file, "%i %lf", &myStruct[i+1].Field1, &myStruct[i+1].Field2);
}
fclose(arq_file);
return 0;
}
It will get in a infinity loop because it never gets the next line.
How could I reach the line below the one I just read?
Update: File 01 Example
1,Alan,123,
2,Alan Harper,321
3,Jose Rendeks,32132
4,Maria da graça,822282
5,Charlie Harper,9999999999
File 02 Example
1,320,123
2,444,321
3,250,123,321
3,3,250,373,451
2,126,621
1,120,320
2,453,1230
3,12345,0432,1830
I think an example is better than giving you hints, this is a combination of fgets() + strtok(), there are other functions that could work for example strchr(), though it's easier this way and since I just wanted to point you in the right direction, well I did it like this
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int
main(void)
{
FILE *file;
char buffer[256];
char *pointer;
size_t line;
file = fopen("data.dat", "r");
if (file == NULL)
{
perror("fopen()");
return -1;
}
line = 0;
while ((pointer = fgets(buffer, sizeof(buffer), file)) != NULL)
{
size_t field;
char *token;
field = 0;
while ((token = strtok(pointer, ",")) != NULL)
{
printf("line %zu, field %zu -> %s\n", line, field, token);
field += 1;
pointer = NULL;
}
line += 1;
}
return 0;
}
I think it's very clear how the code works and I hope you can understand.
If the same code has to handle both data files, then you're stuck with reading the fields into a string, and subsequently converting the string into a number.
It is not clear from your description whether you need to do something special at the end of line or not — but because only one of the data lines ends with a comma, you do have to allow for fields to be separated by a comma or a newline.
Frankly, you'd probably do OK with using getchar() or equivalent; it is simple.
char buffer[4096];
char *bufend = buffer + sizeof(buffer) - 1;
char *curfld = buffer;
int c;
while ((c = getc(arq_file)) != EOF)
{
if (curfld == bufend)
…process overlong field…
else if (c == ',' || c == '\n')
{
*curfld = '\0';
process(buffer);
curfld = buffer;
}
else
*curfld++ = c;
}
if (c == EOF && curfld != buffer)
{
*curfld = '\0';
process(buffer);
}
However, if you want to go with higher level functions, then you do want to use fgets() to read lines (unless you need to worry about deviant line endings, such as DOS vs Unix vs old-style Mac (CR-only) line endings). Or use POSIX
getline() to read arbitrarily long lines. Then split the lines using strtok_r() or equivalent.
char *buffer = 0;
size_t buflen = 0;
while (getline(&buffer, &buflen, arq_file) != -1)
{
char *posn = buffer;
char *epos;
char *token;
while ((token = strtok_r(posn, ",\n", &epos)) != 0)
{
process(token);
posn = 0;
}
/* Do anything special for end of line */
}
free(buffer);
If you think you must use scanf(), then you need to use something like:
char buffer[4096];
char c;
while (fscanf(arq_file, "%4095[^,\n]%c", buffer, &c) == 2)
process(buffer);
The %4095[^,\n] scan set reads up to 4095 characters that are neither comma nor newline into buffer, and then reads the next character (which must, therefore, either be comma or newline — or conceivably EOF, but that causes problems) into c. If the last character in the file is neither comma nor newline, then you will skip the last field.
I'm trying to parse in a text file, and add each distinct word into a hashtable, with the words as keys, and their frequencies as values. The problem is proving to be the reading part: the file is a very large file of "normal" text, in that it has punctuation and special characters. I want to treat all non-alphabetical chars read in as word-boundaries. I have something basic going with this:
char buffer[128];
while(fscanf(fp, "%127[A-Za-z]%*c", buffer) == 1) {
printf("%s\n", buffer);
memset(buffer, 0, 128);
}
However, that chokes whenever it actually hits a non-alphabetical char preceded by whitespace (e.g., "the,cat was (brown)" would be read in as "the cat was"). I know what the issue is with that code, but I'm not sure how to get around it. Would I be better off just reading in an entire line and doing the parsing manually? I'm trying scanf because I felt that this was a pretty good candidate for the mini-regex thing that you can do with the format string.
Suggest use of isalpha(), fgetc() and a simple state-machine.
#include <assert.h>
#include <ctype.h>
#include <stdio.h>
int AdamRead(FILE *inf, char *dest, size_t n) {
int ch;
do {
ch = fgetc(inf);
if (ch == EOF) return EOF;
} while (!isalpha(ch));
assert(n > 1);
n--; // save room for \0
while (n-- > 0) {
*dest++ = ch;
ch = fgetc(inf);
if (!isalpha(ch)) break;
}
ungetc(ch, inf); // Add this is something else may need to parse `inf`.
*dest = '\0';
return 1;
}
char buffer[128];
while(AdamRead(fp, buffer, sizeof buffer) == 1) {
printf("%s\n", buffer);
}
Note: If you want to go the "%127[A-Za-z]%*[^A-Za-z]" route, code may need to start with a one-time fscanf(fp, "*[^A-Za-z]"); to deal with leading non-letters.
There's another way apart from the one mentioned in the comment. I don't know if it's better though. You can read lines from the file using fgets and then tokenize the line using strtok_r POSIX function. Here, r means the function is reentrant which makes it thread-safe. However, you must know the maximum length a line can have in the file.
#include <stdio.h>
#include <string.h>
#define MAX_LEN 100
// in main
char line[MAX_LEN];
char *token;
const char *delim = "!##$%^&*"; // all special characters
char *saveptr; // for strtok_r
FILE *fp = fopen("myfile.txt", "r");
while(fgets(line, MAX_LEN, fp) != NULL) {
for(; ; line = NULL) {
token = strtok_r(line, delim, &saveptr);
if(token == NULL)
break;
else {
// token is a string.
// process it
}
}
}
fclose(fp);
strtok_r modifies its first argument line, so you should keep a copy of it if it needed for other purposes.