I have a configuration file that is supposed to have \r\n line ending style, and I want to include the code in my program to check and correct the format.
Existing code:
int convert_line_endings(FILE *fp)
{
char c = 0, lastc = 0, cnt = 0;
while((c = fgetc(fp)) != EOF)
{
if((c == '\n') && (lastc != '\r'))
{
cnt++;
//somehow "insert" a '\r' in here, after the previous char and before the '\n'
}
lastc = c;
}
return cnt;
}
And in C programming, you can't "insert" a char (or can you?!), just overwrite one or the other. Any suggestions?
No, you can't delete or insert. What you can do is write to a new temporary file by copying everything except \r\n sequences and then overwrite the original file with it.
Related
My assignment is to redirect a text file and do all sorts of operations on it , everything is working except I have a little problem :
so the main function that reads input is getline1():
char* getline1(){
char *LinePtr = (char*)malloc(sizeof(char*)*LINE);
int i = 0;
for ( ; (*(LinePtr+i) = getc(stdin)) != '\n' ; i++){}
*(LinePtr+i) = '\0';
return LinePtr;
}
it returns a pointer to char array of a single line,
so we know that a new line saparates with '\n' char,
previous problem I had is when I wrote the getline1() function like this :
for (int i = 0 ; Line[i] != '\n' ; i++){
Line[i] = getc(stdin);
}
as it logically it may be authentic the getc() is a streaming function and I saw online answers that this will not work didn't quite understand why.
anyway the big issue is that I need to know how many lines there are in the text so I can stop reading values , or to know from getline1() function that there is no next line left and Im done.
things we need to take for account :
1.only <stdio.h> <stdlib.h> need to be used
2.Im using Linux Ubuntu and gcc compiler
3.the ridirection goes like ./Run<input.txt
also I understand that stdin is a file pointer , didn't found a way that this can help me.
Thank you ,
Denis
You should check for the EOF signal in addition to the newline character, you should also check for that your index-1 is always smaller than LINE to avoid overflow and also save space for the NULL terminator.
#define LINE 100
char *my_getline(void)
{
size_t i = 0;
char *str = NULL;
int c = 0;
if ((str = malloc(LINE)) == NULL)
{
fprintf(stderr,"Malloc failed");
exit(EXIT_FAILURE);
}
while (i+1 < LINE && (c = getchar()) != EOF && c != '\n') /* Saving space for \0 */
{
str[i++] = c;
}
str[i] = '\0';
return str;
}
Thanks for everybody , I just made another function to count line this was the only lazy option available :)
static void linecounter(){
FILE *fileptr;
int count = 0;
char chr;
fileptr = fopen("input.txt", "r");
chr = getc(fileptr);
while (chr != EOF){
if (chr == '\n'){count = count + 1;}
chr = getc(fileptr);}
fclose(fileptr);
count_lines = count;}
I have a CSV data file that have the following data:
H1,H2,H3
a,"b
c
d",e
When I open through Excel as CSV file, it is able to show the sheet with column headings as H1, H2, H3 and column values as: a for H1,
multi line value as
b
c
d
for H2
and c for H3
I need to parse this file using a C program and have the values picked up like this.
But, my following code snippet will not work, as I have multi line values for a column:
char buff[200];
char tokens[10][30];
fgets(buff, 200, stdin);
char *ptok = buff; // for iterating
char *pch;
int i = 0;
while ((pch = strchr(ptok, ',')) != NULL) {
*pch = 0;
strcpy(tokens[i++], ptok);
ptok = pch+1;
}
strcpy(tokens[i++], ptok);
How to modify this code snippet to accommodate multi-line values of columns?
Please don't get bothered by the hard-coded values for the string buffers, this is the test code as POC.
Instead of any 3rd party library, I would like to do it the hard way from first principle.
Please help.
The main complication in parsing "well-formed" CSV in C is precisely the handling of variable-length strings and arrays which you are avoiding by using fixed-length strings and arrays. (The other complication is handling not well-formed CSV.)
Without those complications, the parsing is really quite simple:
(untested)
/* Appends a non-quoted field to s and returns the delimiter */
int readSimpleField(struct String* s) {
for (;;) {
int ch = getc();
if (ch == ',' || ch == '\n' || ch == EOF) return ch;
stringAppend(s, ch);
}
}
/* Appends a quoted field to s and returns the delimiter.
* Assumes the open quote has already been read.
* If the field is not terminated, returns ERROR, which
* should be a value different from any character or EOF.
* The delimiter returned is the character after the closing quote
* (or EOF), which may not be a valid delimiter. Caller should check.
*/
int readQuotedField(struct String* s) {
for (;;) {
int ch;
for (;;) {
ch = getc();
if (ch == EOF) return ERROR;
if (ch == '"') {
ch = getc();
if (ch != '"') break;
}
stringAppend(s, ch);
}
}
}
/* Reads a single field into s and returns the following delimiter,
* which might be invalid.
*/
int readField(struct String* s) {
stringClear(s);
int ch = getc();
if (ch == '"') return readQuotedField(s);
if (ch == '\n' || ch == EOF) return ch;
stringAppend(s, ch);
return readSimpleField(s);
}
/* Reads a single row into row and returns the following delimiter,
* which might be invalid.
*/
int readRow(struct Row* row) {
struct String field = {0};
rowClear(row);
/* Make sure there is at least one field */
int ch = getc();
if (ch != '\n' && ch != EOF) {
ungetc(ch, stdin);
do {
ch = readField(s);
rowAppend(row, s);
} while (ch == ',');
}
return ch;
}
/* Reads an entire CSV file into table.
* Returns true if the parse was successful.
* If an error is encountered, returns false. If the end-of-file
* indicator is set, the error was an unterminated quoted field;
* otherwise, the next character read will be the one which
* triggered the error.
*/
bool readCSV(struct Table* table) {
tableClear(table);
struct Row row = {0};
/* Make sure there is at least one row */
int ch = getc();
if (ch != EOF) {
ungetc(ch, stdin);
do {
ch = readRow(row);
tableAppend(table, row);
} while (ch == '\n');
}
return ch == EOF;
}
The above is "from first principles" -- it does not even use standard C library string functions. But it takes some effort to understand and verify. Personally, I would use (f)lex and maybe even yacc/bison (although it's a bit of overkill) to simplify the code and make the expected syntax more obvious. But handling variable-length structures in C will still need to be the first step.
while((c = getc(file)) != -1)
{
if (c == ';')
{
//here I want to skip the line that starts with ;
//I don't want to read any more characters on this line
}
else
{
do
{
//Here I do my stuff
}while (c != -1 && c != '\n');//until end of file
}
}
Can I completely skip a line using getc if first character of line is a semicolon?
Your code contains a couple of references to -1. I suspect that you're assuming that EOF is -1. That's a common value, but it is simply required to be a negative value — any negative value that will fit in an int. Do not get into bad habits at the start of your career. Write EOF where you are checking for EOF (and don't write EOF where you are checking for -1).
int c;
while ((c = getc(file)) != EOF)
{
if (c == ';')
{
// Gobble the rest of the line, or up until EOF
while ((c = getc(file)) != EOF && c != '\n')
;
}
else
{
do
{
//Here I do my stuff
…
} while ((c = getc(file)) != EOF && c != '\n');
}
}
Note that getc() returns an int so c is declared as an int.
Let's assume that by "line" you mean a string of characters until you hit a designated end-of-line character (here assumed as \n, different systems use different characters or character sequences like \r\n). Then whether the current character c is in a semicolon-started line or not becomes a state information which you need to maintain across different iterations of the while-loop. For example:
bool is_new_line = true;
bool starts_with_semicolon = false;
int c;
while ((c = getc(file) != EOF) {
if (is_new_line) {
starts_with_semicolon = c == ';';
}
if (!starts_with_semicolon) {
// Process the character.
}
// If c is '\n', then next letter starts a new line.
is_new_line = c == '\n';
}
The code is just to illustrate the principle -- it's not tested or anything.
I'm trying to read the first character of a file and whenever it's equal to '(' I should skip that line else get the first character from that line. I'm under a mac and I can make use of fgetln.
FILE *file = fopen("test.txt", "r");
char c;
while(fscanf(file, "%s", &c) != EOF) {
if (c != '(')
printf("%c", c);
}
That's my current code. I don't know how to skip lines, although I've tried to get the whole line and checked only the first char solving the skip problem. However this is not working I'm getting strange characters in my console instead of the ones inside test.txt. How should I do that?
The problem with using %s format specifier of fscanf is that is splits on spaces, not only on end-of-line characters. Moreover, reading it in a single-character buffer will nearly always produce undefined behavior.
There are several ways to solve this problem, using different APIs:
You could replace %s with %200[^\n], and passing a 201-character buffer instead of c,
Using fgets with a properly-sized buffer, and picking the initial character, or
Using a character-based API, and setting a "take next" flag each time that you see a '\n' character:
Here is how you can implement the third approach:
bool takeNext = true;
int ch;
while ((ch = fgetc(file)) != EOF) {
if (takeNext && ch != '(') {
printf("%c", ch);
}
takeNext = (ch == '\n');
}
Here is a slightly longer character-based approach, which conditions on whether the first character in a line is ( or not.
If it is (, then we consume everything up to and including the next newline without outputting.
If it not, then we do the same thing but we output the characters as we read them.
#include <stdio.h>
int main(){
FILE *file = fopen("test.txt", "r");
int c;
while((c = getc(file)) != -1) {
if (c == '(') {
// Skip until the next newline
do {
c = getc(file);
} while (c != -1 && c != '\n');
continue;
}
else {
putchar(c);
do {
c = getc(file);
putchar(c);
} while (c != -1 && c != '\n');
}
}
fclose(file);
}
Change c to string because fscanf reads string. See if the 1st character of c matches with (.
If it does not then print the line else skip the line.
FILE *file = fopen("test.txt", "r");
char c[100];
while(fscanf(file, "%s", c)) {
if (c[0] != '(')
printf("%s", c);
}
Use fgets to read whole lines. It is also safer than fscanf as it limits the reading to the buffer size.
To check if the first char is '(' you can refer to it directly:
if (buf[0]=='(')
or
if (*buf=='(')
I am trying to get fgetc to read through a file and skip from a certain indicator until a new line. This seems like a simple question, but I can't find any documentation on it.
Here is an example of my question:
read this in ; skip from semicolon on to new line
My best guess at a solution would be to read in the entire file, and for each line use strtok to skip from ; to the end of the line. Obviously this is horrible inefficient. Any ideas?
*I need to use fgetc or something like fgetc that will parse the file character by character
Easiest thing to do is read the entire line in, then truncate if there a ;.
char buffer[1024], * p ;
if ( fgets(buffer, sizeof(buffer), fin) )
{
if (( p= strchr( buffer, ';' ))) { *p = '\0' ; } // chop off ; and anything after
for ( p= buffer ; ( * p ) ; ++ p )
{
char c= * p ;
// do what you want with each character c here.
}
}
When you do the read, buffer will initially contain:
"read this in ; skip from semicolon on to new line\n\0"
After you find the ; in the line and stick a '\0' there, the buffer looks like:
"read this in \0 skip from semicolon on to new line\n\0"
So the for loop starts at r and stops at the first \0.
//Function of compatible fgets to read up to the character specified by a delimiter.
//However file stream keep going until to newline.
//s : buffer, n : buffer size
char *fgets_delim(char *s, int n, FILE *fp, char delimiter){
int i, ch=fgetc(fp);
if(EOF==ch)return NULL;
for(i=0;i<n-1;++i, ch=fgetc(fp)){
s[i] = ch;
if(ch == '\n'){
s[i+1]='\0';
break;
}
if(ch == EOF){
s[i]='\0';
break;
}
if(ch == delimiter){
s[i]='\0';//s[i]='\n';s[i+1]='\0'
while('\n'!=(ch = fgetc(fp)) && EOF !=ch);//skip
break;
}
}
if(i==n-1)
s[i] = '\0';
return s;
}
Given a requirement to use fgetc(), then you are probably supposed to echo everything up to the first semicolon on the line, and suppress everything from the semicolon to the end of the line. I note in passing that getc() is functionally equivalent to fgetc() and since this code is about to read from standard input and write to standard output, it would be reasonable to use getchar() and putchar(). But rules are rules...
#include <stdio.h>
#include <stdbool.h>
int main(void)
{
int c;
bool read_semicolon = false;
while ((c = fgetc(stdin)) != EOF)
{
if (c == '\n')
{
putchar(c);
read_semicolon = false;
}
else if (c == ';')
read_semicolon = true;
else if (read_semicolon == false)
putchar(c);
/* else suppressed because read_semicolon is true */
}
return 0;
}
If you don't have C99 and <stdbool.h>, you can use int, 0 and 1 in place of bool, false and true respectively. You can use else if (!read_semi_colon) if you prefer.