How to copy text untill newline character? - c

I have a char array list that contains text from a text file, for example:
this is the first line
this is the second line
I want to have the first line copied to another char array without \n (and/or \r).
I do not know the size of the first line exactly but I do know it is less than 100 bytes.
Snappet of my code:
unsigned char *line;
line = (u_char *)calloc(100, sizeof(char));
//read txt file to list
while(list[0] != '\n'){
line[0] = list[0];
list++;
line++;
}
Unfortunaly line is empty. Note that I know for sure list isn't empty, and contains the text as showed above.
Any suggestions on this code, or another solution? The file is opened using open() and not fopen() so I've to loop through my list array.

You can do it like this:
for ( int i = 0; list[i] && list[i] != '\n'; ++i ) {
line[i] = list[i];
}

You also could use strcspn() from the standard library string.h:
Declaration:
size_t strcspn(const char *str1, const char *str2);
Finds the first sequence of characters in the string str1 that does
not contain any character specified in str2.
Returns the length of this first sequence of characters found that do
not match with str2.
Source
Your program would then become
unsigned char *line;
int firstlineLength;
//read txt file to list
/*count the characters up to first linebreak */
firstlineLength = strspn(list, "\n");
/* allocate just the memory you need +1 one for the terminating zero*/
line = (u_char *)calloc(firstlineLength+1, sizeof(char));
strncpy(line, list, firstlineLength);

Related

Scanning data from text file, that doesn't have spacing between each item of data

I have encountered a problem with my homework. I need to scan some data from a text file, to a struct.
The text file looks like this.
012345678;danny;cohen;22;M;danny1993;123;1,2,4,8;Nice person
223325222;or;dan;25;M;ordan10;1234;3,5,6,7;Singer and dancer
203484758;shani;israel;25;F;shaninush;12345;4,5,6,7;Happy and cool girl
349950234;nadav;cohen;50;M;nd50;nadav;3,6,7,8;Engineer very smart
345656974;oshrit;hasson;30;F;osh321;111;3,4,5,7;Layer and a painter
Each item of data to its matching variable.
id = 012345678
first_name = danny
etc...
Now I can't use fscanf because there is no spacing, and the fgets scanning all the line.
I found some solution with %[^;]s, but then I will need to write one block of code and, copy and past it 9 times for each item of data.
Is there any other option without changing the text file, that similar to the code I would write with fscanf, if there was spacing between each item of data?
************* UPDATE **************
Hey, First of all, thanks everyone for the help really appreciating.
I didn't understand all your answers, but here something I did use.
Here's my code :
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct
{
char *idP, *firstNameP, *lastNameP;
int age;
char gender, *userNameP, *passwordP, hobbies, *descriptionP;
}user;
void main() {
FILE *fileP;
user temp;
char test[99];
temp.idP = (char *)malloc(99);
temp.firstNameP = (char *)malloc(99);
temp.lastNameP = (char *)malloc(99);
temp.age = (int )malloc(4);
temp.gender = (char )malloc(sizeof(char));
temp.userNameP = (char *)malloc(99);
fileP = fopen("input.txt", "r");
fscanf(fileP, "%9[^;];%99[^;];%99[^;];%d;%c", temp.idP,temp.firstNameP,temp.lastNameP,&temp.age, temp.gender);
printf("%s\n%s\n%s\n%d\n%c", temp.idP, temp.firstNameP, temp.lastNameP, temp.age, temp.gender);
fgets(test, 60, fileP); // Just testing where it stop scanning
printf("\n\n%s", test);
fclose(fileP);
getchar();
}
It all works well until I scan the int variable, right after that it doesn't scan anything, and I get an error.
Thanks a lot.
As discussed in the comments, fscanf is probably the shortest option (although fgets followed by strtok, and manual parsing are viable options).
You need to use the %[^;] specifier for the string fields (meaning: a string of characters other than ;), with the fields separated by ; to consume the actual semicolons (which we specifically requested not to be consumed as part of the string field). The last field should be %[^\n] to consume up to the newline, since the input doesn't have a terminating semicolon.
You should also (always) limit the length of each string field read with a scanf family function to one less than the available space (the terminating NUL byte is the +1). So, for example, if the first field is at most 9 characters long, you would need char field1[10] and the format would be %9[^;].
It is usually a good idea to put a single space in the beginning of the format string to consume any whitespace (such as the previous newline).
And, of course you should check the return value of fscanf, e.g., if you have 9 fields as per the example, it should return 9.
So, the end result would be something like:
if (fscanf(file, " %9[^;];%99[^;];%99[^;];%d;%c;%99[^;];%d;%99[^;];%99[^\n]",
s.field1, s.field2, s.field3, &s.field4, …, s.field9) != 9) {
// error
break;
}
(Alternatively, the field with numbers separated by commas could be read as four separate fields as %d,%d,%d,%d, in which case the count would go up to 12.)
Here you have simple tokenizer. As I see you have more than one delimiter here (; & ,)
str - string to be tokenized
del - string containing delimiters (in your case ";," or ";" only)
allowempty - if true allows empty tokens if there are two or more consecutive delimiters
return value is a NULL terminated table of pointers to the tokens.
char **mystrtok(const char *str, const char *del, int allowempty)
{
char **result = NULL;
const char *end = str;
size_t size = 0;
int extrachar;
while(*end)
{
if((extrachar = !!strchr(del, *end)) || !*(end + 1))
{
/* add temp variable and malloc / realloc checks */
/* free allocated memory on error */
if(!(!allowempty && !(end - str)))
{
extrachar = !extrachar * !*(end + 1);
result = realloc(result, (++size + 1) * sizeof(*result));
result[size] = NULL;
result[size -1] = malloc(end - str + 1 + extrachar);
strncpy(result[size -1], str, end - str + extrachar);
result[size -1][end - str + extrachar] = 0;
}
str = end + 1;
}
end++;
}
return result;
}
To free the the memory allocated by the tokenizer:
void myfree(char **ptr)
{
char **savedptr = ptr;
while(*ptr)
{
free(*ptr++);
}
free(savedptr);
}
Function is simple but your can use any separators and any number of separators.

In C turning a char into a string

I am reading from a file using fgetc and doing that makes it so that I have a char. However, I want to convert this char to a string such that I can use the strtok function upon it. How would I go about doing this?
int xp;
while(1) {
xp = fgetc(filename);
char xpchar = xp;
//convert xpchar into a string
}
Simply create an array with two items, your character and the null terminator:
char str[] = {ch, '\0'};
Or if you will, use a compound literal to do the same:
(char[]){ch, '\0'}
Compound literals can be used to convert your character directly, inside an expression:
printf("%s", (char[]){ch, '\0'} );
I suppose, you are going to read not just one character from file, so look at the following example:
#define STR_SIZE 10
// STR_SIZE defines the maximum number of characters to be read from file
int xp;
char str[STR_SIZE + 1] = { 0 }; // here all array of char is filled with 0
// +1 in array size ensure that at least one '\0' char
// will be in array to be the end of string
int strCnt = 0; // this is the conter of characters stored in the array
while (1) {
xp = fgetc(f);
char xpchar = xp;
//convert xpchar into a string
str[strCnt] = xpchar; // store character to next free position of array
strCnt++;
if (strCnt >= STR_SIZE) // if array if filled
break; // stop reading from file
}
And name of your file-pointer-variable - filename looks strange (filename is good name for string variable that store name of file, but fgetc and getc need FILE *), so check that in your program you have something like:
FILE * f = fopen(filename, "r");
or think over changing name for filename.

How to read in the last word in a text file and into another text file in C?

SO i'm supposed to write a block of code that opens a file called "words" and writes the last word in the file to a file called "lastword". This is what I have so far:
FILE *f;
FILE *fp;
char string1[100];
f = fopen("words","w");
fp=fopen("lastword", "w");
fscanf(f,
fclose(fp)
fclose(f);
The problem here is that I don't know how to read in the last word of the text file. How would I know which word is the last word?
This is similar to what the tail tool does, you seek to a certain offset from the end of the file and read the block there, then search backwards, once you meet a whitespace or a new line, you can print the word from there, that is the last word. The basic code looks like this:
char string[1024];
char *last;
f = fopen("words","r");
fseek(f, SEEK_END, 1024);
size_t nread = fread(string, 1, sizeof string, f);
for (int I = 0; I < nread; I++) {
if (isspace(string[nread - 1 - I])) {
last = string[nread - I];
}
}
fprintf(fp, "%s", last);
If the word boundary is not find the first block, you continue to read the second last block and search in it, and the third, until your find it, then print all the characters after than position.
There are plenty of ways to do this.
Easy way
One easy approach would be to to loop on reading words:
f = fopen("words.txt","r"); // attention !! open in "r" mode !!
...
int rc;
do {
rc=fscanf(f, "%99s", string1); // attempt to read
} while (rc==1 && !feof(f)); // while it's successfull.
... // here string1 contains the last successfull string read
However this takes a word as any combination of characters separated by space. Note the use of the with filed in the scanf() format to make sure that there will be no buffer overflow.
More exact way
Building on previous attempt, if you want a stricter definition of words, you can just replace the call to scanf() with a function of your own:
rc=read_word(f, string1, 100);
The function would be something like:
int read_word(FILE *fp, char *s, int szmax) {
int started=0, c;
while ((c=fgetc(fp))!=EOF && szmax>1) {
if (isalpha(c)) { // copy only alphabetic chars to sring
started=1;
*s++=c;
szmax--;
}
else if (started) // first char after the alphabetics
break; // will end the word.
}
if (started)
*s=0; // if we have found a word, we end it.
return started;
}

A peculiar bug while reading and printing strings of *char in C

I've just encountered something really odd. My string of char (let's call it word) turns out to have additional letters when I print it. The contatenated letter varies depending on:
the length of the proper prefix word.
the number of spaces after the word.
I'm parsing the word from a line which is just a one line form the standard input. I'm using a function readWord to get the word out of the line:
void readWord(char **linePointer, char **wordPointer){
char *line = *linePointer;
char *word = *wordPointer;
while (!isEndOfLine(line) && isLowerCaseLetter(*line)){
*word = *line;
word++;
line++;
}
word++;
*word = '\0';
printf("The retrieved word is: %s.\n", *wordPointer)
*linePointer = line;
}
My inputs/outputs look like this (please note that I call the readWord function AFTER taking care of insert and the whitespace between):
// INPUT 1 :
insert foo
insert ba // several spaces after 'ba'
// OUTPUT 2:
The retrieved word is foo.
The retrieved word is bas.
// INPUT 1 :
insert foo
insert ba // several spaces after 'bar'
// OUTPUT 2:
The retrieved word is foo.
The retrieved word is bare.
I was thinking whether I allocate the *word properly and I guess I do:
root.word = (char *)malloc(sizeof(char *)); //root is my structure
Moreover, it is unlikely connected to some errors of reassigning the word string because it is completely clear at the beginning of the readWord() function.
Thank you for any help. It is indeed a challenging bug for me and I don't know what else I can do.
UPDATE
It turns out that I actually have some problems with allocating/reassigning, since:
//INPUT
insert foo//no spaces
insert bar //spaces here
//OUTPUT
word variable before calling readWord function: ' '.
The retrieved word is foo.
word variable before calling readWord function: 'insert foo
'.
The retrieved word is bare.
Never trust your input, so check for spaces to go to the beginning of the word.
You increment word one too many, as #rpattiso notes.
I have doubts about your memory allocation (you don't show us all your code):
root.word = (char *)malloc(sizeof(char *)); allocates room for a pointer to a char, but does not allocate the room for the characters themselves. readWord can do that.
The following adapted version should work (updated):
void readWord(char **linePointer, char **wordPointer){
char *line = *linePointer;
int i;
while (!isEndOfLine(line) && !isLowerCaseLetter(*line)) line++; // go to begin of word
*linePointer= line;
while (!isEndOfLine(line) && isLowerCaseLetter(*line)) line++; // go to end of word
i= line - *linePointer; // allocate room for word and copy it
*wordPointer= malloc((i+1) * sizeof(char));
strncpy(*wordPointer, *linePointer, i);
(*wordPointer)[i]= '\0;
printf("The retrieved word is: %s.\n", *wordPointer);
*linePointer = line;
}

Array of strings being overwritten

I have a program that is trying to take a text file that consists of the following and feed it to my other program.
Bruce, Wayne
Bruce, Banner
Princess, Diana
Austin, Powers
This is my C code. It is trying to get the number of lines in the file, parse the comma-separated keys and values, and put them all in a list of strings. Lastly, it is trying to iterate through the list of strings and print them out. The output of this is just Austin Powers over and over again. I'm not sure if the problem is how I'm appending the strings to the list or how I'm reading them off.
#include<stdio.h>
#include <stdlib.h>
int main(){
char* fileName = "Example.txt";
FILE *fp = fopen(fileName, "r");
char line[512];
char * keyname = (char*)(malloc(sizeof(char)*80));
char * val = (char*)(malloc(sizeof(char)*80));
int i = 0;
int ch, lines;
while(!feof(fp)){
ch = fgetc(fp);
if(ch == '\n'){ //counts how many lines there are
lines++;
}
}
rewind(fp);
char* targets[lines*2];
while (fgets(line, sizeof(line), fp)){
strtok(line,"\n");
sscanf(line, "%[^','], %[^',']%s\n", keyname, val);
targets[i] = keyname;
targets[i+1] = val;
i+=2;
}
int q = 0;
while (q!=i){
printf("%s\n", targets[q]);
q++;
}
return 0;
}
The problem is with the two lines:
targets[i] = keyname;
targets[i+1] = val;
These do not make copies of the string - they only copy the address of whatever memory they point to. So, at the end of the while loop, each pair of target elements point to the same two blocks.
To make copies of the string, you'll either have to use strdup (if provided), or implement it yourself with strlen, malloc, and strcpy.
Also, as #mch mentioned, you never initialize lines, so while it may be zero, it may also be any garbage value (which can cause char* targets[lines*2]; to fail).
First you open the file. The in the while loop, check the condition to find \n or EOF to end the loop. In the loop, if you get anything other than alphanumeric, then separate the token and store it in string array. Increment the count when you encounter \n or EOF. Better use do{}while(ch!=EOF);

Resources