I'm really new to C, and currently I'm trying to read in from a file which contains a list of names, and import that into an array. The current array is of type char[][] since it will have more information than just the name, but essentially I want team[0][0] to be the first name i read in, team[1][0] to be the second, etc. I'm pretty sure the actual importing of the names is correct, but I'm having problems storing these arrays.
FILE *teamfile;
teamfile = fopen(file, "r");
char line[MAXLENGTH+1];
int i = 0;
while( fgets(line, sizeof line, teamfile) != NULL )
{
trim_line(line);
strcpy(&team[i][NAME],line);
i++;
}
fclose(teamfile);
Which is called from the main function as teams = teamlist(argv[1], team);
But when I try to refer to the array from elsewhere in my program eg printf(&team[0][0]) it outputs what seems to be all names in one block...
What am I doing wrong?
edit:
static void trim_line(char line[])
{
int i = 0;
// LOOP UNTIL WE REACH THE END OF line
while(line[i] != '\0')
{
// CHECK FOR CARRIAGE-RETURN OR NEWLINE
if( line[i] == '\r' || line[i] == '\n' )
{
line[i] = '\0'; // overwrite with nul-byte
break; // leave the loop early
}
i = i+1; // iterate through character array
}
}
thanks for the help so far! :D
if team is declared as char team[NUM_OF_TEAMS][LENGHT_OF_NAME]
then it should always be strcpy(&team[i],line);
Hint: it is a char array, not a "string object" in C
Related
So i have a series of comments in my markup file:
# comment1
# comment2
I want to read these into an array to be added to a comment array in my struct. I do not know the amount of comment lines in advance
I declare the comment array in my struct as follows:
char *comments; //comment array
Then I am starting to read the comments in but what i've got wasn't working:
int c;
//check for comments
c = getc(fd);
while(c == '#') {
while(getc(fd) != '\n') ;
c = getc(fd);
}
ungetc(c, fd);
//end comments?
Am I even close?
Thanks
First
char *comments; //comment array
Is one comment not an array of comments.
You need to use realloc, to create an array of strings
char**comments = NULL;
int count = 10; // initial size
comments = realloc(comments, count);
when you get > count
count*=2;
comments = realloc(comments, count);// classic doubling strategy
to put a string into the array (assuming comment is a char* with one comment in it
comments[i] = strdup(comment);
You can use fgets() form <stdio> to read one line at at time.
int num_comments = 0;
char comment_tmp[82];
char comment_arr[150][82];
while(comment_tmp[0] != '#' && !feof(file_pointer)){
fgets(comment_tmp, 82, file_pointer);
strcpy(comment_arr[num_comments], comment_tmp);
num_comments++;
}
This has the limitation of only being able to store 150 comments. This can be overcome by 1) setting a higher number there, 2) using dynamic memory allocation (think malloc/free), or 3) organizing your comments into a more flexible data structure like a linked list.
When you see that line is comment store the value of comment in comment variable just go to next line and do this loop again. So the code :
char c = getc(fd);
while(c == '#') {
while(getc(fd) != '\n') /* remove ; */ {
*comment = getc(fd);
++comment;
}
}
or use fscanf which is easier :
fscanf(fd,"#%s\n",comment); /* fd is the file */
Note that comment here is a string not an array of string.
For array of string it would be :
#define COMMENT_LEN 256
char comment [COMMENT_LEN ][100];
int i = 0;
while(!feof(fd) || i < 100) {
fscanf(fd,"#%s\n",comment[i]);
getch(); /* To just skip the new line char */
++i;
}
I have the following text file:
13.69 (s, 1H), 11.09 (s, 1H).
So far I can quite happily use either fgets or fgetc to pass all text to a buffer as follows:
char* data;
data = malloc(sizeof(char) * 100);
int c;
int n = 0;
FILE* inptr = NULL;
inptr = fopen("NMR", "r");
if(NULL == fopen("NMR", "r"))
{
printf("Error: could not open file\n");
return 1;
}
for (c = fgetc(inptr); c != EOF && c != '\n'; c = fgetc(inptr))
{
data[n++] = c;
}
for (int i = 0, n = 100; i < n; i++)
{
printf ("%c", data[i]);
}
printf("\n");
and then print the buffer to the screen afterwards. However, I am only looking to pass part of the textfile to the buffer, namely:
13.69 (s, 1H),
So this means I want fgetc to stop after ','. However, this means the that the text will stop at 13.69 (s, and not 13.69 (s, 1H),
Is there a way around this? I have also experimented with fgets and then using strstr as follows:
char needle[4] = ")";
char* ret;
ret = strstr(data, needle);
printf("The substring is: %s\n", ret);
However, the output from this is:
), 11.09 (s, 1H)
thus giving me the rest of the string which I do not want. It's an interesting one and if anyone has any tips it would be much appreciated!
If you know that the closing parenthesis is the last character you want, you can use that as your stopping point in the fgetc() loop:
char data[100]; //No need to dynamically allocate if we know the size at compile time
int c;
int n = 0;
FILE* inptr = NULL;
inptr = fopen("NMR", "r");
if(inptr == NULL) //We want to check the value of the file we just opened
{ //and plan to use
printf("Error: could not open file\n");
return 1;
}
//We'll keep the original value guards (EOF and '\n') below and add two more
//to make sure we break from the loop
//We use n<98 below to make sure we can always create a null-terminated string,
//If we used 99, the 100th character might be a ')', then we have no room for a
//terminating null-char
for (c = fgetc(inptr); c != ')' && n < 98 && c != EOF && c != '\n'; c = fgetc(inptr))
{
data[n++] = c;
}
if(c != ')') //We hit EOF, \n, or ran out of space in data[]
{
printf("Error: no matching sequence found\n");
return 2;
}
data[n]=')'; //Could also write data[n]=c here, since we know it's a ')'
data[n+1]='\0'; //Add the terminating null character
printf("%s\n",data); //Since it's a properly formatted string, we can use %s
(Note that this example will handle null input characters differently from yours. If you expect null characters to be in the input stream (NMR file) then change the printf("%s",...) line back to the for loop you originally had.
Well with only one example of the format you are trying to parse it's not totally possible to give an answer, however if your input is always like this I would simply have a counter and break after the second comma.
int comma = 0;
for (c = fgetc(inptr); c != EOF && c != '\n' && c != ',' && comma < 1; c = fgetc(inptr))
{
if (data[n] = ',')
comma++;
data[n++] = c;
}
In case the characters inside the parenthesis can be more complex I would simply maintain a boolean state to know if I am actually inside or outside a parenthesis and break when I read a comma outside of it.
Simply read using fgets and store desired string in char * using sscanf-
char *new_data;
new_data=malloc(100); // allocate memory
...
fgets(data,100,inptr); // read from file but check its return
sscanf(data,"%[^)]",new_data); // store string untill ')' in new_data from data
strcat(new_data,")"); // concatenating new_data and ")"
printf("%s",new_data); // print new_data
...
free(new_data); // remember to free memory
Also you should check return of malloc though not done in my example and also close the file opened .
I am given a file of DNA sequences and asked to compare all of the sequences with each other and delete the sequences that are not unique. The file I am working with is in fasta format so the odd lines are the headers and the even lines are the sequences that I want to compare. SO I am trying to store the even lines in one array and the odd lines in another. I am very new to C so I'm not sure where to begin. I figured out how to store the whole file in one array like this:
int main(){
int total_seq = 50;
char seq[100];
char line[total_seq][100];
FILE *dna_file;
dna_file = fopen("inabc.fasta", "r");
if (dna_file==NULL){
printf("Error");
}
while(fgets(seq, sizeof seq, dna_file)){
strcpy(line[i], seq);
printf("%s", seq);
i++;
}
}
fclose(dna_file);
return 0;
}
I was thinking I would have to incorporate some sort of code that looked like this:
for (i = 0; i < rows; i++){
if (i % 2 == 0) header[i/2] = getline();
else seq[i/2] = getline();
but I'm not sure how to implement it.
Any help would be greatly appreciated!
To store the even lines of a file to one array and the odd lines to another,
read each char and swap output files when '\n' encountered.
void Split(FILE *even, FILE* odd, FILE *source) {
int evenflag = 1;
int ch;
while ((ch = fgetc(source)) != EOF) {
if (evenflag) {
fputc(ch, even);
} else {
fputc(ch, odd);
}
if (ch == '\n') {
evenflag = !evenflag;
}
}
}
It is not clear if this post also requires code to do the unique filtering step.
Could you please give me an example of the data in the file?
Am I right in thinking it'd be something like:
Header
Sequence
Header
Sequence
And so on
Perhaps you could do something like this:
int main(){
int total_seq = 50;
char seq[100];
char line[total_seq][100];
FILE *dna_file;
dna_file = fopen("inabc.fasta", "r");
if (dna_file==NULL){
printf("Error");
}
// Put this in an else statement
int counter = 1;
while(fgets(seq, sizeof seq, dna_file)){
// If counter is odd
// Place next line read in headers array
// If counter is even
// Place next line read in sequence array
// Increment counter
}
// Now you have all the sequences & headers. Remove any duplicates
// Foreach number of elements in 'sequence' array - referenced by, e.g. 'j' where 'j' starts at 0
// Foreach number of elements in 'sequence' array - referenced by 'k' - Where 'k' Starts at 'j + 1'
// IF (sequence[j] != '~') So if its not our chosen escape character
// IF (sequence[j] == sequence[k]) (I think you'd have to use strcmp for this?)
// SET sequence[k] = '~';
// SET header[k] = '~';
// END IF
// END IF
// END FOR
// END FOR
}
// You'd then need an algorithm to run through the arrays. If a '~' is found. Move the following non tilda/sequence down to its position, and so on.
// EDIT: Infact. It would probably be easier if when writing back to file, just ignore/don't write if sequence[x] == '~' (where 'x' iterates through all)
// Finally write back to file
fclose(dna_file);
return 0;
}
First: write a function that counts the number of newline (\n) characters in the file.
Then write a function that searches for the n-th newline
Last, write a function to go through and read from one '\n' to the next.
Alternately, you could just go online and read about string parsing.
I'm working on a program for school right now in c and I'm having trouble reading text from a file. I've only ever worked in Java before so I'm not completely familiar with c yet and this has got me thoroughly stumped even though I'm sure it's pretty simple.
Here's an example of how the text can be formatted in the file we have to read:
boo22$Book5555bOoKiNg#bOo#TeX123tEXT(JOHN)
I have to take in each word and store it in a data structure, and a word is only alpha characters, so no numbers or special characters. I already have the data structure working properly so I just need to get each word into a char array and then add it to my structure. It has to keep reading each char until it gets to a non-alpha char value. I've tried looking into the different ways to scan in from a file and I'm not sure what would be best for my scenario.
Here's the code I have right now for my input:
char str[MAX_WORD_SIZE];
char c;
int index = 0;
while (fscanf(dictionaryInputFile, "%c", c) != EOF) //while not at end of file
{
if (isalpha(c)) //if current character is a letter
{
tolower(c); //ignores case in word
str[index] = c; //add char to string
index++;
}
else if (str[0] != '\0') //If a word
{
str[index] = '\0'; //Make sure no left over characters in String
dictionaryRoot = insertNode(str, dictionaryRoot); //insert word to dictionary
index = 0; //reset index
str[index] = '\0'; //Set first character to null since word has been added
}
}
My thinking was that if it doesn't hit that first if statement then I have to check if str is a word or not, that's why it checks if the 0 index of str is null or not. I'm guessing the else if statement I have is not right though, but I can't figure out a way to end the current word I'm building and then reset str to null when it's added to my data structure. Right now when I run this I get a segmentation fault if I pass the txt file as an argument.
I'd just like to know if I'm on the right track and if not maybe some help on how I should be reading this data.
This is my first time posting here so I hope I included everything you'll need to help me, if not just let me know and I'd be happy to add more information.
Biggest problem: Incorrect use of fscanf(). #BLUEPIXY
// while (fscanf(dictionaryInputFile, "%c", c) != EOF)
while (fscanf(dictionaryInputFile, "%c", &c) != EOF)
No protection against overflow.
// str[index] = c; //add char to string
if (index >= MAX_WORD_SIZE - 1) Handle_TooManySomehow();
Not sure why testing against '\0' when '\0' is also a non-alpha.
Pedantically, isalpha() is problematic when a signed char is passed. Better to pass the unsigned char value: is...((unsigned char) c)), when code knows it is not EOF. Alternatively, save the input using int ch = fgetc(stream) and use is...(ch)).
Minor: Better to use size_t for array indexes than int, but be careful as size_t is unsigned. size_t is important should the array become large, unlike in this case.
Also, when EOF received, any data in str is ignored, even if it contained a word. #BLUEPIXY.
For the most part, OP is on the right track.
Follows is a sample non-tested approach to illustrate not overflowing the buffer.
Test for full buffer, then read in a char if needed. If a non-alpha found, add to dictionary if a non-zero length work was accumulated.
char str[MAX_WORD_SIZE];
int ch;
size_t index = 0;
for (;;) {
if ((index >= sizeof str - 1) ||
((ch = fgetc(dictionaryInputFile)) == EOF) ||
(!isalpha(ch))) {
if (index > 0) {
str[index] = '\0';
dictionaryRoot = insertNode(str, dictionaryRoot);
index = 0;
}
if (ch == EOF) break;
}
else {
str[index++] = tolower(ch);
}
}
I have an input file I need to extract words from. The words can only contain letters and numbers so anything else will be treated as a delimiter. I tried fscanf,fgets+sscanf and strtok but nothing seems to work.
while(!feof(file))
{
fscanf(file,"%s",string);
printf("%s\n",string);
}
Above one clearly doesn't work because it doesn't use any delimiters so I replaced the line with this:
fscanf(file,"%[A-z]",string);
It reads the first word fine but the file pointer keeps rewinding so it reads the first word over and over.
So I used fgets to read the first line and use sscanf:
sscanf(line,"%[A-z]%n,word,len);
line+=len;
This one doesn't work either because whatever I try I can't move the pointer to the right place. I tried strtok but I can't find how to set delimitters
while(p != NULL) {
printf("%s\n", p);
p = strtok(NULL, " ");
This one obviously take blank character as a delimitter but I have literally 100s of delimitters.
Am I missing something here becasue extracting words from a file seemed a simple concept at first but nothing I try really works?
Consider building a minimal lexer. When in state word it would remain in it as long as it sees letters and numbers. It would switch to state delimiter when encountering something else. Then it could do an exact opposite in the state delimiter.
Here's an example of a simple state machine which might be helpful. For the sake of brevity it works only with digits. echo "2341,452(42 555" | ./main will print each number in a separate line. It's not a lexer but the idea of switching between states is quite similar.
#include <stdio.h>
#include <string.h>
int main() {
static const int WORD = 1, DELIM = 2, BUFLEN = 1024;
int state = WORD, ptr = 0;
char buffer[BUFLEN], *digits = "1234567890";
while ((c = getchar()) != EOF) {
if (strchr(digits, c)) {
if (WORD == state) {
buffer[ptr++] = c;
} else {
buffer[0] = c;
ptr = 1;
}
state = WORD;
} else {
if (WORD == state) {
buffer[ptr] = '\0';
printf("%s\n", buffer);
}
state = DELIM;
}
}
return 0;
}
If the number of states increases you can consider replacing if statements checking the current state with switch blocks. The performance can be increased by replacing getchar with reading a whole block of the input to a temporary buffer and iterating through it.
In case of having to deal with a more complex input file format you can use lexical analysers generators such as flex. They can do the job of defining state transitions and other parts of lexer generation for you.
Several points:
First of all, do not use feof(file) as your loop condition; feof won't return true until after you attempt to read past the end of the file, so your loop will execute once too often.
Second, you mentioned this:
fscanf(file,"%[A-z]",string);
It reads the first word fine but the file pointer keeps rewinding so it reads the first word over and over.
That's not quite what's happening; if the next character in the stream doesn't match the format specifier, scanf returns without having read anything, and string is unmodified.
Here's a simple, if inelegant, method: it reads one character at a time from the input file, checks to see if it's either an alpha or a digit, and if it is, adds it to a string.
#include <stdio.h>
#include <ctype.h>
int get_next_word(FILE *file, char *word, size_t wordSize)
{
size_t i = 0;
int c;
/**
* Skip over any non-alphanumeric characters
*/
while ((c = fgetc(file)) != EOF && !isalnum(c))
; // empty loop
if (c != EOF)
word[i++] = c;
/**
* Read up to the next non-alphanumeric character and
* store it to word
*/
while ((c = fgetc(file)) != EOF && i < (wordSize - 1) && isalnum(c))
{
word[i++] = c;
}
word[i] = 0;
return c != EOF;
}
int main(void)
{
char word[SIZE]; // where SIZE is large enough to handle expected inputs
FILE *file;
...
while (get_next_word(file, word, sizeof word))
// do something with word
...
}
I would use:
FILE *file;
char string[200];
while(fscanf(file, "%*[^A-Za-z]"), fscanf(file, "%199[a-zA-Z]", string) > 0) {
/* do something with string... */
}
This skips over non-letters and then reads a string of up to 199 letters. The only oddness is that if you have any 'words' that are longer than 199 letters they'll be split up into multiple words, but you need the limit to avoid a buffer overflow...
What are your delimiters? The second argument to strtok should be a string containing your delimiters, and the first should be a pointer to your string the first time round then NULL afterwards:
char * p = strtok(line, ","); // assuming a , delimiter
printf("%s\n", p);
while(p)
{
p = strtok(NULL, ",");
printf("%S\n", p);
}