C parsing a string divided - c

I hate to be that guy asking easy questions but I am bit rusty in my C and something is eluding me here. I am trying to read a file with the following sample text:
23# 1110.00:1000.00,120.00:1110.00,1190.00:900.00,-52.98,-53.21
I want to separate in several strings the elements divided by the cardinal and comas, however in my console am not getting any ouput at all.
#include <stdio.h> /* required for file operations */
#include <conio.h> /* for clrscr */
#include <dos.h> /* for delay */
FILE *fr; /* declare the file pointer */
#include <stdio.h>
int main(void)
{
char output[200];
const char filename[] = "file.txt";
FILE *file = fopen(filename, "r");
if ( file )
{
char line [ BUFSIZ ];
while ( fgets(line, sizeof line, file) )
{ printf(" %s \n", line);
char * i[80],pt1[80], pt2[80], pt3[80], tp1[80], tp2[80];
if ( sscanf(line, "%s# %s,%s,%s,%s,%s",
&i, &pt1, &pt2, &pt3, &tp1, &tp2) == 6 )
{
snprintf(output, sizeof output,
"Leitura:=%s,PT1=%s,PT2=%s,PT3=%s,TP1=%s,TP2=%s,",
i, pt1, pt2, pt3, tp1, tp2);
puts(output);
}
}
}
else
{
perror(filename);
}
return 0;
}

The problem is that sscanf() will not treat # or , as delimiters, only whitespace. This means that "%s" would read "1110.00:1000.00,120.00:1110.00,1190.00:900.00,-52.98,-53.21" entirely, resulting in possible buffer overrun and definitely 6 not being the result.
A solution is to use scan sets:
if (sscanf(line,
"%79[^#]# %79[^,],%79[^,],%79[^,],%79[^,],%s",
i,
pt1,
pt2,
pt3,
tp1,
tp2) == 6 )
Additionally, the declaration of i is not correct:
char * i[80],pt1[80], pt2[80], pt3[80], tp1[80], tp2[80];
as it makes i an array of 80 char*. Change to:
char i[80],pt1[80], pt2[80], pt3[80], tp1[80], tp2[80];

Related

Copy lines from file to char *array[]?

Hi need a little bit of help here. I have a file with 5 lines and I want to put this lines into an array of type char *lines[5]; but I can't figure it out why the following isn't working.
#include <stdio.h>
#include <string.h>
int main(void) {
FILE *fp = fopen("name.txt", "r");
char *str;
char *list[5];
int i = 0;
while (fgets(str, 100, fp) != NULL) // read line of text
{
printf("%s", str);
strcpy(list[i], str);
i++;
}
}
As the commenters stated, you need to create an array (which is nothing more than a space in the memory) of a sufficient size to store your string. One approach to solve your problems is the following, note the comments:
#include <stdio.h>
#include <string.h>
int lines(FILE *file); //try to format the code according to some standard
int main(void) {
FILE *fp = fopen("name.txt", "r");
char list[5][100]; //make sure you allocate enough space for your message
// for loop is more elegant than while loop in this case,
// as you have an index which increases anyway.
// also, you can make sure that files with more than 5 lines
// do not break your program.
for(int i = 0; i<5 ;++i )
{
if(fgets(list[i], 100, fp) == NULL){
break;
}
//list[i] is already a string, you don't need an extra copy
printf("%s", list[i]);
}
}

Read a file specified as an argument and return its' lines

I have an exercise in which I have to read a file containing strings and I have to return the content using one/multiple arrays (this is because the second part of this exercise asks for these lines to be reversed, I'm having problems - and therefore ask for help - with the input).
So far, I have this:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define LENGTH 1024
int main(int argc, char *argv[]){
char* input[LENGTH];
if(argc==2){
FILE *fp = fopen(argv[1], "rt");
if(fp!=NULL){
int i=0;
while(fgets(input, sizeof(input), fp)!=NULL){
input[i] = (char*)malloc(sizeof(char) * (LENGTH));
fgets(input, sizeof(input), fp);
i++;
}
printf("%s", *input);
free(input);
}
else{
printf("File opening unsuccessful!");
}
}
else{
printf("Enter an argument.");
}
return 0;
}
I also have to check whether or not memory allocation has failed. This program in its' current form returns nothing when run from the command line.
EDIT: I think it's important to mention that I get a number of warnings:
passing argument 1 of 'fgets' from incompatible pointer type [-Wincompatible-pointer-types]|
attempt to free a non-heap object 'input' [-Wfree-nonheap-object]|
EDIT 2:
Example of input:
These
are
strings
... and the expected output:
esehT
era
sgnirts
In the exercise, it's specified that the maximum length of a line is 1024 characters.
You probably want something like this.
Comments are in the code
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define LENGTH 1024
int main(int argc, char* argv[]) {
if (argc == 2) {
FILE* fp = fopen(argv[1], "rt");
if (fp != NULL) {
char** lines = NULL; // pointer to pointers to lines read
int nboflines = 0; // total number of lines read
char input[LENGTH]; // temporary input buffer
while (fgets(input, sizeof(input), fp) != NULL) {
char* newline = malloc(strlen(input) + 1); // allocate memory for line (+1 for null terminator)
strcpy(newline, input); // copy line just read
newline[strcspn(newline, "\n")] = 0; // remove \n if any
nboflines++; // one more line
lines = realloc(lines, nboflines * sizeof(char*)); // reallocate memory for one more line
lines[nboflines - 1] = newline; // store the pointer to the line
}
fclose(fp);
for (int i = 0; i < nboflines; i++) // print the lins we've read
{
printf("%s\n", lines[i]);
}
}
else {
printf("File opening unsuccessful!");
}
}
else {
printf("Enter an argument.");
}
return 0;
}
Explanation about removing the \n left by fgets: Removing trailing newline character from fgets() input
Disclaimers:
there is no error checking for the memory allocation functions
memory is not freed. This is left as an exercise.
the way realloc is used here is not very efficient.
you still need to write the code that reverses each line and displays it.
You probably should decompose this into different functions:
a function that reads the file and returns the pointer to the lines and the number of lines read,
a function that displays the lines read
a function that reverses one line (to be written)
a function that reverses all lines (to be written)
This is left as an exercise.

How to print a substring in C

simple C question here!
So I am trying to parse through a string lets say: 1234567W
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[]) {
//pointer to open file
FILE *op;
//open file of first parameter and read it "r"
op = fopen("TestCases.txt", "r");
//make an array of 1000
char x[1000];
char y[1000];
//declare variable nums as integer
int nums;
//if file is not found then exit and give error
if (!op) {
perror("Failed to open file!\n");
exit(1);
}
else {
while (fgets(x, sizeof(x), op)) {
//pounter to get the first coordinate to W
char *p = strtok(x, "W");
//print the first 3 digits of the string
printf("%.4sd\n", p);
}
}
return 0;
My output so far shows: "123d" because of the "%.4sd" in the printf function.
I now need to get the next two numbers, "45". Is there a regex expression I can use that will allow me to get the next two digits of a string?
I am new to C, so I was thinking more like "%(ignore the first 4 characters)(print next 2 digits)(ignore the last two digits)"
input: pic
output: pic
Please let me know.
Thanks all.
printf("Next two: %.2s\n", p + 4); should work.
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
int main(int argc, char *argv[]) {
//pointer to open file
FILE *op;
//open file of first parameter and read it "r"
op = fopen("TestCases.txt", "r");
//make an array of 1000
char x[1000];
char y[1000];
//declare variable nums as integer
int nums;
//if file is not found then exit and give error
if (!op) {
perror("Failed to open file!\n");
exit(1);
}
else {
while (fgets(x, sizeof(x), op)) {
//pounter to get the first coordinate to W
char *p = strtok(x, "W");
//print the first 3 digits of the string
printf("%.4sd\n", p);
printf("Next two: %.2s\n", p + 4);
}
}
return 0;
}
Side note: I added a missing stdio.h include. Please turn on compiler warnings, since this error would've been caught by them.

Get the length of each line in file with C and write in output file

I am a biology student and I am trying to learn perl, python and C and also use the scripts in my work. So, I have a file as follows:
>sequence1
ATCGATCGATCG
>sequence2
AAAATTTT
>sequence3
CCCCGGGG
The output should look like this, that is the name of each sequence and the count of characters in each line and printing the total number of sequences in the end of the file.
sequence1 12
sequence2 8
sequence3 8
Total number of sequences = 3
I could make the perl and python scripts work, this is the python script as an example:
#!/usr/bin/python
import sys
my_file = open(sys.argv[1]) #open the file
my_output = open(sys.argv[2], "w") #open output file
total_sequence_counts = 0
for line in my_file:
if line.startswith(">"):
sequence_name = line.rstrip('\n').replace(">","")
total_sequence_counts += 1
continue
dna_length = len(line.rstrip('\n'))
my_output.write(sequence_name + " " + str(dna_length) + '\n')
my_output.write("Total number of sequences = " + str(total_sequence_counts) + '\n')
Now, I want to write the same script in C, this is what I have achieved so far:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
input = FILE *fopen(const char *filename, "r");
output = FILE *fopen(const char *filename, "w");
double total_sequence_counts = 0;
char sequence_name[];
char line [4095]; // set a temporary line length
char buffer = (char *) malloc (sizeof(line) +1); // allocate some memory
while (fgets(line, sizeof(line), filename) != NULL) { // read until new line character is not found in line
buffer = realloc(*buffer, strlen(line) + strlen(buffer) + 1); // realloc buffer to adjust buffer size
if (buffer == NULL) { // print error message if memory allocation fails
printf("\n Memory error");
return 0;
}
if (line[0] == ">") {
sequence_name = strcpy(sequence_name, &line[1]);
total_sequence_counts += 1
}
else {
double length = strlen(line);
fprintf(output, "%s \t %ld", sequence_name, length);
}
fprintf(output, "%s \t %ld", "Total number of sequences = ", total_sequence_counts);
}
int fclose(FILE *input); // when you are done working with a file, you should close it using this function.
return 0;
int fclose(FILE *output);
return 0;
}
But this code, of course is full of mistakes, my problem is that despite studying a lot, I still can't properly understand and use the memory allocation and pointers so I know I especially have mistakes in that part. It would be great if you could comment on my code and see how it can turn into a script that actually work. By the way, in my actual data, the length of each line is not defined so I need to use malloc and realloc for that purpose.
For a simple program like this, where you look at short lines one at a time, you shouldn't worry about dynamic memory allocation. It is probably good enough to use local buffers of a reasonable size.
Another thing is that C isn't particularly suited for quick-and-dirty string processing. For example, there isn't a strstrip function in the standard library. You usually end up implementing such behaviour yourself.
An example implementation looks like this:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define MAXLEN 80 /* Maximum line length, including null terminator */
int main(int argc, char *argv[])
{
FILE *in;
FILE *out;
char line[MAXLEN]; /* Current line buffer */
char ref[MAXLEN] = ""; /* Sequence reference buffer */
int nseq = 0; /* Sequence counter */
if (argc != 3) {
fprintf(stderr, "Usage: %s infile outfile\n", argv[0]);
exit(1);
}
in = fopen(argv[1], "r");
if (in == NULL) {
fprintf(stderr, "Couldn't open %s.\n", argv[1]);
exit(1);
}
out = fopen(argv[2], "w");
if (in == NULL) {
fprintf(stderr, "Couldn't open %s for writing.\n", argv[2]);
exit(1);
}
while (fgets(line, sizeof(line), in)) {
int len = strlen(line);
/* Strip whitespace from end */
while (len > 0 && isspace(line[len - 1])) len--;
line[len] = '\0';
if (line[0] == '>') {
/* First char is '>': copy from second char in line */
strcpy(ref, line + 1);
} else {
/* Other lines are sequences */
fprintf(out, "%s: %d\n", ref, len);
nseq++;
}
}
fprintf(out, "Total number of sequences. %d\n", nseq);
fclose(in);
fclose(out);
return 0;
}
A lot of code is about enforcing arguments and opening and closing files. (You could cut out a lot of code if you used stdin and stdout with file redirections.)
The core is the big while loop. Things to note:
fgets returns NULL on error or when the end of file is reached.
The first lines determine the length of the line and then remove white-space from the end.
It is not enough to decrement length, at the end the stripped string must be terminated with the null character '\0'
When you check the first character in the line, you should check against a char, not a string. In C, single and double quotes are not interchangeable. ">" is a string literal of two characters, '>' and the terminating '\0'.
When dealing with countable entities like chars in a string, use integer types, not floating-point numbers. (I've used (signed) int here, but because there can't be a negative number of chars in a line, it might have been better to have used an unsigned type.)
The notation line + 1 is equivalent to &line[1].
The code I've shown doesn't check that there is always one reference per sequence. I'll leave this as exercide to the reader.
For a beginner, this can be quite a lot to keep track of. For small text-processing tasks like yours, Python and Perl are definitely better suited.
Edit: The solution above won't work for long sequences; it is restricted to MAXLEN characters. But you don't need dynamic allocation if you only need the length, not the contents of the sequences.
Here's an updated version that doesn't read lines, but read characters instead. In '>' context, it stored the reference. Otherwise it just keeps a count:
#include <stdlib.h>
#include <stdio.h>
#include <ctype.h> /* for isspace() */
#define MAXLEN 80 /* Maximum line length, including null terminator */
int main(int argc, char *argv[])
{
FILE *in;
FILE *out;
int nseq = 0; /* Sequence counter */
char ref[MAXLEN]; /* Reference name */
in = fopen(argv[1], "r");
out = fopen(argv[2], "w");
/* Snip: Argument and file checking as above */
while (1) {
int c = getc(in);
if (c == EOF) break;
if (c == '>') {
int n = 0;
c = fgetc(in);
while (c != EOF && c != '\n') {
if (n < sizeof(ref) - 1) ref[n++] = c;
c = fgetc(in);
}
ref[n] = '\0';
} else {
int len = 0;
int n = 0;
while (c != EOF && c != '\n') {
n++;
if (!isspace(c)) len = n;
c = fgetc(in);
}
fprintf(out, "%s: %d\n", ref, len);
nseq++;
}
}
fprintf(out, "Total number of sequences. %d\n", nseq);
fclose(in);
fclose(out);
return 0;
}
Notes:
fgetc reads a single byte from a file and returns this byte or EOF when the file has ended. In this implementation, that's the only reading function used.
Storing a reference string is implemented via fgetc here too. You could probably use fgets after skipping the initial angle bracket, too.
The counting just reads bytes without storing them. n is the total count, len is the count up to the last non-space. (Your lines probably consist only of ACGT without any trailing space, so you could skip the test for space and use n instead of len.)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[]){
FILE *my_file = fopen(argv[1], "r");
FILE *my_output = fopen(argv[2], "w");
int total_sequence_coutns = 0;
char *sequence_name;
int dna_length;
char *line = NULL;
size_t size = 0;
while(-1 != getline(&line, &size, my_file)){
if(line[0] == '>'){
sequence_name = strdup(strtok(line, ">\n"));
total_sequence_coutns +=1;
continue;
}
dna_length = strlen(strtok(line, "\n"));
fprintf(my_output, "%s %d\n", sequence_name, dna_length);
free(sequence_name);
}
fprintf(my_output, "Total number of sequences = %d\n", total_sequence_coutns);
fclose(my_file);
fclose(my_output);
free(line);
return (0);
}

Issues with structs in C

I have an array in a struct. I'm reading from a file into a string. I use strtok to get the first few characters, and i want to pass the rest of the line into the struct, to eventually be passed into a thread. I'm getting the following error:
incompatible types when assigning to type char[1024] from type char *
Referring to the line indicated below with the comments. It probably has to do with how i'm trying to copy character arrays, but i'm not sure on a better way.
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <linux/input.h>
#include <string.h>
#include <time.h>
#include <unistd.h>
typedef struct
{
int period; //stores the total period of the thread
int priority; // stores the priority
char pline[1024]; // stores entire line of text to be sorted in function.
}PeriodicThreadContents;
int main(int argc, char* argv[])
{
//opening file, and testing for success
//file must be in test folder
FILE *fp;
fp = fopen("../test/Input.txt", "r");
if (fp == NULL)
{
fprintf(stderr, "Can't open input file in.list!\n");
exit(1);
}
char line[1024];
fgets(line, sizeof(line), fp);
//getting first line of text, containing
char *task_count_read = strtok(line," /n");
char *duration_read = strtok(NULL, " /n");
//converting char's to integers
int task_count = atoi(task_count_read);
int i = 0;
PeriodicThreadContents pcontents;
printf("started threads \n");
while (i < task_count)
{
fgets(line, sizeof (line), fp);
strtok(line," ");
if (line[0] == 'P')
{
char *period_read = strtok(NULL, " ");
pcontents.period = atoi(period_read);
printf("%d",pcontents.period);
printf("\n");
char *priority_read = strtok(NULL, " ");
pcontents.priority = atoi(priority_read);
printf("%d",pcontents.priority);
printf("\n");
printf("\n%s",line);
memcpy(&(pcontents.pline[0]),&line,1024);
printf("%s",pcontents.pline);
}
}
return 0;
}
C cannot handle strings as other languages do. C doesn't have string assignments or comparisons without using auxiliary functions.
In order to copy a string in a buffer you can use:
strcpy(pcontents.pline, line);
Or even (to have a warranty that your string is not longer than 1024 bytes):
memcpy(pcontents.pline, line, 1024);
pcontents.pline[1023] = '\0';
For other string operations check: http://www.gnu.org/software/libc/manual/html_node/String-and-Array-Utilities.html#String-and-Array-Utilities
You need to copy the chars from the buffer into pcontents.pline (assuming pcontents is a PeriodicThreadContents).
strcpy(pcontents.pline, strtok(NULL, " "));

Resources