How to read multiple words in a line in C? - c

I want the user to be able to type
start < read_from_old_file.c > write_to_new_file.c
//or whatever type of file
So once the user types the command "start" followed by a "<", this will indicate reading a file, whereas ">" will indicate writing to a new file.
Problem is, I know you can use
scanf("%s", buff);
But this would read one word and not go onto the next.

You can use:
scanf("%[^\n]%*c",buff);
If you wish to use scanf() for this.
%*c is to get rid of the newline due to hitting enter.

In general, using the scanf family of functions is a bad idea. Instead, you should use fgets or fread to read in data, then perform your own processing on it. fgets will read a line at a time, whereas fread is for when you don't care about lines and just want n bytes of data (good for, say, implementing cat).
Most programs will want to read a line at a time. You'll need to allocate your own buddy to pass to fgets.

Related

How to move back to the starting of previous line after reading a line using fgets()?

For example if the file contains:
12345
-3445654
1245646
I want to read the first line into a string using fgets(). Then I want to read the second line in too check if there is a '-' in the first spot. If there is one, I will read the second line and strcat it to the first line.
Then I want to read the thrid line using fgets() again. This time when there is no '-' I just want to make the file go back to the beginning of the third line so that the next time I call fgets() it will read the same third line again.
Is there a way I can do this?
Use fgetc to read the first character on the next line, and if it's not a '-' use ungetc to put it back.
Generally you would just keep the part of the file just read, in the memory until you are sure you don't need it anymore.
Or you could read the entire file into a buffer and then jump around it using pointer as much as you like.
Or if you really must, you can more the current stream position with fseek, and then re-read the parts you need.
Whenever I want to
read a line
read a second line
maybe do something involving both the first line and the second line
what I usually do is declare a second line-holding variable
char prevline[whateversize];
and then, somewhere between step 1 and step 2
strcpy(prevline, line);
(Naturally you have to be sure that the line and prevline variables are consistently allocated, e.g. as arrays of the same size, so that overflow isn't a problem.)

Reading parts of a file after a specific tag is found in C using fgets

I would like some suggestion on how to read an 'XML' like file in such a way that the program would only read/store elements observed in a node that meets some requirements. I was thinking about using two fgets in the following way:
while (fgets(file_buffer,line_buffer,fp) != NULL)
{
if (p_str = (char*) strstr(file_buffer,"<element of interest opening")) )
{
//new fgets that starts at fp and runs only until the end of the node
{
//read and process
}
}
}
Does this make sense or are there smarter ways of doing this?
Secondly (in my idea), will i have to define a new FILE* (like fr), set fr to fp at the start of the second fgets or can i somehow abuse the original filepointer for that?
Use an XML parser like Xmllib2 http://xmlsoft.org/xml.html
Your approach seems isn't bad for the job.
You could read the whole line from the file, then, process it using sprintf, strstr or whatever functions you like. This will save you time and unnecessary overheads with FILE I/O.
As per your second idea, you can use fseek()(Refer: man fseek) or rewind()(Refer: man rewind) using the same file pointer fp. You do not need an extra file pointer.
EDIT:
If you could change the tag format to adhere to XML structure you will be able to use libXML2 and such libraries properly.
If that's not possible, then you have to write your own parser.
A few pointers:
First extract data from the file into a buffer.The size of the buffer and whether dynamically or statically allocated, will depend on your specs.
Search in the buffer, if non-whitespace character is < or whatever character your tag usually begins with. If not, you can just show an error and exit.
Now follows the tag name, until the first whitespace, or the / or the > character. Store them. Process the =, strings and stuff, as you wish.
If the next non-whitespace character is /, check that it is followed by >(or a similar pattern in your specs to find if a tag ends). If so, you've finished parsing and can return your result. Otherwise, you've got a malformed tag and should exit with an error.
If the character is >, then you've found the end of the begin tag. Now follows the content.
Otherwise what follows is an argument. Parse that, store the result, continue at step 4.
Read the content until you find a < character.
If that character is followed by /, it's the end tag. Check that it is followed by the tag name and >. If yes, return the result , else, throw an error.
If you get here, you've found the beginning of a nested XML. Parse that with this algorithm and then continue at 4 again.
Although, its quite basic idea, I hope it should help you start.
EDIT:
If you still want to reference the file as a pointer consider using mmap().
If you add mmap with a bit of shared memory IPC and adequate memory locking stuff, you could write a parallel processing program, that will process most of your files faster.

How to read 1000 or more columns data from file using c/c++ language?

A data file with 10000 rows and 1000 columns. I want to save a entire line to an array or each column to a variant.
There is a standard function fscanf in C. If use this function, I need write the format 1000 times.
fscanf(pFile, "%f,%f,%f,%f,%f,%f,......", &a[0], &a[1],...,a[999]);
It is almost impossible like this when programming in C.
But, I have no idea to implement it with C language.
Any suggestions or solutions?
And, how to read or extract some of columns data?
Read the file line by line using fgets() into a suitably large buffer. Don't be afraid to use a buffer of 32 KB or something, just to be very sure all the fields fit.
Then parse the line in a loop, perhaps using strtok() or just plain old strtod(). Note that the latter returns a pointer to the first character that was not considered a number; this is where your parsing will continue for the next number. Perhaps you need to add an inner loop to "eat" whitespace (or whatever separators you have).
You could read the file line by line, and then extract the numbers in a loop.

Is there a way to know when fscanf reads a whitespace or a new line?

I want to know if there is a way to know when fscanf reads a whitespace or a new line.
Example:
formatting asking words italic
links returns
As fscanf read a string till it meets a newline or a whitespace(using %s), it'll read formatting and the space after it and before a. The thing is, is there a way to know that it read a space? And after it entered the second line is there is a way to know that it read a carriage return?
You can instruct fscanf to read whitespace into your variable instead of reading and discarding whitespace. Use something like [ \n\r\t]* but you need to include more characters in that expression. Depending on the locale and some features of the runtime character set, you might want to write a separate function to compute the appropriate format string once before using it.
If you need to distinguish \n from other kinds of whitespace, you have your variable containing the whitespace that you just finished reading. You might want to count all of the \n characters in it, depending on your needs.

good way to read text file in C

I need to read a text file which may contain long lines of text. I am thinking of the best way to do this. Considering efficiency, even though I am doing this in C++, I would still choose C library functions to do the IO.
Because I don't know how long a line is, potentially really really long, I don't want to allocate a large array and then use fgets to read a line. On the other hand, I do need to know where each line ends. One use case of such is to count the words/chars in each line. I could allocate a small array and use fgets to read, and then determine whether there is \r, \n, or \r\n appearing in the line to tell whether a full line has been read. But this involves a lot of strstr calls (for \r\n, or there are better ways? for example from the return value of fgets?). I could also do fgetc to read each individual char one at a time. But does this function have buffering?
Please suggest compare these or other different ways of doing this task.
The correct way to do I/O depends on what you're going to do with the data. If you're counting words, line-based input doesn't make much sense. A more natural approach is to use fgetc and deal with a character at a time and let stdio worry about the buffering. Only if you need the whole line in memory at the same time to process it should you actually allocate a buffer big enough to contain it all.

Resources