I'm building an assembly compiler in C, and I need to print only line which contain code (alphanumeric characters).
However my compiler doesn't recognize a string pointed to by fgets() as empty, since it sometimes contains whitespace characters.
How do I make a condition, to only print lines containing alphanumeric characters?
My code looks like this:
while(fgets(Line,256,Inputfile)!=NULL)
{
i=0;
while(Line[i]!='\n')
{
Instruction[i]=Line[i];
i++;
}
printf("%s \n",Instruction);
}
Thanks,
You have to trim the result of the fgets. You can refer to this answer to view an example that shows how to trim an array of characters in C.
I hope this can help you.
Do I understand you right? You want to ignore lines with only whitespaces?
while(fgets(Line,256,Inputfile)!=NULL)
{
i=0;
int flag = 0;
while(Line[i]!='\n')
{
if(Line[i] != ' ' && Line[i] != '\t'){
flag = 1;
}
Instruction[i]=Line[i];
i++;
}
if(flag == 1){
printf("%s \n",Instruction);
}
}
Add a function isLineToIgnore() in which you check whether the line contains any alphanumeric characters or not.
int isLineToIgnore(char* line)
{
for (char* cp = line; *cp != '\0'; ++cp )
{
if ( isalnum(*cp) )
{
// Don't ignore the line if it has at least one
// alphanumeric character
return 0;
}
}
// The line has no alphanumeric characters.
// Ignore it.
return 1;
}
and then call the function.
Related
I have text file which include thousands of string
but each string split by a space " "
How can i count how many strings there are?
You don't need the strtok() as you only need to count the number of space characters.
while (fgets(line, sizeof line, myfile) != NULL) {
for (size_t i = 0; line[i]; i++) {
if (line[i] == ' ') totalStrings++;
}
}
If you want to consider any whitespace character then you can use isspace() function.
You can read character by character as well without using an array:
int ch;
while ((ch=fgetc(myfile)) != EOF) {
if (ch == ' ') totalStrings++;
}
But I don't see why you want to avoid using an array as it would probably be more efficient (reading more chars at a time rather than reading one byte at a time).
fgets() function will read entire line from file (you need to know maximum possible size of that line. Then, you can use strtok() from ` to parse the string and count the words.
Using fgetc(), you can count the spaces.
Take note that in cases wherein there are spaces at the beginning of the string, those will be counted as well and it is okay if spaces are present on the start of the line. Else, it won't give accurate results as the first string won't be counted because it has no space before it.
To solve that, we need to check first the first character and increment the string counter if it is an alphabet character.
int str_count = 0;
int c;
// first char
if( isalpha(c = fgetc(myfile)) )
str_count++;
else
ungetc(c, myfile);
Then, we loop through the rest of the contents.
Checking if an alphabet character follows a space will verify if there is a next string after the space, else a space at the end of the line will be counted as well, giving an inaccurate result.
do
{
c = fgetc(myfile);
if( c == EOF )
break;
if(isspace(c)) {
if( isalpha(c = fgetc(myfile)) ) {
str_count++;
ungetc(c, myfile);
} else if(c == '\n') { // for multiple newlines
str_count++;
}
}
} while(1);
Tested on a Lorem Ipsum generator of 1500 words:
http://pastebin.com/w6EiSHbx
I have a C program that reads from a txt file. The text file contains a list of words, one word per line. What I want to do is get the words from the text file and print them as one sentence/paragraph, however, when I try to print them, they still print one word per line.
I am storing the words in a 2d char array as I read them from the file. What I think is happening is the array copies the newline char from the txt file, is that correct? and if so, how do I add the word into the array without the new line char?
while(fgets(line,20,lineRead)!=NULL)
{
for(j = 0; j < 20;j++)
{
message[k][j]= line[j];
}
printf("%s", message[k]);
}
I tried a few while loops with no success:
while(line[j] != ' ')
while(line[j] != NULL)
while(line[j] != EOF)
while(line[j] != ' \')
I'm learning C so please be specific in my error. I want to understand what I'm doing wrong, not just get an answer.
Thank you!
You could simply change your for loop to be:
for(j = 0; j < 20 && line[j] != '\n';j++)
{
message[k][j]= line[j];
}
if(j < 20)
message[k][j] = '\0';
The fgets functions includes the newline character \n in the buffer you are reading. Just include a conditional statement within your loop to copy all the characters except \n and \r. Something like:
if ( line[j] != '\n' && line[j] != '\r' ) {
/* Copy the character in your new buffer */
}
The newline characters are part of the string, you need to remove them:
#include <string.h>
while(fgets(line,20,lineRead)!=NULL)
{
char* newline_pos = strpbrk (line, "\n\r"); // get first occurance of newline or carriage return
if (newline_pos)
*newline_pos = '\0';
printf("%s ", line);
}
You should do:
while(fgets(line,20,lineRead)!=NULL)
{
strcpy(message[k], line);
if(message[k][strlen(line)-1] == '\n')
message[k][strlen(line)-1] = '\0']
printf("%s", message[k]);
}
What this does is that it copies line into message[k] and then removes the last character if it is a newline character. We are checking the last character because the documentation says "A newline character makes fgets stop reading, but it is considered a valid character by the function and included in the string copied to str." So if newline is present it will be the last character.
I have a program in which I wanted to remove the spaces from a string. I wanted to find an elegant way to do so, so I found the following (I've changed it a little so it could be better readable) code in a forum:
char* line_remove_spaces (char* line)
{
char *non_spaced = line;
int i;
int j = 0;
for (i = 0; i <= strlen(line); i++)
{
if ( line[i] != ' ' )
{
non_spaced[j] = line[i];
j++;
}
}
return non_spaced;
}
As you can see, the function takes a string and, using the same allocated memory space, selects only the non-spaced characters. It works!
Anyway, according to Wikipedia, a string in C is a "Null-terminated string". I always thought this way and everything was good. But the problem is: we put no "null-character" in the end of the non_spaced string. And somehow the compiler knows that it ends at the last character changed by the "non_spaced" string. How does it know?
This does not happen by magic. You have in your code:
for (i = 0; i <= strlen(line); i++)
^^
The loop index i runs till strlen(line) and at this index there is a nul character in the character array and this gets copied as well. As a result your end result has nul character at the desired index.
If you had
for (i = 0; i < strlen(line); i++)
^^
then you had to put the nul character manually as:
for (i = 0; i < strlen(line); i++)
{
if ( line[i] != ' ' )
{
non_spaced[j] = line[i];
j++;
}
}
// put nul character
line[j] = 0;
Others have answered your question already, but here is a faster, and perhaps clearer version of the same code:
void line_remove_spaces (char* line)
{
char* non_spaced = line;
while(*line != '\0')
{
if(*line != ' ')
{
*non_spaced = *line;
non_spaced++;
}
line++;
}
*non_spaced = '\0';
}
The loop uses <= strlen so you will copy the null terminator as well (which is at i == strlen(line)).
You could try it. Debug it while it is processing a string containing only one space: " ". Watch carefully what happens to the index i.
How do you know that it "knows"? The most likely scenario is that you're simply having luck with your undefined behavior, and that there is a '\0'-character after the valid bytes of line end.
It's also highly likely that you're not seeing spaces at the end, which might be printed before hitting the stray "lucky '\0'".
A few other points:
There's no need to write this using indexing.
It's not very efficient to call strlen() on each loop iteration.
You might want to use isspace() to remove more whitespace characters.
Here's how I would write it, using isspace() and pointers:
char * remove_spaces(char *str)
{
char *ret = str, *put = str;
for(; *str != '\0'; str++)
{
if(!isspace((unsigned char) *str)
*put++ = *str;
}
*put = '\0';
return ret;
}
Note that this does terminate the space-less version of the string, so the returned pointer is guaranteed to point at a valid string.
The string parameter of your function is null-terminated, right?
And in the loop, the null character of the original string get also copied into the non spaced returned string. So the non spaced string is actually also null-terminated!
For your compiler, the null character is just another binary data that doesn't get any special treatment, but it's used by string APIs as a handy character to easily detect end of strings.
If you use the <= strlen(line), the length of the strlen(line) include the '\0' so your program can work. You can use debug and run analysis.
I am trying to make a program which needs scans in more than one word, and I do not know how to do this with an unspecified length.
My first port of call was scanf, however this only scans in one word (I know you can do scanf("%d %s",temp,temporary);, but I do not know how many words it needs), so I looked around and found fgets. One issue with this is I cannot find how to make it move to the next code, eg
scanf("%99s",temp);
printf("\n%s",temp);
if (strcmp(temp,"edit") == 0) {
editloader();
}
would run editloader(), while:
fgets(temp,99,stdin);
while(fgets(temporary,sizeof(temporary),stdin))
{
sprintf(temp,"%s\n%s",temp,temporary);
}
if (strcmp(temp,"Hi There")==0) {
editloader();
}
will not move onto the strcmp() code, and will stick on the original loop. What should I do instead?
I would scan in each loop a word with scanf() and then copy it with strcpy() in the "main" string.
maybe you can use getline method ....I have used it in vc++ but if it exists in standard c library too then you are good to go
check here http://www.daniweb.com/software-development/c/threads/253585
http://www.cplusplus.com/reference/iostream/istream/getline/
Hope you find what you are looking for
I use this to read from stdin and get the same format that you would get by passing as arguments... so that you can have spaces in words and quoted words within a string. If you want to read from a specific file, just fopen it and change the fgets line.
#include <stdio.h>
void getargcargvfromstdin(){
char s[255], **av = (char **)malloc(255 * sizeof(char *));
unsigned char i, pos, ac;
for(i = 0; i < 255; i++)
av[i] = (char *)malloc(255 * sizeof(char));
enum quotes_t{QUOTED=0,UNQUOTED}quotes=UNQUOTED;
while (fgets(s,255,stdin)){
i=0;pos=0;ac=0;
while (i<strlen(s)) {
/* '!'=33, 'ÿ'=-1, '¡'=-95 outside of these are non-printables */
if ( quotes && ((s[i] < 33) && (s[i] > -1) || (s[i] < -95))){
av[ac][pos] = '\0';
if (av[ac][0] != '\0') ac++;
pos = 0;
}else{
if (s[i]=='"'){ /* support quoted strings */
if (pos==0){
quotes=QUOTED;
}else{ /* support \" within strings */
if (s[i-1]=='\\'){
av[ac][pos-1] = '"';
}else{ /* end of quoted string */
quotes=UNQUOTED;
}
}
}else{ /* printable ascii characters */
av[ac][pos] = s[i];
pos++;
}
}
i++;
}
//your code here ac is the number of words and av is the array of words
}
}
If it exceeds the buffer size you simply can't do it.
You will have to do multiple loops
the maximum size you can scan with scanf() will come from
char *name;
scanf("%s",name);
reed this
http://sekrit.de/webdocs/c/beginners-guide-away-from-scanf.html
Consider, this message:
N,8545,01/02/2011 09:15:01.815,"RASTA OPTSTK 24FEB2011 1,150.00 CE",S,8.80,250,0.00,0
This is just a sample. The idea is, this is one of the rows in a csv file. Now, if I am to break it into commas, then there will be a problem with 1150 figure.
The string inside the double quotes is of variable length, but can be ascertained as one "element"(if I may use the term)
The other elements are the ones separated by ,
How do I parse it? (other than Ragel parsing engine)
Soham
Break the string into fields separated by commas provided that the commas are not embedded in quoted strings.
A quick way to do this is to use a state machine.
boolean inQuote = false;
StringBuffer buffer= new StringBuffer();
// readchar() is to be implemented however you read a char
while ((char = readchar()) != -1) {
switch (char) {
case ',':
if (inQuote == false) {
// store the field in our parsedLine object for later processing.
parsedLine.addField(buffer.toString());
buffer.setLength(0);
}
break;
case '"':
inQuote = !inQuote;
// fall through to next target is deliberate.
default:
buffer.append(char);
}
}
Note that while this provides an example, there is a bit more to CSV files which would have to be accounted for (like embedded quotes within quotes, or whether it is appropriate to strip outer quotes in your example).
A quick and dirty solution if you don't want to add external libraries would be converting the double quotes to \0 (the end of string marker), then parsing the three strings separately using sscanf. Ugly but should work.
Assuming the input is well-formed (otherwise you'll have to add error handling):
for (i=0; str[i]; i++)
if (str[i] == '"') str[i] = 0;
str += sscanf(str, "%c,%d,%d/%d/%d %d:%d:%d.%d,", &var1, &var2, ..., &var9);
var10 = str; // it may be str+1, I don't remember if sscanf consumes also the \0
sscanf(str+strlen(var10), ",%c,%f,%d,%f,%d", &var11, &var12, ..., &var15);
You will obviously have to make a copy of var10 if you want to free str immediately.
This is a function to get the next single CSV field from an input file supplied as a FILE *. It expects the file to be opened in text mode, and supports quoted fields with embedded quotes and newlines. Fields longer than the size of the supplied buffer are truncated.
int get_csv_field(FILE *f, char *buf, size_t size)
{
char *p = buf;
int c;
enum { QS_UNQUOTED, QS_QUOTED, QS_GOTQUOTE } quotestate = QS_UNQUOTED;
if (size < 1)
return EOF;
while ((c = getc(f)) != EOF)
{
if ((c == '\n' || c == ',') && quotestate != QS_QUOTED)
break;
if (c == '"')
{
if (quotestate == QS_UNQUOTED)
{
quotestate = QS_QUOTED;
continue;
}
if (quotestate == QS_QUOTED)
{
quotestate = QS_GOTQUOTE;
continue;
}
if (quotestate == QS_GOTQUOTE)
{
quotestate = QS_QUOTED;
}
}
if (quotestate == QS_GOTQUOTE)
{
quotestate = QS_UNQUOTED;
}
if (size > 1)
{
*p++ = c;
size--;
}
}
*p = '\0';
return c;
}
How about libcsv from our very own Robert Gamble?