So I have a wall of text in a file and I need to recognize some words that are between the $ sign and call them as numbers then print the modified text in another file along with what the numbers correspond to.
Also lines are not defined and columns should be max 80 characters.
Ex:
I $like$ cats.
I [1] cats.
[1] --> like
That's what I did:
#include <stdio.h>
#include <stdlib.h>
#define N 80
#define MAX 9999
int main()
{
FILE *fp;
int i=0,count=0;
char matr[MAX][N];
if((fp = fopen("text.txt","r")) == NULL){
printf("Error.");
exit(EXIT_FAILURE);
}
while((fscanf(fp,"%s",matr[i])) != EOF){
printf("%s ",matr[i]);
if(matr[i] == '\0')
printf("\n");
//I was thinking maybe to find two $ but Idk how to replace the entire word
/*
if(matr[i] == '$')
count++;
if(count == 2){
...code...
}
*/
i++;
}
fclose(fp);
return 0;
}
My problem is that fscanf doesn't recognize '\0' so it doesn't go in the next line when I print the array..also I don't know how to replace $word$ with a number.
Not only will fscanf("%s") read one whitespace-delimited string at a time, it will also eat all whitespace between those strings, including line terminators. If you want to reproduce the input whitespace in the output, as your example suggests you do, then you need a different approach.
Also lines are not defined and columns should be max 80 characters.
I take that to mean the number of lines is not known in advance, and that it is acceptable to assume that no line will contain more than 80 characters (not counting any line terminator).
When you say
My problem is that fscanf doesn't recognize '\0' so it doesn't go in the next line when I print the array
I suppose you're talking about this code:
char matr[MAX][N];
/* ... */
if(matr[i] == '\0')
Given that declaration for matr, the given condition will always evaluate to false, regardless of any other consideration. fscanf() does not factor in at all. The type of matr[i] is char[N], an array of N elements of type char. That evaluates to a pointer to the first element of the array, which pointer will never be NULL. It looks like you're trying to determine when to write a newline, but nothing remotely resembling this approach can do that.
I suggest you start by taking #Barmar's advice to read line-by-line via fgets(). That might look like so:
char line[N+2]; /* N + 2 leaves space for both newline and string terminator */
if (fgets(line, sizeof(line), fp) != NULL) {
/* one line read; handle it ... */
} else {
/* handle end-of-file or I/O error */
}
Then for each line you read, parse out the "$word$" tokens by whatever means you like, and output the needed results (everything but the $-delimited tokens verbatim; the bracket substitution number for each token). Of course, you'll need to memorialize the substitution tokens for later output. Remember to make copies of those, as the buffer will be overwritten on each read (if done as I suggest above).
fscanf() does recognize '\0', under select circumstances, but that is not the issue here.
Code needs to detect '\n'. fscanf(fp,"%s"... will not do that. The first thing "%s" directs is to consume (and not save) any leading white-space including '\n'. Read a line of text with fgets().
Simple read 1 line at a time. Then march down the buffer looking for words.
Following uses "%n" to track how far in the buffer scanning stopped.
// more room for \n \0
#define BUF_SIZE (N + 1 + 1)
char buffer[BUF_SIZE];
while (fgets(buffer, sizeof buffer, stdin) != NULL) {
char *p = buffer;
char word[sizeof buffer];
int n;
while (sscanf(p, "%s%n", word, &n) == 1) {
// do something with word
if (strcmp(word, "$zero$") == 0) fputs("0", stdout);
else if (strcmp(word, "$one$") == 0) fputs("1", stdout);
else fputs(word, stdout);
fputc(' ', stdout);
p += n;
}
fputc('\n', stdout);
}
Use fread() to read the file contents to a char[] buffer. Then iterate through this buffer and whenever you find a $ you perform a strncmp to detect with which value to replace it (keep in mind, that there is a 2nd $ at the end of the word). To replace $word$ with a number you need to either shrink or extend the buffer at the position of the word - this depends on the string size of the number in ascii format (look solutions up on google, normally you should be able to use memmove). Then you can write the number to the cave, that arose from extending the buffer (just overwrite the $word$ aswell).
Then write the buffer to the file, overwriting all its previous contents.
Related
I have these in my file:
JOS BUTTLER
JASON ROY
DAWID MALAN
JONNY BAISTROW
BEN STOKES
in different lines. And I want them to extract in my program to print them on the exact way they are in file. My imaginary output screen is:
JOS BUTTLER
JASON ROY
DAWID MALAN
JONNY BAISTROW
BEN STOKES
How would I do it using fscanf() and printf(). Moreover suggest me the way to change the delimiters of fscanf() to \n
I have tried something like this:
char n[5][30];
printf("Name of 5 cricketers read from the file:\n");
for(i=0;i<5;i++)
{
fscanf(fp,"%[^\n]s",&n[i]);
printf("%s ",n[i]);
}
fclose(fp);
}
But it works only for the first string and other string could not be displayed. There were garbage values.
Be protected from buffer overflows limiting the length of the input, use "%29[^\n]" instead of "%[^\n]s" (you don't need the s specifier)
Consume the trailing new line (your wildcard ^\n reads until a new line is found) using %*c, * means that a char will be read but won't be assigned:
fscanf(fp, "%29[^\n]%*c", n[i]);
or better yet (as pointed out by #WeatherVane), add a space before %, this will consume any blank space including tabs, spaces and new lines that may be left in the buffer from the previous read:
fscanf(fp, " %29[^\n]", n[i]);
Notice that you don't need an ampersand in &n[i], fscanf wants a pointer but n[i] is already (decays into) a pointer when passed as an argument.
Finally, as pointed out by #paddy, fgets does all that for you and is a safer function, always prefer fgets.
How to get string terminated with new line in file handling using fscanf
Use fgets() to read a line of input into a string. It also reads and saves the line's '\n'.
Buffer size 30 may be too small, consider 60.
To detect if the line is too long and lop off the '\n', read in at least + 2 characters.
// char n[5][30];
#define NAME_N 5
#define NAME_SIZE 30
char n[NAME_N][NAME_SIZE]
char buf[NAME_SIZE + 2]
while (i < NAME_N && fgets(buf, sizeof buf, fp) != NULL) {
size_t len = strlen(buf);
// Lop off potential trailing '\n'
if (len > 0 && buf[len - 1] == '\n') {
buf[--len] = '\0';
}
// Handle unusual name length
if (len >= NAME_SIZE || len == 0) {
fprintf(stderr, "Unacceptable name <%s>.\n", buf);
exit(EXIT_FAILURE);
}
// Success
strcpy(n[i], buf);
printf("%s\n", n[i]);
i++;
}
I use an fgets to read from stdin a line and save it in a char array, I would like to get the last letter of the line i wrote , which should be in the array before \nand \0.
For example if i have a char line[10] and write on the terminal 1stLine, is there a fast way to get the letter e rather than just cycling to it?
I saw this post How do I print the last element of an array in c but I think it doesn't work for me, even if I just create the array without filling it with fgets , sizeof line is already 10 because the array already has something in it
I know it's not java and I can't just .giveMeLastItem(), but I wonder if there is a smarter way than to cycle until the char before the \n to get the last letter I wrote
code is something like
char command[6];
fgets(command,6,stdin);
If you know the sentinel value, ex: \0 (or \n ,or any value for that matter), and you want the value of the element immediately preceding to that, you can
use strchr() to find out the position of the sentinel and
get the address of retPtr-1 and dereference to get the value you want.
There are many different ways to inspect the line read by fgets():
first you should check the return value of fgets(): a return value of NULL means either the end of file was reached or some sort of error occurred and the contents of the target array is undefined. It is also advisable to use a longer array.
char command[80];
if (fgets(command, sizeof command, stdin) == NULL) {
// end of file or read error
return -1;
}
you can count the number of characters with len = strlen(command) and if this length os not zero(*), command[len - 1] is the last character read from the file, which should be a '\n' if the line has less than 5 bytes. Stripping the newline requires a test:
size_t len = strlen(command);
if (len > 0 && command[len - 1] == '\n')
command[--len] = '\0';
you can use strchr() to locate the newline, if present with char *p strchr(command, '\n'); If a newline is present, you can strip it this way:
char *p = strchar(command, '\n');
if (p != NULL)
*p = '\0';
you can also count the number of characters no in the set "\n" with pos = strcspn(command, "\n"). pos will point to the newline or to the null terminator. Hence you can strip the trailing newline with:
command[strcspn(command, "\n")] = '\0'; // strip the newline if any
you can also write a simple loop:
char *p = command;
while (*p && *p != '\n')
p++;
*p = '\n'; // strip the newline if any
(*) strlen(command) can return 0 if the file contains an embedded null character at the beginning of a line. The null byte is treated like an ordinary character by fgets(), which continues reading bytes into the array until either size - 1 bytes have been read or a newline has been read.
Once you have only the array, there is no other way to do this. You could use strlen(line) and then get the last characters position based on this index, but this basically does exactly the same (loop over the array).
char lastChar = line[strlen(line)-1];
This has time-complexity of O(n), where n is the input length.
You can change the input method to a char by char input and count the length or store the last input. Every O(1) method like this uses O(n) time before (like n times O(1) for every character you read). But unless you have to really speed optimize (and you don't, when you work with user input) should just loop over the array by using a function like strlen(line) (and store the result, when you use it multiple times).
EDIT:
The strchr() function Sourav Ghosh mentioned, does exactly the same, but you can/must specify the termination character.
A straightforward approach can look the following way
char last_letter = command[ strcspn( command, "\n" ) - 1 ];
provided that the string is not empty or contains just the new line character '\n'.
Here is a demonstrative progarm.
#include <stdio.h>
#include <string.h>
int main(void)
{
enum { N = 10 };
char command[N];
while ( fgets( command, N, stdin ) && command[0] != '\n' )
{
char last_letter = command[ strcspn( command, "\n" ) - 1 ];
printf( "%c ", last_letter );
}
putchar( '\n' );
return 0;
}
If to enter the following sequence of strings
Is
there
a
quick
way
to
get
the
last
element
that
was
put
in
an
array?
then the output will be
s e a k y o t e t t t s t n n ?
The fastest way is to keep an array of references like this:
long ref[]
and ref[x] to contain the file offset of the last character of the xth line. Having this reference saved at the beginning of the file you will do something like:
fseek(n*sizeof(long))
long ref = read_long()
fseek(ref)
read_char()
I think this is the fastest way to read the last character at the end of the nth line.
I did a quick test of the three mentioned methods of reading a line from a stream and measuring its length. I read /usr/share/dict/words 100 times and measured with clock()/1000:
fgets + strlen = 420
getc = 510
fscanf with " 100[^\n]%n" = 940
This makes sense as fgets and strlen just do 2 calls, getc does a call per character, and fscanf may do one call but has a lot of machinery to set up for processing complex formats, so a lot more overhead. Note the added space in the fscanf format to skip the newline left from the previous line.
Beside the other good examples.
Another way is using fscanf()/scanf() and the %n format specifier to write to an argument the amount of read characters so far after you have input the string.
Then you subtract this number by one and use it as an index to command:
char command[6];
int n = 0;
if (fscanf(stdin, "%5[^\n]" "%n", command, &n) != 1)
{
fputs("Error at input!", stderr);
// error routine.
}
getchar();
if (n != 0)
{
char last_letter = command[n-1];
}
#include <stdio.h>
int main (void)
{
char command[6];
int n = 0;
if (fscanf(stdin, "%5[^\n]" "%n", command, &n) != 1)
{
fputs("Error at input!", stderr);
// error routine.
}
getchar();
if (n != 0)
{
char last_letter = command[n-1];
putchar(last_letter);
}
return 0;
}
Execution:
./a.out
hello
o
So far I have been using if statements to check the size of the user-inputted strings. However, they don't see to be very useful: no matter the size of the input, the while loop ends and it returns the input to the main function, which then just outputs it.
I don't want the user to enter anything greater than 10, but when they do, the additional characters just overflow and are outputted on a newline. The whole point of these if statements is to stop that from happening, but I haven't been having much luck.
#include <stdio.h>
#include <string.h>
#define SIZE 10
char *readLine(char *buf, size_t sz) {
int true = 1;
while(true == 1) {
printf("> ");
fgets(buf, sz, stdin);
buf[strcspn(buf, "\n")] = 0;
if(strlen(buf) < 2 || strlen(buf) > sz) {
printf("Invalid string size\n");
continue;
}
if(strlen(buf) > 2 && strlen(buf) < sz) {
true = 0;
}
}
return buf;
}
int main(int argc, char **argv) {
char buffer[SIZE];
while(1) {
char *input = readLine(buffer, SIZE);
printf("%s\n", input);
}
}
Any help towards preventing buffer overflow would be much appreciated.
When the user enters in a string longer than sz, your program processes the first sz characters, but then when it gets back to the fgets call again, stdin already has input (the rest of the characters from the user's first input). Your program then grabs another up to sz characters to process and so on.
The call to strcspn is also deceiving because if the "\n" is not in the sz chars you grab than it'll just return sz-1, even though there's no newline.
After you've taken input from stdin, you can do a check to see if the last character is a '\n' character. If it's not, it means that the input goes past your allowed size and the rest of stdin needs to be flushed. One way to do that is below. To be clear, you'd do this only when there's been more characters than allowed entered in, or it could cause an infinite loop.
while((c = getchar()) != '\n' && c != EOF)
{}
However, trying not to restructure your code too much how it is, we'll need to know if your buffer contains the newline before you set it to 0. It will be at the end if it exists, so you can use the following to check.
int containsNewline = buf[strlen(buf)-1] == '\n'
Also be careful with your size checks, you currently don't handle the case for a strlen of 2 or sz. I would also never use identifier names like "true", which would be a possible value for a bool variable. It makes things very confusing.
In case that string inside the file is longer that 10 chars, your fgets() reads only the first 10 chars into buf. And, because these chars doesn't contain the trailing \n, function strcspn(buf, "\n") returns 10 - it means, you are trying to set to 0 an buf[10], so it is over buf[] boundaries (max index is 9).
Additionally, never use true or false as the name of variable - it totally diminishes the code. Use something like 'ok' instead.
Finally: please clarify, what output is expected in case the file contains string longer than 10 characters. It should be truncated?
I'm fairly new to C and not sure how I would do this. I've found similar questions, but nothing exactly like I want.
What I want to do is read a raw txt file "sentence by sentence" with the end of a sentence being considered a period (.) or a newline (\n). With no assumed maximum lengths for any data structures.
My first thought was getline(), but the version of C I'm required to use does not seem to have such a function. So I've tried to use fgets() and then parse the data onto a sscanf() with a scanset. sscanf(charLine, "%[^.]s", sentence);
The problem with this, is that if there is more than one period (.) it will stop at the first and not start again at that period (.) to collect the others.
I feel like I'm on the right track but just don't how to expand on this.
while(fgets (charLine, size, readFile) == NULL)
{
sscanf(charLine, "%[^.]s", sentence);
// something here...
}
You can write a function that reads the stream until a . or a newline is found. David C.Rankin suggested that just scanning for a . might be too restrictive, causing embedded periods in www.google.com to act as sentence break. One can stop on . if followed by white space:
#include <ctype.h>
#include <stdio.h>
/* alternative to fgets to stop at `.` and newline */
char *fgetsentence(char *dest, size_t size, FILE *fp) {
size_t i = 0;
while (i + 2 < size) {
int c = getc(fp);
if (c == EOF)
break;
dest[i++] = (char)c;
if (c == '\n')
break;
if (c == '.') {
int d = getc(fp);
if (d == EOF)
break;
if (isspace(d)) {
dest[i++] = (char)d;
break;
}
ungetc(d, fp);
}
}
if (i == 0)
return NULL;
dest[i] = '\0';
return dest;
}
If you want to handle arbitrary long sentences, you would take pointers to dest and size and reallocate the array if required.
Note that it would be very impractical to use fscanf(fp, "%[^.\n]", dest) because it is not possible to pass the maximum number of bytes to store into dest as an evaluated argument and one would need to special case empty lines and sentences.
Note too that stopping on ., even with the above restriction that it must be followed by white space still causes false positives: sentences can contain embedded periods followed by white space that are not the end of the sentence. Example: Thanks to David C. Rankin for his comments on my answer.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
So I'm trying to make it so that you can write text into a file until you make a newline or type -1. My problem is that when you write, it just keeps going until it crashes and gives the error "Stack around the variable "inputChoice" was corrupted".
I believe the problem is that the program doesn't stop accepting stdin when you want to stop typing (-1, newline) and that causes the error. I've tried with a simple scanf and it works, but you can only write a word. No spaces and it doesn't support multiple lines either. That's why I have to use fgets
Judging from your comments, I assume that there are some basic concepts in C
that you haven't fully understood, yet.
C-Strings
A C-String is a sequence of bytes. This sequence must end with the value 0.
Every value in the sequence represents a character based on the
ASCII encoding, for example the
character 'a' is 97, 'b' is 98, etc. The character '\0' has
the value 0 and it's the character that determines the end of the string.
That's why you hear a lot that C-Strings are '\0'-terminated.
In C you use an array of chars (char string[], char string[SOME VALUE]) to
save a string. For a string of length n, you need an array of dimension n+1, because
you also need one space for the terminating '\0' character.
When dealing with strings, you always have to think about the proper type,
whether your are using an array or a pointer. A pointer
to char doesn't necessarily mean that you are dealing with a C-String!
Why am I telling you this? Because of:
char inputChoice = 0;
printf("Do you wish to save the Input? (Y/N)\n");
scanf("%s", &inputChoice);
I haven't changed much, got very demotivated after trying for a while.
I changed the %s to an %c at scanf(" %c, &inputChoice) and that
seems to have stopped the program from crashing.
which shows that haven't understood the difference between %s and %c.
The %c conversion specifier character tells scanf that it must match a single character and it expects a pointer to char.
man scanf
c
Matches a sequence of characters whose length is specified by the maximum field
width (default 1); the next pointer must be a
pointer to char, and there must be enough room for all the characters
(no terminating null byte is added). The usual skip of
leading white space is suppressed. To skip white space first, use an explicit space in the format.
Forget the bit about the length, it's not important right now.
The important part is in bold. For the format scanf("%c", the function
expects a pointer to char and its not going to write the terminating '\0'
character, it won't be a C-String. If you want to read one letter and one
letter only:
char c;
scanf("%c", &c);
// also possible, but only the first char
// will have a defined value
char c[10];
scanf("%c", c);
The first one is easy to understand. The second one is more interesting: Here
you have an array of char of dimension 10 (i.e it holds 10 chars). scanf
will match a single letter and write it on c[0]. However the result won't be
a C-String, you cannot pass it to puts nor to other functions that expect
C-Strings (like strcpy).
The %s conversion specifier character tells scanf that it must match a sequence of non-white-space characters
man scanf
s
Matches a sequence of non-white-space characters; the next pointer must be a
pointer to the initial element of a character array that is long enough to
hold the input sequence and the terminating null byte ('\0'), which is added
automatically.
Here the result will be that a C-String is saved. You also have to have enough
space to save the string:
char string[10];
scanf("%s", string);
If the strings matches 9 or less characters, everything will be fine, because
for a string of length 9 requires 10 spaces (never forget the terminating
'\0'). If the string matches more than 9 characters, you won't have enough
space in the buffer and a buffer overflow (accessing beyond the size) occurs.
This is an undefined behaviour and anything can happen: your program might
crash, your program might not crash but overwrites another variable and thus
scrwes the flow of your program, it could even kill a kitten somewhere, do
you really want to kill kittens?
So, do you see why your code is wrong?
char inputChoice = 0;
scanf("%s", &inputChoice);
inputChoice is a char variable, it can only hold 1 value.
&inputChoice gives you the address of the inputChoice variable, but the
char after that is out of bound, if you read/write it, you will have an
overflow, thus you kill a kitten. Even if you enter only 1 character, it will
write at least 2 bytes and because you it only has space for one character, a kitten will die.
So, let's talk about your code.
From the perspective of an user: Why would I want to enter lines of text, possibly a lot of lines of text
and then answer "No, I don't want to save the lines". It doesn't make sense to
me.
In my opinion you should first ask the user whether he/she wants to save the
input first, and then ask for the input. If the user doesn't want to save
anything, then there is no point in asking the user to enter anything at
all. But that's just my opinion.
If you really want to stick to your plan, then you have to save every line and
when the user ends entering data, you ask and you save the file.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BUFFERLEN 1024
void printFile () {
int i;
char openFile[BUFFERLEN];
FILE *file;
printf("What file do you wish to write in?\n");
scanf("%s", openFile);
getchar();
file = fopen(openFile, "w");
if (file == NULL) {
printf("Could not open file.\n");
return;
}
// we save here all lines to be saved
char **lines = NULL;
int num_of_lines = 0;
char buffer[BUFFERLEN];
printf("Enter an empty line of -1 to end input\n");
// for simplicity, we assume that no line will be
// larger than BUFFERLEN - 1 chars
while(fgets(buffer, sizeof buffer, stdin))
{
// we should check if the last character is \n,
// if not, buffer was not large enough for the line
// or the stream closed. For simplicity, I will ignore
// these cases
int len = strlen(buffer);
if(buffer[len - 1] == '\n')
buffer[len - 1] = '\0';
if(strcmp(buffer, "") == 0 || strcmp(buffer, "-1") == 0)
break; // either an empty line or user entered "-1"
char *line = strdup(buffer);
if(line == NULL)
break; // if no more memory
// process all lines that already have been entered
char **tmp = realloc(lines, (num_of_lines+1) * sizeof *tmp);
if(tmp == NULL)
{
free(line);
break; // same reason as for strdup failing
}
lines = tmp;
lines[num_of_lines++] = line; // save the line and increase num_of_lines
}
char inputChoice = 0;
printf("Do you wish to save the Input? (Y/N)\n");
scanf("%c", &inputChoice);
getchar();
if (inputChoice == 'Y' || inputChoice == 'y') {
for(i = 0; i < num_of_lines; ++i)
fprintf(file, "%s\n", lines[i]); // writing every line
printf("Your file has been saved\n");
printf("Please press any key to continue");
getchar();
}
// closing FILE buffer
fclose(file);
// free memory
if(num_of_lines)
{
for(i = 0; i < num_of_lines; ++i)
free(lines[i]);
free(lines);
}
}
int main(void)
{
printFile();
return 0;
}
Remarks on the code
I used the same code as yours as the base for mine, so that you can spot the
differences much quicker.
I use the macro BUFFERLEN for declaring the length of the buffers. That's
my style.
Look at the fgets line:
fgets(buffer, sizeof buffer, stdin)
I use here sizeof buffer instead of 1024 or BUFFERLEN. Again, that's my
style, but I think doing this is better, because even if you change the size
of the buffer by changing the macro, or by using another explicit size, sizeof buffer
will always return the correct size. Be aware that this only works when
buffer is an array.
The function strdup returns a pointer a pointer to a new string that
duplicates the argument. It's used to create a new copy of a string. When
using this function, don't forget that you have to free the memory using
free(). strdup is not part of the standard library, it conforms
to SVr4, 4.3BSD, POSIX.1-2001. If you use Windows (I don't use Windows,
I'm not familiar with the Windows ecosystem), this function might not be
present. In that case you can write your own:
char *strdup(const char *s)
{
char *str = malloc(strlen(s) + 1);
if(str == NULL)
return NULL;
strcpy(str, s);
return str;
}