Can't determine value of character at the end of a string - c

I'm new to C programming. I am trying to make a program that takes some simple input. However I found that on comparison of my input string to what the user "meant" to input, there is an additional character at the end. I thought this might be a '\0' or a '\r' but that seems not to be the case. This is my snippet of code:
char* getUserInput(char* command, char $MYPATH[])
{
printf("myshell$ ");
fgets(command, 200, stdin);
printf("%u\n", (unsigned)strlen(command));
if ((command[(unsigned)strlen(command) - 1] == '\0') || (command[(unsigned)strlen(command) - 1] == '\r'))
{
printf("bye\n");
}
return command;
}
The code shows that when entering, say "exit" that 5 characters are entered. However I can't seem to figure out the identity of this last one. "Bye" never prints. Does anyone know what this mystery character could be?

The magical 5th element most probably is a newline character: \n
From man fgets() (emphasis by me):
fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A '\0' is
stored after the last character in the buffer.
To prove this print out each character read by doing so:
char* getUserInput(char* command, char $MYPATH[])
{
printf("myshell$ ");
fgets(command, 200, stdin);
printf("%u\n", (unsigned)strlen(command));
{
size_t i = 0, len = strlen(command);
for (;i < len; ++i)
{
fprintf(stderr, "command[%zu]='%c' (%hhd or 0x%hhx)\n", i, command[i], command[i], command[i]);
}
}
...

assumptions
array indexes in c are started with 0
strlen returns length of string
so, if you have string "exit", this will be 5 symbols in array = e, x, i, t, \0, strlen return 4, but you're trying to decrement it by 1, so you're checking last symbol in string, instead on NULL terminator
to check NULL terminator use command[strlen(command)] - this will give you \0 always, so there is no sense in it
if you want to compare strings use strcmp function
UPDATE: issue with your program is because fgets appends \n symbol at then end of string:
A newline character makes fgets stop reading, but it is considered a
valid character by the function and included in the string copied to
str.

The reason you don't see the last char is because strlen() won't calculate '\0' into the string's length. So testing for '\0' wont succeed.
for instance, const char* a = "abc"; then strlen(a) will be 3. if you want to test it, you need to access it by command[strlen(command)]
The reason for getting strlen equals to 5 on "exit" is because fgets will append the '\n' character at the end of the input. You could test it by command[strlen(command) -1 ] == '\n'

Related

Is there a quick way to get the last element that was put in an array?

I use an fgets to read from stdin a line and save it in a char array, I would like to get the last letter of the line i wrote , which should be in the array before \nand \0.
For example if i have a char line[10] and write on the terminal 1stLine, is there a fast way to get the letter e rather than just cycling to it?
I saw this post How do I print the last element of an array in c but I think it doesn't work for me, even if I just create the array without filling it with fgets , sizeof line is already 10 because the array already has something in it
I know it's not java and I can't just .giveMeLastItem(), but I wonder if there is a smarter way than to cycle until the char before the \n to get the last letter I wrote
code is something like
char command[6];
fgets(command,6,stdin);
If you know the sentinel value, ex: \0 (or \n ,or any value for that matter), and you want the value of the element immediately preceding to that, you can
use strchr() to find out the position of the sentinel and
get the address of retPtr-1 and dereference to get the value you want.
There are many different ways to inspect the line read by fgets():
first you should check the return value of fgets(): a return value of NULL means either the end of file was reached or some sort of error occurred and the contents of the target array is undefined. It is also advisable to use a longer array.
char command[80];
if (fgets(command, sizeof command, stdin) == NULL) {
// end of file or read error
return -1;
}
you can count the number of characters with len = strlen(command) and if this length os not zero(*), command[len - 1] is the last character read from the file, which should be a '\n' if the line has less than 5 bytes. Stripping the newline requires a test:
size_t len = strlen(command);
if (len > 0 && command[len - 1] == '\n')
command[--len] = '\0';
you can use strchr() to locate the newline, if present with char *p strchr(command, '\n'); If a newline is present, you can strip it this way:
char *p = strchar(command, '\n');
if (p != NULL)
*p = '\0';
you can also count the number of characters no in the set "\n" with pos = strcspn(command, "\n"). pos will point to the newline or to the null terminator. Hence you can strip the trailing newline with:
command[strcspn(command, "\n")] = '\0'; // strip the newline if any
you can also write a simple loop:
char *p = command;
while (*p && *p != '\n')
p++;
*p = '\n'; // strip the newline if any
(*) strlen(command) can return 0 if the file contains an embedded null character at the beginning of a line. The null byte is treated like an ordinary character by fgets(), which continues reading bytes into the array until either size - 1 bytes have been read or a newline has been read.
Once you have only the array, there is no other way to do this. You could use strlen(line) and then get the last characters position based on this index, but this basically does exactly the same (loop over the array).
char lastChar = line[strlen(line)-1];
This has time-complexity of O(n), where n is the input length.
You can change the input method to a char by char input and count the length or store the last input. Every O(1) method like this uses O(n) time before (like n times O(1) for every character you read). But unless you have to really speed optimize (and you don't, when you work with user input) should just loop over the array by using a function like strlen(line) (and store the result, when you use it multiple times).
EDIT:
The strchr() function Sourav Ghosh mentioned, does exactly the same, but you can/must specify the termination character.
A straightforward approach can look the following way
char last_letter = command[ strcspn( command, "\n" ) - 1 ];
provided that the string is not empty or contains just the new line character '\n'.
Here is a demonstrative progarm.
#include <stdio.h>
#include <string.h>
int main(void)
{
enum { N = 10 };
char command[N];
while ( fgets( command, N, stdin ) && command[0] != '\n' )
{
char last_letter = command[ strcspn( command, "\n" ) - 1 ];
printf( "%c ", last_letter );
}
putchar( '\n' );
return 0;
}
If to enter the following sequence of strings
Is
there
a
quick
way
to
get
the
last
element
that
was
put
in
an
array?
then the output will be
s e a k y o t e t t t s t n n ?
The fastest way is to keep an array of references like this:
long ref[]
and ref[x] to contain the file offset of the last character of the xth line. Having this reference saved at the beginning of the file you will do something like:
fseek(n*sizeof(long))
long ref = read_long()
fseek(ref)
read_char()
I think this is the fastest way to read the last character at the end of the nth line.
I did a quick test of the three mentioned methods of reading a line from a stream and measuring its length. I read /usr/share/dict/words 100 times and measured with clock()/1000:
fgets + strlen = 420
getc = 510
fscanf with " 100[^\n]%n" = 940
This makes sense as fgets and strlen just do 2 calls, getc does a call per character, and fscanf may do one call but has a lot of machinery to set up for processing complex formats, so a lot more overhead. Note the added space in the fscanf format to skip the newline left from the previous line.
Beside the other good examples.
Another way is using fscanf()/scanf() and the %n format specifier to write to an argument the amount of read characters so far after you have input the string.
Then you subtract this number by one and use it as an index to command:
char command[6];
int n = 0;
if (fscanf(stdin, "%5[^\n]" "%n", command, &n) != 1)
{
fputs("Error at input!", stderr);
// error routine.
}
getchar();
if (n != 0)
{
char last_letter = command[n-1];
}
#include <stdio.h>
int main (void)
{
char command[6];
int n = 0;
if (fscanf(stdin, "%5[^\n]" "%n", command, &n) != 1)
{
fputs("Error at input!", stderr);
// error routine.
}
getchar();
if (n != 0)
{
char last_letter = command[n-1];
putchar(last_letter);
}
return 0;
}
Execution:
./a.out
hello
o

Reading input with fgets?

I am trying to get the input of one character using fgets(). To my knowledge fgets will addend the \n to the end of the input unless there is no room.
char test[1];
fgets(test,1,stdin);
readRestOfLine();
while (strcmp(test,"z") != 0){
......
......
}
Anyway the loop is never run even when z is entered. Why is this?
man fgets
char *fgets(char *s, int size, FILE *stream);
fgets() reads in at most one less than size characters from stream and
stores them into the buffer pointed to by s...
A terminating null byte ('\0') is stored after the last character in
the buffer.
In your case of size 1 this means fgets() reads in zero, i. e. no, characters and stores the terminating '\0' in test[0].
strcmp operates on strings as follows from it's name, so test has to be \0 terminated string. test must have room for \0.
As you stated correctly fgets appends a '\n' at the end of the string. So if you input just "z", then the resulting string will be "z\n" which is not equal to "z".
Furthermore the size of the buffer for fgets is only on character long in your program, but this length must be at least as long as your longest string you intend to enter.
Try this:
char test[100]; // space for 98 characters + \n + terminating zéro
fgets(test, 100, stdin);
readRestOfLine();
while (strcmp(test,"z\n") != 0){
......
......
}

Strange behavior of sscanf

I found some strange thing. Here is example of code:
...
char *start = strchr(value, '(');
if(start)
{
char buf[LEN];
memset(buf, 0, LEN);
int num = sscanf(start, "(%s)", buf);
if(num)
{
buf[strlen(buf) - 1] = '\0';
sprintf(value, "%s", buf);
}
...
if value is "(xxx)", for example, then value will be "xxx" after this actions.
But if value is "([34]xx{4,7}| 1234567890)" then value will be "[34]xx{4,7}".
Can anyone explain it?
P.S. it's ARM platform.
int num = sscanf(start, "(%s)", buf);
Here, sscanf returns when it encounters a whitespace in the buffer pointed to by start. You have a space in your input string:
"([34]xx{4,7}| 1234567890)"
^ space here
scanf returns the number of input items successfully matched and assigned. Here, it will return 1 and the value of num is 1. Next, you overwrite the last character in buf by this statement in your if block.
buf[strlen(buf) - 1] = '\0';
That explains your program's output. Now, a few things about your code:
You don't need to do memset(buf, 0, LEN);. Simply do char buf[LEN] = {0}; This fills the array with the null byte.
sscanf doesn't check for the array bound of the buffer buf into which you are writing the string which sscanf is reading from start. If the size of buf is not enough, sscanf will try to write in the memory beyond the buffer buf. This will lead to undefined behaviour and even program crash because of illegal memory access. You should give field width in the format string of sscanf to guard against the buffer overrun.
#define STRINGIFY(s) #s // preprocessor command # stringifies the token s
#define XSTRINGIFY(s) STRINGIFY(s)
#define LEN 10 // max buffer length without the null byte
// inside a function
char buf[LEN + 1]; // +1 for the null byte
const char *format = "(" XSTRINGIFY(LEN) "%s)"; // "(%10s)"
int num = sscanf(start, format, buf);
The 10 in the format string "(%10s)" means that at most 10 characters are stored in the buffer pointed to by buf and then a null byte \0 is added automatically in the end. Hence you don't need the following in the if block:
buf[strlen(buf) - 1] = '\0'; // overwrites the last char before null byte in buf.
Doing this, in fact, overwrites the last character in buf because strlen doesn't count the null byte.
sscanf is used with %s, when it encounters whitespace it will terminate. That is the reason you are getting the output as "[34]xx{4,7}" instead of expected behaviour
The format string consists of a sequence of directives which describe how to process the sequence of input characters. If processing of a directive fails, no further input is read, and scanf() returns. A "failure" can be either of the following: input failure, meaning that input characters were unavailable, or matching failure, meaning that the input was inappropriate (see below).
In your case, sscanf matches the starting (, and then parses the next token, %s which consumes data up to the first whitespace character. sscanf then fails to match a ), which means that the parsing stops. One token was successfully read and assigned, so the return value is 1.
Note that when using scanf, you cannot detect matching failures that occur after the last token that is assigned.

I don't understand the behavior of fgets in this example

While I could use strings, I would like to understand why this small example I'm working on behaves in this way, and how can I fix it ?
int ReadInput() {
char buffer [5];
printf("Number: ");
fgets(buffer,5,stdin);
return atoi(buffer);
}
void RunClient() {
int number;
int i = 5;
while (i != 0) {
number = ReadInput();
printf("Number is: %d\n",number);
i--;
}
}
This should, in theory or at least in my head, let me read 5 numbers from input (albeit overwriting them).
However this is not the case, it reads 0, no matter what.
I understand printf puts a \0 null terminator ... but I still think I should be able to either read the first number, not just have it by default 0. And I don't understand why the rest of the numbers are OK (not all 0).
CLARIFICATION: I can only read 4/5 numbers, first is always 0.
EDIT:
I've tested and it seems that this was causing the problem:
main.cpp
scanf("%s",&cmd);
if (strcmp(cmd, "client") == 0 || strcmp(cmd, "Client") == 0)
RunClient();
somehow.
EDIT:
Here is the code if someone wishes to compile. I still don't know how to fix
http://pastebin.com/8t8j63vj
FINAL EDIT:
Could not get rid of the error. Decided to simply add #ReadInput
int ReadInput(BOOL check) {
...
if (check)
printf ("Number: ");
...
# RunClient()
void RunClient() {
...
ReadInput(FALSE); // a pseudo - buffer flush. Not really but I ignore
while (...) { // line with garbage data
number = ReadInput(TRUE);
...
}
And call it a day.
fgets reads the input as well as the newline character. So when you input a number, it's like: 123\n.
atoi doesn't report errors when the conversion fails.
Remove the newline character from the buffer:
buf[5];
size_t length = strlen(buffer);
buffer[length - 1]=0;
Then use strtol to convert the string into number which provides better error detection when the conversion fails.
char * fgets ( char * str, int num, FILE * stream );
Get string from stream.
Reads characters from stream and stores them as a C string into str until (num-1) characters have been read or either a newline or the end-of-file is reached, whichever happens first.
A newline character makes fgets stop reading, but it is considered a valid character by the function and included in the string copied to str. (This means that you carry \n)
A terminating null character is automatically appended after the characters copied to str.
Notice that fgets is quite different from gets: not only fgets accepts a stream argument, but also allows to specify the maximum size of str and includes in the string any ending newline character.
PD: Try to have a larger buffer.

Does fgets() always terminate the char buffer with \0?

Does fgets() always terminate the char buffer with \0 even if EOF is already reached? It looks like it does (it certainly does in the implementation presented in the ANSI K&R book), but I thought I would ask to be sure.
I guess this question applies to other similar functions such as gets().
EDIT: I know that \0 is appended during "normal" circumstances, my question is targeted at EOF or error conditions. For example:
FILE *fp;
char b[128];
/* ... */
if (feof(fp)) {
/* is \0 appended after EACH of these calls? */
fgets(b, 128, fp);
fgets(b, 128, fp);
fgets(b, 128, fp);
}
fgets does always add a '\0' to the read buffer, it reads at most size - 1 characters from the stream (size being the second parameter) because of this.
Never use gets as you can never guarantee that it won't overflow any buffer that you give it, so while it technically does always terminate the read string this doesn't actually help.
Never use gets!!
7.19.7.2 The fgets function
Synopsis
1 #include <stdio.h>
char *fgets(char * restrict s, int n,
FILE * restrict stream);
Description
2 The fgets function reads at most one less than the number of characters
specified by n from the stream pointed to by stream into the array pointed
to by s. No additional characters are read after a new-line character
(which is retained) or after end-of-file. A null character is written
immediately after the last character read into the array.
Returns
3 The fgets function returns s if successful. If end-of-file is encountered
and no characters have been read into the array, the contents of the array
remain unchanged and a null pointer is returned. If a read error occurs
during the operation, the array contents are indeterminate and a null
pointer is returned.
So, yes, when fgets() does not return NULL the destination array always has a null character.
If fgets() returns NULL, the destination array may have been changed and may not have a null character. Never rely on the array after getting NULL from fgets().
Edit example added
$ cat fgets_error.c
#include <stdio.h>
void print_buf(char *buf, size_t len) {
int k;
printf("%02X", buf[0]);
for (k=1; k<len; k++) printf(" %02X", buf[k]);
}
int main(void) {
char buf[3] = {1, 1, 1};
char *r;
printf("Enter CTRL+D: ");
fflush(stdout);
r = fgets(buf, sizeof buf, stdin);
printf("\nfgets returned %p, buf has [", (void*)r);
print_buf(buf, sizeof buf);
printf("]\n");
return 0;
}
$ ./a.out
Enter CTRL+D:
fgets returned (nil), buf has [01 01 01]
$
See? no NUL in buf :)
man fgets:
fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a new‐line is read, it is stored into the buffer. A '\0' is stored after the last character in the buffer.
If you did open the file in binary mode "rb", and if you want to read Text line by line by using fgets you can use the following code to protect your software of loosing text, if by a mistake the text contained a '\0' byte.
But finally like the others mentioned, normally you should not use fgets if the stream contains '\0'.
size_t filepos=ftell(stream);
fgets(buffer, buffersize, stream);
len=strlen(buffer);
/* now check for > len+1 since no problem if the
last byte is 0 */
if(ftell(stream)-filepos > len+1)
{
if(!len) filepos++;
if(!fseek(stream, filepos, SEEK_SET) && len)
{
fread(buffer, 1, len, stream);
buffer[len]='\0';
}
}
Yes it does. From CPlusPlus.com
Reads characters from stream and stores them as a C string into str until (num-1) characters have been read or either a newline or a the End-of-File is reached, whichever comes first.
A newline character makes fgets stop reading, but it is considered a valid character and therefore it is included in the string copied to str.
A null character is automatically appended in str after the characters read to signal the end of the C string.

Resources