here i meet a strange problem about c read function in linux.
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
int main(int argc, char** argv){
int fd=open("a.c",O_RDONLY);
if(fd==-1){
fprintf(stderr,"%s\n",strerror(errno));
}
char buf[10];
if(read(fd,buf,9)==-1){
fprintf(stderr,"%s\n",strerror(errno));
}else{
printf("%s\n",buf);
}
}
i think the buf should be initialize to zero, so the first 9 char read to buffer and the last one is '\0' and it like a string. but the resule is odd, below is a.c file and the result of this program,
a.c
1234567890abcd
result
1234567893øþzôo`
seems this string is out of buffer, I can't figure out what happened, can anyone help me?
thanks.
When you print a character array without ending '\0', printf will print all characters till it finds '\0' in the memory. In this case, looks like '1234567893øþzôo` is followed by '\0'. Note that printf does not know the size of 'buf' array, hence it will print even those characters present after the end of buf array.
As you have suggested it is better to either set entire buffer to 0 or add '\0' explicitly at the end (as shown in below code).
buf[9] = '\0';
You said "i think the buf should be initialize to zero". The compiler does not do this automatically for you, so you will need to do it yourself if that is what you want:
char buf[10];
memset(buf, 0, sizeof(buf));
Before the buffer is initialized, you have no guarantees on what its contents will be.
ISTM your buffer is not zero-terminated, since you only read 9 characters. Change the last part of your code:
if(read(fd,buf,9)==-1){
fprintf(stderr,"%s\n",strerror(errno));
}else{
/* add this */
buf[9] = '\0';
printf("%s\n",buf);
}
}
What happens if you add that?
You should initialize buf to all 0.
Related
I use vscode as my code editor, and in the header file <strings.h> there is a function called bzero and when hovering on the function vscode says that bzero Set N bytes of S to 0. But I don't think it works like that. I created an array of 11 chars which called s and placed inside it Hello World.
Then I used bzero to set the first 4 bytes of s to 0, but from the output it seems like it cleaned the whole buffer.
#include <strings.h>
#include <stdio.h>
int main(int argc, char const *argv[])
{
char s[11] = "Hello World";
bzero(s, 4);
puts(s);
return 0;
}
$ cc main.c -o main && ./main
# empty
$
bzero does exactly what it says. The issue you’re facing is due to a misunderstanding of what a string is in C.
Briefly, a C string is a zero-terminated buffer of chars. That is, C treats an array of chars as a string by considering all chars until it finds the first one whose value is 0.
puts (and printf etc.) uses this definition of “string”.
As a consequence, setting even just the first char in the array to 0 results in an empty string, regardless of what comes after.
(Note also that bzero is a legacy function and its use is discouraged; use memset instead.)
Zero is a string-terminating character. Therefore if you set the first byte of your string to zero, puts will believe that the string is empty.
puts() expects a pointer to a string. Strings in C is sequences of characters *terminated by a null-character ('\0'). Null-character is represented by a value zero.
Therefore, puts() stops at the first zero and prints an empty string.
Print the whole buffer to see the effect of bzero().
#include <strings.h>
#include <stdio.h>
int main(int argc, char const *argv[])
{
char s[11] = "Hello World";
bzero(s, 4);
for (int i = 0; i < 11; i++) printf("%d ", s[i]); // print elements of the buffer
puts(s);
return 0;
}
Output:
0 0 0 0 111 32 87 111 114 108 100
My program is supposed to read from stdin and hand over the input to the system - in case the input does not equal "exit".
That works perfectly unless the second input is longer than the first.
For example if the first input is "hello" and the second is "hellohello", the input gets split up into "hello" and "ello".
I guess the problem is that the buffer s is not properly cleared while looping. Therefore I used memset() but unfortunately I did not get the results I was looking for.
Can anyone see the mistake?
Thanks a lot!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX 1024
int main(){
char *s = (char *)malloc(MAX);
char *temp = NULL;
while(fgets(s, (sizeof(s)-1), stdin)!= NULL){
temp = s+(strlen(s)-1);
*temp = '\0';
if (strcmp(s,"exit")==0){
break;
} else {
system(s);
}
memset(s, 0, MAX);
}
free(s);
return 0;
}
The incorrect thing here is (sizeof(s)-1). This will not return size of allocated buffer, instead return size of (char*). You size of buffer is MAX. memset() really doesn't do anything with this, so remove it. an you do not need to do that -1, fgets() will always automatically attach zero terminator in the end of string, even if buffer filled up.
Also these two lines
temp = s+(strlen(s)-1);
*temp = '\0';
are not needed, because
"fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A terminating null byte (aq\0aq) is stored after the last character in the buffer."
(from "man fgets", google for it)
I was trying to figure out that how a string with a known size can be filled with single characters. Then I wrote this simple code for a bigger problem that I have
(dynamic filling of a string with unknown size)
. When I tried to compile and run this code I encountered a problem which output had a heart symbol! and I don't know where it comes from.
#include <stdio.h>
#include <stdlib.h>
int main()
{
int i;
char str[3];
for(i=0;i<=2;i++){
str[i]=getc(stdin);
}
puts(str);
return 0;
}
Thank you.
The C strings are sequences of chars terminated by the null character (i.e. the character with code 0). It can be expressed as '\0', '\x0' or simply 0.
Your code fills str with three chars but fails to produce the null terminator. Accordingly, puts() prints whatever characters it finds in memory until it reaches the first null character.
Your code exposes Undefined Behaviour. It can do anything and it's not its fault.
In order to fix it you have to make sure the string ends with the null terminating character:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int i;
// Make room for 3 useful chars and the null terminator
char str[4];
// Read three chars
for(i = 0; i < 3; i ++) {
str[i] = getc(stdin);
}
// Add the null terminator for strings
str[3] = 0;
puts(str);
return 0;
}
Update
As #JeremyP notes in a comment, if the file you read from (stdin) ends before the code reads 3 characters, fgetc() will return EOF (End Of File) characters that are also funny non-printable characters that makes you wonder where they came from.
The correct way to write this code is to check if the input file reached its EOF (feof()) before reading from it:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int i;
// Make room for 3 useful chars and the null terminator
char str[4];
// Read at most three chars
for(i = 0; i < 3 && !feof(stdin); i ++) {
str[i] = getc(stdin);
}
// Add the null terminator for strings
str[i] = 0;
puts(str);
return 0;
}
Strings in c need to be null terminated so it could be that you forgot to add a '\0' character to the end of str. The reason the heart symbol shows up would be that when puts() tries to write out a string it keeps reading the next character in memory until it reaches a null terminator, '\0'. Since it doesn't encounter one it just continues reading into memory and happens to find the heart symbol I'd guess. Hope this helps.
I'm doing some exercises regarding buffer overflows and I am currently stumped as how to proceed further with one of them. This is the program code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/syscall.h>
void reverb(char *msg, unsigned int len) {
unsigned char length = (unsigned char) len;
char buffer[250] = "Printed: ";
strcat(buffer + 9, msg);
if ((length > 75) || (length < 15)) {
fprintf(stderr, "Error: invalid string length");
exit(1);
}
else {
fprintf(stdout, "%s\n", buffer);
}
}
int main(int argc, char **argv) {
//argument check
if (argc != 2) {
fprintf(stderr, "Invalid arguments!\n");
return 1;
}
reverb(argv[1], strlen(argv[1]));
return 0;
}
So basically as obvious as it is, this program should just re-print the argument you gave to it. I obviously have to exploit one of the functions used, and I suspect the main culprit here is strcat. However, I'm faced with the issue of the length variable when I want to get my stack smashing done.
To be able to cause a segfault and successfully find a point for the overflow to happen, I need to pass an argument with a length of around 255+ (not sure on the current number right now but it's somewhere around that), which is not doable with my 75 char limit. Using gdb and setting a break point right after strcat, I am able to find the buffer's location in the memory (kinda easy, just the area filled with 0x41 since I spammed it with A's). length's location was kinda trickier, but here's the issue - it's located BEFORE the buffer, meaning I couldn't even overwrite it if I wanted. But, I somehow still need to overwrite it to get into the else branch, I think. And I've been stuck at that point, not seeing a way to proceed properly.
If the string is long - say 256 chars, the Length variable will be wrong. Use the suggestions in the comments to catch it.
The crash: the string copy will then copy all 256 chars onto the end of the string in buffer, which only can hold 250 bytes.
Move the strcpy() into the else.
Additional notes:
I ran this in my debugger (VS2015)
When passing in the 256 byte string and overrunning “buffer”, the variable “length” got clobbered, and its value changes from 0 to 69 (some character in my string). This caused the 75 char limit to fail and the buffer to be printed.
I am having problems with the printf function in the CS50 IDE. When I am using printf to print out a string (salt in this code), extra characters are being output that were not present in the original argument (argv).
Posted below is my code. Any help would be appreciated. Thank you.
#include <cs50.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stdlib.h>
int main(int argc, string argv[])
{
// ensuring that only 1 command-line argument is inputted
if (argc != 2)
{
return 1;
}
char salt[2];
for (int i = 0; i < 2; i++)
{
char c = argv[1][i];
salt[i] = c;
}
printf("the first 2 characters of the argument is %s\n", salt);
}
You are missing a string terminator in salt.
Somehow the computer needs to know where your string ends in memory. It does so by reading until it encounters a NUL byte, which is a byte with value zero.
Your array salt has exactly 2 bytes of space, and after them, random garbage exists which just happens to be next in memory after your array. Since you don't have a string terminator, the computer will read this garbage as well until it encounters a NUL byte.
All you need to do is include such a byte in your array, like so:
char salt[3] = {0};
This will make salt one byte longer, and the {0} is a shorthand for {0, 0, 0} which will initialize the contents of the array with all zerores. (Alternatively, you could use char salt[3]; and later manually set the last byte to zero using salt[2] = 0;.)
In your case, salt is at least one element shy of being a string, unless the argv[1] is only one element, it does not contain a null-terminator.
You need to allocate space to hold the null-terminator and actually put one there to be able to use salt as string, as expected for the argument to %s conversion specifier in case of printf().
Otherwise, the string related functions and operations, which essentially rely on the fact that there will be a null terminator to mark the end of the char array (i.e., mark the end of valid memory that can be accessed), will try to access past the valid memory which causes undefined behavior. Once you hit UB, nothing is guaranteed.
So, considering the fact that you want to use
"....the first 2 characters of the argument....."
you need to make salt a 3-element char array, and make sure that salt[2] contains a null-terminator, like '\0'.