I know it's very dumb but I really don't get what the heck is happening here.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int getinput()
{
char buf[10];
int rv = read(0, buf, 1000);
printf("\nNumber of bytes read are %d\n", rv);
return 0;
}
int main()
{
getinput();
return 0;
}
I can't understand how this read() function is working.
read(0, buf, 1000)
Also, the buf is 10 bytes long why it is taking 23 bytes?
Array-pointer equivalence
In C, an array like the variable buf in your example is just a pointer to the memory address of the first allocated byte.
You can print the value of this pointer:
#include <stdio.h>
int main(void) {
char buf[10];
printf("Address of the first byte of buf: %p\n", buf);
return 0;
}
Output:
Address of the first byte of buf: 0x7ffd3699bfb6
Pointer arithmetic
When you write something into this buffer with an instruction like
buf[3] = 'Z';
It is in fact translated to
*(buf+3) = 'Z';
It means "add 3 to the value of the pointer buf and store the character 'Z' at the resulting address".
Nothing is stopping you from storing the character 'Z' at any given address. You can store it before or after the address pointed to by buf without any restriction. If the address you choose happen to be the address of another variable, it cannot produce a segmentation fault (the address is valid).
In C, you can even write the character 'Z' at the address 123456 if you like:
int main(void) {
char *address = (char *)123456;
*address = 'Z';
return 0;
}
The fact that your buffer is 10 bytes long does not change that. You cannot "fix" this because writing anything at any memory location is a fundamental feature of the C programming language. The only "fix" I can think of would be to use another language.
File descriptors opened at program startup
In your exemple, you pass the value 0 as the first argument of the function read(). It seems that this value corresponds to the file descriptor of the standard input. This file descriptor is automatically opened at program startup (normally you get such a file descriptor as the result of a call to the function open()). So, if you get 23 read bytes, it means that you typed in 23 characters on your keyboard during the program execution (for instance, 22 letters and 1 newline character).
It would be better to use the macro name of the standard input file descriptor:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int getinput()
{
char buf[10];
int rv = read(STDIN_FILENO, buf, 10);
printf("\nNumber of bytes read are %d\n", rv);
return 0;
}
int main()
{
getinput();
return 0;
}
your sample is a perfect example of a buffer overflow.
read(0, buff, 1000) will most probably corrupt the memory (stack on your case).
Read will take the start address of your buf pointer and will write those 23 bytes in your case... if there are some other structures on the memory they will be overwritten by those 13 bytes and can lead to very unwanted behavior (maybe even crashes of you application)
C gives the responsibility to handle memory correctly to the programmer. So there is no bounds checking.
You call read() with 3 arguments:
The file handle, in your case "0".
The pointer to the array of bytes to fill with the bytes read from the file, in your case buf.
The size of this array, in your case "1000".
Apparently the file has only 23 bytes, which is less or equal to 1000, so read() returns this value.
Note: But before, it happily wrote all these 23 bytes into the array. Since your buffer has just a capacity of 10 bytes, the memory after it gets overwritten. This is called "buffer overflow" and is a common error, abused for evil attacks, or possibly leading to crashes or malfunction (Thanks, ikegami!).
To fix this error, I recommend to change the read into:
read(0, buf, sizeof buf);
This way your are always giving the right size to read(). (If you declare buf as an array, of course.)
Related
I'm making a webserver in C, and I want to allocate just one chunk of memory for everything (strings and arrays).
My allocation strategy starts with this. and bp is the buffer pointer for searches:
char *bp, *buf=malloc(1048576); // allocate 1MB
First 64KB will be the max space for the full HTTP request unprocessed (because I'm not dealing with uploaded file requests). The remainder of the 1MB that's allocated will contain each header that hopefully will be easily be accessible.
Now if I programmed the extraction code this way, I'd have no problem:
char *httpreq=buf+65536;
int linesize=8192; //size of each line
int httprn=0; // Http request header number. increments for each header found.
char *crlf;
while((crlf=strstr(bp,"\r\n"))){ //loop until no more enters are found
memmove(httpreq+(httprn*linesize),bp,crlf-bp);
bp+=2; //move pointer to skip CRLF.
httprn++;
}
But I'd rather program the code this way:
int linesize=8192; //size of each line
char *httpreq[linesize]=buf+65536;
int httprn=0;
while((crlf=strstr(bp,"\r\n"))){
memmove(httpreq[httprn++],bp+=2,crlf-bp); //skip CRLF
}
However the C compiler tells me that I have an invalid initializer and its referring to this particular line:
char *httpreq[linesize]=buf+65536;
is there any way I can use this kind of syntax:
httpreq[n]
instead of this:
httpreq+(linesize*n)
to read the HTTP header n without having to use local static memory?
This:
char httpreq[n][n];
would use static memory, but I'd rather use extended memory for string allocation.
Any ideas?
Yes, but you need to properly construct the pointers. Here is example of what you want to achieve:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define LINESIZE 8192
int main() {
char *buf = (char *) malloc(1048576);
char (*ht)[LINESIZE];
ht = (char (*)[])( buf + 65536);
printf("bf %p\n", buf);
printf("ht[0] %p\n", ht[0] );
printf("ht[1] %p\n", ht[1] );
sprintf(ht[0],"%s\n", "This is the first line");
sprintf(ht[1],"%s\n", "This is the second line");
printf("%s", ht[0]);
printf("%s", ht[1]);
}
so, the char (*ht)[LINESIZE] tells the compiler that ht is an array of char *, each one LINESIZE long.
The (char (*)[])(buff + 65536) is casting the calculation of the offset in the type of ht.
This subprogram takes three user inputs: a text string, a path to a file, and a 1 digit flag. It loads the file into a buffer, then appends both the flag and the file buffer, in that order, to a char array that serves as a payload. It returns the payload and the original user string.
I received a bug where some of my string operations on the file buffer, flag, and payload appeared to corrupt the memory that the user_string was located in. I fixed the bug by swapping strcat(flag, buffer) to strcpy(payload, flag), (which is what I intended to write originally), but I'm still perplexed as to what caused this bug.
My guess from reading the documentation (https://www.gnu.org/software/libc/manual/html_node/Concatenating-Strings.html , https://www.gnu.org/software/libc/manual/html_node/Concatenating-Strings.html) is that strcat extends the to string strlen(to) bytes into unprotected memory, which the file contents loaded into the buffer copied over in a buffer overflow.
My questions are:
Is my guess correct?
Is there a way to reliably prevent this from occurring? Catching this sort of thing with an if(){} check is kind of unreliable, as it doesn't consistently return something obviously wrong; you expect a string of length filelength+1 and get a string of filelength+1.
bonus/unrelated: is there any computational cost/drawbacks/effects with calling a variable without operating on it?
/*
user inputs:
argv[0] = tendigitaa/four
argv[1] = ~/Desktop/helloworld.txt
argv[2] = 1
helloworld.txt is a text file containing (no quotes) : "Hello World"
*/
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include <string.h>
int main (int argc, char **argv) {
char user_string[100] = "0";
char file_path[100] = "0";
char flag[1] = "0";
strcpy(user_string, argv[1]);
strcpy(file_path, argv[2]);
strcpy(flag, argv[3]);
/*
at this point printfs of the three declared variables return the same as the user inputs.
======
======
a bunch of other stuff happens...
======
======
and then this point printfs of the three declared variables return the same as the user inputs.
*/
FILE *file;
char * buffer = 0;
long filelength;
file = fopen(file_path, "r");
if (file) {
fseek(file, 0, SEEK_END);
filelength = ftell(file);
fseek(file, 0, SEEK_SET);
buffer = malloc(filelength);
printf("stringcheck1: %s \n", user_string);
if (buffer) {
fread(buffer, 1, filelength, file);
}
}
long payloadlen = filelength + 1;
char payload[payloadlen];
printf("stringcheck2: %s \n", user_string);
strcpy(payload, flag);
printf("stringcheck3: %s \n", user_string);
strcat(flag, buffer);
printf("stringcheck4: %s \n", user_string); //bug here
free(buffer);
printf("stringcheck5: %s \n", user_string);
payload; user_string; //bonus question: does this line have any effect on the program or computational cost?
return 0;
}
/*
printf output:
stringcheck1: tendigitaa/four
stringcheck2: tendigitaa/four
stringcheck3: tendigitaa/four
stringcheck4: lo World
stringcheck5: lo World
*/
note: taking this section out of the main program caused stringcheck 4 to segfault instead of returning "lo World". The behavior was otherwise equivalent.
strcat does exactly what documentation says:
char *strcat(char *restrict s1, const char *restrict s2); The
strcat() function shall append a copy of the string pointed to by s2
(including the terminating null byte) to the end of the string pointed
to by s1. The initial byte of s2 overwrites the null byte at the end
of s1. If copying takes place between objects that overlap, the
behavior is undefined.
s1 has to have enough memory allocated to accommodate both strings plus the terminating nul
The linked article is about programming own string concatenating functions. How to write such a function depends on the application - which is stated there. There are many ways.
In your program the destination char array is not big enough and the result is an Undefined Behaviour and it is not even big enough to accommodate a single character string.
I strongly advice to learn some C strings basics.
If you want safer strcat you can write your own one for example:
char *mystrcat(const char *str1, const char *str2)
{
char *dest = NULL;
size_t str1_length, str2_length;
if(str1 && str2)
{
dest = malloc((str1_length = strlen(str1)) + (str2_length = strlen(str2)) + 1);
if(dest)
{
memcpy(dest, str1, str1_length);
memcpy(dest + str1_length, str2, str2_length);
}
}
return dest;
}
But for the safety we always pay the price - the code is longer and less efficient. C language was designed to be as efficient as possible sacrificing the safety and introducing the idea if the Undefined Behaviour.
You can't store a non-empty string in a 1-character array. A string needs room for the string contents and a null terminator.
So when you declare
char flag[1] = "1";
you've only allocated one byte, which contains the character 1. There's no null terminator.
Using this with any string functions will result in undefined behavior, because they look for the null terminator to find the end of the string.
strcat(flag, buffer) will search for the null terminator, which will be outside the array, and then append buffer after that. So this clearly causes a buffer overflow when writing.
strcpy(payload, flag) is also wrong. It will look for a null terminator after the flag bytes to know when to stop copying to payload, so it will copy more than just flag (unless there happens to be a null byte after it).
You can resolve the strcpy() problem by increasing the size:
char flag[2] = "1";
You can also leave the size empty, the compiler will make it large enough to hold the string that initializes it, including the null byte:
char flag[] = "1";
The line that causes the problem is because strcat() is trying to cram buffer into flag which is only one character long and you haven't allocated any more space to fit buffer.
If you want to put buffer into flag, I recommend using realloc() to increase the length of flag to include the length of buffer.
Also the only thing you ever print is user_string. I'm not sure if you're trying to print the other string you're working with.
I want to store a single char into a char array pointer and that action is in a while loop, adding in a new char every time. I strictly want to be into a variable and not printed because I am going to compare the text. Here's my code:
#include <stdio.h>
#include <string.h>
int main()
{
char c;
char *string;
while((c=getchar())!= EOF) //gets the next char in stdin and checks if stdin is not EOF.
{
char temp[2]; // I was trying to convert c, a char to temp, a const char so that I can use strcat to concernate them to string but printf returns nothing.
temp[0]=c; //assigns temp
temp[1]='\0'; //null end point
strcat(string,temp); //concernates the strings
}
printf(string); //prints out the string.
return 0;
}
I am using GCC on Debain (POSIX/UNIX operating system) and want to have windows compatability.
EDIT:
I notice some communication errors with what I actually intend to do so I will explain: I want to create a system where I can input a unlimited amount of characters and have the that input be store in a variable and read back from a variable to me, and to get around using realloc and malloc I made it so it would get the next available char until EOF. Keep in mind that I am a beginner to C (though most of you have probably guess it first) and haven't had a lot of experience memory management.
If you want unlimited amount of character input, you'll need to actively manage the size of your buffer. Which is not as hard as it sounds.
first use malloc to allocate, say, 1000 bytes.
read until this runs out.
use realloc to allocate 2000
read until this runs out.
like this:
int main(){
int buf_size=1000;
char* buf=malloc(buf_size);
char c;
int n=0;
while((c=getchar())!= EOF)
buf[n++] = c;
if(n=>buf_size-1)
{
buf_size+=1000;
buf=realloc(buf, buf_size);
}
}
buf[n] = '\0'; //add trailing 0 at the end, to make it a proper string
//do stuff with buf;
free(buf);
return 0;
}
You won't get around using malloc-oids if you want unlimited input.
You have undefined behavior.
You never set string to point anywhere, so you can't dereference that pointer.
You need something like:
char buf[1024] = "", *string = buf;
that initializes string to point to valid memory where you can write, and also sets that memory to an empty string so you can use strcat().
Note that looping strcat() like this is very inefficient, since it needs to find the end of the destination string on each call. It's better to just use pointers.
char *string;
You've declared an uninitialised variable with this statement. With some compilers, in debug this may be initialised to 0. In other compilers and a release build, you have no idea what this is pointing to in memory. You may find that when you build and run in release, your program will crash, but appears to be ok in debug. The actual behaviour is undefined.
You need to either create a variable on the stack by doing something like this
char string[100]; // assuming you're not going to receive more than 99 characters (100 including the NULL terminator)
Or, on the heap: -
char string* = (char*)malloc(100);
In which case you'll need to free the character array when you're finished with it.
Assuming you don't know how many characters the user will type, I suggest you keep track in your loop, to ensure you don't try to concatenate beyond the memory you've allocated.
Alternatively, you could limit the number of characters that a user may enter.
const int MAX_CHARS = 100;
char string[MAX_CHARS + 1]; // +1 for Null terminator
int numChars = 0;
while(numChars < MAX_CHARS) && (c=getchar())!= EOF)
{
...
++numChars;
}
As I wrote in comments, you cannot avoid malloc() / calloc() and probably realloc() for a problem such as you have described, where your program does not know until run time how much memory it will need, and must not have any predetermined limit. In addition to the memory management issues on which most of the discussion and answers have focused, however, your code has some additional issues, including:
getchar() returns type int, and to correctly handle all possible inputs you must not convert that int to char before testing against EOF. In fact, for maximum portability you need to take considerable care in converting to char, for if default char is signed, or if its representation has certain other allowed (but rare) properties, then the value returned by getchar() may exceed its maximum value, in which case direct conversion exhibits undefined behavior. (In truth, though, this issue is often ignored, usually to no ill effect in practice.)
Never pass a user-provided string to printf() as the format string. It will not do what you want for some inputs, and it can be exploited as a security vulnerability. If you want to just print a string verbatim then fputs(string, stdout) is a better choice, but you can also safely do printf("%s", string).
Here's a way to approach your problem that addresses all of these issues:
#include <stdio.h>
#include <string.h>
#include <limits.h>
#define INITIAL_BUFFER_SIZE 1024
int main()
{
char *string = malloc(INITIAL_BUFFER_SIZE);
size_t cap = INITIAL_BUFFER_SIZE;
size_t next = 0;
int c;
if (!string) {
// allocation error
return 1;
}
while ((c = getchar()) != EOF) {
if (next + 1 >= cap) {
/* insufficient space for another character plus a terminator */
cap *= 2;
string = realloc(string, cap);
if (!string) {
/* memory reallocation failure */
/* memory was leaked, but it's ok because we're about to exit */
return 1;
}
}
#if (CHAR_MAX != UCHAR_MAX)
/* char is signed; ensure defined behavior for the upcoming conversion */
if (c > CHAR_MAX) {
c -= UCHAR_MAX;
#if ((CHAR_MAX != (UCHAR_MAX >> 1)) || (CHAR_MAX == (-1 * CHAR_MIN)))
/* char's representation has more padding bits than unsigned
char's, or it is represented as sign/magnitude or ones' complement */
if (c < CHAR_MIN) {
/* not representable as a char */
return 1;
}
#endif
}
#endif
string[next++] = (char) c;
}
string[next] = '\0';
fputs(string, stdout);
return 0;
}
I'm working on detecting and preventing BOF attacks and I'd like to know, how can I overflow a global struct?
My code:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
struct{
char name[20];
char description[10];
} test;
int main(int argc, char **argv){
if(argc != 2)
exit(-1);
*(*(argv+1)+20) = '\x00'; //terminate string after 20 characters
strcpy(test.name, argv[1]); //no BOF here... stopped at 20
printf("%s\n", test.name);
char *desc;
desc = malloc(10);
if(!desc){
printf("Error allocating memory\n");
exit(-1);
}
scanf("%s", desc); //no bounds checking - this is where I BOF
strcpy(test.description, desc); //copy over 10 characters into 10 char buffer
printf("%s\n", test.description); //this prints out whatever I type in
//even thousands of characters, despite it having a buffer of 10 chars
}
You overflow a global buffer the same way you do any other buffer type; you store more data in it than there are bytes allocated for it. Perhaps the question is "and what damage does that do?", and the answer is the usual: it depends.
Basically, when you overflow a specific global buffer, you write over some other global variables, and what happens next depends on whether the other variable is referenced again, and what it is supposed to hold. It won't, typically, have function return addresses and the like, so it can be more difficult to exploit.
char *desc = malloc(10);
scanf("%s", desc); //no bounds checking - this is where I BOF
strcpy(test.description, desc); //copy over 10 characters into 10 char buffer
One of the things you will need to address during testing on modern Linux systems are the calls to scanf and strcpy. Modern systems use FORTIFY_SOURCE, and it tries to remediate some classes of buffer overflows.
FORTIFY_SOURCE uses "safer" variants of high risk functions like memcpy and strcpy. The compiler uses the safer variants when it can deduce the destination buffer size. If the copy would exceed the destination buffer size, then the program calls abort().
To disable FORTIFY_SOURCE for your testing, you should compile the program with -U_FORTIFY_SOURCE or -D_FORTIFY_SOURCE=0.
I'm fairly competent in a few scripting languages, but I'm finally forcing myself to learn raw C. I'm just playing around with some basic stuff (I/O right now). How can I allocate heap memory, store a string in the allocated memory, and then spit it back out out? This is what I have right now, how can I make it work correctly?
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char *toParseStr = (char*)malloc(10);
scanf("Enter a string",&toParseStr);
printf("%s",toParseStr);
return 0;
}
Currently I'm getting weird output like '8'\'.
char *toParseStr = (char*)malloc(10);
printf("Enter string here: ");
scanf("%s",toParseStr);
printf("%s",toParseStr);
free(toParseStr);
Firstly, the string in scanf is specifies the input it's going to receive. In order to display a string before accepting keyboard input, use printf as shown.
Secondly, you don't need to dereference toParseStr since it's pointing to a character array of size 10 as you allocated with malloc. If you were using a function which would point it to another memory location, then &toParseStr is required.
For example, suppose you wanted to write a function to allocate memory. Then you'd need &toParseStr since you're changing the contents of the pointer variable (which is an address in memory --- you can see for yourself by printing its contents).
void AllocateString(char ** ptr_string, const int n)
{
*ptr_string = (char*)malloc(sizeof(char) * n);
}
As you can see, it accepts char ** ptr_string which reads as a pointer which stores the memory location of a pointer which will store the memory address (after the malloc operation) of the first byte of an allocated block of n bytes (right now it has some garbage memory address since it is uninitialized).
int main(int argc, char *argv[])
{
char *toParseStr;
const int n = 10;
printf("Garbage: %p\n",toParseStr);
AllocateString(&toParseStr,n);
printf("Address of the first element of a contiguous array of %d bytes: %p\n",n,toParseStr);
printf("Enter string here: ");
scanf("%s",toParseStr);
printf("%s\n",toParseStr);
free(toParseStr);
return 0;
}
Thirdly, it is recommended to free memory you allocate. Even though this is your whole program, and this memory will be deallocated when the program quits, it's still good practice.
You need to give scanf a conversion format so it knows you want to read a string -- right now, you're just displaying whatever garbage happened to be in the memory you allocated. Rather than try to describe all the problems, here's some code that should at least be close to working:
char *toParseStr = malloc(10);
printf("Enter a string: ");
scanf("%9s", toParseStr);
printf("\n%s\n", toParsestr);
/* Edit, added: */
free(toParseStr);
return 0;
Edit: In this case, freeing the string doesn't make any real difference, but as others have pointed out, it is a good habit to cultivate nonetheless.
Using scanf() (or fscanf() on data you don't control) with a standard "%s" specifier is a near-certain way to get yourself into trouble with buffer overflows.
The classic example is that it I enter the string "This string is way more than 10 characters" into your program, chaos will ensue, cats and dogs will begin sleeping together and a naked singularity may well appear and consume the Earth (most people just state "undefined behaviour" but I think my description is better).
I actively discourage the use of functions that cannot provide protection. I would urge you (especially as a newcomer to C) to use fgets() to read your input since you can control buffer overflows with it a lot easier, and it's more suited to simple line input than scanf().
Once you have a line, you can then call sscanf() on it to your heart's content which, by the way, you don't need to do in this particular case since you're only getting a raw string anyway.
I would use:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BUFFSZ 10
int main(int argc, char *argv[]) {
char *toParseStr = malloc(BUFFSZ+2);
if (toParseStr == NULL) {
printf ("Could not allocate memory!\n");
return 1;
}
printf ("Enter a string: ");
if (fgets (toParseStr, BUFFSZ+2, stdin) == NULL) {
printf ("\nGot end of file!\n");
return 1;
}
printf("Your string was: %s",toParseStr);
if (toParseStr[strlen (toParseStr) - 1] != '\n') {
printf ("\nIn addition, your string was too long!\n");
}
free (toParseStr);
return 0;
}
You don't need an & before toParseStr in scanf as it is already a pointer
also call free(toParseStr) afterwards
First, the errors that was keeping your program from working: scanf(3) takes a format-string, just like printf(3), not a string to print for the user. Second, you were passing the address of the pointer toParseStr, rather than the pointer toParseStr.
I also removed the needless cast from your call to malloc(3).
An improvement that your program still needs is to use scanf(3)'s a option to allocate memory for you -- so that some joker putting ten characters into your string doesn't start stomping on unrelated memory. (Yes, C will let someone overwrite almost the entire address space with this program, as written. Giant security flaw. :)
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
char *toParseStr = malloc(10);
printf("Enter a short string: ");
scanf("%s",toParseStr);
printf("%s\n",toParseStr);
return 0;
}