Is it dangerous merely to read outside an array in C [duplicate] - c

This question already has answers here:
How dangerous is it to access an array out of bounds?
(12 answers)
Closed 7 years ago.
I have seen several posts on the dangers of WRITING outside of array boundaries. I was wondering though, is there any problem with READING outside of them? My reason for this is as follows:
I have commands and data in a randomly generated array, but sometimes the commands require uncertain amounts of data. Do I need to put checks in each command's subroutine so that data is not read from outside the string, or can I temporarily read from outside the array, and realloc later?

According to the (draft) C standard (ISO 9899:201x) Annex J, undefined behaviour includes :
— Addition or subtraction of a pointer into, or just beyond, an array
object and an integer type produces a result that points just beyond
the array object and is used as the operand of a unary * operator that
is evaluated (6.5.6).
— An array subscript is out of range, even if an object is apparently
accessible with the given subscript
In C this expression a[2] is equivalent to *(a+2)

sometimes the commands require uncertain amounts of data
Initially, an amount of memory needs to be allocated based on the best guess (using malloc or calloc).
can I temporarily read junk from outside the array, and realloc later
As the program proceeds, the previous uncertainty about the amount of data presumably gives way to certainty as to how much new memory is needed to store new data.
Then, realloc needs to be called with the newly available information.
But the program always needs to check the return values of alloc routines and keep proper accounting of the valid range(s) of memory blocks that it was given and should keep within those bounds.
Otherwise it would lead to undefined behavior
Better yet, you can use c++ standard library to do the memory management for you. The facilities provided by standard library allocate and expand memory as needed so you don't have to do it.

Related

How can it be that I can write/read to/from beyond the defined size of an array in C language? [duplicate]

This question already has answers here:
How dangerous is it to access an array out of bounds?
(12 answers)
Closed 2 months ago.
i created array of 5 elements. in which i'm supposed to be able to index only 0 - 4. But why am i able to initialize the 5th index in this case?
A good answer for this may be found here: https://stackoverflow.com/a/70276640/4441211
In the link, the answer refers to the use of pointers. In your case you are writing n[5]=1, which is essentially writing *(n+5)=1, which is also pointers.
How can it be that I can write/read to/from beyond the defined size of an array in C language?
Defining an array or any other object reserves memory for it.
It does not create any checks on source code that you write to ensure that your source code stays within that reserved space. (There are often some checks built into the operating system, but they operate on units of memory pages, not on individual objects inside your program.)
Reserving memory is merely an accounting arrangement. Your process has a bunch of memory, and the C language gives you a lot of freedom to access it. When memory is reserved for one purpose, the C implementation will not, by itself, use it for any other purpose. It is up to you to write code that uses memory correctly.

C: how stack array with variable size works and result is correct? What happened in memory?

I am confused why following code works and result is correct,
int size = 5;
int array[size];
I remember that stack array CANNOT assign its size by variable(except const), because variable only know when runtime. This code shall report compiler error. However, today, I face this kind of code at work, I use -std=c89, and c99, etc, no any problem.
I assign 5 values to it, works. I assign 6 values to it, works(No seg fault). I assign 100 values to it, end part(out of range) gets wrong value, remaining still correct.
I am not sure 1. Why this can compile? gcc changed? or it always can compile this? 2. Why this work? Stack array is not allocated at runtime time, its size is fixed, why this "pseudo-dynamic" array works? Thanks
I remember that stack array CANNOT assign its size by variable(except const), because variable only know when runtime.
This is false. A static array cannot have a variable length, since its length must be fixed during translation time (compilation time). An automatic array can have a variable length (subject to compiler support), since it is commonly implemented by using the hardware stack, which is designed to support run-time allocations.
However, today, I face this kind of code at work, I use -std=c89, and c99, etc, no any problem.
Support for variable-length arrays was required in C 1999. Compilers earlier than that may have supported it as an extension. It was made an optional feature in later versions of the standard, but compilers that had support previously generally retained it.
This code shall report compiler error.
Are you saying there is a rule that the compiler shall report an error for this? What rule, why do you say that?
I assign 5 values to it, works. I assign 6 values to it, works(No seg fault). I assign 100 values to it, end part(out of range) gets wrong value, remaining still correct.
When you use more elements than have been allocated for an array, the behavior is not defined by the C standard or, generally, but an implementation (except when using debugging or safety features to detect or avoid overruns). Commonly, there is nothing to prevent you from trying to use six or more elements of a five-element array. Doing so may corrupt your program’s memory in various ways. It may appear to work in some circumstances, as long as the overrun does not destroy other data your program needs to appear to work.
I am not sure 1. Why this can compile? gcc changed?
It compiled because the compiler supported variable length arrays.
Why this work? Stack array is not allocated at runtime time, its size is fixed, why this "pseudo-dynamic" array works?
Stack arrays are generally allocated at run time.

Why is there no built-in function to find the size of a pointer array in C? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
free() only needs the pointer value to release allocated memory. Which means that C knows how big assigned memory blocks are. Then how come there isn't some built-in function to find the size of a pointer array?
I know the convention is to keep track of array sizes, but given that there already is some memory management happening natively, why don't we harness that to have a convenient size() function for arrays?
Such a function would be possible. The question is why the C standard doesn't require it.
The GNU C library implementation provides a function malloc_usable_size() that's similar to what you're suggesting. It returns the number of usable bytes for a malloc()ed pointer -- which may be greater than the requested size for various reasons.
It's not 100% clear why no such function is in the standard. It's not mentioned in the 1989 ANSI C Rationale or in the ISO C99 Rationale.
One thing to keep in mind is that an implementation doesn't need to keep track of the number of bytes requested by a malloc() call. It will often round up the request to some larger value, and it only needs to keep track of that.
For that matter, free() doesn't necessarily need to know the size of the block being deallocated. It might just call some lower-level function that doesn't provide that information. Or, for example, allocated blocks might b organized into linked lists, one list for each allocated size; free() might simply release a block from that list without having to know the size.
Finally, C programmers have gotten along without such a function for decades. Adding a requirement to provide it would impose some (probably fairly small) overhead on all implementations. I think the attitude is that you can simply remember how much memory you asked for, and use that information as needed.
If you allocate a single object:
some_type *ptr = malloc(sizeof *ptr);
then sizeof *ptr gives you the size of the object. If you allocate an array:
some_type *ptr = malloc(count * sizeof *ptr);
then sizeof *ptr only gives you the size of a single element of the allocated array -- but if you remember the value of count you can compute the total requested size easily enough.
Bottom line: The C standard could have required such a function, but it's not really necessary.
UPDATE: Kerrek SB makes an excellent point in a comment, one that I hadn't thought of. I'll take the liberty of summarizing it here.
A function that operates on an array via a pointer to its initial element (and there are a lot of such functions) shouldn't have to care how the array was allocated. The proposed size() function, like the GNU-specific malloc_usable_size(), works only when the argument points to a heap-allocated array. This means that the function either has to assume that the array is heap-allocated (and be right about that assumption!) or be given extra information. In the latter case, it might as well be given the size of the array, making size() superfluous.
free() may use internal data to reclaim the block of memory being released, but be aware that this data does not necessarily contain the exact size passed to malloc(), calloc(), or realloc() to allocate the block. The C Standard does not specify a function to retrieve this information.
Most malloc() implementations provide a non-standard function to retrieve the available size of the allocated block: in the Glibc, this function is size_t malloc_usable_size(void *ptr);. Other libraries may have a different function or no function at all to retrieve this information.
As for a generic solution to retrieve the size of the array to which you have a pointer, this is usually not possible. In efficient implementations, pointers do not carry size information. It is possible to implement fat pointers that would carry this information, but the whole system needs to be compiled this way. Some integrated compilers such as tcc support this approach to provide runtime pointer checking.
Because basically, the address will point on a chunk of memory, which contains meta-data (such as the size of the chunk). Freeing that entry will actually mark the block available (if the pointer is valid).
If the caller access that memory location afterward, that's undefined behaviour. So even from that point of view, free will have its job done.
free() only needs the pointer value,because you can only pass the pointer that malloc() return.The malloc() will write the size of this assign in the front address of return pointer.When you pass the pointer to free(),free() will read the size,so free() knows how many space to release.Therefor,there not used any function to find the size of a pointer array.

C program with pointer

is it possible to convert any program written in C using pointer into another c program that does not contain any pointers?If yes, can we automate the process?
i read a few papers on c to java bytecode compilation and found that a major issue was "the pointer problem".so i was thinking that if the above process could be done,then it could be included like a preprocessing step(though it itself may be big task) and then it may be simpler to try converting to jvm bytecode...
thanks in advance
In theory you can, by simulating individual data structures or even the entire memory (static data, heap and stack) with arrays. But the question is whether this is very practical; it may involve having to rewrite every pointer-based standard library function you need.
Anyway, there's a nice explanation on Wikipedia:
It is possible to simulate pointer behavior using an index to an (normally one-dimensional) array.
Primarily for languages which do not support pointers explicitly but do support arrays, the array can be thought of and processed as if it were the entire memory range (within the scope of the particular array) and any index to it can be thought of as equivalent to a general purpose register in assembly language (that points to the individual bytes but whose actual value is relative to the start of the array, not its absolute address in memory). Assuming the array is, say, a contiguous 16 megabyte character data structure, individual bytes (or a string of contiguous bytes within the array) can be directly addressed and manipulated using the name of the array with a 31 bit unsigned integer as the simulated pointer (this is quite similar to the C arrays example shown above). Pointer arithmetic can be simulated by adding or subtracting from the index, with minimal additional overhead compared to genuine pointer arithmetic.
Pointers are rather central to C. While C may be turing-complete without pointers, it's not practical to rewrite arbitrary C without them.Things you can't do without pointers:-dynamic (manual) memory allocation.-passing by reference.Given that arrays decay into pointers a the drop of a hat, you also couldn't use arrays practically, so you are left with automatic, static and global variables that cannot be arrays. tl;dr: No

Array upper bound cecking in C language [duplicate]

This question already has answers here:
How dangerous is it to access an array out of bounds?
(12 answers)
Closed 8 years ago.
In c program we can initialize an array like int array[10]. So it can store 10 integer value.But when I give input using loop it takes input more than 10 and doesn't show any error.
actually what is happening??
#include<stdio.H>
main()
{
int array[10],i;
for(i=0;i<=11;i++)
scanf("%d",&array[i]);
for(i=0;i<10;i++)
printf("%d",array[i]);
}
Because C doesn't do any array bounds checking. You as a programmer are responsible for making sure that you don't index out of bounds.
Depending on the used compiler and the system the code is running on, you might read random data from memory or get a SIGSEGV eventually when reading/writing out of bounds.
The C compiler and the runtime are not required to perform any array bounds checking.
What you describe is an example of a whole class of programming errors that result in undefined behavior. From Wikipedia:
In computer programming, undefined behavior refers to computer code whose behavior is specified to be arbitrary.
What this means is that the program is allowed to misbehave (or not) in any way it pleases.
In practice, any of the following are reasonably likely to happen when you write past the end of an array:
The program crashes, either immediately or at a later point.
Other, unrelated, data gets overwritten. This could result in arbitrary misbehaviour and/or in serious security vulnerabilities.
Internal data structures that are used to keep track of allocated memory get corrupted by the out-of-bounds write.
The program works exactly as if more memory had been allocated in the first place (memory is often allocated in block, and by luck there might happen to be some spare capacity after the end of the array).
(This is not an exhaustive list.)
There exist tools, such as Valgrid, that can help discover and diagnose this type of errors.
The C-language standard does not dictate how variables should be allocated in memory.
So the theoretical answer is that you are performing an unsafe memory access operation, which will lead to undefined behavior (anything could happen).
Technically, however, all compilers allocate local variables in the stack and global variables in the data-section, so the practical answer is:
In the case of a local array, you will either override some other local variable or perform an illegal memory access operation.
In the case of a global array, you will either override some other global variable or perform an illegal memory access operation.

Resources