Iterate through memory using pointers [closed] - c

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I am new to C; I have more background knowledge with Java.
I want to try searching for value in memory using pointer
#include <stdio.h>
void main(){
int value = 10;
find(value);
}
void find(int value){ // will change this to return array of long or int depending on which is better
int* intPointer;
for(int i = 40000; i < 40400 ; i+= 32){ /* not sure if i do it like this or another way,
* the starting value, ending value and condition
* is just for testing purpose */
intPointer = (int*)i;
if(*intPointer == value){
printf("found value"); //will actually be storing it to an array instead
}
}
}
The find method gives me a segmentation fault most likely because the address does not store int value. Is there a way I can find out what type data is stored in the memory address.
Is there a better way of achieving a similar task? Please explain and show. The actual program will not have int value = ?? instead only the find method will be used to get a array of addresses which contains this value.
Also what is the best type to store addresses int or long?

A couple things to know right off the bat:
Memory is not always guaranteed to be physically contiguous (even if the memory model appears to be as such).
Knowing what a segfault is would be helpful, as well as what segmented memory and memory protection are in general
The find method gives me a segmentation fault most likely because the address does not store int value
From what I can tell you are trying to loop through memory addresses and find a specific value. Technically it's possible, but practically there's no reason to do it except a few rare cases.
Unfortunately (for your purposes) your operating system of choice handles memory rather intelligently; instead of just placing everything it is given in a linear order, it divides (or segments) memory into their own address spaces to keep memory of process A from interfering with memory of process B, and so forth. To keep track of which processes are where in memory, a mapping mechanism is used.
You tried to access memory outside of your program's assigned segment. To be able to do what it seems you want to do, you're going to need to find a way to get the memory map from the OS.
is there a way I can find out what type data is stored in the memory address
No, at least not in the way you want. A couple SO answers address this.

There is no way to do what you trying in a generic portable way. The code generates undefined behavior as you are accessing memory not allocated to you.
On simple embedded systems it may be possible. To start with, it will require that you have full insight in the systems memory layout and how the compiler works. For instance, you must know whether int is always 4 byte aligned in memory, start/end address of RAM. If an MMU is present and active, you also need to know how it works.
is there a way I can find out what type data is stored in the memory address.
No

You are faking a pointer with a temporary var inside the loop.
The correct way to search for a value would be to have two arguments: the value to search for, and a collection (array, for example) containing some values to be searched.

Related

Can I find how much stack memory is currently used in a Mac thread?

When I try to google this, all I find is stuff about getting and setting the stack limit, such as -[NSThread stackSize], but that's NOT what I want. I want to know how much memory is in actually in use on the stack in the current thread, or equivalently how much stack space remains available.
I'm hoping to figure out a stack overflow in a crash report submitted by a user. In my previous experience, a stack overflow has usually been caused by an infinite recursion, but not this time. So I'm wondering if some of my C++ functions are really using a heck of a lot more stack space than they should.
A comment suggested that I get the stack pointer at the start of the thread, and compare its value later. I happened across the question Print out value of stack pointer. It has several answers:
(The accepted answer) Take the address of a local variable.
Use a little assembly language to get the value of the stack pointer register.
Use the function __builtin_frame_address(0) in GCC or Clang.
I tried those techniques (Apple Clang, macOS 11.2). Methods 2 and 3 produced similar results, but method 1 produced absurdly different results. For one thing, method 1 gives values that increase as you go deeper into a call chain, while the others give values that decrease. What's up with this, are there two different kinds of stacks?
If you are trying to do that, I guess you want to know how much memory are you using to guess the optimum number of threads you can create of some kind.
The answer is not easy, as you normally don't have access to the stack pointer. But I'll try to devise a solution for you that will not require to access the stack pointer, while it requires to use a global variable per thread.
The idea is to force a parameter to be in the stack. Even if the ABI in your system uses register to pass parameters, if you save the address of a parameter (the actual parameter variable) into some local variable, and then after that you call a function, that takes a parameter (the type doesn't matter, as you are going to use it's address to compare both):
static char *initial_stack_pseudo_addr;
size_t save_initial_stack(char dumb)
{
/* the & operator forces dumb to be implemented in the stack */
initial_stack_pseudo_addr = &dumb;
}
size_t how_much_stack(int dumb)
{
return initial_stack_pseudo_addr - &dumb;
}
So when you start the thread, you call save_initial_stack(0);. When you want to know how much stack you have consumed, just can do the following:
size_t stack_size = how_much_stack(0);
printf("at this point I have %zi bytes of stack\n", stack_size);
Basically, what you have done is to calculate how many bytes are between the address of the local parameter of the call to save_initial_stack() to the address of the local parameter of the call you do now to get the stack size. This is approximate, but the stack changes too quick to have a precise idea.
The following example will illustrate the thing. A recursive function is called after setting the initial pointer value, then at each recursive call the current size of the stack (approximate) is computed and printed, and a new recursive call is made. The program should run until the process gets a stack overflow.
#include <stdio.h>
char *stack_at_start;
void save_stack_pointer(char dumb)
{
stack_at_start = &dumb;
}
size_t get_stack_size(char dumb)
{
return stack_at_start - &dumb;
}
void recursive()
{
printf("Stack size: %zi\n", get_stack_size(0));
recursive();
}
int main()
{
save_stack_pointer(0);
recursive();
}

What is the origin of garbage value? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
When we define a variable, and do not initialize it, the block of memory allocated to the variable still contains a value from previous programs, known as garbage value. But suppose, in a hypothetical case, a block of unused memory is present in the system, and when I declare and define a variable, that block of memory is allocated to the variable. If I do not initialize the variable, and try to print the value of the variable, the system doesn't have any garbage value to print. What will be the result? What will the system do?
When we define a variable, and do not initialize it, the block of memory allocated to the variable still contains a value from previous programs, known as garbage value.
If I do not initialize the variable, and try to print it, it doesn't have any garbage value to print.
C does not specify these behaviors. There is no specified garbage value.
If code attempts to print (or use) the value of an uninitialized object, the result is undefined behavior (UB). Anything may happen: a trap error occurs, the value is 42, code dies, anything.
There is a special case if the uninitialized object is an unsigned char in that a value will be there of indeterminate value, just something in the range [0...UCHAR_MAX], but no UB. This is the closest to garbage value C specifies.
Firstly, it isn't defined how an implementation behaves precisely when an uninitialised read is made, by the C standard. Merely that the value is not defined. The system may use whatever method it wishes to choose the value. Or it is even possible it is a trap representation and will crash the program I believe.
However on most real modern OSes. The data in reality is fresh pages that get mapped into the programs address space. For security reasons most kernels actually explicitly ensure these are zeroed out to avoid software spying on data from previous programs which were run and got left in memory.
However some OSes as you say will just leave this data in, meaning the page is either fresh and usually zeroed out or contains arbitrary data from previous programs (or even potentially arbitrary data defined by how the memory starts up, but at least with DRAM, that is generally in a zeroed state).
I think you need more of hardware perspective.
What is a memory? An example of memory: is made up of transistors and capacitors. A transistor and capacitor make a memory bit. A bit is either of value 0 or 1, a hypothetical scenario of non-existance of this bit value does not exist ;) as it has to hold either 0 or 1 and nothing else. If you think there is "nothing" in bit value, that means the hardware(transistor/capacitor) you are imagining is not working.
A bunch of bits makes a word or byte. A bunch of bytes holds an integer/ float or whatever variable you define. So even without initializing the variable, it contains 0's and 1's in each of the memory cells. When you access this - it's called garbage.
But suppose, in a hypothetical case, a block of unused memory is present in the system, and when I declare and define a variable, that block of memory is allocated to the variable. If I do not initialize the variable, and try to print it, it doesn't has any garbage value to print. What will it do?
Any given memory location has some value, no matter how it got there. The "garbage" value doesn't have to come from a program that ran in that space previously, it could just be the initial state of the memory location when the system starts up. The reason it's "garbage" is that you don't know what it is -- you didn't put it there, you don't have any idea how it got there, and you don't really care.

How can I place an array at a specific address at boot time [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I am writing a small hobby OS as a learning experience. It's aimed at a 32-bit x86 architecture.
I am at the point where I need to create an initial page_directory so I can enable paging. At this point paging (and thus VM) is not enabled.
I have a function that reserves 4kb of unused memory and returns the starting address of this memory block.
I want to create an array, page_dir (consisting of 1024 int), at the memory location returned by the function described above.
I understand the basic on pointers (I think), but I can't figure out how to do this.
How can I define the array page_table at the physical address returned during runtime?
If I well understood you want treat an address returned by a function as the base address of an array of ints.
If the above assumption is correct you can use 2 ways, a cast or an intermediary variable.
Using a cast:
void *pd = GetPhysicalAddress();
...
for (i=0; i<1024; i++)
((int *)pd)[i] = SomeValue(); //cast for each access
Or:
int *pd = (int *)GetPhysicalAddress(); //Cast only on assignement
...
for (i=0; i<1024; i++)
pd[i] = SomeValue();
In general, you cannot do this for an actual physical address, but you can use mmap to get a pointer to memory at a specified virtual address. Mapping physical addresses such as device specific memory is usually done in device drivers using operating system specific APIs.
EDIT: With the extra information you provided, this is not the general case!
In order to have a pointer to a physical address before the paging is even set up, I guess you can just use this:
p = (void*)0x00010000;
Or whatever actual physical address you want to use.
Even if paging is not set up, you may already be in protected mode with segmentation, so it really depends on how your DS segment is set up.
I suggest you study the bootstrap of actual operating systems, or just the bootloader that executes in the mode you are referring to.

Is memcpy of array in C Vaxocentrist? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Is a memcpy of a chunk of one array to another in C guilty of Vaxocentrism?
Example:
double A[10];
double B[10];
// ... do stuff ...
// copy elements 3 to 7 in A to elements 2 to 6 in B
memcpy(B+2, A+3, 5*sizeof(double)
As a related question, is casting from an array to a pointer Vaxocentrist?
char A[10];
char* B = (char*)A;
B[0]=2;
A[1]=3;
B[2]=5;
I certainly appreciate the idea of writing code that works under different machine architectures and different compilers, but if I applied type safety to the extreme it would cripple many of C's useful features! How much / little can I assume about how the compiler implements arrays/pointers/etc.?
No. The model on which memcpy works is defined in the abstract machine specified by the C language standard and has nothing to do with any particular physical machine it might be running on. In particular, all objects in C have a representation which is defined as an overlaid array of type unsigned char[sizeof object], and memcpy works on this representation.
Likewise, the 'decay' of arrays to pointers via cast or implicit conversion is completely defined on the abstract machine and has nothing to do with physical machines.
Further, none of the points 1-14 in the linked article have anything to do with the code you're asking about.
In C code, memcpy() can be a useful optimization in a couple of cases. First, if the array of memory is very small then the copy operation can often be inlined directly by the compiler instead of calling a function. This can be a big win in a tight loop that runs a lot. Second, in the case where the array is very large and the hardware supports a faster mode of memory access for certain aligned memory cases then that faster code can be used for the vast majority of the memory. You honestly do not want to know the scary details of alignment and copy operations for different hardware, better to just put that stuff in memcpy() and let everyone use it.
For your first example you're using the + operator incorrectly. You want to deference the element its pointing to. This is safe because both arrays are of size 10, and when allocating memory for arrays all the addresses are sequential with respect to element 0. Also you're copying doesn't go outside of the bounds of the declared array that you're copying to, B.
memcpy(&B[2], &A[3], 5*sizeof(double));
On your related point, you're making the same mistake, you'd want to do the following:
char A[10];
char* B = &A[0];
B[0]=2;
A[1]=3;
B[2]=5;

Why can I write and read memory when I haven't allocated space?

I'm trying to build my own Hash Table in C from scratch as an exercise and I'm doing one little step at a time. But I'm having a little issue...
I'm declaring the Hash Table structure as pointer so I can initialize it with the size I want and increase it's size whenever the load factor is high.
The problem is that I'm creating a table with only 2 elements (it's just for testing purposes), I'm allocating memory for just those 2 elements but I'm still able to write to memory locations that I shouldn't. And I also can read memory locations that I haven't written to.
Here's my current code:
#include <stdio.h>
#include <stdlib.h>
#define HASHSIZE 2
typedef char *HashKey;
typedef int HashValue;
typedef struct sHashTable {
HashKey key;
HashValue value;
} HashEntry;
typedef HashEntry *HashTable;
void hashInsert(HashTable table, HashKey key, HashValue value) {
}
void hashInitialize(HashTable *table, int tabSize) {
*table = malloc(sizeof(HashEntry) * tabSize);
if(!*table) {
perror("malloc");
exit(1);
}
(*table)[0].key = "ABC";
(*table)[0].value = 45;
(*table)[1].key = "XYZ";
(*table)[1].value = 82;
(*table)[2].key = "JKL";
(*table)[2].value = 13;
}
int main(void) {
HashTable t1 = NULL;
hashInitialize(&t1, HASHSIZE);
printf("PAIR(%d): %s, %d\n", 0, t1[0].key, t1[0].value);
printf("PAIR(%d): %s, %d\n", 1, t1[1].key, t1[1].value);
printf("PAIR(%d): %s, %d\n", 3, t1[2].key, t1[2].value);
printf("PAIR(%d): %s, %d\n", 3, t1[3].key, t1[3].value);
return 0;
}
You can easily see that I haven't allocated space for (*table)[2].key = "JKL"; nor (*table)[2].value = 13;. I also shouldn't be able read the memory locations in the last 2 printfs in main().
Can someone please explain this to me and if I can/should do anything about it?
EDIT:
Ok, I've realized a few things about my code above, which is a mess... But I have a class right now and can't update my question. I'll update this when I have the time. Sorry about that.
EDIT 2:
I'm sorry, but I shouldn't have posted this question because I don't want my code like I posted above. I want to do things slightly different which makes this question a bit irrelevant. So, I'm just going to assume this was question that I needed an answer for and accept one of the correct answers below. I'll then post my proper questions...
Just don't do it, it's undefined behavior.
It might accidentially work because you write/read some memory the program doesn't actually use. Or it can lead to heap corruption because you overwrite metadata used by the heap manager for its purposes. Or you can overwrite some other unrelated variable and then have hard times debugging the program that goes nuts because of that. Or anything else harmful - either obvious or subtle yet severe - can happen.
Just don't do it - only read/write memory you legally allocated.
Generally speaking (different implementation for different platforms) when a malloc or similar heap based allocation call is made, the underlying library translates it into a system call. When the library does that, it generally allocates space in sets of regions - which would be equal or larger than the amount the program requested.
Such an arrangement is done so as to prevent frequent system calls to kernel for allocation, and satisfying program requests for Heap faster (This is certainly not the only reason!! - other reasons may exist as well).
Fall through of such an arrangement leads to the problem that you are observing. Once again, its not always necessary that your program would be able to write to a non-allocated zone without crashing/seg-faulting everytime - that depends on particular binary's memory arrangement. Try writing to even higher array offset - your program would eventually fault.
As for what you should/should-not do - people who have responded above have summarized fairly well. I have no better answer except that such issues should be prevented and that can only be done by being careful while allocating memory.
One way of understanding is through this crude example: When you request 1 byte in userspace, the kernel has to allocate a whole page atleast (which would be 4Kb on some Linux systems, for example - the most granular allocation at kernel level). To improve efficiency by reducing frequent calls, the kernel assigns this whole page to the calling Library - which the library can allocate as when more requests come in. Thus, writing or reading requests to such a region may not necessarily generate a fault. It would just mean garbage.
In C, you can read to any address that is mapped, you can also write to any address that is mapped to a page with read-write areas.
In practice, the OS gives a process memory in chunks (pages) of normally 8K (but this is OS-dependant). The C library then manages these pages and maintains lists of what is free and what is allocated, giving the user addresses of these blocks when asked to with malloc.
So when you get a pointer back from malloc(), you are pointing to an area within an 8k page that is read-writable. This area may contain garbage, or it contain other malloc'd memory, it may contain the memory used for stack variables, or it may even contain the memory used by the C library to manage the lists of free/allocated memory!
So you can imagine that writing to addresses beyond the range you have malloc'ed can really cause problems:
Corruption of other malloc'ed data
Corruption of stack variables, or the call stack itself, causing crashes when a function return's
Corruption of the C-library's malloc/free management memory, causing crashes when malloc() or free() are called
All of which are a real pain to debug, because the crash usually occurs much later than when the corruption occurred.
Only when you read or write from/to the address which does not correspond to a mapped page will you get a crash... eg reading from address 0x0 (NULL)
Malloc, Free and pointers are very fragile in C (and to a slightly lesser degree in C++), and it is very easy to shoot yourself in the foot accidentally
There are many 3rd party tools for memory checking which wrap each memory allocation/free/access with checking code. They do tend to slow your program down, depending on how much checking is applied..
Think of memory as being a great big blackboard divided into little squares. Writing to a memory location is equivalent to erasing a square and writing a new value there. The purpose of malloc generally isn't to bring memory (blackboard squares) into existence; rather, it's to identify an area of memory (group of squares) that's not being used for anything else, and take some action to ensure that it won't be used for anything else until further notice. Historically, it was pretty common for microprocessors to expose all of the system's memory to an application. An piece of code Foo could in theory pick an arbitrary address and store its data there, but with a couple of major caveats:
Some other code `Bar` might have previously stored something there with the expectation that it would remain. If `Bar` reads that location expecting to get back what it wrote, it will erroneously interpret the value written by `Foo` as its own. For example, if `Bar` had stored the number of widgets that were received (23), and `Foo` stored the value 57, the earlier code would then believe it had received 57 widgets.
If `Foo` expects the data it writes to remain for any significant length of time, its data might get overwritten by some other code (basically the flip-side of the above).
Newer systems include more monitoring to keep track of what processes own what areas of memory, and kill off processes that access memory that they don't own. In many such systems, each process will often start with a small blackboard and, if attempts are made to malloc more squares than are available, processes can be given new chunks of blackboard area as needed. Nonetheless, there will often be some blackboard area available to each process which hasn't yet been reserved for any particular purposes. Code could in theory use such areas to store information without bothering to allocate it first, and such code would work if nothing happened to use the memory for any other purpose, but there would be no guarantee that such memory areas wouldn't be used for some other purpose at some unexpected time.
Usually malloc will allocate more memory than you require to for alignment purpose. Also because the process really have read/write access to the heap memory region. So reading a few bytes outside of the allocated region seldom trigger any errors.
But still you should not do it. Since the memory you're writing to can be regarded as unoccupied or is in fact occupied by others, anything can happen e.g. the 2nd and 3rd key/value pair will become garbage later or an irrelevant vital function will crash due to some invalid data you've stomped onto its malloc-ed memory.
(Also, either use char[≥4] as the type of key or malloc the key, because if the key is unfortunately stored on the stack it will become invalid later.)

Resources