Is it feasible for a general purpose programming language to not have a heap? [closed] - heap-memory

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I'm looking into creating a programming language. What I'm wondering is, in a language that contains a reference-like construct, is it feasible to not have a new/malloc operator? That is, all variables are stored either on the stack somewhere or are statically allocated.
The idea is that you get better type safety, as well as "free garbage collection" without actually having a garbage collector.
I'm not familiar with too many scripting languages, so if one already does this, feel free to point it out.
(Dynamic / unknown size data structures would be handled by a dynamic list structure, which would be handled (obviously) on the heap, behind the user's back.)

Fortran was always quite a "general purpose" language, but it had no support for any kind of a dynamic memory allocation out of the box.
A usual practice was to allocate a big array statically and simulate your own memory management on top of it.
If a way to get rid of both GC and a manual memory management is what you're looking for, then region analysis can help, but only in few specific cases.

Region-based memory management was one approach for not having a heap managed in the traditional sense. This manifested in languages like FX and MLKit.

There is no requirement at all that you absolutely have to implement a stack, or a heap. C does not specify a stack either, for example. In fact, in many languages you don't even need to care, you just specify that the implementation (a compiler, an interpret, or whatever) to make room for a variable, and perhaps for how long.
Your language's interpreter (assuming one) could do int main(void) { char memory[1048576]; run_script_from_stdin_using(memory); }. You could even call mmap(2) to get an anonymous block of memory and use that to stash your variables in. It just does not matter where objects live, and that stack/heap are terms that have questionable meaning given they are often interchangeable.

You can allocate objects on the stack if you know they won't be referenced after the method terminates. This means, the object is used solely withing the method (e.g. a temp object), or use solely in method resulting from nested invocations. This is however a severe restriction. There are some dynamic optimizations that go in this direction though (at least optimization of temp object). You could maybe have a static checking mechanism that enforces this restriction, or possibly distinguish between heap and stack objects with types...

Related

Should C compilers immediately free "further unused" memories? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Opinion-based Update the question so it can be answered with facts and citations by editing this post.
Improve this question
Context: while compiling some C code compilers may show high RAM consumption. Preliminary investigation shows that (at least some) C compilers do not immediately free "further unused" memories: despite that such (previously allocated) memories are not used anymore, they are still kept in the RAM. The C compiler continues processing the C code, allocating more memories in the RAM, until it reaches OOM (out of memory).
The core question: should C compilers immediately free "further unused" memories?
Rationale:
Efficient RAM utilization: no need of mem_X anymore => free mem_X to let other processes (itself including) to use mem_X.
Ability to compile the "RAM demanding" C code.
UPD20210825. I've memory-profiled some C compiler and have found that it keeps in RAM "C preprocessor data", in particular:
macro table (memory pool for macros);
scanner token objects (memory pool for tokens and for lists).
At certain point X in the middle-end (after the IR is built) these objects seem not needed anymore and, hence, can be freed. (However, now these objects are kept in RAM until a point X+1.) The benefit is seen on "preprocessor-heavy" C programs. Example: "preprocessor-heavy" C program using "ad hoc polymorphism" implemented via C preprocessor (by using a set of macros it progressively implements all the needed "machinery" to support a common interface for an arbitrary (and supported) set of individually specified types). The number of "polymorphic" entries is ~50k * 12 = ~600k (yes, it does not say anything). Results:
before fix: at point X C compiler keeps in RAM ~1.5GB of unused "C preprocessor data";
after fix: at point X C compiler frees from RAM ~1.5GB of unused "C preprocessor data", hence, letting OS processes (itself including) to use these ~1.5GB.
I don't know where you get your analysis from. Most parts like the abstract syntax tree is kept because it is used in all different passes.
It might be that some, especially simple compilers don't free stuff because it's not considered necessary for a C compiler. It's a one shot compilation unit operation and than the process ends.
Of course if you build a compiler library like tinycc did you need to free everything, but even this might happen within a final custom heap clearance at the end of the compilation run.
I have not seen this ever be a problem in real world. But i don't do embedded stuff where a lack of resources can be something to worry.
allocating more memories in the RAM, until it reaches OOM (out of
memory).
None of the compilers I use ever run out of memory. Please give an example of such behaviour.
If you are an Arduino user and think about the code which will not fit into the memory - it is not the problem of the compiler, only the programmer.

Why does C not require a garbage collector? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
My understanding of this has come down to C's origins as a "portable assembler" and the option of less overhead. Is thiat all there is to it?
First of all, lets be clear about what garbage is.
The Java definition of garbage is objects that are no longer reachable. The precise meaning of reachable is a bit abstruse, but a practical definition is that if you can get to an object by following references (pointers) from well known places like thread stacks or static variables, then it may be reachable. (In practice, some imprecision is OK, so long as objects that are reachable don't get deleted.)
You could try to apply the same definition to C and C++. An object is garbage if it cannot be reached.
However, the practical problem with this definition ... and garbage collection ... in C or C++ is whether a "pointer like" value is actually a valid pointer. For instance:
An uninitialized C variable can contain a random value that looks like a pointer to an object.
When a C union type that overlays a pointer with an long, a garbage collector cannot be sure whether the union contains one or the other ... or both.
When C application code "compresses" pointers to word aligned heap nodes by dividing them by 4 or 8, a garbage collector won't detect them as "pointer like". Or if it does, it will misinterpret them.
A similar issues is when C application code represents pointers as offsets relative to something else.
However, it is clear that a C program can call malloc, forget to call free, and then forget the address of the heap node. That node is garbage.
There are two reasons why C / C++ doesn't have garbage collection.
It is "culturally inappropriate". The culture of these languages is to leave storage management to the programmer.
It would be technically difficult (and expensive) to implement a precise garbage collector for C / C++. Indeed, doing this would involve things that made the language implementation slow.
Imprecise (i.e. conservative) garbage collectors are practical, but they have performance and (I have heard) reliability issues. (For instance, a conservative collector cannot move non-garbage objects.)
It would be simpler if the implementer (of a C / C++ garbage collector) could assume that the programmer only wrote code that strictly conformed to the C / C++ specs. But they don't.
But your answer seems to be, why did they design C like that?
Questions like that can only be answered authoritatively by the designers (in this case, the late Dennis Ritchie) or their writings.
As you point out in the question, C was designed to be simple and "close to the hardware".
However, C was designed in the early 1970's. In those days programming languages which required a garbage collector were rare, and GC techniques were not as advanced as they are now.
And even now, it is still a fact that garbage collected languages (like Java) are not suitable for applications that require predictable "real-time" performance.
In short, I suspect that the designers were of the view that garbage collection would make the language impractical for its intended purpose.
There are some garbage collectors built for C or C++:
Please check http://www.hboehm.info/gc/.
As you stated, garbage collection defies the purpose of performance claimed by C and C++, as it requires tracking allocations and/or reference counting.

Direct stack and heap access; Virtual- or hardware- level?

When I'm on SO I read a lot of comments guiding (Especially in C)
"dynamic allocation allways goes to the heap, automatic allocation on the stack"
But especially regarding to plain C I disaggree with that. As the ISO/IEC9899 doesn't even drop a word of heap or stack. It just mentions three storage duriations (static, automatic, and allocated) and advises how each of them has to be treat.
What would give a compiler the option to do it even wise versa if it would like to.
So my question is:
Are the heap and the stack physical existing that (even if not in C) a standardized language can say "... has to happen on heap and ... on the stack"?
Or are they just a virtuell system of managing memory access so that a language can't make rules about them, as it can't even be ensured the enviroment supports them?
In my knowledgebase only the second would make sense. But I read allready many times people writing comments like "In language XY this WILL happen on the stack/heap". But if I'm right this had to be indeterminable as long the language isn't just made for such systems which guarantee to have a stack and heap. And all thoose comments would be wrong.
Thats what lead me to ask about this question. Am I that wrong, or is there a big error in reasoning going around about that?
You are correct in that the C spec doesn't require the use of a heap or stack, as long as it implements the storage classes correctly.
However, virtually every compiler will use stacks for automatic variables and heaps for allocated variables. While you could implement a compiler that doesn't use a stack or heap, it probably wouldn't perform very well and wouldn't be familiar to most devs.
So when people say "always", they really mean "virtually always".

How to calculate memory consumption in C [duplicate]

This question already has answers here:
How to determine CPU and memory consumption from inside a process
(10 answers)
Closed 9 years ago.
Is there any way to calculate memory consumption in C. I have checked other answers on Stackoverflow but they were not satisfactory.
Something similar to the one we have in Java:
// Get the Java runtime
Runtime runtime = Runtime.getRuntime();
// Run the garbage collector
runtime.gc();
// Calculate the used memory
long memory = runtime.totalMemory() - runtime.freeMemory();
System.out.println("Used memory is bytes: " + memory + "bytes");
System.out.println("Used memory is kilobytes: " + bytesTokilobytes(memory) +"kb");
C-language itself does not provide any means.
Although every specific platform gives you some support.
For example on Windows you can look at the TaskManager, Details tab. Right click on the listview column headers to add/remove columns. Some of them give insight on how much memory the process consumes. There are a lot of other tools including commercial ones (use google), that give more detailed picture.
On Windows there is also special API that allows writing your own tool. A while ago I wrote one. I do not want this answer to be an ad.
The real question seems to be, can you get the C heap to report how much space it's currently holding. I don't know of a portable way to do this.
Or you could plug in a "debugging heap" implementation which tracks this number and provides an API to retrieve it; debugging heaps are available as second-source libraries, and one MAY come with your compiler. (Many years ago I implemented a debugging heap as a set of macros which intercepted heap calls and redirected them through wrapper routines to perform several kinds of analysis; I wasn't maintaining a usage counter but I could have done so.) ((CAUTION: Anything allocated from a debugging heap MUST be returned to that heap, not the normal heap, and vice versa, or things get very ugly very quickly.))
Or your compiler may have some other nonstandard way to retrieve this information. Check its documentation.

"Rules" to use global or not variables [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have some question to use of globals variables in the C programming language. I've never too many as theses days to don't use globals variables anymore. So,my question is: when use or not global variables in C? can someone give me some explanation? perfomance question,of couse,should be included.
A more specific case to help to answer the question: for example,if I have a global array that hold structs,accessed by almost all functions by program,I need to get two members of this array per function call. In this case,I need to make two variables(pointers) to members in this array that I want to access. Like this foo_t *x,*y. It happen several times and in differents functions while program is running. In this specific case,should be x and y global or local variables(of each functions that call it)?
Some peoples argue that it's too expensive to computer's memory and of couse program perfomance.
I hope this is clear.
Most of the time, maintainability is more important than performance. You don't describe your application very much, so I'm going to assume it isn't performance critical, and also doesn't have issues with limited stack sizes, etc. I.e. maintainability wins.
In that case, the real question is which makes the code clearer, and reduces the opportunities for errors. That's a case by case decision, though my general rule is to minimize globals, particularly globals with a more than one file scope.
If the application is multithreaded, things get even clearer - and more exciting. Every global variable must either be const, or protected by some lock. And that includes variables declared with "static". Make your life easier by only making things global if they really are shared.
Back to your case - you have some kind of big global array of data structures, and some specific positions within it. Apparently a whole bunch of routines will operate on the same positions; otherwise the question doesn't make sense. I'd do whatever makes the code most clear and readable. From your description, that's probably to make them global, but you haven't given enough detail for me to be sure. If you do make them global, for heavens sake don't call them x and y, unless those names make clear and obvious sense - e.g. if they together represent a cartesian co-ordinate.
Although global memory and dynamic memory allocation are separate items, I tend to see paired use: lots of global memory with little allocations or little global and lots of dynamic memory usage.
Global variables (or externally linkable memory, etc.) have a valid use today in applications that are life critical. Having worked in both Both Air Traffic Control and Life Sciences, some applications use no dynamically allocated memory. There often is large global arrays instead. Such applications can not tolerate an out of memory some time down the line and need to show that the application can handle all specified capacity on start-up.
In some micro-controller applications, again the scope and size is well known so there is no dynamic allocation and often data is shared between functions via the global space. Even the worst case stack size is known (recursion not allowed).
In dynamic applications that need to scale with various processors, demands, etc. good programing practice avoids global memory. I find this coding style easier to maintain and enhance.
I do not see speed/code efficiency driving the use/non-use of global memory. It is more of a software architecture issue. Go with the approach the 1) meets your design goals & 2) is debug-able and maintainable.
Thought: Video memory is like global memory in that all functions and application shared this space.

Resources