Memory footprint on windows - c

My C application on windows is running a for loop in which it dumps numerous entries into some data structure and then saves the same in an xml. Now, i want to know the memory footprint it is taking to do the same. Are there any tools available?

Task Manager is the way I do it. It's simple and easy.
But it only works if you're trying to measure very large memory footprints. But applications with large footprints are probably the only cases where you'd need to measure the usage anyway.
If you want to measure memory usage accurate to the byte, I would just build a simple wrapper around malloc() and free() that increments some global value. (if the app is threaded, a lock might also be needed)

Task Manager is one way to do it. I prefer Process Explorer because it gives a lot more info than Task Manager.

Related

Directly accessing video memory within the Linux kernel in a driver-agnostic manner

TL;DR at end in bold if you don't want rationale/context (which I'm providing since it's always good to explain the core issue and not simply ask for help with Method X which might not be the best approach)
I frequently do software performance analysis on older hardware, which shows up race errors, single-frame graphical glitches and other issues more readily than more modern silicon.
Often, it would be really cool to be able to take screenshots of a misbehaving application that might render garbage for one or two frames or display erroneous values for a few fractions of a second. Unfortunately, problems most frequently arise when the systems in question are swapping heavily to disk, making it consistently unlikely that the screenshots I try to take will contain the bugs I'm trying to capture.
The obvious solution would be a capture device, and I definitely want to explore pixel-perfect image and video recording in the future when I have the resources for that (it sounds like a hugely fun opportunity to explore FPGAs).
I recently realized, however, that the kernel is what is performing the swapping, and that if I move screenshotting into kernelspace, well, I don't have to wait for my screenshot keystroke to make its way through the X input layer, into the screenshot program, wait for that to do its XSHM dance and get the screenshot data, all while the system is heavily I/O loaded (eg, 5-second system load of >10) - I can simply have the kernel memcpy() the displayed area of video memory to a preallocated buffer at the exact fraction of a second I hit PrtSc!
TL;DR: Where should I start looking to figure out how to "portably" (within the sense of Linux having different graphics drivers, each with different architectural designs) access the currently-displayed area of video memory?
I get the impression I should be looking at libdrm, possibly within KMS, but I would really appreciate some pointers to knowing what actually accesses video memory.
I'm also guessing there are probably some caveats and gotchas to reading video memory directly on certain chipsets? I don't expect my code to make it into the Linux kernel (who knows, but I doubt it) but I'd still like whatever I build to be fairly portable across computers for convenience.
NOTE: I am not using compositing with the systems in question, in case this changes anything. I'm interested to know whether I could write a compositing-compatible system; I suspect this would be nontrivial.

why unable to upload file using tftp?

Is it necessary to establish the connection each time during uploading the file in the multiple iteration for maintaining the stack size?
I got a calloc failed error.
I am using freertos with multithreading.
According to Wikipedia, yes, TFTP does not allow keeping the connection alive for multiple files.
If you are working with a small embedded system, its filesystem might not be designed to handle many files (even small ones) and you would want to reorganize the data into fewer.
Not sure what this has to do with stack size or running out of heap space. The question is very vague but you might want to account for scarce memory resources (using pencil and paper, even) to plan how the program will run, and avoid chasing these bugs every time a new feature is added.

Library or tools for managing shared mmapped files

Disclaimer: This is probably a research question as I cannot find what I am looking for, and it is rather specific.
Problem: I have a custom search application that needs to read between 100K and 10M files that are between 0.01MB to about 10.0MB each. Each file contains one array that could be directly loaded as an array via mmap. I am looking for a solution to prefetch files into RAM before they are needed and if the system memory is full, eject ones that were already processed.
I know this sounds a lot like a combination of OS memory management and something like memcached. What I am actually looking for is something like memcached that doesn't return strings or values for a key, but rather the address for the start of a chosen array. In addition, (this is a different topic) I would like to be able to have the shared memory managed such that the distance between the CPU core and the RAM is the shortest on NUMA machines.
My question is: "does a tool/library like this already exist?"
Your question is related to this one
I'm not sure you need to find a library. You just need to understand how to efficiently use system calls.
I believe the readahead system call could help you.
Indeed you have many many files (and perhaps too much of them). I hope that your filesystem is good enough, or that they are in many directories. Having millions of files may become a concern if not tuned appropriately (but I won't dare help on this).
I don't know if it is your application who writes & reads that many files. Perhaps you might consider switching to a fast DBMS like PostGresQL or MySQL, or perhaps you could use GDBM.
I have once done this for a search-engine kind of application. It used an LRU chain, which was also addressable (via a hash table) by file-id, and memory-address IIRC. On every access, the hot items were repositioned to the head of the LRU chain. When memory got tight (mmap can fail ...) the tail of the LRU-chain was unmapped.
The pitfall of this scheme is that the program can get blocked on pagefaults. And since it was single threaded, it was really blocked. Altering this to a multithreaded architecture would involve protecting the hash and LRU structures by locks and semaphores.
After that, I realised that I was doing double buffering: the OS itself has a perfect LRU diskbuffer mechanism, which is probably smarter then mine. Just open()ing or mmap()ing every single file on every request is only one sytemcall away, and (given recent activity) just as fast, or even faster than the buffering layer.
wrt DBMS: using a DBMS is a clean design, but you have the overhead of minimal 3 systemcalls just to get the first block of data. And it will certainly (always) block. But it lends itself reasonably for a multi-threaded design, and relieves you from the pain of locks and buffer management.

Are there operating systems that aren't based off of or don't use a file/directory system?

It seems like there isn't anything inherent in an operating system that would necessarily require that sort of abstraction/metaphor.
If so, what are they? Are they still used anywhere? I'd be especially interested in knowing about examples that can be run/experimented with on a standard desktop computer.
Examples are Persistent Haskell, Squeak Smalltalk, and KeyKOS and its descendants.
It seems like there isn't anything inherent in an operating system
that would necessarily require that sort of abstraction/metaphor.
There isn't any necessity, it's completely bogus. In fact, forcing everything to be accessible via a human readable name is fundamentally flawed, and precludes security due to Zooko's triangle.
Examples of hierarchies similar to this appear as well in DNS, URLs, programming language module systems (Python and Java are two good examples), and torrents, X.509 PKI.
One system that fixes some of the problems caused by DNS/URLs/X.509 PKI is Waterken's YURL.
All these systems exhibit ridiculous problems because the system is designed around some fancy hierarchy instead of for something that actually matters.
I've been planning on writing some blogs explaining why these types of systems are bad, I'll update with links to them when I get around to it.
I found this http://pages.stern.nyu.edu/~marriaga/papers/beyond-the-hfs.pdf but it's from 2003. Is something like that what you are looking for?
About 1995, I started to design an object oriented operating system
(SOOOS) that has no file system.
Almost everything is an object that exists in virtual memory
which is mapped/paged directly to the disk
(either local or networked, I.e. redudimentary cloud computing).
There is a lot of overhead in programs to read and write data in specific formats.
Image never reading and writing files.
In SOOOS there are no such things as files and directories,
Autonomous objects, which would essentially replace files, can be organized
suiting your needs, not simply a restrictive hierarchical file system.
There are no low level drive format structures (I.e. clusters)
that have additional level of abstraction and translation overhead.
SOOOS Data storage overhead is simply limited to page tables
that can be quickly indexed as with basic virtual memory paging.
Autonomous objects each have their own dynamic
virtual memory space which serves as the persistent data store.
When active they are given a task context and added to the active process task list
and then exist as processes.
A lot of complexity is eliminated in my design, simply instanciate objects
in a program and let the memory manager and virtual memory system handle
everything consistently with minimal overhead.
Booting the operating system is simply a matter of loading the basic kernal
setting up the virtual memory page tables to the key OS objects and
(re)starting the OS object tasks. When the computer is turned-off,
shutdown is essentially analogous to hibernation
so the OS is nearly in instant-on status,
The parts (pages) of data and code are loaded only as needed.
For example to edit a document, instead of starting a program by loading the entire
executable in memory, simply load the task control structure of the
autonomous object and set the instruction pointer to the function to be performed.
The code is paged in only as the instruction pointer traverses its virtual memory.
Data is always immediately ready to be used and simply paged in only as accessed
with no need to parse files and manage data structures which often
have a distict represention in memory from secondary storage.
Simply use the program's native memory allocation mechanism and
abstract data types without disparate and/or redundent data structures.
Object Linking and Embedding type of program interaction,
memory mapped IO, and interprocess communication you
get practically for free as one would implement
memory sharing using the facilities of the processor's Memory Management Unit.

Why does an empty WPF Application consume 50MB?

I'm on Vista 64 and a blank 'WPF Application' template allocates 50MB when I press compile and run.
Surly this is way too much for an empty white box?!
Is there anything I can do to make my WPF applications less thirsty?
Jan
50 MB doesn't sound like that much for a modern application that makes heavy use of shared libraries.
Measuring memory usage is something of a black art. On some systems, apps which display the memory usage of a given app include in that total memory used by any shared libraries used by that app. But that memory is in fact being used by all apps using that library.
What is reporting the "50Mb" number to you? Task manager?
Generally speaking, I'd say that rather than worrying about unavoidable overhead for abstract use cases, it's better to develop your application and then analyze its memory usage in context to how it impacts performance.
Hope that helps.

Resources