Simple C/C++ in-process cache similar to memcached - c

I need a simple (LRU) cache which should run in-process. I found memcached, which looks great but there does not seem to be an easy way to host it in-process. I don't need a distributed cache, just a simple key/value store and some kind of LRU behaviour and some nice allocator to limit fragmentation, as the entry size varies a lot (few bytes -- few kilobytes.) There must be surely an existing implementation of such a thing? Should be C or C++.

I hate to answer this way, but it would be fairly simple to implement yourself.
Allocator. Use malloc and free. They do work, and they work well. This also makes it easier to interface with the rest of your program.
Mutex -> hash table, tree, or trie. You can use a linked list to track LRU. Don't try to do fancy lockless stuff.
Should weigh less than a couple hundred lines, knock it out in a good solid day.

I've had success using commoncache but the project doesn't appear to have any activity and issues raised (with patches) by my colleague are still unaddressed...

Related

What is an efficient heap implementation for an allocation subsystem?

Let's say I want to write a 3D game engine - somewhere along the lines of DOOM and Quake. And I want to do it in pure C (not sure if that's relevant, but just in case...).
The first thing worth tackling I think would be the engines memory allocation. I've looked at some source code for this (Quake 3, DOOM 3), and in terms of allocation management, I've found that a B-tree seems like a good way to go, but I'm not sure if that would
A binary heap would be simpler, but from what I've read I'm not sure it would scale well. Maybe I'm wrong?
Ideally, I'm looking for something between O(1) and O(n log n) runtime efficiency. I'm not sure if this is realistic o
not though :)
Thoughts?
Basically, to start you can simply use the regular malloc provided by your compiler/build environment. Then, once your engine will look great, you could attempt to write your own memory allocator.
Doom 3 offer both options (enabled at compile time): either it uses an internal memory allocator (this is the default), or can use the regular malloc.
You can have a look at Heap.cpp in Doom 3 source code. It is based on BTrees. But honestly, you'll probably find this very difficult to understand !

Handing out addresses in a custom memory allocator w.r.t. cache conflicts

I've spent my afternoon reading up on processor caches after reading about the effect power of twos can have on cache conflicts. Now I wish to apply this new knowledge to my memory allocator for multi-threaded programs. However, I don't fully understand it yet.
I was under the impression that processors loved powers of two, so my allocator rounds requested sizes to their next power of two and then slices pages into multiples of this size and hands them out. When a page is full, it simply maps a new page and slices it up the same way. This leads to very similar and predictable offsets into pages.
To what extent should I adapt my allocator to avoid this issue? For example, should I try to randomize addresses slightly or am I screwed for using powers of two in the first place?
Thanks!
Until you have uncontrovertible proof that this is performance critical, just leave it be. The extra complication will most probably not be worth it.
Everybody should read (and understand!) Bentley's "Writing efficient programs" (sadly out of print now, his "Programming Pearls" contains a summary, and is well worth a read too).
Before embarking on a code-optimization bout, make sure it is worth it. If the performance is adequate, there are better uses of your time. Yes, you have to measure first.
Measure where the cost is being spent. Programmers are notoriously bad at guessing where the costs are
The most performance gains come from restating the problem (sometimes it is enough to solve a problem that is faster to solve), then overall organization of the system, next better algorithms/data structures; and at the very, very end detail optimizations like the one considered here.
Your friendly compiler, given a bit of prodding in the direction of "generate good code" will today generate much better code than an experienced assembly language programmer when given similar (full function scale) tasks. Most local source code reorganizations "for performance" are either moot (the compiler would have done so on its own) or deleterious (the compiler will recognize and rewrite the usual code sequences, unusual code can confuse it to do nothing or generate bad code).
Programmer time (writing, debugging, maintaining) is much more valuable than a few microseconds of computer time here and there, except for extremely unusual circumstances. Write the simplest code that does the job, rework only if experience shows it is worthwile.

How can I implement cooperative lightweight threading with C on Mac OS X?

I'm trying to find a lightweight cooperative threading solution to try implementing an actor model.
As far as I know, the only solution is setcontext/getcontext,
but the functionality is deprecated(?) by Apple. I'm confused by why they did this; however, I'm finding replacement for this.
Pthreads are not an option because I need cooperative model instead of preemptive model to control context switching timing precisely/manually without expensive locking.
-- edit --
Reason of avoiding pthreads:
Because pthreads are not cooperative/deterministic and too expensive. I need actor model for game logic code, so thousand of execution context are required at minimal. Hardware threading requires MB of memory and expense to create/destruct. And parallelism is not important. In fact, I just need concurrent execution of many functions. This can be implemented with many divided functions and some kind of object model, but my goal is reducing those overheads.
If I know something wrong, please correct me. It'll be very appreciated.
The obvious 'lightweight' solution is to avoid complex nested calling except for limited situations where the execution time will be tightly bounded, then store an explicit state structure for each "thread" and implement the main program logic as a state machine that's easily suspendable/resumable at most points. Then you can simply swap out the pointer to the state structure for 'context switch'. Basically this technique amounts to keeping all of your important state variables, including what would conventionally be local variables, in the state structure.
Whether this is worthwhile probably depends on your reason for avoiding pthreads. If your reason is to be portable to non-POSIX systems, or if you really need deterministic program flow, then it may be worthwhile. But if you're just worried about performance overhead and memory synchronization issues, I think you should use pthreads and manage these issues. If you avoid unnecessary locking, use fine-grained locks, and minimize the amount of time locks are held, performance should not suffer.
Edit: Based on your further details posted in the comments on the main question, I think the solution I've proposed is the right one. Each actor should have their own context in which you store the state of the actor's action/thinking/etc. You would have a run_actor function which would take an actor context and a number of "ticks" to advance the actor's state by, and a run_all_actors function which would iterate over a list of active actors and call run_actor for each with the specified number of ticks.
Further, note that this solution still allows you to use real threads to take advantage of SMP/multicore machines. You simply divide the actors up between threads. You may need some degree of locking if one actor needs to examine another's context (e.g. for collision detection).
I was researching this question as well, and I ran across GNU Pth (not to be confused with Pthreads). See http://www.gnu.org/software/pth/
It aims to be a portable solution for cooperative threads. It does mention it is implemented via setcontext/getcontext if available (so it may not be on Mac OSX). Otherwise it says it uses longjmp/setjmp, but it's not clear to me how that works.
Hope this is helpful to anyone who searches for this question.
I have discovered the some of required functionalities from setcontext/getcontext are implemented in libunwind.
Unfortunately the library won't be compiled on Mac OS X because of deprecation of the setcontext/getcontext. Anyway Apple has implemented their own libunwind which is compatible with GNU's implementation at source level. The library is exist on Mac OS X 10.6, 10.7, and iOS. (I don't know exact version in case of iOS)
This library is not documented, but I could find the headers from these locations.
/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS5.0.sdk/usr/include/libunwind.h
/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator4.3.sdk/usr/include/libunwind.h
/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator5.0.sdk/usr/include/libunwind.h
/Developer/SDKs/MacOSX10.6.sdk/usr/include/libunwind.h
/Developer/SDKs/MacOSX10.7.sdk/usr/include/libunwind.h
There was a note in the header file that to go GNU libunwind site for documentation.
I'll bet on the library.

Higher level languages with C functions

Maybe this has been asked before, but I couldn't find it. My question is simple: Does it make sense to write an application in higher level languages (Java, C#, Python) and time/performance-critical functions in C? Or at this point unless you do very low level OS/game/sensor programming it is all the same to have a full, say, Java application?
It makes sense if you a) notice a performance issue, AND b) use performance measurements to locate where the problem occurs, AND c) can't achieve the desired performance by modifying the existing code.
If any of these items don't apply, then it's probably premature optimization.
If you are fluent and productive in a higher level language such a Python and Lua, then by all means start writing in that language. Look for bottlenecks if and when they exist.
speed can be quite similar with things like C#.
What is tricky is latency. So if you want to write something which you know takes < 10ms then C is reasonably predictable (ignoring whatever variability your operating system might introduce).
Having said that for very tight long loops (image processing for example), things like C/C++ can offer some speed up. You can get quite reasonable performance out of C#, you do have to be careful how you program it though, but I have found in general, you can still squeeze more out of C/C++
Usually your preferred language will do whatever you need it to in acceptable time (er, blazing fast).
Sure, critical time/performance functions can be written in a "more optimal/suitable" language like C or assembly - but whether it will actually make things faster is another story. There are laws that govern how much actual/overall speed-up that you'll get, specifically Amdahs Law and (diminishing returns) .
To answer your question, it only makes sense to rewrite these critical functions in lower languages if there is good enough speed-up to warrant the extra work.
I suggest you read Cliff Click's excellent Java vs. C Performance....Again.. It outlines many points of comparison between Java and C++.
Predictably, the conclusion is that it depends, but it's a worthwhile read.
You can only really answer this on a case by case basis and without reference to what you are doing it's impossible to answer.
But maybe what you actually want here is some kind of sanity check to ensure that this approach isn't crazy to consider. I have worked on tools ranging from a very large graphics application (~ million lines) to relatively small physical simulation engines (~10000 lines) there were written just as you describe: Python on the outside for interface (both for API and for GUI), C/C++ on the inside for the heavy lifting. They all benefited from this division of responsibility.
This is done especially with scripting languages. Things that come to mind are games made in Python. Most of the time Python is too slow for some of the more number crunching aspects of games and they make this a C module for speed. Be sure that you actually need the speed though and that number crunching is your performance bottleneck and not a general algorithm issue. Doing a brute-force search over a list is going to be slow in both C and Python.
I'd say that it depends a lot on your application.
What sort of performance is important? short startup-time? high throughput? low latency? Is it important that response time is always predictable?
Is the application short lived or does it run for long periods of time?
Java can give you high throughput, but occasional short freezes while doing garbage collection. C# is probably similar.
Python, well the performance there will often lag behind the others for anything not written in C (some things ARE written in C, even if you didn't do it yourself).
So as others said. It depends.
But as always with performance: Measure first, optimize when you know you need to.

What specific examples are there of knowing C making you a better high level programmer?

I know about the existance of question such as this one and this one. Let me explain.
Afet reading Joel's article Back to Basics and seeing many similar questions on SO, I've begun to wonder what are specific examples of situations where knowing stuff like C can make you a better high level programmer.
What I want to know is if there are many examples of this. Many times, the answer to this question is something like "Knowing C gives you a better feel of what's happening under the covers" or "You need a solid foundation for your program", and these answers don't have much meaning. I want to understand the different specific ways in which you will benefit from knowing low level concepts,
Joel gave a couple of examples: Binary databases vs XML, and strings. But two examples don't really justify learning C and/or Assembly. So my question is this: What specific examples are there of knowing C making you a better high level programmer?
My experience with teaching students and working with people who only studied high-level languages is that they tend to think at a certain high level of abstraction, and they assume that "everything comes for free". They can become very competent programmers, but eventually they have to deal with some code that has performance issues and then it comes to bite them.
When you work a lot with C, you do think about memory allocation. You often think about memory layout (and cache locality if that's an issue). You understand how and why certain graphics operations just cost a lot. How efficient or inefficient certain socket behaviors are. How buffers work, etc. I feel that using the abstractions in a higher level language when you do know how it is implemented below the covers sometimes gives you "that extra secret sauce" when thinking about performance.
For example, Java has a garbage collector and you can't directly assign things to memory directly. And yet, you can make certain design choices (e.g., with custom data structures) that affect performance because of the same reasons this would be an issue in C.
Also, and more generally, I feel that it is important for a power programmer to not only know big-O notation (which most schools teach), but that in real-life applications the constant is also important (which schools try to ignore). My anecdotal experience is that people with skills in both language levels tend to have a better understanding of the constant, perhaps because of what I described above.
In addition, many higher level systems that I have seen interface with lower level libraries and infrastructures. For instance, some communications, databases or graphics libraries. Some drivers for certain devices, etc. If you are a power programmer, you may eventially have to venture out there and it helps to at least have an idea of what is going on.
Knowing low level stuff can help a lot.
To become a racing driver, you have to learn and understand the basic physics of how tyres grip the road. Anyone can learn to drive pretty fast, but you need a good understanding of the "low level" stuff (forces and friction, racing lines, fine throttle and brake control, etc) to get those last few percent of performance that will allow you to win the race.
For example, if you understand how the CPU architecture works in your computer, you can write code that works better with it (e.g. if you know you have a certain CPU cache size or a certain number of bytes in each CPU cache line, you can arrange your data structures and the way that you access them to make the best use of the cache - for example, processing many elements of an array in order is often faster than processing random elements, due to the CPU cache). If you have a multi-core computer, then understanding how low level techniques like threading work can gave huge benefits (just as not understanding the low level can lead to disaster in threading).
If you understand how Disk I/O and caching works, you can modify file operations to work well with it (e.g. if you read from one file and write to another, working on large batches of data in RAM can help reduce I/O contention between the reading and writing phases of your code, and vastly improve throughput)
If you understand how virtual functions work, you can design high-level code that uses virtual functions well. If used incorrectly they can severely hamper performance.
If you understand how drawing is handled, you can use clever tricks to improve drawing speed. e.g. You can draw a chessboard by alternately drawing 64 white and black squares. But it is often faster to draw 32 white sqares and then 32 black ones (because you only have to change the drawing colour twice instead of 64 times). But you can actually draw the whole board black, then XOR 4 stripes across the board and 4 stripes down the board in white, and this can be much faster still (2 colour changes, and only 9 rectangles to draw instead of 64). This chessboard trick teaches you a very important programming skill: Lateral thinking. By designing your algorithm well, you can often make a big difference to how well your program operates.
Understanding C, or for that matter, any low level programming language, gives you an opportunity to understand things like memory usage (i.e. why is it a bad thing to create several million heavy objects), how pointers/object references work, etc.
The problem is that as we've created ever increasing levels of abstraction, we find ourselves doing a lot of 'lego block' programming, without understanding how the legos actually function. And by having almost infinite resources, we start treating memory and resources like water, and tend to solve problems by throwing more iron at the situation.
While not limited to C, there's a tremendous benefit to working at a low level with much smaller, memory constrained systems like the Arduino or old-school 8-bit processors. It lets you experience close to the metal coding in a much more approachable package, and after spending time squeezing apps into 512K, you will find yourself applying these skills at a larger level within your day to day programming.
So the language itself is not important, but having a deeper appreciation for how all of the bits come together, and how to work effectively at a level closer to the hardware is a set of skills beneficial to any software developer.
For one, knowing C helps you understand how memory works in the OS and in other high level languages. When your C# or Java program balloons on memory usage, understanding that references (which are basically just pointers) take memory too, and understand how many of the data structures are implemented (which you get from making your own in C) helps you understand that your dictionary is reserving huge amounts of memory that aren't actually used.
For another, knowing C can help you understand how to make use of lower level operating system features. You don't need this often, but sometimes you may need memory mapped files, or to use marshalling in C#, and C will greatly help understand what you're doing when that happens.
I think C has also helped my understanding of network protocols, but I can't put my finger on specific examples. I was reading another SO question the other day where someone was complaining about how C's bit-fields are 'basically useless' and I was thinking how elegantly C bit fields represent low-level network protocols. High level languages dealing with structures of bits always end up a mess!
In general, the more you know, the better programmer you will be.
However, sometimes knowing another language, such as C, can make you do the wrong thing, because there might be an assumption that is not true in a higher-level language (such as Python, or PHP). For example, one might assume that finding the length of a list might be O(N) where N is the length of the list. However, this is probably not the case in many high-level language instances. In Python, for most list-like things the cost is O(1).
Knowing more about the specifics of a language will help, but knowing more in general might lead one to make incorrect assumptions.
Just "knowing" C would not make you better.
But, if you understand the whole thing, how native binaries work, how does CPU work with it, what are architecture limitations, you may write a code which is easier for CPU.
For example, how L1/L2 caches affect your work, and how should you write your code to have more hits in L1/L2 caches. When working with C/C++ and doing heavy optimizations, you will have to go down to that kind of things.
It isn't so much knowing C as it is that C is closer to the bare metal than many other languages. You need to be more aware of how to allocate/deallocate memory because you have to do it yourself. Doing it yourself helps you understand the implications of many decisions that you make.
To me any language is acceptable as long as you understand how the compiler/interpreter (basically) maps your code onto the machine. It's a bit easier to do in a language that exposes this directly, but you should be able to, with a bit of reading, figure out how memory is allocated and organized, what sort of indexing patterns are more optimal than others, what constructs are more efficient for particular applications, etc.
More important, I think, is a good understanding of operating systems, memory architectures, and algorithms. If you understand how your algorithm works, why it would be better to choose one algorithm or data structure over another (e.g., HashSet vs. List), and how your code maps onto the machine, it shouldn't matter what language you are using.
This is my experience of how I learnt and taught myself programming, specifically, understanding C, this is going back to early 1990's so may be a bit antique, but the passion and the drive is important:
Learn to understand the low level principles of the computer, such as EGA/VGA programming, here's a link to the Simtel archive on the C programmer's guide to the PC.
Understanding how TSR's work
Download the whole archive of Bob Stout's snippets which is a big collection of C code that does one thing only - study them and understand it, not alone that, the collection of snippets strives to be portable.
Browse at the International Obfuscated C Code Contest (IOCCC) online, and see how the C code can be abused and understand the intracies of the language. The worst code abuse is the winner! Download the archives and study them.
Like myself, I loved the infamous Ponzo's C Tutorial which helped me immensely, unfortunately, the archive is very hard to find. If anyone knows of where to obtain them, please leave a comment and I will amend this answer to include the link. There is another one that I can remember - Coronado's [Generic?] C Tutorial, again, my memory on this one is hazy...
Look at Dr. Dobb's journal and C User Journal here - I do not know if you can still get them in print but they were a classic, can remember the feeling of holding a printed copy in my hand and tearing off home to type in the code to see what happens!
Grab an ancient copy of Turbo C v2 which I believe you can get from borland.com and just play with 16bit C programming to get a feel and mess with the pointers...sure it is ancient and old but playing with pointers on it is fine.
Understand and learn Pointers, link here to the legacy Simtel.net - a crucial link to achieving C Guru'ship for want of a better word, also you will find a host of downloads pertaining to the C programming language - I remember actually ordering the Simtel CD Archive and looking for the C stuff...
A couple of things that you have to deal directly with in C that other languages abstract away from you include explicit memory management (malloc) and dealing directly with pointers.
My girlfriend is one semester from graduating MIT (where they mainly use Java, Scheme, and Python) with a Computer Science degree, and she is currently working at a company whose codebase is in C++. For the first few days she had a difficult time understanding all the pointers/references/etc.
On the other hand, I found moving from C++ to Java very easy, because I was never confused about pass-references-by-value vs pass-by-reference.
Similarly, in C/C++ it is much more apparent that primitives are just the compiler treating the same sets of bits in different ways, as opposed to a language like Python or Ruby where everything is an object with its own distinct properties.
A simple (not entirely realistic) example to illustrate some of the advice above. Consider the seemingly harmless
while(true)
for(Iterator iter = foo.iterator(); iter.hasNext();)
bar.doSomething( iter.next() )
or the even higher level
while(true)
for(Baz b: foo)
bar.doSomething(b)
A possible problem here is that each time round the while loop a new object (the iterator) is created. If all you care about is programmer convenience, then the latter is definitely better. But if the loop has to be efficient or the machine is resource constrained then you are pretty much at the mercy of the designers of your high level language.
For example, a typical complaint for doing high-performance Java is having execution stop while garbage (such as all those allocated Iterator objects) is reclaimed. Not very good if your software is charged with tracking incoming missiles, auto-piloting a passenger jet, or just not leaving the user wondering why the GUI has stopped responding.
One possible solution (still in the higher-level language) would be to weaken the convenience of the iterator to something like
Iterator iter = new Iterator();
while(true)
for(foo.initAlreadyAllocatedIterator(iter); iter.hasNext();)
bar.doSomething(iter.next())
But this would only make sense if you had some idea about memory allocation...otherwise it just looks like a nasty API. Convenience always costs somewhere, and knowing lower-level stuff can help you identify and mitigate those costs.

Resources