How can you practically test a synchronized data structure (in C)?
Firing a couple of threads and have them compete for access to the structure for a while to see if anything goes wrong doesn't sound very safe.
EDIT in response to comments: I mean that there are several threads running functions that operate on the same set of data, with some kind of synchronization strategy (flags/semaphores/lock-free CAS/etc) to presumably eliminate race conditions and deadlocks. The problem is programatically testing for the correct synchronization of the workers.
No-one really knows how to do this with 100% reliability. Here is just one example of of a testing tool to find concurrency bugs.
Related
I'm trying to develop an application that will be running on multiple computers linked to a shared Lustre storage, performing various actions, including but not limited to:
Appending data to a file.
Reading data from a file.
Reading from and writing to a file, modifying all of its content pass a certain offset.
Reading from and writing to a file, modifying its content at a specific offset.
As you can see, the basic I/O one can wish for.
Since it's concurrent for most of that, I ought to need some kind of locking to allow safely doing the different writings, but I've seen Lustre doesn't support flock(2)s by default (and I'm not sure I want to use it over fcntl(2), I guess I will if it comes to it), and I haven't seen anything about fcntl(2) to confirm its support.
Researching it mostly resulted in me reading lot of papers about I/O optimization using Lustre, but those usually explain how the structure of their hardware / software / network works rather than explaining how it's done in the code.
So, can I use fcntl(2) with Lustre? Should I use it? If not, what are other alternatives to allow different clients to perform concurrent modifications of the data?
Or is it even possible ? (I've seen in Lustre tickets that mmap is possible, so fcntl should work too (no logic behind statement), but there might be limitations I would want to be aware of.)
I'll keep on writing a test application to check it out, but I figured I should still ask in case there are better alternatives (or if there are limitations to its functionalities that I should be aware of, since my test will be limited and we don't want unknown limitations to become an issue later in the development process).
Thanks,
Edit: The base question has been properly answered by LustreOne, here I give more specific informations about my use case to allow people to add pertinent additional informations about Lustre concurrent access.
The Lustre clients will be server to other applications.
Clients of those applications will each have their own set of files, but we want to support allowing clients to log to their client space from multiple machines at the same time and, for that purpose, we need to allow concurrent file read and write.
These, however, will always be a pretty small percentage of total I/O operations.
While really interesting insights were given in LustreOne's answer, not many of them apply to this use case (or rather, they do apply, but adding the complexity to the overall system might not be desired for the impact on performances).
That is, for the use case considered at present, I'm sure it can be of much help to some, and ourselves later on. However, what we are seeking right now is more of a way to easily allow two nodes or threads a node responding to two request to modify data to let one pass and detect the conflict, effectively preventing concerned client.
I believed file locking would be enough for that use case, but had a preference for byte locking since some of the most concerned file are getting appended non-stop by some clients, and read/modified up to the end by others.
However, judging from what I understood from LustreOne's answer:
That said, there is no strict requirement for this if your application
knows what it is doing. Lustre will already keep non-overlapping
writes consistent, and can handle concurrent O_APPEND writes as well.
The later case is already managed by Lustre out of the box.
Any opinion on what could be the best alternatives ? Will using simple flock() on complete file be enough ?
Note that some file will also have index, which can be used to determine availability of data without locking any of the data file, shall that be used or are bytes lock quick enough for us to avoid increasing codebase size to support both case?
A final mention on mmap. I'm pretty sure it doesn't fit our use case much since we got so many files and many clients, so OST might not be able to cache much, but to be sure... shall it be used, and if so, how? ^^
Sorry for being so verbose, it's one of my bad traits. :/
Have a nice day,
You should mount all clients with the "-o flock" mount option to enable globally coherent locking. Then flock() (and I think fcntl() locking) will work.
That said, there is no strict requirement for this if your application knows what it is doing. Lustre will already keep non-overlapping writes consistent, and can handle concurrent O_APPEND writes as well. However, since Lustre has to do internal locking for appends, this can hurt write performance significantly if there are a lot of different clients appending to the same file concurrently. (Note this is not a problem if only a single client is appending).
If you are writing the application yourself, then there are a lot of things you can do to make performance better:
- have some central thread assign a "write slot number" to each writer (essentially an incrementing integer), and then the client writes to offset = recordsize * slot number. Beyond assigning the slot number (which could be done in batches for better performance), there is no contention between clients. In most HPC applications the threads use the MPI rank as the slot number, since it is unique, and threads on the same node will typically be assigned adjacent slots so Lustre can further aggregate the writes. That doesn't work if you use a producer/consumer model where threads may produce variable numbers of records.
- make the IO recordsize a multiple of 4KiB in size to avoid contention between threads. Otherwise, the clients or servers will be forced to do read-modify-write for the partial records in a disk block, which is inefficient.
- Depending on whether your workflow allows it or not, rather than doing read and write into the same file, it will probably be more efficient to write a bunch of records into one file, then process the file as a whole and write into a second file. Not that Lustre can't do concurrent read and write to a single file, but this causes unnecessary contention that could be avoided.
How can I write and run automated tests that check that my database transaction strategy is removing race conditions? At the moment all I do is test it in development by putting a breakpoint in the code and sending two requests, I can then see in slow motion what happens. This is not something I can automate though, it's not even testing really, just part of development.
Your test can spawn threads and run two or more threads making the same request isolated by the transaction.
Perform a load test with a realistic work-load. Unfortunately, this is not easy to do. Race conditions are hard to discover on any platform. I know of no systematic way to find such bugs.
Sometimes you can exclude the possibility of inconsistencies by construction. For example:
A transaction running under SERIALIZABLE behaves as if it was the only transaction in the system. Therefore, there are never data races.
A read-only transaction under SNAPSHOT behaves the same way. Total data consistency.
A UNIQUE INDEX will never violate its integrity guarantees.
As you can see you can sometimes make your code safe by construction so that there is minimal need to test.
I have a large dataset of philosophic arguments, each of which connect to other arguments as proof or disproof of a given statement. A root statement can have many proofs and disproofs, each of which may also have proofs and disproofs. Statements can also be used in multiple graphs, and graphs can be analyzed under a "given context" or assumption.
I need to construct a bayesian network of related arguments, so that each node propagates influence fairly and accurately to it's connected arguments; I need to be able to calculate the probability of chains of connected nodes concurrently, with each node requiring datastore lookups that must block to get results; the process is mostly I/O bound, and my datastore connection can run asynchronously in java, go and python {google appengine}. Once each lookup completes, it propagates the effects to all other connected nodes until the probability delta drops below a threshold of irrelevance {currently 0.1%}. Each node of the process must calculate chains of connections, then sum up all the results across all queries to adjust validity results, with results chained outward to any connected arguments.
In order to avoid recurring infinitely, I was thinking of using an A*-like process in goroutines to propagate updates to the argument maps, with a heuristic based on compounding influence which ignores nodes once probability of influence dips below, say 0.1% . I'd tried to set up the calculations with SQL triggers, but it got complex and messy way too fast. Then I moved to google appengine to take advantage of asynchronous nosql, and it was better, but still too slow. I need to be run the updates fast enough to get a snappy UI, so when a user creates or votes for or against a proof or disproof, they can see the results reflected in UI immediately.
I think Go is the language of choice to support the concurrency I need, but I'm open to suggestions. The client is a monolithic javascript app that just uses XHR and websockets to push and pull argument maps {and their updates} in real time. I have a java prototype that can compute large chains in 10~15s, but monitoring of performance shows that most of my runtime is wasted in synchronization and overhead from ConcurrentHashMap.
If there are other highly-concurrent languages worth trying out, please let me know. I know java, python, go, ruby and scala, but will learn any language if it suits my needs.
Similarly, if there are open source implementations of huge Bayesian networks, please leave a suggestion.
I think it's a bit difficult to tell what you are asking about. Maybe you can elaborate on your question.
Goroutines are quite cheap, and are a perfect match for modern web applications which use XHR or Websockets heavily (and other I/O bound applications which have to wait for database responses and stuff like that). Additionally, the go runtime is also able to execute those goroutines in parallel, so that Go is also a good fit for CPU bound tasks, which should take advantage of multiple cores and the speed of a natively compiled language.
But you should also keep in mind, that goroutines and channels aren't for free. They still require some amount of memory and each synchronization point (e.g. a channel send or receive) comes with its cost. That's normally not a problem, since the synchronization is, in comparison to a database query for example, extremely cheap, but it might not be suited for building efficient Bayesian networks, especially if the actual work of each goroutine / node is negligible in comparison to the synchronization overhead.
Your primary goal for every concurrent program should be to avoid shared mutability as far as possible. So a Bayesian network modeled with goroutines and channels might be a good educational example and a great way to measure the performance of Go's channel implementation, but it's probably not the best fit for your problem.
I am implementing a small database like MySQL.. Its a part of a larger project..
Right now i have designed the core database, by which i mean i have implemented a parser and i can now execute some basic sql queries on my database.. it can store, update, delete and retrieve data from files.. As of now its fine.. however i want to implement this on network..
I want more than one user to be able to access my database server and execute queries on it at the same time... I am working under Linux so there is no issue of portability right now..
I know i need to use Sockets which is fine.. I also know that i need to use a concept like Thread Pool where i will be required to create a maximum number of threads initially and then for each client request wake up a thread and assign it to the client..
As for now what i am unable to figure out is how all this is actually going to be bundled together.. Where should i implement multithreading.. on client side / server side.? how is my parser going to be configured to take input from each of the clients separately?(mostly via files i think?)
If anyone has idea about how i can implement this pls do tell me bcos i am stuck here in this project...
Thanks.. :)
If you haven't already, take a look at Beej's Guide to Network Programming to get your hands dirty in some socket programming.
Next I would take his example of a stream client and server and just use that as a single threaded query system. Once you have got this down, you'll need to choose if you're going to actually use threads or use select(). My gut says your on disk database doesn't yet support parallel writes (maybe reads), so likely a single server thread servicing requests is your best bet for starters!
In the multiple client model, you could use a simple per-socket hashtable of client information and return any results immediately when you process their query. Once you get into threading with the networking and db queries, it can get pretty complicated. So work up from the single client, add polling for multiple clients, and then start reading up on and tackling threaded (probably with pthreads) client-server models.
Server side, as it is the only person who can understand the information. You need to design locks or come up with your own model to make sure that the modification/editing doesn't affect those getting served.
As an alternative to multithreading, you might consider event-based single threaded approach (e.g. using poll or epoll). An example of a very fast (non-SQL) database which uses exactly this approach is redis.
This design has two obvious disadvantages: you only ever use a single CPU core, and a lengthy query will block other clients for a noticeable time. However, if queries are reasonably fast, nobody will notice.
On the other hand, the single thread design has the advantage of automatically serializing requests. There are no ambiguities, no locking needs. No write can come in between a read (or another write), it just can't happen.
If you don't have something like a robust, working MVCC built into your database (or are at least working on it), knowing that you need not worry can be a huge advantage. Concurrent reads are not so much an issue, but concurrent reads and writes are.
Alternatively, you might consider doing the input/output and syntax checking in one thread, and running the actual queries in another (query passed via a queue). That, too, will remove the synchronisation woes, and it will at least offer some latency hiding and some multi-core.
I have a C# winform application. it has many forms with different functionalities. These forms wrap to a WCF service. for example
form1 calls serviceMethod1 continuously and updates the results
form2 calls serviceMethod2 continuously and updates the results
The calls are made in a different thread per each form, but this is ending up with too many threads as we have many forms. Is this bad and why? and is there a way to avoid this given my scenario?
Regards
How many threads are you talking about? If you have a lot of threads, you'll lose a bit of performance due to context switching - but in practice I wouldn't expect this to become a significant problem until you have an awful lot of them.
One alternative would be to use a Timer though (it sounds like a System.Timers.Timer or System.Threading.Timer would be most appropriate) - schedule each service call to be made on a regular basis, and the timer will use the threadpool to fire the calls. I suspect that although you say you're calling the services "continuously" you actually mean you're doing it regularly - which is exactly the kind of situation a timer is good for.
To answer the question frankly: It depends entirely on the OS and app design, but this question may indicate a shortcoming in the program's design.
Detail:
You want to learn the allocation requirements of a thread on your target architecture/OS, as well as keep your threads relatively busy/avoid polling, and to configure priorities correctly if you really do have a lot of threads. 'Many' threads may be 8 (or fewer, if busy), or 100+ if they have relatively little work to do, it ultimately depends on your needs and design.
As tests for some tests/objects/operations, I have used more than 100, and occasionally more than 1000 working threads. No explosions happened, though I have never had a true need for those operations to be that parallel in a shipping app (unless the aforementioned programs are being used in very unusual circumstances), and it made more sense to put the actual implementation into some centralized task manager. If you have time-critical/real time applications, then these tasks may be best on another thread. If they are short lived, consider a thread pool.. well, there are many ways to attack many problem classes...
You can use WCF asynchronious proxy
In Visual Studio, when you add Web Reference you can check "Generate Asynchronous operations" to generate an asynchronious proxy.
While the threads spend most of their time waiting for server response - even hundreds of threads are unlikely to degrade performance (CPU-wise). Otherwise, use thread pool and queue "request and update form once" tasks when previous update completes.
More important problem might be loading service with too many simultaneous requests.
As a general rule, you won't gain anything by having more threads than you have CPU cores. There are exceptions to the general rule, but I doubt they apply to your case.
From the OS' point of view, threads are no longer the lightweight things they used to be, but are almost as costly as full processes. Implementing thread synchronization correctly is not a simple task, debugging multi-threaded applications is a lot harder than a single threaded one.
With green threads, it is not an issue. Green threads being sort of a virtual thread, which is what you will generally get with Java and C#.
The benefit of threads in many apps is not to crunch more numbers but to allow lots of things to go on at once with good responsiveness, so having a lot of threads can be very useful for some things and will not always have any real cost.