Is there a framework for running unit tests on Apache C modules? - c

I am about to make some changes to an existing Apache C module to fix some possible security flaws and general bad practices. However the functionality of the code must remain unchanged (except in cases where its fixing a bug). Standard regression testing stuff seems to be in order. I would like to know if anyone knows of a good way to run some regression unit tests againt the code. I'm thinking something along the lines of using C-Unit but with all the tie ins to the Apache APR and status structures I was wondering if there is a good way to test this. Are there any pre-built frameworks that can be used with C-unit for example?
Thanks
Peter

I've been thinking of answering this for a while, but figured someone else might come up with a better answer, because mine is rather unsatisfactory: no, I'm not aware of any such unit testing framework.
I think your best bet is to try and refactor your C module such that it's dependencies on the httpd code base are contained in a very thin glue layer. I wouldn't worry too much about dependencies on APR, that can easily be linked into your unit test code. It's things like using the request record that you should try to abstract out a little bit.
I'll go as far and suggest that such a refactoring is a good idea if the code is suspected to contain security flaws and bad practices. It's just usually a pretty big job.
What you might also consider is running integration tests rather than unit tests (ideally both), i.e. come up with a set of requests and expected responses from the server, and run a program to compare the actual to expected responses.
So, not the response you've been looking for, and you probably thought of something along this line yourself. But at least I can tell you from experience that if the module can't be replaced with something new for business reasons, then refactoring it for testability will likely pay off in the longer term.

Spent some time looking around the interwebs for you as its a question i was curious myself. came across a wiki article stating that
http://cutest.sourceforge.net/
was used for the apache portable c runtime testing. might be worth checking that out.

Related

Consensus algorithm for Node.js

I'm trying to implement a collaborative canvas in which many people can draw free-handly or with specific shape tools.
Server has been developed in Node.js and client with Angular1-js (and I am pretty new to them both).
I must use a consensus algorithm for it to show always the same stuff to all the users.
I'm seriously in troubles with it since I cannot find a proper tutorial its use. I have been looking and studying Paxos implementation but it seems like Raft is very used in practical.
Any suggestions? I would really appreciate it.
Writing a distributed system is not an easy task[1], so I'd recommend using some existing strongly consistent one instead of implementing one from scratch. The usual suspects are zookeeper, consul, etcd, atomix/copycat. Some of them offer nodejs clients:
https://github.com/alexguan/node-zookeeper-client
https://www.npmjs.com/package/consul
https://github.com/stianeikeland/node-etcd
I've personally never used any of them with nodejs though, so I won't comment on maturity of clients.
If you insist on implementing consensus on your own, then raft should be easier to understand — the paper is surprisingly accessible https://raft.github.io/raft.pdf. They also have some nodejs implementations, but again, I haven't used them, so it is hard to recommend any particular one. Gaggle readme contains an example and skiff has an integration test which documents its usage.
Taking a step back, I'm not sure if the distributed consensus is what you need here. Seems like you have multiple clients and a single server. You can probably use a centralized data store. The problem domain is not really that distributed as well - shapes can be overlaid one on top of the other when they are received by server according to FIFO (imagine multiple people writing on the same whiteboard, the last one wins). The challenge is with concurrent modifications of existing shapes, by maybe you can fallback to last/first change wins or something like that.
Another interesting avenue to explore here would be Conflict-free Replicated Data Types — CRDT. Folks at github used them to implement collaborative "pair" programming in atom. See the atom teletype blog post, also their implementation maybe useful, as collaborative editing seems to be exactly the problem you try to solve.
Hope this helps.
[1] Take a look at jepsen series https://jepsen.io/analyses where Kyle Kingsbury tests various failure conditions of distribute data stores.
Try reading Understanding Paxos. It's geared towards software developers rather than an academic audience. For this particular application you may also be interested in the Multi-Paxos Example Application referenced by the article. It's intended both to help illustrate the concepts behind the consensus algorithm and it sounds like it's almost exactly what you need for this application. Raft and most Multi-Paxos designs tend to get bogged down with an overabundance of accumulated history that generates a new set of problems to deal with beyond simple consistency. An initial prototype could easily handle sending the full-state of the drawing on each update and ignore the history issue entirely, which is what the example application does. Later optimizations could be made to reduce network overhead.

How to prevent an applications DLL to be decompiled?

As I know there are some applications that decompile DLLs to get source codes from application files.
Not only I don't want others to have the sources but also I don't want others to use them, I mean the DLL files. so how should i lock the DLLs and how safe they are ?
Before I get into anything else, I will state that it is impossible to protect your application entirely.
That being said, you can still make things more difficult. There are many obfuscators out there that will help you make it more difficult for someone to decompile your application and understand it.
http://en.wikipedia.org/wiki/List_of_obfuscators_for_.NET
.NET obfuscation tools/strategy
That's truly the best you can hope for.
Personally, I really wouldn't bother going too deep, if at all. You'll find that you are either spending too much money or time (or both) trying to protect your application from no-gooders. These are the same people who, no matter what barriers you throw up at them, will continue to try and given the nature of managed languages, they will most likely succeed. In fact, most obfuscators can be deobfuscated with simple tools... In the meantime, you've let other important features and bug fixes slip by because you spent more time and effort on security measures.
Obfuscation is one way to protect your code. Again, the solution is relative as per your needs. If you have a super secretive program, then you would want to explore more expensive and in-dept strategies.
However, if you are developing a business application or such thing which would not be worth a lot of any hacker's time to reverse engineer, minimal to normal obfuscation strategies are good enough. As the main answer suggests, look at those links.
Recently, I came upon ConfuseEx, a free open-source obfuscator that does the job for WPF apps and more. It seems to be very powerful, effective and customizable.
ConfuseEx on Github
For DLLs there is almost nothing we can do , confusing the files is the best way , but public member will remain in the way they were before , but if you pack them in your exe file , and confuse them , no one can use them easily .
I used ConfuserEX and it was very easy to use and effective .

Writing Angular Unit Tests After the App is Written?

I've inherited a medium sized angular app. It looks pretty well organized and well written but there's no documentation or unit tests implemented.
I'm going to make an effort to write unit tests posthumously and eventually work in e2e tests and documentation via ngdoc.
I'm wondering what the best approach to writing unit tests is after the fact. Would you start with services and factories, then directives etc or some other strategy? I plan on using Jasmine as my testing framework.
I should also mention that I've only been with the code for a few days so I'm not 100% sure how everything ties together yet.
At the end of the day what you need to know is that your software works correctly, consistently, and within business constraints. That's all that matters, and TDD is just a tool to help get you there. So, that's the frame of mind from which I'm approaching this answer.
If there are any known bugs, I'd start there. Cover the current, or intended, functionality with tests and work to fix the bugs as you go.
After that, or if there aren't any currently known bugs, then I'd worry about adding tests as you begin to maintain the code and make changes. Add tests to cover the current, correct functionality to make sure you don't break it in unexpected ways.
In general, writing tests to cover things that appear to be working, just so that you can have test coverage, won't be a good use of time. While it might feel good, the point of tests is to tell you when something is wrong. So, if you already have working code and you never change it, then writing tests to cover it won't make the code any less buggy. Going over the code by hand might uncover as yet undiscovered bugs, but that has nothing to do with TDD.
That doesn't mean that writing tests after the fact should never be done, but exhaustive tests after the fact seems a bit overkill.
And if none of that advice applies to your particular situation, but you want to add tests anyway, then start with the most critical/dangerous parts of the code - the pieces that, if something goes wrong you're going to be especially screwed, and make sure those sections are rock-solid.
I was recently in a similar situation with an AngularJS app. Apparently, one of the main rules of AngularJS development is TDD Always. We didn't learn about this until later though, after development had been ongoing for more than six months.
I did try adding some tests later on, and it was difficult. The most difficult aspect is that your code is less likely to be written in a way that's easily testable. This means a lot of refactoring (or rewriting) is in order.
Obviously, you won't have a lot of time to spend reverse engineering everything to add in tests, so I'd suggest following these guidelines:
Identify the most problematic areas of the application. This is the stuff that always seems to break whenever someone makes a change.
Order this list by importance, so that the most important components are at the top of your list and the lesser items are at the bottom.
Next, order by items that are less complex.
Start adding tests in slowly, from the top of your list.
You want to start with the most important yet least complex so you can get some easy wins. Things with a lot of dependencies may also need to be refactored in the process, so things that are already dependent on less should be easier to refactor and test.
In the end, it seems best to start with unit testing from the beginning, but when that's not possible, I've found this approach to be helpful.

Stress testing with pycassa

I've been trying to write a stress tester for a rather large cassandra database. At first I was doing it from scratch, and then I found stress.py which allows you to stress test your cluster. However, like all benchmarks, the test data is unrepresentative of the loads this database will be seeing. Thus I decided to modify it to be more realistic to my usage pattern.
I'm using pycassa for most of this project. However stress.py uses the lower-level thrift interface directly, which I find rather cumbersome. Are there any projects out there which stress test cassandra using pycassa? Thanks!
I'm not aware of any existing general-purpose stress tests that make use of pycassa; I'd also love to hear about them if there are any.
In the past, I've modified stress.py to make use of pycassa. I believe I set it up to use one small ConnectionPool per process and I was pretty happy with the result; modifying the Operation class and get_client was the main chunk of work here.
It's hard to give more specific details about this without knowing what you want to do, so feel free to ask more detailed questions if you need to.

Which has a better code base to learn from: nginx or lighttpd?

Primary goal is to learn from a popular web server codebase (implemented in C) with priority given to structure/design instead of neat tricks throughout the code.
I didn't include Apache since its code base is an order of magnitude larger than the two mentioned.
Ngxinx might just be the best straight-c code-base I have encountered. I have read large chunks of Apache, and I always came out feeling unclean, it is a monolithic mess.
You will not just learn about web-servers by exploring Nginx, but pretty much the best practises for writing networked software under Unix and straight-c, from code architecture to meta-programming techniques.
I have heard nothing but good things about Lighttpd, however it is limited in scope compared to Nginx. therefore I would invest time in nginx if I was you. Although lighttpd's limited scope might be beneficial to you, as a first target to study.
Neat tricks always happen in any codebase worth its salt, to be honest. Nevertheless, the answer you probably don't want to hear is that it would probably be good to study both so you can kind of learn through the intersection. The alternative might really leave you stuck in a box of the "lighthttpd" way or the "nginx" way, etc.
I didn't include Apache since its code base is an order of magnitude larger than the two mentioned.
Actually Apache code is quite readable. It has large code base because it does lots of things. But it is well structured and quite easy to understand. You can also check APR library (Apache Portable Runtime) which has plethora of small things to learn from.
IMO if you want to learn programming, you should start with lower profile projects - and not HTTPd, but something simpler.
Both nginx and LightHTTPd (just like Apache) are production quality software, meaning very steep learning curve. And the learning unfortunately often means digging archives to see why it is that way - that comes with age to any mature project.
If you are simply into C and learning design, you might want to check the FreeBSD or its derivatives. In my experience it is a better place for starting: there are lots of tools and libraries of all calibers there. And their TODO lists are never empty, what serves well as a guide to where to start.

Resources