Program development - c

I am writing a C program. What I have seen from my earlier experiences is that I make some changes on a correct version of my program, and after that change, the program is computing incorrectly.
Now, for one occasion it may be easy to detect where I made that change and undo it or do it in some other way, and for other occasions I find it hard (with labor) to detect where exactly the problem is.
Can you suggest some platform or tool which allows you to put the new version and old version of the program side by side and mark the changes that were employed on the new version.
I am using gcc 4.3.2 to compile c programs on Ubuntu 10.04 OS.
Any suggestion is welcome.
regards,
Anup

Use a version control system. I recommend Subversion. This will allow you to compare your newer version with the older one to see exactly what changed and you can revert to the older working version if you break your code.

If you want a tiny, small, portable, one-file personal control version system, I can suggest fossil. A documentation is available here.

What you are requesting is some diff-like tool. There is a plethora of such tools, ranging from the diff command line utility to GUI frontend such as http://meld.sourceforge.net/. Most of these tools can be combined with (or have counterparts in) revision control systems as Subversion.

I would do this with some version control system, like Git. Refer to Git for beginners: The definitive practical guide if you're new to version control systems and/or git.

Thats the main purpose of all the version control management software.
I recommand to look at Subversion, GIT or Mercurial.
There are thousands of internet resources for this programs.
And they all have good comparision programs integrated in their GUI Tools.

Also, try to use Git/Mercurial/whatever together with Dropbox, it will allow you to continue developing on other computers. It will also teach you how to collaborate with others and with yourself.
This is slightly easier to do with the newer VCS systems like Git and Mercurial.

If you would use a version control system, such as Git or Mercurial, and follow a few good practices:
committing small, self contained changes (which includes committing, i.e. saving changes to version control, often)
committing known good state (or at least trying to, e.g. ensuring that code compiles before you commit)
you would be always able to go back to known good state.
If bug is/was in your current changes, you can compare current state with previous version: somewhere in changed lines there is bug... or something in changed lines uncovered an existing bug.
If bug was introduced by some earlied, unknknown commit, you can use bisect to search through history to find first commit that exhibits given bug: see for example git bisect manpage.
HTH

Related

netbeans 6.1 - cannot gets a variable type hovering the mouse on it

On the netbeans that I am using, I cannot do a basic thing such hovering the mouse over a variable and getting its type as shown in the below picture.
How can I obtain this behaviour?
On my system there is windows XP, Netbeans 6.1 and my application is a Vega7000 Application
Why are you using such an old version? It may well be a bug or simply a missing feature.
So answer is, upgrade!
https://netbeans.org/downloads/
If you need the old version for release builds, you can perhaps still do the actual development and debugging using a newer and shinier version.
Also, if you are not using one yet, use source control! I prefer git, but this is very subjective, other DVCS are good too. And note that so called distributed VCS works extremely well locally inside one source tree directory, it does not have to be distributed to many places. When working with old fragile complex code you are afraid of breaking, very frequent local commits, and diffing to known-good versions when problems arise are life savers. If you care about publishing work in progress, at least git (presumably all DVCS) allow squashing multiple commits into one before pushing to non-personal repo or branch.

Maintain a separate branch for each platform

I'm trying to port my project to another platform and I've found a few differences between this new platform and the one I started on. I've seen the autotools package and configure scripts which are supposed to help with that, but I was wondering how feasible it would be to just have a separate branch for each new platform.
The only problem I see is how to do development on the target platform and then merge in changes to other branches without getting the platform-dependent changes. If there is a way to do that, it seems to me it'd be much cleaner.
Has anyone done this who can recommend/discourage this approach?
I would definitely discourage that approach.
You're just asking for trouble if you keep the same code in branches that can't be merged. It's going to be incredibly confusing to keep track of what changes have been applied to what branches and a nightmare should you forget to apply a change to one of your platform branches.
You didn't mention the language, but use the features available in the language to separate code differences between platforms, but using one branch. For example, in C++, you should first use file-based separation. For example, if you have sound code for Mac, Linux and Windows platforms, create a sound_mac.cpp, sound_windows.cpp and sound_linux.cpp file, each containing the same classes and methods, but containing very different platform-specific implementations. Obviously, you only add the appropriate file to the IDE on the particular platform. So, your Xcode project gets sound_mac.cpp file, while your Visual Studio project uses the sound_windows.cpp file. The files which reference those classes and methods will use #ifdef's to determine which headers to include.
You'll use a similar approach for things like installer scripts. You may have a different installer on the Mac than on Windows, but the files for both will be in the branch. Your build script on the Mac will simply utilize the Mac-specific installer files and ignore the Windows-specific files.
Keeping things in one branch and just ignoring what doesn't apply to the current platform allows you merge back and forth between topic branches and the master, making your life much more sane.
Branching to work out compatibility for a target platform is doable. Just be sure to separate out changes that don't have to do with the target platform specifically into another branch.

Can I use Java NIO2?

I am working on a project where I've to implement Distributed File System, so for I/O operations, I was thinking of using NIO2 (JDK7)
JDK7 will be released in August next year.
My questions are:
Would it be good idea to use NIO2 from snapshot of JDK7 ? What problems can I face?
If I compile my code which uses both JDK6 and JDK7 classes is it possible to compile using JDK7 ?
1 - Would it be good idea to use NIO2 from snapshot of JDK7 ? What problems can I face?
For a student / research project I see no major issues, apart from the general ones like:
new APIs may still be in a state of flux, and may change without notice,
you are more likely to encounter JDK/JRE/JVM bugs, and
people who want to try out your project have to use JDK 7.
For a project that needs to go into production before the actual release of JDK 7, you probably should be more cautious.
2 - If I compile my code which uses both JDK6 and JDK7 classes is it possible to compile using JDK7?
You cannot be certain until you actually try it, but I'd be very surprised if the answer is anything other than "yes". The Java team are very aware of the need to maintain backwards compatibility.
(However, it is unlikely that you'll be able to compile using JDK 6 ... unless they decide that it is technically feasible and worthwhile to provide a backport of the feature for JDK 6. For something like NIO2, it could be "no" on both counts.)

How does one go about understanding GNU source code?

I'm really sorry if this sounds kinda dumb. I just finished reading K&R and I worked on some of the exercises. This summer, for my project, I'm thinking of re-implementing a linux utility to expand my understanding of C further so I downloaded the source for GNU tar and sed as they both seem interesting. However, I'm having trouble understanding where it starts, where's the main implementation, where all the weird macros came from, etc.
I have a lot of time so that's not really an issue. Am I supposed to familiarize myself with the GNU toolchain (ie. make, binutils, ..) first in order to understand the programs? Or maybe I should start with something a bit smaller (if there's such a thing) ?
I have little bit of experience with Java, C++ and python if that matters.
Thanks!
The GNU programs big and complicated. The size of GNU Hello World shows that even the simplest GNU project needs a lot of code and configuration around it.
The autotools are hard to understand for a beginner, but you don't need to understand them to read the code. Even if you modify the code, most of the time you can simply run make to compile your changes.
To read code, you need a good editor (VIM, Emacs) or IDE (Eclipse) and some tools to navigate through the source. The tar project contains a src directory, that is a good place to start. A program always start with the main function, so do
grep main *.c
or use your IDE to search for this function. It is in tar.c. Now, skip all the initialization stuff, untill
/* Main command execution. */
There, you see a switch for subcommands. If you pass -x it does this, if you pass -c it does that, etc. This is the branching structure for those commands. If you want to know what these macro's are, run
grep EXTRACT_SUBCOMMAND *.h
there you can see that they are listed in common.h.
Below EXTRACT_SUBCOMMAND you see something funny:
read_and (extract_archive);
The definition of read_and() (again obtained with grep):
read_and (void (*do_something) (void))
The single parameter is a function pointer like a callback, so read_and will supposedly read something and then call the function extract_archive. Again, grep on it and you will see this:
if (prepare_to_extract (current_stat_info.file_name, typeflag, &fun))
{
if (fun && (*fun) (current_stat_info.file_name, typeflag)
&& backup_option)
undo_last_backup ();
}
else
skip_member ();
Note that the real work happens when calling fun. fun is again a function pointer, which is set in prepare_to_extract. fun may point to extract_file, which does the actual writing.
I hope I walked you a great deal through this and shown you how I navigate through source code. Feel free to contact me if you have related questions.
The problem with programs like tar and sed is twofold (this is just my opinion, of course!). First of all, they're both really old. That means they've had multiple people maintain them over the years, with different coding styles and different personalities. For GNU utilities, it's usually pretty good, because they usually enforce a reasonably consistent coding style, but it's still an issue. The other problem is that they're unbelievably portable. Usually "portability" is seen as a good thing, but when taken to extremes, it means your codebase ends up full of little hacks and tricks to work around obscure bugs and corner cases in particular pieces of hardware and systems. And for programs as widely ported as tar and sed, that means there's a lot of corner cases and obscure hardware/compilers/OSes to take into account.
If you want to learn C, then I would say the best place to start is not trying to study code that others have written. Rather, try to write code yourself. If you really want to start with an existing codebase, choose one that's being actively maintained where you can see the changes that other people are making as they make them, follow along in the discussions on the mailing lists and so on.
With well-established programs like tar and sed, you see the result of the discussions that would've happened, but you can't see how software design decisions and changes are being made in real-time. That can only happen with actively-maintained software.
That's just my opinion of course, and you can take it with a grain of salt if you like :)
Why not download the source of the coreutils (http://ftp.gnu.org/gnu/coreutils/) and take a look at tools like yes? Less than 100 lines of C code and a fully functional, useful and really basic piece of GNU software.
GNU Hello is probably the smallest, simplest GNU program and is easy to understand.
I know sometimes it's a mess to navigate through C code, especially if you're not familiarized with it. I suggest you use a tool that will help you browse through the functions, symbols, macros, etc. Then look for the main() function.
You need to familiarize yourself with the tools, of course, but you don't need to become an expert.
Learn how to use grep if you don't know it already and use it to search for the main function and everything else that interests you. You might also want to use code browsing tools like ctags or cscope which can also integrate with vim and emacs or use an IDE if you like that better.
I suggest using ctags or cscope for browsing. You can use them with vim/emacs. They are widely used in the open-source world.
They should be in the repository of every major linux distribution.
Making sense of some code which uses a lot of macros, utility functions, etc, can be hard. To better browse the code of a random C or C++ software, I suggest this approach, which is what I generally use:
Install Qt development tools and Qt Creator
Download the sources you want to inspect, and set them up for compilation (usually just ./configure for GNU stuff).
Run qmake -project in the root of the source directory, to generate Qt .pro file for Qt Creator.
Open the .pro file in Qt Creator (do not use shadow build, when it asks).
Just to be safe, in Qt Creator Projects view, remove the default build steps. The .pro file is just for navigation inside Qt Creator.
Optional: set up custom build and run steps, if you want to build and run/debug under Qt Creator. Not needed for navigation only.
Use Qt Creator to browse the code. Note especially the locator (kb shortcut Ctrl+K) to find stuff by name, and "follow symbol under cursor" (kb shortcut F2), and "find usages" (kb shortcut Ctrl-Shift-U).
I had to take a look at "sed" just to see what the problem was; it shouldn't be that big. I looked and I see what the issue is, and I feel like Charleton Heston catching first sight of a broken statue on the beach. All of what I'm about to describe for "sed" might also apply to "tar". But I haven't looked at it (yet).
A lot of GNU code got seriously grunged up - to the point of unmaintainable morbid legacy - for reasons I don't know. I don't know exactly when it happened, maybe late 1990's or early 2000's, but it was like someone flipped a switch and suddenly, all the nice modular mostly self-contained code widgets got massively grunged with all sorts of extraneous entanglements having little or no connection to what the application itself was trying to do.
In your case, "sed": an entire library got (needlessly) dragged in with the application. This was the case at least as early as version 4.2 (the last version predating your query), probably before that - I'd have to check.
Another thing that got grunged up was the build system (again) to the point of unmaintainability.
So, you're really talking about legacy rescue here.
My advice ... which is generic for any codebase that's been around a long time ... is to dig as deep as you can and go back to its earliest forms first; and to branch out and look at other "sed"'s - like those in the UNIX archive.
https://www.tuhs.org/Archive/
or in the BSD archive:
https://github.com/freebsd
https://github.com/weiss/original-bsd
(the second one goes deeper into early BSD in its earlier commits.)
Many of the "sed"'s on the GNU page - but not all of them - may be found under "Downloads" as a link "mirrors" on the GNU sed page:
https://www.gnu.org/software/sed/
Version 1.18 is still intact. Version 1.17 is implicitly intact, since there is a 1.17 to 1.18 diff present there. Neither version has all the extra stuff piled on top of it. It's more representative of what GNU software looked like, before before becoming knotted up with all the entanglements.
It's actually pretty small - only 8863 lines for the *.c and *.h files, in all. Start with that.
For me the process of analysis of any codebase is destructive of the original and always entails a massive amount of refactoring and re-engineering; and simplification coming from just writing it better and more natively, while yet keeping or increasing its functionality. Almost always, it is written by people who only have a few years' experience (by which I mean: less than 20 years, for instance) and have thus not acquired full-fledged native fluency in the language, nor the breadth of background to be able to program well.
For this, if you do the same, it's strongly advised that you have some of test suite already in place or added. There's one in the version 4.2 software, for instance, though it may be stress-testing new capabilities added between 1.18 and 4.2. Just be aware of that. (So, it might require reducing the test suite to fit 1.18.) Every change you make has to be validated by whatever tests you have in your suite.
You need to have native fluency in the language ... or else the willingness and ability to acquire it by carrying out the exercise and others like it. If you don't have enough years behind you, you're going to hit a soft wall. The deeper you go, the harder it might be to move forward. That's an indication that you're not experienced enough yet, and that you don't have enough breadth. So, this exercise then becomes part of your learning experience, and you'll just have to plod through.
Because of how early the first versions date from, you will have to do some rewriting anyhow, just to bring it up to standard. Later versions can be used as a guide, for this process. At a bare minimum, it should be brought up to C99, as this is virtually mandated as part of POSIX. In other words, you should at least be at least as far up to date as the present century!
Just the challenge of getting it to be functional will be exercise enough. You'll learn a lot of what's in it, just by doing that. The process of getting it to be operational is establishing a "baseline". Once you do that, you have your own version, and you can start with the "analysis".
Once a baseline is established, then you can proceed full throttle forward with refactoring and re-engineering. The test suite helps to provide cover against stumbles and inserted errors. You should keep all the versions that you have (re)made in a local repository so that you can jump back to earlier ones, in case you need to track down the sudden emergence of test failures or other bugs. Some bugs, you may find, were rooted all the way back in the beginning (thus: the discovery of hidden bugs).
After you have the baseline (re)written to your satisfaction, then you can proceed to layer in the subsequent versions. On GNU's archive, 1.18 jumps straight to 2.05. You'll have to make a "diff" between the two to see where all the changes were, and then graft them into your version of 1.18 to get your version of 2.05. This will help you better understand both the issues that the changes made addressed, and what changes were made.
At some point you're going to hit GNU's Grunge Wall. Version 2.05 jumped straight to 3.01 in GNU's historical archive. Some entanglements started slipping in with version 3.01. So, it's a soft wall we have here. But there's also an early test suite with 3.01, which you should use with 1.18, instead of 4.2's test suite.
When you hit the Grunge Wall, you'll see directly what the entanglements were, and you'll have to decide whether to go along for the ride or cast them aside. I can't tell you which direction is the rabbit hole, except that SED has been perfectly fine for a long time, most or all of it is what is listed in and mandated by the POSIX standard (even the current one), and what's there before version 3 serves that end.
I ran diffs. Between 2.05 and 3.01, the diff file is 5000 lines. Ok. That's (mostly) fine and is natural for code that's in development, but some of that may be coming from the soft Grunge Wall. Running a diff on 3.01 versus 4.2 yields a diff file that over 60000 lines. You need only ask yourself: how can a program that's under 10000 lines - that abides by an international standard (POSIX) - be producing 60000 lines of differences? The answer is: that's what we call bloat. So, between 3.01 and 4.2, you're witnessing a problem that is very common to code bases: the rise of bloat.
So, that pretty much tells you which direction ("go along for the ride" versus "cast it aside") is the rabbit hole. I'd probably just stick with 3.01, and do a cursory review of the differences between 3.01 and 4.2 and of the change logs to get an overview of what the changes were, and just leave it at that, except maybe to find a different way to write in what they thought was necessary to change, if the reason for it was valid.
I've done legacy rescue before, before the term "legacy" was even in most people's vocabulary and am quick to recognize the hallmark signs of it. This is the kind of process one might go through.
We've seen it happen with some large codebases already. In effect, the superseding of X11 by Wayland was a massive exercise in legacy rescue. It's also possible that the ongoing superseding of GNU's gcc by clang may be considered instance of that.

make and alternatives, pros and cons on windows platform

I'm looking for a make platform. I've read a little about gnu make, and that its got some issues on windows platforms (from slash/backslash, to shell determination ... ) so I would like to hear what are my alternatives to it ?
If it matters, i'm doing fortran development combined with (very)little c on small sized projects (50k lines max), but I don't think that matters since most of those are of the language agnostic type.
What are gnu make drawbacks, and what alternatives do I have, with what advantages?
There are a couple of good tools for continuous integration and building on windows. The two I have in mind are NAnt which describes itself as .Net build tool, but could be used to build anything - its open source and very extensible, although the UI is lacking. I've recently started to use Hudson which is brilliant, the output is way better than NAnt, making it much easier to use. I have zero experience with these tools and Fortran, so good luck there.
My thought on make and its derivatives is to avoid based on it's age, a good tool in its time but it must 20 years old now, and tech (even in the build area) has moved on a fair bit since then.
You can have a look at cmake. It's a kind of "meta-make" system: You write a make-file for it, which says how your project is structured, what libs and sources it needs, and so on. And it can build make-files for you for GNU make, nmake (i believe), project files for Kdevelop and Visual Studio.
KDE has adopted it for KDE4 onwards and it was since greatly enhanced: CMake
Another such system is Bakefile which was built to generate make-files and project-files for the wxWidgets GUI toolkit. It can be used for non-wx applications too, and is relatively young and modern (uses XML as its makefile description).
There is also nmake, which is Microsoft's version of nmake. I would recommend to stick with gnu make though. My advise is to always use Unix like slashes; they also work for Windows. Gnu make is widely used, you can easily find tutorials and get advices about it's use. It is also a better investment, since you can also use it in other areas in the future. Finally, it is much richer in functionality.
I use GNU make under Windows and have no problems with it. However, I also use bash as my shell. Both make and bash are available as part of the Cygwin package from www.cygwin.com and I strongly recommend you install bash & all the common command line tools (grep, sed etc.) if you are going to use make from the command line.
Make has stood the test of time even on windows, and I use it everyday, but there's also msbuild
Details, details...
Given your small project, I wuld just start with MS nmake. Then if that doesn't suffice, move on to GNUmake. Other advice above is also good. Ant and CMake are fine, but you don't need them and there are so many make users who can help you if you have problems.
For that matter, since you are on windows, doesn't the MS IDE have buil tools built in. Just click and go.
keep it simple. Plan to throw the first on away, you will anyway.
Wikipedia also has this to say:
http://en.wikipedia.org/wiki/List_of_build_automation_software

Resources