the point of c style naming conventions - c

One of the classes I'm taking in college is where we are doing a ton of programming in C. We are supposed to use 'C' style naming conventions in the assignments or get docked marks (eg a variable is named like int line_counter, a function clear_array() ) I find this convention really really annoying esp coming after a year of Java where such things are named more conviniently like lineCounter or clearArray(). Underscores are annoying, a hassle to type and increase the amount of syntax errors. Why should this convention be followed? Is there some logic to it or some point behind it? Or is it just another trick to make 'C' even harder to write code in?

Style naming conventions are a matter of tradition, local agreement and uniformity. You have to get used to a different style because there's no guarantee, once out in the job market, that you will use the code convention you like. In this sense, the point is that you have to learn that the Java style is not the only style you will ever deal with.
On regard if it's a good decision or not, it's hard to decide. I am annoyed by styling violations as you are, even if I have tens of year of experience in programming, but you cannot really pretend to reform an old code to new conventions. It takes a lot of non-productive time and screws everything for the other programmers.
You can mitigate the problem of slow-to-type-underscore using tab-completion in your editor (e.g. vim). Writing a method will just become typing a few letters and pressing tab. It's unlikely you will hit an underscore in the first letters.

Getting used to a specific style convention is just a question of the number of lines of code you have written. In this case, that that you find annoying in C because you were used to Java, is also a convention in Python: Contrarily to you (I program mostly in Python), I like better the underscored variable names (although I understand the java ones are also very clear to read)
On the other hand, and as a curiosity, you probably know that the difficulty to write a given character depends on the local distribution of your keyboard. Many of the symbols used in C are a hell to write with, for example, a Spanish keyboard.
What I think it is a really bad idea, is not to use the standard conventions for the language and develop a custom convention. This is really bad for others and also for you because all the documentation, code etc you have to study or interact with will be written in the standard style for the language

There is no such thing as "C-style naming conventions"; you will find quite a number of different styles both in C code and in C++ code. That said, you will just have to suck it up and go with the convention required by your professor. The purpose of style conventions, in general, is to reduce errors and to make it easy to infer information about a symbol without needing to look up its declaration. That said, there are many differences about which style or styles are best. Having a consistent style, though, is important for the understanding of the code base as a whole, and it is probably easier for your professor to grade and understand the homework if it is all written using the same, consistent style.
Pretty much any company you work for will require you to adhere to the company's coding convention, so it is not unreasonable for your professor to have similar requirements. Although it took some getting used to Google's C++ coding conventions when I first started, it is undoubtedly a boon to the code's readability to be in a consistent style. Nothing is more unintelligible than a mix of different styles.

I disagree with your teacher's decision to dock points for capitalization, but you're going to have to follow his or her instructions.
Grades are intended to reflect understanding of the material. I personally found it sufficient when teaching introductory C courses to grade on understanding. Beginners have enough difficulty mastering language constructs. It is unnecessary and cruel to dock points for trivialities.
The merits of your teacher's particular style, or of following a corporate style, are separate questions.

Related

Does C have namespaces similar to C++?

From Programming Language Pragmatics by Michael Scott
Modern versions of C and C++ include a namespace mechanism
that provides module-like data hiding
Does C have namespaces similar to C++?
Are the "identifier name spaces" mentioned in C in a Nutshell the "namespaces" mentioned in Scott's book, and similar to namespaces in C++?
Thanks.
No, C does not have a namespace mechanism whereby you can provide “module-like data hiding”.
book quality
I do not know anything about the book you cited, but the word “namespaces” is one of those that gets overloaded to a lot of different meanings, just like “window”. (I question the validity of anything the author says for getting such a major point about one of the world’s oldest and most widespread computer languages so brazenly wrong.)
name spaces in C
“Name spaces” in C are a completely different mechanism, working for a completely different purpose. These are the name spaces discussed in “C in a Nutshell”. The words mean something different than C++ namespaces. Since David Rankin bothered to lookup chapter and section referencing the C11 Standard, these are the name spaces used in C:
label names
struct/union/enum tags
struct/union members
everything else (including enum values)
a quick blurb about scope
Keep in mind that this says nothing about scope, which is a separate mechanism. For example, a global variable and a variable local to a function may have the same name; nevertheless they share the same name space. The difference is that the global’s visibility is obscured by the local variable.
value of namespaces in C++
It is still unclear whether namespaces were a very useful extension to C++, and the argument as to its righteousness continues. The C crowd (mostly) agrees that the headache that adding namespaces would involve doesn’t justify the ends. I couldn’t find anything particularly useful on the interwebs right off the top of my keyboard, except for a couple of bland blurbs about emulating them using structs or (even worse) using macro abuse. If you really want to dig, you could probably find some useful discussions archived on the comp.lang.c newsgroup.
No, C has nothing like C++ namespaces. Most people have to fake what C++ does using a kind of underscore notation at best. This is at least what I do instead of trying to pack things into structs. Your IDE will still help with code assists, you just have to get used to using the underscore instead of a . for everything
C++
MyNamespace::MyObject.myMethodOrVar ...
Ends up looking like this in C
MyNamespace_MyObject_myMethodOrVar
May not be as smooth as C++ or Java, but it works and still helps avoid name collision. It's just a pain in the ass.
And yes, this doesn't give you syntactic devices like use. It is what it is I'm afraid.

What parser-generators with code separation and language extensibility would you recommend?

I'm looking for a context-free grammar parser generator with grammar/code separation and a possibility to add support for new target languages. For instance if I want parser in Pascal, I can write my own pascal code generator without reimplementing the whole thing.
I understand that most open-source parser generators can in theory be extended, still I'd prefer something that has extendability planned and documented.
Feature-wise I need the parser to at least support Python-style indentation, maybe with some additional work. No requirement on the type of parser generated, but I'd prefer something fast.
Which are the most well-known/maintained options?
Popular parser-generators seem to mostly use mixed grammar/code approach which I really don't like. Comparison list on Wikipedia lists a few but I'm a novice at this and can't tell which to try.
Why I don't like mixing grammar/code: because this approach seems like a mess. Grammar is grammar, implementation details are implementation details. They're different things written in different languages, it's intuitive to keep them in separate places.
What if I want to reuse parts of grammar in another project, with different implementation details? What if I want to compile a parser in a different language? All of this requires grammar to be kept separate.
Most parser generators won't handle context-free grammars. They handle some subset (LL(1), LL(k), LL(*), LALR(1), LR(k), ...). If you choose one of these, you will almost certainly have to hack your grammar to match the limitations of the parser generator (no left recursion, limited lookahead, ...). If you want a real context free parser generator you want an Early parser generator (inefficient), a GLR parser generator (the most practical of the lot), or a PEG parser generator (and the last isn't context-free; it requires rules to be ordered to determine which ones take precedence).
You seem to be worried about mixing syntax and parser-actions used to build the trees.
If the tree you build isn't a direct function of the syntax, there has to be some way to tie the tree-building machinery to the grammar productions. Placing it "near" the grammar production is one way, but leads to your "mixed" notation objection.
Another way is to give each rule a name (or some unique identifier), and set the tree-building machinery off to the side indexed by the names. This way your grammar isn't contaminated with the "other stuff", which seems to be your objection. None of the parser generator systems I know of do this. An awkward issue is that you now have to invent lots of rule names, and anytime you have a few hundred names that's inconvenient by itself and it is hard to make them mnemonic.
A third way is to make the a function of the syntax, and auto-generate the tree building steps. This requires no extra stuff off to the side at all to produce the ASTs. The only tool I know that does it (there may be others but I've been looking for 20 odd years and haven't seen one) is my company's product,, the DMS Software Reengineering Toolkit. [DMS isn't just a parser generator; it is a complete ecosystem for building program analysis and transformation tools for arbitrary languages, using a GLR parsing engine; yes it handles Python style indents].
One objection is that such trees are concrete, bloated and confusing; if done right, that's not true.
My SO answer to this question:
What is the difference between an Abstract Syntax Tree and a Concrete Syntax Tree? discusses how we get the benefits of ASTs from automatically generated compressed CSTs.
The good news about DMS's scheme is that the basic grammar isn't bloated with parsing support. The not so good news is that you will find lots of other things you want to associate with grammar rules (prettyprinting rules, attribute computations, tree synthesis,...) and you come right back around to the same choices. DMS has all of these "other things" and solves the association problem a number of ways:
By placing other related descriptive formalisms next to the grammar rule (producing the mixing you complained about). We tolerate this for pretty-printing rules because in fact it is nice to have the grammar (parse) rule adjacent to the pretty-print (anti-parse) rule. We also allow attribute computations to be placed near the grammar rules to provide an association.
While DMS allows rules to have names, this is only for convenient access by procedural code, not associating other mechanisms with the rule.
DMS provides a third way to associate these mechanisms (esp. attribute grammar computations) by using the rule itself as a kind of giant name. So, you write the grammar and prettyprint rules in one place, and somewhere else you can write the grammar rule again with an associated attribute computation. In principle, this is just like giving each rule a name (well, a signature) and associating the computation with the name. But it also allows us to define many, many different attribute computations (for different purposes) and associate them with their rules, without cluttering up the base grammar. Our tools check that a (rule,associated-computation) has a valid rule in the base grammar, so it makes it relatively each to track down what needs fixing when the base grammar changes.
This being my tool (I'm the architect) you shouldn't take this as a recommendation, just a bias. That bias is supported by DMS's ability to parse (without whimpering) C, C++, Java, C#, IBM Enterprise COBOL, Python, F77/F90/F95 with column6 continues/F90 continues and embedded C preprocessor directives to boot under most circumstances), Mumps, PHP4/5 and many other languages.
First off, any decent parser generator is going to be robust enough to support Python's indenting. That isn't really all that weird as languages go. You should try parsing column-sensitive languages like Fortran77 some time...
Secondly, I don't think you really need the parser itself to be "extensible" do you? You just want to be able to use it to lex and parse the language or two you have in mind, right? Again, any decent parser-generator can do that.
Thirdly, you don't really say what about the mix between grammar and code you don't like. Would you rather it be all implemented in a meta-language (kinda tough), or all in code?
Assuming it is the latter, there are a couple of in-language parser generator toolkits I know of. The first is Boost's Spirit, which is implemented in C++. I've used it, and it works. However, back when I used it you pretty much needed a graduate degree in "boostology" to be able to understand its error messages well enough to get anything working in a reasonable amount of time.
The other I know about is OpenToken, which is a parser-generation toolkit implemented in Ada. Ada doesn't have the error-novel problem that C++ has with its templates, so OpenToken is far easier to use. However, you have to use it in Ada...
Typical functional languages allow you to implement any sublanguage you like (mostly) within the language itself, thanks to their inhernetly good support for things like lambdas and metaprogramming. However, their parsers tend to be slower. That's really no problem at all if you are just parsing a configuration file or two. Its a tremendous problem if you are parsing hundreds of files at a go.

Has the use of C to implement other languages constrained their designs in any way?

It seems that most new programming languages that have appeared in the last 20 years have been written in C. This makes complete sense as C can be seen as a sort of portable assembly language. But what I'm curious about is whether this has constrained the design of the languages in any way. What prompted my question was thinking about how the C stack is used directly in Python for calling functions. Obviously the programming language designer can do whatever they want in whatever language they want, but it seems to me that the language you choose to write your new language in puts you in a certain mindset and gives you certain shortcuts that are difficult to ignore. Are there other characteristics of these languages that come from being written in that language (good or bad)?
I tend to disagree.
I don't think it's so much that a language's compiler or interpreter is implemented in C — after all, you can implement a virtual machine with C that is completely unlike its host environment, meaning that you can get away from a C / near-assembly language mindset.
However, it's more difficult to claim that the C language itself didn't have any influence on the design of later languages. Take for example the usage of curly braces { } to group statements into blocks, the notion that whitespace and indentation is mostly unimportant, native type's names (int, char, etc.) and other keywords, or the way how variables are defined (ie. type declaration first, followed by the variable's name, optional initialization). Many of today's popular and wide-spread languages (C++, Java, C#, and I'm sure there are even more) share these concepts with C. (These probably weren't completely new with C, but AFAIK C came up with that particular mix of language syntax.)
Even with a C implementation, you're surprisingly free in terms of implementation. For example, chicken scheme uses C as an intermediate, but still manages to use the stack as a nursery generation in its garbage collector.
That said, there are some cases where there are constraints. Case in point: The GHC haskell compiler has a perl script called the Evil Mangler to alter the GCC-outputted assembly code to implement some important optimizations. They've been moving to internally-generated assembly and LLVM partially for that reason. That said, this hasn't constrained the language design - only the compiler's choice of available optimizations.
No, in short. The reality is, look around at the languages that are written in C. Lua, for example, is about as far from C as you can get without becoming Perl. It has first-class functions, fully automated memory management, etc.
It's unusual for new languages to be affected by their implementation language, unless said language contains serious limitations. While I definitely disapprove of C, it's not a limited language, just very error-prone and slow to program in compared to more modern languages. Oh, except in the CRT. For example, Lua doesn't contain directory functionality, because it's not part of the CRT so they can't portably implement it in standard C. That is one way in which C is limited. But in terms of language features, it's not limited.
If you wanted to construct an argument saying that languages implemented in C have XYZ limitations or characteristics, you would have to show that doing things another way is impossible in C.
The C stack is just the system stack, and this concept predates C by quite a bit. If you study theory of computing you will see that using a stack is very powerful.
Using C to implement languages has probably had very little effect on those languages, though the familiarity with C (and other C like languages) of people who design and implement languages has probably influenced their design a great deal. It is very difficult to not be influenced by things you've seen before even when you aren't actively copying the best bits of another language.
Many languages do use C as the glue between them and other things, though. Part of this is that many OSes provide a C API, so to access that it's easy to use C. Additionally, C is just so common and simple that many other languages have some sort of way to interface with it. If you want to glue two modules together which are written in different languages then using C as the middle man is probably the easiest solution.
Where implementing a language in C has probably influenced other languages the most is probably things like how escapes are done in strings, which probably isn't that limiting.
The only thing that has constrained language design is the imagination and technical skill of the language designers. As you said, C can be thought of as a "portable assembly language". If that is true, then asking if C has constrained a design is akin to asking if assembly has constrained language design. Since all code written in any language is eventually executed as assembly, every language would suffer the same constraints. Therefore, the C language itself imposes no constraints that would be overcome by using a different language.
That being said, there are some things that are easier to do in one language vs another. Many language designers take this into account. If the language is being designed to be, say, powerful at string processing but performance is not a concern, then using a language with better built-in string processing facilities (such as C++) might be more optimal.
Many developers choose C for several reasons. First, C is a very common language. Open source projects in particular like that it is relatively easier to find an experienced C-language developer than it is to find an equivalently-skilled developer in some other languages. Second, C typically lends itself to micro-optimization. When writing a parser for a scripted language, the efficiency of the parser has a big impact on the overall performance of scripts written in that language. For compiled languages, a more efficient compiler can reduce compile times. Many C compilers are very good at generating extremely optimized code (which is also part of the reason why many embedded systems are programmed in C), and performance-critical code can be written in inline assembly. Also, C is standardized and is generally a static target. Code can be written to the ANSI/C89 standard and not have to worry about it being incompatible with a future version of C. The revisions made in the C99 standard add functionality but don't break existing code. Finally, C is extremely portable. If at least one compiler exists for a given platform, it's most likely a C compiler. Using a highly-portable language like C makes it easier to maximize the number of platforms that can use the new language.
The one limitation that comes to mind is extensibility and compiler hosting. Consider the case of C#. The compiler is written in C/C++ and is entirely native code. This makes it very difficult to use in process with a C# application.
This has broad implications for the tooling chain of C#. Any code which wants to take advantage of the real C# parser or binding engine has to have at least one component which is written in native code. This eventually results in most of the tooling chain for the C# language being written in C++ which is a bit backwards for a language.
This doesn't limit the language per say but definitely has an effect on the experience around the language.
Garbage collection. Language implementations on top of Java or .NET use the VM's GC. Those on top of C tend to use reference counting.
One thing I can think of is that functions are not necessarily first class members in the language, and this is can't be blamed on C alone (I am not talking about passing a function pointer, though it can be argued that C provides you with that feature).
If one were to write a DSL in groovy (/scheme/lisp/haskell/lua/javascript/and some more that I am not sure of), functions can become first class members. Making functions first class members and allowing for anonymous functions allows to write concise and more human readable code (like demonstrated by LINQ).
Yes, eventually all of these are running under C (or assembly if you want to get to that level), but in terms of providing the user of the language the ability to express themselves better, these abstractions do a wonderful job.
Implementing a compiler/interpreter in C doesn't have any major limitations. On the other hand, implementing a language X to C compiler does. For example, according to the Wikipedia article on C--, when compiling a higher level language to C you can't do precise garbage collection, efficient exception handling, or tail recursion optimization. This is the kind of problem that C-- was intended to solve.

Does C's FILE have an object-oriented interface?

Does the FILE type used through standard C functions fopen, etc. have an object-oriented interface?
I'm looking for opinions with reasoning rather than an absolute answer, as definitions of OO vary by who you ask. What are the important OO concepts it meets or doesn't meet?
In response to JustJeff's comment below, I am not asking whether C is an OO language, nor whether C (easily or not) allows OO programming. (Isn't that a separate issue?)
Is C an object-oriented language?
Was OOP (object-oriented-programming) anything more than a laboratory concept when C and FILE were created?
Answering these questions will answer your question.
EDIT:
Further thoughts:
Object Oriented specifically means several behaviors, including:
Inheritence: Can you derive new classes from FILE?
Polymorphism: Can you treat derived classes as FILEs?
Encapsulation: Can you put a FILE inside another object?
Methods & Properties: Does a FILE have methods and properties specific to it? (eg.
myFile.Name, myFile.Size, myFile.Delete())
Although there are well known C "tricks" to accomplish something resembling each of these behaviors, this is not built in to FILE, and is not the original intent.
I conclude that FILE is not Object Oriented.
If the FILE type were "object oriented", presumably we could derive from it in some meaningful way. I've never seen a convincing instance of such a derivation.
Lets say I have new hardware abstraction, a bit like a socket, called a wormhole. Can I derive from FILE (or socket) to implement it. Not really - I've probably got to make some changes to tables in the OS kernel. This is not what I call object orientation
But this whole issue comes down to semantics in the end. Some people insist that anything that uses a jump-table is object oriented, and IBM have always claimed that their AS/400 boxes are object-oriented, through & through.
For those of you that want to dip into the pit of madness and stupidity that is the USENET comp.object newsgroup, this topic was discussed quite exhaustively there a few years ago, albeit by mad and stupid people. If you want to trawl those depths, the Google Groups interface is a good place to start.
Academically speaking, certainly the actual files are objects. They have attributes and you can perform actions on them. Doesn't mean FILE is a class, just saying, there are degrees of OO-ness to think about.
The trouble with trying to say that the stdio FILE interface qualifies as OO, however, is that the stdio FILE interface doesn't represent the 'objectness' of the file very well. You could use FILEs under plain old C in an OO way, but of course you forfeit the syntactic clarity afforded by Java or C++.
It should probably further be added that while you can't generate 'inheritance' from FILE, this further disqualifies it as OO, but you could argue that's more a fault of its environment (plain C) than the abstract idea of the file-as-object itself.
In fact .. you could probably make a case for FILE being something like a java interface. In the linux world, you can operate almost any kind of I/O device through the open/close/read/write/ioctl calls; the FILE functions are just covers on top of those; therefore in FILE you have something like an abstract class that defines the basic operations (open/read/etc) on an 'abstact i/o device', leaving it up to the various sorts of derived types to flesh those out with type-specific behavior.
Granted, it's very hard to see the OO in a pile of C code, and very easy to break the abstractions, which is why the actual OO languages are so much more popular these days.
It depends. How do you define an "object-oriented interface"? As the comments to abelenky's post shows, it is easy to construct an argument that FILE is object-oriented. It depends on what you mean by "object-oriented". It doesn't have any member methods. But it does have functions specific to it.
It can not be derived from in the "conventional" sense, but it does seem to be polymorphic. Behind a FILE pointer, the implementation can vary widely. It may be a file, it may be a buffer in memory, it may be a socket or the standard output.
Is it encapsulated? Well, it is essentially implemented as a pointer. There is no access to the implementation details of where the file is located, or even the name of the file, unless you call the proper API functions on it. That sounds encapsulated to me.
The answer is basically whatever you want it to be. If you don't want FILE to be object-oriented, then define "object-oriented" in a way that FILE can't fulfill.
C has the first half of object orientated.
Encapsulation, ie you can have compound types like FILE* or structs but you can't inherit from them which is the second (although less important) half
No. C is not an object-oriented language.
I know that's an "absolute answer," which you didn't want, but I'm afraid it's the only answer. The reasoning is that C is not object-oriented, so no part of it can have an "object-oriented interface".
Clarification:
In my opinion, true object-orientation involves method dispatch through subtype polymorphism. If a language lacks this, it is not object-oriented.
Object-orientation is not a "technique" like GTK. It is a language feature. If the language lacks the feature, it is not object-oriented.
If object-orientation were merely a technique, then nearly every language could be called object-oriented, and the term would cease to have any real meaning.
There are different definitions of oo around. The one I find most useful is the following (inspired by Alan Kay):
objects hold state (ie references to other objects)
objects receive (and process) messages
processing a message may result in
messages beeing sent to the object itself or other objects
a change in the object's state
This means you can program in an object-oriented way in any imperative programming language - even assembler. A purely functional language has no state variables, which makes oo impossible or at least awkward to implement (remember: LISP is not pure!); the same should go for purely declarative languages.
In C, message passing in most often implemented as function calls with a pointer to a struct holding the object's state as first argument, which is the case for the file handling api. Still, C as a language can't be classified as oo as it doesn't have syntactic support for this style of programming.
Also, some other definitions of oo include things like class-based inheritance (so what about prototypal languages?) and encapsulation - which aren't really essential in my opinion - but some of them can be implemented in C with some pointer- and casting magic.

Code Obfuscation?

So, I have a penchant for Easter Eggs... this dates back to me being part of the found community of the Easter Egg Archive.
However, I also do a lot of open source programming.
What I want to know is, what do you think is the best way to SYSTEMATICALLY and METHODICALLY obfuscate code.
Examples in PHP/Python/C/C++ preferred, but in other languages is fine, if the methodology is explained properly.
Compile the code with full optimization. Completely strip the binary.
Use a decompiler on the code.
I can guarantee the result will be so utterly unreadable that you won't even be able to read it ;)
In that case, you should use/write an "obfuscator". A program that does the job for you.
The Salamander Obfuscator can be used to obfuscate .Net programs, but it is more to prevent decompilation, thus not exactly what you need.
A good place to learn about obfuscation in C is International Obfuscated C Code Contest
In the spirit of renaming symbols: overuse scope and visibility rules by naming different variables with the same name.
The question is how to create seemingly non-obfuscated code in plain sight (open source) without it appearing to perform another function.
Some obvious methods:
remove comments and as much whitespace as you can without breaking things
join lines
rename variables and functions to be meaningless (preferably 1 character)
For systematic and methodical obfuscation of code, you cannot beat Perl. If you want something that compiles to a binary, there is always APL.
If you are targeting the .NET framework, put your easter egg source code in a resource file as a binhex string. Then you can have one of your initialisaing routines fetch it, decode it and compile it into memory. You can invoke it using reflection.
If you need help with the technical aspects of compiling into memory and calling into the resultant assembly I can give you I library I wrote and a sample program that uses it.
You can use this technology to load plug-ins, which is a legit thing to do and reasonable in an initialiser.

Resources