In a C program,we usually declare less number of variables. Whether is it possible to declare 'n' number of variables?
for ex: int a,int b,int c .......
Or whether compiler gives me an error?
Is there a maximum limit for the declarations of variables?
A compiler is not obliged to have an upper limit. But if it has a limit, it can only be so small.
C does define minimal maximum limits.
See C11 5.2.4.1 Translation limits
Examples:
4095 external identifiers in one translation unit
511 identifiers with block scope declared in one block
127 parameters in one function definition
1023 members in a single structure or union
There is no explicit limit for number of variables. Moreover this is connected to specific compiler, but there are indications about what minimal number of scopes, local variables and identifiers length must be supported to call it standard conformant.
The limit itself is not anyhow connected to memory available at runtime.
Any limit (constraint) on the source code must be explicitly given in compiler documentation (this is statement taken from the C standard indication, paragraph limits).
Related
I know to code in C well but I thought of learning C from the book C - The Complete Reference by Herbert Schildt. Here is a quote from Chapter 2:
In C89, at least the first 6 characters of an external identifier and at
least the first 31 characters of an internal identifier will be significant. C99 has increased these values. In C99, an external identifier has at least 31 significant characters, and an internal identifier has at least 63 significant characters.
Can somebody explain what does it mean to be significant?
That means that it is used within the compiler to differ between different names.
E.g. if only the first 6 characters are significant, when having two variables:
int abcdef_1;
int abcdef_2;
They will be treated as the same variable, and possibly the compiler will generate a warning or error.
About the minimal significance:
Maybe the compiler/assembler can handle more, but the linker cannot. Or maybe external tools which are out of control of the manufacturer of the assembler/linker can handle less, thus a minimum value (per type, internal/external) is defined in the C standard(s).
what is the max amount of variables/identifiers you can have in C? Learning compiler theory and interpreter design, I've learned that identifiers and their values are stored via a symbol dictionary/hashmap.
Considering that hashmaps/dictionaries have a RAM limit, what would be the max amount of hashed identifiers possible in the C programming language?
In general the number of identifiers is a quality-of-implementation issue. All compilers I know are only limited by available resources (memory).
There is, however, a (nearly useless) specification of minimum limits in the C Standard, C11, emphasis for identifiers by me:
5.2.4.1 Translation limits
The implementation shall be able to translate and execute at least one
program that contains at least one instance of every one of the
following limits:
127 nesting levels of blocks
63 nesting levels of conditional inclusion
12 pointer, array, and function declarators (in any combinations) modifying an arithmetic, structure, union, or void type in a
declaration
63 nesting levels of parenthesized declarators within a full declarator
63 nesting levels of parenthesized expressions within a full expression
63 significant initial characters in an internal identifier or a macro name (each universal character name or extended source character
is considered a single character)
31 significant initial characters in an external identifier (each universal character name specifying a short identifier of 0000FFFF or
less is considered 6 characters, each universal character name
specifying a short identifier of 00010000 or more is considered 10
characters, and each extended source character is considered the same
number of characters as the corresponding universal character name, if
any)
4095 external identifiers in one translation unit
511 identifiers with block scope declared in one block
4095 macro identifiers simultaneously defined in one preprocessing translation unit
127 parameters in one function definition
127 arguments in one function call
127 parameters in one macro definition
127 arguments in one macro invocation
4095 characters in a logical source line
4095 characters in a string literal (after concatenation)
65535 bytes in an object (in a hosted environment only)
15 nesting levels for #included files
1023 case labels for a switch statement (excluding those for any nested switch statements)
1023 members in a single structure or union
1023 enumeration constants in a single enumeration
63 levels of nested structure or union definitions in a single struct-declaration-list
I consider it nearly useless due to the "at least one program" part. I think the intent is clear, but if your vendor sells you a compiler able to translate exactly one program testing these limits, then you won't get your money back :-)
The standard doesn't specify a limit so it's down to the compiler or interpreter to make the choice.
You should also note that identifiers can be compiled out in the final binary.
There does not seem to be any information in the C standard, but the C++ standard does mention some minimum recommendations which you probably could use as a guideline:
Annex B (informative)
Implementation quantities
[implimits]
(2.8) — Identifiers with block scope declared in one block [1 024].
i have the following code
#include<stdio.h>
int main()
{
int a12345678901234567890123456789012345;
int a123456789012345678901234567890123456;
int sum;
scanf("%d",&a12345678901234567890123456789012345);
scanf("%d",&a123456789012345678901234567890123456);
sum = a12345678901234567890123456789012345 + a123456789012345678901234567890123456;
printf("%d\n",sum);
return 0;
}
the problem is, we know that ANSI standard recognizes variables upto 31 characters...but, both variables are same upto 35 characters...but, still the program compiles without any error and warning and giving correct output...
but how?
shouldn't it give an error of redeclaration?
Many compilers are built to exceed ANSI specification (for instance, in recognizing longer than 31 character variable names) as a protection to programmers. While it works in the compiler you're using, you can't count on it working in just any C compiler...
[...] we know that ANSI standard recognizes variables upto 31 characters [...] shouldn't it give an error of redeclaration?
Well, not necessary. Since you mentioned ANSI C, this is the relevant part of C89 standard:
"Implementation limits"
The implementation shall treat at least the first 31 characters of an internal name (a macro name or an identifier that does not have external linkage) as significant. Corresponding lower-case and upper-case letters are different. The implementation may further restrict the significance of an external name (an identifier that has external linkage) to six characters and may ignore distinctions of alphabetical case for such names.10 These limitations on identifiers are all implementation-defined.
Any identifiers that differ in a significant character are different identifiers. If two identifiers differ in a non-significant character, the behavior is undefined.
http://port70.net/~nsz/c/c89/c89-draft.html#3.1.2 (emphasis mine)
It's also explicitly described as a common extension:
Lengths and cases of identifiers
All characters in identifiers (with or without external linkage) are significant and case distinctions are observed (3.1.2)
http://port70.net/~nsz/c/c89/c89-draft.html#A.6.5.3
So, you're just exploiting a C implementation choice of your compiler.
The C89 rationale elaborates on this:
3.1.2 Identifiers
While an implementation is not obliged to remember more than the first
31 characters of an identifier for the purpose of name matching, the
programmer is effectively prohibited from intentionally creating two
different identifiers that are the same in the first 31 characters.
Implementations may therefore store the full identifier; they are not
obliged to truncate to 31.
The decision to extend significance to 31 characters for internal
names was made with little opposition, but the decision to retain the
old six-character case-insensitive restriction on significance of
external names was most painful. While strong sentiment was expressed
for making C ``right'' by requiring longer names everywhere, the
Committee recognized that the language must, for years to come,
coexist with other languages and with older assemblers and linkers.
Rather than undermine support for the Standard, the severe
restrictions have been retained.
Compilers like GCC may store the full identifier.
The number of significant initial characters in an identifier (C90 6.1.2, C90, C99 and C11 5.2.4.1, C99 and C11 6.4.2).
For internal names, all characters are significant. For external
names, the number of significant characters are defined by the linker;
for almost all targets, all characters are significant.
A conforming implementation must support at least 31 characters for an external identifier (and your identifiers are internal, where the limit is 63 for C99 and C11).
In fact, having all characters significant is the intent of the standard, but the committe doesn't want to make implementations non-conforming by not providing it. The limits for external identifiers origin from some linkers unable to provide more (in C89, only 6 characters were required to be significant, which is why the old standard library functions have names not longer than 6 characters).
To be precise, the standard doesn't exactly mandate these limits, the language in the standard is quite permissive:
C11 (n1570) 5.2.4.1 Translation limits
The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits:18)
[...]
63 significant initial characters in an internal identifier or a macro name (each universal character name or extended source character is considered a single character)
31 significant initial characters in an external identifier (each universal character name specifying a short identifier of 0000FFFF or less is considered 6 characters, each universal character name specifying a short identifier of 00010000 or more is considered 10 characters, and each extended source character is considered the same number of characters as the corresponding universal character name, if any)19)
[...]
Footnote 18) clearly expresses the intent:
Implementations should avoid imposing fixed translation limits whenever possible.
Footnote 19) refers to Future language directions 6.11.3:
Restriction of the significance of an external name to fewer than 255 characters (considering each universal character name or extended source character as a single character) is an obsolescent feature that is a concession to existing implementations.
And to explain the permissiveness in the first sentence of 5.2.4.1, cf. the C99 rationale (5.10)
5.2.4 Environmental limits
The C89 Committee agreed that the Standard must say something about certain capacities and limitations, but just how to enforce these treaty points was the topic of considerable debate.
5.2.4.1 Translation limits
The Standard requires that an implementation be able to translate and execute some program that meets each of the stated limits. This criterion was felt to give a useful latitude to the implementor in meeting these limits. While a deficient implementation could probably contrive a program that meets this requirement, yet still succeed in being useless, the C89 Committee felt that such ingenuity would probably require more work than making something useful. The sense of both the C89 and C99 Committees was that implementors should not construe the translation limits as the values of hard-wired parameters, but rather as a set of criteria by which an implementation will be judged.
There is no limit .
Actually there is a limit , it has to be small enough that it will fit in memory, but otherwise no . If there is a builtin limit (I don't believe there is) it is so huge you would be really hard-pressed to reach it. I
generated C++ code with 2 variables with a differing last character to ensure that the names that long are distinct . I got to 64KB file and thought that is enough.
I've just found this function definition in some embedded code:
float round_float_to_4(static float inputval);
I'm familiar with other uses for static (global variables, functions and local variables), but this is the first time I see it as specifier for function parameter. I assume that this forces compiler to use fixed memory location for inputval instead of stack?
This is non standard. I'd guess the same thing as you, and I'm not surprised of such extension in compilers having an embedded target.
That's not valid. The only valid place where static may be used in a function parameter i'm aware of is in an array dimension
float round_float_to_4(float inputval[static 4]);
Saying that inputval will, in all calls to this function, point to memory providing at least 4 floats (this is a C99 addition, it doesn't appear in C89).
As per C standard,
The only storage-class specifier that shall occur in a parameter
declaration is register.
Many embedded devices have a seriously limited stack, such a feature would be of great benefit in reducing the chances of stack overflow, while still giving you the opportunity for re entrant code.
Smaller chips don't have any opportunity to put variables on the stack, so all parameters are implicitly memory locations.
In my college days I read about the auto keyword and in the course of time I actually forgot what it is. It is defined as:
defines a local variable as having a
local lifetime
I never found it is being used anywhere, is it really used and if so then where is it used and in which cases?
If you'd read the IAQ (Infrequently Asked Questions) list, you'd know that auto is useful primarily to define or declare a vehicle:
auto my_car;
A vehicle that's consistently parked outdoors:
extern auto my_car;
For those who lack any sense of humor and want "just the facts Ma'am": the short answer is that there's never any reason to use auto at all. The only time you're allowed to use auto is with a variable that already has auto storage class, so you're just specifying something that would happen anyway. Attempting to use auto on any variable that doesn't have the auto storage class already will result in the compiler rejecting your code. I suppose if you want to get technical, your implementation doesn't have to be a compiler (but it is) and it can theoretically continue to compile the code after issuing a diagnostic (but it won't).
Small addendum by kaz:
There is also:
static auto my_car;
which requires a diagnostic according to ISO C. This is correct, because it declares that the car is broken down. The diagnostic is free of charge, but turning off the dashboard light will cost you eighty dollars. (Twenty or less, if you purchase your own USB dongle for on-board diagnostics from eBay).
The aforementioned extern auto my_car also requires a diagnostic, and for that reason it is never run through the compiler, other than by city staff tasked with parking enforcement.
If you see a lot of extern static auto ... in any code base, you're in a bad neighborhood; look for a better job immediately, before the whole place turns to Rust.
auto is a modifier like static. It defines the storage class of a variable. However, since the default for local variables is auto, you don't normally need to manually specify it.
This page lists different storage classes in C.
The auto keyword is useless in the C language. It is there because before the C language there existed a B language in which that keyword was necessary for declaring local variables. (B was developed into NB, which became C).
Here is the reference manual for B.
As you can see, the manual is rife with examples in which auto is used. This is so because there is no int keyword. Some kind of keyword is needed to say "this is a declaration of a variable", and that keyword also indicates whether it is a local or external (auto versus extrn). If you do not use one or the other, you have a syntax error. That is to say, x, y; is not a declaration by itself, but auto x, y; is.
Since code bases written in B had to be ported to NB and to C as the language was developed, the newer versions of the language carried some baggage for improved backward compatibility that translated to less work. In the case of auto, the programmers did not have to hunt down every occurrence of auto and remove it.
It's obvious from the manual that the now obsolescent "implicit int" cruft in C (being able to write main() { ... } without any int in front) also comes from B. That's another backward compatibility feature to support B code. Functions do not have a return type specified in B because there are no types. Everything is a word, like in many assembly languages.
Note how a function can just be declared extrn putchar and then the only thing that makes it a function that identifier's use: it is used in a function call expression like putchar(x), and that's what tells the compiler to treat that typeless word as a function pointer.
In C auto is a keyword that indicates a variable is local to a block. Since that's the default for block-scoped variables, it's unnecessary and very rarely used (I don't think I've ever seen it use outside of examples in texts that discuss the keyword). I'd be interested if someone could point out a case where the use of auto was required to get a correct parse or behavior.
However, in the C++11 standard the auto keyword has been 'hijacked' to support type inference, where the type of a variable can be taken from the type of its initializer:
auto someVariable = 1.5; // someVariable will have type double
Type inference is being added mainly to support declaring variables in templates or returned from template functions where types based on a template parameter (or deduced by the compiler when a template is instantiated) can often be quite painful to declare manually.
With the old Aztec C compiler, it was possible to turn all automatic variables to static variables (for increased addressing speed) using a command-line switch.
But variables explicitly declared with auto were left as-is in that case. (A must for recursive functions which would otherwise not work properly!)
The auto keyword is similar to the inclusion of semicolons in Python, it was required by a previous language (B) but developers realized it was redundant because most things were auto.
I suspect it was left in to help with the transition from B to C. In short, one use is for B language compatibility.
For example in B and 80s C:
/* The following function will print a non-negative number, n, to
the base b, where 2<=b<=10. This routine uses the fact that
in the ASCII character set, the digits 0 to 9 have sequential
code values. */
printn(n, b) {
extern putchar;
auto a;
if (a = n / b) /* assignment, not test for equality */
printn(a, b); /* recursive */
putchar(n % b + '0');
}
auto can only be used for block-scoped variables. extern auto int is rubbish because the compiler can't determine whether this uses an external definition or whether to override the extern with an auto definition (also auto and extern are entirely different storage durations, like static auto int, which is also rubbish obviously). It could always choose to interpret it one way but instead chooses to treat it as an error.
There is one feature that auto does provide and that's enabling the 'everything is an int' rule inside a function. Unlike outside of a function, where a=3 is interpreted as a definition int a =3 because assignments don't exist at file scope, a=3 is an error inside a function because apparently the compiler always interprets it as an assignment to an external variable rather than a definition (even if there are no extern int a forward declarations in the function or in the file scope), but a specifier like static, const, volatile or auto would imply that it is a definition and the compiler takes it as a definition, except auto doesn't have the side effects of the other specifiers. auto a=3 is therefore implicitly auto int a = 3. Admittedly, signed a = 3 has the same effect and unsigned a = 3 is always an unsigned int.
Also note 'auto has no effect on whether an object will be allocated to a register (unless some particular compiler pays attention to it, but that seems unlikely)'
Auto keyword is a storage class (some sort of techniques that decides lifetime of variable and storage place) example. It has a behavior by which variable made by the Help of that keyword have lifespan (lifetime ) reside only within the curly braces
{
auto int x=8;
printf("%d",x); // here x is 8
{
auto int x=3;
printf("%d",x); // here x is 3
}
printf("%d",x); // here x is 8
}
I am sure you are familiar with storage class specifiers in C which are "extern", "static", "register" and "auto".
The definition of "auto" is pretty much given in other answers but here is a possible usage of "auto" keyword that I am not sure, but I think it is compiler dependent.
You see, with respect to storage class specifiers, there is a rule. We cannot use multiple storage class specifiers for a variable. That is why static global variables cannot be externed. Therefore, they are known only to their file.
When you go to your compiler setting, you can enable optimization flag for speed. one of the ways that compiler optimizes is, it looks for variables without storage class specifiers and then makes an assessment based on availability of cache memory and some other factors to see whether it should treat that variable using register specifier or not. Now, what if we want to optimize our code for speed while knowing that a specific variable in our program is not very important and we dont want compiler to even consider it as register. I though by putting auto, compiler will be unable to add register specifier to a variable since typing "register auto int a;" OR "auto register int a;" raises the error of using multiple storage class specifiers.
To sum it up, I thought auto can prohibit compiler from treating a variable as register through optimization.
This theory did not work for GCC compiler however I have not tried other compilers.