I'm working through a book exercise to generate some random serial number, and here's my function:
NSString *randomSerialNumber = [NSString stringWithFormat:#"%c%c%c%c%c",
'0' + random() % 10,
'A' + random() % 26,
'0' + random() % 10,
'A' + random() % 26,
'0' + random() % 10];
This works and has an output like: 2J6X7. But before, the 0s and As I had wrapped in double quotes, and an example output was 11764ıÒ˜. What did I do wrong my first time around, and why did using single quotes fix it?
The difference between single and double quotes is that double quotes declare a string, and single quotes declare a single character. Try doing this, you will get a syntax error:
'More than one character'
The reason that your code outputted a bunch of random characters is that strings are not integers like most other data types, but pointers. This means that when you type "A string", the result of the expression is the memory location that the characters are stored at. This could be anywhere in memory, depending on when you start the program. So, when you added random() to the string, it gave you a random memory address! The statement was equivalent to this in English:
Store the characters "A" in memory, and then give me the memory
address a random amount of cells later.
The random amount of cells later could be anything else in your program. The pointer was interpreted as an character (because of %c), but it wasn't meant to, giving you seemingly random output.
The single quotes are giving you an ASCII representation of each character (char), to which you are adding a number. The double quotes give you a string (char *), to which adding a number really doesn't make much sense.
I might be wrong here, but I think at least the first half is right!
Single quotes are used for single character constants. Double quotes are used for strings, either C strings or NSStrings (which have '#' prefixes). When you used double quotes, the + meant pointer arithmetic. The resulting pointers were passed where characters were expected, which leads to undefined behavior. The correct version does integer arithmetic on the chars, which works as expected.
The numbers that you use for modulus controls how many ASCII characters after the one specified within the single quotes that can show up in the final string. Take a look at this table: http://www.asciitable.com/, for a reference on the ASCII numbers.
'0' + random() % 10 will give you characters from '0' to '10'
'A' + random() % 26 will give you characters from 'A' to 'Z'
Apologies if this is really obvious..
I'd like to add a question to this: is there a nice way to generate a similar string using regular expressions instead?
Adrian's answer is right. I'm just trying to add clarification.
"A" does three things.
It reserves space for 2 characters, the A and a nul terminator.
It puts the codes which represent those two characters in that space.
It returns the address of the first one.
That address is of the type: char *
'A' just returns the value of the code for A,
the same value as was put in the first memory cell above.
the value is of type char.
So as you can see there is a big difference.
Related
This is a code I have used to define an array:
int characters[126];
following which I wanted to get a record of the frequencies of all the characters recorded for which I used the while loop in this format:
while((a=getchar())!=EOF){
characters[a]=characters[a]+1;
}
Then using a for loop I print the values of integers in the array.
How exactly is this working?
Does C assign a specific number for letters ie. a,b,c, etc in the array?
What happens when we make an array defined using characters instead of integers in C?
Let's be sure we are clear: you are using integer values returned by getchar() as indexes into your array. This is not defining the array, it is just accessing its elements.
Does C assign a specific number for letters ie. a,b,c, etc in the array?
There are no letters in the array. There are ints. However, yes, the characters read by getchar() are encoded as integer values, so they are, in principle, suitable array indexes. Thus, this line ...
characters[a]=characters[a]+1;
... reads the int value then stored at index a in array characters, adds 1 to it, and then assigns the result back to element a of the array, provided that the value of a is a valid index into the array.
More generally, it is important to understand that although one of its major uses is to represent characters, type char is an integer type. Its values are numbers. The mapping from characters to numbers is implementation and context dependent, but it is common enough for the mapping to be consistent with the ASCII code that you will often see programs that assume such a mapping.
Indeed, your code makes exactly such an assumption (and others) by allowing only for character codes less than 126.
You should also be aware that if your characters array is declared inside a function then it is not initialized. The code depends on all elements to be initially to zero. I would recommend this declaration instead:
int characters[UCHAR_MAX + 1] = {0};
That upper bound will be sufficient for all the non-EOF values returned by getchar(), and the explicit zero-initialization will ensure the needed initial values regardless of where the array is declared.
I have realized the charecter set that can function as an input for getchar() is part of the ASCII table and comes under an int. I used the code following to find that out:
#include <stdio.h>
int main(){
int a[128];
a['b']=4;
printf("%d",a[98]); //it is 98 as according to the table 'b' is assigned the value of 98
}
following which executing this code i get the output of 4.
I am really new to coding so feel free to correct me.
Character values are represented using some kind of integer encoding - ASCII (very common), EBCDIC (mostly IBM mainframes), UTF-8 (backward-compatible to ASCII), etc.
The character value 'a' maps to some integer value - 97 in ASCII and UTF-8, 129 in EBCDIC. So yes, you can use a character value to index into an array - arr['a']++ would be equivalent to arr[97]++ if you were using ASCII or UTF-8.
The C language does not dictate this - it's determined by the underlying platform.
I'm still new to the forum so I apologize in advance for forum - etiquette issues.
I'm having trouble understanding the differences between int arrays and char arrays.
I recently wrote a program for a Project Euler problem that originally used a char array to store a string of numbers, and later called specific characters and tried to use int operations on them to find a product. When I used a char string I got a ridiculously large product, clearly incorrect. Even if I converted what I thought would be compiled as a character (str[n]) to an integer in-line ((int)str[n]) it did the exact same thing. Only when I actually used an integer array did it work.
Code is as follows
for the char string
char str[21] = "73167176531330624919";
This did not work. I got an answer of about 1.5 trillion for an answer that should have been about 40k.
for the int array
int str[] = {7,3,1,6,7,1,7,6,5,3,1,3,3,0,6,2,4,9,1,9};
This is what did work. I took off the in-line type casting too.
Any explanation as to why these things worked/did not work and anything that can lead to a better understanding of these ideas will be appreciated. Links to helpful stuff are as well. I have researched strings and arrays and pointers plenty on my own (I'm self taught as I'm in high school) but the concepts are still confusing.
Side question, are strings in C automatically stored as arrays or is it just possible to do so?
To elaborate on WhozCraig's answer, the trouble you are having does not have to do with strings, but with the individual characters.
Strings in C are stored by and large as arrays of characters (with the caveat that there exists a null terminator at the end).
The characters themselves are encoded in a system called ascii which assigns codes between 0 - 127 for characters used in the english language (only). Thus "7" is not stored as 7 but as the ascii encoding of 7 which is 55.
I think now you can see why your product got so large.
One elegant way to fix would be to convert
int num = (int) str[n];
to
int num = str[n] - '0';
//thanks for fixing, ' ' is used for characters, " " is used for strings
This solution subtracts the ascii code for 0 from the ascii code for your character, say "7". Since the numbers are encoded linearly, this will work (for single digit numbers). For larger numbers, you should use atoi or strtol from stdlib.h
Strings are just character arrays with a null terminating byte.
There is no separate string data type in c.
When using a char as an integer, the numeric ascii value is used. For example, saying something like printf("%d\n", (int)'a'); will result in 97 (the ascii value of 'a') being printed.
You cannot use a string of numbers to do numeric calculations unless you convert it to an integer array. To convert a digit as a character into its integer form, you can do something like this:
char a = '2';
int a_num = a - '0';
//a_num now stores integer 2
This causes the ascii value of '0' (48) to be subtracted from ascii value '2' (50), finally leaving 2.
char str[21] = "73167176531330624919"
this code is equivalent to
char str[21] = {'7','3','1','6','7','1','7','6','5',/
'3','1','3','3','0','6','2','4','9','1','9'}
so whatever stored in str[21] is not numbers, but the char(their ASCII equivalent representation is different).
side question answer - yes/no, the strings are automatically stored as char arrays, but the string does has a extra character('\0') as the last element(where a char array need not have such a one).
I ripped this from an ebook on C programming.
I understand that ASCII representations of the characters '0' and '9' are integers, so I understand the compatibility with the integer array. I am simply not sure how the shown output is computed? There input is the code itself.
What does this statement mean?
++ndigit[c-'0'];
So, is the program essentially checking if the input is one of the first 10 installments of of the ASCII code table?
ASCII CODE
No, it doesn't.
c - '0' subtracts the (not necessarily ASCII) character code of the character 0 from that of c. This will yield a number between 0 and 9 if c is a digit. Then, the resulting integer is used to index the zero-initialized ndigit array using the [] operator, and the prefix increment operator (++) is then used to increment the element at that particular index.
By the way, the code is erroneous at multiple places. I suggest you switch to another book because this one appears to be either outdated and/or encouraging the use of several types of bad programming practice.
First, main() doesn't have a return type, which is an error. It needs to be declared as int main() or int main(void) or int main(int, char **). Older compilers had the bad habit of assuming an implicit int return type if it was omitted, but this behavior is now deprecated.
Second, it would be better to initialize the ndigit array, like this:
int ndigit[10] = { 0 };
The for loop is superfluous because we can have initialization; it's also less readable than the initialization syntax, and it's also dangerous: the author doesn't calculate the count of the array using sizeof(ndigits) / sizeof(ndigits[0]), but he hardcodes its length, which may cause a buffer overrun when the length of the array is changed (decreased) and the hard-coded length value in the for loop is forgotten about.
The program computes the number of times a digit between 0 and 9 was introduced as input, how many white spaces and how many other characters were in the input.
++ndigit[c-'0'];
'0' - as integer is the ASCII code for 0.
c - is the read character (its ASCII code)
c - '0' = the actual digit (between 0 and 9) represented by the ASCII code c.
For example '3'(ASCII) would be 3(digit=integer) + '0'(ASCII)
So that's how you obtain the index in the array for your digit and you increment the number of times that digit showed up.
I am working on a C program that I did not write and integrating it with my C++ code. This C program has a character array and usage putc function to print the content of it. Like this:
printf("%c\n","01"[b[i]]);
This is a bit array and can have either ASCII 0 or ASCII 1 (NOT ASCII 48 and 49 PLEASE NOTE). This command prints "0" and "1" perfectly. However, I did not understand the use of "01" in the putc command. I can also print the contents like this:
printf("%d\n",b[i]);
Hence I was just curious. Thanks.
Newbie
The "01" is a string literal, which for all intents and purposes is an array. It's a bit weird-looking... you could write:
char *characters = "01";
printf("%c\n", characters[b[i]]);
or maybe even better:
char *characters = "01";
int bit = b[i];
printf("%c\n", characters[bit]);
And it would be a little easier to understand at first glance.
Nasty way of doing the work, but whoever wrote this was using the contents of b as an array dereference into the string, "01":
"foo"[0] <= 'f'
"bar"[2] <= 'r'
"01"[0] <= '0'
"01"[1] <= '1'
your array, b, contains 0s and 1s, and the author wanted a way to quickly turn those into '0's and '1's. He could, just as easily have done:
'0' + b[i]
But that's another criminal behavior. =]
The String "01" is getting cast into a character array (which is what strings are in C), and the b[i] specifies either a 0 or a 1, so the "decomposed" view of it would be.
"01"[0]
or
"01"[1]
Which would select the "right" character from the char array "string". Note that this is only possible C due to the definition that a string is a pointer to a character. Thus, the [...] operation becomes a memory offset operation equal to the size of one item of the type of pointer (in this case, one char).
Yes, your printf would be much better, as it requires less knowledge of obscure "c" tricks.
This line is saying take the array of characters "01" and reference an array element. Get that index from the b[i] location.
Thus "01"[0] returns the character 0 and "01"[1] returns the character 1
Do the statement you understand.
Simplifying the other one, by replacing b[i] with index, we get
"01"[index]
The string literal ("01") is of type char[3]. Getting its index 0 or 1 (or 2) is ok and returns the character '0' or '1' (or '\0').
I know my way around ruby pretty well and am teaching myself C starting with a few toy programs. This one is just to calculate the average of a string of numbers I enter as an argument.
#include <stdio.h>
#include <string.h>
main(int argc, char *argv[])
{
char *token;
int sum = 0;
int count = 0;
token = strtok(argv[1],",");
while (token != NULL)
{
count++;
sum += (int)*token;
token = strtok(NULL, ",");
}
printf("Avg: %d", sum/count);
printf("\n");
return 0;
}
The output is:
mike#sleepycat:~/projects/cee$ ./avg 1,1
Avg: 49
Which clearly needs some adjustment.
Any improvements and an explanation would be appreciated.
Look for sscanf or atoi as functions to convert from a string (array of characters) to an integer.
Unlike higher-level languages, C doesn't automatically convert between string and integral/real data types.
49 is the ASCII value of '1' char.
It should be helpful to you....:D
The problem is the character "1" is 49. You have to convert the character value to an integer and then average.
In C if you cast a char to an int you just get the ASCII value of it. So, you're averaging the ascii value of the character 1 twice, and getting what you'd expect.
You probably want to use atoi().
EDIT: Note that this is generally true of all typecasts in C. C doesn't reinterpret values for you, it trusts you to know what exists at a given location.
strtok(
Please, please do not use this. Even its own documentation says never to use it. I don't know how you, as a Ruby programmer, found out about its existence, but please forget about it.
(int)*token
This is not even close to doing what you want. There are two fundamental problems:
1) A char* does not "contain" text. It points at text. token is of type char*; therefore *token is of type char. That is, a single byte, not a string. Note that I said "byte", not "character", because the name char is actually wrong - an understandable oversight on the part of the language designers, because Unicode did not exist back then. Please understand that char is fundamentally a numeric type. There is no real text type in C! Interpreting a sequence of char values as text is just a convention.
2) Casting in C does not perform any kind of magical conversions.
What your code does is to grab the byte that token points at (after the strtok() call), and cast that numeric value to int. The byte that is rendered with the symbol 1 actually has a value of 49. Again, interpreting a sequence of bytes as text is just a convention, and thus interpreting a byte as a character is just a convention - specifically, here we are using the convention known as ASCII. When you hit the 1 key on your keyboard, and later hit enter to run the program, the chain of events set in motion by the command window actually passed a byte with the value 49 to your program. (In the same way, the comma has a value of 44.)
Both of the above problems are solved by using the proper tools to parse the input. Look up sscanf(). However, you don't even want to pass the input to your program this way, because you can't put any spaces in the input - each "word" on the command line will be passed as a separate entry in the argv[] array.
What you should do, in fact, is take advantage of that, by just expecting each entry in argv[] to represent one number. You can again use sscanf() to parse each entry, and it will be much easier.
Finally:
printf("Avg: %d", sum/count)
The quotient sum/count will not give you a decimal result. Dividing an integer by another integer yields an integer in C, discarding the remainder.
In this line: sum += (int)*token;
Casting a char to an int takes the ASCII value of the char. for 1, this value is 49.
Use the atoi function instead:
sum += atoi(token);
Note atoi is found in the stdlib.h file, so you'll need to #include it as well.
You can't convert a string to an integer via
sum += (int)*token;
Instead you have to call a function like atoi():
sum += atoi (token);
when you cast a char (which is what *token is) to int you get its ascii value in C - which is 49... so the average of the chars ascii values is in fact 49. you need to use atoi to get the value of the number represented