Alphanumeric String to Unique Integer representation

Alphanumeric String to Unique Integer representation - c

I have an input data of 4 character string (alphanumeric) or 3 character string and I need to convert these ASCII character string to unique float in 2 digits each, separated by decimal.
Ex:
Input string = 5405, output data = 54.05
Input string = 53BC, output data = 53.199 ( B ascii value is ~ 0x42 in hex and C is 0x43 )
Issue is I am seeing the same output when input strings are 560B and 5618, as both results in same output as 56.18.
Is there a way to uniquely generate a float number in these cases?
Max value of float allowed is 99.999.

Simple math tells us that this is not possible. The number of unique alphanumeric strings of length 4 (case-insensitive) is 36^4 = 1,679,616 while the number of non-negative unique floating point numbers with at most 3 fractional digits and less than 100 is 10^5 = 100,000.
If the string were restricted to hexadecimal digits, there would only be 16^4 = 65,536 possibilities in which case a unique encoding would be possible.
Slightly off-topic: when a mapping is needed into a domain which is too small to accommodate the result of a unique mapping, a hash function is the "standard tool", but collisions must be handled.

Your encoding is somewhat confusing, but here is a simple solution:
use 2 digits for the integral part
use 2 digits for fractional parts 00 to 99
use a combination of 1 letter and 1 letter or digit for fractional parts 100 to 999. There are 26*36 = 936 such combinations, enough to cover the 900 possibilities.
all values from 00.00 to 99.999 can be encoded.
some 4 letter and digit combinations are not used.
the encoding is not unique. eg: 53A0 is 53.100, the same number as 53.10 encoded as 5310.
Here is an implementation:
#include <stdib.h>
double fdecode(const char *s) {
char a[7];
a[0] = s[0];
a[1] = s[1];
a[2] = '.';
if (s[2] >= '0' && s[2] <= '9') {
a[3] = s[3];
a[4] = s[4];
a[5] = '\0';
} else {
// assuming uppercase letters
int n = 100 + (s[3] - 'A') * 36;
if (s[4] >= '0' && s[4] <= '9') {
n += s[4] - '0';
} else {
n += 10 + (s[4] - 'A') % 26;
}
snprintf(&a[3], 4, "%d", n);
}
return strtod(a, NULL);
}
int fencode(char *s, double d) {
char a[7];
if (d >= 0 && snprintf(a, 7, "%06.3f", d) == 6) {
s[0] = a[0];
s[1] = a[1];
if (a[5] == '0') {
s[2] = a[3];
s[3] = a[4];
} else {
int n = atoi(a + 3);
s[2] = 'A' + (n / 36);
n %= 36;
if (n < 10) {
s[3] = '0' + n;
} else {
s[3] = 'A' + n - 10;
}
}
s[4] = '\0';
return 4;
} else {
s[0] = '\0';
return -1;
}
}

Related

=PSET 2 CAESAR= How do I convert ASCII range down to a value from 0 to 25?

I first did this:
// Convert ASCII range down to a value from 0 to 25
char uppercase[27] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char lowercase[27] = "abcdefghijklmnopqrstuvwxyz";
char convertedUppercase[27];
char convertedLowercase[27];
for (int i = 0; i <= 26; i++)
{
convertedUppercase[i] = uppercase[i] - 'A';
convertedLowercase[i] = lowercase[i] - 'a';
}
// For each character in the plaintext: (DOESN'T WORK)
for (int i = 0, n = strlen(p); i <= n; i++)
{
// Rotate the character if it's a letter // ci = (pi + k) % 26
if (isalpha(p[i]))
{
if (isupper(p[i]))
{
c[i] = ((p[i]) + k) % 26;
}
else if (islower(p[i]))
{
c[i] = ((p[i]) + k) % 26;
}
}
}
printf("ciphertext: %s\n", c);
but then I realized that the value of convertedUppercase will just be like 0 = NUL instead of 0 = A. Can anyone give me a hint what to do?
edit:
From the CS50 Discord:
"The caesar cipher formula (p + k) % 26 works on the premise that p (the plain text character) has a value of 0 - 25 (representing a - z or A - Z)
So if your plain char is 'x', that would have a value of 23, and if your key was 2, then the ciphered char would be:
(23 + 2) % 26
( 25 ) % 26
= 25 'z'
I'm kinda lost on how to do it.

This would be so much easier if you would provide a MRE.
I guess what you are observing is that you see a truncated cipertext if you attempt to output it via printf() with "%s".
This is however only because any "A" (that is a ciper A, i.e. after shifting by key) results in a 0 (which terminates string output, being the '\0' terminator) and most other letters result in unprintable characters.
This is because you only shift by key and map to 0-25 what needs to be the number representation (i.e. numeric instead of textual ciper) here:
c[i] = ((p[i]) + k) % 26;
In order to turn into textual cipher instead of numeric ciper, you need to do
convert textual to numeric, with -'A'
shift by key, with +k
map to 0-25, with %26
convert numeric to textual, with +'A'
I.e.
c[i] = ((p[i]-'A') + k) % 26 + 'A';
E.g. "H" from "Hello World".
textual to numeric, 'H' - 'A' -> 7
shift by key, 7 + 4 -> 11
map to 0-25, 11%26 -> 11
numeric to textual, 11 + 'A' -> 'L' is cipher
E.g. "W" from "Hello World".
textual to numeric, 'W' - 'A' -> 22
shift by key, 22 + 4 -> 26
map to 0-25, 26%26 -> 0
numeric to textual, 0 + 'A' -> 'A' is cipher

C program that sums a char with int

I have a given exercise that wants me to find the uppercase letter that is K places from the letter in this case char variable that is named C. The range is uppercase letters from A to Z.
For example if the input is B 3 the output should be E. For this specific input its simple you just sum the values and you get your answer but for example what if we go out of the range. Here is one example F 100 the program should output B because if the value is > than Z the program starts from A.
If there are some confusions I will try to explain it more here are some test cases and my code that only work if we don't cross the range.
Input Output
B 3 E
X 12345 S
F 100 B
T 0 T
#include <stdio.h>
int main(){
int K;
char C,rez;
scanf("%c %d",&C,&K);
int ch;
for(ch = 'A';ch <= 'Z';ch++){
if(C>='A' && C<='Z'){
rez = C+K;
}
}
printf("%c",rez);
return 0;
}

Think of the letters [A-Z] as base 26 where zero is A, one is B and 25 is Z.
As we sum of the letter (in base 26) and the offset, it is only the least significant base 26 digit we have interest, so use % to find the least significant base 26 digit much like one uses % 10 to find the least significant decimal digit.
scanf(" %c %d",&C,&K);
// ^ space added to consume any white-space
if (C >= 'A' && C <= 'Z') {
int base26 = C - 'A';
base26 = base26 + K;
base26 %= 26;
int output = base26 + 'A';
printf("%c %-8d %c\n", C, K, output);
}
For negative offsets we need to do a little more work as % in not the mod operator, but the remainder. This differs with some negative operands.
base26 %= 26;
if (base < 0) base26 += 26; // add
int output = base26 + 'A';
Pedantically, C + K may overflow with extreme K values. To account for that, reduce K before adding.
// base26 = C + K;
base26 = C + K%26;
We could be a little sneaky and add 26 to insure the sum is not negative.
if (C >= 'A' && C <= 'Z') {
int base26 = C - 'A';
base26 = base26 + K%26 + 26; // base26 >= 0, even when K < 0
base26 %= 26; // base26 >= 0 and < 26
int output = base26 + 'A';
printf("%c %-8d %c\n", C, K, output);
}
... or make a complex one-line
printf("%c %-8d %c\n", C, K, (C - 'A' + K%26 + 26)%26 + 'A');

This can be accomplished by using 2 concepts.
ASCII value
Modulus operator (%)
In C every character has an ASCII value. Basically it goes from 0-127.
The character 'A' has the value of 65
The character 'B' has the value of 66 (65 + 1)
and so on...
Until Z which is 65 + 25 = 90
And the 2nd concept I want to highlight in math is modulo arithmetic where if you always want to map a number to certain range, you can use a modulus operator.
Modulus is the reminder that you get after dividing a number by another number.
In our case, we have 26 alphabets so we can always get a number between 0 to 25
For the example you took
100 % 26 = 22
But you have to consider the starting point too.
So, we always subtract the initial alphabet by the value of 'A', i.e. 65 so that 'A' maps to 0 and 'Z' maps to 25
So, if we start with 'F' and need to go 100 places..
Subtract 'A' value from 'F' value. Characters behave like numbers so you can actually store 'F' - 'A' in an integer
In this case 'F' - 'A' = 5
Next we add the offset to this.
5 + 100 = 105
Then we perform modulus with 26
105 % 26 = 1
Finally add the value of 'A' back to the result
'A' + 1 = 'B'
And you are done

Get the remainder of input number with 26 using modulo operator. If sum of input character and remainder is less than or equal to Z then its the answer otherwise again find the remainder of sum with 26 and that will be answer (take care of offset because the ASCII decimal value of letter A is 65).
Roughly the implementation will be:
#include <stdio.h>
int main(){
int K;
char C, rez;
scanf("%c %d",&C,&K);
// Validate the user input
int ch;
int rem = K % 26;
if ((rem + C) - 'A' < 26) {
rez = rem + C;
} else {
rez = ((rem + C - 'A') % 26) + 'A';
}
printf("%c\n",rez);
return 0;
}
Note that, I know there is scope of improvement in the implementation. But this is just to give an idea to OP about how it can be done.
Output:
# ./a.out
B 3
E
# ./a.out
X 12345
S
# ./a.out
F 100
B
# ./a.out
T 0
T

Converting ascii hex string to byte array

I have a char array say char value []={'0','2','0','c','0','3'};
I want to convert this into a byte array like unsigned char val[]={'02','0c','03'}
This is in an embedded application so i can't use string.h functions. How can i do this?

Sicne you talk about an embedded application I assume that you want to save the numbers as values and not as strings/characters. So if you just want to store your character data as numbers (for example in an integer), you can use sscanf.
This means you could do something like this:
char source_val[] = {'0','A','0','3','B','7'} // Represents the numbers 0x0A, 0x03 and 0xB7
uint8 dest_val[3]; // We want to save 3 numbers
for(int i = 0; i<3; i++)
{
sscanf(&source_val[i*2],"%x%x",&dest_val[i]); // Everytime we read two chars --> %x%x
}
// Now dest_val contains 0x0A, 0x03 and 0xB7
However if you want to store it as a string (like in your example), you can't use unsigned char
since this type is also just 8-Bit long, which means it can only store one character. Displaying 'B3' in a single (unsigned) char does not work.
edit: Ok according to comments, the goal is to save the passed data as a numerical value. Unfortunately the compiler from the opener does not support sscanf which would be the easiest way to do so. Anyhow, since this is (in my opinion) the simplest approach, I will leave this part of the answer at it is and try to add a more custom approach in this edit.
Regarding the data type, it actually doesn't matter if you have uint8. Even though I would advise to use some kind of integer data type, you can also store your data into an unsigned char. The problem here is, that the data you get passed, is a character/letter, that you want to interpret as a numerical value. However, the internal storage of your character differs. You can check the ASCII Table, where you can check the internal values for every character.
For example:
char letter = 'A'; // Internally 0x41
char number = 0x61; // Internally 0x64 - represents the letter 'a'
As you can see there is also a differnce between upper an lower case.
If you do something like this:
int myVal = letter; //
myVal won't represent the value 0xA (decimal 10), it will have the value 0x41.
The fact you can't use sscanf means you need a custom function. So first of all we need a way to conver one letter into an integer:
int charToInt(char letter)
{
int myNumerical;
// First we want to check if its 0-9, A-F, or a-f) --> See ASCII Table
if(letter > 47 && letter < 58)
{
// 0-9
myNumerical = letter-48;
// The Letter "0" is in the ASCII table at position 48 -> meaning if we subtract 48 we get 0 and so on...
}
else if(letter > 64 && letter < 71)
{
// A-F
myNumerical = letter-55
// The Letter "A" (dec 10) is at Pos 65 --> 65-55 = 10 and so on..
}
else if(letter > 96 && letter < 103)
{
// a-f
myNumerical = letter-87
// The Letter "a" (dec 10) is at Pos 97--> 97-87 = 10 and so on...
}
else
{
// Not supported letter...
myNumerical = -1;
}
return myNumerical;
}
Now we have a way to convert every single character into a number. The other problem, is to always append two characters together, but this is rather easy:
int appendNumbers(int higherNibble, int lowerNibble)
{
int myNumber = higherNibble << 4;
myNumber |= lowerNibbler;
return myNumber;
// Example: higherNibble = 0x0A, lowerNibble = 0x03; -> myNumber 0 0xA3
// Of course you have to ensure that the parameters are not bigger than 0x0F
}
Now everything together would be something like this:
char source_val[] = {'0','A','0','3','B','7'} // Represents the numbers 0x0A, 0x03 and 0xB7
int dest_val[3]; // We want to save 3 numbers
int temp_low, temp_high;
for(int i = 0; i<3; i++)
{
temp_high = charToInt(source_val[i*2]);
temp_low = charToInt(source_val[i*2+1]);
dest_val[i] = appendNumbers(temp_high , temp_low);
}
I hope that I understood your problem right, and this helps..

If you have a "proper" array, like value as declared in the question, then you loop over the size of it to get each character. If you're on a system which uses the ASCII alphabet (which is most likely) then you can convert a hexadecimal digit in character form to a decimal value by subtracting '0' for digits (see the linked ASCII table to understand why), and subtracting 'A' or 'a' for letters (make sure no letters are higher than 'F' of course) and add ten.
When you have the value from the first hexadeximal digit, then convert the second hexadecimal digit the same way. Multiply the first value by 16 and add the second value. You now have single byte value corresponding to two hexadecimal digits in character form.
Time for some code examples:
/* Function which converts a hexadecimal digit character to its integer value */
int hex_to_val(const char ch)
{
if (ch >= '0' && ch <= '9')
return ch - '0'; /* Simple ASCII arithmetic */
else if (ch >= 'a' && ch <= 'f')
return 10 + ch - 'a'; /* Because hex-digit a is ten */
else if (ch >= 'A' && ch <= 'F')
return 10 + ch - 'A'; /* Because hex-digit A is ten */
else
return -1; /* Not a valid hexadecimal digit */
}
...
/* Source character array */
char value []={'0','2','0','c','0','3'};
/* Destination "byte" array */
char val[3];
/* `i < sizeof(value)` works because `sizeof(char)` is always 1 */
/* `i += 2` because there is two digits per value */
/* NOTE: This loop can only handle an array of even number of entries */
for (size_t i = 0, j = 0; i < sizeof(value); i += 2, ++j)
{
int digit1 = hex_to_val(value[i]); /* Get value of first digit */
int digit2 = hex_to_val(value[i + 1]); /* Get value of second digit */
if (digit1 == -1 || digit2 == -1)
continue; /* Not a valid hexadecimal digit */
/* The first digit is multiplied with the base */
/* Cast to the destination type */
val[j] = (char) (digit1 * 16 + digit2);
}
for (size_t i = 0; i < 3; ++i)
printf("Hex value %lu = %02x\n", i + 1, val[i]);
The output from the code above is
Hex value 1 = 02
Hex value 2 = 0c
Hex value 3 = 03
A note about the ASCII arithmetic: The ASCII value for the character '0' is 48, and the ASCII value for the character '1' is 49. Therefore '1' - '0' will result in 1.

It's easy with strtol():
#include <stdlib.h>
#include <assert.h>
void parse_bytes(unsigned char *dest, const char *src, size_t n)
{
/** size 3 is important to make sure tmp is \0-terminated and
the initialization guarantees that the array is filled with zeros */
char tmp[3] = "";
while (n--) {
tmp[0] = *src++;
tmp[1] = *src++;
*dest++ = strtol(tmp, NULL, 16);
}
}
int main(void)
{
unsigned char d[3];
parse_bytes(d, "0a1bca", 3);
assert(d[0] == 0x0a);
assert(d[1] == 0x1b);
assert(d[2] == 0xca);
return EXIT_SUCCESS;
}
If that is not available (even though it is NOT from string.h), you could do something like:
int ctohex(char c)
{
if (c >= '0' && c <= '9') {
return c - '0';
}
switch (c) {
case 'a':
case 'A':
return 0xa;
case 'b':
case 'B':
return 0xb;
/**
* and so on
*/
}
return -1;
}
void parse_bytes(unsigned char *dest, const char *src, size_t n)
{
while (n--) {
*dest = ctohex(*src++) * 16;
*dest++ += ctohex(*src++);
}
}

Assuming 8-bit bytes (not actually guaranteed by the C standard, but ubiquitous), the range of `unsigned char` is 0..255, and the range of `signed char` is -128..127. ASCII was developed as a 7-bit code using values in the range 0-127, so the same value can be represented by both `char` types.
For the now discovered task of converting a counted hex-string from ascii to unsigned bytes, here's my take:
unsigned int atob(char a){
register int b;
b = a - '0'; // subtract '0' so '0' goes to 0 .. '9' goes to 9
if (b > 9) b = b - ('A' - '0') + 10; // too high! try 'A'..'F'
if (b > 15) b = b - ('a' - 'A); // too high! try 'a'..'f'
return b;
}
void myfunc(const char *in, int n){
int i;
unsigned char *ba;
ba=malloc(n/2);
for (i=0; i < n; i+=2){
ba[i/2] = (atob(in[i]) << 4) | atob(in[i+1]);
}
// ... do something with ba
}

I need to add string characters in C. A + B must = C. Literally

I am writing a program that is due tonight at midnight, and I am utterly stuck. The program is written in C, and takes input from the user in the form SOS where S = a string of characters, O = an operator (I.E. '+', '-', '*', '/'). The example input and output in the book is the following:
Input> abc+aab
Output: abc + aab => bce
And that's literally, not variable. Like, a + a must = b.
What is the code to do this operation? I will post the code I have so far, however all it does is take the input and divide it between each part.
#include <stdio.h>
#include <string.h>
int main() {
system("clear");
char in[20], s1[10], s2[10], o[2], ans[15];
while(1) {
printf("\nInput> ");
scanf("%s", in);
if (in[0] == 'q' && in[1] == 'u' && in[2] == 'i' && in[3] == 't') {
system("clear");
return 0;
}
int i, hold, breakNum;
for (i = 0; i < 20; i++) {
if (in[i] == '+' || in[i] == '-' || in[i] == '/' || in[i] == '*') {
hold = i;
}
if (in[i] == '\0') {
breakNum = i;
}
}
int j;
for (j = 0; j < hold; j++) {
s1[j] = in[j];
}
s1[hold] = '\0';
o[0] = in[hold];
o[1] = '\0';
int k;
int l = 0;
for (k = (hold + 1); k < breakNum; k++) {
s2[l] = in[k];
l++;
}
s2[breakNum] = '\0';
printf("%s %s %s =>\n", s1, o, s2);
}
}

Since this is homework, let's focus on how to solve this, rather than providing a bunch of code which I suspect your instructor would frown upon.
First, don't do everything from within the main() function. Break it up into smaller functions each of which do part of the task.
Second, break the task into its component pieces and write out the pseudocode:
while ( 1 )
{
// read input "abc + def"
// convert input into tokens "abc", "+", "def"
// evaluate tokens 1 and 3 as operands ("abc" -> 123, "def" -> 456)
// perform the operation indicated by token 2
// format the result as a series of characters (579 -> "egi")
}
Finally, write each of the functions. Of course, if you stumble upon roadblocks along the way, be sure to come back to ask your specific questions.

Based on your examples, it appears “a” acts like 1, “b” acts like 2, and so on. Given this, you can perform the arithmetic on individual characters like this:
// Map character from first string to an integer.
int c1 = s1[j] - 'a' + 1;
// Map character from second string to an integer.
int c2 = s2[j] - 'a' + 1;
// Perform operation.
int result = c1 + c2;
// Map result to a character.
char c = result - 1 + 'a';
There are some things you have to add to this:
You have to put this in a loop, to do it for each character in the strings.
You have to vary the operation according to the operator specified in the input.
You have to do something with each result, likely printing it.
You have to do something about results that extended beyond the alphabet, like “y+y”, “a-b”, or “a/b”.

If we assume, from your example answer, that a is going to be the representation of 1, then you can find the representation values of all the other values and subtract the value representation of a from it.
for (i = 0; i < str_len; i++) {
int s1Int = (int)s1[i];
int s2Int = (int)s1[i];
int addAmount = 1 + abs((int)'a' - s2Int);
output[i] = (char)(s1Int + addAmount)
}
Steps
1) For the length of the s1 or s2
2) Retrieve the decimal value of the first char
3) Retrieve the decimal value of the second char
4) Find the difference between the letter a (97) and the second char + 1 <-- assuming a is the representation of 1
5) Add the difference to the s1 char and convert the decimal representation back to a character.
Example 1:
if S1 char is a, S2 char is b:
s1Int = 97
s2Int = 98
addAmount = abs((int)'a' - s2Int)) = 1 + abs(97 - 98) = 2
output = s1Int + addAmount = 97 + 2 = 99 = c
Example 2:
if S1 char is c, S2 char is a:
s1Int = 99
s2Int = 97
addAmount = abs((int)'a' - s2Int)) = 1 + abs(97 - 97) = 1
output = s1Int + addAmount = 99 + 1 = 100 = d

Data types conversion (unsigned long long to char)

Can anyone tell me what is wrong with the following code?
__inline__
char* ut_byte_to_long (ulint nb) {
char* a = malloc(sizeof(nb));
int i = 0;
for (i=0;i<sizeof(nb);i++) {
a[i] = (nb>>(i*8)) & 0xFF;
}
return a;
}
This string is then concatenated as part of a larger one using strcat. The string prints fine but for the integers which are represented as character symbols. I'm using %s and fprintf to check the result.
Thanks a lot.
EDIT
I took one of the comments below (I was adding the terminating \0 separately, before calling fprintf, but after strcat. Modifying my initial function...
__inline__
char* ut_byte_to_long (ulint nb) {
char* a = malloc(sizeof(nb) + 1);
int i = 0;
for (i=0;i<sizeof(nb);i++) {
a[i] = (nb>>(i*8)) & 0xFF;
}
a[nb] = '\0' ;
return a;
}
This sample code still isn't printing out a number...
char* tmp;
tmp = ut_byte_to_long(start->id);
fprintf(stderr, "Value of node is %s \n ", tmp);

strcat is expecting a null byte terminating the string.
Change your malloc size to sizeof(nb) + 1 and append '\0' to the end.

You have two problems.
The first is that the character array a contains numbers, such as 2, instead of ASCII codes representing those numbers, such as '2' (=50 on ASCII, might be different in other systems). Try modifying your code to
a[i] = (nb>>(i*8)) & 0xFF + '0';
The second problem is that the result of the above computation can be anything between 0 and 255, or in other words, a number which requires more than one digit to print.
If you want to print hexadecimal numbers (0-9, A-F), two digits per such computation will be enough, and you can write something like
a[2*i + 0] = int2hex( (nb>>(i*8)) & 0x0F ); //right hexa digit
a[2*i + 1] = int2hex( (nb>>(i*8+4)) & 0x0F ); //left hexa digit
where
char int2hex(int n) {
if (n <= 9 && n >= 0)
return n + '0';
else
return (n-10) + 'A';
}

if you dont want to use sprintf(target_string,"%lu",source_int) or the non standard itoa(), here is a version of the function that transform a long to a string :
__inline__
char* ut_byte_to_long (ulint nb) {
char* a = (char*) malloc(22*sizeof(char));
int i=21;
int j;
do
{
i--;
a[i] = nb % 10 + '0';
nb = nb/10;
}while (nb > 0);
// the number is stored from a[i] to a[21]
//shifting the string to a[0] : a[21-i]
for(j = 0 ; j < 21 && i < 21 ; j++ , i++)
{
a[j] = a[i];
}
a[j] = '\0';
return a;
}
I assumed that an unsigned long contain less than 21 digits. (biggest number is 18,446,744,073,709,551,615 which equals 2^64 − 1 : 20 digits)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Alphanumeric String to Unique Integer representation - c

Related

=PSET 2 CAESAR= How do I convert ASCII range down to a value from 0 to 25?

C program that sums a char with int

Converting ascii hex string to byte array

I need to add string characters in C. A + B must = C. Literally

Data types conversion (unsigned long long to char)

Categories

Resources