How to turn non-printable characters into their hex values in C? - arrays

I'm trying to make a function that takes an array of characters as an input, and outputs printable characters normally, and non-printable characters in hexadecimal (by turning these character into decimal using Extended ASCII, then turning that decimal number into hex).
For example:
"This morning is ßright"
should turn into:
"This morning is E1right"
since ß in Extended ASCII is 225, and that in hexadecimal is E1.
Here is what I attempted:
void myfunction(char *str)
{
int size=0;
for (int i = 0; str[i] != NULL; i++) size++; //to identify how many characters are in the string
for (int i = 0; i < size; i++)
{
if (isprint(str[i]))
{
printf("%c", str[i]); //printing printable characters
}
else
{
if (str[i] == NULL) break; //to stop when reaching the end of the string
printf("%02x", str[i]); //This is where I'm having an issue
}
}
}
This function outputs this:
"This morning is ffffffc3ffffff9fright"
how can I turn the non-printable characters into their hex value? and what is causing this function to behave in this way?
Thanks in advance!

You're seeing a couple of issues here. The first is that the char type on your machine (as on most) is signed, so when you have a char that is not ascii, it shows up as a negative number. This then sign extends to your int size before you print it as an unsigned hex value, so you get those ffffff strings you see.
If you mask it to 8 bits, you'll see the hex values more clearly. Use
printf("%02X", str[i] & 0xff); // X to use upper-case hex chars for clarity
and you'll get the output
This morning is C39Fright
Now you see the second problem, which is that ß is not an ascii character. It is unicode character #00DF, however, and when it is encoded in UTF-8 it shows up as the two-byte sequence C3 9F.

You have plenty of issues with your code.
for (int i = 0; str[i] != NULL; i++) size++; NULL is a pointer str[i] is char. You simply want to compare with zero which is a null character. null character is not the same as NULL pointer!!!
Same here: if (str[i] == NULL) break;
printf("%02x", str[i]); you use wron format to print char value as number. You should use hh size modifier. See how it works in the attached code.
Use the correct type for indexes or sizes - size_t instead of int
Your code is overcomplicated.
void myfunction(const char *str)
{
while(*str)
{
if (isprint(*str))
{
printf("%c", *str); //printing printable characters
}
else
{
printf("%02hhX", *str); //This is where I'm having an issue
}
str++;
}
}
int main(void)
{
char *str = "This morning is \xE1right";
myfunction(str);
}
https://godbolt.org/z/6jKWdr3rM

Related

How to write a function that receives a string as a parameter and returns the maximum digit that appears in the string?

These are the details. (https://i.stack.imgur.com/cMY9J.png)
#include <stdio.h>
#include <stdlib.h>
int maxdigit(char sir[]);
int main()
{
char s[100];
gets(s);
printf("The greatest digits is %i\n", maxdigit(s));
}
int maxdigit(char sir[])
{
int i, max = 0;
for(i=0;i<100;i++)
{
if(max<sir[i])
{
max = sir[i];
}
}
return max;
}
I genuinely don't know how to get only the integer values in a string to return it. How can i do it so it doesn't compare with the ascii codes of the letters?
A few problems here:
gets() is a dangerous function (since it can lead to buffer overflows), and has been removed from the C standard. Consider using fgets() or scanf().
Your code assumes that all 100 digits have been entered by the user. If they enter less than 100, the memory for the other digits is uninitialised, and will probably be full of garbage. You need to stop when you reach the '\0' that terminates the string.
To convert an ASCII code (for a digit) to the value of the digit, subtract '0' from it. But you'll need to ensure that all the entered digits are actually digits, maybe with isdigit().
Iterate until null terminating character. In your code you go beyond end of the string if the string length is smaller than the array
Convert char representation of the digit to its integer value by substracting 0.
unsigned maxdigit(const char *str)
{
unsigned max = 0;
if(str)
{
while(*str)
{
if(isdigit((unsigned char)*str))
{
if(max < *str - '0') max = *str - '0';
}
str++;
}
}
return max;
}

Print bytes in C, only non-printable characters as hex

I have a program creating some byte strings that are a mix of human-readable text and control bytes (including null bytes). In order to debug these strings I would like to have a function that prints these strings, given a pointer and a size, in a way that I can read the printable ASCII characters on screen, as well as the hex value of the non-printable ones (à la Python), e.g.
first string\x00second string\x00\x01
So far I have a function that only prints the printable characters:
void print_bytes(unsigned char *bs, size_t size) {
size_t i;
for (i = 0; i < size; i++) {
fputc(bs[i], stdout);
}
printf("\n");
}
Other than that, I have only seen examples online print everything as hex sequences, which does not help me understand the contents of the strings.
How can I improve the function above to print the hex values of non-printable characters?
Using the suggestion in the comment, I rewrote the function to use the isprint swicth:
void print_bytes(const unsigned char *bs, const size_t size) {
for (size_t i = 0; i < size; i++) {
if isprint(bs[i]) {
fputc(bs[i], stdout);
} else {
printf("\\x%02x", bs[i]);
}
}
printf("\n");
}

how to fix this code so that it can test the integers present next to the character?

Given a string containing alphanumeric characters, calculate the sum of all numbers present in the string.
The problem with my code is that it displays the integers present before the characters, but it is not summing up the integers after the characters.
The execution is easy in python and C++ but I cant get it done using C! Can anyone please verify where I have done wrong? << thank you !
enter code here
#include<stdio.h>
#include<string.h>
int convert(char[]);
int main()
{
char ch[100],temp[100]={0};
int i=0,s=0,j=0,n;
scanf("%s",ch);
for(i=0;i<strlen(ch);i++)
{
if((ch[i]>='0') && (ch[i]<='9'))
{
temp[j]=ch[i];
j++;
}
else
{
if(temp[0]== '\0')
{
continue;
}
else
{
n=convert(temp);
s+=n;
temp[0]= '\0';
j=0;
}
}
}
printf("%d",s);
return 0;
}
int convert(char s[]) //converting string to integer
{
int n=0;
for(int i=0;i<strlen(s);i++)
{
n= n * 10 + s[i] - '0';
}
return n;
}
Input : 12abcd4
Expected output : 16
But the output is 12 for my code.
There are two problems in your code. The first was mentioned in the comments : if the last character is a digit, the last "number section" will not be taken into account. But I don't think that the solution given in the comments is good because if the last character is not a digit, you will have a wrong value. To correct this, I added an if statement that check if the last character is a digit, if so call convert().
The second problem is that strlen return the number of characters in you string from the beginning until it finds an '\0'. The way you used your string lead to the follow problem :
ch = "12abcd4".
At first you have temp = '1' + '2' + '\0'...
After calling convert() you set temp[0] to '\0', thus temp = '\0' + '2' + '\0'... .
And when you start reading digit again, you set '4' in temp[0]. Your string is now : '4' + '2' + '\0'... .
The n returned will be 42 and your result 54 (12+42). There are several solution to have the expected behavior, I chose to use your variable j to indicate how many characters should be read instead of using strlen() :
#include<stdio.h>
#include<string.h>
int convert(char[], int size);
int main() {
char ch[100],temp[100]={0};
int i=0,s=0,j=0,n;
scanf("%s",ch);
for(i=0;i<strlen(ch);i++) {
if((ch[i]>='0') && (ch[i]<='9')) {
temp[j]=ch[i];
j++;
// change here
if(i == strlen(ch) - 1) {
n=convert(temp, j);
s+=n;
}
}
else {
// change here
n=convert(temp, j);
s+=n;
if(temp[0]== '\0') {
continue;
}
temp[0]= '\0';
j=0;
}
}
printf("%d\n",s);
return 0;
}
//change here
int convert(char s[], int size) {
int n=0;
for(int i=0;i<size;i++) {
n= n * 10 + s[i] - '0';
}
return n;
}
You could use a combination of strtoul() and strpbrk() to do this.
Declare two character pointers start_ptr and end_ptr and make start_ptr point to the beginning of the string under consideration.
char *start_ptr=s, *end_ptr;
where s is the character array of size 100 holding the string.
Since your string has only alphanumeric characters, there is no - sign and hence there are no negative numbers. So we can get away with using unsigned integers.
We are using strtoul() from stdlib.h to perform the string to integer conversion. So let's declare two variables: rv for holding the value returned by strtoul() and sum to hold the sum of numbers.
unsigned long rv, sum_val=0;
Now use a loop:
for(; start_ptr!=NULL; )
{
rv = strtoul(start_ptr, &end_ptr, 10);
if(rv==ULONG_MAX && errno==ERANGE)
{
//out of range!
printf("\nOut of range.");
break;
}
else
{
printf("\n%lu", rv);
sum_val += rv;
start_ptr=strpbrk(end_ptr, "0123456789");
}
}
strtoul() will convert as much part of the string as possible and then make end_ptr point to the first character of the part of the string that could not be converted.
It will return ULONG_MAX if the number is too big and errno would be set to ERANGE.
Otherwise the converted number is returned.
strpbrk() would search for a set of characters (in this case the characters 0-9) and return a pointer to the first match. Otherwise NULL is returned.
Don't forget to include the following header files:
stdlib.h ---> strtoul
string.h ---> strpbrk
limits.h ---> ULONG_MAX
errno.h ---> errno
In short, we could make the program to something like
for(; start_ptr!=NULL; sum_val += rv, start_ptr=strpbrk(end_ptr, "0123456789"))
{
rv = strtoul(start_ptr, &end_ptr, 10);
if(rv==ULONG_MAX && errno==ERANGE)
{
//out of range!
break;
}
}
printf("\n\n%lu", sum_val);
So the value of sum_val for the string "12abcd4" would be 16.
scanf() is usually not the best way to accept input that is not well-formatted. Maybe you can use fgets()-sscanf() combo instead.
If you must use scanf(), make sure that you check the value returned by it, which in your case must be 1 (the number of successful assignments that scanf() made).
And to prevent overflow, use a width specifier as in
scanf("%99s",ch);
instead of
scanf("%s",ch);
as 100 is the size of the ch character array and we need one extra byte to store the string delimiter (the \0 character).

Trying to remove all numbers from a string in C

I'm trying to take all of the numbers out of a string (char*)...
Here's what I have right now:
// Take numbers out of username if they exist - don't care about these
char * newStr;
strtoul(user, &newStr, 10);
user = newStr;
My understanding is that strtoul is supposed to convert a string to an unsigned long. The characters that are not numbers are put into the passed in pointer (the 2nd arg). When i reassign user to newStr and print it, the string remains unchanged. Why is this? Does anyone know of a better method?
From the documentation example:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char str[30] = "2030300 This is test";
char *ptr;
long ret;
ret = strtoul(str, &ptr, 10);
printf("The number(unsigned long integer) is %lu\n", ret);
printf("String part is |%s|", ptr);
return(0);
}
Let us compile and run the above program, this will produce the following result:
The number(unsigned long integer) is 2030300
String part is | This is test|
char* RemoveDigits(char* input)
{
char* dest = input;
char* src = input;
while(*src)
{
if (isdigit(*src)) { src++; continue; }
*dest++ = *src++;
}
*dest = '\0';
return input;
}
Test:
int main(void)
{
char inText[] = "123 Mickey 456";
printf("The result is %s\n", RemoveDigits(inText));
// Expected Output: " Mickey "
}
The numbers were removed.
Here is a C program to remove digits from a string without using inbuilt functions. The string is shifted left to overwrite the digits:
#include <stdio.h>
int main(void) {
char a[] = "stack123overflow";
int i, j;
for (i = 0; a[i] != '\0'; i ++) {
if (a[i] == '0' || a[i] == '1' || a[i] == '2' || a[i] == '3' || a[i] == '4' || a[i] == '5' || a[i] == '6' || a[i] == '7' || a[i] == '8' || a[i] == '9') {
for (j = i; a[j] != '\0'; j ++)
a[j] = a[j + 1];
i--;
}
}
printf("%s", a);
return 0;
}
Example of execution:
$ gcc shift_str.c -o shift_str
$ ./shift_str
stackoverflow
strtoul() does not extract all numbers from string, it just trying to covert string to number and convertion stops when non digit is find. So if your string starts from number strtoul() works as you expect, but if string starts from letters, strtoul() stops at the first symbol. To solve your task in simple way you should copy all non-digits to other string, that will be a result.
The problem you are having is that strtoul is converting characters at the beginning of the string into an unsigned long. Once it encounters non-numeric digits, it stops.
The second parameter is a pointer into the original character buffer, pointing at the first non-numeric character.
http://www.cplusplus.com/reference/cstdlib/strtoul/
Parameter 2 : Reference to an object of type char*, whose value is set by the function to the next character in str after the numerical value.
So, if you tried to run the function on "123abc567efg" the returned value would be 123. The original string buffer would still be "123abc567efg" with the second parameter now pointing at the character 'a' in that buffer. That is, the pointer (ptr) will have a value 3 greater than original buffer pointer (str). Printing the string ptr, would give you "abc567efg" as it simply points back into the original buffer.
To actually remove ALL the digits from the string in C you would need to do something similar to this answer : Removing spaces and special characters from string
You build your allowable function to return false on 0-9 and true otherwise. Loop through and copy out digits to a new buffer.

How to print a string, which includes special characters and was read from a file? How to print it without special characters?

I have following function:
int Printf(const char *s, int length)
{
int i=0;
while(i < length)
{
printf("%c", s[i]);
i++;
}
}
But if I call it with a non null-terminated string like "Hello World\n" which I read from a file, it prints Hello World\n without making a new line, so it prints \n explicitly. What is wrong with my function?
There's nothing wrong, but I guess the \n is essentially in the string. When you write \n inside a string in your C/C++ program the compiler will replace it with the proper linebreak. However this doesn't happen if the \n is in your text (essentially being "\\n").
Where is the string set? Seems like you might have to handle the escaped characters yourself.
Btw. depending on your compiler you should be able to use something like this, which is a lot simplier:
printf("%*s", length, s);
Edit:
Just read your comment above. You'll have to handle the \n -> linebreak replacement yourself if you read the string from a file. printf() won't handle it for you.
Special characters are handled by the compiler, not by printf. They are converted during compile time, so
char a[] = "a\n";
becomes equivalent to
char a[] = { 'a', 13, 0 };
printf never sees "\n", the compiler has converted that to 13 beforehand.
And printf doesn't have the ability to convert special characters. When you read "Hello World\n" from a file, you can't expect it to be converted by the compiler.
I have rewritten my function so:
int Printf(char *s, int length)
{
int i=0;
char c = '\0',
special='\\',
newline ='n',
creturn ='r',
tab ='t';
while(i < length)
{
if(c == special)
{
if( s[i] == newline )
printf("\n");
else if(s[i] == creturn)
printf("\r");
else if(s[i] == tab)
printf("\t");
else if(s[i] == special)
printf("\\");
}
else if (s[i] != '\\')
printf("%c", s[i]);
c = s[i];
i++;
}
}
and now it does work right!

Resources