Print bytes in C, only non-printable characters as hex - c

I have a program creating some byte strings that are a mix of human-readable text and control bytes (including null bytes). In order to debug these strings I would like to have a function that prints these strings, given a pointer and a size, in a way that I can read the printable ASCII characters on screen, as well as the hex value of the non-printable ones (à la Python), e.g.
first string\x00second string\x00\x01
So far I have a function that only prints the printable characters:
void print_bytes(unsigned char *bs, size_t size) {
size_t i;
for (i = 0; i < size; i++) {
fputc(bs[i], stdout);
}
printf("\n");
}
Other than that, I have only seen examples online print everything as hex sequences, which does not help me understand the contents of the strings.
How can I improve the function above to print the hex values of non-printable characters?

Using the suggestion in the comment, I rewrote the function to use the isprint swicth:
void print_bytes(const unsigned char *bs, const size_t size) {
for (size_t i = 0; i < size; i++) {
if isprint(bs[i]) {
fputc(bs[i], stdout);
} else {
printf("\\x%02x", bs[i]);
}
}
printf("\n");
}

Related

Copying valid strings to 2d array in C

I am checking if a function returns true, it prints out valid strings according some other function I got. At the moment, it's printing it out correctly but it is also printing empty lines which seem to correspond to the invalid strings.
How can I make these empty lines go away?
Here is my code:
int main()
{
int i, count = 0;
char input[10];
char validStr[10][60] = {""};
for (i = 0; i < 60; ++i){
if(fgets(input,10, stdin) == NULL){
break;
}
input[strcspn(input,"\n")] = '\0';
if(checkIfValid(input)){
memcpy(validStr[i],input,sizeof(input));
count++;
}
}
printf("%d\n",count);
for (int j = 0 ; j < count; ++j){
printf("%s\n",validStr[j]);
}
}
The count indicates it is printing only the valid strings but as you can tell by the pic it prints white lines.
Note: For various reasons the program needs to follow the current order so the output is printed after the first for loop.
Thanks in advance!
Instead of this:
if(checkIfValid(input)){
memcpy(validStr[i],input,sizeof(input));
count++;
}
This:
if(checkIfValid(input)){
memcpy(validStr[count],input,sizeof(input));
count++;
}
As others have pointed out in the comments, you want to safely secure that string copy. May I suggest:
if(checkIfValid(input)){
char* dst = validStr[count];
size_t MAXLEN = 10;
strncpy(dst, input, MAXLEN);
dest[MAXLEN-1] = '\0';
count++;
}
Continuing from the comment, if you want to store the entire string, you need to provide adequate space for the nul-terminating character.
AAAAAAAAAA
QELETIURTE
...
contain strings that are 10 characters long and will not fit in input as declared char[10].
Instead of looping with a for, allow the return from fgets() control your read-loop and keep count as a condition controlling the loop to ensure you protect your array bounds, e.g.
#include <stdio.h>
#include <string.h>
#define MAXC 128 /* if you need a constant, #define one (or more) */
#define NSTR 10
int checkIfValid (const char *s) { return 1; (void)s; }
int main(void)
{
size_t count = 0;
char input[MAXC];
char validStr[NSTR][MAXC] = {""};
while (count < NSTR && fgets (input, sizeof input, stdin)) {
input[strcspn(input,"\n")] = '\0';
if(checkIfValid(input)){
strcpy (validStr[count], input);
count++;
}
}
printf ("%zu\n",count);
for (size_t j = 0 ; j < count; ++j) {
printf("%s\n",validStr[j]);
}
}
(adjust your array declaration for 60 strings of 10 characters each)
If you want to cut off at 9 characters and ensure the stings are nul-terminated, #selbie has that covered.
Example Use/Output
With your data (as good as I could read it) in dat/validstr.txt you could do:
$ ./bin/validstring <dat/validstr.txt
6
AAAAAAAAAA
QELETIURTE
321qweve
sdsdsdfFF
GRSGGFDDSS
toLotssAAA

How to count the length of each word in a string of characters

I'm a beginner programmer and im trying to solve some exercises and i needed some help with one of them.
The exercise goes like this:
We need to input a string of characters , read it and print out the length of each word.
This is what i did
int main()
{
char str[N+1+1];
int i=0;
int pos=0;
int wordlen=0;
int word[60]={0,};
printf("Please enter the string of characters: ");
gets(str);
while(i<strlen(str))
{
if(!isalpha(str[i]))
{
wordlen=0;
i++;
}
if(isalpha(str[i]))
{
wordlen++;
i++;
pos=i;
}
word[pos]=wordlen;
wordlen=0;;
i++;
}
for(i=0;i<20;i++)
{
if(word[i]==0) // here im just trying to find a way to avoid printing 0's but you can ignore it if you want
{break;}
else
printf("%d ",word[i]);
}
return 0;
}
The problem is that when i try to compile it for example: I input "hi hi hi" its supposed to print 2 2 2 but instead it's printing nothing.
Can i ask for some help?
I failed to follow OP's logic.
Perhaps begin again?
End-of-word
To "count the length of each word", code needs to identify the end of a word and when to print.
Detecting a non-letter and the current word length > 0 indicates the prior character was the end of a word. Note that every C string ends with a non-letter: '\0', so let us iterate on that too to insure loop ends on a final non-letter.
int word_length = 0;
int strlength = strlen(str); // Call strlen() only once
while (i <= strlength) {
if (isalpha(s[i])) {
word_length++;
} else {
if (word_length > 0) {
printf("%d ", word_length);
word_length = 0;
}
}
}
printf("\n");
gets()
gets() is no longer in the C library for 10 years as it is prone to over-run. Do not use it.
we are supposed to use gets. is unfortunate and implies OP’s instruction is out-of-date. Instead, research fgets() and maybe better instruction material.
Advanced
is...() better called as isalpha((unsigned char) s[i]) to handle s[i] < 0.
In general, better to use size_t than int for string sizing and indexing as the length may exceed INT_MAX. That is not likely to happen with OP's testing here.

How to turn non-printable characters into their hex values in C?

I'm trying to make a function that takes an array of characters as an input, and outputs printable characters normally, and non-printable characters in hexadecimal (by turning these character into decimal using Extended ASCII, then turning that decimal number into hex).
For example:
"This morning is ßright"
should turn into:
"This morning is E1right"
since ß in Extended ASCII is 225, and that in hexadecimal is E1.
Here is what I attempted:
void myfunction(char *str)
{
int size=0;
for (int i = 0; str[i] != NULL; i++) size++; //to identify how many characters are in the string
for (int i = 0; i < size; i++)
{
if (isprint(str[i]))
{
printf("%c", str[i]); //printing printable characters
}
else
{
if (str[i] == NULL) break; //to stop when reaching the end of the string
printf("%02x", str[i]); //This is where I'm having an issue
}
}
}
This function outputs this:
"This morning is ffffffc3ffffff9fright"
how can I turn the non-printable characters into their hex value? and what is causing this function to behave in this way?
Thanks in advance!
You're seeing a couple of issues here. The first is that the char type on your machine (as on most) is signed, so when you have a char that is not ascii, it shows up as a negative number. This then sign extends to your int size before you print it as an unsigned hex value, so you get those ffffff strings you see.
If you mask it to 8 bits, you'll see the hex values more clearly. Use
printf("%02X", str[i] & 0xff); // X to use upper-case hex chars for clarity
and you'll get the output
This morning is C39Fright
Now you see the second problem, which is that ß is not an ascii character. It is unicode character #00DF, however, and when it is encoded in UTF-8 it shows up as the two-byte sequence C3 9F.
You have plenty of issues with your code.
for (int i = 0; str[i] != NULL; i++) size++; NULL is a pointer str[i] is char. You simply want to compare with zero which is a null character. null character is not the same as NULL pointer!!!
Same here: if (str[i] == NULL) break;
printf("%02x", str[i]); you use wron format to print char value as number. You should use hh size modifier. See how it works in the attached code.
Use the correct type for indexes or sizes - size_t instead of int
Your code is overcomplicated.
void myfunction(const char *str)
{
while(*str)
{
if (isprint(*str))
{
printf("%c", *str); //printing printable characters
}
else
{
printf("%02hhX", *str); //This is where I'm having an issue
}
str++;
}
}
int main(void)
{
char *str = "This morning is \xE1right";
myfunction(str);
}
https://godbolt.org/z/6jKWdr3rM

How to count the number of same character in C?

I'm writing a code a that prompts the user to enter a string
&
create a function that is a type void that prints out the character that was used the most
(As in where it appeared more than any other ones)
&
also shows the number of how many times it was in that string.
Therefore here is what I have so far...
#include <stdio.h>
#include <string.h>
/* frequent character in the string along with the length of the string (use strlen from string.h – this will require you to #include <string.h> at the top of your program).*/
/* Use array syntax (e.g. array[5]) to access the elements of your array.
* Write a program that prompts a user to input a string,
* accepts the string as input, and outputs the most
* You should implement a function called mostfrequent.
* The function prototype for mostfrequent is: void mostfrequent(int *counts, char *most_freq, int *qty_most_freq, int num_counts);
* Hint: Consider the integer value of the ASCII characters and how the offsets can be translated to ints.
* Assume the user inputs only the characters a through z (all lowercase, no spaces).
*/
void mostfrequent(int *counts, char *most_freq, int *qty_most_freq, int num_counts_)
{
int array[255] = {0}; // initialize all elements to 0
int i, index;
for(i = 0; most_freq[i] != 0; i++)
{
++array[most_freq[i]];
}
// Find the letter that was used the most
qty_most_freq = array[0];
for(i = 0; most_freq[i] != 0; i++)
{
if(array[most_freq[i]] > qty_most_freq)
{
qty_most_freq = array[most_freq[i]];
counts = i;
}
num_counts_++;
}
printf("The most frequent character was: '%c' with %d occurances \n", most_freq[index], counts);
printf("%d characters were used \n", num_counts_);
}
int main()
{
char array[5];
printf("Enter a string ");
scanf("%s", array);
int count = sizeof(array);
mostfrequent(count , array, 0, 0);
return 0;
}
I'm getting the wrong output too.
output:
Enter a string hello
The most frequent character was: 'h' with 2 occurances
5 characters were used
should be
The most frequent character was: 'l' with 2 occurances
5 characters were used
let's do it short (others will correct me if I write something wrong ^_^ )
you declare a int like this:
int var;
use it like this :
var = 3;
you declare a pointer like this :
int* pvar;
and use the pointed value like this:
*pvar = 3;
if you declared a variable and need to pass a pointer to it as function parameters, use the & operator like this :
functionA(&var);
or simply save its address in a pointer var :
pvar = &var;
that's the basics. I hope it will help...
The function prototype you are supposed to use seems to include at least one superfluous parameter. (you have the total character count available in main()). In order to find the most frequently appearing character (at least the 1st of the characters that occur that number of times), all you need to provide your function is:
the character string to be evaluated;
an array sized so that each element represents on in the range of values you want to find the most frequent (for ASCII characters 128 is fine, for all in the range of unsigned char, 256 will do); and finally
a pointer to return the index in your frequency array that holds the index to the most frequently used character (or the 1st character of a set if more than one are used that same number of times).
In your function, your goal is to loop over each character in your string. In the frequency array (that you have initialized all zero), you will map each character to an element in the frequency array and increment the value at that element each time the character is encountered. For example for "hello", you would increment:
frequency['h']++;
frequency['e']++;
frequency['l']++;
frequency['l']++;
frequency['o']++;
Above you can see when you are done, the element frequency['l']; will hold the value of 2. So when you are done you just loop over all elements in frequency and find the index for the element that holds the largest value.
if (frequency[i] > frequency[most])
most = i;
(which is also why you will get the first of all characters that appear that number of times. If you change to >= you will get the last of that set of characters. Also, in your character count you ignore the 6th character, the '\n', which is fine for single-line input, but for multi-line input you need to consider how you want to handle that)
In your case, putting it altogether, you could do something similar to:
#include <stdio.h>
#include <ctype.h>
enum { CHARS = 255, MAXC = 1024 }; /* constants used below */
void mostfrequent (const char *s, int *c, int *most)
{
for (; *s; s++) /* loop over each char, fill c, set most index */
if (isalpha (*s) && ++c[(int)*s] > c[*most])
*most = *s;
}
int main (void) {
char buf[MAXC];
int c[CHARS] = {0}, n = 0, ndx;
/* read all chars into buf up to MAXC-1 chars */
while (n < MAXC-1 && (buf[n] = getchar()) != '\n' && buf[n] != EOF)
n++;
buf[n] = 0; /* nul-terminate buf */
mostfrequent (buf, c, &ndx); /* fill c with most freq, set index */
printf ("most frequent char: %c (occurs %d times, %d chars used)\n",
ndx, c[ndx], n);
}
(note: by using isalpha() in the comparison it will handle both upper/lower case characters, you can adjust as desired by simply checking upper/lower case or just converting all characters to one case or another)
Example Use/Output
$ echo "hello" | ./bin/mostfreqchar3
most frequent char: l (occurs 2 times, 5 chars used)
(note: if you use "heello", you will still receive "most frequent char: e (occurs 2 times, 6 chars used)" due to 'e' being the first of two character that are seen the same number of times)
There are many ways to handle frequency problems, but in essence they all work in the same manner. With ASCII characters, you can capture both the most frequent character and the number of times it occurs in a single array of int and an int holding the index to where the max occurs. (you don't really need the index either -- it just save looping to find it each time it is needed).
For more complex types, you will generally use a simple struct to hold the count and the object. For example if you were looking for the most frequent word, you would generally use a struct such as:
struct wfreq {
char *word;
int count;
}
Then you simply use an array of struct wfreq in the same way you are using your array of int here. Look things over and let me know if you have further questions.
Here is what I came up with. I messed up with the pointers.
void mostfrequent(int *counts, char *most_freq, int *qty_most_freq, int num_counts_)
{
*qty_most_freq = counts[0];
*most_freq = 'a';
int i;
for(i = 0; i < num_counts_; i++)
{
if(counts[i] > *qty_most_freq)
{
*qty_most_freq = counts[i];
*most_freq = 'a' + i;
}
}
}
/* char string[80]
* read in string
* int counts[26]; // histogram
* zero counts (zero the array)
* look at each character in string and update the histogram
*/
int main()
{
int i;
int num_chars = 26;
int counts[num_chars];
char string[100];
/*zero out the counts array */
for(i = 0; i < num_chars; i++)
{
counts[i] = 0;
}
printf("Enter a string ");
scanf("%s", string);
for(i = 0; i < strlen(string); i++)
{
counts[(string[i] - 'a')]++;
}
int qty_most_freq;
char most_freq;
mostfrequent(counts , &most_freq, &qty_most_freq, num_chars);
printf("The most frequent character was: '%c' with %d occurances \n", most_freq, qty_most_freq);
printf("%d characters were used \n", strlen(string));
return 0;
}

Reverse a string containing ASCII chars and non-ASCII chars

I got a problem about how to reverse a string containing this 'abcd汉字efg'.
str_to_reverse = "abcd汉字efg"; /* those non-ASCII chars are Chinese characters, each of them takes 2 bytes */
after reversion, it should be:
str_toreverse = "gfe字汉dcba";
I thought, to reverse the string, I gotta identify those non-ASCII chars, because I think that simply reversing every byte won't get the right answer.
How can I do it?
PS:
I wrote this program under Ubuntu, 32-bit.
then I printed every byte:
for(i = 0; i < strlen(s); i++)
printf("%c", s[i]);
I got some gibberish text instead of "汉字".
Pure C89 answer:
#include <stdlib.h>
#include <stdio.h>
#include <locale.h>
#include <string.h>
int main()
{
char const* str;
size_t slen;
char* rev;
setlocale(LC_ALL, "");
str = "abcd汉字efg";
printf("%s\n", str);
slen = strlen(str);
rev = malloc(slen+1)+slen;
*--rev = '\0';
while (*str != '\0') {
int clen, i;
clen = mblen(str, slen);
if (clen == -1) {
fprintf(stderr, "Bad encoding\n");
return EXIT_FAILURE;
}
for (i = 0; i < clen; ++i) {
*--rev = str[clen-1-i];
}
str += clen;
}
printf("%s\n", rev);
return 0;
}
If the string is encoded as utf8, it is pretty simple. You can obtain the length of well formed utf8 sequences by inspecting only the first byte.
In a first pass you reverse only the utf8 "subsequences" (those with length > 1)
In a second pass you reverse the whole string.
Voila.

Resources