I am trying to make a function to print/store the first n bytes of an u_char pointer. This is what I currently have (just trying to print the first value), but it doesn't work.
void print_first_value(const u_char * p) {
printf("%d", p[0]);
}
I have also tried this:
void print_first_value(const u_char * p) {
printf("%d", &p[0]);
}
How would I make this work? In the end I want to loop through the individual values in *p, but I can only print the entire string at the address pointed to by p via this code.
void print_first_value(const u_char * p) {
printf("%s", p);
}
So what I am printing out is packets, sorry I didn't mention that. The last snippet of code prints a packet in hex, so something like 0050 5686 7654 0000... and I want to print/store the values at certain indexes. So I want the first two blocks 00505686, then the next two and so on.
First of all, a few notes about your code:
u_char isn't a standard type. unsigned char is the standard way of spelling this type. While you might be using a typedef in your codebase (such as typedef unsigned char u_char;), it's a better idea to use the standard type, particularly when posting code using that typedef without the typedef itself.
&p[0] and p mean the exact same thing in C, regardless of the value of p (assuming that it is a pointer). By the same reasoning, p[0] and *p also mean the same thing. I'll be using p and *p exclusively in further examples, but keep in mind the equivalence.
unsigned char is an integral type. This means that its value is an integer. The fact that this value can also be interpreted as a character is incidental. This will be very relevant soon.
Now, as for your snippets. Let's go in reverse order. The last one just prints the string, as you know.
The second one is undefined behavior. printf("%d", p) (&p[0] = p, remember?) is passing a pointer as an argument (p is of type const unsigned char *), but %d expects an int. The arguments must match the types indicated by the format specifiers; it is an error to do otherwise. It will probably "work" (as in, not crash), but it's something you definitely shouldn't do. It's not valid C.
The first one is the most interesting one. First of all, printf("%d", *p) isn't undefined behavior, unlike the second snippet's case. *p is const unsigned char (the pointer has been dereferenced), and any type narrower than int gets promoted to int on variadic parameter lists (printf is defined as int printf(const char *, ...); the , ... at the end indicates that it accepts any number of arguments of any type, and it is often referred to as variadic because of this reason), so this is valid.
And in fact, it works. Let's try a full program using it:
#include <stdio.h>
void print_first_value (const unsigned char * p) {
printf("%d", *p);
}
int main (void) {
char str[] = "Hello world!";
print_first_value(str);
return 0;
}
Assuming you're not using a particularly strange computer or OS, you'll get 72 printed this way. This is not wrong! 72 happens to be the number (called a codepoint) that internally represents a capital letter H in ASCII. Remember how I said that unsigned char was an integral type? This is what it means: its value is really a number. You asked your computer to print the number, and it did.
If you want to print the character that this number represents, though, you have two choices: use %c as a format specifier in printf (which tells it to print the character) or use the putchar/putc functions (which take a single number and print the character they represent). Let's go with the latter:
#include <stdio.h>
void print_first_character (const char * p) {
// it doesn't matter if it is unsigned or signed,
// because we're just printing the character
putchar(*p);
}
int main (void) {
char str[] = "Hello world!";
print_first_character(str);
return 0;
}
Now you'll get H. Getting somewhere! Now, to print all the characters in the string, we need to know one extra detail: after all meaningful characters in a string, the very last one is always zero. As in, the number zero, not the character '0'. (This is often written as '\0', but that is the same as zero.) So, here we go:
#include <stdio.h>
void print_first_character (const char * p) {
putchar(*p);
}
int main (void) {
char message[] = "Hello world!";
const char * str = message; // initialize the pointer to the beginning of the string
while (*str) { // while *str isn't zero
print_first_character(str); // print the character...
str ++; // ...and advance to the next one
}
putchar('\n'); // let's print a newline too, so the output looks nicer
return 0;
}
And here we go! Hello world! will be printed. Of course, puts("Hello world!"); would have done the same, but that isn't as fun, now is it?
Per Your Edit You Are Printing Packets
Ah hah! That makes more sense. When you create an unsigned char pointer to an unsigned value you have a pointer to the beginning of the value in memory, but how the value is stored will depend on endianness of the machine and the byte-order of the bytes in the packet.
Simply storing/printing out the bytes as they are currently stored in memory isn't difficult, nor is storing/printing each two-bytes. Each may be done with something similar to:
/* all bytes stored in memory */
void prn_all (const unsigned char *p, size_t nbytes)
{
while (nbytes--)
printf ("0x%02x\n", p[nbytes]);
}
/* each 2-bytes stored in memory */
void prn_two (const unsigned char *p, size_t nbytes)
{
while (nbytes--) {
printf ("%02x", p[nbytes]);
if (nbytes % 2 == 0)
putchar ('\n');
}
}
...
unsigned u = 0xdeadbeef;
unsigned char *p = (unsigned char *)&u;
prn_all (p, sizeof u);
putchar ('\n');
prn_two (p, sizeof u);
Would result in:
$ /bin/prn_uchar_byte
0xde
0xad
0xbe
0xef
dead
beef
Now the caveat. Since you mention "packet", depending on whether the packet is in network-byte-order or host-byte-order, you may need a conversion (or simple bit shifts) to get the bytes in the order you need. C provides functions to convert between network-byte-order and host-byte-order and vice-versa with man 3 byteorder htonl, htons, ntohl, ntohs. Needed because network byte order is Big Endian while normal x86 and x86_64 is Little Endian. If your packages are in network byte order and you need host byte order, you can simply call ntohs (network to host short) to convert each two-byte value to host order, e.g.
/* each 2-bytes converted to host byte order from network byte order */
void prn_two_host_order (const unsigned char *p, size_t nbytes)
{
for (size_t i = 0; i < nbytes; i+=2) {
uint16_t hostorder = ntohs (*(uint16_t*)(p+i));
printf ("%04" PRIx16 "\n", hostorder);
}
}
...
prn_two_host_order (p, sizeof u);
Results in:
efbe
adde
(note: the prototype for ntohs (and all byteorder conversions) use exact-width types uint16_t and uint32_t -- for which the associated print macros are in inttypes.h -- which also automatically includes stdint.h)
You will have determine the order you have in your "packets" to know whether a byteorder conversion is needed. That will depend on how you get your data.
Putting it altogether in a short example, you could do something like:
#include <stdio.h>
#include <inttypes.h>
#include <arpa/inet.h>
/* all bytes stored in memory */
void prn_all (const unsigned char *p, size_t nbytes)
{
while (nbytes--)
printf ("0x%02x\n", p[nbytes]);
}
/* each 2-bytes stored in memory */
void prn_two (const unsigned char *p, size_t nbytes)
{
while (nbytes--) {
printf ("%02x", p[nbytes]);
if (nbytes % 2 == 0)
putchar ('\n');
}
}
/* each 2-bytes converted to host byte order from network byte order */
void prn_two_host_order (const unsigned char *p, size_t nbytes)
{
for (size_t i = 0; i < nbytes; i+=2) {
uint16_t hostorder = ntohs (*(uint16_t*)(p+i));
printf ("%04" PRIx16 "\n", hostorder);
}
}
int main (void) {
unsigned u = 0xdeadbeef;
unsigned char *p = (unsigned char *)&u;
prn_all (p, sizeof u);
putchar ('\n');
prn_two (p, sizeof u);
putchar ('\n');
prn_two_host_order (p, sizeof u);
}
(note: some systems use the header netinet/in.h instead of arpa/inet.h for the byteorder conversion as listed in the man page)
Full Example Use/Output
$ /bin/prn_uchar_byte
0xde
0xad
0xbe
0xef
dead
beef
efbe
adde
You can store the values instead of printing -- but that is left to you. Look things over and let me know if you have questions.
Related
I'm trying to make an analogue of sscanf with a specifier %p.
I use this:
int res = ahex2num(buf);
*va_arg(ap, void **) = (void *) res;
It works correctly, i actually get the address i pass, like 0x1A but i am facing this error:
warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
In main function:
int main(){
void *a;
readFromStr("0x1A", "%p", &a);
printf("%p", a);
return 0;
}
/*
out: 0x1a
*/
Can I somehow avoid this?
long ahex2num(unsigned char *in){
unsigned char *pin = in;
long out = 0;
while(*pin != 0){
out <<= 4;
out += (*pin < 'A') ? *pin & 0xF : (*pin & 0x7) + 9;
pin++;
}
return out;
}
Apparently pointers, particularly void *, have a different size than int on your system. E.g., pointers may be 64 bits and int may be 32 bits. Implementing %p in a routine like sscanf is a valid reason for converting an integer to void *, but you need to use an integer type that can hold all the bits needed for a pointer. A good type for this may be uintptr_t, declared in <stdint.h>.
You will need to ensure all the code that works with the integers from the scanning, such as ahex2num, can support the necessary width and signedness, including handling potential overflow as desired.
If I had your entire code, I could test it. I assume to remove the warning without using a pragma is as simple as changing your typecast from int to long int.
I solved this problem like this:
long long int res = ahex2num(buf);
Can you help me to understand why the value of my dataStruct structure isn't the value of one of its members? (As for the simpleDataStruct strucure)
I print the value with this line:
printf("dataStruct:..............0x%X\r\n", dataStruct);
And the result is:
dataStruct:..............0x22FE20
I use GCC.
My code is:
int main(void)
{
typedef struct Main_SimpleStructData_s
{
unsigned char a;
unsigned char b;
}
Main_SimpleStructData_t;
typedef struct Main_StructuredData_s
{
unsigned char a;
unsigned char* b;
}
Main_StructuredData_t;
unsigned char localDataA = 0xBE;
unsigned char localDataB = 0xEF;
unsigned char localDataC = 0xCA;
unsigned char localDataD = 0xFE;
Main_SimpleStructData_t simpleDataStruct;
Main_StructuredData_t dataStruct;
simpleDataStruct.a = localDataA;
simpleDataStruct.b = localDataB;
dataStruct.a = localDataC;
dataStruct.b = &localDataD;
printf("\r\n");
printf("simpleDataStruct:........0x%X\r\n", simpleDataStruct);
printf("Addr simpleDataStruct: 0x%X\r\n", &simpleDataStruct);
printf("Size simpleDataStruct: %u\r\n", (unsigned)sizeof(simpleDataStruct));
printf("\r\n");
printf("Addr localDataC: 0x%X\r\n", &localDataC);
printf("Size localDataC: %u\r\n", (unsigned)sizeof(localDataC));
printf("Addr localDataD: 0x%X\r\n", &localDataD);
printf("Size localDataD: %u\r\n", (unsigned)sizeof(localDataD));
printf("dataStruct:..............0x%X\r\n", dataStruct);
printf("dataStruct.a: 0x%X\r\n", dataStruct.a);
printf("dataStruct.b: 0x%X\r\n", dataStruct.b);
printf("Addr dataStruct: 0x%X\r\n", &dataStruct);
printf("Addr dataStruct.a: 0x%X\r\n", &(dataStruct.a));
printf("Addr dataStruct.b: 0x%X\r\n", &(dataStruct.b));
printf("Size dataStruct: %u\r\n", (unsigned)sizeof(dataStruct));
return (0);
}
And the result is:
simpleDataStruct:........0xEFBE
Addr simpleDataStruct: 0x22FE4A
Size simpleDataStruct: 2
Addr localDataC: 0x22FE4D
Size localDataC: 1
Addr localDataD: 0x22FE4C
Size localDataD: 1
dataStruct:..............0x22FE20
dataStruct.a: 0xCA
dataStruct.b: 0x22FE4C
Addr dataStruct: 0x22FE30
Addr dataStruct.a: 0x22FE30
Addr dataStruct.b: 0x22FE38
Size dataStruct: 16
In advance, thank you.
The %X conversion takes an unsigned int argument. You incorrectly pass a struct Main_StructuredData_s, which is not an unsigned int, which is undefined behaviour, so I don't know why you would expect to see something reasonable as the result.
edit: As for why the Main_SimpleStructData_t appears to "work" by showing its members, the answer is still that it's undefined behaviour and it may do whatever, including the "correct" thing. The underlying reason in this particular case is almost certainly:
printf tries to read an unsigned int argument (because it doesn't know what you actually passed, when you say that you passed an unsigned int)
The small Main_SimpleStructData_t happens to be passed as an argument in the same way as an unsigned int would be (on your platform), and the printf ends up reading in its members' values.
The larger Main_StructuredData_t happens to be passed as an argument in a different way (e.g., on the stack instead of in a register) and the printf reads some random value instead because the struct isn't in the place where the unsigned int argument would have been.
When you call the function printf, the arguments are pushed on the stack.
printf pops the stack when printing out the values from the stack. the stack contains no information about the data type, that is the job for the format specifier.
The format specifier tells printf about the datatypes passed on the stack and then knows the sizes of those arguments, it has otherwise no way of knowing.
printf cannot handle a user-defined struct like that if you give the format specifier %x will just try something but it is undefined behavior. You can write out the address of the struct prefix it with & and or the members of the struct, but not the struct itself.
You can write your own printf function with a custom format specifier that internally prints out the members after passing the struct by value, but as it is now you have not. search for stdarg.h for more info
#EOF, #Ctx, #Anders and #Arkku: Thank you very much for your help. I understand that the problem is between the laptop and my chair ;) I was stupid but now I'm a man :)
To summary, I don't know use correctly the printf function. GCC had been advertised me with the warnings but I didn't read those...
If my simple Main_SimpleStructData_t structure becomes more complicated, the behaviour is the same: Undefined !
typedef struct Main_SimpleStructData_s
{
unsigned char a;
unsigned int c; // Add a little bit complication
unsigned char b;
}
Main_SimpleStructData_t;
The result becomes:
simpleDataStruct:......0x22FE20 // Undefined behaviour also !
Addr simpleDataStruct: 0x22FE40
Size simpleDataStruct: 12
I'm reproducing printf from scrap and I need to store pointers address into a string then print it, so first I cast void* into an unsigned int then itoa it to hexadecimal but the last three char are wrong.
int main(void)
{
char str[] = "printf from scrap!";
my_printf("MY_PRINTF:'%p'", (void*)str);
printf("\n PRINTF:'%p'\n\n", (void*)str);
return (0);
}
int conv_p(va_list args)
{
void *ptr;
unsigned int ptrint;
ptr = va_arg(args, void*);
ptrint = (unsigned int)&ptr;
my_putstr("0x7fff");
my_putstr(my_itoa_base_uint(ptrint, 16));
return (1);
}
Output:
MY_PRINTF:'0x7fff505247b0'
PRINTF:'0x7fff50524a20'
As you can see the last three char are wrong, is there any documentation about that?
In the second case, you're converting the address of the variable ptr to an int, rather than its value (the pointer you're interested in).
Replacing (unsigned int)&ptr; with (unsigned int)ptr; will give you consistent values.
And an additional aside: there's no guarantee unsigned int is large enough to represent the pointer value: you should use intptr_t or uintptr_t from <stdint.h>.
learning C and now i study simple code snipper that show byte representation of primitive values:
typedef unsigned char *byte_pointer;
void show_bytes(byte_pointer start, int len) {
int i;
for (i = 0; i < len; i++)
printf(" %.2x", start[i]);
printf("\n");
}
void show_float(float x) {
show_bytes((byte_pointer) &x, sizeof(float));
}
void show_int(int x) {
show_bytes((byte_pointer) &x, sizeof(int));
}
void show_pointer(void *x) {
show_bytes((byte_pointer) &x, sizeof(void *));
}
If i understand correct, &x (an ampersand character) showing address of memory (equal to *x).
So. program routine is showing hexadecimal values of each data type, with int value of bytes like (sizeof(int)).
Im not really understand how its work. First, we typedef pointer of unsigned char, and then use it with other types. What is the meaning of (byte_pointer) &x and why does it work, when we define byte_pointer as value of type unsigned char? I understand that we get address of memory that contain value, but i don't know how exactly it work and WHY it work with char pointer. Could you explain that part?
Thanks.
The code simply takes the address of a random chunk of data and prints the contents byte by byte. The code takes the address of whatever you pass to it, then converts it to a pointer-to-byte (unsigned char). Any pointer type in C can be converted to another pointer type, although in some cases doing so is dangerous practice. In the case of char, it is safe though, you are guaranteed to get a pointer to the lowest addressed byte of the object.
Note that hiding a pointer behind a typedef is bad and dangerous practice. Just forget about that typedef, it adds nothing of value. A better way to write the same code would be:
void show_bytes (const uint8_t* start, int len)
or alternatively
void show_bytes (const void* s, int len)
{
const uint8_t* start = s;
...
byte_pointer is defined to be a pointer to an unsigned char; this is so show_bytes can print out each individual byte (in hexadecimal) of what the address passed to show_bytes points to.
I would have declared start to be a void*, and then cast it inside of show_bytes, making it a) clearer that show_bytes doesn't care what type of thing start points to, and b) avoids the cast in every call.
Here is the full code of it
#include <stdio.h>
#include <string.h>
void reverse_string(unsigned short *buf, int length)
{
int i;
unsigned short temp;
for (i = 0; i < length / 2; i++)
{
temp = buf[i];
buf[i] = buf[length - i - 1];
buf[length - i - 1] = temp;
}
}
int main(int argc, char **argv)
{
unsigned short* tmp = (unsigned short*)argv[1];
reverse_string(tmp,strlen(argv[1]) / 2);
printf("%s",argv[1]);
return 0;
}
As you can see, in main, we have
unsigned short* tmp = (unsigned short*)argv[1];
Arent pointers supposed to point "to the address of" of a variable? The one above isn't(using the ampersand). Yet the program works as intended.
Why is it like that?
And what does this part mean?
(unsigned short*)argv[1]
argv is a pointer-to-an-array-of-pointers:
argv[0][0] (a char)
argv[0] (a char*)
argv (a char**)
unsigned char* tmp = (unsigned char*)argv[1];
...works, because you're referencing the the second "string" in that set.
Note that in this case, "char" and "unsigned short" might be roughly equivolent depending on the compiler and platform, but it is probably not a good idea to assume that. For example, if you compiled to enable a "unicode" command line, then you might get "short" instead of "char" forwarded to you from the command line. But, that may be a dangerous assumption, as "these days" a "short" is usually 16-bits and a "char" is usually 8-bits.
Addressing the original questions:
argv is an array of pointers, each of which point to a character array. argv[1] is a pointer to the character array with the first argument (i.e. if you run ./program arg1 arg2, the pointer argv[1] points to the string arg1).
The ampersand is used to denote a reference, which is for most purposes the same as a pointer. It is syntactic sugar to make it easy to pass a reference to a variable that you have already declared. The common example is using scanf.
int x = 1;
scanf(..., &x, ...)
is equivalent to
int x = 1;
int *p = &x;
scanf(..., p, ...)
The program itself is designed to flip endianness. It's not sufficient to go character-by-character because you have to flip two bytes at a time (ie short-by-short), which is why it works using shorts.
(unsigned short*)argv[1] instructs the compiler to treat the address as if it were an array of shorts. To give an example:
unsigned char *c = (unsigned char *)argv[1];
c[1]; /*this points to the address one byte after argv*/
unsigned short *s = (unsigned short *)argv[1];
s[1]; /*this points to the address two bytes after argv */
Take a look at a primer on type casting.