Question about a pointer in a program. [C] - c

Here is the full code of it
#include <stdio.h>
#include <string.h>
void reverse_string(unsigned short *buf, int length)
{
int i;
unsigned short temp;
for (i = 0; i < length / 2; i++)
{
temp = buf[i];
buf[i] = buf[length - i - 1];
buf[length - i - 1] = temp;
}
}
int main(int argc, char **argv)
{
unsigned short* tmp = (unsigned short*)argv[1];
reverse_string(tmp,strlen(argv[1]) / 2);
printf("%s",argv[1]);
return 0;
}
As you can see, in main, we have
unsigned short* tmp = (unsigned short*)argv[1];
Arent pointers supposed to point "to the address of" of a variable? The one above isn't(using the ampersand). Yet the program works as intended.
Why is it like that?
And what does this part mean?
(unsigned short*)argv[1]

argv is a pointer-to-an-array-of-pointers:
argv[0][0] (a char)
argv[0] (a char*)
argv (a char**)
unsigned char* tmp = (unsigned char*)argv[1];
...works, because you're referencing the the second "string" in that set.
Note that in this case, "char" and "unsigned short" might be roughly equivolent depending on the compiler and platform, but it is probably not a good idea to assume that. For example, if you compiled to enable a "unicode" command line, then you might get "short" instead of "char" forwarded to you from the command line. But, that may be a dangerous assumption, as "these days" a "short" is usually 16-bits and a "char" is usually 8-bits.

Addressing the original questions:
argv is an array of pointers, each of which point to a character array. argv[1] is a pointer to the character array with the first argument (i.e. if you run ./program arg1 arg2, the pointer argv[1] points to the string arg1).
The ampersand is used to denote a reference, which is for most purposes the same as a pointer. It is syntactic sugar to make it easy to pass a reference to a variable that you have already declared. The common example is using scanf.
int x = 1;
scanf(..., &x, ...)
is equivalent to
int x = 1;
int *p = &x;
scanf(..., p, ...)
The program itself is designed to flip endianness. It's not sufficient to go character-by-character because you have to flip two bytes at a time (ie short-by-short), which is why it works using shorts.
(unsigned short*)argv[1] instructs the compiler to treat the address as if it were an array of shorts. To give an example:
unsigned char *c = (unsigned char *)argv[1];
c[1]; /*this points to the address one byte after argv*/
unsigned short *s = (unsigned short *)argv[1];
s[1]; /*this points to the address two bytes after argv */

Take a look at a primer on type casting.

Related

Print/store n bytes of u_char pointer in c?

I am trying to make a function to print/store the first n bytes of an u_char pointer. This is what I currently have (just trying to print the first value), but it doesn't work.
void print_first_value(const u_char * p) {
printf("%d", p[0]);
}
I have also tried this:
void print_first_value(const u_char * p) {
printf("%d", &p[0]);
}
How would I make this work? In the end I want to loop through the individual values in *p, but I can only print the entire string at the address pointed to by p via this code.
void print_first_value(const u_char * p) {
printf("%s", p);
}
So what I am printing out is packets, sorry I didn't mention that. The last snippet of code prints a packet in hex, so something like 0050 5686 7654 0000... and I want to print/store the values at certain indexes. So I want the first two blocks 00505686, then the next two and so on.
First of all, a few notes about your code:
u_char isn't a standard type. unsigned char is the standard way of spelling this type. While you might be using a typedef in your codebase (such as typedef unsigned char u_char;), it's a better idea to use the standard type, particularly when posting code using that typedef without the typedef itself.
&p[0] and p mean the exact same thing in C, regardless of the value of p (assuming that it is a pointer). By the same reasoning, p[0] and *p also mean the same thing. I'll be using p and *p exclusively in further examples, but keep in mind the equivalence.
unsigned char is an integral type. This means that its value is an integer. The fact that this value can also be interpreted as a character is incidental. This will be very relevant soon.
Now, as for your snippets. Let's go in reverse order. The last one just prints the string, as you know.
The second one is undefined behavior. printf("%d", p) (&p[0] = p, remember?) is passing a pointer as an argument (p is of type const unsigned char *), but %d expects an int. The arguments must match the types indicated by the format specifiers; it is an error to do otherwise. It will probably "work" (as in, not crash), but it's something you definitely shouldn't do. It's not valid C.
The first one is the most interesting one. First of all, printf("%d", *p) isn't undefined behavior, unlike the second snippet's case. *p is const unsigned char (the pointer has been dereferenced), and any type narrower than int gets promoted to int on variadic parameter lists (printf is defined as int printf(const char *, ...); the , ... at the end indicates that it accepts any number of arguments of any type, and it is often referred to as variadic because of this reason), so this is valid.
And in fact, it works. Let's try a full program using it:
#include <stdio.h>
void print_first_value (const unsigned char * p) {
printf("%d", *p);
}
int main (void) {
char str[] = "Hello world!";
print_first_value(str);
return 0;
}
Assuming you're not using a particularly strange computer or OS, you'll get 72 printed this way. This is not wrong! 72 happens to be the number (called a codepoint) that internally represents a capital letter H in ASCII. Remember how I said that unsigned char was an integral type? This is what it means: its value is really a number. You asked your computer to print the number, and it did.
If you want to print the character that this number represents, though, you have two choices: use %c as a format specifier in printf (which tells it to print the character) or use the putchar/putc functions (which take a single number and print the character they represent). Let's go with the latter:
#include <stdio.h>
void print_first_character (const char * p) {
// it doesn't matter if it is unsigned or signed,
// because we're just printing the character
putchar(*p);
}
int main (void) {
char str[] = "Hello world!";
print_first_character(str);
return 0;
}
Now you'll get H. Getting somewhere! Now, to print all the characters in the string, we need to know one extra detail: after all meaningful characters in a string, the very last one is always zero. As in, the number zero, not the character '0'. (This is often written as '\0', but that is the same as zero.) So, here we go:
#include <stdio.h>
void print_first_character (const char * p) {
putchar(*p);
}
int main (void) {
char message[] = "Hello world!";
const char * str = message; // initialize the pointer to the beginning of the string
while (*str) { // while *str isn't zero
print_first_character(str); // print the character...
str ++; // ...and advance to the next one
}
putchar('\n'); // let's print a newline too, so the output looks nicer
return 0;
}
And here we go! Hello world! will be printed. Of course, puts("Hello world!"); would have done the same, but that isn't as fun, now is it?
Per Your Edit You Are Printing Packets
Ah hah! That makes more sense. When you create an unsigned char pointer to an unsigned value you have a pointer to the beginning of the value in memory, but how the value is stored will depend on endianness of the machine and the byte-order of the bytes in the packet.
Simply storing/printing out the bytes as they are currently stored in memory isn't difficult, nor is storing/printing each two-bytes. Each may be done with something similar to:
/* all bytes stored in memory */
void prn_all (const unsigned char *p, size_t nbytes)
{
while (nbytes--)
printf ("0x%02x\n", p[nbytes]);
}
/* each 2-bytes stored in memory */
void prn_two (const unsigned char *p, size_t nbytes)
{
while (nbytes--) {
printf ("%02x", p[nbytes]);
if (nbytes % 2 == 0)
putchar ('\n');
}
}
...
unsigned u = 0xdeadbeef;
unsigned char *p = (unsigned char *)&u;
prn_all (p, sizeof u);
putchar ('\n');
prn_two (p, sizeof u);
Would result in:
$ /bin/prn_uchar_byte
0xde
0xad
0xbe
0xef
dead
beef
Now the caveat. Since you mention "packet", depending on whether the packet is in network-byte-order or host-byte-order, you may need a conversion (or simple bit shifts) to get the bytes in the order you need. C provides functions to convert between network-byte-order and host-byte-order and vice-versa with man 3 byteorder htonl, htons, ntohl, ntohs. Needed because network byte order is Big Endian while normal x86 and x86_64 is Little Endian. If your packages are in network byte order and you need host byte order, you can simply call ntohs (network to host short) to convert each two-byte value to host order, e.g.
/* each 2-bytes converted to host byte order from network byte order */
void prn_two_host_order (const unsigned char *p, size_t nbytes)
{
for (size_t i = 0; i < nbytes; i+=2) {
uint16_t hostorder = ntohs (*(uint16_t*)(p+i));
printf ("%04" PRIx16 "\n", hostorder);
}
}
...
prn_two_host_order (p, sizeof u);
Results in:
efbe
adde
(note: the prototype for ntohs (and all byteorder conversions) use exact-width types uint16_t and uint32_t -- for which the associated print macros are in inttypes.h -- which also automatically includes stdint.h)
You will have determine the order you have in your "packets" to know whether a byteorder conversion is needed. That will depend on how you get your data.
Putting it altogether in a short example, you could do something like:
#include <stdio.h>
#include <inttypes.h>
#include <arpa/inet.h>
/* all bytes stored in memory */
void prn_all (const unsigned char *p, size_t nbytes)
{
while (nbytes--)
printf ("0x%02x\n", p[nbytes]);
}
/* each 2-bytes stored in memory */
void prn_two (const unsigned char *p, size_t nbytes)
{
while (nbytes--) {
printf ("%02x", p[nbytes]);
if (nbytes % 2 == 0)
putchar ('\n');
}
}
/* each 2-bytes converted to host byte order from network byte order */
void prn_two_host_order (const unsigned char *p, size_t nbytes)
{
for (size_t i = 0; i < nbytes; i+=2) {
uint16_t hostorder = ntohs (*(uint16_t*)(p+i));
printf ("%04" PRIx16 "\n", hostorder);
}
}
int main (void) {
unsigned u = 0xdeadbeef;
unsigned char *p = (unsigned char *)&u;
prn_all (p, sizeof u);
putchar ('\n');
prn_two (p, sizeof u);
putchar ('\n');
prn_two_host_order (p, sizeof u);
}
(note: some systems use the header netinet/in.h instead of arpa/inet.h for the byteorder conversion as listed in the man page)
Full Example Use/Output
$ /bin/prn_uchar_byte
0xde
0xad
0xbe
0xef
dead
beef
efbe
adde
You can store the values instead of printing -- but that is left to you. Look things over and let me know if you have questions.

Printing pointer's address in C

I'm reproducing printf from scrap and I need to store pointers address into a string then print it, so first I cast void* into an unsigned int then itoa it to hexadecimal but the last three char are wrong.
int main(void)
{
char str[] = "printf from scrap!";
my_printf("MY_PRINTF:'%p'", (void*)str);
printf("\n PRINTF:'%p'\n\n", (void*)str);
return (0);
}
int conv_p(va_list args)
{
void *ptr;
unsigned int ptrint;
ptr = va_arg(args, void*);
ptrint = (unsigned int)&ptr;
my_putstr("0x7fff");
my_putstr(my_itoa_base_uint(ptrint, 16));
return (1);
}
Output:
MY_PRINTF:'0x7fff505247b0'
PRINTF:'0x7fff50524a20'
As you can see the last three char are wrong, is there any documentation about that?
In the second case, you're converting the address of the variable ptr to an int, rather than its value (the pointer you're interested in).
Replacing (unsigned int)&ptr; with (unsigned int)ptr; will give you consistent values.
And an additional aside: there's no guarantee unsigned int is large enough to represent the pointer value: you should use intptr_t or uintptr_t from <stdint.h>.

get first char from *char[] variable in C

i want to get the first character of a string (char[]) in C.
unsigned int N;
unsigned int F;
unsigned int M;
char C;
int main (int argc, char *argv[]){
if (argc!=5){
printf("Invalid number of arguments! (5 expected)\n");
exit(-1);
}
N = atoi(argv [1]);
F = atoi(argv [2]);
M = atoi(argv [3]);
C = (char) argv[4]; //this way gives a wrong char value to variable C
The code above gives me the warning: cast to pointer from integer of different size.
EDIT: as pointed in comments, argv is char *[], not char[].
There are two main ways to do this. The first is to simply dereference the pointer.
C = *argv[4];
You can also do it via array subscript, which implicitly adds an offset to the pointer before dereferencing it.
Be sure to check whether it's null first, and so on. In general, you need to be careful when dealing with pointers.
argv[4] is a char array. You need to dereference that array to obtain a single element from it
C = *(argv[4]);

Can int pointer be casted to char *?

The below program tests for Little/Big endian on intel processor. Actually little endian is correct output. First I am casting int to char* and accessing its value without initialization to int *.I am not understanding second part of output. Here int pointer is casted to char *. So why is not int pointer not changed its alignment to char *?
00000000 00000000 00000011 01111111 = 895
0 0 3 127
int main() {
int num = 895;
if(*(char *)&num == 127)
{
printf("\nLittle-Endian\n");
}
else
{
printf("Big-Endian\n");
}
int *p = (char *)&num ;
if(*p == 127)
{
printf("\nLittle-Endian\n");
}
else
{
printf("Big-Endian\n");
}
printf("%d\n",*p);
}
o/p
Little-Endian
Big-Endian
895
The first half of your program using this comparison:
if(*(char *)&num == 127)
looks fine.
The second half of your program contains this assignment:
int *p = (char *)&num ;
Which isn't valid code. You can't convert pointer types without an explicit cast. In this case, your compiler might be letting you get away with it, but strictly speaking, it's incorrect. This line should read:
int *p = (int *)(char *)&num;
or simply this equivalent statement:
int *p = &num;
From this example, I'm sure you can see why your second test doesn't work the way you'd like it to - you're still operating on the whole int, not on the single byte you were interested in. If you made p a char *, it would work the way you expected:
char *p = (char *)&num;
Can int pointer be cast to char *?
Yes, it's only the inverse that would invoke undefined behavior, more precisely, using the result of a cast from char * to int * (since char is 1-byte aligned, so any data pointer type can safely be cast to char *).

pointers and memory addresses

I recently did an assignment using bit masking and shifting to manipulate a 4 byte int.
I got to wondering if it was possible to set a char pointer to the start of the int variable and then step through the int as if it was a 1 byte char by using the char pointer.
Is there a way to do this or something similar? I tried to set the char pointer to an int but when I step ahead by 1 it jumps 4 bytes instead.
Just trying to think of alternative ways of doing the same thing.
Of course you can, this code shows the behavior:
#include <stdio.h>
int main()
{
int value = 1234567;
char *pt = (char*) &value;
printf("first char: %p, second char: %p\n", pt, pt+1);
}
This outputs:
first char: 0x7fff5fbff448, second char: 0x7fff5fbff449
As you can see difference is just 1 byte as intended, this because arithmetic on pointers has been done after casting the type to a smaller kind of data.
I imagine this should do what you want:
int x = 42;
char *c = (char *) &x;
char byte0 = c[0];
char byte1 = c[1];
char byte2 = c[2];
char byte3 = c[3];
Yes a char pointer would step by 1byte at a time, you probably inadvertently cast it to an int.
Another complexity is the order of the bytes in an int, at least on Intel

Resources