Coverity - overrun of struct type

Coverity - overrun of struct type - c

I am getting the following error through the coverity tool -
overrun-buffer-arg: Overrunning struct type in_addr of 4 bytes by passing it to a function which accesses it at byte offset 7 using argument "8UL".
sample code:
static u_long addr;
static struct sockaddr_in remote_server;
addr = inet_addr(remote_servername);
memcpy((char *) &remote_server.sin_addr, (char *)&addr, sizeof(addr));
In the last line, I am getting the above error.
Can someone through some light on, what's going wrong.
Please let me know, if you need any more information.

inet_addr() returns an in_addr_t, not an u_long.
struct sockaddr_in's sin_addr is a struct in_addr, which holds an in_addr_t s_addr.
This should do the trick:
static struct sockaddr_in remote_server;
remote_server.sin_addr.s_addr = inet_addr(remote_servername);

Standard warning: Do not cast a pointer to/from void *.
For the message: read it carefully, it very well states the problem. Just a hint: Use proper types. You are apparently accessing a struct beyond its size. Which size doe u_long have actually?
addr should be serialized properly to an uint8_t[], respecting endianess. As you take sizeof() from the second argument, apparently the first argument is shorter.
Why do you not just assign, but use memcpy()? Check both have the same type.

Related

How does "transfer/casting" between 2 structs that have different structures works in C?

I'm learning HTTP protocol following a tutorial which gives an understandable piece of code and here's part of it.
struct sockaddr_in address;
...
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons( PORT );
memset(address.sin_zero, '\0', sizeof address.sin_zero);
if (bind(server_fd, (struct sockaddr *)&address, sizeof(address))<0)
{
perror("In bind");
exit(EXIT_FAILURE);
}
The example code works well, although I don't understand the some kind of transfer between two structs.
the definition of struct sockaddr_in in <netinet/in.h> is
struct sockaddr_in {
__uint8_t sin_len;
sa_family_t sin_family;
in_port_t sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
the definition of struct sockaddr in <sys/socket.h> is
struct sockaddr {
__uint8_t sa_len; /* total length */
sa_family_t sa_family; /* [XSI] address family */
char sa_data[14]; /* [XSI] addr value (actually larger) */
};
They have different structures, how the "transfer/casting" works there?

I don't understand the some kind of transfer between two structs.
There is no data transfer between different structs, nor any conversion of structure objects. In bind(server_fd, (struct sockaddr *)&address, sizeof(address)), a pointer to a struct is converted to a different object pointer type. This is explicitly allowed by C.
The C language specification does not define any behavior for accessing the struct via the converted pointer. Any attempt to do so would violate the strict aliasing rule, but that's not your problem. The example you presented demonstrates an utterly standard usage idiom for the bind() function, for which it was designed. Therefore, you can rely on the bind() implementation to do the right thing with it, by whatever magic is required.
Conceptually, though, you can observe that the first two members of struct sockaddr and struct sockaddr_in have the same data types. You could imagine, then, that bind is able to access those two members via the converted pointer, despite it constituting a strict-aliasing violation. Although C does not define behavior for that, POSIX implicitly requires that it work in at least this case. Having then done that, the second of those members indicates the address family, by which bind() can invoke the appropriate behavior for the address's actual type.
That is a variation on C-style polymorphism. It is helped out by the third bind argument, the size of the address object, which enables bind() to copy the address object without knowing its true effective data type.
These structure types and the bind() API could have been defined a bit differently to avoid the implied strict-aliasing violation, but that wasn't necessary in early C, where member names corresponded directly to offsets from the beginning of the structure. And where those names were global, which is why you see the sin_ and sa_ prefixes in those member names, and similar in many other structure types provided by the system. Nowadays, it's best to just accept that that's how bind() is used, and it's up to the system to provide a bind() implementation that accommodates it.

The casting works.
Looking at the two structures:
struct sockaddr_in {
__uint8_t sin_len;
sa_family_t sin_family;
in_port_t sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
struct sockaddr {
__uint8_t sa_len; /* total length */
sa_family_t sa_family; /* [XSI] address family */
char sa_data[14]; /* [XSI] addr value (actually larger) */
};
First two members, sin_len and sa_len, sin_family and sa_family will not be problematic as those are of the same data type. The padding for sa_family_t works exactly the same on both ends.
Looking at the reference,
in_port_t Equivalent to the type uint16_t as described in <inttypes.h>
in_addr_t Equivalent to the type uint32_t as described in <inttypes.h>
For windows, struct in_addr looks like below:
struct in_addr {
union {
struct {
u_char s_b1;
u_char s_b2;
u_char s_b3;
u_char s_b4;
} S_un_b;
struct {
u_short s_w1;
u_short s_w2;
} S_un_w;
u_long S_addr;
} S_un;
};
and that for a linux is:
struct in_addr {
uint32_t s_addr; /* address in network byte order */
};
The whole confusion you might have is because of how the contents align. However, it is a well-thought historic design. It is intended to accommodate implementation-dependent aspects in the design.
When I Secondly, implementation-dependent -- it refers to the fact that implementation of in_addr_t is not consistent across all systems, as seen above.
In a nutshell, this entire magic works, because of the 2 things: The exact size and padding nature of the first two members and then lastly the data type of sa_data[14] is char, or more precisely an array of a 1-byte data-type. This design trick with union inside a struct has been widely used.
Unix Network Programming Volume 1 states:
The reason the sin_addr member is a structure, and not just an in_addr_t, is historical. Earlier releases (4.2BSD) defined the in_addr structure as a union of various structures, to allow access to each of the 4 bytes and to both of the 16-bit values contained within the 32-bit IPv4 address. This was used with class A, B, and C addresses to fetch the appropriate bytes of the address. But with the advent of subnetting and then the disappearance of the various address classes with classless addressing, the need for the union disappeared. Most systems today have done away with the union and just define in_addr as a structure with a single in_addr_t member.
Not what you asked for, but good to know:
The same header states:
The sockaddr_in structure is used to store addresses for the Internet address family. Values of this type shall be cast by applications to struct sockaddr for use with socket functions.
So, sockaddr_in is a struct specific to IP-based communication and sockaddr is more of a generic structure for socket operations.
Just a try:
#include <stdio.h>
#include <sys/socket.h>
#include <netinet/in.h>
int main(void)
{
printf("sizeof(struct sockaddr_in) = %zu bytes\n", sizeof(struct sockaddr_in));
printf("sizeof(struct sockaddr) = %zu bytes\n", sizeof(struct sockaddr));
return 0;
}
Prints:
sizeof(struct sockaddr_in) = 16 bytes
sizeof(struct sockaddr) = 16 bytes

I think this cast breaks the strict aliasing rule and then is undefined behaviour if the bind function dereferences the pointer.
In practice the code assumes that all fields of struct sockaddr_in are contiguous so you can access a buffer of bytes either as a struct sockaddr_in or as a struct sockaddr equivalently. But the fields of a structure are not guaranteed to be contiguous. If in_port_tis two bytes long for example, there may very well be a hole between sin_portand sin_addr with a 32 bytes machine compiler because it may want to align sin_addr field on 32 bytes address.
This way of coding is frequent when you develop a communication interface driver: you receive a buffer of bytes that need to be interpreted as a data structure (like: first byte is an adress, following bytes are a length, etc...). Casting from a structure to another one avoids to copy data.
Note that usually compilers provide non-standard-C ways to guarantee that all fields of structures are contigiuous. For example with gcc it is __attribute__((packed))
Now, to answer to your question: provided the structures are packed and there is no undefined behaviour, the cast basically does nothing. sa_data will be the array of bytes located after the field sin_family. So this array will consist of sin_port, followed by sin_addr followed by the array sin_zero.
EDIT: I compiled tje following structures on STM32H7 (ARM cortex M7, 32 bits architecture) with arm-none-eabi-gcc:
struct in_addr {
uint32_t s_addr;
};
struct sockaddr_in {
uint8_t sin_len;
uint16_t sin_family;
uint16_t sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
struct sockaddr {
uint8_t sa_len;
uint16_t sa_family;
char sin_zero[14];
};
The size of sockaddr_in is 20.
The size of sockaddr is 18.
Note that if sa_family_t is of type char and not short, due to alignment, both structures are same size.

Type casting struct pointer

I'm trying to get into Socket programming and came across an article at https://www.tenouk.com/Module43a.html I'm having difficulty understanding as how a char array is cast into struct pointer
char buffer[PCKT_LEN];
struct ipheader *ip = (struct ipheader *) buffer;
//some code here
ip->iph_ihl = 5;
ip->iph_ver = 4;
ip->iph_tos = 16;
As per my understanding, pointer ip will now hold the address of buffer and values for members of struct ipheader will now be stored in buffer. Please help understanding the same. If I'm right, then how would we be able to print values stored in buffer?

You understanding is correct. The pointer *ip will point to buffer. char buffer[PCKT_LEN] is an array of size sizeof(char) * PCKT_LEN. Since a char is usually 1 byte long it is just a chunk of memory of PCKT_LEN bytes. PCKT_LEN is defined to be 8192
The amount of bytes needed to store a struct ipheader is much less than this. Try int a = sizeof(ipheader) and use a debugger to see the value assigned to a. For me it is 24 bytes, but it could be slightly different for you. This means that buffer can hold much more data than the struct ipheader needs. I haven't looked to deeply into the code, and I don't know much about socket programming. But one use for this could be to augment buffer with additional data outside of the struct. Since you know struct ipheader takes up sizeof(ipheader) bytes you will have sizeof(char)*8192 - sizeof(ipheader) left to augment the array.
Edit:
Upun further inspection, this is kinda what is happening:
struct ipheader *ip = (struct ipheader *) buffer;
struct udpheader *udp = (struct udpheader *) (buffer + sizeof(struct ipheader));
It tries to store the ip header at the beginning of the buffer, then it augments that same buffer with an udp header. By using buffer + sizeof(struct ipheader)
it makes sure that it stores the udp header after ipheader by offsetting buffer by sizeof(struct ipheader) bytes. Basically struct ipheader *ip points to the beginning of the buffer and struct udpheader *udp points to buffer + sizeof(struct ipheader). I hope this makes sense. Obviously there is still a lot of space left over in buffer so you could potentially augment it even further.

how a char array is cast into struct pointer
You can't do that safely. The code invokes undefined behavior:
char buffer[PCKT_LEN];
struct ipheader *ip = (struct ipheader *) buffer;
//some code here
ip->iph_ihl = 5;
ip->iph_ver = 4;
ip->iph_tos = 16;
That code violates the strict aliasing rule. That basically means memory that isn't a certain type of object can't be treated as being that type of object, with the exception that any non-char object can be treated as an array of char.
That's not what's happening in the posted code. In the posted code, a char array is being treated as if it were a struct ipheader.
The memory is not a struct ipheader - it's an array of char - so the code violates strict aliasing.
The casting from char * to struct ipheader * can also result in an improperly aligned object and violate 6.3.2.3 Pointers, paragraph 7:
A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. ...
Code such as you've found here is unfortunately all too common as the x86-based machines that are the most common platform widely used by programmers are very forgiving of misaligned accesses, so such code tends to "work".
See Structure assignment in Linux fails in ARM but succeeds in x86 for an example of a platform where it doesn't work.

Why glibc2.23 change struct sockaddr_storage?

I checked the git log as https://patchwork.sourceware.org/patch/12453/. This modification seems to fix an issue on a specific platform.
But I don't understand why to swap __ss_align and __ss_padding in struct sockaddr_storage.
The Qualcomm platform I'm now developing on has lots of typecast as follows.
struct sockaddr_storage prefix_addr
(struct sockaddr_in6 *)&(prefix_addr)->sin6_addr.s6_addr
On our Cortex A7 platform, struct alignment are as follows:
Before glibc2.23:
struct sockaddr_in6
{
sin6_family; //0th byte
sin6_port; //2nd byte
sin6_flowinfo; //4th byte
sin6_addr; //8th byte
};
struct sockaddr_storage
{
ss_family; //0th byte
__ss_align; //4th byte
__ss_padding; //8th byte
};
After glibc2.23:
struct sockaddr_storage
{
ss_family; //0th byte
__ss_padding; //2nd byte
__ss_align; //124th byte
};
glibc changed struct sockaddr_storage, but struct sockaddr_in6 is not changed, so this modification would cause many alignment issues on our platform, which lead getting IPV6 addresses errors.

Please refer to the reply from the Florian:
On 09/01/2017 12:07 PM, honan li wrote:
define SASTORAGE_DATA(addr) (addr).__ss_padding
typedef struct qcmap_cm_nl_prefix_info_s {
boolean prefix_info_valid;
unsigned char prefix_len;
unsigned int mtu;
struct sockaddr_storage prefix_addr;
struct ifa_cacheinfo cache_info;
} qcmap_cm_nl_prefix_info_t;
void QCMAP_Backhaul::GetIPV6PrefixInfo(char *devname,
qcmap_cm_nl_prefix_info_t
*ipv6_prefix_info)
{
struct sockaddr_in6 *sin6 = NULL;
...
sin6 = (struct sockaddr_in6 *)&ipv6_prefix_info->prefix_addr;
memcpy(SASTORAGE_DATA(ipv6_prefix_info->prefix_addr),
RTA_DATA(rta),
sizeof(sin6->sin6_addr));
...
}
Currently publicly available here:
https://github.com/Bigcountry907/HTC_a13_vzw_Kernel/blob/master/vendor/qcom/proprietary/data/mobileap_v2/server/src/QCMAP_ConnectionManager.cpp#L3658
I would expect applications do something like this instead:
struct sockaddr_in6 sin6; memset (&sin6, 0, sizeof (sin6));
sin6.sin6_family = AF_INET6; memcpy (&sin6.sin6_addr, RTA_DATA
(rta), sizeof (sin6.sin6_addr)); memcpy
(&ipv6_prefix_info->prefix_addr, &sin6, sizeof (sin6));
This avoids any aliasing issues and a dependency on the internals of
struct sockaddr_storage (which is only intended as a way to allocate a
generic struct sockaddr of sufficient size and alignment). It also
initializes the other components of the socket address (such as the
port number and flow label).
Thanks, Florian
Also refer to:
https://books.google.com.hk/books?id=ptSC4LpwGA0C&pg=PA72&lpg=PA72&dq=alignment+sufficient&source=bl&ots=Kt0BQhjiMt&sig=HTUbm2bzVNSoMxNX98EMzORFc30&hl=zh-CN&sa=X&ved=0ahUKEwiP78iMoovWAhVLipQKHYFdCxcQ6AEIQzAF#v=onepage&q=alignment%20sufficient&f=false
Conclusion is Qualcomm's usage of sockaddr_storage is incorrect.

I'm not sure this is the change in glibc that generates alignment warnings: this change has no effect on the source code you show.
On the contrary, I think Qualcomm is using clang, and beginning with clang 4.x, there is a new alignment warning that I've seen generating false positive warning about alignements, depending on the way you write your code: -Waddress-of-packed-member.
You should check if the clang version you are using has changed, since glibc changed. If this is the case, this could certainly explain your problem.
Using a temporary variable makes those warning disappear.
instead of making a call like:
my_function(((struct sockaddr_in6 *)&prefix_addr)->sin6_addr.s6_addr)
just write:
struct sockaddr_in6 p = *(struct sockaddr_in6 *) &prefix_addr;
my_function(p.sin6_addr.s6_addr);
This may avoid alignment warnings.

Safely converting from struct sockaddr to struct sockaddr_storage

I have a function which takes in "struct sockaddr *" as a parameter (let's call this input_address), and then I need to operate on that address, which may be a sockaddr_in or sockaddr_in6, since I support both IPv4 and IPv6.
I'm getting some memory corruption and trying to track it down to it's source, and in the process found some code that seems suspect, so I would like to validate if this is the right way to do things.
struct sockaddr_storage *input_address_storage = (struct sockaddr_storage *) input_address;
struct sockaddr_storage result = [UtilityClass performSomeOperation: *input_address_storage];
At first I thought the cast in the first line was safe, but then in the second line I need to dereference that pointer, which seems like it may be wrong. The reason I am concerned is that it may end up copying memory that is beyond where the original structure is (since sockaddr_in is shorter than sockaddr_in6). I am not sure if this could cause a memory corruption (my guess is no), but nevertheless this code gives me a bad feeling.
I can't change the fact my function takes a "struct sockaddr *", so it seems like it would be difficult to work around this type of code, and yet I want to avoid copying from a memory location where I shouldn't be.
If anyone can validate whether what I am doing is wrong, and the best way to fix this, I'd appreciate it.
EDIT: An admin had changed my C tag for C# for some reason. The code I gave is primarily C, with one function call from objective C that doesn't really matter. That call could have been C.

The problem with your approach is that you are converting an existing struct sockaddr* into a struct sockaddr_storage*. Imagine what happens if the original was a ``struct sockaddr_in. Sincesizeof(struct sockaddr_in) < sizeof(struct sockaddr_storage)`, the memory-sanitizer complains of unbound memory reference.
struct sockaddr_storage is essentially a container to contain either your struct sockaddr_in or struct sockaddr_in6.
Hence, it is useful when you want to pass in a struct sockaddr* object but want to allocate enough memory for both sockaddr_in and sockaddr_in6.
A good example is the recvfrom(3) call:
ssize_t recvfrom(int socket, void *restrict buffer, size_t length,
int flags, struct sockaddr *restrict address,
socklen_t *restrict address_len);
Since address requires a struct sockaddr* object, we will construct a struct sockaddr_storage first, and pass it in:
struct sockaddr_storage address;
socklen_t address_length = sizeof(struct sockaddr_storage);
ssize_t ret = recvfrom(fd, buffer, buffer_length, 0, (struct sockaddr*)&address, &address_length);
if (address.ss_family == AF_INET) {
DoIpv4Work((struct sockaddr_in*)&address, ...);
} else if (address.ss_family == AF_INET6) {
DoIpv6Work((struct sockaddr_in6*)&address, ...);
}
The difference in your approach and mine is that I allocate a struct sockaddr_storage and then use it as struct sockaddr, but you do the REVERSE, and use a struct sockaddr and then use it as struct sockaddr_storage.

Why do we cast sockaddr_in to sockaddr when calling bind()?

The bind() function accepts a pointer to a sockaddr, but in all examples I've seen, a sockaddr_in structure is used instead, and is cast to sockaddr:
struct sockaddr_in name;
...
if (bind (sock, (struct sockaddr *) &name, sizeof (name)) < 0)
...
I can't wrap my head around why is a sockaddr_in struct used. Why not just prepare and pass a sockaddr?
Is it just convention?

No, it's not just convention.
sockaddr is a generic descriptor for any kind of socket operation, whereas sockaddr_in is a struct specific to IP-based communication (IIRC, "in" stands for "InterNet"). As far as I know, this is a kind of "polymorphism" : the bind() function pretends to take a struct sockaddr *, but in fact, it will assume that the appropriate type of structure is passed in; i. e. one that corresponds to the type of socket you give it as the first argument.

I don't know if its very much relevant for this question, but I would like to provide some extra info which may make the typecaste more understandable as many people who haven't spent much time with C get confused seeing such a typecaste.
I use macOS, so I am taking examples based on header files from my system.
struct sockaddr is defined as follows:
struct sockaddr {
__uint8_t sa_len; /* total length */
sa_family_t sa_family; /* [XSI] address family */
char sa_data[14]; /* [XSI] addr value (actually larger) */
};
struct sockaddr_in is defined as follows:
struct sockaddr_in {
__uint8_t sin_len;
sa_family_t sin_family;
in_port_t sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
Starting from the very basics, a pointer just contains an address. So struct sockaddr * and struct sockaddr_in * are pretty much the same. They both just store an address. Only relevant difference is how compiler treats their objects.
So when you say (struct sockaddr *) &name, you are just tricking the compiler and telling it that this address points to a struct sockaddr type.
So let's say the pointer is pointing to a location 1000. If the struct sockaddr * stores this address, it will consider memory from 1000 to sizeof(struct sockaddr) possessing the members as per the structure definition. If struct sockaddr_in * stores the same address it will consider memory from 1000 to sizeof(struct sockaddr_in).
When you typecasted that pointer, it will consider the same sequence of bytes upto sizeof(struct sockaddr).
struct sockaddr *a = &name; // consider &name = 1000
Now if I access a->sa_len, the compiler would access from location 1000 to sizeof(__uint8_t) which is same bytes size as in case of sockaddr_in. So this should access the same sequence of bytes.
Same pattern is for sa_family.
After that there is a 14 byte character array in struct sockaddr which stores data from in_port_t sin_port (typedef'd 16 bit unsigned integer = 2 bytes ) , struct in_addr sin_addr (simply a 32 bit ipv4 address = 4 bytes) and char sin_zero[8](8 bytes). These 3 add up to make 14 bytes.
Now these three are stored in this 14 bytes character array and we can access any of these three by accessing appropriate indices and typecasting them again.
user529758's answer already explains the reason to do this.

This is because bind can bind other types of sockets than IP sockets, for instance Unix domain sockets, which have sockaddr_un as their type. The address for an AF_INET socket has the host and port as their address, whereas an AF_UNIX socket has a filesystem path.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Coverity - overrun of struct type - c

inet_addr() returns an in_addr_t, not an u_long. struct sockaddr_in's sin_addr is a struct in_addr, which holds an in_addr_t s_addr. This should do the trick: static struct sockaddr_in remote_server; remote_server.sin_addr.s_addr = inet_addr(remote_servername);

Related

How does "transfer/casting" between 2 structs that have different structures works in C?

Type casting struct pointer

Why glibc2.23 change struct sockaddr_storage?

Safely converting from struct sockaddr to struct sockaddr_storage

Why do we cast sockaddr_in to sockaddr when calling bind()?

Categories

Resources