Segmentation fault when looking up host name and IP address - c

I have the following piece of code for getting the hostname and IP address,
#include <stdlib.h>
#include <stdio.h>
#include <netdb.h> /* This is the header file needed for gethostbyname() */
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
int main(int argc, char *argv[])
{
struct hostent *he;
if (argc!=2){
printf("Usage: %s <hostname>\n",argv[0]);
exit(-1);
}
if ((he=gethostbyname(argv[1]))==NULL){
printf("gethostbyname() error\n");
exit(-1);
}
printf("Hostname : %s\n",he->h_name); /* prints the hostname */
printf("IP Address: %s\n",inet_ntoa(*((struct in_addr *)he->h_addr))); /* prints IP address */
}
But I am getting a warning during compilation:
$cc host.c -o host
host.c: In function ‘main’:
host.c:24: warning: format ‘%s’ expects type ‘char *’, but argument 2 has type ‘int’
Then there is a segmentation fault when I run the code:
./host 192.168.1.4
Hostname : 192.168.1.4
Segmentation fault
What is the error in the code?

I had a similar code (if not the same) and it compiled fine in a machine in our school laboratory, but when I compiled it on my machine at home, it had the same error (I didn't edit the code). I read the man page for inet, and found that I had one header file missing, which is the #include <arpa/inet.h>. After I added that header to my C program, it compiled and run fine.

The warning about the mismatch for the printf format is an important warning.
In this case, it comes because the compiler is thinking that the function inet_ntoa returns an int, but you specified to expect a string in the format-string.
The incorrect return-type for inet_ntoa is the result of an old C rule that states that if you try to use a function without a prior declaration, then the compiler must assume the function returns an int and takes an unknown (but fixed) number of arguments.
The mismatch between the assumed return type and the actual return type of the function results in undefined behaviour, which manifests itself as a crash in your case.
The solution is to include the correct header for inet_ntoa.

Break this code:
printf("IP Address: %s\n",inet_ntoa(*((struct in_addr *)he->h_addr)));
Into this:
struct in_addr* address = (in_addr*) he->h_addr;
char* ip_address = inet_ntoa(*address);
printf("IP address: %s\n", ip_address);
It also makes it easier to debug and pinpoint the problem.

Actually, I just compiled that code on my FreeBSD machine at home and it works.

You could try dumping the value of he->h_addr before trying to dereference it and pass it to inet_ntoa. If it was NULL, that would result in a seg fault.
How about running it through strace?

Related

Declaration of unrelated char array needed to avoid program from freezing

I'm quite new to C and run into strange problem that I can not explain or solve.
#include <netinet/in.h>
#include <stdio.h>
#include <sys/socket.h>
#include <unistd.h>
void main ()
{
int sock = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in sock_addr;
sock_addr.sin_family = AF_INET;
sock_addr.sin_port = htons(1500);
connect(sock, (struct sockaddr *)&sock_addr, sizeof(sock_addr));
puts("A");
char foo[9];
puts("B");
close(sock);
}
Code above prints out following lines:
A
B
If I comment out char foo[9] or change 9 to some smaller value then nothing is being printed out and program hangs. Looks like connect is that makes program to freeze but I don't see anything wrong on that line.
How to fix above code so that char foo[9] can be removed and program still prints out A and B and then exits? Why completely unrelated char foo[9] avoids program to freeze?
I'm using GCC 6.3.0 on Ubuntu.
Converting comments to an answer.
The code shown has an incorrect return type for the main() function on Linux. That is required to be int on all systems except Windows — only on Windows can you possibly hope to use void main(). See What should main() return in C and C++ for more information.
#include <netinet/in.h>
#include <stdio.h>
#include <sys/socket.h>
#include <unistd.h>
int main(void)
{
int sock = socket(AF_INET, SOCK_STREAM, 0);
// missed error check - probably not critical
struct sockaddr_in sock_addr;
sock_addr.sin_family = AF_INET;
sock_addr.sin_port = htons(1500);
// missed intialization of sock_addr.sin_addr - crucial!
// omitted trace printing before the call
connect(sock, (struct sockaddr *)&sock_addr, sizeof(sock_addr));
// missed error check — possibly critical
// omitted trace printing after the call - not crucial because of puts() calls
puts("A");
char foo[9];
puts("B");
close(sock);
}
Have you tried error checking the system calls? You set the port and family but not the IP address when you try to connect — that is dubious at best, erroneous at worst. I'm not immediately sure why it causes the symptoms you're seeing, but there are problems en route to where the trouble occurs. It could be that your changed code changes the IP address part of sock_addr and your system is hanging trying to contact an uncontactable machine.
How long have you waited before deciding the program's frozen?
Have you tried adding fprintf(stderr, "BC\n"); before the call to connect() and fprintf(stderr, "AC\n"); after it? Does the call hang?
Are you using the optimizer at all?
Do you compile with warnings enabled, such as warnings for unused variables? (Use gcc -Wall -Werror -Wextra -Wstrict-prototypes -Wmissing-prototypes as a starting point — if it doesn't compile cleanly under those options, it quite possibly won't run cleanly either. Include -g for debug information and -O3; if you're doing serious debugging in a debugger, then drop the -O3.)
The code doesn't initialize the sock_addr variable properly — it doesn't set the sin_addr at all, so you're connecting to an indeterminate IP address (you've literally no idea what you're trying to connect to). At minimum, use struct sockaddr_in sock_addr = { 0 }; to set it to zeros. Or use memset(&sock_addr, '\0', sizeof(sock_addr));. You're invoking undefined behaviour because you don't initialize the structure properly. And variable responses from compilers and optimizers is symptomatic of undefined behaviour.
Karmo Rosental notes:
It is connecting to localhost when it's not freezing but your suggestion struct sockaddr_in sock_addr = { 0 }; helped to avoid freezing in GCC.

C-Socket Programming-"undefined reference"

I am a beginner in socket programming in C. I got the code in the book and when I compiled, these are following error with undefined reference. Please give a tips to correct this!Thank you!
Code:
#include <stdio.h>
#include <winsock.h>
#include <winsock2.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#define RCVBUFSIZE 32
int main(int argc, char *argv[]){
int sock;
struct sockaddr_in echoServAddr;
unsigned short echoServPort;
char *servIP;
char *echoString;
char echoBuffer[RCVBUFSIZE];
unsigned int echoStringLen;
int bytesRcvd, totalBytesRcvd;
if(argc>3 || argc>4){
printf("Usage: %s <Server IP> <Echo Word> [<Echo Port>\n",argv[0]);
exit(1);
}
servIP=argv[1];
echoString=argv[2];
if(argc==4){
echoServPort=atoi(argv[3]);
}
else{
echoServPort=7;
}
if((sock=socket(PF_INET,SOCK_STREAM,IPPROTO_TCP))<0){
printf("socket() failed!");
}
memset(&echoServAddr,0,sizeof(echoServAddr));
echoServAddr.sin_family=AF_INET;
echoServAddr.sin_addr.s_addr=inet_addr(servIP);
echoServAddr.sin_port=htons(echoServPort);
if (connect(sock, (struct sockaddr *) &echoServAddr, sizeof(echoServAddr)) < 0){
printf("connect() failed!");
}
echoStringLen=strlen(echoString);
if(send(sock,echoString, echoStringLen,0)!=echoStringLen){
printf("send() send maximum bytes than expected");
}
totalBytesRcvd=0;
printf("Received");
while(totalBytesRcvd<echoStringLen){
if((bytesRcvd=recv(sock,echoBuffer,RCVBUFSIZE-1,0))<=0){
printf("recv() failed!");
}
totalBytesRcvd+=bytesRcvd;
echoBuffer[bytesRcvd]='\0';
printf(echoBuffer);
}
close(sock);
exit(1);
}
I got errors as follow:
In function `main':
client.cpp:34: undefined reference to `_socket#12'
client.cpp:39: undefined reference to `_inet_addr#4'
client.cpp:40: undefined reference to `_htons#4'
client.cpp:41: undefined reference to `_connect#12'
client.cpp:45: undefined reference to `_send#16'
client.cpp:51: undefined reference to `_recv#16'
collect2.exe [Error] ld returned 1 exit status
From my observation, you're trying to build Winsock application with MinGW.
MinGW, by default, don't link Winsock library (which is Ws2_32.lib) automatically, so you need to manually tell compiler by using -l flag plus name of your library.
gcc winsock.c -o winsock.exe -lws2_32
Edit : If you're IDE which are using MinGW suite (Code::Block, DevC++, etc), you can try to find option where you can manually add compiler flags, and add -lws2_32 into on of your compiler flag
Edit : Based on your comment above, so you're using Dev-C++ IDE, below is guide how you can use ws2_32.lib library inside your IDE
Go to top menu Tools, and click Compiler Options...
Inside Compiler Options, tick checkbox Add the following commands when calling the compiler
Put compiler flag -lws2_32 below into the textarea
Press OK
Try again to compile your source code. If nothing go wrong, your program should be compiled successfully.
Regards

Ignoring return value of fscanf and Segmentation fault

I was wondering how to solve a Core dumped issue on my C code.
When I compile it with: g++ -g MatSim.cpp -o MatSim -lstdc++ -O3, I get three warnings, this is one (The other two are similar and are only differentiated by the string variable name):
MatSim.cpp: In function ‘int main()’:
MatSim.cpp:200037:27: warning: ignoring return value of ‘int fscanf(FILE*, const char*, ...)’, declared with attribute warn_unused_result [-Wunused-result]
fscanf(TM,"%255s",string2);
The principal aspects of my code and the related part that the compiler reports:
#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
#include <iostream>
#include <fstream>
#include <string.h>
using namespace std;
int to_int(char string[256])
{
if( strcmp(string,"0") == 0 )
{
return 0;
}
...
else if( strcmp(string,"50000") == 0 )
{
return 50000;
}
return -1;
}
int main()
{
int a,b,div,value,k,i,j,tm,ler;
char string[256];
char string1[256];
char string2[256];
FILE *TM;
TM = fopen("TM","r");
if( (TM = fopen("TM","r")) == NULL)
{
printf("Can't open %s\n","TM");
exit(1);
}
fscanf(TM,"%255s",string2);
tm = to_int(string2);
fclose(TM);
...
}
I have tried the reported suggestion in here and I tried to understand what was posted in here. But, I don't see its application on my code.
Finally, when I run the exe file, it returns:
Segmentation fault (core dumped)`.
In your code, you're fopen()ing the file twice. Just get rid of the
TM = fopen("TM","r");
before the if statement.
That said, you should check the value of fscanf() to ensure success. Otherwise, you'll end up reading an uninitialized array string2, which is not null-terminated which in turn invokes undefined behaviour.
Please be aware, almost all string related functions expect a null-terminated char array. If your array is not null terminated, there will be UB. Also, it is a good practice to initialize your automatic local variables to avoid possible UB in later part of code.
You are opening the file twice.
Alll you need is this:
FILE *TM = fopen("TM","r");
if (TM == NULL) { /* file was not opened */ }

unix socket error 14: EFAULT (bad address)

I have a very simple question, but I have not managed to find any answers to it all weekend. I am using the sendto() function and it is returning error code 14: EFAULT. The man pages describe it as:
"An invalid user space address was specified for an argument."
I was convinced that this was talking about the IP address I was specifying, but now I suspect it may be the memory address of the message buffer that it is referring to - I can't find any clarification on this anywhere, can anyone clear this up?
EFAULT It happen if the memory address of some argument passed to sendto (or more generally to any system call) is invalid. Think of it as a sort of SIGSEGV in kernel land regarding your syscall. For instance, if you pass a null or invalid buffer pointer (for reading, writing, sending, recieving...), you get that
See errno(3), sendto(2) etc... man pages.
EFAULT is not related to IP addresses at all.
Minimal runnable example with getcpu
Just to make things more concrete, we can have a look at the getcpu system call, which is very simple to understand, and shows the same EFAULT behaviour.
From man getcpu we see that the signature is:
int getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *tcache);
and the memory pointed to by the cpu will contain the ID of the current CPU the process is running on after the syscall, the only possible error being:
ERRORS
EFAULT Arguments point outside the calling process's address space.
So we can test it out with:
main.c
#define _GNU_SOURCE
#include <assert.h>
#include <errno.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/syscall.h>
int main(void) {
int err, ret;
unsigned cpu;
/* Correct operation. */
assert(syscall(SYS_getcpu, &cpu, NULL, NULL) == 0);
printf("%u\n", cpu);
/* Bad trash address == 1. */
ret = syscall(SYS_getcpu, 1, NULL, NULL);
err = errno;
assert(ret == -1);
printf("%d\n", err);
perror("getcpu");
return EXIT_SUCCESS;
}
compile and run:
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
./main.out
Sample output:
cpu 3
errno 14
getcpu: Bad address
so we see that the bad call with a trash address of 1 returned 14, which is EFAULT as seen from kernel code: https://stackoverflow.com/a/53958705/895245
Remember that the syscall itself returns -14, and then the syscall C wrapper detects that it is an error due to being negative, returns -1, and sets errno to the actual precise error code.
And since the syscall is so simple, we can confirm this from the kernel 5.4 implementation as well at kernel/sys.c:
SYSCALL_DEFINE3(getcpu, unsigned __user *, cpup, unsigned __user *, nodep,
struct getcpu_cache __user *, unused)
{
int err = 0;
int cpu = raw_smp_processor_id();
if (cpup)
err |= put_user(cpu, cpup);
if (nodep)
err |= put_user(cpu_to_node(cpu), nodep);
return err ? -EFAULT : 0;
}
so clearly we see that -EFAULT is returned if there is a problem with put_user.
It is worth mentioning that my glibc does have a getcpu wrapper as well in sched.h, but that implementation segfaults in case of bad addresses, which is a bit confusing: How do I include Linux header files like linux/getcpu.h? But it is not what the actual syscall does to the process, just whatever glibc is doing with that address.
Tested on Ubuntu 20.04, Linux 5.4.
EFAULT is a macro defined in a file "include/uapi/asm-generic/errno-base.h"
#define EFAULT 14 /* Bad address */

What's wrong with gethostbyname?

I am using this snippet of code I found in http://www.kutukupret.com/2009/09/28/gethostbyname-vs-getaddrinfo/ to perform dns lookups
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <netdb.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
int main(int argc, char *argv[ ]) {
struct hostent *h;
/* error check the command line */
if(argc != 2) {
fprintf(stderr, "Usage: %s hostname\n", argv[0]);
exit(1);
}
/* get the host info */
if((h=gethostbyname(argv[1])) == NULL) {
herror("gethostbyname(): ");
exit(1);
}
else
printf("Hostname: %s\n", h->h_name);
printf("IP Address: %s\n", inet_ntoa(*((struct in_addr *)h->h_addr)));
return 0;
}
I am facing a weird fact
./test www.google.com
Hostname: www.l.google.com
IP Address: 209.85.148.103
works fine, but if I try to resolve an incomplete IP address I get this
./test 10.1.1
Hostname: 10.1.1
IP Address: 10.1.0.1
I would expect an error like the following
./test www.google
gethostbyname(): : Unknown host
but the program seems to work.
Any idea why?
It is not a bug but rather a feature of inet_aton() function:
DESCRIPTION
The inet_aton() function converts the specified string, in the
Internet standard dot notation, to a network address, and stores the
address in the structure provided.
Values specified using dot notation take one of the following forms:
a.b.c.d When four parts are specified, each is interpreted as a byte of data and assigned, from left to right, to the four bytes of an internet address.
a.b.c
When a three-part address is specified, the last part is interpreted as a 16-bit quantity and placed in the rightmost two bytes of the network address. This makes the three-part address format convenient for specifying Class B network addresses as 128.net.host.
You can read more about this there, for example.
POSIX.2004 says :
The name argument of gethostbyname() shall be a node name; the behavior of gethostbyname() when passed a numeric address string is unspecified. For IPv4, a numeric address string shall be in the dotted-decimal notation described in inet_addr().
So, when looking at it from the POSIX point of view, you cannot expect anything when passing it an IP address.
On my system, the man page says this :
If name is an IPv4 or IPv6 address, no lookup is performed and gethostbyname() simply copies name into the h_name field and its struct in_addr equivalent into the h_addr_list[0] field of the returned hostent structure.
It does not say anything about what happens if you pass it an incomplete IP address, so anything could happen, including the behavior you observed.
For more information on how gethostbyname is implemented on your system, you can check the documentation for the function and/or the source code (if available).

Resources