Why does my pc send more than 1514 byte packet in one go - c

I wrote a program to send 1460 byte data using TCP client and server continuously. My system interface MTU is 1500.
Here is my program of client
if((sockfd = socket(AF_INET, SOCK_STREAM, 0))< 0)
{
printf("\n Error : Could not create socket \n");
return 1;
}
setsockopt(sockfd,SOL_TCP,TCP_NODELAY,&one,sizeof(one));
serv_addr.sin_family = AF_INET;
serv_addr.sin_port = htons(9998);
serv_addr.sin_addr.s_addr = inet_addr("10.10.12.1");
if(connect(sockfd, (struct sockaddr *)&serv_addr, sizeof(serv_addr))<0)
{
printf("\n Error : Connect Failed \n");
return 1;
}
while(1)
{
write(sockfd, send_buff, 1448) ;
}
In wireshark initial 15 to 30 packets are showing that 1514 byte of packet is going but afterwards showing as below
wireshark output of some packet
No. Time Source Destination Protocol Length Info
16 0.000000 10.10.12.2 10.10.12.1 TCP 5858 53649 > distinct32 [ACK] Seq=3086892290 Ack=250285353 Win=14608 Len=5792 TSval=23114307 TSecr=23833274
Frame 16: 5858 bytes on wire (46864 bits), 5858 bytes captured (46864 bits)
Ethernet II, Src: 6c:3b:e5:14:9a:a2 (6c:3b:e5:14:9a:a2), Dst: Ibm_b5:86:85 (00:1a:64:b5:86:85)
Internet Protocol Version 4, Src: 10.10.12.2 (10.10.12.2), Dst: 10.10.12.1 (10.10.12.1)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
Total Length: 5844
Identification: 0x8480 (33920)
Flags: 0x00
Fragment offset: 0
Time to live: 64
Protocol: TCP (6)
Header checksum: 0xb38d [correct]
Source: 10.10.12.2 (10.10.12.2)
Destination: 10.10.12.1 (10.10.12.1)
Transmission Control Protocol, Src Port: 53649 (53649), Dst Port: distinct32 (9998), Seq: 3086892290, Ack: 250285353, Len: 5792
Source port: 53649 (53649)
Destination port: distinct32 (9998)
[Stream index: 0]
Sequence number: 3086892290
[Next sequence number: 3086898082]
Acknowledgement number: 250285353
Header length: 32 bytes
Flags: 0x010 (ACK)
Window size value: 913
[Calculated window size: 14608]
[Window size scaling factor: 16]
Checksum: 0x42dd [validation disabled]
Options: (12 bytes)
No-Operation (NOP)
No-Operation (NOP)
Timestamps: TSval 23114307, TSecr 23833274
Data (5792 bytes)
On wireshark it is showing that more than 5792, 7000, 65535 byte of packet are going.
But i am sending 1514 byte of packet in one go. on other side i am receiving 1514 byte of packets only due to network mtu.
So my question is
why this much of huge packets are going ?
I tried without NODELAY option also but it is not working.
Is there any solution to send particular packet size (such as 1514 byte) can be send, no jumbo frames ?
I update my tcp_rmem and tcp_wmem also for tcp sending buffer and receiving buffer also. But did not found any solution.

TCP, by design, bundles up multiple write() calls into larger packets. Also, TCP coalesces packets by default according to Nagle's Algorithm.
If you want more control over the actual size of network packets, use UDP.

These are "jumbo frames", and they're faster than traditional frame sizes because they don't load up a CPU as much.
Consider yourself fortunate that you're getting them without futzing around with your IP stack's settings.

I searched a lot and found that, we need to change some parameter on interface.
On my interface eth0 default option are
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: off
now using ethtool we need to off some sending side segementation offload.
For that
sudo ethtool -K eth0 tso off gso off
using this
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: off
udp-fragmentation-offload: off
generic-segmentation-offload: off
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: off
After this your interface will send packets whatever you want to send.

Related

PCAP Coding :: My Code is Setting the Wrong Type of Ethernet

I’m writing a C program which builds an Ethernet/IPv4/TCP network packet, then writes the packet into a PCAP file for inspection. I build my code off the SO post here. The first version of my code worked perfectly, but it was one big main() function, and that is not portable into larger programs.
So I reorganized the code so I could port it into another program. I don’t want to get into the differences between Version 1 and Version 2 in this post. But needless to say, Version 2 works great, except for one annoying quirk. When Wireshark opened a Version 1 PCAP file, it saw that my Layer 2 was Ethernet II:
Frame 1: 154 bytes on wire (1232 bits), 154 bytes captured (1232 bits)
Ethernet II, Src: 64:96:c8:fa:fc:ff (64:96:c8:fa:fc:ff), Dst: Woonsang_04:05:06 (01:02:03:04:05:06)
Destination: Woonsang_04:05:06 (01:02:03:04:05:06)
Source: 64:96:c8:fa:fc:ff (64:96:c8:fa:fc:ff)
Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 10.10.10.10, Dst: 20.20.20.20
Transmission Control Protocol, Src Port: 22, Dst Port: 55206, Seq: 1, Ack: 1, Len: 100
SSH Protocol
But in Version 2, the Layer 2 header became 802.3 Ethernet:
Frame 1: 154 bytes on wire (1232 bits), 134 bytes captured (1072 bits)
IEEE 802.3 Ethernet
Destination: Vibratio_1c:08:00 (00:09:70:1c:08:00)
Source: 45:00:23:28:06:cf (45:00:23:28:06:cf)
Length: 64
Trailer: 050401040204000001020506040400070602040704060202…
Logical-Link Control
Data (61 bytes)
[Packet size limited during capture: Ethernet truncated]
I’m no expert in networking, but I’m guessing my Version 2 PCAP file is malformed somewhere. I should not have a Logical-Link Control header in there; my code thinks it is writing Ethernet II / IPv4 / TCP headers. At this point, my instinct is that either the PCAP Packet header (necessary to proceed every packet in a PCAP file) or my Ethernet header is incorrect, somehow. Which would tell Wireshark “the next X bytes are an Ethernet II header?"
Here’s my code, in excerpts:
The structs for the PCAP header and Ethernet frames were cribbed directly from the before-mentioned SO post. The solution in that post was to use the pcap_sf_pkthdr struct for the PCAP Packet header:
// struct for PCAP Packet Header - Timestamp
struct pcap_timeval {
bpf_int32 tv_sec; // seconds
bpf_int32 tv_usec; // microseconds
};
// struct for PCAP Packet Header
struct pcap_sf_pkthdr {
struct pcap_timeval ts; // time stamp
bpf_u_int32 caplen; // length of portion present
bpf_u_int32 len; // length this packet (off wire)
};
And the Ethernet header is from the original post:
// struct for the Ethernet header
struct ethernet {
u_char mac1[6];
u_char mac2[6];
u_short protocol; // will be ETHERTYPE_IP, for IPv4
};
There’s not much to either struct, right? I don’t really understand how Wireshark looks at this and knows the first 20 bytes of the packet are Ethernet.
Here’s the actual code, slightly abridged:
#include <netinet/in.h> // for ETHERTYPE_IP
struct pcap_sf_pkthdr* allocatePCAPPacketHdr(struct pcap_sf_pkthdr* pcapPacketHdr ){
pcapPacketHdr = malloc( sizeof(struct pcap_sf_pkthdr) );
if( pcapPacketHdr == NULL ){
return NULL;
}
uint32_t frameSize = sizeof( struct ethernet) + …correctly computed here
bzero( pcapPacketHdr, sizeof( struct pcap_sf_pkthdr ) );
pcapPacketHdr->ts.tv_sec = 0; // for now
pcapPacketHdr->ts.tv_usec = 0; // for now
pcapPacketHdr->caplen = frameSize;
pcapPacketHdr->len = frameSize;
return pcapPacketHdr;
}
void* allocateL2Hdr( packetChecklist* pc, void* l2header ){
l2header = malloc( sizeof( struct ethernet ) );
if( l2header == NULL ){
return NULL;
}
bzero( ((struct ethernet*)l2header)->mac1, 6 );
bzero( ((struct ethernet*)l2header)->mac2, 6 );
// …MAC addresses filled in later…
((struct ethernet*)l2header)->protocol = ETHERTYPE_IP; // This is correctly set
return l2header;
}
...and the code which uses the above functions...
struct pcap_sf_pkthdr* pcapPacketHdr;
pcapPacketHdr = allocatePCAPPacketHdr( pcapPacketHdr );
struct ethernet* l2header;
l2header = allocateL2Hdr( l2header );
Later, the code populates these structs and writes them into a file, along with an IPv4 header, a TCP header, and so on.
But I think my problem is that I don’t really understand how Wireshark is supposed to know that my Ethernet header is Ethernet II and not 802.3 Ethernet with an Logical-Link Header. Is that communicated in the PCAP Packet Header? Or in the ethernet frame somewhere? I’m hoping for advice. Thank you
Wireshark is supposed to know that my Ethernet header is Ethernet II and not 802.3 Ethernet with an Logical-Link Header. Is that communicated in the PCAP Packet Header?
No.
Or in the ethernet frame somewhere?
Yes.
If you want the details, see, for example, the "Types" section of the Wikipedia "Ethernet frame" page.
However, the problem appears to be that the packet you're writing to the file doesn't have the full 6-byte destination and source addresses in it - the last two bytes of the destination address are 0x08 0x00, which are the first two bytes of a big-endian value of ETHERTYPE_IP (0x0800), and the first byte of the source address is 0x45, which is the first byte of an IPv4 header for an IPv4 packet with no IP options.
Somehow, Version 1 of your program put the destination and source addresses into the data part of the pcap record, but Version 2 didn't.

scapy send tcp packet on established connection

I have the following:
Server Side: TCP python server (not scapy)
Client Side: Scapy to establish connection and sent TCP packet
I am trying to send TCP packet via scapy on established connection after 3 way handshaking
I am able to build the 3 way handshaking and the server side (other side -python TCP server- not scapy- create TCP socket, bind, listen, accpet, recv()) shows new connection comes and accept() returns the created FD
I am trying to send packet from scapy after the 3 way handshake succeeded but recv() on the not-scapy side can't get the packet
scapy side:
#!/usr/bin/env python
from scapy.all import *
import time
# VARIABLES
src = sys.argv[1]
dst = sys.argv[2]
sport = random.randint(1024,65535)
dport = int(sys.argv[3])
# SYN
ip=IP(src=src,dst=dst)
SYN=TCP(sport=sport,dport=dport,flags='S',seq=1000)
SYNACK=sr1(ip/SYN)
# ACK
ACK=TCP(sport=sport, dport=dport, flags='A', seq=SYNACK.ack, ack=SYNACK.seq + 1)
send(ip/ACK)
time.sleep(15)
ip = IP(src=src, dst=dst)
tcp = ip / TCP(sport=sport, dport=dport, flags="PA", seq=123, ack=1) / "scapy packet 123"
tcp.show2()
send(tcp)
Not scapy side:
#!/usr/bin/python
import socket
from scapy.all import *
ip = sys.argv[1]
port = sys.argv[2]
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind((ip, int(port)))
s.listen(1)
while True :
conn, addr = s.accept()
print 'Connection address:', addr
data = conn.recv(1024) # Stuck here .....
tcpdump output shows:
tcpdump: listening on ens1f1, link-type EN10MB (Ethernet), capture size 65535 bytes
18:09:35.820865 IP (tos 0x0, ttl 64, id 1, offset 0, flags [none], proto TCP (6), length 40)
11.4.3.31.63184 > 11.4.3.30.strexec-d: Flags [S], cksum 0x6543 (correct), seq 1000, win 8192, length 0
18:09:35.821017 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 44)
11.4.3.30.strexec-d > 11.4.3.31.63184: Flags [S.], cksum 0x748d (correct), seq 3017593595, ac k 1001, win 29200, options [mss 1460], length 0
18:09:35.930593 IP (tos 0x0, ttl 64, id 1, offset 0, flags [none], proto TCP (6), length 40)
11.4.3.31.63184 > 11.4.3.30.strexec-d: Flags [.], cksum 0xde5a (correct), seq 1, ack 1, win 8 192, length 0
18:09:51.057904 IP (tos 0x0, ttl 64, id 1, offset 0, flags [none], proto TCP (6), length 56)
11.4.3.31.63184 > 11.4.3.30.strexec-d: Flags [P.], cksum 0x8eef (correct), seq 4294966419:429 4966435, ack 1277373702, win 8192, length 16
18:09:51.057996 IP (tos 0x0, ttl 64, id 1194, offset 0, flags [DF], proto TCP (6), length 40)
11.4.3.30.strexec-d > 11.4.3.31.63184: Flags [.], cksum 0x8c4a (correct), seq 1, ack 1, win 2 9200, length 0
My question why receiver side is not getting the sent packet?
Note: My target to send TCP packet on established connection with bad checksum and receive it by not scapy tcp server
Thanks in advance!!
Your sequence numbers must accurately track the payload bytes you send. A packet with the SYN or FIN flag set is an exception and is treated as if it had a payload of length 1. In other words, you can use whatever initial sequence number you like, but then it must increase byte-for-byte with your sent payload (+1 for SYN or SYN+ACK [or FIN]).
So, if you start with a sequence number of 1000 in the SYN packet, then the next packet with payload (call this pktA) should have a sequence number of 1001. Then your next packet (pktB) should have sequence number 1001 + pktA.payload_size, and so forth.
Likewise, you cannot simply set the acknowledge number field in the TCP header to 1 (as you're doing with the "scapy packet 123"). Whenever you provide the ACK flag in the header, you need to acknowledge the other side's payload by setting the acknowledge number in the header to the last-received sequence number from the other side's last payload. In this case, you already sent a bare ACK packet with that acknowledge number, so you don't strictly need to include the ACK flag, but it's typical to always include it and if you are going to include the flag, the acknowledge sequence number should be set correctly.
See this link for a good overview:
http://packetlife.net/blog/2010/jun/7/understanding-tcp-sequence-acknowledgment-numbers/

Sending don't fragment UDP packets at the server and receiving fragmented packets at the client

I've created a program in C that sends data with UDP packets.
Socket is made don't fragment using
int optval = IP_PMTUDISC_DO;
if(setsockopt(sd,IPPROTO_IP,IP_MTU_DISCOVER,&optval,sizeof(int))!=0)
{
perror("setsocketopt()");
return 0;
}
Checking with TSHARK at the server (debian 8 KVM virtualizied), for all packets don't fragment has been set:
But at the client, large packets receiving fragmented!!
Then I figured something more wired. The IPv4 ID field has been set to 0!
I thought it might be because of the don't fragment effect (because no packet is going to be fragmented).
Then I started Openvpn program and sniffed its packets at the server but the packets had IPv4 ID != 0 while don't fragment was set.
Do you guys have any idea that why is this happening to me?!
Edit: Here is another sample of a large packet at server copy and pasted from tshark result.
Frame 1049: 1514 bytes on wire (12112 bits), 1514 bytes captured (12112 bits) on interface 0
Ethernet II, Src: server_mac, Dst: gateway_mac
Internet Protocol Version 4, Src: server_ip, Dst: client_ip
Version: 4
Header Length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
0000 00.. = Differentiated Services Codepoint: Default (0x00)
.... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
Total Length: 1500
Identification: 0x0000 (0)
Flags: 0x02 (Don't Fragment)
0... .... = Reserved bit: Not set
.1.. .... = Don't fragment: Set
..0. .... = More fragments: Not set
Fragment offset: 0
Time to live: 64
Protocol: UDP (17)
Header checksum: 0xa52d [validation disabled]
[Good: False]
[Bad: False]
Source: server_ip
Destination: client_ip
[Source GeoIP: Unknown]
[Destination GeoIP: Unknown]
User Datagram Protocol, Src Port: 7554 (7554), Dst Port: 45376 (45376)
Data (1472 bytes)
Like you see, don't fragment has been set, while all packets larger than PMTU are receiving fragmented to client side.
This is a packet trace example of openvpn on the same server. like you see it at least has IPv4 ID calculated!
Frame 3749: 1455 bytes on wire (11640 bits), 1455 bytes captured (11640 bits) on interface 0
Ethernet II, Src: SERVER_MAC, Dst: GATEWAY_MAC
Internet Protocol Version 4, Src: SERVER_IP, Dst: CLIENT_IP
Version: 4
Header Length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport))
0000 00.. = Differentiated Services Codepoint: Default (0x00)
.... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00)
Total Length: 1441
Identification: 0xcc96 (52374)
Flags: 0x02 (Don't Fragment)
0... .... = Reserved bit: Not set
.1.. .... = Don't fragment: Set
..0. .... = More fragments: Not set
Fragment offset: 0
Time to live: 64
Protocol: UDP (17)
Header checksum: 0xe252 [validation disabled]
[Good: False]
[Bad: False]
Source: SERVER_IP
Destination: CLIENT_IP
[Source GeoIP: Unknown]
[Destination GeoIP: Unknown]
User Datagram Protocol, Src Port: 19234 (19234), Dst Port: 46921 (46921)
Data (1413 bytes)

Neighbor solicitation sent instead of ICMP6 echo resquest

I'm trying to send an ICMPV6 echo request. Bellow my code:
struct icmp6_hdr icmp6;
int sock;
struct icmp6_filter filterv6;
struct ifreq ifr;
sock = socket(AF_INET6, SOCK_RAW,IPPROTO_ICMPV6);
ICMP6_FILTER_SETBLOCKALL(&filterv6);
ICMP6_FILTER_SETPASS(ICMP6_DST_UNREACH, &filterv6);
ICMP6_FILTER_SETPASS(ICMP6_PACKET_TOO_BIG, &filterv6);
ICMP6_FILTER_SETPASS(ICMP6_TIME_EXCEEDED, &filterv6);
ICMP6_FILTER_SETPASS(ICMP6_PARAM_PROB, &filterv6);
ICMP6_FILTER_SETPASS(ICMP6_ECHO_REPLY, &filterv6);
ICMP6_FILTER_SETPASS(ND_REDIRECT, &filterv6);
setsockopt(sock, IPPROTO_ICMPV6, ICMP6_FILTER, &filterv6, sizeof (filterv6));
...
setsockopt(sock, SOL_SOCKET, SO_BINDTODEVICE, &ifr, sizeof ifr);
...
icmp6.icmp6_type = ICMP6_ECHO_REQUEST;
icmp6.icmp6_code = 0;
icmp6.icmp6_cksum = 0;
icmp6.icmp6_id = id;
icmp6.icmp6_seq = 100;
if( (sendto(sock, &icmp6, sizeof(struct icmp6_hdr), 0, (struct sockaddr *)dest, socklen)) != sizeof(struct icmp6_hdr))
However, for an unknown reason, the sent packet is an NDS:
[root#jingo ~]# tcpdump -v -i any -s0 | grep icmp6
tcpdump: WARNING: Promiscuous mode not supported on the "any" device
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
11:57:08.397368 IP6 (hlim 255, next-header: ICMPv6 (58), length: 32) 2001:db8:0:85a3::ac1f:8003 > ff02::1:ff1f:8009: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has 2001:db8:0:85a3::ac1f:8009
11:57:09.397331 IP6 (hlim 64, next-header: ICMPv6 (58), length: 112) 2001:db8:0:85a3::ac1f:8003 > 2001:db8:0:85a3::ac1f:8003: [icmp6 sum ok] ICMP6, destination unreachable, length 112, unreachable address 2001:db8:0:85a3::ac1f:8009
I'm using 2.6.18-308.el5PAE kernel , Red Hat Enterprise Linux Server release 5.1 (Tikanga).
This is normal behavior.
Since you can't send IP traffic until you have the correct MAC address to direct packets to, something has to find that MAC address. In IPv4, you would have seen an ARP packet. NDP (neighbor discovery protocol) replaced ARP in IPv6, which is why you're seeing NDP traffic.
The real problem here is that the destination host is not reachable. It may be down, or the router may not know how to reach it. Your router might be configured incorrectly, but that seems unlikely.
Try pinging a host that is up, and you will see the NDP traffic followed by your ICMP echo request.

Filtering packets in pcap dump file

I'm writing network analyzer and I need to filter packets saved in file, I have written some code to filter http packets but I'm not sure if it work as it should because when I use my code on a pcap dump the result is 5 packets but in wireshark writing http in filter gives me 2 packets and if I use:
tcpdump port http -r trace-1.pcap
it gives me 11 packets.
Well, 3 different results, that's a little confusing.
The filter and the packet processing in me code is:
...
if (pcap_compile(handle, &fcode, "tcp port 80", 1, netmask) < 0)
...
while ((packet = pcap_next(handle,&header))) {
u_char *pkt_ptr = (u_char *)packet;
//parse the first (ethernet) header, grabbing the type field
int ether_type = ((int)(pkt_ptr[12]) << 8) | (int)pkt_ptr[13];
int ether_offset = 0;
if (ether_type == ETHER_TYPE_IP) // ethernet II
ether_offset = 14;
else if (ether_type == ETHER_TYPE_8021Q) // 802
ether_offset = 18;
else
fprintf(stderr, "Unknown ethernet type, %04X, skipping...\n", ether_type);
//parse the IP header
pkt_ptr += ether_offset; //skip past the Ethernet II header
struct ip_header *ip_hdr = (struct ip_header *)pkt_ptr;
int packet_length = ntohs(ip_hdr->tlen);
printf("\n%d - packet length: %d, and the capture lenght: %d\n", cnt++,packet_length, header.caplen);
}
My question is why there are 3 different result when filtering the http? And/Or if I'm filtering it wrong then how can I do it right, also is there a way to filter http(or ssh, ftp, telnet ...) packets using something else than the port numbers?
Thanks
So I have figured it out. It took a little search and understanding but I did it.
Wireshark filter set to http filter packets that have set in tcp port 80 and also flags set to PSH, ACK. After realizing this, the tcpdump command parameters which result in the same numbers of packets was easy to write.
So now the wireshark and tcpdump gives the same results
What about my code? well I figured that I actually had an error in my question, the filter
if (pcap_compile(handle, &fcode, "tcp port 80", 1, netmask) < 0)
indeed gives 11 packets (src and dst port set to 80 no matter what tcp flags are)
Now to filter the desired packets is a question of good understanding the filter syntax
or setting to filter only port 80 (21,22, ...) and then in callback function or in while loop get the tcp header and from there get the flags and use mask to see if it is the correct packet (PSH, ACK, SYN ...) the flags number are for example here

Resources