List windows route table - c

I want to list entries from windows route table. Same output as from route print. I use GetIpForwardTable2 function from IP Helper API. But I get some weird results which differ from route command output.
I run it in Windows 7 64bit in VirtualBox where I have 3 network cards (NAT, Bridge and Internal Network) and compile it under cygwin with following command:
gcc -D_WIN32_WINNT=0x0601 -DNTDDI_VERSION=0x06010000 win-iproute.c -liphlpapi
Those _WIN32_WINNT and NTDDI_VERSION are just to make functionality from Win7 available.
To make it simplier I consider ipv4 only now.
Here is the code:
#include <windows.h>
#include <winsock2.h>
#include <iphlpapi.h>
#include <Mstcpip.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
DWORD retval;
MIB_IPFORWARD_TABLE2 *routes = NULL;
MIB_IPFORWARD_ROW2 *route;
int idx;
retval = GetIpForwardTable2(AF_INET, &routes);
if (retval != ERROR_SUCCESS)
{
fprintf(stderr, "GetIpForwardTable2 failed (0x%x)\n.", retval);
return 1;
}
printf("Route entries count: %lu\n", routes->NumEntries);
for (idx = 0; idx < routes->NumEntries; idx++)
{
printf("\n -- Entry #%d -- \n", idx);
route = routes->Table + idx;
printf("luid: \t\t Reserved: %u, NetLuidIndex %u, IfType %u\n",
route->InterfaceLuid.Info.Reserved,
route->InterfaceLuid.Info.NetLuidIndex,
route->InterfaceLuid.Info.IfType);
printf("protocol: \t %lu\n", route->Protocol);
printf("origin: \t %lu\n", route->Origin);
printf("loopback: \t %lu\n", route->Loopback);
printf("next hop: \t %s\n", inet_ntoa(route->NextHop.Ipv4.sin_addr));
printf("site prefix length: \t %u\n", route->SitePrefixLength);
printf("prefix length: \t %u\n", route->DestinationPrefix.PrefixLength);
printf("prefix : \t %s\n", inet_ntoa(route->DestinationPrefix.Prefix.Ipv4.sin_addr));
}
return 0;
}
And the output is:
Route entries count: 22
-- Entry #0 --
luid: Reserved: 0, NetLuidIndex 6, IfType 6
protocol: 0
origin: 0
loopback: 0
next hop: 0.0.0.0
site prefix length: 0
prefix length: 0
prefix : 0.0.0.0
-- Entry #1 --
luid: Reserved: 0, NetLuidIndex 0, IfType 0
protocol: 0
origin: 0
loopback: 0
next hop: 0.0.0.0
site prefix length: 0
prefix length: 3
prefix : 0.0.0.0
-- Entry #2 --
luid: Reserved: 0, NetLuidIndex 0, IfType 0
protocol: 4294967295
origin: 257
loopback: 0
next hop: 0.0.0.0
site prefix length: 0
prefix length: 10
prefix : 0.1.0.0
-- Entry #3 --
luid: Reserved: 17, NetLuidIndex 0, IfType 0
protocol: 11
origin: 0
loopback: 2
next hop: 0.0.0.0
site prefix length: 17
prefix length: 0
prefix : 2.0.0.0
-- Entry #4 --
luid: Reserved: 0, NetLuidIndex 0, IfType 0
protocol: 32
origin: 0
loopback: 2
next hop: 0.1.0.0
site prefix length: 0
prefix length: 255
prefix : 2.0.0.0
-- Entry #5 --
luid: Reserved: 0, NetLuidIndex 0, IfType 0
protocol: 0
origin: 256
loopback: 255
next hop: 0.0.0.0
site prefix length: 0
prefix length: 11
prefix : 255.255.255.255
-- Entry #6 --
luid: Reserved: 3, NetLuidIndex 65792, IfType 0
protocol: 201326592
origin: 2
loopback: 0
next hop: 0.0.0.0
site prefix length: 3
prefix length: 24
prefix : 0.0.6.0
-- Entry #7 --
luid: Reserved: 5855577, NetLuidIndex 89, IfType 0
protocol: 0
origin: 2
loopback: 0
next hop: 0.1.0.0
site prefix length: 89
prefix length: 0
prefix : 0.0.0.0
-- Entry #8 --
luid: Reserved: 0, NetLuidIndex 0, IfType 0
protocol: 0
origin: 4294967295
loopback: 0
next hop: 2.0.0.0
site prefix length: 0
prefix length: 0
prefix : 0.0.0.0
-- Entry #9 --
luid: Reserved: 16777215, NetLuidIndex 65791, IfType 0
protocol: 593
origin: 1572864
loopback: 0
next hop: 2.0.0.0
site prefix length: 255
prefix length: 0
prefix : 0.0.0.0
-- Entry #10 --
luid: Reserved: 1, NetLuidIndex 512, IfType 0
protocol: 0
origin: 0
loopback: 0
next hop: 255.255.255.255
site prefix length: 1
prefix length: 0
prefix : 0.0.0.0
-- Entry #11 --
luid: Reserved: 4, NetLuidIndex 512, IfType 0
protocol: 0
origin: 0
loopback: 0
next hop: 0.0.6.0
site prefix length: 4
prefix length: 81
prefix : 0.0.0.0
-- Entry #12 --
luid: Reserved: 0, NetLuidIndex 16776960, IfType 65535
protocol: 3
origin: 1
loopback: 0
next hop: 0.0.0.0
site prefix length: 0
prefix length: 0
prefix : 0.1.0.0
-- Entry #13 --
luid: Reserved: 0, NetLuidIndex 12, IfType 6
protocol: 4294967295
origin: 0
loopback: 0
next hop: 0.0.0.0
site prefix length: 0
prefix length: 0
prefix : 0.0.0.0
-- Entry #14 --
luid: Reserved: 0, NetLuidIndex 0, IfType 0
protocol: 0
origin: 0
loopback: 0
next hop: 0.0.0.0
site prefix length: 0
prefix length: 3
prefix : 0.0.0.0
-- Entry #15 --
luid: Reserved: 0, NetLuidIndex 0, IfType 0
protocol: 4294967295
origin: 257
loopback: 0
next hop: 0.0.0.0
site prefix length: 0
prefix length: 255
prefix : 0.1.0.0
-- Entry #16 --
luid: Reserved: 585, NetLuidIndex 0, IfType 0
protocol: 3449440
origin: 0
loopback: 0
next hop: 0.0.0.0
site prefix length: 73
prefix length: 0
prefix : 2.0.0.0
-- Entry #17 --
luid: Reserved: 3211321, NetLuidIndex 13056, IfType 65
protocol: 3342403
origin: 4325427
loopback: 49
next hop: 125.0.0.0
site prefix length: 53
prefix length: 68
prefix : 54.0.45.0
-- Entry #18 --
luid: Reserved: 3473453, NetLuidIndex 17408, IfType 54
protocol: 0
origin: 0
loopback: 0
next hop: 0.0.0.0
site prefix length: 0
prefix length: 0
prefix : 70.0.69.0
-- Entry #19 --
luid: Reserved: 0, NetLuidIndex 0, IfType 0
protocol: 7471205
origin: 7274610
loopback: 0
next hop: 115.0.97.0
site prefix length: 111
prefix length: 0
prefix : 0.0.0.0
-- Entry #20 --
luid: Reserved: 7274611, NetLuidIndex 26112, IfType 116
protocol: 3277144
origin: 50725
loopback: 0
next hop: 49.69.55.56
site prefix length: 51
prefix length: 56
prefix : 65.0.100.0
-- Entry #21 --
luid: Reserved: 3277144, NetLuidIndex 0, IfType 0
protocol: 0
origin: 0
loopback: 0
next hop: 0.0.0.0
site prefix length: 192
prefix length: 0
prefix : 16.0.0.0
While the output of route print -4 is following:
===========================================================================
Interface List
16...08 00 27 7e 98 16 ......Intel(R) PRO/1000 MT Desktop Adapter #3
14...08 00 27 86 3d 31 ......Intel(R) PRO/1000 MT Desktop Adapter #2
11...08 00 27 42 d2 16 ......Intel(R) PRO/1000 MT Desktop Adapter
1...........................Software Loopback Interface 1
12...00 00 00 00 00 00 00 e0 Microsoft ISATAP Adapter
13...00 00 00 00 00 00 00 e0 Teredo Tunneling Pseudo-Interface
15...00 00 00 00 00 00 00 e0 Microsoft ISATAP Adapter #2
17...00 00 00 00 00 00 00 e0 Microsoft ISATAP Adapter #3
===========================================================================
IPv4 Route Table
===========================================================================
Active Routes:
Network Destination Netmask Gateway Interface Metric
0.0.0.0 0.0.0.0 10.0.2.2 10.0.2.15 10
0.0.0.0 0.0.0.0 10.0.0.138 10.0.0.36 10
10.0.0.0 255.255.255.0 On-link 10.0.0.36 266
10.0.0.36 255.255.255.255 On-link 10.0.0.36 266
10.0.0.255 255.255.255.255 On-link 10.0.0.36 266
10.0.2.0 255.255.255.0 On-link 10.0.2.15 266
10.0.2.15 255.255.255.255 On-link 10.0.2.15 266
10.0.2.255 255.255.255.255 On-link 10.0.2.15 266
89.89.89.0 255.255.255.0 On-link 89.89.89.89 266
89.89.89.89 255.255.255.255 On-link 89.89.89.89 266
89.89.89.255 255.255.255.255 On-link 89.89.89.89 266
127.0.0.0 255.0.0.0 On-link 127.0.0.1 306
127.0.0.1 255.255.255.255 On-link 127.0.0.1 306
127.255.255.255 255.255.255.255 On-link 127.0.0.1 306
224.0.0.0 240.0.0.0 On-link 127.0.0.1 306
224.0.0.0 240.0.0.0 On-link 10.0.2.15 266
224.0.0.0 240.0.0.0 On-link 10.0.0.36 266
224.0.0.0 240.0.0.0 On-link 89.89.89.89 266
255.255.255.255 255.255.255.255 On-link 127.0.0.1 306
255.255.255.255 255.255.255.255 On-link 10.0.2.15 266
255.255.255.255 255.255.255.255 On-link 10.0.0.36 266
255.255.255.255 255.255.255.255 On-link 89.89.89.89 266
===========================================================================
Persistent Routes:
None
There is a lot of weird stuff in the code output. Many entries have undocumented values, for example:
Protocol should be within range 1-14 (almost non entry has such value)
Luid.IfType shouldn't be 0 (again almost all are zero)
almost non entry gives any reasonable Prefix
It's described here MIB_IPFORWARD_ROW2 and here NET_LUID
Should I just ignore those with invalid values? and if so where are the valid ones? Or am I doing something terribly wrong?
I also discovered that when I start Windows with cables unplugged it gives less entries (which makes sense). Then I plug in the cables and entries are added. But when I unplug again they are still there. route command works as expected, when cable is unplugged entries are reduced.
When I try older function GetIpForwardTable it works. But it doesn't support ipv6.

So it seems that the problem was in cygwin. When I compile the example code with Microsoft C compiler cl.ex it works as expected. And after update of cygwin it works when compiled using gcc too.
Interesting is that it was enough to update the packages using cygwin installer, cygwin1.dll can remain in older version.

Related

What exactly do I need alter to get the desired result in tracert?

Looking for some batch file code which uses tracert, I found this:
Process "Tracert" output for IP addresses (Batch)
I'm not an expert in batch files, and I tried to alter the domain to a specific one, but I'm getting this error:
What exactly should I alter to get a result file like this, but using the domain that I want:
Hop: 1 Packetloss: 0 (0% loss) Average: 0ms IP: 192.168.1.1
Hop: 2 Packetloss: 0 (0% loss) Average: 0ms IP: 10.59.17.1
Hop: 3 Packetloss: 0 (0% loss) Average: 2ms IP: 172.17.2.137
Hop: 4 Packetloss: 0 (0% loss) Average: 6ms IP: 172.17.4.36
Hop: 5 Packetloss: 0 (0% loss) Average: 4ms IP: 172.17.4.10
Hop: 6 Packetloss: 0 (0% loss) Average: 2ms IP: 172.17.4.3
Hop: 7 Packetloss: 0 (0% loss) Average: 3ms IP: 80.72.159.241 Host: lt-0-0-0.mx-1a.ip.cirque.dk
Hop: 8 Packetloss: 0 (0% loss) Average: 2ms IP: 194.255.185.193 Host: 0xc2ffb9c1.linknet.dk.telia.net
Hop: 9 Packetloss: 0 (0% loss) Average: 2ms IP: 194.255.133.97 Host: 0xc2ff8561.linknet.dk.telia.net
Hop: 10 Packetloss: 0 (0% loss) Average: 5ms IP: 194.255.133.98 Host: 0xc2ff8562.linknet.dk.telia.net
Hop: 11 Packetloss: 0 (0% loss) Average: 3ms IP: 87.72.143.234
Hop: 12 Packetloss: --- Average: --- IP: Timeout
Hop: 13 Packetloss: --- Average: --- IP: Timeout
Hop: 14 Packetloss: 0 (0% loss) Average: 3ms IP: 62.61.131.22 Host: speedtest01-hor.aplus.dk

Issue reading line and printing everything together in c

I am trying to write a code that reads line by line and stores it in a variable sdp. This is what I would paste in the console to be read and I want the sdp to appear in that same format.
The idea would be to read every line, and put a end of line character and for every read reallocating extra memory.
Here is the code I wrote:
printf("[Description]: ");
char *line;
size_t len = 0;
ssize_t read = 0;
char *sdp = (char*) malloc(sizeof(char));
while ((read = getline(&line, &len, stdin)) != -1 && all_space(line)) {
sdp = (char*) realloc (sdp,(sizeof(sdp)) +sizeof(line));
strcat(sdp, line);
strcat(sdp, "\r\n");
free(line);
}
printf("%s\n",sdp);
rtcSetRemoteDescription(peer->pc, sdp, "offer");
free(sdp);
This is what I am trying to read and store in the variable sdp
v=0
o=- 118260718 0 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE 0
m=application 9 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
a=ice-ufrag:r4nc
a=ice-pwd:USf61+k7dRjSFJFZIlZkNn
a=ice-options:trickle
a=mid:0
a=setup:actpass
a=dtls-id:1
a=fingerprint:sha-256 21:C5:F4:72:3C:BA:4F:4A:DD:F1:14:C5:15:A9:57:4E:5B:61:44:CB:9B:7C:FC:2A:D3:0D:90:99:47:53:A9:57
a=sctp-port:5000
a=max-message-size:262144
But i keeping getting a problem, sdp doesnot store all the value and the result after printing is strange.
can someone help me
2020-05-11 21:24:28.287 DEBUG [15263] [rtc::InitLogger#34] Logger initialized
Peer created
Reached this point
***************************************************************************************
* 0: Exit / 1: Enter remote description / 2: Enter remote candidate / 3: Send message / 4: Print Connection Info *
[Command]: v=0
o=- 118260718 0 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE 0
m=application 9 UDP/DTLS/SCTP webrtc-datachannel
c=IN IP4 0.0.0.0
a=ice-ufrag:r4nc
a=ice-pwd:USf61+k7dRjSFJFZIlZkNn
a=ice-options:trickle
a=mid:0
a=setup:actpass
a=dtls-id:1
a=fingerprint:sha-256 21:C5:F4:72:3C:BA:4F:4A:DD:F1:14:C5:15:A9:57:4E:5B:61:44:CB:9B:7C:FC:2A:D3:0D:90:99:47:53:A9:57
a=sctp-port:5000
a=max-message-size:262144user#user-MXC6300:~/Desktop/WEBRTC/build$ o=- 118260718 0 IN IP4 127.0.0.1
118260718: command not found
user#user-MXC6300:~/Desktop/WEBRTC/build$ s=-
user#user-MXC6300:~/Desktop/WEBRTC/build$ t=0 0
0: command not found
user#user-MXC6300:~/Desktop/WEBRTC/build$ a=group:BUNDLE 0
0: command not found
user#user-MXC6300:~/Desktop/WEBRTC/build$ m=application 9 UDP/DTLS/SCTP webrtc-datachannel
9: command not found
user#user-MXC6300:~/Desktop/WEBRTC/build$ c=IN IP4 0.0.0.0
IP4: command not found
user#user-MXC6300:~/Desktop/WEBRTC/build$ a=ice-ufrag:r4nc
user#user-MXC6300:~/Desktop/WEBRTC/build$ a=ice-pwd:USf61+k7dRjSFJFZIlZkNn
user#user-MXC6300:~/Desktop/WEBRTC/build$ a=ice-options:trickle
user#user-MXC6300:~/Desktop/WEBRTC/build$ a=mid:0
user#user-MXC6300:~/Desktop/WEBRTC/build$ a=setup:actpass
user#user-MXC6300:~/Desktop/WEBRTC/build$ a=dtls-id:1
user#user-MXC6300:~/Desktop/WEBRTC/build$ a=fingerprint:sha-256 21:C5:F4:72:3C:BA:4F:4A:DD:F1:14:C5:15:A9:57:4E:5B:61:44:CB:9B:7C:FC:2A:D3:0D:90:99:47:53:A9:57
21:C5:F4:72:3C:BA:4F:4A:DD:F1:14:C5:15:A9:57:4E:5B:61:44:CB:9B:7C:FC:2A:D3:0D:90:99:47:53:A9:57: command not found
user#user-MXC6300:~/Desktop/WEBRTC/build$ a=sctp-port:5000
user#user-MXC6300:~/Desktop/WEBRTC/build$ a=max-message-size:262144

arm neon assembly performance issue in xiaomi5s

Consider the following codes, The first code snippet:
void run_new(const float* src, float* dst,
size_t IH, size_t IW, size_t OH, size_t OW,
size_t N) {
rep(n, N) {
const float* src_ptr = src + IW * IH * n;
float* outptr = dst;
const float* r0 = src_ptr;
const float* r1 = src_ptr + IW;
float32x4_t k0123 = vdupq_n_f32(3.f);
rep(h, OH) {
size_t width = OW >> 2;
asm volatile(
"dup v21.4s, %4.s[0] \n"
"dup v22.4s, %4.s[1] \n"
"dup v23.4s, %4.s[2] \n"
"dup v24.4s, %4.s[3] \n"
"mov x3, xzr \n"
"0: \n"
"ldr q0, [%1] \n"
"ld1 {v1.4s, v2.4s}, [%2], #32 \n"
"add x3, x3, #0x1 \n"
"cmp %0, x3 \n"
"ld1 {v3.4s, v4.4s}, [%3], #32 \n"
"fmla v0.4s, v1.4s, v21.4s \n" // src[i] * k[i]
"fmla v0.4s, v2.4s, v22.4s \n"
"fmla v0.4s, v3.4s, v23.4s \n"
"fmla v0.4s, v4.4s, v24.4s \n"
"str q0, [%1], #16 \n"
"bne 0b \n"
: "+r"(width), "+r"(outptr), "+r"(r0), "+r"(r1)
: "w"(k0123)
: "cc", "memory", "x3", "v0", "v1", "v2", "v3", "v4", "v21", "v22", "v23", "v24");
}
}
}
The second code snippet:
void run_origin(const float* src, float* dst,
size_t IH, size_t IW, size_t OH, size_t OW,
size_t N) {
rep(n, N) {
const float* src_ptr = src + IW * IH * n;
float* outptr = dst;
const float* r0 = src_ptr;
const float* r1 = src_ptr + IW;
float32x4_t k0123 = vdupq_n_f32(3.f);
rep(h, OH) {
size_t width = OW >> 2;
asm volatile(
"dup v21.4s, %4.s[0] \n"
"dup v22.4s, %4.s[1] \n"
"dup v23.4s, %4.s[2] \n"
"dup v24.4s, %4.s[3] \n"
"mov x3, xzr \n"
"mov x4, xzr \n"
"0: \n"
"add x19, %2, x4 \n"
"ldr q0, [%1] \n" // load dst 0, 1, 2, 3
"ld1 {v1.4s, v2.4s}, [x19]\n" // 1, 2, 4, 6
"add x3, x3, #0x1 \n"
"cmp %0, x3 \n"
"add x19, %3, x4 \n"
"ld1 {v3.4s, v4.4s}, [x19]\n"
"fmla v0.4s, v1.4s, v21.4s \n" // src[i] * k[i]
"fmla v0.4s, v2.4s, v22.4s \n"
"fmla v0.4s, v3.4s, v23.4s \n"
"fmla v0.4s, v4.4s, v24.4s \n"
"add x4, x4, #0x20 \n"
"str q0, [%1], #16 \n"
"bne 0b \n"
"add %2, %2, x4 \n"
"add %3, %3, x4 \n"
: "+r"(width), "+r"(outptr), "+r"(r0), "+r"(r1)
: "w"(k0123)
: "cc", "memory", "x3", "x4", "x19", "v0", "v1", "v2", "v3", "v4", "v21", "v22", "v23", "v24");
}
}
}
All the code in Test performance of arm neon assembly
I test the performance of these two codes on xiaomi5s、xiaomi6、redmi, The detail of the performance is:
N: 12 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 325.35058 mflops --- new: 4275.63483 mflops --- speedup: 13.14162 xiaomi5s
perf origin: 3082.00078 mflops --- new: 3063.45047 mflops --- speedup: 0.99398 xiaomi6
perf origin: 1761.05058 mflops --- new: 1814.37185 mflops --- speedup: 1.03028 redmi
The following test in xiaomi5s.
N: 12 IH:48-256 IW: 224
N: 12 IH: 48 IW: 224 OH: 24 OW: 112
perf origin: 3721.16633 mflops --- new: 4935.31729 mflops --- speedup: 1.32628
N: 12 IH: 80 IW: 224 OH: 40 OW: 112
perf origin: 1185.58378 mflops --- new: 3852.38266 mflops --- speedup: 3.24936
N: 12 IH: 112 IW: 224 OH: 56 OW: 112
perf origin: 1021.83468 mflops --- new: 3503.70672 mflops --- speedup: 3.42884
N: 12 IH: 144 IW: 224 OH: 72 OW: 112
perf origin: 797.61461 mflops --- new: 4167.12780 mflops --- speedup: 5.22449
N: 12 IH: 176 IW: 224 OH: 88 OW: 112
perf origin: 465.55073 mflops --- new: 4084.54206 mflops --- speedup: 8.77357
N: 12 IH: 208 IW: 224 OH: 104 OW: 112
perf origin: 373.99237 mflops --- new: 4255.78687 mflops --- speedup: 11.37934
N: 12 IH: 240 IW: 224 OH: 120 OW: 112
perf origin: 341.57406 mflops --- new: 4290.58840 mflops --- speedup: 12.56122
N: 12 IH:224 IW: 48-256
N: 12 IH: 224 IW: 48 OH: 112 OW: 24
perf origin: 3660.35916 mflops --- new: 4729.61877 mflops --- speedup: 1.29212
N: 12 IH: 224 IW: 80 OH: 112 OW: 40
perf origin: 2918.48755 mflops --- new: 4748.17285 mflops --- speedup: 1.62693
N: 12 IH: 224 IW: 112 OH: 112 OW: 56
perf origin: 951.03852 mflops --- new: 4051.84318 mflops --- speedup: 4.26044
N: 12 IH: 224 IW: 144 OH: 112 OW: 72
perf origin: 1186.74405 mflops --- new: 4160.18572 mflops --- speedup: 3.50555
N: 12 IH: 224 IW: 176 OH: 112 OW: 88
perf origin: 533.47286 mflops --- new: 4199.36622 mflops --- speedup: 7.87175
N: 12 IH: 224 IW: 208 OH: 112 OW: 104
perf origin: 447.30682 mflops --- new: 4092.22256 mflops --- speedup: 9.14858
N: 12 IH: 224 IW: 240 OH: 112 OW: 120
perf origin: 442.58206 mflops --- new: 4200.13672 mflops --- speedup: 9.49007
IC: 2-12 IH:224 IW: 224
N: 2 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 3794.45684 mflops --- new: 5236.48508 mflops --- speedup: 1.38004
N: 3 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 3790.20521 mflops --- new: 5150.30622 mflops --- speedup: 1.35885
N: 4 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 2117.55521 mflops --- new: 4329.34274 mflops --- speedup: 2.04450
N: 5 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 1290.43541 mflops --- new: 3915.65607 mflops --- speedup: 3.03437
N: 6 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 1038.86926 mflops --- new: 3747.69392 mflops --- speedup: 3.60747
N: 7 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 845.26878 mflops --- new: 4025.81237 mflops --- speedup: 4.76276
N: 8 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 658.23150 mflops --- new: 3971.62335 mflops --- speedup: 6.03378
N: 9 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 527.99489 mflops --- new: 4163.94501 mflops --- speedup: 7.88634
N: 10 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 416.75353 mflops --- new: 4119.03296 mflops --- speedup: 9.88362
N: 11 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 378.38875 mflops --- new: 4203.33717 mflops --- speedup: 11.10852
N: 12 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 350.36924 mflops --- new: 4202.19842 mflops --- speedup: 11.99363
I am confused by the performance test in xiaomi5s, Why the performance of the first code on xiaomi5s so bad.
I guess it may be caused by the pipeline of neon is broken if it wait for the normal register such as ld1 {v3.4s, v4.4s}, [x19] wait for x19 which is calculated by add x19, %3, x4, but I am not very sure。
Addition details:
xiaomi5s cpu: Qualcomm Snapdragon 821
xiaomi6 cpu: Qualcomm Snapdragon 835
redmi cpu: MediaTek Helio X20
Compile options(clang version: 5.0.0): clang++ -std=c++11 -Ofast.
I change ldr q0, [%2] to ld1 v0.4s, [%2], but the result is the same, the performance of the run_origin may be a little faster, about 1%-3%.
N: 12 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 342.96631 mflops --- asm: 4288.51646 mflops --- speedup: 12.50419
I change fmla v0.4s, v1.4s, v21.4s to smlsl2 v0.2d, v1.4s, v21.4s, but the result is the same.
N: 12 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 348.03699 mflops --- asm: 4245.18804 mflops --- speedup: 12.19752
I change fmla v0.4s, v1.4s, v21.4s to fadd v0.4s, v1.4s, v21.4s, the origin code gets faster.
N: 12 IH: 224 IW: 224 OH: 112 OW: 112
perf origin: 743.95433 mflops --- asm: 4756.65769 mflops --- speedup: 6.39375
A wild guess is that the bottleneck is just as likely to be in the memory/cache subsystem as the core. Perhaps the first case does something that inhibits automatic pre-loading (or the xiaomi5s lacks this or has it disabled)?
It might be interesting to try adding a pld (or rather prfm) instruction, though I've never found them to help much on Cortex-A9 at least.
An easy way to check if fmla is the bottleneck would be to comment out some or all of the data-processing instructions (of course, the output will be wrong!)
I'm still not as familiar with NEON64 as with NEON32, but there are several things I wouldn't do in your code:
Why are you using the VFP instruction "ldr"?. Switching between VFP and NEON can cost lots of cycles, especially if these instructions are memory accessing ones. That both share the registers doesn't mean they are the same unit. Change it to LD1 ...... 4s
Do you want it 32bit or 64bit? Chose x3 or w3, and stick to it.
Are you sure you want fused multiply with fmla? Maybe yes or maybe no, but note that fused multiplies cost more...
cheers

MS SQL Server crashing on Linux

I am running Microsoft SQL server on Ubuntu 16.04.2 LTS in QEMU VM
SQL Agent installed as well.
16 GB RAM assigned, and 6 processors.
SQL Upper memory limit set to 10 GB
I have a single 1.2 GB database. Simple Recovery mode.
Single SQL Agent job, that backs up the DB.
Problem: sqlserv process is killed by OOM shortly after job finished.
What settings should I be looking at to fix this?
I do not see anything in the SQL logs, only the messages in dmesg.
BACKUP JOB:
--Script 1: Backup specific database
-- 1. Variable declaration
DECLARE #path VARCHAR(500)
DECLARE #name VARCHAR(500)
DECLARE #pathwithname VARCHAR(500)
DECLARE #time DATETIME
DECLARE #year VARCHAR(4)
DECLARE #month VARCHAR(2)
DECLARE #day VARCHAR(2)
DECLARE #hour VARCHAR(2)
DECLARE #minute VARCHAR(2)
DECLARE #second VARCHAR(2)
-- 2. Setting the backup path
SET #path = 'C:\sqldata\SQLBACKUPS\'
-- 3. Getting the time values
SELECT #time = GETDATE()
SELECT #year = (SELECT CONVERT(VARCHAR(4), DATEPART(yy, #time)))
SELECT #month = (SELECT CONVERT(VARCHAR(2), FORMAT(DATEPART(mm,#time),'00')))
SELECT #day = (SELECT CONVERT(VARCHAR(2), FORMAT(DATEPART(dd,#time),'00')))
SELECT #hour = (SELECT CONVERT(VARCHAR(2), FORMAT(DATEPART(hh,#time),'00')))
SELECT #minute = (SELECT CONVERT(VARCHAR(2), FORMAT(DATEPART(mi,#time),'00')))
SELECT #second = (SELECT CONVERT(VARCHAR(2), FORMAT(DATEPART(ss,#time),'00')))
-- 4. Defining the filename format
SELECT #name ='DBNAME' + '_' + #year + #month + #day + #hour + #minute + #second
SET #pathwithname = #path + #namE + '.bak'
--5. Executing the backup command
BACKUP DATABASE [DBNAME]
ERROR MESSAGE in dmesg:
[617521.605059] kthreadd invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=2, oom_score_adj=0
[617521.605060] kthreadd cpuset=/ mems_allowed=0
[617521.605076] CPU: 1 PID: 2 Comm: kthreadd Not tainted 4.8.0-46-generic #49~16.04.1-Ubuntu
[617521.605077] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[617521.605082] 0000000000000286 00000000ac5a0d51 ffff8806ed5dbb00 ffffffffa0e2e073
[617521.605086] ffff8806ed5dbc90 ffff8806ea450ec0 ffff8806ed5dbb68 ffffffffa0c2e97b
[617521.605088] 0000000000000000 ffff8802fb7b8a80 ffff8806ea450ec0 ffff8806ed5dbb58
[617521.605090] Call Trace:
[617521.605117] [<ffffffffa0e2e073>] dump_stack+0x63/0x90
[617521.605130] [<ffffffffa0c2e97b>] dump_header+0x5c/0x1dc
[617521.605143] [<ffffffffa0dbd629>] ? apparmor_capable+0xe9/0x1a0
[617521.605152] [<ffffffffa0ba58d6>] oom_kill_process+0x226/0x3f0
[617521.605154] [<ffffffffa0ba5e4a>] out_of_memory+0x35a/0x3f0
[617521.605156] [<ffffffffa0bab079>] __alloc_pages_slowpath+0x959/0x980
[617521.605157] [<ffffffffa0bab35a>] __alloc_pages_nodemask+0x2ba/0x300
[617521.605166] [<ffffffffa0a80726>] copy_process.part.30+0x146/0x1b50
[617521.605176] [<ffffffffa0a63eee>] ? kvm_sched_clock_read+0x1e/0x30
[617521.605183] [<ffffffffa0aa3ed0>] ? kthread_create_on_node+0x1e0/0x1e0
[617521.605194] [<ffffffffa0a2c78c>] ? __switch_to+0x2dc/0x700
[617521.605196] [<ffffffffa0a82327>] _do_fork+0xe7/0x3f0
[617521.605213] [<ffffffffa1295b17>] ? __schedule+0x307/0x790
[617521.605215] [<ffffffffa0a82659>] kernel_thread+0x29/0x30
[617521.605219] [<ffffffffa0aa48e0>] kthreadd+0x160/0x1b0
[617521.605222] [<ffffffffa129aa1f>] ret_from_fork+0x1f/0x40
[617521.605224] [<ffffffffa0aa4780>] ? kthread_create_on_cpu+0x60/0x60
[617521.605225] Mem-Info:
[617521.605231] active_anon:1075398 inactive_anon:4083 isolated_anon:0
active_file:2616493 inactive_file:328306 isolated_file:160
unevictable:1 dirty:327621 writeback:785 unstable:0
slab_reclaimable:21286 slab_unreclaimable:7420
mapped:10714 shmem:5451 pagetables:6225 bounce:0
free:33879 free_pcp:498 free_cma:0
[617521.605234] Node 0 active_anon:4301592kB inactive_anon:16332kB active_file:10465972kB inactive_file:1313224kB unevictable:4kB isolated(anon):0kB isolated(file):640kB mapped:42856kB dirty:1310484kB writeback:3140kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 3321856kB anon_thp: 21804kB writeback_tmp:0kB unstable:0kB pages_scanned:17790528 all_unreclaimable? yes
[617521.605235] Node 0 DMA free:15900kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[617521.605238] lowmem_reserve[]: 0 2952 15988 15988 15988
[617521.605240] Node 0 DMA32 free:64576kB min:12464kB low:15580kB high:18696kB active_anon:733012kB inactive_anon:0kB active_file:2107244kB inactive_file:145520kB unevictable:0kB writepending:145520kB present:3129192kB managed:3063624kB mlocked:0kB slab_reclaimable:6992kB slab_unreclaimable:1272kB kernel_stack:1280kB pagetables:2844kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[617521.605243] lowmem_reserve[]: 0 0 13036 13036 13036
[617521.605244] Node 0 Normal free:55040kB min:55048kB low:68808kB high:82568kB active_anon:3568580kB inactive_anon:16332kB active_file:8358728kB inactive_file:1167704kB unevictable:4kB writepending:1168104kB present:13631488kB managed:13352220kB mlocked:4kB slab_reclaimable:78152kB slab_unreclaimable:28400kB kernel_stack:5168kB pagetables:22056kB bounce:0kB free_pcp:1992kB local_pcp:100kB free_cma:0kB
[617521.605264] lowmem_reserve[]: 0 0 0 0 0
[617521.605266] Node 0 DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15900kB
[617521.605277] Node 0 DMA32: 208*4kB (UE) 148*8kB (UE) 260*16kB (UE) 115*32kB (UME) 121*64kB (UME) 73*128kB (UME) 67*256kB (UME) 22*512kB (UME) 9*1024kB (UME) 0*2048kB 0*4096kB = 64576kB
[617521.605284] Node 0 Normal: 856*4kB (UMEH) 604*8kB (UEH) 278*16kB (UMEH) 373*32kB (UMEH) 185*64kB (UMEH) 53*128kB (UMEH) 14*256kB (UMEH) 6*512kB (UME) 5*1024kB (MH) 0*2048kB 0*4096kB = 55040kB
[617521.605293] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[617521.605294] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[617521.605294] 2950382 total pagecache pages
[617521.605295] 0 pages in swap cache
[617521.605296] Swap cache stats: add 0, delete 0, find 0/0
[617521.605296] Free swap = 0kB
[617521.605297] Total swap = 0kB
[617521.605297] 4194168 pages RAM
[617521.605297] 0 pages HighMem/MovableOnly
[617521.605298] 86230 pages reserved
[617521.605298] 0 pages cma reserved
[617521.605298] 0 pages hwpoisoned
[617521.605299] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[617521.605304] [ 337] 0 337 10867 3412 25 3 0 0 systemd-journal
[617521.605306] [ 382] 0 382 25742 291 17 3 0 0 lvmetad
[617521.605307] [ 384] 0 384 11276 897 22 3 0 -1000 systemd-udevd
[617521.605308] [ 780] 108 780 90615 2349 78 3 0 0 whoopsie
[617521.605309] [ 789] 106 789 11833 986 27 3 0 -900 dbus-daemon
[617521.605311] [ 803] 0 803 1100 312 7 3 0 0 acpid
[617521.605312] [ 823] 104 823 65138 701 29 3 0 0 rsyslogd
[617521.605313] [ 835] 0 835 129671 2914 40 6 0 0 snapd
[617521.605314] [ 836] 0 836 7137 729 18 3 0 0 systemd-logind
[617521.605315] [ 838] 0 838 7252 644 20 3 0 0 cron
[617521.605316] [ 857] 0 857 84342 1436 65 3 0 0 ModemManager
[617521.605317] [ 965] 0 965 16380 1344 35 3 0 -1000 sshd
[617521.605318] [ 967] 0 967 4884 65 14 3 0 0 irqbalance
[617521.605320] [ 992] 0 992 17496 788 40 3 0 0 login
[617521.605321] [ 1098] 0 1098 74129 1986 47 3 0 0 polkitd
[617521.605322] [ 1116] 120 1116 11105 983 23 3 0 0 ntpd
[617521.605323] [ 1152] 0 1152 71840 2120 136 4 0 0 winbindd
[617521.605324] [ 1153] 0 1153 105122 3484 203 4 0 0 winbindd
[617521.605325] [ 1159] 0 1159 73413 2856 140 4 0 0 winbindd
[617521.605326] [ 1161] 0 1161 71832 1924 135 4 0 0 winbindd
[617521.605327] [ 1163] 0 1163 71832 1295 136 4 0 0 winbindd
[617521.605328] [ 1721] 1000 1721 11312 932 26 3 0 0 systemd
[617521.605329] [ 1722] 1000 1722 16318 466 34 3 0 0 (sd-pam)
[617521.605337] [ 1725] 1000 1725 5613 1066 16 3 0 0 bash
[617521.605338] [ 1789] 0 1789 14274 787 33 3 0 0 sudo
[617521.605339] [ 1790] 0 1790 14109 719 33 3 0 0 su
[617521.605340] [ 1791] 0 1791 5619 1120 17 3 0 0 bash
[617521.605342] [ 1935] 0 1935 60002 1421 114 4 0 0 nmbd
[617521.605343] [ 1948] 0 1948 86040 3924 165 3 0 0 smbd
[617521.605345] [ 1949] 0 1949 82452 1067 155 3 0 0 smbd
[617521.605347] [ 1951] 0 1951 86171 1589 160 3 0 0 smbd
[617521.605349] [19081] 0 19081 87063 4262 167 3 0 0 smbd
[617521.605351] [19253] 0 19253 24889 1458 52 3 0 0 sshd
[617521.605352] [19275] 1000 19275 24889 891 51 3 0 0 sshd
[617521.605354] [19276] 1000 19276 5605 1104 16 3 0 0 bash
[617521.605356] [19307] 0 19307 14274 778 33 3 0 0 sudo
[617521.605357] [19308] 0 19308 14109 737 32 3 0 0 su
[617521.605359] [19309] 0 19309 5618 1184 16 3 0 0 bash
[617521.605360] [16347] 999 16347 18952 4419 40 4 0 0 sqlservr
[617521.605361] [16349] 999 16349 3028846 1043058 2562 26 0 0 sqlservr
[617521.605362] [20193] 0 20193 88057 4618 168 3 0 0 smbd
[617521.605363] [30023] 0 30023 87931 4038 167 3 0 0 smbd
[617521.605364] [ 4801] 0 4801 87627 4088 167 3 0 0 smbd
[617521.605365] [ 5266] 0 5266 68705 2451 66 4 0 0 cups-browsed
[617521.605366] [ 7563] 0 7563 88008 4183 167 3 0 0 smbd
[617521.605368] [10495] 0 10495 88072 4621 168 3 0 0 smbd
[617521.605369] [12342] 0 12342 88008 4292 167 3 0 0 smbd
[617521.605371] [12797] 0 12797 12555 719 30 3 0 0 cron
[617521.605373] [12798] 0 12798 12555 719 30 3 0 0 cron
[617521.605375] [12799] 0 12799 1127 213 8 3 0 0 sh
[617521.605376] [12800] 0 12800 1127 187 7 3 0 0 sh
[617521.605377] [12801] 0 12801 4902 785 15 3 0 0 rsync
[617521.605378] [12802] 0 12802 4732 483 14 3 0 0 rsync
[617521.605379] [12803] 0 12803 3911 690 12 3 0 0 rsync
[617521.605380] [12804] 0 12804 3741 452 11 3 0 0 rsync
[617521.605381] [12805] 0 12805 4878 477 15 3 0 0 rsync
[617521.605382] [12806] 0 12806 3911 515 11 3 0 0 rsync
[617521.605383] Out of memory: Kill process 16349 (sqlservr) score 254 or sacrifice child
[617521.608484] Killed process 16349 (sqlservr) total-vm:12115384kB, anon-rss:4164616kB, file-rss:7616kB, shmem-rss:0kB
[617521.832626] oom_reaper: reaped process 16349 (sqlservr), now anon-rss:0kB, file-rss:236kB, shmem-rss:0kB
You can configure SQL sp_configure setting to limit memory consumption if there are other processes consuming memory on the machine causing it to run out of memory or increase swap ( though you don't want SQL to be swapped out) or increase memory.
We can also tune the way that the OOM killer handles OOM conditions. If we want to make SQL process ( in this case 3452 ) less likely to be killed by the OOM killer
echo -15 > /proc/3452/oom_adj

OOM Killer is triggered despite free memory

I am having a problem where the OOM killer is being triggered some times. I have researched in the internet and have found many related threads. But a few things still puzzle me. I hope some one could help me.
Environment: iMX6 (32bit).
User/Kernelspace split: 2G-2G
TotalRAm - 4GB
Some important logs:
top invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
I see that it is trying to allocate 1 page of (contagious) memory (order=0) in the HIGHMEM zone (from gfp_mask). Please correct me if i am wrong.
DMA free:1322780kB min:4492kB low:5612kB high:6736kB active_anon:0kB inactive_anon:0kB active_file:84kB
DMA: 941*4kB (UEMC) 1211*8kB (UEMC) 1185*16kB (UEMC) 836*32kB (UEMC) 554*64kB
(UEMC) 295*128kB (UEMC) 106*256kB
HighMem free:480kB min:512kB low:2384kB high:4256kB active_anon:2021148kB inactive_anon:70364kB active_file:0kB
HighMem: 0*4kB 1*8kB (R) 0*16kB 7*32kB (R) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB
I believe the OOM-killer is triggered as the free Highmem (480KB) is below the min (512KB). Again please correct me if i am wrong.
My questions:
1. I thought the DMA_ZONE is only about 16MB, NORMAL_ZONE is upwards from 16MB
to about 896MB and the rest is HIGHMEM_ZONE. But the log shows more than 1GB
free pages (1322780kB) exist in the DMA_ZONE.
2. Why does not the kernel utilize this Zone for further allocation?
Morelogs: (taken out from the complete log):
DMA per-cpu:
CPU 0: hi: 186, btch: 31 usd: 0
CPU 1: hi: 186, btch: 31 usd: 0
CPU 2: hi: 186, btch: 31 usd: 0
CPU 3: hi: 186, btch: 31 usd: 0
HighMem per-cpu:
CPU 0: hi: 186, btch: 31 usd: 51
CPU 1: hi: 186, btch: 31 usd: 20
CPU 2: hi: 186, btch: 31 usd: 4
CPU 3: hi: 186, btch: 31 usd: 14
active_anon:505287 inactive_anon:17591 isolated_anon:0
active_file:21 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
free:330815 slab_reclaimable:1134 slab_unreclaimable:3487
mapped:15956 shmem:25014 pagetables:1982 bounce:0
25046 total pagecache pages
983039 pages of RAM
331349 free pages
9947 reserved pages
2772 slab pages
543663 pages shared
0 pages swap cached
cat /proc/pagetypeinfo
Page block order: 13
Pages per block: 8192
Free pages count per migrate type at order 0 1 2 3 4 5 6 7 8 9 10 11 12 13
Node 0, zone DMA, type Unmovable 1 0 9 8 3 1 0 1 1 1 1 0 1 0
Node 0, zone DMA, type Reclaimable 4 5 5 1 2 0 1 1 1 0 1 0 1 0
Node 0, zone DMA, type Movable 1 6 4 0 0 0 1 1 2 4 3 3 4 28
Node 0, zone DMA, type Reserve 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Node 0, zone DMA, type CMA 1 1 2 0 0 0 0 0 1 1 0 0 1 3
Node 0, zone DMA, type Isolate 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone HighMem, type Unmovable 11 7 2 2 9 6 5 3 3 1 0 1 1 0
Node 0, zone HighMem, type Reclaimable 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone HighMem, type Movable 23 201 4771 4084 1803 403 105 69 57 38 23 21 8 23
Node 0, zone HighMem, type Reserve 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Node 0, zone HighMem, type CMA 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone HighMem, type Isolate 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Number of blocks type Unmovable Reclaimable Movable Reserve CMA Isolate
Node 0, zone DMA 5 1 33 1 16 0
Node 0, zone HighMem 2 0 62 1 0 0
I would be glad to post further logs if necessary.
Thankyou,
Srik
probably long shot but have you tried adding
vm.overcommit_memory = 2
vm.overcommit_ratio = 80
to
/etc/sysctl.conf
The changes are high that you did run out of virtual memory because 32 bit kernel can only directly access 4 GB of virtual memory and there're heavy limitations on the usable address space for hardware access. For example, network adapter hardware acceleration could require memory in some specific address range and if you run out of RAM in that specific range, the system either has to run OOM Killer or kill your network adapter. And that's true even if your system has free available in some unrelated zone.
For details, try reviewing these links:
https://serverfault.com/questions/564068/linux-oom-situation-32-bit-kernel
https://serverfault.com/questions/548736/how-to-read-oom-killer-syslog-messages
and maybe this, too:
https://unix.stackexchange.com/questions/373312/oom-killer-doesnt-work-properly-leads-to-a-frozen-os
TL;DR: if you need more than 2 GB of RAM, install a 64 bit OS.

Resources