BPF: mark in structure __skbuff is not writeable?

BPF: mark in structure __skbuff is not writeable? - c

I have a BPF code (section "classifier"). I use this to load to an interface using the tc (traffic controller) utility. My code changes the mark in __skbuff. Later when I try to catch this mark using iptables, I observe that the mark I edited has disappeared.
Code:
__section("classifier")
int function(struct __sk_buff *skb)
{
skb->mark = 0x123;
I use the iptable mangle table's below rule to see if the mark is written correctly.
# iptables -t mangle -A PREROUTING -i <my_interface> \
-m mark --mark 0x123 \
-j LOG --log-prefix "MY_PRINTS" --log-level 7
Following are the TC commands I used to load my bpf programs ;
# tc qdisc add dev <myInterface> root handle 1: prio
# tc filter add dev <myInterface> parent 1: bpf obj bpf.o flowid 1:1 direct-action

The issue is in your tc commands. You are attaching your filter on the egress side.
The root parent refers to the egress side, used for traffic shaping. If instead you want to attach your filter on the ingress side, you should use something like this (no handle needed):
# tc qdisc add dev <myInterface> ingress
# tc filter add dev <myInterface> ingress bpf obj bpf.o direct-action
Or, better practice, use the BPF-specific qdisc clsact, which can be used to attach filters for both ingress and egress (not much documentation on it, besides its commit log and Cilium's BPF documentation (search for clsact)):
# tc qdisc add dev <myInterface> clsact
# tc filter add dev <myInterface> ingress bpf obj bpf.o direct-action

Related

How can I run dracut commands inside a non-root C code?

I'm developing a tool that modifies LUKS partitions and disks.
Everything is working very well. Until now...
To handle disks properly as a non-root user, I added some polkit rules to change password, open partition, change crypttab and many others.
But, I'm seeing problems when I change crypttab and I need to run dracut to apply some dracut modules (dracut --force). Specially, the last one.
My user is part of admin group and I added a rule into sudoers file to not ask sudo password when my application is executed.
So, I decided to use this code:
gchar *dracut[] = {"/usr/bin/sudo", "/usr/bin/dracut", "--force", NULL};
if ((child = fork()) > 0) {
waitpid(child, NULL, 0);
} else if (!child) {
execvp("/usr/bin/sudo", dracut);
}
It is not working because SELinux is preventing to run this command:
SELinux is preventing /usr/bin/sudo from getattr access on the chr_file /dev/hpet.
***** Plugin catchall (100. confidence) suggests **************************
If you believe that sudo should be allowed getattr access on the hpet chr_file by default.
Then you should report this as a bug.
You can generate a local policy module to allow this access.
Do
allow this access for now by executing:
# ausearch -c 'sudo' --raw | audit2allow -M my-sudo
# semodule -X 300 -i my-sudo.pp
Additional Information:
Source Context system_u:system_r:xdm_t:s0-s0:c0.c1023
Target Context system_u:object_r:clock_device_t:s0
Target Objects /dev/hpet [ chr_file ]
Source sudo
Source Path /usr/bin/sudo
Port <Unknown>
Host <Unknown>
Source RPM Packages sudo-1.8.25p1-4.el8.x86_64
Target RPM Packages
Policy RPM selinux-policy-3.14.1-61.el8.noarch
Selinux Enabled True
Policy Type targeted
Enforcing Mode Enforcing
Host Name jcfaracco#hostname
Platform Linux jcfaracco#hostname 4.18.0-80.el8.x86_64 #1
SMP Wed Mar 13 12:02:46 UTC 2019 x86_64 x86_64
Alert Count 9
First Seen 2019-06-14 19:32:42 -03
Last Seen 2019-06-14 19:42:46 -03
Local ID 772b2c41-2302-4ee0-8886-52789eb63e22
Raw Audit Messages
type=AVC msg=audit(1560552166.658:199): avc: denied { getattr } for pid=2291 comm="sudo" path="/dev/hpet" dev="devtmpfs" ino=10776 scontext=system_u:system_r:xdm_t:s0-s0:c0.c1023 tcontext=system_u:object_r:clock_device_t:s0 tclass=chr_file permissive=0
type=SYSCALL msg=audit(1560552166.658:199): arch=x86_64 syscall=stat success=no exit=EACCES a0=7ffd4a6dffb0 a1=7ffd4a6def20 a2=7ffd4a6def20 a3=7fe845a73181 items=0 ppid=1756 pid=2291 auid=4294967295 uid=982 gid=980 euid=0 suid=0 fsuid=0 egid=980 sgid=980 fsgid=980 tty=tty1 ses=4294967295 comm=sudo exe=/usr/bin/sudo subj=system_u:system_r:xdm_t:s0-s0:c0.c1023 key=(null)ARCH=x86_64 SYSCALL=stat AUID=unset UID=gnome-initial-setup GID=gnome-initial-setup EUID=root SUID=root FSUID=root EGID=gnome-initial-setup SGID=gnome-initial-setup FSGID=gnome-initial-setup
Hash: sudo,xdm_t,clock_device_t,chr_file,getattr
Do you know how to fix this issue? Any other idea to call dracut inside a C code is welcome too. In case of any other smart way to perform this issue.

Yocto: Adding Kernel Module to Image

I added iptables package to my device image, using CORE_IMAGE_EXTRA_INSTALL += "iptables".
I tried to run it on the device and get the following error message:
modprobe: FATAL: Module ip_tables not found in directory /lib/modules/4.9.11-1.0.0+gc27010d
iptables v1.6.1: can't initialize iptables table `filter': Table does not exist (do you need to insmod?)
Perhaps iptables or your kernel needs to be upgraded.
Seems like I have missing kernel module.
Need your help in adding standard kernel module to the image (Where can I find all modules files and how should I add and load it to the image).

You must add the iptables module to your kernel. I had the same problem and I could solve it with these steps:
Run bitbake -c menuconfig virtual/kernel
Activate CONFIG_IP_NF_IPTABLES module (you can search its location on that menu typing slash '/').
Save it and run bitbake -c savedefconfig virtual/kernel for saving that file as a defconfig.
Copy defconfig file from returned path to yocto-distro/layer-name/recipes-kernel/linux/files/ (make this directory if it does not exist).
Create a .bbappend file inside yocto-distro/layer-name/recipes-kernel/linux/ with the same name as your original recipe file from meta layer.
Edit your file and append lines below:
SRC_URI += "file://defconfig"
KERNEL_DEFCONFIG = "${WORKDIR}/defconfig"
FILESEXTRAPATHS_prepend := "${THISDIR}/files"
~
Relaunch bitbake your-image-name
It worked on my situation. Btw, I got that info from the following webs:
http://variwiki.com/index.php?title=Yocto_Customizing_the_Linux_kernel
https://www.linuxtopia.org/Linux_Firewall_iptables/x651.html
Have a nice day! :D

I've added a MAX7320 i2c output chip. How can I get the kernel to load the driver for it?

I've added a MAX7320 i2c expander chip to i2c bus 0 on my ARM Linux board.
The chip works correctly from userspace with commands such as /usr/sbin/i2cset -y 0 0x5d 0x02 and /usr/sbin/i2cget -y 0 0x5d.
There is a drivers/gpio/gpio-max732x.c file in the kernel source, which is compiled into the kernel that I'm running. (I've built it from source.)
How do I tell the kernel that it should instantiate the gpio-max732x driver on "i2c bus 0, chip id 0x5d"?
Do I need to modify the device tree .dts file and put a new .dtb file in /boot/dtbs/?
What would the clause for instantiating a gpio-max732x module look like?
P.S. I've seen https://lkml.org/lkml/2015/1/13/305 but I can't figure out how to get the patch files.

Device Tree
There must be appropriate Device Tree definition for your chip, in order for driver to instantiate. There are 2 ways to do so:
Modify .dts Device Tree file for your board (look in arch/arm/boot/dts/), then recompile it and re-flash it to your device.
This way is preferred in case when you have access you kernel sources for your board and you are able to re-flash .dtb file to your device.
Create Device Tree Overlay file, compile it and load it on your device.
This way is preferred when you don't have access to kernel sources for your board, or you are unable to flash new device tree blob to your device.
Your device definition in Device Tree should look like (according to Documentation/devicetree/bindings/gpio/gpio-max732x.txt):
&i2c0 {
expander: max7320#5d {
compatible = "maxim,max7320";
reg = <0x5d>;
gpio-controller;
#gpio-cells = <2>;
};
};
Kernel configuration
As your expander chip (MAX7320) has no input GPIOs, you don't need IRQ support for MAX732x. So you can disable CONFIG_GPIO_MAX732X_IRQ in your kernel configuration.
Matching device with driver
Once you have your Device Tree loaded (with definition for MAX7320), MAX732x driver will be matched with device definition, and instantiated. Below is explained how matching happens.
In Device Tree file you have compatible property:
compatible = "maxim,max7320";
In MAX732x driver you can see this table:
static const struct of_device_id max732x_of_table[] = {
...
{ .compatible = "maxim,max7320" },
...
When driver is being loaded, and when Device Tree blob is being loaded, kernel tries to find the match for each driver and Device Tree definition. Just by comparing strings above. If strings are matched -- kernel instantiates driver, passing corresponding device parameters to it. Look at i2c_device_match() function for details.
Obtaining patches
The best way is to use kernel sources that already have Device Tree support of MAX732x (v4.0+). But if it's not the case, then...
You can cherry-pick patches from upstream kernel to your kernel:
$ git remote add upstream git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
$ git fetch --all
$ git cherry-pick 43c4bcf9425e
$ git cherry-pick 479f8a5744d8
$ git cherry-pick 09afa276d52e
$ git cherry-pick 996bd13f28e6
And if you still want to apply patches manually (worst option, actually), here you can find direct links to patches. Click (patch) link to get a raw patch.
Also check later patches for gpio-max732x.c here.
Hardware concerns
To be sure that your chip has 0x5d I2C address, check that configuration pins are tied to next lines (as per datasheet):
Pin Line
-----------
AD2 V+
AD0 V+

Profiling sleep times with perf

I was looking for a way to find out where my program spends time. I read the perf tutorial and tried to profile sleep times as it is described there. I wrote the simplest possible program to profile:
#include <unistd.h>
int main() {
sleep(10);
return 0;
}
then I executed it with perf:
$ sudo perf record -e sched:sched_stat_sleep -e sched:sched_switch -e sched:sched_process_exit -g -o ~/perf.data.raw ./a.out
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB /home/pablo/perf.data.raw (~578 samples) ]
$ sudo perf inject -v -s -i ~/perf.data.raw -o ~/perf.data
build id event received for [kernel.kallsyms]: d62870685909222126e7070d2bafdf029f7ed3b6
failed to write feature 2
$ sudo perf report --stdio --show-total-period -i ~/perf.data
Error:
The /home/pablo/perf.data file has no samples!
Does anybody know how to avoid these errors? What do they mean? failed to write feature 2 doesn't look too user-friendly...
Update:
$ uname -a
Linux debian 3.12-1-amd64 #1 SMP Debian 3.12.9-1 (2014-02-01) x86_64 GNU/Linux

There is a error message from your second perf command from https://perf.wiki.kernel.org/index.php/Tutorial#Profiling_sleep_times - perf inject -s
$ sudo perf inject -v -s -i ~/perf.data.raw -o ~/perf.data
build id event received for [kernel.kallsyms]: d62870685909222126e7070d2bafdf029f7ed3b6
failed to write feature 2
failed to write feature 2 doesn't look too user-friendly...
... but it was added to perf to made errors more user-friendly: http://lwn.net/Articles/460520/ "perf: make perf.data more self-descriptive (v5)" by Stephane Eranian , 22 Sep 2011:
+static int do_write_feat(int fd, struct perf_header *h, int type, ....
+ pr_debug("failed to write feature %d\n", type);
All features are listed here http://lxr.free-electrons.com/source/tools/perf/util/header.h#L13
15 HEADER_TRACING_DATA = 1,
16 HEADER_BUILD_ID,
So, it sounds like perf inject was not able to write information about build ids (error from function write_build_id() from util/header.c) if I'm not wrong. There are two cases which can lead to error: unsuccessful call to perf_session__read_build_ids() or failing in writing buildid table dsos__write_buildid_table (this is not our case because there is no "failed to write buildid table" error message; check write_build_id)
You may check, do you have all buildids needed for the session. Also it may be useful to clear your buildid cache (rm -rf ~/.debug), and check that you have up-to-date vmlinux with debugging info or kallsyms enabled in your kernel.
UPDATE: in comments Pavel says that his pref record had no any sched:sched_stat_sleep events written to perf.data:
sudo perf record -e sched:sched_stat_sleep -e sched:sched_switch -e sched:sched_process_exit -g -o ~/perf.data.raw ./a.out
As he explains in his answer, his default debian kernel have CONFIG_SCHEDSTATS option disabled with vendor's patch. The redhat did the same thing with the option in release kernels since 3.11, and this is explained in Redhat Bug 1013225 (Josh Boyer 2013-10-28, comment 4):
We switched to enabling that only on debug builds a while ago. It seems that was turned off entirely with the final 3.11.0 build and has remained off since. Internal testing shows the option has a non-trivial performance impact for context switches.
We can turn this on in debug kernels again, but I'm not sure it's worthwhile.
Josh Poimboeuf 2013-11-04 in comment 8 says that performance impact is detectable:
In my tests I did a lot of context switches under various CPU loads. I saw a ~5-10% drop in average context switch speed when CONFIG_SCHEDSTATS was enabled. ...The performance hit only seemed to happen on post-CFS kernels (>= 2.6.23). The previous O(1) scheduler didn't seem to have this issue.
Fedora disabled CONFIG_SCHEDSTAT in non-debug kernels at 12 July 2013 "[kernel] Disable LATENCYTOP/SCHEDSTATS in non-debug builds." by Dave Jones. First kernel with disabled option: 3.11.0-0.rc0.git6.4.
In order to use any perf software tracepoint event with name like sched:sched_stat_* (sched:sched_stat_wait, sched:sched_stat_sleep, sched:sched_stat_iowait) we must recompile kernel with CONFIG_SCHEDSTATS option enabled and replace default Debian, RedHat or Fedora kernels which have no this option.
Thank you, Pavel Davydov.

I finally found out how to make it work. The problem was that the default debian kernel is built without some config options, that perf needs to be able to monitor sleep times. It looks like CONFIG_SCHEDSTATS should be enabled to make kernel collect scheduler statistics. This is told to have some runtime overhead. Also I enabled CONFIG_SCHED_TRACER and some lock tracing options, but I'm not sure if they matter in my case. Anyway, no statistic data is collected in scheduler without CONFIG_SCHEDSTATS (see kernel/sched/ directory of kernel source).
Also, there is a very good article about perf written by Brendan Gregg, with a lot of usefull examples and some kernel options that are needed to make perf work properly.
Update: I checked the history of CONFIG_SCHEDSTATS in debian. I've checked out debian kernel patches and build scripts repo:
svn checkout svn://svn.debian.org/svn/kernel/dists/trunk/linux/debian
And then found CONFIG_SCHEDSTATS option there
$ grep -R CONFIG_SCHEDSTAT config/
config/config:# CONFIG_SCHEDSTATS is not set
This string was added to the repo in commit 10837, on 2008-03-14, with comment "debian/config: Do complete reorganization". Also, in this and this (thanks to osgx) bug reports it is told that CONFIG_LATENCYTOP, CONFIG_SCHEDSTATS options are not enabled because they can affect kernel perfomance. So, I think it just was never switched on in default debian kernels. I haven't found the discussion about scheduler stats option, though. If I do, I will write back here.

This works for me for "perf version 3.11.1" on an "openSUSE 13.1 (x86_64)" box.
Here is the output if you care:
# ========
# captured on: Sun Feb 16 09:49:38 2014
# hostname : *****************
# os release : 3.11.10-7-desktop
# perf version : 3.11.1
# arch : x86_64
# nrcpus online : 8
# nrcpus avail : 8
# cpudesc : Intel(R) Core(TM) i7-3840QM CPU # 2.80GHz
# cpuid : GenuineIntel,6,58,9
# total memory : 32945368 kB
# cmdline : /usr/bin/perf inject -v -s -i perf.data.raw -o perf.data
# event : name = sched:sched_stat_sleep, type = 2, config = 0x48, config1 = 0x0, config2 = 0x
# event : name = sched:sched_switch, type = 2, config = 0x51, config1 = 0x0, config2 = 0x0, e
# event : name = sched:sched_process_exit, type = 2, config = 0x4e, config1 = 0x0, config2 =
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: cpu = 4, software = 1, tracepoint = 2, uncore_cbox_0 = 6, uncore_cbox_1 = 7,
# ========
#
# Samples: 0 of event 'sched:sched_stat_sleep'
# Event count (approx.): 0
#
# Overhead Period Command Shared Object Symbol
# ........ ............ ....... ............. ......
#
# Samples: 8 of event 'sched:sched_switch'
# Event count (approx.): 80099958776
#
# Overhead Period Command Shared Object Symbol
# ........ ............ ....... ................. .................
#
100.00% 80099958776 bla [kernel.kallsyms] [k] thread_return
|
--- thread_return
thread_return
do_nanosleep
hrtimer_nanosleep
SyS_nanosleep
system_call_fastpath
0x7fbc0dec6570
__GI___libc_nanosleep
(nil)
# Samples: 0 of event 'sched:sched_process_exit'
# Event count (approx.): 0
#
# Overhead Period Command Shared Object Symbol
# ........ ............ ....... ............. ......
#
#
# (For a higher level overview, try: perf report --sort comm,dso)
#
}

setting up passive checks on nagios

hello board this question may be a little clean and green however,
I've been trying to set up Nagios NSCA for passive checks on a local ubuntu box as a prototype.
for those in the know, my nsca listening on 5667 and send_nsca is on the same ubuntu computer (localhost 127.0.0.1) . I've been reading and testing object definitions and service templates however I have been getting config errors when i try to access nagios web after modifications.
I hope to get clearer instructions on how I can create the service (directories/configurations) to process passive checks in Nagio3 for ubuntu.

There are a few things to consider, firstly that localhost is defined as a host and secondly that the check actually exists as it would for any other check but with a command that doesn't actually do anything, for example.. I've created a passiveservices.cfg file with services defined as follows:
define service{
use generic-service,service-pnp
host_name Server1,Server2
service_description Uptime
active_checks_enabled 1
passive_checks_enabled 1
check_command check_null
check_freshness 1
check_period none
}
define service{
use generic-service,service-pnp
host_name Server1,Server2
service_description Drive space
active_checks_enabled 1
passive_checks_enabled 1
check_command check_null
check_freshness 1
check_period none
Note that the check command is check_null, it's not actually doing anything.. and passive_checks_enabled is 1.
There are two lines within Nagios.cfg which you need to enable:
accept_passive_host_checks
accept_passive_service_checks
It's also a good idea to enable the following two lines aswell
check_service_freshness
check_host_freshness
If a server doesn't poll in after a set amount of time, it'll trigger a script (I trigger an email within my config)
Lastly, enable the following two lines:
log_external_commands
log_passive_checks
They'll help with debugging if this doesn't work. It writes out to /var/log/syslog on Ubuntu (well, it does on mine)..

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight