Autonomically sending a message from kernel-module to user-space application without relying on the invoke of input. from user-space - c

I will give a detailed exp of the program and lead to the issue regarding the use of netlink socket communication.
The last paragraph asks the actual question I need an answer for, so you might wanna start by peeking it first.
Disclaimer before I start:
- I have made an earlier search before asking here and did not find complete solution / alternative to my issue.
- I know how to initialize a module and insert it to kernel.
- I know to handle communication between module and user-space without using netlink sockets. Meaning using struct file_operations func pointers assignments to later be invoked by the module program whenever a user attempts to read/write etc. and answer to the user using copy_to_user / copy_from_user.
- This topic refers to Linux OS, Mint 17 dist.
- Language is C
Okay, so I am building a system with 3 components:
1. user.c : user application (user types commands here)
2. storage.c : storage device ('virtual' disk-on-key)
3. device.ko : kernel module (used as proxy between 1. and 2.)
The purpose of this system is to be able (as a user) to:
- Copy files to the virtual disk-on-key device (2) - like an "upload" from local directory that belongs to the user.
- Save files from the virtual device on local directory - like "download" from the device storage to the user directory.
Design:
Assuming programs (1),(2) are compiled and running + (3) has successfully inserted using the bash command ' sudo insmod device.ko ' , the following should work like this (simulation ofc):
Step 1 (in user.c) -> user types 'download file.txt'
Step 2 (in device.ko) -> the device recognizes the user have tried to 'write' to it (actually user just passing the string "download file.txt") and invokes the 'write' implementation of the method we set on struct file_operation earlier on module_init().
The device (kernel module) now passes the data (string with a command) to the storage.c application, expecting an answer to later be retrieved to the user.c application.
Step 3 (in storage.c) -> now, lets say this program performs a busy-wait loop of 'readmsg()' and that's how a request from module event is triggered and recognized, the storage device now recognizes that the module has sent a request (string with a command \ data). Now, the storage programs shall perform an implementation of some function 'X' to send the data requested using sendmsg() somewhere inside the function.
Now, here comes the issue.
Usually, on all of the examples I've looked on web, the communication between the kernel-module and a user-space (or the storage.c program in our case) using netlink is triggered by the user-space and not vice versa. Meaning that the sendmsg() function from the user-space invokes the 'request(struct sk_buff *skb)' method (which is set on the module_init() part as following:
struct netlink_kernel_cfg cfg = {
.input = request // when storage.c sends something, it invokes the request function
};
so when the storage.c performs something like:
sendmsg(sock_fd,&msg,0); // send a msg to the module
the module invokes and runs the:
static void request(struct sk_buff *skb) {
char *msg ="Hello from kernel";
msg_size=strlen(msg);
netlink_holder=(struct nlmsghdr*)skb->data;
printk(KERN_INFO "Netlink received msg payload:%s\n",(char*)nlmsg_data(netlink_holder));
pid = netlink_holder->nlmsg_pid; // pid of sending process
skb_out = nlmsg_new(msg_size,0);
if(!skb_out){
printk(KERN_ERR "Failed to allocate new skb\n");
return;
}
netlink_holder=nlmsg_put(skb_out,0,0,NLMSG_DONE,msg_size,0); // add a new netlink message to an skb. more info: http://elixir.free-electrons.com/linux/v3.2/source/include/net/netlink.h#L491
NETLINK_CB(skb_out).dst_group = 0; // not in multicast group
strncpy(nlmsg_data(netlink_holder),msg,msg_size); // assign data as char* (variable msg)
result=nlmsg_unicast(sock_netlink,skb_out,pid); // send data to storage. more info: http://elixir.free-electrons.com/linux/latest/source/include/net/netlink.h#L598
if(result<0)
printk(KERN_INFO "Error while sending bak to user\n");
}
and from all that big chunk, the only thing that im interesting in is actually doing this:
result=nlmsg_unicast(sock_netlink,skb_out,pid); // send data to storage.
BUT I can't use nlmsg_unicast() without having the strcut sk_buff* which is provided automatically for me whenever there's an invoke from storage.c !
To sum up everything:
How do I send a msg from the device.ko (kernel module) to the user-space withtout having to wait for request to invoke / rely on the provided strcut sk_buff parameter from the earlier shown 'request()' method ?
Hope this sums up the point.
Thanks.

The only question here is that you need the user-space program connected to kernel-space first to get the pid of your user-program.
After get the pid, you can manually construct the skb_out and send it out through netlink_unicast or nlmsg_unicast.
The pid is always needed, you can set it as static and let your user-space program connect to your device.ko to make a long-maintained link.
Although this question is asked at 2017, I believe OP has already found the answer :D

Related

change file descriptor without re-initializing the handle of uv_poll_t type

I have an application project running on Linux environment, which includes libuv and another third-party library, the third-party library provides APIs for starting a TCP connection to remote server (say xxx_connect()) and getting file descriptor of the active connection (say xxx_get_socket()) . So far I managed to get valid file descriptor from xxx_get_socket() after xxx_connect() completed successfully, and initialize uv_poll_t handle with that file descriptor in my program.
Currently I am working on reconnecting function, after reconnecting the same server (by running xxx_connect() again), xxx_get_socket() returns different file descriptor, that means it is necessary to update io_watcher.fd member of a uv_poll_t handle to receive data in the new active connection.
AFAIK uv_poll_init() internally invokes uv__io_check_fd() , uv__nonblock() and uv__io_init() , it seems possible to modify io_watcher.fd of a uv_poll_t handle without closing the handle and then initializing it again (see sample code below), which has extra latency. However I'm not sure if it is safe to do so, I don't know whether io_watcher.fd member of a uv_poll_t handle is referenced elsewhere in libuv (e.g. uv_run()) which makes thing more complex. Is my approach feasible or should I re-initialize the uv_poll_t handle in such case ? Appreciate any feedback.
Possible approach , simplified sample code :
int uv_poll_change_fd( uv_poll_t *handle, int new_fd ) {
if (uv__fd_exists(handle->loop, new_fd))
// ..... some code ....
err = uv__io_check_fd(handle->loop, new_fd);
if(err)
// ..... some code ....
err = uv__nonblock(new_fd, 1);
// ..... some code ....
handle->io_watcher.fd = new_fd;
}

how to fix 'invalid argument' for ioctl requests to block device

I'm writing a small c program to make tape status and seek requests via
ioctl(int fd, long int request, &io_buf)
but after trial and plenty of error, ioctl is returning -1 with the errno message "Invalid Argument"
I'm on Linux and running my program as sudo. The device I want to issue requests to is an optical drive, connected via SCSI. I've tried tape status and seek requests by passing requests (MTIOCGET or MTIOCTOP, respectively) to ioctl.
Code snippet for tape status function where fd is the file descriptor of the device returned by open() and mtgetbuf is an instance of the mtget struct from sys/mtio.h
stat = ioctl(fd, MTIOCGET, &mtgetbuf);
if (stat == -1)
{
perror("error on ioctl MTIOCGET request: ")
return EXIT_FAILURE;
}
Similar code snippet for seek tape function except mtopbuf is an instance of the mtop structure and MTSEEK is the defined op code for the seek operation, also in sys/mtio.h
mtopbuf.mt_op = MTSEEK;
stat = ioctl(fd, MTIOCTOP, &mtopbuf);
if (stat == -1)
{
perror("error on ioctl MTIOCGET request: ")
return EXIT_FAILURE;
}
Instead of invalid argument error messages and a return of -1, I would have expected a successful return from ioctl and the respective structure instances, mtgetbuf and mtopbuf, to have their members populated with data provided by the device.
I.e. A successful ioctl() command with the MTIOCGET request would return into the mtgetbuf mt_type member a value of either MT_ISSCSI1, MT_ISSCSI2, or MT_ISUNKNOWN (I don't believe it is any of the other defined values for other vendor-specific devices).
Note: I'm aware of the linux/mtio.h header file and I have tried including that in place of sys/mtio.h but the outcome is the same.
I've recently had success issuing requests to a block device using the SCSI Generic Linux driver (SG). There are three header files (below) that have provided op codes, structures used to pass and retrieve data from the device, among other information.
SCSI SG Header files:
/usr/include/scsi/scsi.h
/usr/include/scsi/scsi_ioctl.h
/usr/include/scsi/sg.h
A combination of online resources were instrumental in understanding how to package, send, and receive requests:
1) The TLDP SCSI Generic (sg) HOW-TO guide is a font of information on communicating to SCSI devices via the SG driver. A link to it is provided here. It explains in detail various commands that can be issued, how to package the commands by creating an instance of the sg_io_hdr_t structure, as well as a programming example to send a SCSI INQUIRY command which returns basic vendor information of the device. There are also status and sense codes for error handling and understanding unsuccessful SCSI requests.
2) Seagate's SCSI Command Reference Manual was helpful at times to understand the structure of bytes/bits in a SCSI command. Typically the op code occupied the first byte and the remaining bytes were zeros. The op codes in this reference manual were defined between those three header files mentioned above.
I have been able to send successful INQUIRY and GET_SG_VERSION_NUMBER requests and most likely have been able to send SEEK(6), READ_CAPACITY(10), and REZERO_UNIT commands. I say most likely because -1/errno values are not being returned and no information is being passed back into the sense buffer which is an indication of warnings/errors (either SCSI, host adapter, or driver status codes).
Hope this answers OPs question.

How does the Linux Kernel know which file descriptor to write input events to?

I would like to know the mechanism in which the Linux Kernel knows which file descriptor (e.g. /dev/input/eventX) to write the input to. For example, I know that when the user clicks the mouse, an interrupt occurs, which gets handled by the driver and propagated to the Linux input core via input_event (drivers/input/input.c), which eventually gets written to the appropriate file in /dev/input/. Specifically, I want to know which source files I need to go through to see how the kernel knows which file to write to based on the information given about the input event. My goal is to see if I can determine the file descriptors corresponding to specific input event codes before the kernel writes them to the /dev/input/eventX character files.
You may go through two files:
drivers/input/input.c
drivers/input/evdev.c
In evdev.c, evdev_init() will call input_register_handler() to initialize input_handler_list.
Then in an input device driver, after initialize input_dev, it will call:
input_register_device(input_dev)
-> get device kobj path, like /devices/soc/78ba000.i2c/i2c-6/6-0038/input/input2
-> input_attach_handler()
-> handler->connect(handler, dev, id);
-> evdev_connect()
In evdev_connect(), it will do below:
1. dynamic allocate a minor for a new evdev.
2. dev_set_name(&evdev->dev, "event%d", dev_no);
3. call input_register_handle() to connect input_dev and evdev->handle.
4. create a cdev, and call device_add().
After this, you will find input node /dev/input/eventX, X is value of dev_no.

Get signal level of the connected WiFi network

Using wpa_supplicant 2.4 on ARM Debian.
Is there a way to get signal level, in decibels or percents, of the wireless network I’m currently connected to?
STATUS command only returns the following set of values: bssid, freq, ssid, id, mode, pairwise_cipher, group_cipher, key_mgmt, wpa_state, ip_address, p2p_device_address, address, uuid
I can run SCAN afterwards, wait for results and search by SSID. But that’s slow and error-prone, I'd like to do better.
The driver should already know that information (because connected, and adjusting transmit levels for energy saving), is there a way to just query for that?
This question is not about general computing hardware and software. I'm using wpa_supplicant through a C API defined in wpa_ctrl.h header, interacting with the service through a pair of unix domain sockets (one for commands, another one for unsolicited events).
One reason I don’t like my current SCAN + SCAN_RESULT solution, it doesn’t work for hidden SSID networks. Scan doesn’t find the network, therefore I’m not getting signal level this way. Another issue is minor visual glitch at application startup. My app is launched by systemd, After=multi-user.target. Unless it’s the very first launch, Linux is already connected to Wi-Fi by then. In my app’s GUI (the product will feature a touch screen), I render a phone-like status bar, that includes WiFi signal strength icon. Currently, it initially shows minimal level (I know it's connected because STATUS command shows SSID), only after ~1 second I’m getting CTRL-EVENT-SCAN-RESULTS event from wpa_supplicant, run SCAN_RESULT command and update signal strength to the correct value.
On the API level my code is straightforward. I have two threads for that, both call wpa_ctrl_open, the command thread calls wpa_ctrl_request, the event thread has an endless loop that calls poll passing wpa_ctrl_get_fd() descriptor and POLLIN event mask, followed by wpa_ctrl_pending and wpa_ctrl_recv.
And here's the list of files in /sys/class/net/wlan0:
./mtu
./type
./phys_port_name
./netdev_group
./flags
./power/control
./power/async
./power/runtime_enabled
./power/runtime_active_kids
./power/runtime_active_time
./power/autosuspend_delay_ms
./power/runtime_status
./power/runtime_usage
./power/runtime_suspended_time
./speed
./dormant
./name_assign_type
./proto_down
./addr_assign_type
./phys_switch_id
./dev_id
./duplex
./gro_flush_timeout
./iflink
./phys_port_id
./addr_len
./address
./operstate
./carrier_changes
./broadcast
./queues/rx-0/rps_flow_cnt
./queues/rx-0/rps_cpus
./queues/rx-1/rps_flow_cnt
./queues/rx-1/rps_cpus
./queues/rx-2/rps_flow_cnt
./queues/rx-2/rps_cpus
./queues/rx-3/rps_flow_cnt
./queues/rx-3/rps_cpus
./queues/tx-0/xps_cpus
./queues/tx-0/tx_maxrate
./queues/tx-0/tx_timeout
./queues/tx-0/byte_queue_limits/limit
./queues/tx-0/byte_queue_limits/limit_max
./queues/tx-0/byte_queue_limits/limit_min
./queues/tx-0/byte_queue_limits/hold_time
./queues/tx-0/byte_queue_limits/inflight
./queues/tx-1/xps_cpus
./queues/tx-1/tx_maxrate
./queues/tx-1/tx_timeout
./queues/tx-1/byte_queue_limits/limit
./queues/tx-1/byte_queue_limits/limit_max
./queues/tx-1/byte_queue_limits/limit_min
./queues/tx-1/byte_queue_limits/hold_time
./queues/tx-1/byte_queue_limits/inflight
./queues/tx-2/xps_cpus
./queues/tx-2/tx_maxrate
./queues/tx-2/tx_timeout
./queues/tx-2/byte_queue_limits/limit
./queues/tx-2/byte_queue_limits/limit_max
./queues/tx-2/byte_queue_limits/limit_min
./queues/tx-2/byte_queue_limits/hold_time
./queues/tx-2/byte_queue_limits/inflight
./queues/tx-3/xps_cpus
./queues/tx-3/tx_maxrate
./queues/tx-3/tx_timeout
./queues/tx-3/byte_queue_limits/limit
./queues/tx-3/byte_queue_limits/limit_max
./queues/tx-3/byte_queue_limits/limit_min
./queues/tx-3/byte_queue_limits/hold_time
./queues/tx-3/byte_queue_limits/inflight
./tx_queue_len
./uevent
./statistics/rx_fifo_errors
./statistics/collisions
./statistics/rx_errors
./statistics/rx_compressed
./statistics/rx_dropped
./statistics/tx_packets
./statistics/tx_errors
./statistics/rx_missed_errors
./statistics/rx_over_errors
./statistics/tx_carrier_errors
./statistics/tx_heartbeat_errors
./statistics/rx_crc_errors
./statistics/multicast
./statistics/tx_fifo_errors
./statistics/tx_aborted_errors
./statistics/rx_bytes
./statistics/tx_compressed
./statistics/tx_dropped
./statistics/rx_packets
./statistics/tx_bytes
./statistics/tx_window_errors
./statistics/rx_frame_errors
./statistics/rx_length_errors
./dev_port
./ifalias
./ifindex
./link_mode
./carrier
You can get the signal level of the connected wifi by wpa_supplicant cmd SIGNAL_POLL
The wpa_supplicant would return:
RSSI=-60
LINKSPEED=867
NOISE=9999
FREQUENCY=5745
The value of the RSSI is the signal level.
You can get the signal level of the connected wifi by wpa_supplicant cmd BSS <bssid>.
About the bssid of the connected wifi, you can get from wpa_supplicant cmd STATUS.
https://android.googlesource.com/platform/external/wpa_supplicant_8/+/622b66d6efd0cccfeb8623184fadf2f76e7e8206/wpa_supplicant/ctrl_iface.c#1986
For iw compatible devices:
Following command gives the current station(aka AP) signal strength:
iw dev wlp2s0 station dump -v
If you need C API, just dig the source code of iw.
After a quick glance, the function you need is here
For broadcom devices, try search broadcom wl. It is close source, don't know if C API is provided.

How to create multiple network namespace from a single process instance

I am using following C function to create multiple network namespaces from a single process instance:
void create_namespace(const char *ns_name)
{
char ns_path[100];
snprintf(ns_path, 100, "%s/%s", "/var/run/netns", ns_name);
close(open(ns_path, O_RDONLY|O_CREAT|O_EXCL, 0));
unshare(CLONE_NEWNET);
mount("/proc/self/ns/net", ns_path, "none", MS_BIND , NULL);
}
After my process creates all the namspaces and I add a tap interface to any of the one network namespace (with ip link set tap1 netns ns1 command), then I actually see this interface in all of the namespaces (presumably, this is actually a single namespace that goes under different names).
But, if I create multiple namespaces by using multiple processes, then everything is working just fine.
What could be wrong here? Do I have to pass any additional flags to the unshare() to get this working from a single process instance? Is there a limitation that a single process instance can't create multiple network namespaces? Or is there a problem with mount() call, because /proc/self/ns/net is actually mounted multiple times?
Update:
It seems that unshare() function creates multiple network namespaces correctly, but all the mount points in /var/run/netns/ actually reference to the first network namespace that was mounted in that direcotry.
Update2:
It seems that the best approach is to fork() another process and execute create_namespace() function from there. Anyway, I would be glad to hear a better solution that does not involve fork() call or at least get a confirmation that would prove that it is impossible to create and manage multiple network namespaces from a single process.
Update3:
I am able to create multiple namespaces with unshare() by using the following code:
int main() {
create_namespace("a");
system("ip tuntap add mode tap tapa");
system("ifconfig -a");//shows lo and tapA interface
create_namespace("b");
system("ip tuntap add mode tap tapb");
system("ifconfig -a");//show lo and tapB interface, but does not show tapA. So this is second namespace created.
}
But after the process terminates and I execute ip netns exec a ifconfig -a and ip netns exec b ifconfig -a it seems that both commands were suddenly executed in namespace a. So the actual problem is storing the references to the namespaces (or calling mount() the right way. But I am not sure, if this is possible).
Network Namespaces are, by design, created with a call to clone, and it can be modified after by unshare. Take note that even if you do create a new network namespace with unshare, in fact you just modify network stack of your running process. unshare is unable to modify network stack of other processes, so you won't be able to create another one only with unshare.
In order to work, a new network namespace needs a new network stack, and so it needs a new process. That's all.
Good news is that it can be made very lightweight with clone, see:
Clone() differs from the traditional fork() system call in UNIX, in
that it allows the parent and child processes to selectively share or
duplicate resources.
You are able to divert only on this network stack (and avoid memory space, table of file descriptors and table of signal handlers). Your new network process can be made more like a thread than a real fork.
You can manipulate them with C code or with Linux Kernel and/or LXC tools.
For instance, to add a device to new network namespace, it's as simple as:
echo $PID > /sys/class/net/ethX/new_ns_pid
See this page for more info about CLI available.
On the C-side, one can take a look at lxc-unshare implementation. Despite its name it uses clone, as you can see (lxc_clone is here). One can also look at LTP implementation, where the author has chosen to use fork directly.
EDIT: There is a trick that you can use to make them persistent, but you will still need to fork, even temporarily.
Take a look at this code of ipsource2 (I have removed error checking for clarity):
snprintf(netns_path, sizeof(netns_path), "%s/%s", NETNS_RUN_DIR, name);
/* Create the base netns directory if it doesn't exist */
mkdir(NETNS_RUN_DIR, S_IRWXU|S_IRGRP|S_IXGRP|S_IROTH|S_IXOTH);
/* Create the filesystem state */
fd = open(netns_path, O_RDONLY|O_CREAT|O_EXCL, 0);
[...]
close(fd);
unshare(CLONE_NEWNET);
/* Bind the netns last so I can watch for it */
mount("/proc/self/ns/net", netns_path, "none", MS_BIND, NULL)
If you execute this code in a forked process, you'll be able to create new network namespace at will. In order to delete them, you can simply umount and delete this bind:
umount2(netns_path, MNT_DETACH);
if (unlink(netns_path) < 0) [...]
EDIT2: Another (dirty) trick would be simply to execute "ip netns add .." cli with system.
You only have to bind mount /proc/*/ns/* if you need to access these namespaces from another process, or need to get handle to be able to switch back and forth between the two. It is not needed to use multiple namespaces from a single process.
unshare does create new namespace.
clone and fork by default do not create any new namespaces.
there is one "current" namespace of each kind assigned to a process. It can be changed by unshare or setns. Set of namespaces (by default) is inherited by child processes.
Whenever you do open(/proc/N/ns/net), it creates inode for this file,
and all subsequent open()s will return file that is bound to the
same namespace. Details are lost in the depths of kernel dentry cache.
Also, each process has only one /proc/self/ns/net file entry, and
bind mount does not create new instances of this proc file.
Opening those mounted files are exactly the same as opening
/proc/self/ns/net file directly (which will keep pointing to the
namespace it pointed to when you first opened it).
It seems that "/proc/*/ns" is half-baked like this.
So, if you only need 2 namespaces, you can:
open /proc/1/ns/net
unshare
open /proc/self/ns/net
and switch between the two.
For more that 2 you might have to clone(). There seems to be no way to create more than one /proc/N/ns/net file per process.
However, if you do not need to switch between namespaces at runtime, or to share them with other processes, you can use many namespaces like this:
open sockets and run processes for main namespace.
unshare
open sockets and run processes for 2nd namespace (netlink, tcp, etc)
unshare
...
unshare
open sockets and run processes for Nth namespace (netlink, tcp, etc)
Open sockets keep reference to their network namespace, so they will not be collected until sockets are closed.
You can also use netlink to move interfaces between namespaces, by sending netlink command on source namespace, and specifying dst namespace either by PID or namespace FD (the later you don't have).
You need to switch process namespace before accessing /proc entries that depend on that namespace. Once "proc" file is open, it keeps reference to the namespace.

Resources