why open file descriptors are not getting reused instead they are increasing in number value - c

I have a simple C HTTP server. I close file descriptors for disk files and new connection fds returned by accept(...), but I noticed that I am getting new file descriptor numbers that are bigger than the previous numbers: for example file descriptor from accept return starts with 4, then 5, then 4 again and so on until file descriptor reaches max open file descriptor on a system.
I have set the value to 10,000 on my system but I am not sure why exactly file descriptor number jumps to max value. And I am kind of sure than my program is closing the file descriptors.
So I would like to know if there are not thousands of connections then how come file descriptor new number are increasing periodically: in around 24 hours I get message accept: too many open files. What is this message?
Also, does ulimit -n number value get reset automatically without system reboot?
as mentioned in the answer. The output of _2$ ps aux | grep lh is
dr-x------ 2 fawad fawad 0 Oct 11 11:15 .
dr-xr-xr-x 9 fawad fawad 0 Oct 11 11:15 ..
lrwx------ 1 fawad fawad 64 Oct 11 11:15 0 -> /dev/pts/3
lrwx------ 1 fawad fawad 64 Oct 11 11:15 1 -> /dev/pts/3
lrwx------ 1 fawad fawad 64 Oct 11 11:15 2 -> /dev/pts/3
lrwx------ 1 fawad fawad 64 Oct 11 11:25 255 -> /dev/pts/3
and the output of ls -la /proc/$$/fd is
root 49855 0.5 5.4 4930756 322328 ? Sl Oct09 15:58 /usr/share/atom/atom --executed-from=/home/fawad/Desktop/C++-work/lhparse --pid=49844 --no-sandbox
root 80901 0.0 0.0 25360 5952 pts/4 S+ 09:32 0:00 sudo ./lh
root 80902 0.0 0.0 1100852 2812 pts/4 S+ 09:32 0:00 ./lh
fawad 83419 0.0 0.0 19976 916 pts/3 S+ 11:27 0:00 grep --color=auto lh
I like to know what is pts/4 etc. column. is this the file descriptor number.

It's likely that the socket that is represented by the file descriptor is in close_wait or time_wait state. Which means the TCP stack holds the fd open for a bit longer. So you won't be able to reuse it immediately in this instance.
Once the socket is fully finished with and closed, the file descriptor number will then available for reuse inside your program.
See: https://en.m.wikipedia.org/wiki/Transmission_Control_Protocol
Protocol Operation and specifically Wait States.
To see what files are still open you can run
ls -la /proc/$$/fd
The output of this will also be of help.
ss -tan | head -5
LISTEN 0 511 *:80 *:*
SYN-RECV 0 0 192.0.2.145:80 203.0.113.5:35449
SYN-RECV 0 0 192.0.2.145:80 203.0.113.27:53599
ESTAB 0 0 192.0.2.145:80 203.0.113.27:33605
TIME-WAIT 0 0 192.0.2.145:80 203.0.113.47:50685

Related

mkdir throws No space left on device , while creating a large file is fine( plenty space and inodes available )

very strange behaviour
I cannot create a directory
[root#XXXXXX DEV]# mkdir 1
mkdir: cannot create directory `1': No space left on device
[root#dev-albert DEV]# pwd
/deployment/.octopus/Applications/OctopusServer/DEV
[root#XXXXXX DEV]# df -P /deployment
Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/mapper/deploymentvg-deployment 10321208 5229888 4567096 54%
/deployment
[root#dev-albert DEV]# df -Pi /deployment
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/deploymentvg-deployment 655360 69129 586231 11%
/deployment
As you can see, plenty of space , good number of inodes free.
Does anyone have any clue what is happening with my system.
[root#dev-albert DEV]# dmsetup ls
rootvg-tmp (252:6)
rootvg-usr (252:7)
rootvg-var (252:8)
deploymentvg-usropenv (252:3)
deploymentvg-deployment (252:2)
rootvg-agent (252:4)
rootvg-oracle (252:11)
rootvg-varlock (252:9)
rootvg-deployment (252:5)
rootvg-swap (252:1)
rootvg-root (252:0)
rootvg-varspool (252:10)
top output
top - 14:44:35 up 347 days, 20:40, 2 users, load average: 2.02, 2.02, 2.05
Tasks: 125 total, 2 running, 123 sleeping, 0 stopped, 0 zombie
Cpu(s):100.0%us, 0.0%sy, 0.0%ni,117100.0%id,-42916200.0%wa, 0.0%hi,
0.0%si,200.0%st
Mem: 4071932k total, 3394132k used, 677800k free, 780312k buffers
Swap: 4194300k total, 22604k used, 4171696k free, 1742552k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
747 root 20 0 0 0 0 S 3.0 0.0 303:33.05 jbd2/dm-2-8
20679 root 20 0 0 0 0 S 2.7 0.0 0:14.24 kworker/0:2
16319 root 20 0 0 0 0 R 2.3 0.0 266:30.45 flush-252:2
When I run mkdir with strace
open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=99158576, ...}) = 0
mmap(NULL, 99158576, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fd4c4b04000
close(3) = 0
mkdir("1", 0777) = -1 ENOSPC (No space left on
device)
open("/usr/share/locale/locale.alias", O_RDONLY) = 3
write(2, ": No space left on device", 25: No space left on device) = 25
uname output
Linux 2.6.39-400.17.1.el6uek.x86_64 #1 SMP Fri Feb 22 18:16:18 PST 2013 x86_64 x86_64 x86_64 GNU/Linux
Sometimes that can be caused by the b-tree used by ext4 as directory index hitting its height limit. If you get the No space left on device error on mkdir for some names but not others, or there's plenty of space and inodes, check your dmesg for these warnings:
EXT4-fs warning (device dm-0): ext4_dx_add_entry:2226: Directory (ino: 80087286) index full, reach max htree level :2
EXT4-fs warning (device dm-0): ext4_dx_add_entry:2230: Large directory feature is not enabled on this filesystem
That means you're hitting the b-tree limit, and on a pinch you can enable large directory with:
tune2fs -O large_dir <dev>
It doesn't require unounting or rebooting, and it will increase the limit from 10M to 2B. Depending on what you're doing you're likely to hit performance bottlenecks or actually filling the disk before hitting the limit again, but I recommend rethinking your directory structure to avoid creating too many files and subdirectories in the same directory, and use the above solution only in an emergency.

trying to run arbitrary commands and parse their output

here is part of code
scanf("%[^\n]%*c",command);
int pid;
pid=fork();
if (pid == 0) {
// Child process
char *argv[]={command ,NULL};
execvp(argv[0], argv);
exit (0);
}
When I give as input ls I want as output
1 copy of mysh1.c mysh1.c mysh3.c mysh.c New Folder
a.out helpmanual.desktop mysh2.c mysh4.c New File
and when i give ls -l /tmp
i'm waiting
total 12
-rw------- 1 antre antre 0 Nov 4 17:31 config-err-KT9sEZ
drwx------ 2 antre antre 4096 Nov 4 19:21 mozilla_antre0
drwx------ 2 antre antre 4096 Jan 1 1970 orbit-antre
drwx------ 2 antre antre 4096 Nov 4 17:31 ssh-HaOFtKdeIQnQ `
but i take:
1 copy of mysh1.c mysh1.c mysh3.c mysh.c New Folder
a.out helpmanual.desktop mysh2.c mysh4.c New File
It seems that you're trying to parse the output of ls -l in a C program for some reason.
That's unlikely to be the “right” thing to do. The usual mechanism is to use opendir and readdir to read the directory file, directly.
If you have some truly strange situation in which you cannot opendir (the only case that comes to mind is if you're running ls on a remote system, eg, over ssh), there is a mode in GNU ls specifically for producing an output record format that can be parsed by another program.
From the GNU coreutils info:
10.1.2 What information is listed
‘-D’
‘--dired’
With the long listing (‘-l’) format, print an additional line after
the main output:
//DIRED// BEG1 END1 BEG2 END2 ...
The BEGN and ENDN are unsigned integers that record the byte
position of the beginning and end of each file name in the output.
This makes it easy for Emacs to find the names, even when they
contain unusual characters such as space or newline, without fancy
searching.
If directories are being listed recursively (‘-R’), output a
similar line with offsets for each subdirectory name:
//SUBDIRED// BEG1 END1 ...
Finally, output a line of the form:
//DIRED-OPTIONS// --quoting-style=WORD
where WORD is the quoting style (*note Formatting the file
names::).
Here is an actual example:
$ mkdir -p a/sub/deeper a/sub2
$ touch a/f1 a/f2
$ touch a/sub/deeper/file
$ ls -gloRF --dired a
a:
total 8
-rw-r--r-- 1 0 Jun 10 12:27 f1
-rw-r--r-- 1 0 Jun 10 12:27 f2
drwxr-xr-x 3 4096 Jun 10 12:27 sub/
drwxr-xr-x 2 4096 Jun 10 12:27 sub2/
a/sub:
total 4
drwxr-xr-x 2 4096 Jun 10 12:27 deeper/
a/sub/deeper:
total 0
-rw-r--r-- 1 0 Jun 10 12:27 file
a/sub2:
total 0
//DIRED// 48 50 84 86 120 123 158 162 217 223 282 286
//SUBDIRED// 2 3 167 172 228 240 290 296
//DIRED-OPTIONS// --quoting-style=literal
Note that the pairs of offsets on the ‘//DIRED//’ line above
delimit these names: ‘f1’, ‘f2’, ‘sub’, ‘sub2’, ‘deeper’, ‘file’.
The offsets on the ‘//SUBDIRED//’ line delimit the following
directory names: ‘a’, ‘a/sub’, ‘a/sub/deeper’, ‘a/sub2’.
Here is an example of how to extract the fifth entry name,
‘deeper’, corresponding to the pair of offsets, 222 and 228:
$ ls -gloRF --dired a > out
$ dd bs=1 skip=222 count=6 < out 2>/dev/null; echo
deeper
Note that although the listing above includes a trailing slash for
the ‘deeper’ entry, the offsets select the name without the
trailing slash. However, if you invoke ‘ls’ with ‘--dired’ along
with an option like ‘--escape’ (aka ‘-b’) and operate on a file
whose name contains special characters, notice that the backslash
is included:
$ touch 'a b'
$ ls -blog --dired 'a b'
-rw-r--r-- 1 0 Jun 10 12:28 a\ b
//DIRED// 30 34
//DIRED-OPTIONS// --quoting-style=escape
If you use a quoting style that adds quote marks (e.g.,
‘--quoting-style=c’), then the offsets include the quote marks. So
beware that the user may select the quoting style via the
environment variable ‘QUOTING_STYLE’. Hence, applications using
‘--dired’ should either specify an explicit
‘--quoting-style=literal’ option (aka ‘-N’ or ‘--literal’) on the
command line, or else be prepared to parse the escaped names.
i just only needed to use strtok

how to open /dev/console in C

I was reading wayland/weston code, the setting up tty part. I found it tries to acquire an available tty for doing KMS and start windows.
This is how it does:
if (!wl->new_user) {
wl->tty = STDIN_FILENO;
} else if (tty) {
t = ttyname(STDIN_FILENO);
if (t && strcmp(t, tty) == 0)
wl->tty = STDIN_FILENO;
else
wl->tty = open(tty, O_RDWR | O_NOCTTY);
} else {
int tty0 = open("/dev/tty0", O_WRONLY | O_CLOEXEC);
char filename[16];
if (tty0 < 0)
error(1, errno, "could not open tty0");
if (ioctl(tty0, VT_OPENQRY, &wl->ttynr) < 0 || wl->ttynr == -1)
error(1, errno, "failed to find non-opened console");
snprintf(filename, sizeof filename, "/dev/tty%d", wl->ttynr);
wl->tty = open(filename, O_RDWR | O_NOCTTY);
close(tty0);
}
in src/weston-launch.c.
It tries to open('/dev/tty0') and find a tty that available if no tty is specified.
But you can't do that, neither /dev/tty0 nor 'available tty' belongs to you. I tested with my simpler version. And of course I couldn't open /dev/tty0.
Do you guys know how this magic is done?
The actual available devices for a tty depend on the system. On most interactive Unix/Unix-like systems you will have a "tty" whose name can be found from the command-line program tty. For example:
$ tty
/dev/pts/2
Likely, you also have a device named "tty", e.g.,
$ ls -l /dev/tty
lrwxrwxrwx 1 root other 26 Feb 9 2014 /dev/tty -> ../devices/pseudo/sy#0:tty
$ ls -lL /dev/tty
crw-rw-rw- 1 root tty 22, 0 Feb 9 2014 /dev/tty
You cannot open just any tty device, because most of them are owned by root (or other users to which they have been assigned).
For further discussion about the differences between /dev/console, /dev/tty and other tty-devices, see Cannot open /dev/console.
According to the console_codes(4) manual page:
VT_OPENQRY
Returns the first available (non-opened) console. argp points to an int which is set to the number of the vt (1 <= *argp <= MAX_NR_CONSOLES).
and for example on a Linux system I see this in /dev:
crw-rw-rw- 1 root 5, 0 Mon 04:20:13 tty
crw------- 1 root 4, 0 Mon 03:58:52 tty0
crw------- 1 root 4, 1 Mon 04:00:41 tty1
crw------- 1 tom 4, 2 Mon 04:30:31 tty2
crw------- 1 root 4, 3 Mon 04:00:41 tty3
crw------- 1 root 4, 4 Mon 04:00:41 tty4
crw------- 1 root 4, 5 Mon 04:00:41 tty5
crw------- 1 root 4, 6 Mon 04:00:41 tty6
crw------- 1 root 4, 7 Mon 03:58:52 tty7
crw------- 1 root 4, 8 Mon 03:58:52 tty8
crw------- 1 root 4, 9 Mon 03:58:52 tty9
crw------- 1 root 4, 10 Mon 03:58:52 tty10
crw------- 1 root 4, 11 Mon 03:58:52 tty11
All of those tty devices except one for which I have opened a console session are owned by root. To be able to log into one, a program such as getty acts to temporarily change its ownership. Doing a ps on my machine shows for example
root 2977 1 0 04:00 tty1 00:00:00 /sbin/getty 38400 tty1
root 2978 1 0 04:00 tty2 00:00:00 /bin/login --
root 2979 1 0 04:00 tty3 00:00:00 /sbin/getty 38400 tty3
root 2980 1 0 04:00 tty4 00:00:00 /sbin/getty 38400 tty4
root 2981 1 0 04:00 tty5 00:00:00 /sbin/getty 38400 tty5
root 2982 1 0 04:00 tty6 00:00:00 /sbin/getty 38400 tty6
Note that getty is running as root. That gives it the privilege to change the ownership of the tty device as needed. That is, while the ioctl may identify an unused tty, you need elevated privileges to actually open it. Linux (like any other Unix-like system) does not have a way to provide ensure that one process has truly exclusive access to a terminal. So it uses the device ownership and permissions to ensure this access.
If you're not the superuser then you should only try to access /dev/tty. That is a special device synonym for whichever tty is controlling the current process.

Please explain the ps -aef response on RHEL

What does pts/2 indicate in the below output. Why there is no such for other dd processes?
$ ps -aef |grep dd
root 6553672 15073352 3 02:32:19 - 0:01 dd of=/dev/lv01 bs=1024k
padmin 9437410 16515110 1 02:43:32 **pts/2** 0:00 grep dd
root 13828156 11010220 0 02:32:33 - 0:00 dd of=/dev/lv02 bs=1024k
root 14155860 13828156 2 02:32:33 - 0:01 dd of=/dev/lv02 bs=1024k
root 15073352 13762812 0 02:32:19 - 0:00 dd of=/dev/lv01 bs=1024k
root 15532200 15925276 2 02:40:47 **pts/1** 0:03 dd of=/home/padmin/sample-dd-op bs=1024k
pts/X in the TTY column means that the process is connected to a pseudo terminal slave.
An empty value means:
The terminal session has ended
The command was fired by a daemon
This nice Answer shows the difference between PTS and TTY.

How to set CPU affinity for a process from C or C++ in Linux?

Is there a programmatic method to set CPU affinity for a process in c/c++ for the Linux operating system?
You need to use sched_setaffinity(2).
For example, to run on CPUs 0 and 2 only:
#define _GNU_SOURCE
#include <sched.h>
cpu_set_t mask;
CPU_ZERO(&mask);
CPU_SET(0, &mask);
CPU_SET(2, &mask);
int result = sched_setaffinity(0, sizeof(mask), &mask);
(0 for the first parameter means the current process, supply a PID if it's some other process you want to control).
See also sched_getcpu(3).
Use sched_setaffinity at the process level, or pthread_attr_setaffinity_np for individual threads.
I have done many effort to realize what is happening so I add this answer for helping people like me(I use gcc compiler in linux mint)
#include <sched.h>
cpu_set_t mask;
inline void assignToThisCore(int core_id)
{
CPU_ZERO(&mask);
CPU_SET(core_id, &mask);
sched_setaffinity(0, sizeof(mask), &mask);
}
int main(){
//cal this:
assignToThisCore(2);//assign to core 0,1,2,...
return 0;
}
But don't forget to add this options to the compiler command : -D _GNU_SOURCE
Because operating system might assign a process to the particular core, you can add this GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=2,3" to the grub file located in /etc/default and the run sudo update-grub in terminal to reserve the cores you want
UPDATE:
If you want to assign more cores you can follow this piece of code:
inline void assignToThisCores(int core_id1, int core_id2)
{
CPU_ZERO(&mask1);
CPU_SET(core_id1, &mask1);
CPU_SET(core_id2, &mask1);
sched_setaffinity(0, sizeof(mask1), &mask1);
//__asm__ __volatile__ ( "vzeroupper" : : : ); // It is hear because of that bug which dirtied the AVX registers, so, if you rely on AVX uncomment it.
}
sched_setaffinity + sched_getaffinity minimal C runnable example
This example was extracted from my answer at: How to use sched_getaffinity and sched_setaffinity in Linux from C? I believe the questions are not duplicates since that one is a subset of this one, as it asks about sched_getaffinity only, and does not mention C++.
In this example, we get the affinity, modify it, and check if it has taken effect with sched_getcpu().
main.c
#define _GNU_SOURCE
#include <assert.h>
#include <sched.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
void print_affinity() {
cpu_set_t mask;
long nproc, i;
if (sched_getaffinity(0, sizeof(cpu_set_t), &mask) == -1) {
perror("sched_getaffinity");
assert(false);
}
nproc = sysconf(_SC_NPROCESSORS_ONLN);
printf("sched_getaffinity = ");
for (i = 0; i < nproc; i++) {
printf("%d ", CPU_ISSET(i, &mask));
}
printf("\n");
}
int main(void) {
cpu_set_t mask;
print_affinity();
printf("sched_getcpu = %d\n", sched_getcpu());
CPU_ZERO(&mask);
CPU_SET(0, &mask);
if (sched_setaffinity(0, sizeof(cpu_set_t), &mask) == -1) {
perror("sched_setaffinity");
assert(false);
}
print_affinity();
/* TODO is it guaranteed to have taken effect already? Always worked on my tests. */
printf("sched_getcpu = %d\n", sched_getcpu());
return EXIT_SUCCESS;
}
GitHub upstream.
Compile and run:
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
./main.out
Sample output:
sched_getaffinity = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
sched_getcpu = 9
sched_getaffinity = 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
sched_getcpu = 0
Which means that:
initially, all of my 16 cores were enabled, and the process was randomly running on core 9 (the 10th one)
after we set the affinity to only the first core, the process was moved necessarily to core 0 (the first one)
It is also fun to run this program through taskset:
taskset -c 1,3 ./a.out
Which gives output of form:
sched_getaffinity = 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0
sched_getcpu = 2
sched_getaffinity = 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
sched_getcpu = 0
and so we see that it limited the affinity from the start.
This works because the affinity is inherited by child processes, which taskset is forking: How to prevent inheriting CPU affinity by child forked process?
Python: os.sched_getaffinity and os.sched_setaffinity
See: How to find out the number of CPUs using python
Tested in Ubuntu 16.04.
In short
unsigned long mask = 7; /* processors 0, 1, and 2 */
unsigned int len = sizeof(mask);
if (sched_setaffinity(0, len, &mask) < 0) {
perror("sched_setaffinity");
}
Look in CPU Affinity for more details
It is also possible to make it through the shell without any modification in the programs with the cgroups and the cpuset sub-system. Cgroups (v1 at least) are typically mounted on /sys/fs/cgroup under which the cpuset sub-system resides. For example:
$ ls -l /sys/fs/cgroup/
total 0
drwxr-xr-x 15 root root 380 nov. 22 20:00 ./
drwxr-xr-x 8 root root 0 nov. 22 20:00 ../
dr-xr-xr-x 2 root root 0 nov. 22 20:00 blkio/
[...]
lrwxrwxrwx 1 root root 11 nov. 22 20:00 cpuacct -> cpu,cpuacct/
dr-xr-xr-x 2 root root 0 nov. 22 20:00 cpuset/
dr-xr-xr-x 5 root root 0 nov. 22 20:00 devices/
dr-xr-xr-x 3 root root 0 nov. 22 20:00 freezer/
[...]
Under cpuset, the cpuset.cpus defines the range of CPUs on which the processes belonging to this cgroup are allowed to run. Here, at the top level, all the CPUs are configured for all the processes of the system. Here, the system has 8 CPUs:
$ cd /sys/fs/cgroup/cpuset
$ cat cpuset.cpus
0-7
The list of processes belonging to this cgroup is listed in the cgroup.procs file:
$ cat cgroup.procs
1
2
3
[...]
12364
12423
12424
12425
[...]
It is possible to create a child cgroup into which a subset of CPUs are allowed. For example, let's define a sub-cgroup with CPU cores 1 and 3:
$ pwd
/sys/fs/cgroup/cpuset
$ sudo mkdir subset1
$ cd subset1
$ pwd
/sys/fs/cgroup/cpuset/subset1
$ ls -l
total 0
-rw-r--r-- 1 root root 0 nov. 22 23:28 cgroup.clone_children
-rw-r--r-- 1 root root 0 nov. 22 23:28 cgroup.procs
-rw-r--r-- 1 root root 0 nov. 22 23:28 cpuset.cpu_exclusive
-rw-r--r-- 1 root root 0 nov. 22 23:28 cpuset.cpus
-r--r--r-- 1 root root 0 nov. 22 23:28 cpuset.effective_cpus
-r--r--r-- 1 root root 0 nov. 22 23:28 cpuset.effective_mems
-rw-r--r-- 1 root root 0 nov. 22 23:28 cpuset.mem_exclusive
-rw-r--r-- 1 root root 0 nov. 22 23:28 cpuset.mem_hardwall
-rw-r--r-- 1 root root 0 nov. 22 23:28 cpuset.memory_migrate
-r--r--r-- 1 root root 0 nov. 22 23:28 cpuset.memory_pressure
-rw-r--r-- 1 root root 0 nov. 22 23:28 cpuset.memory_spread_page
-rw-r--r-- 1 root root 0 nov. 22 23:28 cpuset.memory_spread_slab
-rw-r--r-- 1 root root 0 nov. 22 23:28 cpuset.mems
-rw-r--r-- 1 root root 0 nov. 22 23:28 cpuset.sched_load_balance
-rw-r--r-- 1 root root 0 nov. 22 23:28 cpuset.sched_relax_domain_level
-rw-r--r-- 1 root root 0 nov. 22 23:28 notify_on_release
-rw-r--r-- 1 root root 0 nov. 22 23:28 tasks
$ cat cpuset.cpus
$ sudo sh -c "echo 1,3 > cpuset.cpus"
$ cat cpuset.cpus
1,3
The cpuset.mems files must be filled before moving any process into this cgroup. Here we move the current shell into this new cgroup (we merely write the pid of the process to move into the cgroup.procs file):
$ cat cgroup.procs
$ echo $$
4753
$ sudo sh -c "echo 4753 > cgroup.procs"
sh: 1: echo: echo: I/O error
$ cat cpuset.mems
$ sudo sh -c "echo 0 > cpuset.mems"
$ cat cpuset.mems
0
$ sudo sh -c "echo 4753 > cgroup.procs"
$ cat cgroup.procs
4753
12569
The latter shows that the current shell (pid#4753) is now located in the newly created cgroup (the second pid 12569 is the cat's command one as being the child of the current shell, it inherits its cgroups). With a formatted ps command, it is possible to verify on which CPU the processes are running (PSR column):
$ ps -o pid,ppid,psr,command
PID PPID PSR COMMAND
4753 2372 3 bash
12672 4753 1 ps -o pid,ppid,psr,command
We can see that the current shell is running on CPU#3 and its child (ps command) which inherits the its cgroups is running on CPU#1.
As a conclusion, instead of using sched_setaffinity() or any pthread service, it is possible to create a cpuset hierarchy in the cgroups tree and move the processes into them by writing their pids in the corresponding cgroup.procs files.

Resources