Why nice is defined as long in 'set_user_nice'

Why nice is defined as long in 'set_user_nice' - c

I am a student learning OS. When I read the source code of set_user_nice, I found the nice is defined as long. But I have learned that the range of nice is -20 ~ 19. So, why not define nice as int to save more room in RAM？
Is this related to x86 or x64 systems？
Here is the source code(./kernel/sched/core.c)
void set_user_nice(struct task_struct *p, long nice)
{
bool queued, running;
int old_prio, delta;
struct rq_flags rf;
struct rq *rq;
if (task_nice(p) == nice || nice < MIN_NICE || nice > MAX_NICE)
return;
/*
* We have to be careful, if called from sys_setpriority(),
* the task might be in the middle of scheduling on another CPU.
*/
rq = task_rq_lock(p, &rf);
/*
* The RT priorities are set via sched_setscheduler(), but we still
* allow the 'normal' nice value to be set - but as expected
* it wont have any effect on scheduling until the task is
* SCHED_DEADLINE, SCHED_FIFO or SCHED_RR:
*/
if (task_has_dl_policy(p) || task_has_rt_policy(p)) {
p->static_prio = NICE_TO_PRIO(nice);
goto out_unlock;
}
queued = task_on_rq_queued(p);
running = task_current(rq, p);
if (queued)
dequeue_task(rq, p, DEQUEUE_SAVE);
if (running)
put_prev_task(rq, p);
p->static_prio = NICE_TO_PRIO(nice);
set_load_weight(p);
old_prio = p->prio;
p->prio = effective_prio(p);
delta = p->prio - old_prio;
if (queued) {
enqueue_task(rq, p, ENQUEUE_RESTORE);
/*
* If the task increased its priority or is running and
* lowered its priority, then reschedule its CPU:
*/
if (delta < 0 || (delta > 0 && task_running(rq, p)))
resched_curr(rq);
}
if (running)
set_curr_task(rq, p);
out_unlock:
task_rq_unlock(rq, p, &rf);
}

Related

Shortest Remaining Time Scheduling Timing Issue

I am working on finishing an Operating Systems Process Scheduling Algorithms project, and I am running into a timing issue related to the Shortest Time Remaining First algorithm. My teammates and I hand calculated the results of the waiting time, turnaround time, etc., and compared them to an online source, so we think the problem is in our code, specifically for the waiting time.
Here is the relevant function where we calculate the timing and run the algorithm:
bool shortest_remaining_time_first(dyn_array_t *ready_queue, ScheduleResult_t *result)
{
if (!ready_queue || !result)
return false;
const int numProcesses = (int)ready_queue->size;
/* should provide array in lowest arrival times */
dyn_array_sort(ready_queue, &compare_arrival_times);
dyn_array_t *run_queue = dyn_array_create(ready_queue->capacity, ready_queue->data_size, NULL);
/* ProcessControlBlock_t looks like:
typedef struct ProcessControlBlock = {
uint32_t arrival,
uint32_t remaining_burst_time,
uint32_t priority
} ProcessControlBlock_t
and dyn_array_t looks like:
typedef struct dyn_array = {
size_t capacity;
size_t size;
const size_t data_size;
void *array; /* this is an array of ProcessControlBlock_t's*/
void (*destructor)(void *);
} dyn_array_t;
*/
ProcessControlBlock_t pcb, pcbRun;
dyn_array_extract_front(ready_queue, &pcb);
/* we need 3 of these to use seperately */
int processCount = numProcesses;
int loopCounter = numProcesses;
int runCounter = 0;
/* timing stuff */
int ticks = 0;
int waitTime = 0;
int timeComplete = 0;
/* used to check if the arrival time/burst time is the same to check for the same process */
int arrivalCheck = 0;
int burstCheck = 0;
/* while there are still processes to run */
while (loopCounter > 0)
{
/* will wait to push new process onto run_queue */
while((int)pcb.arrival <= ticks && processCount > 0)
{
runCounter++;
/* this will not push a new value until that much time has passed */
dyn_array_push_front(run_queue, &pcb);
if(processCount > 0)
{
dyn_array_extract_front(ready_queue, &pcb);
processCount--;
}
}
if(runCounter > 0)
{
/* we want to re-sort each time so we can always pull the smallest burst time */
dyn_array_sort(run_queue, &compareBurstTimes);
dyn_array_extract_front(run_queue, &pcbRun);
/* assuming there is still a process to run */
if(pcbRun.remaining_burst_time > 0)
{
/* if we are on the same process, do nothing */
/* else, change the waitTime */
if((int)pcbRun.arrival == arrivalCheck && (int)pcbRun.remaining_burst_time == burstCheck){}
else
{
/* this is where we think the problem is occurring */
waitTime += ticks - pcbRun.arrival;
}
/* the process has started, then run on "CPU" and increase time passed */
pcbRun.started = true;
virtual_cpu(&pcbRun);
ticks++;
/* if still > 0 after being decremented */
if(pcbRun.remaining_burst_time > 0)
{
dyn_array_push_front(run_queue, &pcbRun);
arrivalCheck = pcbRun.arrival;
burstCheck = pcbRun.remaining_burst_time;
}
}
/* if the process has run all the way through, decrease the number of processes */
if(pcbRun.remaining_burst_time <= 0)
{
loopCounter--;
runCounter--;
timeComplete += ticks - pcbRun.arrival;
}
}
else
{
/* if there is a gap between arrival times and processes */
ticks++;
}
}
/* calculate timing values */
result->average_waiting_time = (float)waitTime / (float)numProcesses;
result->average_turnaround_time = (float)timeComplete / (float)numProcesses;
result->total_run_time = ticks;
/* this frees run_queue AND run_queue->array */
dyn_array_destroy(run_queue);
return true;
}
Our first Process Control Block comes in with a table that looks like this:
pid
arrival
remaining_burst_time
priority
0
0
15
0
1
1
10
0
2
2
5
0
3
3
20
0
From these values, we hand-calculated and verified the waiting time to be 11.75, yet our program produces a calculated waiting time to be 12.25. Any idea why this may be?
Our approach of calculating each value on essentially almost every tick has made it more difficult to find an appropriate solution.

Confusion regarding #interrupt-cells configuration on PCA9555 expander

I'm trying to setup a device tree source file for the first time on my custom platform. On the board is a NXP PCA9555 gpio expander. I'm attempting to setup node for the device and am a bit confused.
Here is where I'm at with the node in the dts file:
ioexp0: gpio-exp#21 {
compatible = "nxp,pca9555";
reg = <21>;
interrupt-parent = <&gpio>;
interrupts = <8 0>;
gpio-controller;
#gpio-cells = <2>;
/*I don't understand the following two lines*/
interrupt-controller;
#interrupt-cells = <2>;
};
I got to this point by using the armada-388-gp.dts source as a guide.
My confusion is on what code processes the #interrupt-cells property. The bindings documentation is not very helpful at all for this chip as it doesn't say anything regarding interrupt cell interpretation.
Looking at the pca953x_irq_setup function in the source code for the pca9555 driver - I don't see anywhere that the #interrupt-cells property is handled. Is this handled in the linux interrupt handling code? I'm just confused as to how I'm suppose to know the meaning of the two interrupt cells.
pca953x_irq_setup for your convenience:
static int pca953x_irq_setup(struct pca953x_chip *chip,
int irq_base)
{
struct i2c_client *client = chip->client;
int ret, i;
if (client->irq && irq_base != -1
&& (chip->driver_data & PCA_INT)) {
ret = pca953x_read_regs(chip,
chip->regs->input, chip->irq_stat);
if (ret)
return ret;
/*
* There is no way to know which GPIO line generated the
* interrupt. We have to rely on the previous read for
* this purpose.
*/
for (i = 0; i < NBANK(chip); i++)
chip->irq_stat[i] &= chip->reg_direction[i];
mutex_init(&chip->irq_lock);
ret = devm_request_threaded_irq(&client->dev,
client->irq,
NULL,
pca953x_irq_handler,
IRQF_TRIGGER_LOW | IRQF_ONESHOT |
IRQF_SHARED,
dev_name(&client->dev), chip);
if (ret) {
dev_err(&client->dev, "failed to request irq %d\n",
client->irq);
return ret;
}
ret = gpiochip_irqchip_add_nested(&chip->gpio_chip,
&pca953x_irq_chip,
irq_base,
handle_simple_irq,
IRQ_TYPE_NONE);
if (ret) {
dev_err(&client->dev,
"could not connect irqchip to gpiochip\n");
return ret;
}
gpiochip_set_nested_irqchip(&chip->gpio_chip,
&pca953x_irq_chip,
client->irq);
}
return 0;
}
This is my first time working with device tree so I'm hoping it's something obvious that I'm just missing.

After looking at all of the comments I did some additional reading and figured out my answer.
I now understand that I was misinterpreting some properties of the device tree. I was previously under the impression that the driver had to specify how all properties were handled. I now see that linux will actually handle many of the generic properties such as gpios or interrupts (which makes a lot of sense).
The documentation on the actual interrupts binding was very helpful, not the documentation for the device driver.
Here is a bit more of a detailed explanation of how the translation from intspec to IRQ_TYPE* happens:
The function of_irq_parse_one copies the interrupt specifier integers to a struct of_phandle_args here. This arg is then passed to irq_create_of_mapping via a consumer function (e.g. of_irq_get). This function then maps these args to a struct irq_fwspec via of_phandle_args_to_fwspec and passes it's fwspec data to irq_create_fwspec_mapping. These functions are all found in irqdomain.c. At this point the irq will belong to an irq_domain or use the irq_default_domain. As far I can tell - the pca853x driver uses the default domain. This domain is often setup by platform specific code. I found mine by searching for irq_domain_ops on cross reference. A lot of these seem to do simple copying of intspec[1] & IRQ_TYPE_SENSE_MASK to the type variable in irq_create_fwspec_mapping via irq_domain_translate. From here the type is set to the irq's irq_data via irqd_set_trigger_type.
of_irq_parse_one:
/**
* of_irq_parse_one - Resolve an interrupt for a device
* #device: the device whose interrupt is to be resolved
* #index: index of the interrupt to resolve
* #out_irq: structure of_irq filled by this function
*
* This function resolves an interrupt for a node by walking the interrupt tree,
* finding which interrupt controller node it is attached to, and returning the
* interrupt specifier that can be used to retrieve a Linux IRQ number.
*/
int of_irq_parse_one(struct device_node *device, int index, struct of_phandle_args *out_irq)
{
struct device_node *p;
const __be32 *intspec, *tmp, *addr;
u32 intsize, intlen;
int i, res;
pr_debug("of_irq_parse_one: dev=%s, index=%d\n", of_node_full_name(device), index);
/* OldWorld mac stuff is "special", handle out of line */
if (of_irq_workarounds & OF_IMAP_OLDWORLD_MAC)
return of_irq_parse_oldworld(device, index, out_irq);
/* Get the reg property (if any) */
addr = of_get_property(device, "reg", NULL);
/* Try the new-style interrupts-extended first */
res = of_parse_phandle_with_args(device, "interrupts-extended",
"#interrupt-cells", index, out_irq);
if (!res)
return of_irq_parse_raw(addr, out_irq);
/* Get the interrupts property */
intspec = of_get_property(device, "interrupts", &intlen);
if (intspec == NULL)
return -EINVAL;
intlen /= sizeof(*intspec);
pr_debug(" intspec=%d intlen=%d\n", be32_to_cpup(intspec), intlen);
/* Look for the interrupt parent. */
p = of_irq_find_parent(device);
if (p == NULL)
return -EINVAL;
/* Get size of interrupt specifier */
tmp = of_get_property(p, "#interrupt-cells", NULL);
if (tmp == NULL) {
res = -EINVAL;
goto out;
}
intsize = be32_to_cpu(*tmp);
pr_debug(" intsize=%d intlen=%d\n", intsize, intlen);
/* Check index */
if ((index + 1) * intsize > intlen) {
res = -EINVAL;
goto out;
}
/* Copy intspec into irq structure */
intspec += index * intsize;
out_irq->np = p;
out_irq->args_count = intsize;
for (i = 0; i < intsize; i++)
out_irq->args[i] = be32_to_cpup(intspec++);
/* Check if there are any interrupt-map translations to process */
res = of_irq_parse_raw(addr, out_irq);
out:
of_node_put(p);
return res;
}
EXPORT_SYMBOL_GPL(of_irq_parse_one)
irq_create_fwspec_mapping:
unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec)
{
struct irq_domain *domain;
struct irq_data *irq_data;
irq_hw_number_t hwirq;
unsigned int type = IRQ_TYPE_NONE;
int virq;
if (fwspec->fwnode) {
domain = irq_find_matching_fwspec(fwspec, DOMAIN_BUS_WIRED);
if (!domain)
domain = irq_find_matching_fwspec(fwspec, DOMAIN_BUS_ANY);
} else {
domain = irq_default_domain;
}
if (!domain) {
pr_warn("no irq domain found for %s !\n",
of_node_full_name(to_of_node(fwspec->fwnode)));
return 0;
}
if (irq_domain_translate(domain, fwspec, &hwirq, &type))
return 0;
/*
* WARN if the irqchip returns a type with bits
* outside the sense mask set and clear these bits.
*/
if (WARN_ON(type & ~IRQ_TYPE_SENSE_MASK))
type &= IRQ_TYPE_SENSE_MASK;
/*
* If we've already configured this interrupt,
* don't do it again, or hell will break loose.
*/
virq = irq_find_mapping(domain, hwirq);
if (virq) {
/*
* If the trigger type is not specified or matches the
* current trigger type then we are done so return the
* interrupt number.
*/
if (type == IRQ_TYPE_NONE || type == irq_get_trigger_type(virq))
return virq;
/*
* If the trigger type has not been set yet, then set
* it now and return the interrupt number.
*/
if (irq_get_trigger_type(virq) == IRQ_TYPE_NONE) {
irq_data = irq_get_irq_data(virq);
if (!irq_data)
return 0;
irqd_set_trigger_type(irq_data, type);
return virq;
}
pr_warn("type mismatch, failed to map hwirq-%lu for %s!\n",
hwirq, of_node_full_name(to_of_node(fwspec->fwnode)));
return 0;
}
if (irq_domain_is_hierarchy(domain)) {
virq = irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, fwspec);
if (virq <= 0)
return 0;
} else {
/* Create mapping */
virq = irq_create_mapping(domain, hwirq);
if (!virq)
return virq;
}
irq_data = irq_get_irq_data(virq);
if (!irq_data) {
if (irq_domain_is_hierarchy(domain))
irq_domain_free_irqs(virq, 1);
else
irq_dispose_mapping(virq);
return 0;
}
/* Store trigger type */
irqd_set_trigger_type(irq_data, type);
return virq;
}
EXPORT_SYMBOL_GPL(irq_create_fwspec_mapping);

C multithread performance issue

I am writing a multi-threaded program to traverse an n x n matrix, where the elements in the main diagonal are processed in a parallel manner, as shown in the code below:
int main(int argc, char * argv[] )
{
/* VARIABLES INITIALIZATION HERE */
gettimeofday(&start_t, NULL); //start timing
for (int slice = 0; slice < 2 * n - 1; ++slice)
{
z = slice < n ? 0 : slice - n + 1;
int L = 0;
pthread_t threads[slice-z-z+1];
struct thread_data td[slice-z-z+1];
for (int j=z; j<=slice-z; ++j)
{
td[L].index= L;
printf("create:%d\n", L );
pthread_create(&threads[L],NULL,mult_thread,(void *)&td[L]);
L++;
}
for (int j=0; j<L; j++)
{
pthread_join(threads[j],NULL);
}
}
gettimeofday(&end_t, NULL);
printf("Total time taken by CPU: %ld \n", ( (end_t.tv_sec - start_t.tv_sec)*1000000 + end_t.tv_usec - start_t.tv_usec));
return (0);
}
void *mult_thread(void *t)
{
struct thread_data *my_data= (struct thread_data*) t;
/* SOME ADDITIONAL CODE LINES HERE */
printf("ThreadFunction:%d\n", (*my_data).index );
return (NULL);
}
The problem is that this multithreaded implementation gave me a very bad performance compared with the serial (naive) implementation.
Are there some adjustments that could be done to improve the performance of the multithreaded version ??

a thread pool may make it better.
define a new struct type as follow.
typedef struct {
struct thread_data * data;
int status; // 0: ready
// 1: adding data
// 2: data handling, 3: done
int next_free;
} thread_node;
init :
size_t thread_size = 8;
thread_node * nodes = (thread_node *)malloc(thread_size * sizeof(thread_node));
for(int i = 0 ; i < thread_size - 1 ; i++ ) {
nodes[i].next_free = i + 1;
nodes[i].status = 0 ;
}
nodes[thread_size - 1].next_free = -1;
int current_free_node = 0 ;
pthread_mutex_t mutex;
get thread :
int alloc() {
pthread_mutex_lock(&mutex);
int rt = current_free_node;
if(current_free_node != -1) {
current_free_node = nodes[current_free_node].next_free;
nodes[rt].status = 1;
}
pthread_mutex_unlock(&mutex);
return rt;
}
return thread :
void back(int idx) {
pthread_mutex_lock(&mutex);
nodes[idx].next_free = current_free_node;
current_free_node = idx;
nodes[idx].status = 0;
pthread_mutex_unlock(&mutex);
}
create the threads first, and use alloc() to try to get a idle thread, update the pointer.
don't use join to judge the status.
modify your mult_thread as a loop and after the job finished , just change your status to 3
for each loop in the thread , you may give it more work
I wish it will give you some help.
------------ UPDATED Apr. 23, 2015 -------------------
here is a example.
compile & run with command
$ g++ thread_pool.cc -o tp -pthread --std=c++
yu:thread_pool yu$ g++ tp.cc -o tp -pthread --std=c++11 && ./tp
1227135.147 1227176.546 1227217.944 1227259.340...
time cost 1 : 1068.339091 ms
1227135.147 1227176.546 1227217.944 1227259.340...
time cost 2 : 548.221607 ms
you may also remove timer and it can also compiled as a std c99 file.
In current , the thread size has been limited to 2. You may also adjust the parameter thread_size, and recompile & run again. More threads may give your some more advantage(in my pc, if I change the thread size to 4, the task will finish in 280ms), while too much thread number may not help you too much if you have no enough cpu thread.

How to make a microsecond-precise timer on the STM32L-Discovery ARM board?

I'm trying to implement the Dallas OneWire protocol, but I'm having trouble generating a microsecond delay on the STM32l-Discovery.
How do I implement a timer accurate enough to delay the program for x microseconds?

For start I must tell you that there is no way to accomplish a precise usec delay using software. Even if you use an interrupt based system you will have latencies. Off course you can achieve a better accuracy with a larger CPU frequencies.
In order to connect with a 1-Wire device you can use:
A external interface like DS2482-100
A software 1-wire implementation using pin polling.
For the second solution you have to call a software based delay. You can make a flag polling delay or an interrupt based flag polling delay. In both cases you will be sure that a certain amount of time has passed but you can not be sure how match more time has passed. This is because of the CPU latency, the CPU clock etc...
For example consider the following implementation. We program a HW TIMER to continuously count up and we check TIMER's value. We name "jiffy" the time between each TIMER's ticks and jiffies the TIMERS max value:
Low level driver part (ex: driver.h)
// ...
#define JF_TIM_VALUE (TIM7->CNT)
int JF_setfreq (uint32_t jf_freq, uint32_t jiffies);
// ...
Low level driver part (ex: driver.c)
// ...
#include <stm32l1xx.h>
#include <misc.h>
#include <stm32l1xx_rcc.h>
#include <stm32l1xx_tim.h>
/*
* Time base configuration using the TIM7
* \param jf_freq The TIMER's frequency
* \param jiffies The TIMER's max count value
*/
int JF_setfreq (uint32_t jf_freq, uint32_t jiffies) {
uint32_t psc=0;
RCC_APB1PeriphClockCmd(RCC_APB1Periph_TIM7, ENABLE);
SystemCoreClockUpdate ();
if (jf_freq)
psc = (SystemCoreClock / jf_freq) - 1;
if (psc < 0xFFFF) TIM7->PSC = psc;
else return 1;
if (jiffies < 0xFFFF) TIM7->ARR = jiffies;
else return 1;
TIM7->CR1 |= TIM_CR1_CEN;
return 0;
}
// ...
Middleware jiffy system with some delay implementations
jiffy.h:
#include "string.h"
typedef int32_t jiffy_t; // Jiffy type 4 byte integer
typedef int (*jf_setfreq_pt) (uint32_t, uint32_t); //Pointer to setfreq function
typedef volatile struct {
jf_setfreq_pt setfreq; // Pointer to driver's timer set freq function
jiffy_t *value; // Pointer to timers current value
uint32_t freq; // timer's frequency
uint32_t jiffies; // jiffies max value (timer's max value)
jiffy_t jpus; // Variable for the delay function
}jf_t;
/*
* ============= PUBLIC jiffy API =============
*/
/*
* Link functions
*/
void jf_link_setfreq (jf_setfreq_pt pfun);
void jf_link_value (jiffy_t* v);
/*
* User Functions
*/
void jf_deinit (void);
int jf_init (uint32_t jf_freq, uint32_t jiffies);
jiffy_t jf_per_usec (void);
void jf_delay_us (int32_t usec);
int jf_check_usec (int32_t usec);
jiffy.c:
#include "jiffy.h"
static jf_t _jf;
#define JF_MAX_TIM_VALUE (0xFFFF) // 16bit counters
//Connect the Driver's Set frequency function
void jf_link_setfreq (jf_setfreq_pt pfun) {
_jf.setfreq = pfun;
}
// Connect the timer's value to jiffy struct
void jf_link_value (jiffy_t* v) {
_jf.value = v;
}
// De-Initialize the jf data and un-connect the functions
// from the driver
void jf_deinit (void) {
memset ((void*)&_jf, 0, sizeof (jf_t));
}
// Initialise the jf to a desired jiffy frequency f
int jf_init (uint32_t jf_freq, uint32_t jiffies) {
if (_jf.setfreq) {
if ( _jf.setfreq (jf_freq, jiffies) )
return 1;
_jf.jiffies = jiffies;
_jf.freq = jf_freq;
_jf.jpus = jf_per_usec ();
return 0;
}
return 1;
}
// Return the systems best approximation for jiffies per usec
jiffy_t jf_per_usec (void) {
jiffy_t jf = _jf.freq / 1000000;
if (jf <= _jf.jiffies)
return jf;
else
// We can not count beyond timer's reload
return 0;
}
/*!
* \brief
* A code based delay implementation, using jiffies for timing.
* This is NOT accurate but it ensures that the time passed is always
* more than the requested value.
* The delay values are multiplications of 1 usec.
* \param
* usec Time in usec for delay
*/
void jf_delay_us (int32_t usec) {
jiffy_t m, m2, m1 = *_jf.value;
usec *= _jf.jpus;
if (*_jf.value - m1 > usec) // Very small delays will return here.
return;
// Delay loop: Eat the time difference from usec value.
while (usec>0) {
m2 = *_jf.value;
m = m2 - m1;
usec -= (m>0) ? m : _jf.jiffies + m;
m1 = m2;
}
}
/*!
* \brief
* A code based polling version delay implementation, using jiffies for timing.
* This is NOT accurate but it ensures that the time passed is always
* more than the requested value.
* The delay values are multiplications of 1 usec.
* \param
* usec Time in usec for delay
*/
int jf_check_usec (int32_t usec) {
static jiffy_t m1=-1, cnt;
jiffy_t m, m2;
if (m1 == -1) {
m1 = *_jf.value;
cnt = _jf.jpus * usec;
}
if (cnt>0) {
m2 = *_jf.value;
m = m2-m1;
cnt-= (m>0) ? m : _jf.jiffies + m;
m1 = m2;
return 1; // wait
}
else {
m1 = -1;
return 0; // do not wait any more
}
}
Hmmm you made it till here. Nice
So now you can use it in your application like this:
main.c:
#include "driver.h"
#include "jiffy.h"
void do_some_job1 (void) {
// job 1
}
void do_some_job2 (void) {
// job 2
}
int main (void) {
jf_link_setfreq ((jf_setfreq_pt)JF_setfreq); // link with driver
jf_link_value ((jiffy_t*)&JF_TIM_VALUE);
jf_init (1000000, 1000); // 1MHz timer, 1000 counts, 1 usec per count
// use delay version
do_some_job1 ();
jf_delay_us (300); // wait for at least 300 usec
do_some_job1 ();
// use polling version
do_some_job1 ();
while (jf_check_usec (300)) {
do_some_job2 (); // keep calling for at least 300 usec
}
}

Writing a scheduler for a Userspace thread library

I am developing a userspace premptive thread library(fibre) that uses context switching as the base approach. For this I wrote a scheduler. However, its not performing as expected. Can I have any suggestions for this.
The structure of the thread_t used is :
typedef struct thread_t {
int thr_id;
int thr_usrpri;
int thr_cpupri;
int thr_totalcpu;
ucontext_t thr_context;
void * thr_stack;
int thr_stacksize;
struct thread_t *thr_next;
struct thread_t *thr_prev;
} thread_t;
The scheduling function is as follows:
void schedule(void)
{
thread_t *t1, *t2;
thread_t * newthr = NULL;
int newpri = 127;
struct itimerval tm;
ucontext_t dummy;
sigset_t sigt;
t1 = ready_q;
// Select the thread with higest priority
while (t1 != NULL)
{
if (newpri > t1->thr_usrpri + t1->thr_cpupri)
{
newpri = t1->thr_usrpri + t1->thr_cpupri;
newthr = t1;
}
t1 = t1->thr_next;
}
if (newthr == NULL)
{
if (current_thread == NULL)
{
// No more threads? (stop itimer)
tm.it_interval.tv_usec = 0;
tm.it_interval.tv_sec = 0;
tm.it_value.tv_usec = 0; // ZERO Disable
tm.it_value.tv_sec = 0;
setitimer(ITIMER_PROF, &tm, NULL);
}
return;
}
else
{
// TO DO :: Reenabling of signals must be done.
// Switch to new thread
if (current_thread != NULL)
{
t2 = current_thread;
current_thread = newthr;
timeq = 0;
sigemptyset(&sigt);
sigaddset(&sigt, SIGPROF);
sigprocmask(SIG_UNBLOCK, &sigt, NULL);
swapcontext(&(t2->thr_context), &(current_thread->thr_context));
}
else
{
// No current thread? might be terminated
current_thread = newthr;
timeq = 0;
sigemptyset(&sigt);
sigaddset(&sigt, SIGPROF);
sigprocmask(SIG_UNBLOCK, &sigt, NULL);
swapcontext(&(dummy), &(current_thread->thr_context));
}
}
}

It seems that the "ready_q" (head of the list of ready threads?) never changes, so the search of the higest priority thread always finds the first suitable element. If two threads have the same priority, only the first one has a chance to gain the CPU. There are many algorithms you can use, some are based on a dynamic change of the priority, other ones use a sort of rotation inside the ready queue. In your example you could remove the selected thread from its place in the ready queue and put in at the last place (it's a double linked list, so the operation is trivial and quite inexpensive).
Also, I'd suggest you to consider the performace issues due to the linear search in ready_q, since it may be a problem when the number of threads is big. In that case it may be helpful a more sophisticated structure, with different lists of threads for different levels of priority.
Bye!