runtime scheduling abstraction - c

I am asked to write a program,an abstraction of runtime scheduling algorithms in C.The one I'm having problem with is "priority scheduling".It takes runtimes from the user like the other algorithms,except it also takes priorities of these processes.What troubles me is that I can't relate priorities with runtimes. How am I supposed to make a relationship between the priority and the runtime?According to algorithm,it runs the one with highest priority first.I just need to know how to make this connection,thanks.

EDIT :-
Store the priorities in an array and sort out the array as per decreasing priorities and map each priority with it's timing! Then,simply display the sorted array from starting and it's mapping with the process-time.
If the process are added at run-time, the you need to adopt greedy-algorithm to solve that problem. Select that process with highest priority in increasing order of incoming time.
Check this to learn more Interval Scheduling Algorithm
BTW,for implementing the real-scheduling algorithm :-
If the first process is currently running and then there comes a new process with the same or lesser priority level, then you should add just the new process to the FIFO data-structure(queue) so that it gets executed at the next(for equal/lesser priority).
If the first process is currently running, and a new process with a higher priority arrives and requests memory, you must pass the current process' upcoming instruction to the stack and execute the higher priority process and then return interrupt to execute the upcoming instruction!
I hope you are getting confused with the DATA-STRUCTURES to implement in a peculiar manner. Also,there is significant use of priority(higher)!

Related

Does Flink's windowing operation process elements at the end of window or does it do a rolling processing?

I am having some trouble understanding the way windowing is implemented internally in Flink and could not find any article which explain this in depth. In my mind, there are two ways this can be done. Consider a simple window wordcount code as below
env.socketTextStream("localhost", 9999)
.flatMap(new Splitter())
.groupBy(0)
.window(Time.of(500, TimeUnit.SECONDS)).sum(1)
Method 1: Store all events for 500 seconds and at the end of the window, process all of them by applying the sum operation on the stored events.
Method 2: We use a counter to store a rolling sum for every window. As each event in a window comes, we do not store the individual events but keep adding 1 to previously stored counter and output the result at the end of the window.
Could someone kindly help to understand which of the above methods (or maybe a different approach) is used by Flink in reality. The reason is, there are pros and cons to both approach and is important to understand in order configure the resources for the cluster correctly.
eg: The Method 1 seems very close to batch processing and might potentially have issues related to spike in processing at every 500 sec interval while sitting idle otherwise etc while Method2 would need to maintain a common counter between all task managers.
sum is a reducing function as mentioned here(https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/windows/#reducefunction). Internally, Flink will apply reduce function to each input element, and simply save the reduced result in ReduceState.
For other windows functions, like windows.apply(WindowFunction). There is no aggregation so all input elements will be saved in the ListState.
This document(https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/windows/#window-functions) about windows stream mentions about how the internal elements are handled in Flink.

How to do priority based processing

I have a linked list containing n processes and these processes are sorted in an decreasing order of priority to run. So , the 1st process i.e. 1st node of linked list has the maximum priority, then the 2nd node and so on.
At one time instance I can only run 8 processes. What I want to do is that I want 6 highest priority and 2 lowest priority to run at one time instant.
What I had done is that I rearranged the linked list where the first six nodes will be having highest priority and the next two nodes with lowest priority. Then further 6 nodes of high priority and 2 low priority and repeating this until all nodes are covered. But this does not do what I want i.e. if any of the 6 highest priority process ends then another high priority process should take it's place and if a low priority process ends then a low priority may take its place.
How can I implement this? (also this is my first question so if there are any problems in the way I had put my question please point it out)
I can solve this problem with 2 methods.
Anyhow use this list as deque. Deque is nothing but a data
structure like queue where we can push and pop elements from both top as well
as bottom of queue. Now we will maintain 2 counters, High-counter
and Low-counter. High-counter will keep track of 6 High priority
processes will Low-counter will keep track of 2 Low priority
processes. So High-counter will have max value 6 while Low-counter
will have 2. At start, initialize both counters with zero.
Now while High-counter < 6 , pop top element from dequeue and
increment the High-counter. Again while Low-counter < 2 , pop bottom
element from dequeue and increment the Low-counter. If any High
priority process completes, we will decrement the High-counter and
similar for low priority process. Now since, Counters are less than
their respective maximum values, we will use a while loop and repeat
previous two while loops until Deque is empty. Also note that here popping process means running them.
Use the similar approach but this time with semaphores in place of counters.
I am not claiming, my solution is 100% correct but still it's close to the answer. Also tag the question with operating system so as to attract better answers.

Getting JMeter to work with Throughput Shaping timer and Concurrency Thread Group

I am trying to shape a JMeter test involving a Concurrency Thread Group and a Throughput Shaping Timer as documented here and here. the timer is configured to run ten ramps and stages with RPS from 1 to 333.
I want to set up the Concurrency Thread Group to use the schedule feedback function and added the formula in the Target concurrency field (I have updated the example from tst-name to the actual timer name). ramp-up time and steps I have set to 1 as I assume the properties are not that important if the throughput is managed by the timer; the Hold Target Rate time is 8000, which is longer than the steps added in the timer (6200).
When I run the test, it ends without any exceptions within 3 seconds or so. The log file shows a few rows about starting and ending threads but nothing alarming.
The only thing I find suspicious is the Log entry "VirtualUserController: Test limit reached, thread is done plus thread name.
I am not getting enough clues from the documentation linked here to figure this out myself, do you have any hints?
According to the documentation rampup time and steps should be blank:
When using this approach, leave Concurrency Thread Group Ramp Up Time and Ramp-Up Steps Count fields blank"
So your assumption that setting them to 1 is OK, seems false...

Using JGraphT to Manage Ordering of Dependent Tasks

I have a list of tasks that have dependencies between them and I was considering how I could use JGraphT to manage the ordering of the tasks. I would set up the graph as a directed graph and remove vertices as I processed them (or should I mask them?). I could use TopologicalOrderIterator if I were only going to execute one task at a time but I'm hoping to parallelize the tasks. I could get TopologicalOrderIterator and check Graphs.vertexHasPredecessors until I find as many as I want to execute as once but ideally, there would be something like Graphs.getVerticesWithNoPredecessors. I see that Netflix provides a utility to get leaf vertices, so I could reverse the graph and use that, but it's probably not worth it. Can anyone point me to a better way? Thanks!
A topological order may not necessary be what you want. Here's an example why not. Given the following topological ordering of tasks: [1,2,3,4], and the arcs (1,3), (2,3). That is, task 1 needs to be completed before task 3, similar for 2 and 4. Let's also assume that task 1 takes a really long time to complete. So we can start processing tasks 1, and 2 in parallel, but you cannot start 3 before 1 completes. Even though task 2 completes, we cannot start task 4 because task 3 is the next task in our ordering and this task is being blocked by 1.
Here's what you could do. Create an array dep[] which tracks the number of unfulfilled dependencies per task. So dep[i]==0 means that all dependencies for task i have been fulfilled, meaning that we can now perform task i. If dep[i]>0, we cannot perform task i yet. Lets assume that there is a task j which needs to be performed prior to task i. As soon as we complete task j, we can decrement the number of unfulfilled dependencies of task i, i.e: dep[i]=dep[i]-1. Again, if dep[i]==0, we are now ready to process task i.
So in short, the algorithm in pseudocode would look like this:
Initialize dep[] array.
Start processing in parallel all tasks i with dep[i]==0
if a task i completes, decrement dep[j] for all tasks j which depend on i. If task j has dep[j]==0, start processing it.
You could certainly use a Directed Graph to model the dependencies. Each time you complete a task, you could simply iterate over the outgoing neighbors (in jgrapht use the successorsOf(vertex) function). The DAG can also simply be used to check feasibility: if the graph contains a cycle, you have a problem in your dependencies. However, if you don't need this heavy machinery, I would simply create a 2-dimensional array where for each task i you store the tasks that are dependent on i.
The resulting algorithm runs in O(n+m) time, where n is the number of tasks and m the number of arcs (dependencies). So this is very efficient.

which size of chunk will yield to best performance using master-worker with MPI?

Im using MPI to parrlel a program that is trying to solve the Metric TSP problem. I have P processors , and N cities to pass .
Each thread asks for work from the master, recieves a chunk - which is a range of permutation that he should check and calculates the minimal among them. I am optimizing this by pruning bad routes in advance.
There are total (N-1)! routes to calculate. each worker get a chunk with a number that represnt the first route he has to check and the also the last. In addition the master sends him the most recent best result known , so can easly prone bad routes in advance with some lower bound on thier remains.
Each time a worker is finding result that is better that the global , he asyncrounsly sends it to the all other workers and to the master.
Im not looking for better solution- I'm just trying to determine which chunk size is the best.
The best chunk size i've found so far is (n!)/(n/2)! , but it doesnt yield so good result .
please help me understand which chunk size is the best here. I'm trying to balance between the amount of computation and communication
thanks
This depends heavily on factors beyond your control: MPI implementation, total load on the machine, etc. However, I'd hazard a guess that it also heavily depends on how many worker processes there are. On that note, understand that MPI spawns processes, not threads.
Ultimately, as is often the case with most optimization questions, the answer is simply "test a lot of different settings and see which one is best". You may want to do this manually, or write a tester app that implements some sort of heuristic (e.g. a genetic algorithm).

Resources