Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
SCHEDULER(9)		 BSD Kernel Developer's	Manual		  SCHEDULER(9)

     curpriority_cmp, maybe_resched, resetpriority, roundrobin,
     roundrobin_interval, sched_setup, schedclock, schedcpu, setrunnable,
     updatepri -- perform round-robin scheduling of runnable processes

     #include <sys/param.h>
     #include <sys/proc.h>

     curpriority_cmp(struct proc *p);

     maybe_resched(struct thread *td);

     propagate_priority(struct proc *p);

     resetpriority(struct ksegrp *kg);

     roundrobin(void *arg);


     sched_setup(void *dummy);

     schedclock(struct thread *td);

     schedcpu(void *arg);

     setrunnable(struct	thread *td);

     updatepri(struct thread *td);

     Each process has three different priorities stored	in struct proc:
     p_usrpri, p_nativepri, and	p_priority.

     The p_usrpri member is the	user priority of the process calculated	from a
     process' estimated	CPU time and nice level.

     The p_nativepri member is the saved priority used by
     propagate_priority().  When a process obtains a mutex, its	priority is
     saved in p_nativepri.  While it holds the mutex, the process's priority
     may be bumped by another process that blocks on the mutex.	 When the
     process releases the mutex, then its priority is restored to the priority
     saved in p_nativepri.

     The p_priority member is the actual priority of the process and is	used
     to	determine what runqueue(9) it runs on, for example.

     The curpriority_cmp() function compares the cached	priority of the	cur-
     rently running process with process p.  If	the currently running process
     has a higher priority, then it will return	a value	less than zero.	 If
     the current process has a lower priority, then it will return a value
     greater than zero.	 If the	current	process	has the	same priority as p,
     then curpriority_cmp() will return	zero.  The cached priority of the cur-
     rently running process is updated when a process resumes from tsleep(9)
     or	returns	to userland in userret() and is	stored in the private variable

     The maybe_resched() function compares the priorities of the current
     thread and	td.  If	td has a higher	priority than the current thread, then
     a context switch is needed, and KEF_NEEDRESCHED is	set.

     The propagate_priority() looks at the process that	owns the mutex p is
     blocked on.  That process's priority is bumped to the priority of p if
     needed.  If the process is	currently running, then	the function returns.
     If	the process is on a runqueue(9), then the process is moved to the ap-
     propriate runqueue(9) for its new priority.  If the process is blocked on
     a mutex, its position in the list of processes blocked on the mutex in
     question is updated to reflect its	new priority.  Then, the function re-
     peats the procedure using the process that	owns the mutex just encoun-
     tered.  Note that a process's priorities are only bumped to the priority
     of	the original process p,	not to the priority of the previously encoun-
     tered process.

     The resetpriority() function recomputes the user priority of the ksegrp
     kg	(stored	in kg_user_pri)	and calls maybe_resched() to force a resched-
     ule of each thread	in the group if	needed.

     The roundrobin() function is used as a timeout(9) function	to force a
     reschedule	every sched_quantum ticks.

     The roundrobin_interval() function	simply returns the number of clock
     ticks in between reschedules triggered by roundrobin().  Thus, all	it
     does is return the	current	value of sched_quantum.

     The sched_setup() function	is a SYSINIT(9)	that is	called to start	the
     callout driven scheduler functions.  It just calls	the roundrobin() and
     schedcpu()	functions for the first	time.  After the initial call, the two
     functions will propagate themselves by registering	their callout event
     again at the completion of	the respective function.

     The schedclock() function is called by statclock()	to adjust the priority
     of	the currently running thread's ksegrp.	It updates the group's esti-
     mated CPU time and	then adjusts the priority via resetpriority().

     The schedcpu() function updates all process priorities.  First, it	up-
     dates statistics that track how long processes have been in various
     process states.  Secondly,	it updates the estimated CPU time for the cur-
     rent process such that about 90% of the CPU usage is forgotten in 5 *
     load average seconds.  For	example, if the	load average is	2.00, then at
     least 90% of the estimated	CPU time for the process should	be based on
     the amount	of CPU time the	process	has had	in the last 10 seconds.	 It
     then recomputes the priority of the process and moves it to the appropri-
     ate runqueue(9) if	necessary.  Thirdly, it	updates	the %CPU estimate used
     by	utilities such as ps(1)	and top(1) so that 95% of the CPU usage	is
     forgotten in 60 seconds.  Once all	process	priorities have	been updated,
     schedcpu()	calls vmmeter()	to update various other	statistics including
     the load average.	Finally, it schedules itself to	run again in hz	clock

     The setrunnable() function	is used	to change a process's state to be
     runnable.	The process is placed on a runqueue(9) if needed, and the
     swapper process is	woken up and told to swap the process in if the
     process is	swapped	out.  If the process has been asleep for at least one
     run of schedcpu(),	then updatepri() is used to adjust the priority	of the

     The updatepri() function is used to adjust	the priority of	a process that
     has been asleep.  It retroactively	decays the estimated CPU time of the
     process for each schedcpu() event that the	process	was asleep.  Finally,
     it	calls resetpriority() to adjust	the priority of	the process.

     mi_switch(9), runqueue(9),	sleepqueue(9), tsleep(9)

     The curpriority variable really should be per-CPU.	 In addition,
     maybe_resched() should compare the	priority of chk	with that of each CPU,
     and then send an IPI to the processor with	the lowest priority to trigger
     a reschedule if needed.

     Priority propagation is broken and	is thus	disabled by default.  The
     p_nativepri variable is only updated if a process does not	obtain a sleep
     mutex on the first	try.  Also, if a process obtains more than one sleep
     mutex in this manner, and had its priority	bumped in between, then
     p_nativepri will be clobbered.

BSD			       November	3, 2000				   BSD


Want to link to this manual page? Use this URL:

home | help