= Scheduler job matchmaking = Currently the scheduler's work-sending algorithm is: * if host is reliable, scan entire job array, looking for retries * if using HR, scan entire job array, looking for jobs committed to this HR class * if still need work, scan entire job array with no restrictions This is bad for several reasons: * inefficient: repeated array scans * inflexible policy: why should reliable be more important than HR? * hard to add new policies, like sending "hard" jobs to fasts hosts To solve these problems we're going to change this part of the scheduler. Basic idea: given a job J and a host H, there is a function V(J, H) that represents the value of sending J to H. V(J, H) might reflect various factors: * the computational "hardness" of J * CPU/deadline ratio * RAM or disk requirements * H already has files required by J(or jobs already planned for sending have such files) * J is a retry and H is a fast/reliable host * J has already been committed to H's HR class BOINC includes a default value function. Projects can tweak its weights, or define their own value function. == Details == functions: job_set_feasible(set):: checks if a set of jobs is feasible (no DB access) * disk usage * deadline check (EDF sim or crude) * one result per user/host (no DB) job_feasible_fast(j):: feasibility checks that can be done with no DB access * WU committed to different platform (no DB check) * app filtering * memory usage job_feasible_slow(j):: feasibility checks that need DB access * one result per user or host per WU (look in DB) * WU committed to different platform (look in DB) Parameters: N:: scan at least this many slots (if scan N slots and have enough jobs to send, stop) M:: scan at most this many slots (even if don't have enough jobs to send yet) L:: if scan this many locked slots, print warning msg (should increase shmem size) logic: {{{ acquire semaphore i = random index in shmem x = ordered list of jobs to send (empty) slots_scanned = 0 slots_locked = 0 loop i = i+1 % array_size slots_scanned++ if slots_scanned > M break if shmem[i] is empty continue if shmem[i] is locked slots_locked++ continue j = shmem[i] if !job_feasible_fast(j) continue; v = v(h, j) if v <= lowest score in x continue S = jobs in x with value >= v if !job_set_feasible(S+j) continue lock j release semaphore if !job_feasible_slow(j) acquire semaphore unlock j continue acquire semaphore add j to x while (x minus lowest-val element) satisfies work request remove lowest-val element of x while !job_set_feasible(x) remove lowest-value element of x if x satisfies work request and slots_scanned >= N break; for each job j in x mark slot j as empty release semaphore if slots_locked > L print "need bigger array" message }}}