Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Version 29 and Version 30 of GpuWorkFetch

Timestamp:: Mar 18, 2009, 11:14:55 AM (15 years ago)
Author:: davea
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

GpuWorkFetch

-                      v29
+                      v30
 = Work fetch and GPUs =
 This document describes changes to BOINC's work fetch mechanism,
 in the 6.7 client and the scheduler as of [17024].
+This document describes changes to BOINC's work fetch mechanism
+in the 6.6 client and the scheduler as of [17024].
 == Problems with the old work fetch policy ==
 …
 == Examples ==
+In following, A and B are projects.
+In following examples, the client is attached
+to projects A and B with equal resource share.
 === Example 1 ===
 …
 Variation: a new project C is attached when A's job finishes.
 It should immediately share the CPU with B.
+It should immediately share the CPU 50/50 with B.
 === Example 3 ===
 …
 After a year, B gets a GPU app.
+Goal: A and B immediately share the GPU.
+== Resource types ==
+New abstraction: '''processing resource type''' just "resource type".
+Goal: A and B immediately share the GPU 50/50.
+== The new policy ==
+=== Resource types ===
+New abstraction: '''processing resource type''' or just "resource type".
 Examples of resource types:
  * CPU
  * A coprocessor type (a kind of GPU, or the SPE processors in a Cell)
+A job sent to a client is associated with an app version,
+which uses some number (possibly fractional) of CPUs,
+and some number of instances of a particular coprocessor type.
+== Scheduler request and reply message ==
+New fields in the scheduler request message:
+ '''double cpu_req_secs''':: number of CPU seconds requested
+ '''double cpu_req_instances''':: send enough jobs to occupy this many CPUs
+And for each coprocessor type:
+ '''double req_secs''':: number of instance-seconds requested
+ '''double req_instances''':: send enough jobs to occupy this many instances
+The semantics: a scheduler should send jobs for a resource type
+only if the request for that type is nonzero.
+For compatibility with old servers, the message still has '''work_req_seconds''',
+which is the max of the req_seconds.
+== Per-resource-type backoff ==
+We need to handle the situation where e.g. there's a GPU shortfall
+but no projects are supplying GPU work
+(for either permanent or transient reasons).
+We don't want an overall work-fetch backoff from those projects.
+Instead, we maintain a separate backoff timer per (project, resource type).
+The backoff interval is doubled up to a limit whenever we ask for work of that type and don't get any work;
+Currently there are two resource types: CPU and NVIDIA GPUs.
+Summary of the new policy: it's like the old policy,
+but with a separate copy for each resource type,
+and scheduler requests can now ask for work for particular resource types.
+=== Per-resource-type backoff ===
+We need to keep track of whether projects have work for particular
+resource types,
+so that we don't keep asking them for types of work they don't have.
+To do this, we maintain a separate backoff timer per (project, resource type).
+The backoff interval is doubled up to a limit (1 day)
+whenever we ask for work of that type and don't get any work;
 it's cleared whenever we get a job of that type.
-There is still an overall backoff timer for each project.
-This is triggered by:
- * requests from the project
- * RPC failures
- * job errors
-and so on.
 Note: if we decide to ask a project for work for resource A,
 we may ask it for resource B as well, even if it's backed off for B.
+== Long-term debt ==
+This is independent of the overall backoff timer for each project,
+which is triggered by requests from the project,
+RPC failures, job errors and so on.
+=== Long-term debt ===
 We continue to use the idea of '''long-term debt''' (LTD),
 …
  * An offset is added so that the maximum debt across all projects is zero (this ensures that when a new project is attached, it starts out debt-free).
+== Client data structures ==
+=== RSC_WORK_FETCH ===
+Work-fetch state for a particular resource types.
+There are instances for CPU ('''cpu_work_fetch''') and NVIDIA GPUs ('''cuda_work_fetch''').
+Data members:
+ '''ninstances''':: number of instances of this resource type
+Used/set by rr_simulation()):
+ '''double shortfall''':: shortfall for this resource
+ '''double nidle''':: number of currently idle instances
+Member functions:
+ '''rr_init()''':: called at the start of RR simulation.  Compute project shares for this PRSC, and clear overall and per-project shortfalls.
+ '''set_nidle()''':: called by RR sim after initial job assignment.
+Set nidle to # of idle instances.
+ '''accumulate_shortfall()''':: called by RR sim for each time interval during work buf period.
+{{{
+shortfall += dt*(ninstances - instances in use)
+for each project p not backed off for this PRSC
+    p->PRSC_PROJECT_DATA.accumulate_shortfall(dt)
+}}}
+ '''select_project()''':: select the best project to request this type of work from. It's the project not backed off for this PRSC, and for which LTD + p->shortfall is largest, also taking into consideration overworked projects etc.
+ '''accumulate_debt(dt)'''::
+for each project p:
+{{{
+x = insts of this device used by P's running jobs
+y = P's share of this device
+update P's LTD
+}}}
+=== RSC_PROJECT_WORK_FETCH ===
+State for a (resource type, project pair).
+It has the following "persistent" members (i.e., saved in state file):
+ '''backoff_interval'''::  how long to wait until ask project for work specifically for this PRSC;
+double this any time we ask for work for this rsc and get none (maximum 24 hours). Clear it when we ask for work for this PRSC and get some job.
+ '''backoff_time''':: back off until this time
+ '''debt''': long term debt
+And the following transient members (used by rr_simulation()):
+ '''double runnable_share''':: # of instances this project should get based on resource share
+relative to the set of projects not backed off for this PRSC.
+ '''instances_used''':: # of instances currently being used
+=== PROJECT_WORK_FETCH ===
+Per-project work fetch state.
+Members:
+ '''overall_debt''':: weighted sum of per-resource debts
+=== WORK_FETCH ===
+Overall work-fetch state.
+ '''PROJECT* choose_project()''':: choose a project from which to fetch work.
+ * Do round-robin simulation
+ * if a GPU is idle, choose a project to ask for that type of work (using RSC_WORK_FETCH::choose_project())
+ * if a CPU is idle, choose a project to ask for CPU work
+ * if GPU has a shortfall, choose a project to ask for GPU work
+ * if CPU has a shortfall, choose a project to ask for CPU work
+ In the case where a resource type was idle, ask for only that type of work.
+=== Summary of the new policy ===
+Every 60 seconds, and when various events happen (e.g. jobs finish),
+the following is done.
+CI is the "connect interval" preference;
+AW is the "additional work" preference.
+Auxiliary functions:
+'''get_major_shortfall(resource)'''
+If the resource will have an idle instance before CI,
+return the greatest-overall-debt non-backed-off project P
+(P may be overworked).  Otherwise return NULL.
+'''get_minor_shortfall(resource)'''
+If the resource will have an idle instance between CI and CI+AW,
+return the greatest-overall-debt non-backed-off non-overworked project P
+'''get_starved_project(resource)'''
+If any project is not overworked, not backed off, and has no runnable jobs
+for any resource, return the one with greatest overall debt
+Main logic:
+ * Do a round-robin simulation of currently queued jobs.
+ * p = get_major_shortfall(NVIDIA GPU); if p <> NULL, ask it for work and return
+ * ... same for other coprocessor types (we assume that coprocessors are faster, hence more imporant, than CPU)
+ * ... same, for CPU
+ * p = get_minor(shortfall(NVIDIA GPU); if p <> NULL, ask it for work and return
+ * ... same for other coprocessor types, then CPU
+ * p = get_starved_project(NVIDIA GPU); if p <> NULL, ask it for work and return
+ * ... same for other coprocessor types, then CPU
+In the get_major_shortfall() case, ask only for work of that resource type.
 Otherwise ask for all types of work for which there is a shortfall.
+== Scheduler changes ==
+== Implementation notes ==
+A job sent to a client is associated with an app version,
+which uses some number (possibly fractional) of CPUs,
+and some number of instances of a particular coprocessor type.
+=== Scheduler request and reply message ===
+New fields in the scheduler request message:
+ '''double cpu_req_secs''':: number of CPU seconds requested
+ '''double cpu_req_instances''':: send enough jobs to occupy this many CPUs
+And for each coprocessor type:
+ '''double req_secs''':: number of instance-seconds requested
+ '''double req_instances''':: send enough jobs to occupy this many instances
+The semantics: a scheduler should send jobs for a resource type
+only if the request for that type is nonzero.
+For compatibility with old servers, the message still has '''work_req_seconds''',
+which is the max of the req_seconds.
+=== Client data structures ===
+ RSC_WORK_FETCH:: The work-fetch state for a particular resource type. There are instances for CPU ('''cpu_work_fetch''') and NVIDIA GPUs ('''cuda_work_fetch''').
+ RSC_PROJECT_WORK_FETCH:: The work-fetch state for a (resource type, project pair).
+ PROJECT_WORK_FETCH:: Per-project work fetch state.
+ WORK_FETCH:: Overall work-fetch state.
+=== Scheduler changes ===
  * WORK_REQ has fields for requests (secs, instances) of the various resource types
 …
  * get_app_version(): skip app versions for resource for which we don't need more work.
 == Notes ==