Proposal for a new credit system
This is a tentative proposal for a new credit system for BOINC.
- The system should reflect how much "utility" volunteers provide to projects. There are many types of utility (see below). Projects should be able to define utility however they want.
- The system should normalize the amount of credits given by different projects, so that there will be no incentive for average volunteers to select projects based on credit (however, volunteers with unusual computers should have an incentive to select projects that need such resources).
- The average amount of credit per host per day should remain constant over time.
Types of utility
Possible types of utility:
- Computation (FP, integer, or mixed) without tight deadlines (this is typical of most current projects).
- Computation with a tight latency bound (minutes or hours).
- Computation that needs a large amount of RAM.
- Computation that needs a large amount of storage.
- Storage (e.g. GB/day).
- Storage, with network bandwidth and/or availability requirements.
- Network bandwidth (e.g. web crawling).
- Network bandwidth at particular times of day (e.g. Internet performance study).
- Deployment on a wide range of computer types (e.g. studies of computer usage).
- Computation with human "steering".
- Human activity (e.g. Stardust@home-type projects).
- Computers are described by a set of parameters (FP and int benchmarks, #CPUs, cache sizes, memory bandwidth, available RAM, available disk, presence of particular GPUs, network bandwidth, available fraction, connected fraction, maybe others).
- Each project P publishes a "credit function" C(H) specifying, for a host H with given parameters, how much credit per day would be granted if H is attached exclusively to P.
- Normalization rule: for each project, the average of C(H) over all hosts participating in BOINC must be about 100.
- Accounting rule: the credit granted by a project P cannot exceed the sum over hosts H of
RS(H, P)*C(H)where RS(H, P) is the fractional resource share of H's attachment to P.
The normalization and accounting rules would be evaluated by cross-project statistics sites.
- Computational projects would have to derive a credit function based on how fast various types of computers run their applications (and how much they value each application). We can supply automated tools for this.
- Suppose a project's application needs at least 16 GB RAM. Its credit function would be zero for hosts with < 16 GB RAM. Its value for hosts with at least 16 GB would be limited by the normalization rule.
- For a storage-only projects, C(H) would be proportional to available disk space (possibly with some additional consideration for network bandwidth etc.).
- Using the published credit functions, it would be possible to develop a "credit maximizer" web site, where users can enter the parameters of their computer, and it tells them how much credit/day each project would give them.
- Similarly, it would be possible to develop a web site that tells users how much additional credit/day they could get by adding more disk/RAM/network bandwidth etc. to an existing computer.
Questions and comments
- BOINC doesn't currently have code to measure memory bandwidth or cache sizes; we'd need to develop this.
- How should credit functions be represented (e.g., as a data structure? a PHP function?).
- What should the default credit function be? What tools can we provide to projects to let them develop appropriate credit functions easily?
- As computers become faster and bigger, credit functions will have to change in order to continue to satisfy the normalization rule.
- Credit can no longer be used as a basis for FLOPS estimates; we'll need something else for that purpose (or keep around the existing credit system and use it only to estimate FLOPS).
- Suppose a project's application needs X GB of RAM, and there is only one host in the BOINC population with that much RAM. Then its credit per day for that host can be 100N, where N is the size of the population. In other words, that computer can get as much credit as all other computers combined. Is this desirable?
- The accounting rule doesn't take into account non-competing projects. E.g. suppose project A uses only CPU and project B uses only disk. If hosts attach to both projects with equal resource shares, the projects will be limited to issuing 50 credits/day on the average.
- Suppose a project uses a resource that other projects use little or none of (e.g. QCN uses the accelerometer found in some laptops; Depspid uses network). How much credit should the project be allowed to grant? Under the current proposal, they would be allowed to grant something like 100/(N+1) per host/day, where N is the average number of other attached projects.
- Also in the above case, the resource share for QCN doesn't affect its utility (since it doesn't use any bottleneck resources), and yet users' resource share settings will affect how much credit can be granted by QCN and other projects attached to hosts running QCN. This doesn't seem right.
- Suppose that a large number of projects arise that don't use CPU or other bottleneck resources. If they're allowed to grant as much credit as CPU-intensive projects, then credit becomes meaningless.
- This proposal doesn't say anything about the last 3 types of utility.