v6.10.3 ATI request gets lost and only CPU request occurs

Message boards : BOINC client : v6.10.3 ATI request gets lost and only CPU request occurs
Message board moderation

To post messages, you must log in.

AuthorMessage
Jon Sonntag

Send message
Joined: 4 Sep 09
Posts: 3
United States
Message 27059 - Posted: 4 Sep 2009, 13:15:28 UTC
Last modified: 4 Sep 2009, 13:17:54 UTC

An ATI scheduling problem exists in BOINC client 6.10.3.

The test project server is set so that NO app_info is needed. It knows about ATI clients and will send ATI apps to them along with ATI WUs. The problem is that the client has an identity problem and keeps requesting CPU work for the ATI card. It NEVER requests ATI work. Instead, it takes the numbers from the ATI request and sends them as the CPU seconds. Just look at the highlighted numbers below to see what I mean.


04-Sep-2009 08:02:32 [---] [wfd] ------- start work fetch state -------
04-Sep-2009 08:02:32 [---] [wfd] target work buffer: 0.86 + 194400.00 sec
04-Sep-2009 08:02:32 [---] [wfd] CPU: shortfall 774998.77 nidle 3.50 saturated 0.00 busy 0.00 RS fetchable 100.00 runnable 100.00
04-Sep-2009 08:02:32 [Collatz Conjecture] [wfd] CPU: fetch share 0.00 debt 0.00 backoff dt 0.00 int 0.00 (no new tasks)
04-Sep-2009 08:02:32 [CollatzTest] [wfd] CPU: fetch share 1.00 debt 0.00 backoff dt 0.00 int 120.00
04-Sep-2009 08:02:32 [Milkyway@home] [wfd] CPU: fetch share 0.00 debt 0.00 backoff dt 0.00 int 0.00 (susp via GUI) (no new tasks)
04-Sep-2009 08:02:32 [---] [wfd] ATI GPU: shortfall 194400.86 nidle 1.00 saturated 0.00 busy 0.00 RS fetchable 0.00 runnable 0.00
04-Sep-2009 08:02:32 [Collatz Conjecture] [wfd] ATI GPU: fetch share 0.00 debt 0.00 backoff dt 0.00 int 0.00 (no new tasks)
04-Sep-2009 08:02:32 [CollatzTest] [wfd] ATI GPU: fetch share 0.00 debt 0.00 backoff dt 12.91 int 120.00
04-Sep-2009 08:02:32 [Milkyway@home] [wfd] ATI GPU: fetch share 0.00 debt 0.00 backoff dt 0.00 int 0.00 (susp via GUI) (no new tasks)
04-Sep-2009 08:02:32 [Collatz Conjecture] [wfd] overall_debt 0
04-Sep-2009 08:02:32 [CollatzTest] [wfd] overall_debt 0
04-Sep-2009 08:02:32 [Milkyway@home] [wfd] overall_debt 0
04-Sep-2009 08:02:32 [---] [wfd] ------- end work fetch state -------
04-Sep-2009 08:02:32 [---] [wfd] ------- start work fetch state -------
04-Sep-2009 08:02:32 [---] [wfd] target work buffer: 0.86 + 194400.00 sec
04-Sep-2009 08:02:32 [---] [wfd] CPU: shortfall 774998.77 nidle 3.50 saturated 0.00 busy 0.00 RS fetchable 100.00 runnable 100.00
04-Sep-2009 08:02:32 [Collatz Conjecture] [wfd] CPU: fetch share 0.00 debt 0.00 backoff dt 0.00 int 0.00 (no new tasks)
04-Sep-2009 08:02:32 [CollatzTest] [wfd] CPU: fetch share 1.00 debt 0.00 backoff dt 0.00 int 120.00
04-Sep-2009 08:02:32 [Milkyway@home] [wfd] CPU: fetch share 0.00 debt 0.00 backoff dt 0.00 int 0.00 (susp via GUI) (no new tasks)
04-Sep-2009 08:02:32 [---] [wfd] ATI GPU: shortfall 194400.86 nidle 1.00 saturated 0.00 busy 0.00 RS fetchable 0.00 runnable 0.00
04-Sep-2009 08:02:32 [Collatz Conjecture] [wfd] ATI GPU: fetch share 0.00 debt 0.00 backoff dt 0.00 int 0.00 (no new tasks)
04-Sep-2009 08:02:32 [CollatzTest] [wfd] ATI GPU: fetch share 0.00 debt 0.00 backoff dt 12.91 int 120.00
04-Sep-2009 08:02:32 [Milkyway@home] [wfd] ATI GPU: fetch share 0.00 debt 0.00 backoff dt 0.00 int 0.00 (susp via GUI) (no new tasks)
04-Sep-2009 08:02:32 [Collatz Conjecture] [wfd] overall_debt 0
04-Sep-2009 08:02:32 [CollatzTest] [wfd] overall_debt 0
04-Sep-2009 08:02:32 [Milkyway@home] [wfd] overall_debt 0
04-Sep-2009 08:02:32 [---] [wfd] ------- end work fetch state -------
04-Sep-2009 08:02:32 [CollatzTest] [wfd] request: CPU (194400.86 sec, 4) ATI GPU (0.00 sec, 0)
04-Sep-2009 08:02:32 [CollatzTest] [sched_op_debug] Starting scheduler request
04-Sep-2009 08:02:32 [CollatzTest] Sending scheduler request: To fetch work.
04-Sep-2009 08:02:32 [CollatzTest] Requesting new tasks
04-Sep-2009 08:02:32 [CollatzTest] [sched_op_debug] CPU work request: 194400.86 seconds; 4 idle CPUs

How did 194400.96 turn from an ATI shortfall into a CPU request?
ID: 27059 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5091
United Kingdom
Message 27060 - Posted: 4 Sep 2009, 14:02:38 UTC

Probably the same (or a resurrected version of the) problem which occurred with CUDA cards: Not getting new WU's for CPU projects after upgrade to 6.6.20.

I believe that one's been fixed (haven't seen it for a while, but to be honest I haven't been looking), so if David can remember what he did, something similar could be done for ATI.

Though come to think of it, the symptoms weren't quite the same: in that case, a CUDA shortfall existed, but a CPU request was issued for the true CPU shortfall - which happened to be zero. You are at least getting a request for the value of the ATI shortfall, just in the wrong request slot.
ID: 27060 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15506
Netherlands
Message 27061 - Posted: 4 Sep 2009, 14:05:30 UTC - in response to Message 27059.  
Last modified: 4 Sep 2009, 14:06:26 UTC

The problem is that the client has an identity problem and keeps requesting CPU work for the ATI card. It NEVER requests ATI work.

That's known and fixed (we hope) in 6.10.4

It was a missing work request message in scheduler_op.cpp
I pointed it out to David and he's fixed it, but too late to be included in 6.10.3

See [trac]changeset:18995[/trac]
It will then still only show this when <sched_op_debug> is on in cc_config.xml.
ID: 27061 · Report as offensive
Jon Sonntag

Send message
Joined: 4 Sep 09
Posts: 3
United States
Message 27167 - Posted: 8 Sep 2009, 19:13:48 UTC

I see there is a 6.10.4 version. I can't wait to test it, but I first have to revive the Collatz database which took a dive this weekend.
ID: 27167 · Report as offensive

Message boards : BOINC client : v6.10.3 ATI request gets lost and only CPU request occurs

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.