Message boards : BOINC client : BOINC 6.6.2: work fetch
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Aug 06 Posts: 5 |
It seems the Boinc 6.6.2 client is fetching too much work. I have it running on a Vista x64 machine with 0 days work buffer and 0.1 days internet connect. I ended up with task running in high priority mode and the client is still downloading a ton of new workunits from all over the projects. |
Send message Joined: 29 Aug 05 Posts: 15554 |
Please make a cc_config.xml file in your BOINC Data directory and enable the <work_fetch_debug> and <sched_op_debug> flags. Run BOINC for one round of work fetch and post the messages for that work fetch loop here. (Disable the flags afterwards, as else your stdoutdae.txt file will fill up quite quickly). |
Send message Joined: 12 Aug 06 Posts: 5 |
Currently i have all projects set to NNW since I first have to get rid of all the work i have. Once this is through i will get the logs. |
Send message Joined: 13 Dec 08 Posts: 6 |
Jord, I have some info on this problem including the cc_config.xml log flags. It is over 380 lines long. Would you want me to post it here? Kevin |
Send message Joined: 5 Oct 06 Posts: 5124 |
Yes please - other people may be interested in helping to try and disgnose the problem too, and posting it here would save you sending a separate PM/email to each of us. |
Send message Joined: 13 Dec 08 Posts: 6 |
I cut it down from 380 lines. Here is the first three requests for work from setiathome. I have more work fetch requests if needed. First one got me 19 new tasks. Second one got back off. Third one got 11 new tasks. 6-Jan-2009 06:06:02 [---] [wfd] ------- start work fetch state ------- 26-Jan-2009 06:06:02 [---] [wfd] CPU: shortfall 0.00 nidle 0.00 total RS 375.00 runnable RS 0.00 26-Jan-2009 06:06:02 [Einstein@Home] [wfd] CPU: runshare 0.44 debt 39615.93 backoff t -20962.19 int 60.00 26-Jan-2009 06:06:02 [lhcathome] [wfd] CPU: runshare 0.00 debt -8046.72 backoff t -1232967962.58 int 0.00 26-Jan-2009 06:06:02 [SETI@home] [wfd] CPU: runshare 0.56 debt -182352.88 backoff t -1232967962.58 int 0.00 26-Jan-2009 06:06:02 [SETI@home Beta Test] [wfd] CPU: runshare 0.00 debt 150783.68 backoff t -1232967962.58 int 0.00 26-Jan-2009 06:06:02 [---] [wfd] CUDA: shortfall 72035.52 nidle 0.00 total RS 375.00 runnable RS 325.00 26-Jan-2009 06:06:02 [Einstein@Home] [wfd] CUDA: runshare 0.00 debt 84782.50 backoff t -20902.19 int 120.00 26-Jan-2009 06:06:02 [lhcathome] [wfd] CUDA: runshare 0.00 debt -0.00 backoff t -1232967962.58 int 0.00 26-Jan-2009 06:06:02 [SETI@home] [wfd] CUDA: runshare 0.00 debt 96429.90 backoff t -1232967962.58 int 0.00 26-Jan-2009 06:06:02 [SETI@home Beta Test] [wfd] CUDA: runshare 1.00 debt -181212.41 backoff t -1232967962.58 int 0.00 26-Jan-2009 06:06:02 [Einstein@Home] [wfd] overall_debt 2256802.797653 26-Jan-2009 06:06:02 [lhcathome] [wfd] overall_debt -8046.724735 26-Jan-2009 06:06:02 [SETI@home] [wfd] overall_debt 2339430.609219 26-Jan-2009 06:06:02 [SETI@home Beta Test] [wfd] overall_debt -4588186.682137 26-Jan-2009 06:06:02 [---] [wfd] ------- end work fetch state ------- 26-Jan-2009 06:06:02 [SETI@home] [wfd] request: CPU (75600.00 sec, 0) CUDA (75600.00 sec, 0) 26-Jan-2009 06:06:02 [SETI@home] [sched_op_debug] Starting scheduler request 26-Jan-2009 06:06:02 [SETI@home] Sending scheduler request: To fetch work. 26-Jan-2009 06:06:02 [SETI@home] CPU work request: 75600.00 seconds, 0 instances 26-Jan-2009 06:06:02 [SETI@home] CUDA work request: 75600.00 seconds, 0 instances 26-Jan-2009 06:06:12 [SETI@home] Scheduler request completed: got 19 new tasks 26-Jan-2009 06:06:12 [SETI@home] [sched_op_debug] Server version 607 26-Jan-2009 06:06:12 [SETI@home] Project requested delay of 11.000000 seconds 26-Jan-2009 06:06:12 [SETI@home] [sched_op_debug] estimated total job duration: 73981 seconds 26-Jan-2009 06:06:12 [SETI@home] [sched_op_debug] Deferring communication for 11 sec 26-Jan-2009 06:06:12 [SETI@home] [sched_op_debug] Reason: requested by project 26-Jan-2009 06:06:12 [---] [work_fetch_debug] Request work fetch: RPC complete 26-Jan-2009 06:06:18 [---] [wfd] ------- start work fetch state ------- 26-Jan-2009 06:06:18 [---] [wfd] CPU: shortfall 0.00 nidle 0.00 total RS 0.00 runnable RS 0.00 26-Jan-2009 06:06:18 [Einstein@Home] [wfd] CPU: runshare 0.44 debt 39630.64 backoff t -20977.86 int 60.00 26-Jan-2009 06:06:18 [lhcathome] [wfd] CPU: runshare 0.00 debt -8046.72 backoff t -1232967978.25 int 0.00 26-Jan-2009 06:06:18 [SETI@home] [wfd] CPU: runshare 0.56 debt -182381.48 backoff t -1232967978.25 int 0.00 26-Jan-2009 06:06:18 [SETI@home Beta Test] [wfd] CPU: runshare 0.00 debt 150797.57 backoff t -1232967978.25 int 0.00 26-Jan-2009 06:06:18 [---] [wfd] CUDA: shortfall 70805.57 nidle 0.00 total RS 0.00 runnable RS 325.00 26-Jan-2009 06:06:18 [Einstein@Home] [wfd] CUDA: runshare 0.00 debt 84787.20 backoff t -20917.86 int 120.00 26-Jan-2009 06:06:18 [lhcathome] [wfd] CUDA: runshare 0.00 debt 0.00 backoff t -1232967978.25 int 0.00 26-Jan-2009 06:06:18 [SETI@home] [wfd] CUDA: runshare 0.00 debt 96435.78 backoff t -1232967978.25 int 0.00 26-Jan-2009 06:06:18 [SETI@home Beta Test] [wfd] CUDA: runshare 1.00 debt -181222.98 backoff t -1232967978.25 int 0.00 26-Jan-2009 06:06:18 [Einstein@Home] [wfd] overall_debt 2256940.400995 26-Jan-2009 06:06:18 [lhcathome] [wfd] overall_debt -8046.724735 26-Jan-2009 06:06:18 [SETI@home] [wfd] overall_debt 2339555.621289 26-Jan-2009 06:06:18 [SETI@home Beta Test] [wfd] overall_debt -4588449.297550 26-Jan-2009 06:06:18 [---] [wfd] ------- end work fetch state ------- 26-Jan-2009 06:06:18 [---] No project chosen for work fetch 26-Jan-2009 06:06:23 [---] [work_fetch_debug] Request work fetch: Project backoff ended 26-Jan-2009 06:06:23 [---] [wfd] ------- start work fetch state ------- 26-Jan-2009 06:06:23 [---] [wfd] CPU: shortfall 0.00 nidle 0.00 total RS 375.00 runnable RS 0.00 26-Jan-2009 06:06:23 [Einstein@Home] [wfd] CPU: runshare 0.44 debt 39635.88 backoff t -20983.44 int 60.00 26-Jan-2009 06:06:23 [lhcathome] [wfd] CPU: runshare 0.00 debt -8046.72 backoff t -1232967983.83 int 0.00 26-Jan-2009 06:06:23 [SETI@home] [wfd] CPU: runshare 0.56 debt -182391.67 backoff t -1232967983.83 int 0.00 26-Jan-2009 06:06:23 [SETI@home Beta Test] [wfd] CPU: runshare 0.00 debt 150802.51 backoff t -1232967983.83 int 0.00 26-Jan-2009 06:06:23 [---] [wfd] CUDA: shortfall 70796.17 nidle 0.00 total RS 375.00 runnable RS 325.00 26-Jan-2009 06:06:23 [Einstein@Home] [wfd] CUDA: runshare 0.00 debt 84788.88 backoff t -20923.44 int 120.00 26-Jan-2009 06:06:23 [lhcathome] [wfd] CUDA: runshare 0.00 debt 0.00 backoff t -1232967983.83 int 0.00 26-Jan-2009 06:06:23 [SETI@home] [wfd] CUDA: runshare 0.00 debt 96437.87 backoff t -1232967983.83 int 0.00 26-Jan-2009 06:06:23 [SETI@home Beta Test] [wfd] CUDA: runshare 1.00 debt -181226.75 backoff t -1232967983.83 int 0.00 26-Jan-2009 06:06:23 [Einstein@Home] [wfd] overall_debt 2256989.425570 26-Jan-2009 06:06:23 [lhcathome] [wfd] overall_debt -8046.724735 26-Jan-2009 06:06:23 [SETI@home] [wfd] overall_debt 2339600.159914 26-Jan-2009 06:06:23 [SETI@home Beta Test] [wfd] overall_debt -4588542.860749 26-Jan-2009 06:06:23 [---] [wfd] ------- end work fetch state ------- 26-Jan-2009 06:06:23 [SETI@home] [wfd] request: CPU (75600.00 sec, 0) CUDA (75600.00 sec, 0) 26-Jan-2009 06:06:23 [SETI@home] [sched_op_debug] Starting scheduler request 26-Jan-2009 06:06:23 [SETI@home] Sending scheduler request: To fetch work. 26-Jan-2009 06:06:23 [SETI@home] CPU work request: 75600.00 seconds, 0 instances 26-Jan-2009 06:06:23 [SETI@home] CUDA work request: 75600.00 seconds, 0 instances 26-Jan-2009 06:06:29 [SETI@home] Scheduler request completed: got 11 new tasks 26-Jan-2009 06:06:29 [SETI@home] [sched_op_debug] Server version 607 26-Jan-2009 06:06:29 [SETI@home] Project requested delay of 11.000000 seconds 26-Jan-2009 06:06:29 [SETI@home] [sched_op_debug] estimated total job duration: 123987 seconds 26-Jan-2009 06:06:29 [SETI@home] [sched_op_debug] Deferring communication for 11 sec 26-Jan-2009 06:06:29 [SETI@home] [sched_op_debug] Reason: requested by project 26-Jan-2009 06:06:29 [---] [work_fetch_debug] Request work fetch: RPC complete I have 359 tasks downloaded will take over 440 hours to complete. It is 20 times the 21 hour cache I wanted. I have these settings 3 hour network connect with a .75 extra days. The work fetch requests stopped work in high priority mode. I did lower the extra days to 0.125 or 3 hours. See if that helps. This Computer is crunching away. I only pushed the update button to make sure these errors were fixed after I restarted Boinc manager and client. I was midnight and I was tired and did not want to wait. It was fixed after the restart. Message: 25-Jan-2009 23:06:39 [SETI@home] Starting 19dc08ab.3908.34815.6.8.132_0 25-Jan-2009 23:06:44 [SETI@home] [error] Can't create link file slots/2/AK_v8_win_SSSE3x.exe 25-Jan-2009 23:06:45 [SETI@home] Computation for task 19dc08ab.3908.34815.6.8.132_0 finished 25-Jan-2009 23:06:45 [SETI@home] Output file 19dc08ab.3908.34815.6.8.132_0_0 for task 19dc08ab.3908.34815.6.8.132_0 absent Message: 26-Jan-2009 00:02:01 [Einstein@Home] Starting h1_0725.25_S5R4__956_S5R5a_0 26-Jan-2009 00:02:07 [Einstein@Home] [error] Can't create link file slots/2/einstein_S5R5_3.01_windows_intelx86.exe 26-Jan-2009 00:02:08 [Einstein@Home] Computation for task h1_0725.25_S5R4__956_S5R5a_0 finished 26-Jan-2009 00:02:08 [Einstein@Home] Output file h1_0725.25_S5R4__956_S5R5a_0_0 for task h1_0725.25_S5R4__956_S5R5a_0 absent Kevin |
Send message Joined: 5 Mar 08 Posts: 272 |
I cut it down from 380 lines. That would look like its only getting work from Seti. Is that correct? Did it pickup work from other projects or just from Seti? Seti did have a bug on the server side which has supposedly been corrected. That doesn't mean 6.6.2 doesn't have issues, but that you may need to retry it to see if its still a problem. MarkJ |
Send message Joined: 12 Aug 06 Posts: 5 |
In my case it was also different projects. Currently i still have around 10 WCG and 20 prime grid task (still the reason i can't get the logs) |
Send message Joined: 30 Oct 05 Posts: 1239 |
Use something like PasteBin to store the logs, and give us the link. Kathryn :o) |
Send message Joined: 13 Dec 08 Posts: 6 |
Markj, Seti main and Einstein constant requests for work. When computer had enough for three weeks. I had 359 workunits just for Seti main approx 440 hours of work. With Boinc requesting for more work every 11 secs approx. and getting 2 to 20 workunits for each request. 24 times in a row. Something is amiss somewhere. Einstein blew out I reach my max limit of 16 per day. This message error helped me reach my quota for Einstein. Message: 26-Jan-2009 00:02:01 [Einstein@Home] Starting h1_0725.25_S5R4__956_S5R5a_0 26-Jan-2009 00:02:07 [Einstein@Home] [error] Can't create link file slots/2/einstein_S5R5_3.01_windows_intelx86.exe 26-Jan-2009 00:02:08 [Einstein@Home] Computation for task h1_0725.25_S5R4__956_S5R5a_0 finished 26-Jan-2009 00:02:08 [Einstein@Home] Output file h1_0725.25_S5R4__956_S5R5a_0_0 for task h1_0725.25_S5R4__956_S5R5a_0 absent If I let Seti main requests continue. I believe I would have over 1200 workunits do. That with a 3 hour contact server interval and .75 days extra. 21 hours max no where near 440 plus hours I had. Luckly Boinc went into High Priority mode and stopped requesting more work. I am still running Boinc 6.6.2. It has not had a problem since. Getting ready to test it again. Regards, Kevin |
Send message Joined: 21 Dec 08 Posts: 4 |
Is there any progress on this yet? I have the same issue with Aqua (38 wu), Einstein (16 wu), Milkyway (19 wu) and WCG (17 wu). Also running GPUGrid, but it has a limit of one WU per processor so I'm okay there. I set preferences to 1/2 day and set all projects except GPU to no new work. The cc_config.xml file is in place, but there is nothing in the log. On the positive side, it seems like wu's are running faster on 6.6.2. That's just based on observation of CPU times on a few completed wu's and hopefully, it turns out to be true. Thanks! |
Send message Joined: 5 Oct 06 Posts: 5124 |
Report copied from SETI Beta: Fetching had been repaired in BM 6.6.3? No... Again (as in 6.6.2), pressing update at beta project leads to 10-WUs request with already full queue, even with NNT. (BM 6.6.3, Windows Server 2003 x86) Again time format of displayed time in BM's tasks window at computer with non-English localization differs sporadically (but all selected lines with different time format being copied into clipboard are looks in similar locale-specific format). |
Send message Joined: 21 Dec 08 Posts: 4 |
On the positive side, it seems like wu's are running faster on 6.6.2. That's just based on observation of CPU times on a few completed wu's and hopefully, it turns out to be true. OK |
Send message Joined: 12 Aug 06 Posts: 5 |
It seems that this problem occurs if the client runs into a CUDA shortfall. But he doesn't only asks the CUDA projects for work he also asks the non-CUDA projects for work. By doing this he ends up with a lot more CPU workload than needed. boinc logs |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.