scheduler choking on seti GPU WUs

Message boards : BOINC client : scheduler choking on seti GPU WUs
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 22660 - Posted: 24 Jan 2009, 17:36:26 UTC
Last modified: 24 Jan 2009, 17:45:20 UTC

I have 4 cores but cannot get any WUs for non-seti. I can only get some work if I suspend seti and reset all the other projects. Then, when one project does get some jobs in, I have to suspend that one so that other projects can get jobs. I did this about 12 hours ago and now it seems I will have to do it again. If SETI is not suspended I get "asking for 0 cpu and for 0 gpu". If SETI is suspended I still get "asking for 0 cpu" but I see "asking for xxxx gpu" on all other projects. The only way I can force boinc 6.6.2 to get a non-gpu bound wu is to then reset a project and it will then get a minimum, maybe 3 "aqua" tasks. The rest of the projects do not get anything unless I then suspend "aqua" and then reset "einstine" for example to get einstine a very few wu's.

This is too much babysitting.

This is just a "schedule N resources M ways" problem that surely has been solved before.

In looking at the "transfer" tab in BM there are 100's of tasks trying to upload, stuck uploading, retrying, etc (thruput of 1.2kb). I checked the task manager and there is only 3 threads so I assume one thread is handleing all the uploads. That is nice, I dont mind seeing 201,000,000 page faults for boinc in under 4 hours as that shows it is busy, at least for those 4 hours, but I would hate to see 200 threads or one for each upload. Thanks.

BTW, I have 16 screens of about 50 tasks per screen in BM. All but a handfull are seti gpu bound.
ID: 22660 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 20 Dec 07
Posts: 1069
Germany
Message 22671 - Posted: 24 Jan 2009, 22:00:48 UTC - in response to Message 22660.  

...This is too much babysitting...

If you don't wanna babysit, don't run beta software (i.e. CUDA and/or BOINC 6.6 ;-)

Gruß,
Gundolf

P.S. I'm pretty sure that your problems are caused by Long Term Debt discrepancies, which in turn are caused by CUDA scheduling bugs.
Computer sind nicht alles im Leben. (Kleiner Scherz)
ID: 22671 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 Jun 08
Posts: 641
United States
Message 22684 - Posted: 25 Jan 2009, 13:05:02 UTC - in response to Message 22671.  
Last modified: 25 Jan 2009, 13:15:59 UTC

...This is too much babysitting...

If you don't wanna babysit, don't run beta software (i.e. CUDA and/or BOINC 6.6 ;-)

Gruß,
Gundolf

P.S. I'm pretty sure that your problems are caused by Long Term Debt discrepancies, which in turn are caused by CUDA scheduling bugs.


After running down all tasks except seti, I uninstalled 6.6.2-w32 and put in 6.6.2-w64. BM then picked up with the BAM! manager that I was using when I had 6.6.0-w64. I had put in the 6.6.2-w32 by mistake. When I was last using it, BAM! had seti suspended as I was working on seti-beta. Guess what? System seems fine: I have a nice mix of milkyway, seti beta, einstein, aqua.

However - under messages I see that Einstein has hit the quota limit of 64, Milkyway its quote limit of 8 per CPU (32). I do not see that message from aqua only because I had stopped new tasks.

I think the seti people are simply running their jobs thru and ignoring the no new tasks and that is contributing the problem. Since NNT relies on the server to not send more, they can just ignore it and I think that is what has been happening.

What makes my new BM schedule look better is the quota limit by the projects I am serviceing. I no longer have 800 pending tasks. I have about 120 pending tasks since about 10 hours ago. This is not due to the BOINC scheduler, it is do to the current projects using a quota limit. I dont think that seti or seti beta have a quota limit. Right now I have seti suspended and seti-beta on NNT and that keeps my CPU nicely busy and not over-run or under run. This is babysitting of course, and the quota limit is a major player.

Something else to think about - If GPUGRID started ignoring the NNT I suspect the BOINC developers would quickly come up with a BM fix to prevent GPUGRID from hogging bandwidth and cores. Probably within hours of noticing the problem. This wont happen with SETI of course since they are both in bed together. An easy fix would be to suspend any project that ignored the NNT. I suspect they would do this to any project other than SETI.
ID: 22684 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5091
United Kingdom
Message 22686 - Posted: 25 Jan 2009, 13:37:47 UTC - in response to Message 22684.  

I dont think that seti or seti beta have a quota limit.

Yes they do, but it's probably bigger than you're used to seeing:

From message 856650:

01/22/09 20:21:10|SETI@home|Message from server: No work sent
01/22/09 20:21:10|SETI@home|Message from server: No work is available for SETI@home Enhanced
01/22/09 20:21:10|SETI@home|Message from server: (reached daily quota of 700 results)

In another post, someone worked out that with their multi-core CPUs and multiple graphics cards, their 'per computer' daily quota would be 2800 (with the newly-increased quota limits counting a CUDA card as 5 CPU cores for quota purposes).
ID: 22686 · Report as offensive

Message boards : BOINC client : scheduler choking on seti GPU WUs

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.