Thread 'Feeder'

Author	Message
DJStarfox Send message Joined: 19 Jul 07 Posts: 17	Message 26070 - Posted: 17 Jul 2009, 17:28:54 UTC Based on a discussion in this SETI thread, I would like to know if adding a double buffer to the feeder would result in a more efficient (and higher throughput) system. As most of you know, SETI is the largest project by number of volunteers, and their servers are extremely busy. What performance metrics do we have to measure this part of the BOINC server-side system? My goal is to help BOINC scale up better. ID: 26070 ·

Jord Volunteer tester Help desk expert Send message Joined: 29 Aug 05 Posts: 15581	Message 26072 - Posted: 17 Jul 2009, 18:21:30 UTC - in response to Message 26070. Forwarded to developers. ID: 26072 ·

David Anderson Volunteer moderator Project administrator Project developer Send message Joined: 10 Sep 05 Posts: 728	Message 26074 - Posted: 17 Jul 2009, 18:32:23 UTC - in response to Message 26070. Not sure what you mean by double buffer. But in any case, the feeder isn't usually a bottleneck. When it is a bottleneck, it's because its DB query runs slowly, and this is a MySQL issue (typically it means the MySQL server doesn't have enough RAM) ID: 26074 ·

DJStarfox Send message Joined: 19 Jul 07 Posts: 17	Message 26075 - Posted: 17 Jul 2009, 18:41:56 UTC - in response to Message 26074. Having a slow database connection (in relation to local storage or system memory) is typical for most applications.... so in order to workaround that... What I meant by a double buffer system was to have the feeder have two threads and two buffers (queues). The scheduler pulls from queue 1 while the feeder is busy filling queue 2. When queue 1 is empty, they both swap buffers. In theory, this should give the scheduler more throughput and allow for higher latencies on the database connection when filling the buffer. ID: 26075 ·

ZPM Send message Joined: 14 Mar 09 Posts: 215	Message 26079 - Posted: 17 Jul 2009, 20:43:00 UTC - in response to Message 26075. Last modified: 17 Jul 2009, 20:45:30 UTC i see what you mean, running off of generator 1, while 2 is being refilled, and then switching back and forth like a green house running on battery backup(in this case, 2 battery back-up systems)... sort of in the way that we have the splitters... this way, request for work would be 24/7 and everyone would get work... but it doesn't help the bandwidth issue one darn bit. instead of work being created in 10 minute interval or w/e, constant work; as long as we have enough raw data to go around.. ID: 26079 ·

DJStarfox Send message Joined: 19 Jul 07 Posts: 17	Message 26083 - Posted: 17 Jul 2009, 22:26:26 UTC - in response to Message 26079. By reducing the failure rate (times clients connect but don't get any work because the feeder/scheduler is too busy), clients will request work less often, reducing bandwidth (a little). Also, but making the feeder/scheduler faster, the response time may improve too, also reducing overall bandwidth. Just a theory. The big assumption is that the feeder is heavily delayed by the database and the scheduler is waiting on the feeder at least a significant amount of time. ID: 26083 ·

Nicolas Send message Joined: 19 Jan 07 Posts: 1179	Message 26101 - Posted: 19 Jul 2009, 1:54:43 UTC - in response to Message 26074. When it is a bottleneck, it's because its DB query runs slowly, and this is a MySQL issue (typically it means the MySQL server doesn't have enough RAM) Or it may mean that the database layout is horrible (XML blobs causing fragmentation). ID: 26101 ·

Nicolas Send message Joined: 19 Jan 07 Posts: 1179	Message 26102 - Posted: 19 Jul 2009, 1:56:46 UTC - in response to Message 26075. What I meant by a double buffer system was to have the feeder have two threads and two buffers (queues). The scheduler pulls from queue 1 while the feeder is busy filling queue 2. When queue 1 is empty, they both swap buffers. In theory, this should give the scheduler more throughput and allow for higher latencies on the database connection when filling the buffer. I don't understand how that would help. In the current code, while the feeder is busy filling the one and only buffer, the scheduler can still "pull from it". You seem to think the scheduler is locked from using the buffer while the feeder is doing a DB query to refill it. ID: 26102 ·

DJStarfox Send message Joined: 19 Jul 07 Posts: 17	Message 26106 - Posted: 19 Jul 2009, 2:45:10 UTC - in response to Message 26102. Well, my assumption was that there is a concurrency issue, and having two buffers would reduce the latency. But if just having a larger buffer would have the same benefit as two smaller buffers, then it would just make a lot more sense to have a bigger buffer (no programming). ID: 26106 ·

Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.