Message boards : Server programs : Feeder
Message board moderation
Author | Message |
---|---|
Send message Joined: 19 Jul 07 Posts: 17 |
Based on a discussion in this SETI thread, I would like to know if adding a double buffer to the feeder would result in a more efficient (and higher throughput) system. As most of you know, SETI is the largest project by number of volunteers, and their servers are extremely busy. What performance metrics do we have to measure this part of the BOINC server-side system? My goal is to help BOINC scale up better. |
Send message Joined: 29 Aug 05 Posts: 15581 |
Forwarded to developers. |
Send message Joined: 10 Sep 05 Posts: 728 |
Not sure what you mean by double buffer. But in any case, the feeder isn't usually a bottleneck. When it is a bottleneck, it's because its DB query runs slowly, and this is a MySQL issue (typically it means the MySQL server doesn't have enough RAM) |
Send message Joined: 19 Jul 07 Posts: 17 |
Having a slow database connection (in relation to local storage or system memory) is typical for most applications.... so in order to workaround that... What I meant by a double buffer system was to have the feeder have two threads and two buffers (queues). The scheduler pulls from queue 1 while the feeder is busy filling queue 2. When queue 1 is empty, they both swap buffers. In theory, this should give the scheduler more throughput and allow for higher latencies on the database connection when filling the buffer. |
Send message Joined: 14 Mar 09 Posts: 215 |
i see what you mean, running off of generator 1, while 2 is being refilled, and then switching back and forth like a green house running on battery backup(in this case, 2 battery back-up systems)... sort of in the way that we have the splitters... this way, request for work would be 24/7 and everyone would get work... but it doesn't help the bandwidth issue one darn bit. instead of work being created in 10 minute interval or w/e, constant work; as long as we have enough raw data to go around.. |
Send message Joined: 19 Jul 07 Posts: 17 |
By reducing the failure rate (times clients connect but don't get any work because the feeder/scheduler is too busy), clients will request work less often, reducing bandwidth (a little). Also, but making the feeder/scheduler faster, the response time may improve too, also reducing overall bandwidth. Just a theory. The big assumption is that the feeder is heavily delayed by the database and the scheduler is waiting on the feeder at least a significant amount of time. |
Send message Joined: 19 Jan 07 Posts: 1179 |
When it is a bottleneck, it's because its DB query runs slowly, Or it may mean that the database layout is horrible (XML blobs causing fragmentation). |
Send message Joined: 19 Jan 07 Posts: 1179 |
What I meant by a double buffer system was to have the feeder have two threads and two buffers (queues). The scheduler pulls from queue 1 while the feeder is busy filling queue 2. When queue 1 is empty, they both swap buffers. In theory, this should give the scheduler more throughput and allow for higher latencies on the database connection when filling the buffer. I don't understand how that would help. In the current code, while the feeder is busy filling the one and only buffer, the scheduler can still "pull from it". You seem to think the scheduler is locked from using the buffer while the feeder is doing a DB query to refill it. |
Send message Joined: 19 Jul 07 Posts: 17 |
Well, my assumption was that there is a concurrency issue, and having two buffers would reduce the latency. But if just having a larger buffer would have the same benefit as two smaller buffers, then it would just make a lot more sense to have a bigger buffer (no programming). |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.