Message boards : News : New BOINC project studies machine learning
Message board moderation
Author | Message |
---|---|
Send message Joined: 10 Sep 05 Posts: 726 |
|
Send message Joined: 8 Nov 19 Posts: 718 |
System requirements (64 bit CPU only): Linux Windows Future support: OSX Linux / arm Linux / ppc64le Not supported: GPU Too bad, this kind of project could gain so much traction if GPUs were used... |
Send message Joined: 8 Nov 10 Posts: 310 |
Too bad, this kind of project could gain so much traction if GPUs were used... I was wondering about that. You could probably raise that point more effectively than I could on their forum. Maybe some volunteers could help? It is a nice project. |
Send message Joined: 28 Jun 10 Posts: 2676 |
System requirements (64 bit CPU only): If it is similar to CPDN where each calculation depends on the previous one then it is possible GPU won't give much of a boost if any. Of course it may well be able to make use of parallel processing. I am just putting forward a possible reason. |
Send message Joined: 12 Feb 11 Posts: 419 |
Too bad, this kind of project could gain so much traction if GPUs were used... We are at the beginning of the project... |
Send message Joined: 8 Nov 19 Posts: 718 |
System requirements (64 bit CPU only): Only if the project depends on Double Precision. Then an RTX2080 @ 300W is about 10x faster than a 15W dual core Celeron CPU at the same frequency. GPUs can use VRAM to store the result, which they re-read, and re-process through the shaders/cores, without needing to pass through the PCIE bus again. If a project can run on Single Precision (or 32bit FPP), an RTX 2080 Ti is about 1.000-1.500x faster (in terms of Flops) than a dual core celeron running at the same frequency. Long WUs may require a lot of VRAM, especially if tailored to the GPU they're feeding. An RTX 2080 Ti has ~4350 cores and 11GB of VRAM, that means each WU gets less than 2,5MB of VRAM, or there must be some parallel optimizations allowing blocks of cores (shaders) to run simultaneously sharing VRAM memory blocks. If Half precision (16bit) worked, Nvidia and AMD are working on doubling their half precision process rates, by processing 2 of them in their cores. Rather than running 1 instruction per core, they can fit 2 of them through an algorithm. Not sure if this process will be implemented in current line of GPUs. But to give you an idea, current AI runs at Quarter (8 bit) or half(16bit) precision, and uses between 10 to 300 cores. Can you imagine the processing power if ran through an RTX 2080 Ti (with over 4000 cores)? Should boost learning algorithms and AI to near real time. |
Send message Joined: 12 Feb 11 Posts: 419 |
The source code of the project is open. So, if someone wants to help the development.... |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.