The BOINC Wrapper
An existing application (or sequence of applications) can be run under BOINC using a wrapper program supplied by BOINC. The wrapper runs the applications as subprocesses, and handles all communication with the BOINC client (e.g., to report CPU time and fraction done).
The source code of wrapper is in boinc/samples. You can get pre-compiled versions here:
The job description file
The wrapper reads a file with logical name 'job.xml'. This file has the format:
<job_desc> <task> <application>worker</application> [ <stdin_filename>stdin_file</stdin_filename> ] [ <stdout_filename>stdout_file</stdout_filename> ] [ <stderr_filename>stderr_file</stderr_filename> ] [ <command_line>--foo bar</command_line> ] [ <weight>X</weight> ] [ <checkpoint_filename>filename</checkpoint_filename> ] [ <fraction_done_filename>filename</fraction_done_filename> ] [ <exec_dir>dirname</exec_dir> ] [ <multi_process/> ] [ <setenv>VARNAME=VAR_VALUE</setenv> ] [ <daemon/> ] [ <append_cmdline_args/> ] [ <time_limit>X</time_limit> ] [ <priority>N</priority> ] </task> [ other <task>s ] [ <unzip_input> <zipfilename>foo.zip</zipfilename> ... </unzip_input> ] [ <zip_output> <zipfilename>foo.zip</zipfilename> <filename>regexp</filename> ... </zip_output> ] [ <rename_output> <filename>foo</filename> </rename_output> ] </job_desc>
The job file describes a sequence of tasks. The descriptor for each task includes:
- The logical name of the application, or 'worker program'.
- stdin_filename, stdout_filename, stderr_filename
- The logical names of the files to which stdin, stdout, and stderr are to be connected (if any).
- command-line arguments to be passed to the worker program.
This string is macro-substituted as follows:
- $NTHREADS is replaced with the number of CPUs the client is allocating for this job.
- $GPU_DEVICE_NUM is replaced with the device number of the GPU allocated to this job.
- $PROJECT_DIR is replaced with the absolute path of the project directory.
- $PWD is replaced with the absolute path of the current working directory.
- boinc_resolve(foo) is replaced with the return value of boinc_resolve_filename(foo) which eliminates the need to use the <copy_file/> attribute for foo. Note: This substitution happens right before the execution of this task on the whole commandline especially with the parts added via <append_cmdline_args/>. This is different to the other substitutions which happen when the wrapper reads job.xml at startup.
- the contribution of each task to the overall fraction done is proportional to its weight (floating-point, default 1). For example, if your job has tasks A and B, and A uses 100 times more CPU time than B, set A.weight=100 and B.weight=1.
- the name of the checkpoint file used by the app, if any. When this is modified, the wrapper assumes that a checkpoint has been completed and notifies the core client.
- the name of a file to which the app will periodically write its fraction done (0 to 1). This is used by the wrapper to report overall fraction done.
- The directory to start the application in (relative to slot, or use the $PROJECT_DIR macro)
- Include this if the application creates multiple processes. Note: each parent process must wait for its children to exit.
- Environment variable needed by the applications. You can have more than one <setenv> entry. Use the VARNAME=VAR_VALUE form, e.g. LD_LIBRARY_PATH=$PROJECT_DIR:$LD_LIBRARY_PATH. You can also use the $NTHREADS or $GPU_DEVICE_NUM macro.
- this task is a 'daemon' process that should run in the background asynchronously while the other tasks are run sequentially. The wrapper will shut down this daemon when the last task has exited.
- if set, the wrapper's command-line arguments (specified in the input template) are passed to the worker program, after those in <command_line>.
- if given, kill the task after the given amount of elapsed (running) time. Note: on Windows, tasks are killed using TerminateProcess(), which doesn't flush stdio buffers; take this into account.
- set the process of the task based on N:
- 1: lowest (Win: IDLE; Unix: 19)
- 2: low (Win: BELOW_NORMAL; Unix: 10
- 3: normal (Win NORMAL; Unix: 0)
- 4: high (Win: ABOVE_NORMAL; Unix: -10)
- 5: highest (Win: HIGH; Unix: -16)
Use high priorities only for tasks that use little CPU time.
The job file can specify multiple tasks. This is useful for two purposes:
- To handle jobs that involve multiple steps (e.g., pre-processing and post-processing).
- To break a long job up into smaller pieces. This provides a form of checkpointing: wrapper does checkpointing at the task level, so that lost CPU time is limited even if the legacy applications themselves are not restartable.
The wrapper can optionally unzip input files:
- before running tasks, the wrapper will unzip the specified input files.
The wrapper can optionally zip output files:
- after all tasks are completed, the wrapper will zip output files (specified by one or more regular expressions) into a zip file with the given name.
The wrapper can optionally rename output files:
- for this to work, a result file with <open_name>foo.link</open_name> has to be specified in the result template. After all tasks are completed, a file "foo" created by the application will be moved to become the result file, without using additional disk space or copy time. Note: The link name to be specified in the result template is hardcoded right now but this may change in a future version.
- One or more of the tasks may be multi-threaded and/or use GPUs.
- Normally the job file is part of the application version (it's the same between workunits). Alternatively, it can be part of the workunit (e.g. if its command line elements differ between workunits). This requires that you use the same worker program logical names for all platforms.
- Files opened directly by a worker program must have the <copy_file/> tag. This requires version 5.5 or higher of the BOINC core client (you can specify this limit at either the application or project level.
- Worker programs must exit with zero status; nonzero values are interpreted as errors by the wrapper.
- If you run wrapper in standalone mode (while debugging), you must provide input files with the proper logical, not physical, names.
- The job file may be slightly different for different platforms (i.e. app_versions) due to directory requirements (exec_dir) and environment variables (setenv) required. You will therefore want to make and track different versions for each app_version you are supporting.
The wrapper has the following command-line options:
--device N: macro-substitute N for $GPU_DEVICE_NUM
in worker command lines and environment variables.
--nthreads X: macro-substitute X for $NTHREADS
in worker command lines and environment variables.
--trickle X: send a trickle-up message reporting runtime every X seconds.
Assume you have an executable program for a particular platform (say "worker_windows_intelx86_0.exe" for Win32). The program reads from in and writes to out. (You can use the program in samples/worker/ for this purpose).
We assume that you have already created a project with root directory PROJECT/. Now
- Download the wrapper for Win32 (see links above) to your server. Assume the filename is wrapper_25825_windows_intelx86.exe.
- Create an application named 'worker'.
- Create the directory hierarchy
apps/ worker/ 1.0/ windows_intelx86/
- Put the files wrapper_25825_windows_intelx86.exe and worker_windows_intelx86_0.exe in the bottom directory.
- In the same directory, create a file worker_job_1.0.xml
(1.0 is a version number) containing
<job_desc> <task> <application>worker</application> <command_line>10</command_line> </task> </job_desc>
- In the same directory, create a file version.xml containing
<version> <file> <physical_name>wrapper_25825_windows_intelx86.exe</physical_name> <main_program/> </file> <file> <physical_name>worker_windows_intelx86_0.exe</physical_name> <logical_name>worker</logical_name> </file> <file> <physical_name>worker_job_1.0.xml</physical_name> <logical_name>job.xml</logical_name> </file> </version>
- In the 'PROJECT/templates' directory create a workunit template file called 'worker_in':
<file_info> <number>0</number> </file_info> <workunit> <file_ref> <file_number>0</file_number> <open_name>in</open_name> <copy_file/> </file_ref> <rsc_fpops_bound>1e12</rsc_fpops_bound> <rsc_fpops_est>1e14</rsc_fpops_est> </workunit>and a result template file called 'worker_out'
<file_info> <name><OUTFILE_0/></name> <generated_locally/> <upload_when_present/> <max_nbytes>5000000</max_nbytes> <url><UPLOAD_URL/></url> </file_info> <result> <file_ref> <file_name><OUTFILE_0/></file_name> <open_name>out</open_name> <copy_file/> </file_ref> </result>
- Run bin/update_versions to create an app version.
- Run bin/start to start the daemons.
- Create an input file input, and run a script like
#! /bin/sh bin/stage_file input bin/create_work --appname worker --wu_name worker_nodelete input
to generate a workunit.
Physical file management
You can use the wrapper together with physical file management, where you directly access files in your project directory. For example, you could create a job whose first task unpacks a zip file into the project directory, and whose subsequent tasks access these files.
The support for this is:
- If a worker program name begins with "$PROJECT_DIR", that substring is replaced with the project directory, and the name is treated as a physical name.
- In task command lines, "$PROJECT_DIR" is replaced with the project directory.
You can include a graphics app with a wrapper-based application.
GenWrapper: A more general BOINC wrapper
When the functionality of the BOINC Wrapper is not enough, there is a generic solution which uses POSIX-like shell scripting, instead of the XML config file, for describing jobs: You can have complex control flows (loops, branches, etc), but remember "with great power must also come -- great responsibility!"