Changes between Version 56 and Version 57 of RemoteJobs


Ignore:
Timestamp:
Feb 24, 2017, 2:20:30 PM (7 years ago)
Author:
davea
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • RemoteJobs

    v56 v57  
    11[[PageOutline]]
     2
    23= RPCs for remote job submission =
    3 
    4 This document describes APIs for remotely submitting,
    5 monitoring, and controlling jobs on a BOINC server.
    6 The APIs supports the submission of '''batches''' of jobs,
    7 which may contain a single job or many thousands of jobs.
    8 Currently, the API has two restrictions:
     4This document describes APIs for remotely submitting, monitoring, and controlling jobs on a BOINC server. The APIs supports the submission of '''batches''' of jobs, which may contain a single job or many thousands of jobs. Currently, the API has two restrictions:
    95
    106 * All jobs in a batch must use the same application.
    117 * There can be no dependencies between jobs.
    128
    13 At the bottom level, the interface consists of Web RPCs.
    14 BOINC provides client-side bindings in PHP, C++, and Python.
    15 These bindings differ slightly; they expose different details of the Web RPCs.
    16 
    17 Interfaces for [RemoteInputFiles staging input files]
    18 and [RemoteOutputFiles fetching output files] are described separately.
    19 
    20 There are various options for managing input files.
    21 If you use [RemoteInputFiles#Job-basedfilemanagement Job-based file management],
    22 which maintains batch/file associations,
    23 the order of operations is:
     9At the bottom level, the interface consists of Web RPCs. BOINC provides client-side bindings in PHP, C++, and Python. These bindings differ slightly; they expose different details of the Web RPCs.
     10
     11Interfaces for [wiki:RemoteInputFiles staging input files] and [wiki:RemoteOutputFiles fetching output files] are described separately.
     12
     13There are various options for managing input files. If you use [wiki:RemoteInputFiles#Job-basedfilemanagement Job-based file management], which maintains batch/file associations, the order of operations is:
    2414
    2515 * Create a batch (initially empty); returns the batch ID.
     
    2717 * Submit jobs, passing the batch ID
    2818
    29 If you manage input files a different way,
    30 then you create the batch and submit jobs in a single API call.
     19If you manage input files a different way, then you create the batch and submit jobs in a single API call.
    3120
    3221Once you have submitted the batch, you can
     22
    3323 * '''Monitor''' the batch with query_batches(), query_batch(), or query_job().
    34  * '''Abort''' the batch (if you see errors, or if enough jobs have been finished)
    35   using abort_jobs() or abort_batch().
    36  * [RemoteOutputFiles download output files].
    37  * '''Retire''' the batch using retire_batch().
    38   This tells the server to clean up the files and job records associated with the batch,
    39   and to mark the batch as "retired";
    40   retired batches are normally not shown in the web interface.
     24 * '''Abort''' the batch (if you see errors, or if enough jobs have been finished) using abort_jobs() or abort_batch().
     25 * [wiki:RemoteOutputFiles download output files].
     26 * '''Retire''' the batch using retire_batch(). This tells the server to clean up the files and job records associated with the batch, and to mark the batch as "retired"; retired batches are normally not shown in the web interface.
    4127
    4228== PHP interface ==
    43 
    44 The following functions are provided in the PHP file
    45 [/trac/browser/trunk/boinc/html/inc/submit.inc submit.inc],
    46 which is independent of other BOINC PHP code.
    47 The file html/user/submit_test.php has code to exercise and test these functions.
     29The following functions are provided in the PHP file [/trac/browser/trunk/boinc/html/inc/submit.inc submit.inc], which is independent of other BOINC PHP code. The file html/user/submit_test.php has code to exercise and test these functions.
    4830
    4931=== boinc_submit_batch() ===
    50 
    5132Submits a batch.
    5233
    5334Arguments: a "request object" whose fields include
     35
    5436 * '''project''': the project URL
    5537 * '''authenticator''': the user's authenticator
    5638 * '''app_name''': the name of the application for which jobs are being submitted
    57  * '''batch_name''': a symbolic name for the batch.
    58    Must be unique.
    59    If omitted, a name of the form "batch_unixtime" will be used.
    60  * '''workunit_template_file''': an optional [JobTemplates input template file name].
    61  * '''result_template_file''': an optional [JobTemplates output template file name].
     39 * '''batch_name''': a symbolic name for the batch. Must be unique. If omitted, a name of the form "batch_unixtime" will be used.
     40 * '''workunit_template_file''': an optional [wiki:JobTemplates input template file name].
     41 * '''result_template_file''': an optional [wiki:JobTemplates output template file name].
    6242 * '''jobs''': an array of job descriptors, each of which contains
    63   * '''rsc_fpops_est''': an estimate of the FLOPs used by the job
    64   * '''command_line''': command-line arguments to the application
    65   * '''wu_template''': optional; the [http://boinc.berkeley.edu/trac/wiki/JobTemplates#Inputtemplates input template]
    66     to use for this job, as an XML string.
    67   * '''result_template''': optional;
    68     the [http://boinc.berkeley.edu/trac/wiki/JobTemplates#Outputtemplates output template]
    69     to use for this job, as an XML string.
    70   * '''input_files''': an array of input file descriptors, each of which contains
    71    * '''mode''': "local", "semilocal", "local_staged", "inline", or "remote" (see below).
    72    * '''source''': meaning depends on mode:
    73     * local: path on the BOINC server
    74     * semilocal: the file's URL
    75     * local_staged: physical name
    76     * inline: the file's contents
    77    * For "remote" mode, instead of "source" you must specify:
    78     * '''url''': the file's URL
    79     * '''nbytes''': file size
    80     * '''md5''': the file's MD5 checksum
     43   * '''rsc_fpops_est''': an estimate of the FLOPs used by the job
     44   * '''command_line''': command-line arguments to the application
     45   * '''wu_template''': optional; the [http://boinc.berkeley.edu/trac/wiki/JobTemplates#Inputtemplates input template] to use for this job, as an XML string.
     46   * '''result_template''': optional; the [http://boinc.berkeley.edu/trac/wiki/JobTemplates#Outputtemplates output template] to use for this job, as an XML string.
     47   * '''input_files''': an array of input file descriptors, each of which contains
     48     * '''mode''': "local", "semilocal", "local_staged", "inline", or "remote" (see below).
     49     * '''source''': meaning depends on mode:
     50       * local: path on the BOINC server
     51       * semilocal: the file's URL
     52       * local_staged: physical name
     53       * inline: the file's contents
     54     * For "remote" mode, instead of "source" you must specify:
     55       * '''url''': the file's URL
     56       * '''nbytes''': file size
     57       * '''md5''': the file's MD5 checksum
    8158
    8259Result: a 2-element array containing
     60
    8361 * The batch ID
    8462 * An error message (null if success)
    8563
    8664Input files can be supplied in any of the following ways:
    87  * '''local''': the file is on the BOINC server and is not [JobStage staged].
    88   It's specified by its full path.
    89  * '''local_staged''': the filed has been [JobStage staged] on the BOINC server.
    90   It's specified by its physical name.
    91  * '''semilocal''': the file is on a data server that's accessible to the BOINC server
    92    but not necessarily to the outside world.
    93    The file is specified by its URL.
    94    It will be downloaded by the BOINC server during job submission,
    95    and served to clients from the BOINC server.
    96  * '''inline''': the file is included in the job submission request XML message.
    97    It will be served to clients from BOINC server.
    98  * '''remote''': the file is on a data server other than the BOINC server,
    99    and will be served to clients from that data server.
    100    It's specified by the URL, the file size, and the file MD5.
     65
     66 * '''local''': the file is on the BOINC server and is not [wiki:JobStage staged]. It's specified by its full path.
     67 * '''local_staged''': the filed has been [wiki:JobStage staged] on the BOINC server. It's specified by its physical name.
     68 * '''semilocal''': the file is on a data server that's accessible to the BOINC server but not necessarily to the outside world. The file is specified by its URL. It will be downloaded by the BOINC server during job submission, and served to clients from the BOINC server.
     69 * '''inline''': the file is included in the job submission request XML message. It will be served to clients from BOINC server.
     70 * '''remote''': the file is on a data server other than the BOINC server, and will be served to clients from that data server. It's specified by the URL, the file size, and the file MD5.
    10171
    10272The following mode has been proposed but is not implemented yet:
    103  * '''sandbox''': the file is in the user's [RemoteInputFiles#Per-userfilesandbox sandbox],
    104    and is specified by its name in the sandbox.
     73
     74 * '''sandbox''': the file is in the user's [wiki:RemoteInputFiles#Per-userfilesandbox sandbox], and is specified by its name in the sandbox.
    10575
    10676The following example submits a 10-job batch:
     77
    10778{{{
    10879$req = new StdClass;
     
    132103}
    133104}}}
    134 
    135 Note: this interfaces is inconsistent; it lets you do some things but not others.
    136 Let me know if you need additions.
     105Note: this interfaces is inconsistent; it lets you do some things but not others. Let me know if you need additions.
    137106
    138107=== boinc_estimate_batch() ===
    139 
    140108Returns an estimate of the elapsed time required to complete a batch.
    141109
    142 Arguments: same as '''boinc_submit_batch()'''
    143 (only relevant fields need to be populated).
     110Arguments: same as '''boinc_submit_batch()''' (only relevant fields need to be populated).
    144111
    145112Return value: a 2-element array containing
     113
    146114 * The elapsed time estimate, in seconds
    147115 * An error message (null if success)
    148116
    149117=== boinc_query_batches() ===
    150 
    151118Returns a list of this user's batches, both in progress and complete.
    152119
    153120Argument: a request object with elements
     121
    154122 * '''project''' and '''authenticator''': as above.
    155123 * '''get_cpu_time''' (optional): if nonzero, get CPU time of each batch
    156124
    157 Result: a 2-element array.
    158 The first element is an array of batch descriptor objects,
    159 each with the following fields:
     125Result: a 2-element array. The first element is an array of batch descriptor objects, each with the following fields:
     126
    160127 * '''id''': batch ID
    161128 * '''state''': values are
    162   * 1: in progress
    163   * 2: completed (all jobs either succeeded or had fatal errors)
    164   * 3: aborted
    165   * 4: retired
     129   * 1: in progress
     130   * 2: completed (all jobs either succeeded or had fatal errors)
     131   * 3: aborted
     132   * 4: retired
    166133 * '''name''': the batch name
    167134 * '''app_name''': the application name
     
    172139 * '''nerror_jobs''': the number of jobs that had fatal errors
    173140 * '''completion_time''': when the batch was completed
    174  * '''credit_estimate''': BOINC's initial estimate of the credit that
    175   would be granted to complete the batch, including replication
     141 * '''credit_estimate''': BOINC's initial estimate of the credit that would be granted to complete the batch, including replication
    176142 * '''credit_canonical''': the actual credit granted to canonical instances
    177143 * '''credit_total''': the actual credit granted to all instances
    178144
    179145=== boinc_query_batch() ===
    180 
    181146Gets batch details.
    182147
    183148Argument: a request object with elements
     149
    184150 * '''project''' and '''authenticator''': as above
    185151 * '''batch_id''': specifies a batch.
    186  * '''get_cpu_time''' (optional): if nonzero, get CPU time of batch
     152 * '''get_cpu_time''' (optional): if nonzero, get CPU time of batch. This includes all job instances, and doesn't include GPU time, so it may not be meaningful.
    187153 * '''get_job_details''' (optional): if nonzero, return job details (see below).
    188154
    189 Result: a 2-element array.
    190 The first element is a batch descriptor object as described above,
    191 with one additional field:
     155Result: a 2-element array. The first element is a batch descriptor object as described above, with one additional field:
     156
    192157 * '''jobs''': an array of job descriptor objects, each one containing
    193   * '''id''': the database ID of the job's workunit record
    194   * '''canonical_instance_id''': if the job has a canonical result, its database ID
     158   * '''id''': the database ID of the job's workunit record
     159   * '''canonical_instance_id''': if the job has a canonical result, its database ID
    195160
    196161If '''get_job_details''' was set, the job descriptors also contain:
    197   * '''status''': "unsent", "in_progress", "error", or "done".
    198   * '''cpu_time''': if done, the CPU time
    199   * '''exit_status''': if error, the exit status of one of the error instances.
     162
     163 * '''status''': "unsent", "in_progress", "error", or "done".
     164 * '''cpu_time''': if done, the CPU time
     165 * '''exit_status''': if error, the exit status of one of the error instances.
    200166
    201167The order of job descriptors matches their order in the batch submission.
    202168
    203169=== boinc_query_job() ===
    204 
    205170Gets job details.
    206171
    207172Argument: a request object with elements:
     173
    208174 * '''project''' and '''authenticator''': as above
    209175 * '''job_id''': specifies a job.
    210176
    211 Result: a 2-element array.
    212 The first element is a job descriptor object with the following fields:
     177Result: a 2-element array. The first element is a job descriptor object with the following fields:
     178
    213179 * '''instances''': an array of job instance descriptors, each containing:
    214   * '''name''': the instance's name
    215   * '''id''': the ID of the corresponding result record
    216   * '''state''': a string describing the instance's state
    217    (unsent, in progress, complete, etc.)
    218   * '''outfile''': if the instance is over,
    219    a list of output file descriptors, each containing
    220    * '''size''': file size in bytes
     180   * '''name''': the instance's name
     181   * '''id''': the ID of the corresponding result record
     182   * '''state''': a string describing the instance's state (unsent, in progress, complete, etc.)
     183   * '''outfile''': if the instance is over, a list of output file descriptors, each containing
     184     * '''size''': file size in bytes
    221185
    222186=== boinc_abort_batch() ===
    223 
    224187Argument: a request object with elements
     188
    225189 * '''project''' and '''authenticator''': as above,
    226190 * '''batch_id''': specifies a batch.
     
    229193
    230194=== boinc_retire_batch() ===
    231 
    232195Delete server storage (files, DB records) associated with a batch.
    233196
    234197Argument: a request object with elements
     198
    235199 * '''project''' and '''authenticator''': as above,
    236200 * '''batch_id''': specifies a batch.
     
    239203
    240204== C++ interface ==
    241 
    242 A C++ interface to the following functions is available in lib/remote_submit.cpp.
    243 Include lib/remote_submit.h.
    244 
    245 All functions return zero on success,
    246 else an error code as defined in lib/error_numbers.h
     205A C++ interface to the following functions is available in lib/remote_submit.cpp. Include lib/remote_submit.h.
     206
     207All functions return zero on success, else an error code as defined in lib/error_numbers.h
    247208
    248209=== create_batch() ===
    249 
    250210Create a batch - a set of jobs, initially empty.
     211
    251212{{{
    252213int create_batch(
     
    260221);
    261222}}}
    262 
    263223 project_url:: the project URL
    264224 authenticator:: the authenticator of the submitting user
    265225 batch_name:: a name for the batch.  Must be unique over all batches.
    266226 app_name:: the name of an application on the BOINC server
    267  expire_time:: if nonzero, the Unix time when the batch should
    268    be aborted and removed from the server, whether or not it's completed.
     227 expire_time:: if nonzero, the Unix time when the batch should be aborted and removed from the server, whether or not it's completed.
    269228 batch_id:: (out) the batch's database ID
    270229 error_msg:: (out) an error message if the operation failed
    271230
    272231=== estimate_batch() ===
    273 
    274232Get an estimate of the makespan of a (potential) batch.
    275233
     
    284242);
    285243}}}
    286  jobs :: description of jobs; same as for '''submit_jobs()''' (see below).
    287  est_makespan :: the estimated makespan (time to completion).
     244 jobs:: description of jobs; same as for '''submit_jobs()''' (see below).
     245 est_makespan:: the estimated makespan (time to completion).
    288246
    289247=== submit_jobs() ===
    290 
    291248Submit a set of jobs; place them in an existing batch, and make them runnable.
     249
    292250{{{
    293251int submit_jobs(
     
    323281
    324282For each job:
     283
    325284 job_name:: must be unique over all jobs
    326285 cmdline_args:: command-line arguments
     
    328287
    329288For each input file:
     289
    330290 physical_name:: BOINC's physical name for the file.  The file must already be staged.
    331291
    332 
    333292=== query_batches() ===
    334 
    335293Query the status of this user's batches.
    336294
     
    362320};
    363321}}}
    364 
    365322=== query_batch() ===
    366 
    367323Return the detailed status of jobs in a given batch (can specify by either ID or name).
    368324
     
    386342
    387343}}}
    388 
    389344=== abort_jobs() ===
    390 
    391345Abort a set of jobs.
     346
    392347{{{
    393348extern int abort_jobs(
     
    399354
    400355}}}
    401 
    402356=== query_completed_job() ===
    403 
    404357Query a completed job.
     358
    405359{{{
    406360extern int query_completed_job(
     
    422376};
    423377}}}
    424 
    425378 canonical_resultid:: database ID of the "canonical" instance of the job.
    426379 error_mask:: a bitmask of error conditions (see db/boinc_db_types.h)
     
    432385
    433386=== retire_batch() ===
    434 
    435 "Retire" a batch.
    436 The server is then allowed to delete the batch's input and output files,
    437 and its database records.
     387"Retire" a batch. The server is then allowed to delete the batch's input and output files, and its database records.
     388
    438389{{{
    439390extern int retire_batch(
     
    444395);
    445396}}}
    446 
    447397=== set_expire_time() ===
    448 
    449398Change the expiration time of a batch.
     399
    450400{{{
    451401extern int set_expire_time(
     
    457407);
    458408}}}
    459 
    460409=== ping_server() ===
    461 
    462410Ping the project's server; return zero if the server is up.
     411
    463412{{{
    464413extern int ping_server(
     
    467416);
    468417}}}
    469 
    470418=== query_batch_set() ===
    471 
    472 Return the status of the jobs in a given set of batches.
    473 This is used by the Condor interface; it's probably not useful outside of that.
     419Return the status of the jobs in a given set of batches. This is used by the Condor interface; it's probably not useful outside of that.
     420
    474421{{{
    475422extern int query_batch_set(
     
    492439
    493440}}}
    494 
    495441== Python binding ==
    496 
    497 The file tools/submit_api.py contains a Python binding of some of above RPCs.
    498 For examples of its use, see tools/submit_api_test.py.
    499 
    500 To build a description of a batch of jobs,
    501 use the BATCH_DESC, JOB_DESC, and FILE_DESC classes:
     442The file tools/submit_api.py contains a Python binding of some of above RPCs. For examples of its use, see tools/submit_api_test.py.
     443
     444To build a description of a batch of jobs, use the BATCH_DESC, JOB_DESC, and FILE_DESC classes:
     445
    502446{{{
    503447def make_batch():
     
    527471}}}
    528472You can then pass this to either estimate_batch() or submit_batch():
     473
    529474{{{
    530475    from submit_api import *
     
    536481    print 'estimated time: ', r.text, ' seconds'
    537482}}}
    538 
    539 The return value of all the API functions is an !EntityTree representation
    540 of the XML returned by the RPC.
    541 
    542 Other requests use a REQUEST object.
    543 For example, to query the status of batch 271:
     483The return value of all the API functions is an !EntityTree representation of the XML returned by the RPC.
     484
     485Other requests use a REQUEST object. For example, to query the status of batch 271:
     486
    544487{{{
    545488    req = REQUEST()
     
    563506
    564507}}}
    565 
    566508Possible attributes of FILE_DESC:
     509
    567510 * mode
    568511 * url
     
    572515
    573516Possible attributes of JOB_DESC:
     517
    574518 * rsc_fpops
    575519 * command_line
     
    579523
    580524Possible attributes of BATCH_DESC:
     525
    581526 * project (URL of project)
    582527 * authenticator (submitter's account key)
     
    586531 * jobs (list of JOB_DESC)
    587532
    588 Available functions are listed below.
    589 Each function takes a request object whose attributes include
    590 at least project and authenticator.
    591 
    592  abort_batch(req)::
    593     req attributes: batch_id
    594  abort_jobs(req)::
    595     req attributes: jobs (list of job names)
    596  create_batch(req)::
    597     req attributes: app_name, batch_name, expire_time
    598  estimate_batch(req)::
    599     req is a BATCH_DESC
    600  query_batch(req)::
    601     req attributes: batch_id, get_cpu_time
    602  query_batches(req)::
    603     req attributes_ get_cpu_time
    604  query_completed_job(req)::
    605     req attributes: job_name
    606  query_job(req)::
    607     req attributes: job_id
    608  get_output_file(req)::
    609     req attributes: instance_name, file_num
    610  get_output_files(req)::
    611     req attributes: batch_id
    612  retire_batch(req)::
    613     req attributes: batch_id
     533Available functions are listed below. Each function takes a request object whose attributes include at least project and authenticator.
     534
     535 abort_batch(req):: req attributes: batch_id
     536 abort_jobs(req):: req attributes: jobs (list of job names)
     537 create_batch(req):: req attributes: app_name, batch_name, expire_time
     538 estimate_batch(req):: req is a BATCH_DESC
     539 query_batch(req):: req attributes: batch_id, get_cpu_time
     540 query_batches(req):: req attributes_ get_cpu_time
     541 query_completed_job(req):: req attributes: job_name
     542 query_job(req):: req attributes: job_id
     543 get_output_file(req):: req attributes: instance_name, file_num
     544 get_output_files(req):: req attributes: batch_id
     545 retire_batch(req):: req attributes: batch_id
    614546
    615547== HTTP/XML interface ==
    616 
    617 At a lower level, the APIs are accessed by sending a POST request,
    618 using HTTP or HTTPS, to PROJECT_URL/submit_rpc_handler.php.
    619 The inputs and outputs of each function are XML documents.
    620 The format of the request and reply XML documents
    621 can be inferred from user/submit_rpc_handler.php.
     548At a lower level, the APIs are accessed by sending a POST request, using HTTP or HTTPS, to PROJECT_URL/submit_rpc_handler.php. The inputs and outputs of each function are XML documents. The format of the request and reply XML documents can be inferred from user/submit_rpc_handler.php.
    622549
    623550== Example web interface ==
    624 
    625 An example of a web interface for job submission and control,
    626 based on this API, can be found here:
    627 http://boinc.berkeley.edu/trac/browser/trunk/boinc/html/user/submit_example.php
    628 
    629 This example is functional and it shows how to use the API.
    630 However, you will have to modify it heavily for your particular
    631 applications and web site.
    632 
    633 
     551An example of a web interface for job submission and control, based on this API, can be found here: http://boinc.berkeley.edu/trac/browser/trunk/boinc/html/user/submit_example.php
     552
     553This example is functional and it shows how to use the API. However, you will have to modify it heavily for your particular applications and web site.