[Leiden Classical] Messed up scheduler url

Message boards : Questions and problems : [Leiden Classical] Messed up scheduler url
Message board moderation

To post messages, you must log in.

AuthorMessage
Theadalus

Send message
Joined: 6 Apr 10
Posts: 12
Netherlands
Message 35742 - Posted: 15 Nov 2010, 23:38:40 UTC
Last modified: 15 Nov 2010, 23:42:35 UTC

Hi,

I don't know if this issue has been reported before, anyway:

As can be read within this thread at Leiden Classical (LC) forum, several people have a problem on Ubuntu 10.04 Server x64 with BOINC 6.10.58 regarding scheduler url.

Within client_state.xml:

<scheduler_url>http://boinc.gorlaeus.net/Classical_ci//cgi</scheduler_url>


While it should be:

<scheduler_url>http://boinc.gorlaeus.net/Classical_cgi/cgi</scheduler_url>


LC Project admin says this is not their fault/problem, but a BOINC issue...
However, i have another project (malariacontrol.net) running on same machine without any problem?

What i tested:
- Changing url manually within client_state.xml => after each update the url get changed back to wrong one
- Installed BOINC 6.6.41 on same machine => same issue
- Installed BOINC 6.10.58 on Ubuntu 9.10 Server x64 => no problems
- Installed BOINC 6.10.56 x64 on Windows 7 Ultimate x64 => no problems


So my conclusion: issue appears only on Ubuntu 10.04 Server x64 (with any BOINC version) at Leiden Classical?
ID: 35742 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 35743 - Posted: 15 Nov 2010, 23:59:34 UTC

There's a similar problem at cpdn, where the uploader url for HADAM3P Pacific North West models is changed for Linux users.
Correct url on cpdn server, wrong one in client_state on user's computers.
(I think that there's 2 different spelling variations.)

Correcting the spelling in client state fixs it.

1) Exit from BOINC.
2) Make a "just in case" copy of client_state.xml, and paste it somewhere outside of the BOINC folders.
3) Edit the original, correcting the mistake.
4) Save the file.
5) Restart BOINC.

ID: 35743 · Report as offensive
Theadalus

Send message
Joined: 6 Apr 10
Posts: 12
Netherlands
Message 35746 - Posted: 16 Nov 2010, 17:05:57 UTC - in response to Message 35743.  

Correcting the spelling in client state fixs it.

1) Exit from BOINC.
2) Make a "just in case" copy of client_state.xml, and paste it somewhere outside of the BOINC folders.
3) Edit the original, correcting the mistake.
4) Save the file.
5) Restart BOINC.

Already tried this, but after project update (./boinccmd --project http://boinc.gorlaeus.net/ update) all scheduler url instances are changed back to wrong one.

On project update scheduler url is (re)fetched from server, or some local config file (like master_boinc.gorlaeus.net.xml)?
ID: 35746 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 35747 - Posted: 16 Nov 2010, 19:44:56 UTC - in response to Message 35746.  

The trick with this is DON'T do an update.
But this cure may only work with cpdn, where the failure is when uploading the zipped results files, and then it's another few days before the next model finishes.

One of the cpdn threads is here.

Have to wait until/if the devs find and fix it.
I'll get one of the cpdn people to put in a bug report.

ID: 35747 · Report as offensive
Theadalus

Send message
Joined: 6 Apr 10
Posts: 12
Netherlands
Message 35749 - Posted: 17 Nov 2010, 0:12:43 UTC

I already did some tests before, but to be sure did following:

1. Manually corrected scheduler url within client_state.xml
2. Start BOINC, no manual update, just let it run...
3. LC wu's get downloaded and machine starts crunching, so far so good...
4. After checking client_state.xml, i find file upload url als messed up:

  <file_info>
    <name>wu_898976128_1289242539_11664_2_0</name>
    <nbytes>0.000000</nbytes>
    <max_nbytes>16777216.000000</max_nbytes>
    <generated_locally/>
    <status>0</status>
    <upload_when_present/>
    <url>http://boinc.gorlaeus.net/Classical_cgi/file_upload_hnddler</url>
    <signed_xml>
      <name> wu_898976128_1289242539_11664_2_0 </name>
      <max_nbytes> 16777216 </max_nbytes>  
      <url> http://boinc.gorlaeus.net/Classical_cgi/file_upload_handler </url>
      <generated_locally/>
      <upload_when_present/>
    </signed_xml>
    <xml_signature>
26c434eddcdbc609a3cbc7bb63adafc6f72ea362a3e4f5f7c8956b7981371378
7b07420c82e4a14944f496b94d4797cfdb752852fcc2cde4b373ae28d4b73a7b
982fd892aab5ad3d53f5ee359893b94cf32a663be2f3eed10ec02a78d38a4663
8fbe5618e7e587e7ea18d404ca1c1fb7d935b06b97d33d280071d4248542a469
.
  </xml_signature>
</file_info>


So this means i have to manually correct all wrong url instances to get files uploaded. But this also means i have to monitor the client 24/7, because each newly downloaded wu will have messed up url...?

Now my question is: is this a BOINC bug or a LC bug (so i know who to approach to get this issue solved)?
ID: 35749 · Report as offensive
Theadalus

Send message
Joined: 6 Apr 10
Posts: 12
Netherlands
Message 35751 - Posted: 17 Nov 2010, 2:00:30 UTC

Just to test what will happen i fixed file upload url. After wu's were finished and uploaded, client state says 'compute error'

Result:

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Unrecognized XML in parse_init_data_file: hostid
Skipping: 80872
Skipping: /hostid
Unrecognized XML in parse_init_data_file: starting_elapsed_time
Skipping: 0.000000
Skipping: /starting_elapsed_time
Unrecognized XML in parse_init_data_file: computation_deadline
Skipping: 1290292392.136000
Skipping: /computation_deadline
Unrecognized XML in GLOBAL_PREFS::parse_override: mod_time
Skipping: /mod_time
Unrecognized XML in GLOBAL_PREFS::parse_override: run_gpu_if_user_active
Skipping: 0
Skipping: /run_gpu_if_user_active
Unrecognized XML in GLOBAL_PREFS::parse_override: suspend_cpu_usage
Skipping: 0.000000
Skipping: /suspend_cpu_usage
Unrecognized XML in GLOBAL_PREFS::parse_override: max_ncpus_pct
Skipping: 0.000000
Skipping: /max_ncpus_pct
Unrecognized XML in GLOBAL_PREFS::parse_override: daily_xfer_limit_mb
Skipping: 0.000000
Skipping: /daily_xfer_limit_mb
Unrecognized XML in GLOBAL_PREFS::parse_override: daily_xfer_period_days
Skipping: 0
Skipping: /daily_xfer_period_days

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>wu_164284800_1289242539_5341_2_1</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>


Seems some kind of transfer error?

I don't understand why LC is giving so much problems while other projects are running fine on same machine. *sigh*

I think it's total waste of time right now to run LC on Ubuntu 10.x x64...
ID: 35751 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 20 Dec 07
Posts: 1069
Germany
Message 35756 - Posted: 17 Nov 2010, 7:55:23 UTC - in response to Message 35751.  

<message>
<file_xfer_error>
  <file_name>wu_164284800_1289242539_5341_2_1</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>

Perhaps you messed up the client_state.xml (<xml_signature>)?

From BOINC FAQ Service:
ERR_NOT_FOUND -161

This happens when you have an inconsistent client_state.xml file. Files aren't written to it.
Task not found would be the error message.

Gruß,
Gundolf
ID: 35756 · Report as offensive
Theadalus

Send message
Joined: 6 Apr 10
Posts: 12
Netherlands
Message 35760 - Posted: 17 Nov 2010, 9:09:36 UTC - in response to Message 35756.  

<message>
<file_xfer_error>
  <file_name>wu_164284800_1289242539_5341_2_1</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>

Perhaps you messed up the client_state.xml (<xml_signature>)?

The error message i included also refers to tags within global_prefs.xml which i did not edit, and other project running fine with same edited client_state.xml.

But errors are probably all related to url issue, so when that's fixed...
ID: 35760 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15481
Netherlands
Message 35761 - Posted: 17 Nov 2010, 9:33:10 UTC - in response to Message 35760.  

The error message i included also refers to tags within global_prefs.xml which i did not edit, and other project running fine with same edited client_state.xml.

The "Unrecognized" and "Skipping" parts of stderr.txt only show that the science application does not know about these preferences. It's just that the LHC app was made way before BOINC 6, LHC has never released updated applications built against a later BOINC API that does know those tags.

It won't interfere with the running of the science application.
I don't think this is a BOINC problem, especially since you tested it against other Linux distros and with other BOINC versions on this Lunix distro.

What i tested:
- Changing url manually within client_state.xml => after each update the url get changed back to wrong one
- Installed BOINC 6.6.41 on same machine => same issue
- Installed BOINC 6.10.58 on Ubuntu 9.10 Server x64 => no problems
- Installed BOINC 6.10.56 x64 on Windows 7 Ultimate x64 => no problems

So the adding of the forward slash to an URL is probably something that the OS does. Yet, I forwarded it to the developers anyway. So far, not had a reaction.
ID: 35761 · Report as offensive
Theadalus

Send message
Joined: 6 Apr 10
Posts: 12
Netherlands
Message 35762 - Posted: 17 Nov 2010, 11:47:11 UTC - in response to Message 35761.  

So the adding of the forward slash to an URL is probably something that the OS does. Yet, I forwarded it to the developers anyway. So far, not had a reaction.

Ok, let's hope they are able to solve problem, thnx! :)
ID: 35762 · Report as offensive

Message boards : Questions and problems : [Leiden Classical] Messed up scheduler url

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.