Message boards :
Questions and problems :
*** buffer overflow detected ***: boinc_client terminated
Message board moderation
Author | Message |
---|---|
Send message Joined: 11 Aug 17 Posts: 10 |
I keep hitting this after running fresh boinc client for months again and again, quite desperate as I have no idea where to start for debugging hardware Intel(R) Xeon(R) CPU E3-1231 v3 @ 3.40GHz Intel S1200RP platform Gentoo Linux Linux bacztwo 4.9.34-gentoo #1 SMP Thu Jul 6 18:16:42 CST 2017 x86_64 Intel(R) Xeon(R) CPU E3-1231 v3 @ 3.40GHz GenuineIntel GNU/Linux tried BOINC client version 7.2.44 for x86_64-pc-linux-gnu BOINC client version 7.6.33 for x86_64-pc-linux-gnu BOINC client version 7.8.1 for x86_64-pc-linux-gnu all crashed. no special sequence to reproduce, all I do is start a frech boinc client, attached it to BAM, running few days or weeks, then I will hit this, probably cause by one of my job or something, following is the log for assessment https://pastebin.com/BQ4Lkf6T stdoutdae.txt https://pastebin.com/ZprXtnku stderrdae.txt https://pastebin.com/WfWfH1di strace.log [huge text, don't open with browser] https://pastebin.com/raw/zk0P7k6G |
Send message Joined: 20 Nov 12 Posts: 801 |
Can you recompile the client with debugging symbols? It's not exactly obvious from the strace where it's crashing. |
Send message Joined: 11 Aug 17 Posts: 10 |
it will be nice if you can help me with this, I have no experience about C development and debugging, 99% of time I do python/bash/web, I only use strace to do some simple job like looking for what file certain program touched # file /usr/bin/boinc /usr/bin/boinc: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, not stripped, with debug_info so it will be helpful if you can enlighten me how should I do it |
Send message Joined: 11 Aug 17 Posts: 10 |
I followed some random google result and get some output, doesn't know if this help # gdb boinc_client (gdb) catch syscall exit_group (gdb) run ... Thread 1 "boinc_client" received signal SIGABRT, Aborted. 0x00007ffff5cd1118 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff5cd1118 in raise () from /lib64/libc.so.6 #1 0x00007ffff5cd256a in abort () from /lib64/libc.so.6 #2 0x00007ffff5d0de91 in ?? () from /lib64/libc.so.6 #3 0x00007ffff5d95f47 in __fortify_fail () from /lib64/libc.so.6 #4 0x00007ffff5d93fe0 in __chk_fail () from /lib64/libc.so.6 #5 0x00007ffff5d93489 in ?? () from /lib64/libc.so.6 #6 0x00007ffff5d115e0 in _IO_default_xsputn () from /lib64/libc.so.6 #7 0x00007ffff5ce5bba in vfprintf () from /lib64/libc.so.6 #8 0x00007ffff5d9351c in __vsprintf_chk () from /lib64/libc.so.6 #9 0x00007ffff5d93475 in __sprintf_chk () from /lib64/libc.so.6 #10 0x000000000041d1ef in CLIENT_STATE::report_result_error(RESULT&, char const*, ...) () #11 0x000000000041fb7f in CLIENT_STATE::garbage_collect_always() () #12 0x000000000042071e in CLIENT_STATE::garbage_collect() () #13 0x0000000000420a59 in CLIENT_STATE::poll_slow_events() () #14 0x000000000047363a in boinc_main_loop() () #15 0x00000000004085a0 in main () (gdb) |
Send message Joined: 11 Aug 17 Posts: 10 |
client_state.xml https://pastebin.com/raw/0YY5yA5d |
Send message Joined: 20 Nov 12 Posts: 801 |
I thought you had corrupted client_state.xml but I can't see anything obviously wrong in it. Could you run the following commands in gdb: frame 10 list info args print res.name info locals frame 9 info args Depending on how gdb decides to handle whatever garbage those commands print out your terminal may crash. |
Send message Joined: 11 Aug 17 Posts: 10 |
I missed -gfor gcc compile with debug info, here is bt with much more info, (gdb) bt #0 0x00007ffff5cd1118 in raise () from /lib64/libc.so.6 #1 0x00007ffff5cd256a in abort () from /lib64/libc.so.6 #2 0x00007ffff5d0de91 in ?? () from /lib64/libc.so.6 #3 0x00007ffff5d95f47 in __fortify_fail () from /lib64/libc.so.6 #4 0x00007ffff5d93fe0 in __chk_fail () from /lib64/libc.so.6 #5 0x00007ffff5d93489 in ?? () from /lib64/libc.so.6 #6 0x00007ffff5d115e0 in _IO_default_xsputn () from /lib64/libc.so.6 #7 0x00007ffff5ce5bba in vfprintf () from /lib64/libc.so.6 #8 0x00007ffff5d9351c in __vsprintf_chk () from /lib64/libc.so.6 #9 0x00007ffff5d93475 in __sprintf_chk () from /lib64/libc.so.6 #10 0x000000000041d1ef in sprintf (__fmt=0x4bcd7d "<message>\n%s\n</message>\n", __s=0x7fffffffbb80 "<message>\nupload failure: <file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_1.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_x"...) at /usr/include/bits/stdio2.h:34 #11 CLIENT_STATE::report_result_error (this=this@entry=0x6fddc0 <gstate>, res=..., format=format@entry=0x4bcea6 "upload failure: %s") at client_state.cpp:1864 #12 0x000000000041fb7f in CLIENT_STATE::garbage_collect_always (this=this@entry=0x6fddc0 <gstate>) at client_state.cpp:1588 #13 0x000000000042071e in CLIENT_STATE::garbage_collect (this=0x6fddc0 <gstate>) at client_state.cpp:1409 #14 0x0000000000420a59 in CLIENT_STATE::poll_slow_events (this=0x6fddc0 <gstate>) at client_state.cpp:1054 #15 0x000000000047363a in boinc_main_loop () at main.cpp:373 #16 0x00000000004085a0 in main (argc=1, argv=0x7fffffffe058) at main.cpp:548 (gdb) and following is what you requested, due to I recompile the boinc_client, the stack changed a littel (gdb) frame 11 #11 CLIENT_STATE::report_result_error (this=this@entry=0x6fddc0 <gstate>, res=..., format=format@entry=0x4bcea6 "upload failure: %s") at client_state.cpp:1864 1864 in client_state.cpp (gdb) list 1859 in client_state.cpp (gdb) info args https://pastebin.com/raw/1UfczAzP (gdb) print res.name $1 = "wah2_global_a04w_208812_145_613_011130398_1\000\062\060\061\061-01-30.gz\000\000\000\000\000\000\000wah2_sas!\a\000\000\000\000\000\000X\033\003\366\377\177\000\000X\033\003\366\377\177", '\000' <repeats 18 times>, "_2101-01\361\006\000\000\000\000\000\000X\033\003\366\377\177\000\000X\033\003\366\377\177", '\000' <repeats 27 times>, "\276y\000\000\000\000\000P\207\377\377\377\177\000\000\320\330\377\377\377\177\000\000\300xz\000\000\000\000\000"... (gdb) info locals buf = "<message>\nupload failure: <file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_1.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_x"... err_msg = "upload failure: <file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_1.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>"... i = <optimized out> failnum = 538976266 va = <error reading variable va (Attempt to dereference a generic pointer.)> (gdb) frame 10 #10 0x000000000041d1ef in sprintf (__fmt=0x4bcd7d "<message>\n%s\n</message>\n", __s=0x7fffffffbb80 "<message>\nupload failure: <file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_1.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_x"...) at /usr/include/bits/stdio2.h:34 34 __bos (__s), __fmt, __va_arg_pack ()); (gdb) info args __fmt = 0x4bcd7d "<message>\n%s\n</message>\n" __s = 0x7fffffffbb80 "<message>\nupload failure: <file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_1.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_x"... |
Send message Joined: 11 Aug 17 Posts: 10 |
as char buf[4096], err_msg[4096]; // The above store 1-line messages and short XML snippets. // Shouldn't exceed a few hundred bytes. then we got a max out err_msg, not just few hundred bytes, sprintf( buf, "<message>\n%s\n</message>\n", err_msg); so sprintf got buffer overflow due to the extra message tag? |
Send message Joined: 11 Aug 17 Posts: 10 |
(gdb) set print elements 0 (gdb) print err_msg $6 = "upload failure: <file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_1.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_2.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_3.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_4.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_5.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_6.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_7.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_8.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_9.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_10.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_11.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_12.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_13.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_14.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_15.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_16.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_17.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_18.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_19.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_20.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_21.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_22.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_23.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_145_613_011130398_1_r113524241_24.zip</file_name>\n <error_code>-161 (not found)</error_code>\n</file_xfer_error>\n<file_xfer_error>\n <file_name>wah2_global_a04w_208812_1" |
Send message Joined: 11 Aug 17 Posts: 10 |
https://github.com/BOINC/boinc/blob/08c5b3b276b079c1433ed8f1e569314d9df88168/client/client_state.cpp#L1570 https://github.com/BOINC/boinc/blob/08c5b3b276b079c1433ed8f1e569314d9df88168/client/client_state.cpp#L1588 error been stack up then report once, easily over 4096bytes |
Send message Joined: 20 Nov 12 Posts: 801 |
Guess the person who wrote the comment didn't except that tasks could 147 output files. Since you can compile the code yourself you can change the code in report_result_error() from sprint(buf to snprintf(buf, sizeof(buf) |
Send message Joined: 11 Aug 17 Posts: 10 |
that doesn't look like the right fix because it cut's out the end of </message> I think reduce the size of err_msgor increase buf but I know nothing about the workflow of report_result_error, this could just a temporary fix, will you take the bug to upstream? or do I have to open issue on github? |
Send message Joined: 29 Aug 05 Posts: 15482 |
If you haven't done so yet, please do make it an issue at https://github.com/BOINC/boinc/issues |
Send message Joined: 11 Aug 17 Posts: 10 |
client: eliminate possible buffer overflow in reporting result errors https://github.com/BOINC/boinc/commit/8e7857623edeed8ecfdcc8f1395a4d0b625e06f9 upstream fixed |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.