Message boards : Web interfaces : BOINC needs spam bot protection
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Jun 06 Posts: 305 |
There must be robots that create spam accounts in BOINC projects. Some small test projects have more *viagra* accounts than regular users but regular projects are affected as well. I haven't dealt with the account manager interface, maybe that can even be used for project account spamming. The accounts all have 0 credits so they are not contained in the exported XMLs but it's still very annoying :-/ A 2 stage sign in would be a good thing, i.e. sending a short phrase to the user by email, together with a verify URL - and the user has to type that phrase somewhere on the verify page in order to activate the account. Unverified accounts should not be listed on the stats pages, accounts that haven't been activated within a month should go to the wastebasket. Maybe someone could bring this to the attention of the BOINC project developers. Thanks in advance Volker |
Send message Joined: 29 Aug 05 Posts: 15585 |
The developers are looking into different options. 1. is to teach Akismet what is SPAM and what isn't. At this moment it doesn't know, so posting URLs, even pointing to a thread elsewhere on the same forum, is considered spam. 2. is to use reCAPTCHA for signing up through the website. 3. this code is available, see change log 13835, do not allow a profile to be made unless you have at least credit. |
Send message Joined: 27 Jun 06 Posts: 305 |
Well, it's not forum entries or profiles, it's the account name itself. user.name is an entity typed varchar(254), enough to use that as spam, even without a profile entry, e.g. http://boinc.bakerlab.org/rosetta/view_profile.php?userid=212831 (not URLed because I don't want to raise the rating) recaptcha might help - but will it help for remote signup too (BAM etc.)? ��u� |
Send message Joined: 29 Aug 05 Posts: 15585 |
Well, it's not forum entries or profiles, it's the account name itself. Akismet checks anything you type, starting with usernames. If you want to test how reCAPTCHA works, try to unlock my email address. |
Send message Joined: 19 Jan 07 Posts: 1179 |
Adding a captcha to create_account_form.php would help, but not for long. Spammers could get smarter and start using the same requests the BOINC client client uses to create accounts from the manager. |
Send message Joined: 29 Aug 05 Posts: 15585 |
But can you automate that? Or do you then need someone behind the keyboard filling in all the information all the time? |
Send message Joined: 19 Jan 07 Posts: 1179 |
But can you automate that? Or do you then need someone behind the keyboard filling in all the information all the time? Enable [http_debug] and attach to a project, you'll see a lookup_account.php request. There is a similar one for creating accounts. Easily automated. |
Send message Joined: 13 Aug 06 Posts: 778 |
The spambots have had an automated method of solving captchas for more than a year. On the CPDN independent forum when a mini-task was added to the registration process after the captcha, the bots disappeared overnight. There are still a very few spammers, but they seem to be nasty people, not nasty bits of code. The spambots could probably do the entire Gutenberg project, word by word. And they are probably much better than people at spelling boinc. As Volker says, the best thing is to prevent them from registering in the first place. If Akismet is to be trained to recognise spam on boinc forums and web pages, I think the training would have to be done on a hidden forum/hidden project where its mistakes wouldn't affect ordinary honest posters and members. |
Send message Joined: 19 Jan 07 Posts: 1179 |
The spambots have had an automated method of solving captchas for more than a year. Poor captchas, you mean. If spambots could solve reCAPTCHA implementation, the project wouldn't exist in the first place. The whole idea is there isn't good enough OCR, that's why they're using captchas to digitize the books. See PWNtcha for a list of lame captchas that are easily broken. |
Send message Joined: 27 Jun 06 Posts: 305 |
Automating remote signup - BOINC is open source, just extract the modules that do that part in the BOINC client and make its functions part of a robot. The major difference between a robot and the BOINC client is that the BOINC client waits for a person to choose name and project and then trigger that signup whereas a robot would take those from a list - and then trigger the same event. Beyond that layer, robot and BOINC client can be identical. p.s.: if you're not a program developer, you can even use the BOINC client together with a scripted macro recorder. |
Send message Joined: 29 Aug 05 Posts: 15585 |
**grin** I'm wondering, do I now need to ban Ananas for posting how to do that? ;-) |
Send message Joined: 27 Jun 06 Posts: 305 |
Well, forward it to someone who can fix it, then you can hide the entry ;-) |
Send message Joined: 19 Jan 07 Posts: 1179 |
Well, forward it to someone who can fix it, then you can hide the entry ;-) There is no way to "fix it" without adding a captcha to the manager, making current systems incompatible. Ageless: it might be a good idea to hide the post. |
Send message Joined: 29 Aug 05 Posts: 15585 |
I think that spammers already know how to do that, so I'll leave it be. I've sent this thread to Rytis. He'll probably get by sometime tomorrow. Besides, you didn't post anything that would get you banned until the end of Unix time. :-) On second thought, best to be hiding the post anyway, in case it gives someone ideas. I've sent it off to Rytis anyway. |
Send message Joined: 27 Jun 06 Posts: 305 |
... The two-step signup (mentioned in my thread starter) could fix it. The script snippet will still work but it will only create unverified accounts that will never be verified - they will never get listed anywhere and they will disappear after waiting for verification for a certain time span. |
Send message Joined: 27 Jun 06 Posts: 305 |
Search for : view_profile.php viagra gives me nearly a million hits. Those hits are the profiles themselves but also links to those profiles that are used in automated guestbook and shoutbox entries. So the spammers use those profiles as free anonymous web space and increase their rating through that. Another 700k hits for Tramadol, 960k for Cialis, 780k for Phentermine As the UserIDs are very close together, they must have used a script or program, e.g. : http://dist.ist.tugraz.at/cape5/ view_profile.php?userid=8780 key=viagra-online http://dist.ist.tugraz.at/cape5/ view_profile.php?userid=8785 key=Phentermine-online http://dist.ist.tugraz.at/cape5/ view_profile.php?userid=8783 key=Soma-online http://dist.ist.tugraz.at/cape5/ view_profile.php?userid=8781 key=cialis-online http://dist.ist.tugraz.at/cape5/ view_profile.php?userid=8784 key=Ultram-online http://dist.ist.tugraz.at/cape5/ view_profile.php?userid=8782 key=Levitra-online (I inserted spaces in those URLs to make them invalid) The same project has another spam sequence from UserID 9423 to 9427 and one more from 8813 to 8819 (probably more that I haven't found), plus some single entries that don't stand in sequences, e.g. 9112 _________________________ The reference links in shoutboxes and guestbooks mostly contain several links, often mixed together from profiles in different projects. By adding the &key= to the guestbook spam entries they can even count the hits for each chemical in each project from the inserted images / HTTP_REFERER. |
Send message Joined: 12 Feb 06 Posts: 232 |
Ageless wrote: The developers are looking into different options. Both good ideas. I have heard of another method which can stop many bots. It's used to prevent posting, but could likely also be used to prevent account creation. The idea is that you add to the form for posting (or account creation) an extra form field, which is hidden from users by CSS display: none;. The bots still see it and usually put something into it. Real people won't. Then if there is something in this input field you know it's a bot and drop it. The form could be adjusted so that on browsers where the field shows up it is clearly marked so that humans know not to enter anything. This isn't a complete solution by itself, but could be another line of defence. (I'm glad to see we have preview before post!) -- Eric Myers "Education is not the filling of a pail, but the lighting of a fire." -- William Butler Yeats |
Send message Joined: 27 Jun 06 Posts: 305 |
It is very likely that those user accounts have not been created through the web site (HTML) interface but through client / scheduler communication. The post that Ageless has hidden above was a concept how to do it. Btw., I agree that it was a good idea to hide it - I had no better idea to show how serious this issue might become (or already be), it wasn't thought to give anyone "good" ideas. Web site protection will only help against HTML based bots. |
Send message Joined: 19 Jan 07 Posts: 1179 |
It is very likely that those user accounts have not been created According to the logs I have seen from many project admins, bots are using only the web-based part. |
Send message Joined: 27 Jun 06 Posts: 305 |
For those HTML bots even a simple maths question to solve (in prosa) would help. One more quick solution would exist : - don't show a profile for accounts unless they have crunched one workunit OR became founder of a team (some teams have technical accounts for team founders that do not crunch) - show only the first 10 charachters of the username, followed by 3 periods - avoid to choose them as UotD (I have already seen viagra ads on a project entry page - not nice, makes me want to kill) It would't keep the spammers from signing up but they wouldn't get anything out of it anymore :-) p.s.: this project has some spam accounts too btw., scroll through top_users.php and you will see them. As there is most likely no scheduler running, that's sure HTML bot work. |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.