Wikipedia:Bots/Requests for approval/DPL bot
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: JaGa (talk · contribs)
Time filed: 19:39, Thursday October 6, 2011 (UTC)
Automatic or Manual: Automatic unsupervised
Programming language(s): PHP
Source code available: Not currently
Function overview: Tags and removes tags from articles based on whether they should have the {{dablinks}} template.
Links to relevant discussions (where appropriate): Wikipedia talk:Disambiguation pages with links#Dablinks update and proposal
Edit period(s): Twice daily
Estimated number of pages affected: 238 pages on first run; after that, somewhere around 5/day
Exclusion compliant (Y/N): Y
Already has a bot flag (Y/N):
Function details:The {{dablinks}} template was introduced to tag articles with an "excessive" number of disambig links, which at the time was defined as linking to 25+ disambigs. When the template was introduced, there were 300 such articles; now there are six. I've proposed to lower the "excessive" threshold to 15 dablinks, and to do the tagging/de-tagging with a bot so I no longer have to do it manually. This would affect 238 articles on the first run. The bot, which queries and submits via the API, would make no edit when an edit conflict occurs or if max lag is greater than 5 seconds; it would just wait until the next scheduled run and try again.
Discussion
[edit]A few questions:
- How are you generating the list of articles that have a lot of links to dab pages?
- With only 238 affected articles total, how do you figure that you'll find an additional 5 articles per day?
- With such a relatively small number of affected articles (consider, for example, that {{Unreferenced}} has over 250,000 transclusions), wouldn't it be better to develop a script (perhaps AWB or javascript) that would assist you in actually fixing the problem rather than creating a bot to automatically slap another cleanup tag on it? You could fix 238 articles in a few days. Seems to me like a better use of your time, but that's just my opinion.
—SW— confess 16:18, 10 October 2011 (UTC)[reply]
Answers:
- This report gives me my base data, and I update with Toolserver and API data to make sure the dablink count hasn't changed since the report was generated.
- That's based on current experience. Right now, I tag or de-tag 1-3 articles/day manually. Since I'm lowering the excessive dablink threshold from 25 to 15, I expect this rate to rise slightly.
- The DPL project definitely could not fix these in a few days; it actually takes quite a bit of time to fix the dablinks on many of these pages. Most of them are badly neglected lists. ICD-9-CM Volume 3, for instance, took hours to fix since I had to clean it up before I could even begin with the disambiguation.
--JaGatalk 05:38, 11 October 2011 (UTC)[reply]
- What if you designed a tool which would:
- Find all links to dab pages in an article, and for each link:
- Display to you the sentence that the link is used in.
- Automatically load the associated dab page, and display to you each possible link from the dab page
- You simply click on the correct link, and the tool remembers your decision. Then move on to the next link.
- At the end, the tool updates all the links on the page.
- This actually sounds like an interesting project. If you're not interested in developing this, I might try it myself. —SW— communicate 14:14, 11 October 2011 (UTC)[reply]
- AWB has some of that functionality. Brush up your C# and implement the rest! Rich Farmbrough, 15:57, 11 October 2011 (UTC).[reply]
- AWB has some of that functionality. Brush up your C# and implement the rest! Rich Farmbrough, 15:57, 11 October 2011 (UTC).[reply]
- Looks like there's already a tool on toolserver which helps out immensely with this. —SW— converse 15:11, 11 October 2011 (UTC)[reply]
- Yeah, Dab Solver is already gloriously complete; we've been using that tool for some time now. --JaGatalk 21:58, 13 October 2011 (UTC)[reply]
- Fair enough. I don't have any other questions or objections at this time. I don't think it would be unreasonable to lower the threshold below 15 if you could get consensus from a wider discussion for it. However, I would want to verify whether or not the toolserver report you're relying on filters out things like dab links in hatnotes and other potentially valid dab links in articles. —SW— speak 22:39, 13 October 2011 (UTC)[reply]
- Yes. The tool is compliant with WP:INTDABLINK - that is, it ignores intentional disambig links when it performs its counts. --JaGatalk 23:44, 13 October 2011 (UTC)[reply]
- Fair enough. I don't have any other questions or objections at this time. I don't think it would be unreasonable to lower the threshold below 15 if you could get consensus from a wider discussion for it. However, I would want to verify whether or not the toolserver report you're relying on filters out things like dab links in hatnotes and other potentially valid dab links in articles. —SW— speak 22:39, 13 October 2011 (UTC)[reply]
- Yeah, Dab Solver is already gloriously complete; we've been using that tool for some time now. --JaGatalk 21:58, 13 October 2011 (UTC)[reply]
- What if you designed a tool which would:
- The threshold is now lowered from 25 to 15 links. Is it necessary to come back here with a request for every lowering of the threshold? Or is a simple notification enough? Night of the Big Wind talk 21:52, 18 October 2011 (UTC)[reply]
- There should be a discussion to find a consensus on what the proper threshold should be (per WP:BOTPOL). Otherwise, there would be nothing (other than common sense) stopping anyone from lowering the threshold to 1. My personal opinion is that 15 is still a bit high, I think somewhere between 5 and 10 would be good. Others might disagree. If I were the bot operator, I would finish out this BRFA first and then worry about starting a wider discussion about lowering the threshold. In my experience, discussing too many things at once on Wikipedia produces a cacophony. —SW— confabulate 00:14, 19 October 2011 (UTC)[reply]
- That sounds good to me - just stick with 15 for now, see about lowering again later. --JaGatalk 02:37, 19 October 2011 (UTC)[reply]
- Ok, but I am optimistic enough of dreaming about less then 25.000 links to dab-pages in total. Night of the Big Wind talk 05:49, 19 October 2011 (UTC)[reply]
- That sounds good to me - just stick with 15 for now, see about lowering again later. --JaGatalk 02:37, 19 October 2011 (UTC)[reply]
- There should be a discussion to find a consensus on what the proper threshold should be (per WP:BOTPOL). Otherwise, there would be nothing (other than common sense) stopping anyone from lowering the threshold to 1. My personal opinion is that 15 is still a bit high, I think somewhere between 5 and 10 would be good. Others might disagree. If I were the bot operator, I would finish out this BRFA first and then worry about starting a wider discussion about lowering the threshold. In my experience, discussing too many things at once on Wikipedia produces a cacophony. —SW— confabulate 00:14, 19 October 2011 (UTC)[reply]
So... are we ready for a trial? --JaGatalk 16:35, 20 October 2011 (UTC)[reply]
Approved for trial (30 tags and 30 de-tags). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Please use a descriptive edit summary, especially mentioning the threshold when removing the {tl|dablinks}}, because I suspect most editor don't know of the 25+ (now 15+) recommendation. If I were you, I would remove tags only when the dab links are at <10 if you use 15+ to add {{dablinks}}, to prevent removing too many manually added ones. — HELLKNOWZ ▎TALK 17:16, 20 October 2011 (UTC)[reply]
- Funny you should say that; I was also considering a lower removal threshold, but my reason was to keep from a tag/re-tag loop when an article has 14 dablinks and the 15th is in a "This is a disambig! No it isn't!" edit war. So I've put it in; DPL bot tags at 15 dablinks or greater, and removes tags at fewer than 10 dablinks. I've done the first run (results on talk), but there was only one article ready for de-tagging at the time. I've halted all tagging and will update the talk page once I have more de-tagging edits. Would it be OK if I had less than 30 de-tag diffs? It would take a lot of time to fix that many articles. --JaGatalk 04:17, 21 October 2011 (UTC)[reply]
- That's a purely bureaucratic number, if there aren't any pages to work with, you don't need to worry about matching it. I just thought there would be. Anyway, edits look fine. — HELLKNOWZ ▎TALK 07:37, 21 October 2011 (UTC)[reply]
- One minor note: on your one de-tagging edit, the edit summary says that it was de-tagged for having less than 15 dablinks, it should be 10. —SW— confess 13:59, 21 October 2011 (UTC)[reply]
- I updated the edit summary and ran a few more de-tags. --JaGatalk 16:46, 21 October 2011 (UTC)[reply]
- One minor note: on your one de-tagging edit, the edit summary says that it was de-tagged for having less than 15 dablinks, it should be 10. —SW— confess 13:59, 21 October 2011 (UTC)[reply]
- That's a purely bureaucratic number, if there aren't any pages to work with, you don't need to worry about matching it. I just thought there would be. Anyway, edits look fine. — HELLKNOWZ ▎TALK 07:37, 21 October 2011 (UTC)[reply]
Trial complete. — well, 30 tags and 15 de-tags, at least. Couple of notes:
- George Stobbart (video game character) has a good diff since it actually had content above the tag as well.
- The number of dablinks from Kosher fish list has been lowered to 12, and DPL bot has not attempted to de-tag it. --JaGatalk 14:16, 24 October 2011 (UTC)[reply]
Looks good Approved. --Chris 03:07, 27 October 2011 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.