Wikipedia talk:Edit filter/RfC/Archive 1

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

First thoughts from Risker[edit]

  • This is way too many words. There's too much repetition - "Basics" is almost a repeat of "Introduction" section.
    • I've made Introduction clearly the page's lead, which means duplication isn't so bad. Sam Walton (talk) 09:36, 10 August 2015 (UTC)[reply]
  • Remove the word "derogation" and use a non-legal term.
    • Removed those sentences entirely; disallowing should only be done on certified unwanted edits, it doesn't really make sense to talk about stopping everyone being able to edit. Sam Walton (talk) 09:36, 10 August 2015 (UTC)[reply]
  • There are edit filters that are used to collect information that may or may not lead to other actions not directly linked to the filter. This is a legitimate use of edit filters.
    • Removed the sentences in Recommended Uses which implied use only for bad faith edits. Sam Walton (talk) 09:36, 10 August 2015 (UTC)[reply]
  • You may not be aware of this, but the enwiki edit filter is configured to only take the actions that you list. It is entirely possible to configure the edit filter to block accounts (not just edits); however, the enwiki community decided long ago that blocking required human eyes and decision-making, and did not include that option in our configuration. I am not sure whether or not other Wikipedias have enabled account blocking.
  • Much of the problem came from the fact that when the extension was first enabled, only a limited number of edit filters could run concurrently without seriously adversely affecting the manner in which the encyclopedia functioned; thus, all filters were regularly reviewed and reassessed by knowledgeable filter managers. Improvements in the extension and other areas has really reduced the impact of filters, and their use has grown exponentially as a result.
  • For the first couple of years, I thought I would only ever need "edit filter reader" and wondered why I had to have the whole shooting match. However, I've found myself making minor modifications to a couple of filters over the years (mainly fixing typos) which would have been a waste of the time of the originator, so I'm not so sure that splitting the right is a good idea.
  • Remember that "readers" would be able to read the private filters. It is important not to treat even "reading" as an easy-to-obtain userright. Experience tells us that there are plenty of LTA-type editors who are patient enough to develop accounts over time (good hand) and take advantage of that experienced/respected account to facilitate inappropriate behaviour by bad-hand accounts.
  • Might be worth mentioning that oversighters can suppress inappropriate content that shows up in filter logs.
  • Not included in this, or in anywhere else, but we should have an action plan so that useful edit filters whose creators become unavailable/retire/are no longer interested in maintaining them can still be operational but some other specifically designated individual takes over responsibility for its ongoing maintenance.
  • Not included in this, but it seems to me to be worthwhile to explore the value of designating a half-dozen or so experienced and well-qualified edit filter managers to oversee the use of filters generally, and be given latitude to remove/significantly alter/disable edit filters that have significant error rates, have other problem effects, or are being used inappropriately. I believe this sort of occurs on an informal basis now, but it might be worthwhile to look at recognizing that authority in a more formal way, similar to the way that FA delegates and TFA delegates operate.

Hope that helps. Risker (talk) 19:13, 30 July 2015 (UTC)[reply]

The presumption underlying your fourth point is false. The limits actually haven't changed. We have perhaps gotten a little better at writing filters that fit within those limits, and the filters themselves consume fewer resources so perhaps we worry about them less, but we had a large number of active filters even in 2009. The biggest change is probably that edit filters have become a less interesting area to work in now that it is more mature, and so it draws less attention. Dragons flight (talk) 20:23, 5 August 2015 (UTC)[reply]
  • I don't like this hierarchical idea half-dozen or so experienced and well-qualified edit filter managers to oversee the use of filters generally for several reasons, one is that there are really only about half-a-dozen active EFMs at a time. But more importantly, any EFM at the moment feels qualified to change or disable any edit filter, there is little concept of "ownership", except perhaps in new EFs that are still being developed.
  • For that reason we also do not need succession planning.
All the best: Rich Farmbrough, 13:28, 10 August 2015 (UTC).[reply]
"we should have an action plan so that useful edit filters whose creators become unavailable/retire/are no longer interested in maintaining them can still be operational but some other specifically designated individual takes over responsibility for its ongoing maintenance." Agree, and in particular, if there is no maintainer for a significant period of time (aka >1 month), don't leave their orphaned-abuserfilter-bots configured at disallow, please.  :-)     75.108.94.227 (talk) 18:53, 10 September 2015 (UTC)[reply]

Some thoughts[edit]

This may duplicate some of Risker's points but I'm just going to jot down some thoughts as I read through the proposal:

  • I agree somewhat with Risker that this page is quite a bit longer than it should be, though I'm not certain where to reduce content.
  • The 'basics of usage' section makes it seem as though log only mode is only used for testing. In practice many filters remain in log only mode indefinitely and are simply routinely patrolled by interested editors.
  • While the extension was initially intended to stop abuse of the encyclopedia, it has been extended, particularly recently, for uses not related to vandalism or long term abuse. Examples include addition of WikiLove, technical restrictions (arbitration enforcement), warnings regarding edits which may not definitely be spam, and filters like this one which log useful to track changes that aren't necessarily vandalism. I think the "Recommended uses" is a little too harsh on specifying that filters should only be used to counter vandalism.
  • "is there an off-wiki venue? should there be?" - There is no off-wiki venue, though I often discuss some filters on IRC in en-admins. Perhaps a #wikipedia-en-editfilters would be useful, though in practice it's unlikely to differ much from -admins. That said we do have, I think, two non-admin edit filter managers currently.
  • "Each edit filter should be reviewed no less than every (time interval) to identify filters that are no longer useful." Given that this is entirely unenforceable I don't think it's particularly useful. Edit filter managers tend to look through the active filters every so often to see how they're running anyway, or at least I do.
  • While establishing an edit filter noticeboard might be useful, I don't think WT:EF receives enough traffic to justify it.
  • I think the "Reviewing edit filters and troubleshooting" section implies too much ownership of filters by their creating editor. They're edited and monitored pretty communally right now, and I like it that way.
  • I'll probably have many more thoughts as this progresses, but those are the things that stuck out to me on a first read. Sam Walton (talk) 23:20, 30 July 2015 (UTC)[reply]
Also this discussion at VPI that I started to collect the community's thoughts on edit filters should be of interest if you happened to miss it. Sam Walton (talk) 23:42, 30 July 2015 (UTC)[reply]
  • I actually think the "Reviewing edit filters and troubleshooting" section should probably be removed entirely; as mentioned above I don't think enforcing regular reviews is workable or necessary, and it implies too much ownership of filters by their creators. Sam Walton (talk) 09:40, 10 August 2015 (UTC)[reply]
  • Having thought about it I'm actually more open to an Edit Filter/Noticeboard page, WT:EF doesn't receive a whole lot of traffic, but a noticeboard might open the process up to more visibility and interest. Sam Walton (talk) 09:52, 10 August 2015 (UTC)[reply]
    • My reaction of an EF noticeboard was substantially the same as yours. Do we need it? Not technically, we have places to discuss. Would it be useful? Can't tell until we try it. All the best: Rich Farmbrough, 13:30, 10 August 2015 (UTC).[reply]

A note[edit]

I will probably want to have some input here, but I'm not sure when. I am very busy this week and probably into next. Dragons flight (talk) 20:34, 5 August 2015 (UTC)[reply]

Policy or guideline?[edit]

Is this page designed to be a policy or guideline? Sam Walton (talk) 09:49, 10 August 2015 (UTC)[reply]

I hope a guideline. We have too many policies. I do think clarification on "secrecy" is important. Anything short of systematic leaking is probably not a problem, it would be nice to have a good faith statement to the effect that EFM's should use their good judgement in revealing the content of private filters to third parties. All the best: Rich Farmbrough, 13:37, 10 August 2015 (UTC).[reply]
I'll go out on a limb here and say this should be categorized in the 'wikiproject-essay' subset. Almost everything in here is a good idea. Almost nobody that is working the edit-filter and abuse-filter circuit, is going to disagree. So why put the imprimature of officialdom, on something that the people who are involved, already see as worth doing? I'm not an insider where the abusefilters are concerned, so maybe there is heated discussion on the new sekrit mailing list or whatever, but if this is just common sense stuff, with no strong opposition, then I'd say there is no need to add Yet Another WP:PAG. I consider this document to be in the same category as wikiproject-traditions, even though I don't think there actually is a wikiproject-editFilters; such wikiproject-essays are stronger than a regular essay (which sometimes is just one or two people), but weaker than the guidelines/policies/pillars ... theoretically, anything in this document that conflicts with WP:POLITICIAN is not gonna fly, right? And that is 'merely' a wiki-notability-guideline, so if it trumps this draft-document, then this draft-document must be an essay of some sort. By contrast, of course, if somebody adds "abusefilters shall not be applicable to any pages in the remit of wikiproject USA" into the guideline, then very likely this draft-wikiproject-'essay' would quickly *become* upgraded to a wiki-policy, right? So it is an essay-with-legs. I consider that modify-an-unrelated-guideline-to-screw-with-EFN scenario vastly unlikely, hence I recommend that this draft-document become a wikiproject-editFilter 'essay' rather than trying to make it official. Even if it were an official policy, WP:IAR still trumps it, eh? So why add 'policy' at the top, when it is neither necessary (yet), nor will it change anything on-the-ground (indefinitely true). 75.108.94.227 (talk) 18:50, 10 September 2015 (UTC)[reply]

Way too process heavy[edit]

  • Edit filter disputes at WP:AN/I? Whatever next?
  • Tries to hold edit filter to a mythical standard. I have encouraged review of page protection, IP blocking, discretionary sanctions etc., as NYB will attest, but apart from one or two splurges, these have not been regularly occurring processes.

All the best: Rich Farmbrough, 12:35, 10 August 2015 (UTC).[reply]

As I mentioned above, I don't think a review process would work or is needed, and I would support removing that section entirely. Sam Walton (talk) 12:54, 10 August 2015 (UTC)[reply]
MusikAnimal@ seems to agree so I will remove. All the best: Rich Farmbrough, 01:31, 16 August 2015 (UTC).[reply]
I meant to say and Sam Walton@.... Anyway  Done. All the best: Rich Farmbrough, 01:34, 16 August 2015 (UTC).[reply]

Would be nice[edit]

Though not as part of a policy/guideline. A Wiki-version of the edit filter list, to allow for better explanation/discussion than the "notes" field - which is often used as an edit summary. Clearly only limited explanation of private filters would be available, but these are rarely a problem. All the best: Rich Farmbrough, 12:40, 10 August 2015 (UTC).[reply]

(Actually, it's pretty obvious that the filters themselves should be wiki-pages with limited creation/editing/visibility according to the appropriate rights. Then talk pages would exist, history would be straightforward and navigable, watch-lists would work, etc. etc...) All the best: Rich Farmbrough, 13:41, 10 August 2015 (UTC).[reply]

Thoughts from MusikAnimal[edit]

Just going to comment on a few things I had concerns about:

  • edit filters should generally be tested (in "log only" mode) for at least several days
    I think this should instead be X amount of hits. Sometimes even after several days you only have a handful of hits to evaluate, other times you have hundreds within hours that offers ample amount of data. So how many edits are enough before you can move beyond log-only? It depends on the complexity and intent of the filter, I think. Some are designed to catch wide variety of edits, so we'll need to sort of wait until we see all the scenarios are being tested. If I had to put a number on it, I'd say wait for 5 to 10 hits for the simplest of filters. Anything you plan to disallow should of course require more thorough testing. In the end, it's at the author's judgement whether or not they are sure the filter is accurate. They wrote it and are hopefully somewhat sure of themselves, but if not, should seek help from other experienced filter managers.
    I've changed this to something hopefully more suitable; "until a good number of edits have been logged and checked". I don't know if we want to set a particular number, or Sam Walton (talk) 16:25, 16 August 2015 (UTC)[reply]
    Again this is really so dependant on the filter. For very specific purposes the fact that you get zero or few hits is a sign that the filter is working well. Hypothetically, suppose we had wanted to prevent the posting of the Texas Instruments key and used the regex

    (B709D3A0CD2FEC08EAFCCF540D8A100BB38E5E091D646ADB7B14D021096FFCD|B7207B… …)

    If the filter was not logging any hits after a short while, it would be a good indication that false positives were unlikely - in general this type of regex is going to have errors that will hit everything (for example a "*" after the close parentheses, or a "|" before it) or miss stuff out. Other sorts of testing are required to rule out false negatives.
    All the best: Rich Farmbrough, 22:18, 10 September 2015 (UTC).[reply]
  • abusive banned user, an alternative is ... to raise the issue at
    IRC would be ideal, but anyone could log on. However, isn't there a way to make it so that messages by non-ops are only seen by ops, like at #wikipedia-en-revdel connect? That way there'd be no ease dropping. Once the request is seen edit filter managers could talk in a dedicated room amongst themselves, so that non-ops don't see their messages in the requests room. Just a thought. Beyond IRC we could consider a mailing list.
    Could we make an IRC channel that uses the same list as en-admins? We could then manually add non-admin EFMs, of which there are at least a couple. Alternatively a mailing list could work. Sam Walton (talk) 16:25, 16 August 2015 (UTC)[reply]
  • edit filters should be monitored regularly
    There's no way to really enforce this. I would propose doing what I do, which is when I go to create a new filter, first see if I can find an old one that we don't need anymore. This for me is mainly out of concern for performance, where I don't want the filters to just keep piling up and we forget about the old ones that aren't getting any hits. Taking it a step further would be to actually evaluate the performance or the need of filters still actively getting hits. Lastly, I think we should point out that private filters require the most attention, as the author of the filter and the regex, description, etc, are hidden from the vast majority of users who might otherwise question its effectiveness.
    This has been removed. Sam Walton (talk) 16:25, 16 August 2015 (UTC)[reply]
  • If an editor believes that an existing edit filter is unnecessary...
    I think WT:FILTER is the best place for discussion, as it will attract attention from other edit filter managers, which I believe to be most relevant. If it's a policy-based issue or something requiring broader discussion, WP:VPP might be a better place than AN/I (as not all edit filter managers are admins), or you could see simply put a "see WT:FILTER#discussion" message on the more popular noticeboards.
  • For example, when an edit filter has been designed to combat abusive...
    The prose here is a little confusing, in my opinion. If this means what I think it does, we should also note that any private filters are not to be discussed on public noticeboards.
  • Filter managers may share the contents of private edit filters with...
    I'm not sure if we'll be able to get this down in a guideline. It should be evaluated on a case by case basis, at the discretion of the edit filter manager. The EF managers are already highly trusted users, and we should trust they won't reveal confidential details to the wrong users. I don't think there should be set qualifications that users must meet in order to be given such details, instead just use your judgement.
  • If an edit filter manager is misusing the user right...
    Removal of any user rights should be done at AN/I, I think, as that will require admin intervention, and be at the forefront of the community where it would receive valuable input.

I'm not too great at prose but hopefully I can help shape this guideline here on the talk page. Overall it looks good, and I'm glad we're finally doing this MusikAnimal talk 16:35, 14 August 2015 (UTC)[reply]

Encouraging use of regex tools[edit]

I find the external tool debuggex to be indispensable when working with complicated regex (there may be others that are better). This is not always applicable, say when you are using lots of functions, but for complex regex-driven filters I think usage of this tool should be strongly encouraged. Authors can create and save a "debuggex" and add unit tests for what they are trying to capture, and perhaps some similar strings that they are not trying to capture. You can then include a link to that debuggex in the filter description. That indicates to other EF managers that that particular filter has unit tests that need to ran against any new changes. They should also update the debuggex with the new regex, and add any new relevant unit tests, and update the link in the description to the updated debuggex.

Following this process will make maintaining complex filters considerably easier, and super helpful when you've got multiple EF managers working on the same filter. Everyone can create a debuggex account for free (they say there are limitations but that doesn't seem to be the case). Thoughts? MusikAnimal talk 19:51, 14 August 2015 (UTC)[reply]

  • We could definitely include a "useful tools" or something section at the bottom of the page. Sam Walton (talk) 19:53, 14 August 2015 (UTC)[reply]
  • There is some subtlety with debuggex. For example a unit test for "lololol" that matches the entire string will be deemed a failure if it matches "lolol" without requiring the extra "ol". Whether this is an actual failure, or an improvement in the matching of the regex is dependent on various subtle factors. (If "lolol" is considered a false positive, then it should be included as a negative unit test.)
All the best: Rich Farmbrough, 14:03, 15 August 2015 (UTC).[reply]
Also the usual concerns about private filters apply. All the best: Rich Farmbrough, 01:35, 16 August 2015 (UTC).[reply]
Yes just like with other languages your tests might need to be updated as well. I realize this is probably too much for us to include in the guideline, given the stipulations, conditional practically, and just having to learn to use a new tool. I still think we should make note of it, however, and perhaps throw in that you can save the debuggex and include it in the filter description as sort of a go-to for how the regex is constructed. As for private filters, it is possible to make private debuggex's that can be shared amongst specific accounts, but that costs money I think. If we really want it, we could look to the foundation to create an account for us, I'm trying to do something similar for WHOIS services for anti-vandal purposes. I guess what I'm getting at is it can truly take a single mistyped character or other small modification to break what some complicated regex was meant to target. As with any software with critical workflow (such as disallowing filters) it's crucial to ensure we don't break existing functionality or introduce new errors when attempting to make updates. MusikAnimal talk 03:22, 16 August 2015 (UTC)[reply]
I'm not sure the 'Testing, techniques and suggestions' section is really necessary in the guideline; an external link suggesting use of debuggex for testing regex is all that's necessary imo. Sam Walton (talk) 11:21, 18 August 2015 (UTC)[reply]
Yeah I've warmed up to this. While I do believe it can play an integral role in maintaining filters for some, most have gotten along fine without debuggex. The whole saving and resharing the debuggex's is a cool idea but could be cumbersome to standardize, and we don't want to make people think they are forced to use something they are not. Frankly I primarily use it because many of the normalizing functions aren't as comprehensive as I want them to be, so I construct my regex from scratch and use debuggex for testing (as the regex can become quite large and difficult to read). In most cases this probably isn't necessary. MusikAnimal talk 14:39, 18 August 2015 (UTC)[reply]
I've removed the section on the basis that it's filter help, which is likely more useful at one of the other Edit Filter pages, rather than the main guideline. I have, however, added Debuggex as a recommendation to the bottom of the page. I'm not decided whether it should stay there or be moved to, but I definitely don't think the whole big recommendations section should be here. Sam Walton (talk) 21:54, 18 August 2015 (UTC)[reply]
I'm not a big fan of debuggers myself - I do use them but they tend to be used as a crutch too often and there are some tasks (eg: UI debugging, such as anything involving drag and drop in Javascript) where they're more a hindrance than a help. I've used http://regex101.com for testing regexes as it claims to use Perl Compatible Regular Expressions, which is what the filter (according to the documentation) uses. The best "dead tree" source I think is O'Reilly's Owl Book, which is to regexes what the Camel Book is to Perl. Ritchie333 (talk) (cont) 11:45, 11 September 2015 (UTC)[reply]

Suggestions[edit]

Just two quick suggestions:

  • Number and list of edit filter managers could be linked for ease of reference (see 3rd lead para of WP:Template editor).
  • From the same WP page: The recommendation of "Have a strong password" (for all critical user rights) seems sensible and would only take up 3 more lines.

Maybe other aspects of the TE guideline could be used as model too, but not sure which one. GermanJoe (talk) 16:30, 16 August 2015 (UTC)[reply]

Both sensible suggestions, I think. I've added that sentence to the lead and copied the strong password section almost verbatim. Sam Walton (talk) 16:42, 16 August 2015 (UTC)[reply]

Noticeboard[edit]

What are your thoughts on the suggestion of an WP:Edit Filter Noticeboard? While I initially dismissed the idea, pointing out that WT:FILTER hardly gets any traffic as it is, I think I'm actually in support of the idea now. It provides a more official-seeming venue for raising concerns and discussions about filters, and would leave WT:EF free for discussion of the guideline (assuming this passes). That said, the noticeboard could be set up regardless of the guideline proposal, and probably before so it can be properly integrated (or not) into this draft. Sam Walton (talk) 16:32, 16 August 2015 (UTC) Sam Walton (talk) 16:32, 16 August 2015 (UTC)[reply]

I'm up for giving it a try, guideline or not. We can add it to {{noticeboard links}} and see what happens. I think just having the name "noticeboard" would attract welcomed input from non-EF managers, where WT:FILTER seems to be more for internal use. MusikAnimal talk 16:42, 16 August 2015 (UTC)[reply]
@MusikAnimal: Is this the kind of thing that needs a proper proposal somewhere, or should we just go ahead and do it? Sam Walton (talk) 17:10, 16 August 2015 (UTC)[reply]
I don't think so. Here we're inviting the community to take part in this tiny corner of the project, which would seem like an uncontroversially good thing. If it doesn't work out we can simply redirect the noticeboard page back to WT:FILTER. Note also we should update Wikipedia:Noticeboards and possibly Wikipedia:Dashboard MusikAnimal talk 14:20, 18 August 2015 (UTC)[reply]
Alright, I'll see about finishing off the page and publicising it. Do you know anything about setting up a mailing list? I have literally no idea where to start. Sam Walton (talk) 14:27, 18 August 2015 (UTC)[reply]
We did it recently for xTools, apparently you need to request a list on phabricator. I searched around and found meta:Mailing lists#Create a new list. I can help with this, but it'll have to wait a little bit. There's different formats and options that we'll need to set up properly for our purposes MusikAnimal talk 14:47, 18 August 2015 (UTC)[reply]
Alright no worries, it's not urgent. Sam Walton (talk) 14:52, 18 August 2015 (UTC)[reply]

Off-wiki communications[edit]

As mentioned above, I think it would be beneficial to set up a venue for communication between edit filter managers regarding private filters. I tend to use the #wikipedia-en-admins IRC channel for discussing filters (mostly to pester MusikAnimal), but we should probably have somewhere more specific to filter management, primarily to allow non-admin managers (of which we have at least a couple) to join discussions. A mailing list might be the best bet so that users can email with any private concerns they have about private filters, or to request more details on private filters. On the latter note, I don't think there's much harm in limited discussion of private filters with trusted users who can help with them; at least a couple of times I've had useful input from trusted members of the community who know more about a particular issue than I do once they know how the filter works. Sam Walton (talk) 20:18, 16 August 2015 (UTC)[reply]

Ideally we should have private wiki-pages. If edit filters were implemented to run of Wiki-pages as they should be then this would be simple enough. However the quick fix would definitely be a mailing list. All the best: Rich Farmbrough, 00:32, 17 August 2015 (UTC).[reply]
Legoktm has offered to set up a mailing list for us. Will update here when I know more; a number of changes can be made to the draft once it's implemented. Sam Walton (talk) 21:52, 18 August 2015 (UTC)[reply]
The process for requesting a new mailing list is pretty simple (m:Mailing_lists#Create_a_new_list), if there's a rough consenus that a mailing list is needed I can take care of the paperwork. Legoktm (talk) 03:22, 20 August 2015 (UTC)[reply]
Please do. Happy to be a moderator, if that helps - but no guarantees I can spend time on it. All the best: Rich Farmbrough, 00:49, 24 August 2015 (UTC).[reply]
I have created the list at wikipedia-en-editfilters per the request above with Samwalton9 and MusikAnimal as list administrators. Thanks, John F. Lewis (talk) 00:05, 5 September 2015 (UTC)[reply]

Edit filter user right removal[edit]

Where should discussions regarding removal of the user right from editors who are abusing the right or otherwise not competent enough to edit filters? The EF Noticeboard seems like the obvious place, but that may not be considered to be central enough, in which case the administrators noticeboard or similar might be more appropriate. I'm undecided though, the EFN might be fine seeing as it's only regarding the EF user right. Sam Walton (talk) 22:09, 18 August 2015 (UTC)[reply]

I'd say EFN would be a good place to start. It can always escalate elsewhere, if necessary.  —SMALLJIM  22:31, 20 August 2015 (UTC)[reply]
AN for administrator EFMs, and EFN for non-admin EFMs. Cases of misuse of the edit filter manager user right are more complicated and may also have elements of "admin abuse" when related to an admin EFM. Esquivalience t 14:24, 21 August 2015 (UTC)[reply]
I would go for EFN, since it's a specialist topic. Notes could be posted at AN/I. All the best: Rich Farmbrough, 00:09, 26 August 2015 (UTC).[reply]

Private filters section, use of EF2 for testing[edit]

I don't recall there being a consensus that EF2 should be used for testing. I don't think using any low-numbered filter for testing is a good idea, as I said here (4th comment from the end). If a test filter was drastically wrong and used up lots of conditions, it could stop the whole system (if, as the evidence suggests, filters generally run in numerical order). Far better that a much higher number is used, if we want prescribe a test filter at all – what if two people want to test different things at the same time, will we have edit filter edit wars? ;-)  —SMALLJIM  22:30, 20 August 2015 (UTC)[reply]

  1. Location: Perhaps this should be at the Noticeboard?
  2. Consensus:
  3. Little numbers: I agree, ideally test filters should be at n+1, n+2... as the "least important". We could re-order our filters, and it might not be a bad idea in the medium term. (I am thinking that the mailing list might be a good place for a detailed FAQ about existing private filters, which would help with the documentation crisis.)
    • Caveat Testing a filter for days, only to find that the few items it should have picked up may have been missed due to conditions is not a good way to go. We have had this with new filters under test. All the best: Rich Farmbrough, 00:07, 26 August 2015 (UTC).[reply]
  4. Collision: generally, I get the feeling, people create a new filter to test a new filter, and run the existing filter without the actions to test an existing filter. I have only "collided" with another tester once, as far as I am aware. Named test filters are fine, if someone wants "Smalljim's general test filter" I don't think there'd be any opposition. A question arises, perhaps, if there are a lot of private test filters, community oversight is much harder.
  5. Test nirvana: separate test (and acceptance?) and live filter lists. Test would log the putative action, rather than carry it out. Is this un-wiki though?
All the best: Rich Farmbrough, 23:59, 25 August 2015 (UTC).[reply]
Just to throw a final monkeywrench into the considerations, I would support a high-numeral-grouping (or separate list) for defunct-slash-obsolete filters, and for deprecated-slash-obsolete codebases. There are two scenarios here. In the first scenario, the one that will see the most use, filter#666 has a bug, and the regex is tweaked. After some dev-testing (and maybe acceptance-testing), the new codebase goes live. But what about the 'obsolete' codebase that just got replaced? It would often be nice, I would presume, to have a decommission-list, and to version the codebases that are dubbed filter#666. For instance, filter#666_2015_01_01 would be the 'old' codebase in this example, and it would be in the test-list for a month or whatever, then moved to the live-list (where it is simply known as 'filter#666' and 2015_01_01 becomes the version-number rather than the filter-id). At that time in February 2015, the previous codebase would be moved from the live-list to the decommissioned-list. Later, come September 2015, a new codebase filter#666_2015_09_09 has been undergoing test-list efforts, and is ready to go live and become 'filter#666' ... at which time, the then-current filter#666_2015_01_01 codebase should be moved from the live-list to the decommissioned list. I imagine that most of the entries in the decommissioned-list would be disabled after a few months, but I also imagine that they would be kept as tag-only for at least a short while, subject to server-load-considerations. Thataway, if there are corner-cases where the recently-decommissioned filter#666_2015_01_01 codebase does a better job than the recently-made-live filter#666_2015_09_09 , it is an easy comparison to make. The second scenario, which is a subset of the first, is where an entire filter#666 is retired, and the filter#666 jersey hung up on the wall: in that case, the final filter#666 codebase would be moved to the decommissioned-list, but there would be no codebase that replaced it, in the live-list. 75.108.94.227 (talk) 18:17, 10 September 2015 (UTC)[reply]

Requests for assignment of the group to non-admins can be made...[edit]

A couple of points:

  1. Why at EFN? Shouldn't this permission be acquired the same way as others, via WP:PERM? Even if the request at PERM is then discussed at EFN this would seem to be more above-board.
  2. What criteria would be applied in assessing requests? Number of contributions, time on the wiki, absence of blocks, interaction at EFN? The EFN is so new there are no previous requests in its history to use as a model.

Bazj (talk) 16:29, 2 September 2015 (UTC)[reply]

There is a record of previous non-admin requests here. Jo-Jo Eumerus (talk, contributions) 16:51, 2 September 2015 (UTC)[reply]
I think we EFN has been put as the venue because so few non-admins ever request the userright, it seems like overkill to set up a whole PERM page for it. I don't think it's a big deal either way though. Sam Walton (talk) 17:26, 2 September 2015 (UTC)[reply]
Another problem with WP:PERM is the hat collecting there. Every time I check a PERM request page, I see users begging for the right in question and unnecessary, mildly disruptive clerking (MusikBot already does enough of it). 65% of requests for rights there get denied. Mandating requests be placed in a more low-profile page will at least prevent some chaffy requests. Esquivalience t 03:56, 7 September 2015 (UTC)[reply]
I added the method that WMF staffers should use. All the best: Rich Farmbrough, 04:20, 7 September 2015 (UTC).[reply]

A link back to the previous discussions (as provided by Jo-Jo above) would provide guidance for users requesting the right and cut down on "chaffy" requests. I also noted that there were a number of requests to relinquish the right as it was no longer being used. Some guidance along those lines would also be useful. Perhaps with the assurance of a light-touch path to re-acquiring the right when needed again, in a similar vein to returning admins who previously gave up the bit voluntarily. Bazj (talk) 09:22, 7 September 2015 (UTC)[reply]

Hm, that archive of permission requests is quite out of date; there are quite a few requests from the last few years which haven't been archived there. Sam Walton (talk) 09:13, 8 September 2015 (UTC)[reply]
The lack of housekeeping around the granting/revocation discussions makes a pretty compelling case for moving it to WP:PERM where it would be taken care of properly. Bazj (talk) 17:11, 9 September 2015 (UTC)[reply]

Progress[edit]

I feel like this guideline might almost be at a level where we can propose it properly. Does anyone have any major concerns with it? FWIW there's a mass message on the way to EFMs regarding the mailing list which also includes a note about contributing to this guideline so I don't plan to propose it just yet, but please raise anything you think should be changed. Sam Walton (talk) 16:16, 9 September 2015 (UTC)[reply]

So... a viewing only permission for non-admins was decided against? Kharkiv07 (T) 16:36, 9 September 2015 (UTC)[reply]
I forget exactly where there was discussion about this but I don't recall there being consensus for non-admin viewing of hidden filters. That said, one of the reasons we set up the mailing list was to allow EFMs to communicate with trusted users about private filters. It would probably be worth holding a proper discussion about that aside from this guideline. Sam Walton (talk) 16:46, 9 September 2015 (UTC)[reply]
Such as meta:Special:GlobalGroupPermissions/abusefilter-helper granted to meta:Special:GlobalUsers/abusefilter-helper, not all of whom are en-admins? Are they all on the mailing list? Bazj (talk) 17:07, 9 September 2015 (UTC)[reply]
They are not automatically on the mailing list as members need to manually subscribe, but I suppose we could look into to mass messaging those folks on meta. I do recall a small discussion about an enwiki abusefilter-helper user group, but nothing came of it. I suspect that won't happen anytime soon, and either way I don't think it should be a blocker for moving forward with this guideline MusikAnimal talk 19:17, 9 September 2015 (UTC)[reply]
I would be happy for this to become a guideline. That doesn't mean it's finished. All the best: Rich Farmbrough, 22:32, 10 September 2015 (UTC).[reply]

process by which existing abusefilters may be judged[edit]

full disclosure

arbcom

  • "the community is encouraged to establish a policy or guideline for the use of edit filters"  Done
  • "and a process by which existing and proposed edit filters may be judged against these" Red XN (emphasis added)

current draft

  • If the filter is receiving more than a very small percentage of false positives, or is designed to catch good faith edits, it should usually not be placed in 'disallow' or 'throttle' modes.

suggested wording

  • If the filter is receiving more than one confirmed false positive per month a very small percentage of false positives, it should usually not be placed in 'disallow' or 'throttle' modes (without consensus).
  • If the filter is designed to catch good faith edits, it must not be placed in 'disallow' or 'throttle' modes (without broad consensus).
rationale

I think splitting the tag-good-faith-edits type of filters, into their own sentence, and saying that such edit-filters MUST not be disallow, is a no-brainer. Maybe I'm not understanding what type of good-faith edits ought to disallowed... but I don't think so. If the edit is good-faith, then action=warn ought to be enough. The good-faith editor will read the warning, and then stop, unless they (in good faith) believe the warning to not be applicable. Edit-filters ought not be used to try and prevent good-faith editors from shooting themselves in the foot, because regex string-match is simply incapable of knowing when the editor is about to do so. Blanking the main page, is not a good-faith edit; see next paragraph.

    I'm really only concerned with disallow. And I'm really only concerned with abusefilters, not 'edit filters'. Warnings made by abusefilters, like captcha, are annoying... but can only delay good-faith editing. (Cf WP:CONSENSUS about which I sometimes feel similarly.  ;-)     Disallow is very, very strong. There is no indication WHY something was disallowed. There is no friendly tone to the message. There is no manual override. There is no escape. See also, GORT.

    Why is disallow dangerous? Regex is the technical underpinning here. This draft-policy is ludicrously optimistic about how easy it is to write bulletproof regex. "Disallow edit filters should be used only to prevent edits that substantially all good-faith editors would agree are undesirable, or where a clear consensus has been reached that a specific type of edit should not be allowed." This is literally impossible; although it is unknown whether a Turing complete programming language is capable (in theory) of being able to detect 'edits that substantially all good-faith HUMAN editors would agree are undesireable", see Turing test, it is most definitely known that regex are not Turing complete. They are fast string comparison. They have zero understanding of the meaning of the strings. They are not attempting to evaluate what they disallow; they are just string-pattern-matchers. The human who programmed the edit filter, does have a sense of what the ascii strings being pattern-matched actually symbolize, but they are expressing their understanding purely through the weak vessel of regex. There is an old saying.[1][2] There is also a not-uncommon fantasy.[3] Regex is no silver bullet.

    So, my goal here is to help nudge the abusefilter folks into a quantitative absolute threshold, not the weasel-worded relative fuzzy language we have now. My specific suggestion is: any individually-numbered abusefilter, that is set to action=disallow, and which is confirmed (via reported false positives as evaluated by the folks that staff the false-poz noticeboard) to have 2-aka-two false-positive events, during any given 30.25-day-timespan, is henceforth considered a HUFPAF (high-unacceptably-false-poz-abusefilter). WP:Edit_filter_noticeboard is the place where HUFPAF status is tracked; there should be a graph and/or sortable-table of the edit-filters which are currently HUFPAFs, which shows the worst offenders (either in percentage-false-poz or in absolute-confirmed-false-poz). There are three possible ways to correct the situation: fix#1, fine-tune the regex, so that fewer false-poz events occur, and the HUFPAF falls off the graph (this is ideal). Fix#2, downgrade from disallow to warn (this would require discussion at the edit-filter-noticeboard which could plausibly involve RfC should the local-consensus-decision be contentious). Fix#3, gaming the system by simply splitting up the filter into multiple chunks, and using the chunks to divvy up the targetspace in round-robin fashion, which should NOT be permitted -- rigging the numerals so that a set of abusefilters flies under the HUFPAF-threshold radar, is Not KewlTM.

    Again, though, I will stress here that I don't care what the exact quantitative threshold is. There could even be a totally fake puffball quantitative threshold, like defining HUFPAF as "more than 200 false-poz per month" which would exempt all but the most brain-dead of abusefilters from being tracked in a graph-or-table at WP:EFN. Even that puffball threshold, however, requires that the people staffing the false-poz noticeboard, keep *some* kind of track of how many confirmed false-poz events each edit-filter is generating per month. Because in theory, one of them might cross the 200/month false-poz faux-threshold, and they don't wanna risk the wrath of arbcom, right? Therefore, even a faux-quantitative threshold is some use. But I am serious in my proposal for 2-false-poz-per-month, as a good tight threshold, that will encourage the regex wizards to fine-tune the abusefilters that are confirmed by false-poz-noticeboard reports to be causing problems for good-faith editors (plural) every month. And I guess I should specify, the 2-false-poz-per-month disallow threshold, should not apply to socks, nor to multiple hits of the same good-faith-editor (necessarily). But once the abusefilter is causing multiple people headaches, somebody should tweak the regex, or somebody should de-weaponize that particular GORT.

    p.s. Yes, WP:NOTBUREAUCRACY. And yes, this means more work for overworked false-poz-noticeboard-volunteers. But it's pretty easy work: just notch the stick, every time you confirm a false-poz. Once the notches are greater than N, where N is the HUFPAF threshold, start a new thread at WP:EFN, which will snow-close as "worth it" most of the time, until next month. See also, WP:ABUSEFILTERSCANBEABUSED. Agree with Dennis that there should be a regular scan, every month or so, of *all* the abuse filters, checking whether any of the wiki-guardians needs guarding against tool-abuse. But isolated incidents of bad judgement are not my real worry; my real worry is systematic lack of concern about the inherent weakness of regex, as well as systematic expansion of the scope of the abusefilter... to the point where the euphemism 'edit filter' is now considered the new norm. With great power comes great responsibility, and all that. 75.108.94.227 (talk) 18:05, 10 September 2015 (UTC)[reply]

Thanks for your thoughts. I broadly agree with you, but the reason that "a small percentage" was added was that different filters have differing numbers of hits. Some may receive hundreds a day while others only get tripped a few times a month. I think it's fairly obvious that 1 false positive a month means very different things for these two examples. I do think I agree with your splitting up of the two sentences though, and can't think of an example where a good-faith catching filter should be in disallow mode (though throttle is sometimes used on what would be classified individually as good faith edits), so could agree with saying never without consensus. Sam Walton (talk) 18:21, 10 September 2015 (UTC)[reply]
Yeah, I figured that glomming the good-faith-stuff in with the disallow-stuff was a mishap of being a rough draft; thanks for fixing it, by sentence-splitting, when you are comfy with your newly-worked-out-double-sentence-prose. Now, on the other matter, I do still think that 'existing' abusefilters don't have enough process, under the vague 'very small percentage' rules. Here is the original section of edit-filter-reviewing-mechanisms.[4] It was deleted as too-process-heavy. And almost certainly was.  ;-)
    But as you prolly know, the worst abusefilters are the ones where they disallow 1000 bad-apple edits a month... and that same month disallow 50 good-faith editors. That's quote unquote a very small percentage. But still an unacceptable regex, in my opinion. Sometimes this cannot be helped: there are a lot of false-poz reports when people are trying to update articles about rapper-discographies, and get a false-poz when the name of the song contains some kind of ethnic slur or expletive. I'm expecting those will get consensus to remain high-false-poz... but maybe I'll be surprised, and somebody will tweak the abusefilter to be more lenient on articles that have talkpages marked with the wikiProject:RapMusic template, or something intelligent like that. Anyways, my main point is that without a quantative absolute-non-relative threshold, there is little hope that existing abuse-filters will get attention, like exempting specific rapper-articles from the abusefilters related to specific slang-terms. The squeaky wheel gets the grease: I'm attempting to inject some squeak, in the hopes it will attract grease. 75.108.94.227 (talk) 18:37, 10 September 2015 (UTC)[reply]
Regex not Turing complete? In this context true, but neither is any other language since we have finite storage to work with.
And it's as capable of passing the Turing test as the Chinese room, trivially.
It's certainly true that the FPs need attention, and sytematic and systemic attention too. I'm hoping that the combination of the noticeboard and the mailing list will help.
There is potential benefit to maintaining charts of reported fps. But there is also the risk of tampering.
All the best: Rich Farmbrough, 21:50, 10 September 2015 (UTC).[reply]
For what it's worth, by the way, I don't think (and variations of this opinion have been echoed above) that some official process for judging current filters would be useful. Edit filter managers periodically monitor other filters and I don't think there's a major issue in this area that a bureaucratic process would fix. I think I may propose an in-depth review of all active filters sometime soon, but only as a check that they're all functioning as intended and the edits being caught are the intended targets. Sam Walton (talk) 18:41, 13 September 2015 (UTC)[reply]

Ready to be proposed?[edit]

While there are a few outstanding discussions, primarily whether granting the edit filter manager user group should be handled at WP:PERM, I think they should be handled separately from the proposal of this draft becoming a guideline. I think we're at a good stage with this now where we've laid out the community's wishes ready to place them in an official guideline which can be the base for further proposals, and I'd be interested to hear everyone else's thoughts. Sam Walton (talk) 18:48, 13 September 2015 (UTC)[reply]

Agreed. The move to WP:PERM can be taken care of independently at WT:PERM or at WP:EFN. The guideline otherwise covers all the major issues that have thus far gone undocumented. I assume the next step is a proper RfC? We can also look into adding a watchlist notice MusikAnimal talk 17:37, 15 September 2015 (UTC)[reply]