Wikipedia talk:Pages needing translation into English/Archive 4

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1 Archive 2 Archive 3 Archive 4 Archive 5 Archive 6 Archive 7

Request to translate Slavery in Malta

I saw a post on WP:Human_Rights earlier today by Xwejnusgozo requesting a translation of this article (I assume from French to English), so I searched for a project related to translating articles and found this one. I wanted to share the request here, because I thought there might be more of a chance that people in this project could help. It's a pretty long article, so I think it's too daunting for me to imagine tackling it on my own, but I could help work on it with other people if anyone else is interested. I majored in French language and literature at college, but it's been over 10 years since I've really used French on a regular basis. I hadn't seriously considered working on a translation until now, so I have no idea how the process works best. Any and all advice is welcome! :) Permstrump (talk) 20:17, 30 December 2015 (UTC)

Hi. Requests to create articles here by translating articles on other Wikipedias are covered at WP:Translation#Requesting a translation from a foreign language to English. Basically, it involves requesting a new article and proposing that it be created by translating the existing article. —Largo Plazo (talk) 20:43, 30 December 2015 (UTC)

April 2016 - Malaysian articles about law

PNT seems to be experiencing a tidal wave of Malaysian-language articles about law. - HyperGaruda (talk) 05:42, 9 April 2016 (UTC)

After searching around a bit, I stumbled upon several accounts that may belong to one group:

Some other accounts that I have seen in connection to recent Malaysian law listings at PNT:

Maybe a student assignment? - HyperGaruda (talk) 19:35, 7 April 2016 (UTC)

Another one for the list: Kebebasan Bersuara- Perkara 10 (1)(a).+1 - HyperGaruda (talk) 05:42, 9 April 2016 (UTC)

December 2016

Another wave of Malay law-related articles, and a sockpuppet investigation related to one of their creators. --HyperGaruda (talk) 10:14, 10 December 2016 (UTC)

Template to warn off AWB and other bots

Perhaps a part of the boilerplate instructions on the WP:Pages needing translation into English page should be to tag all such non-English pages with a template designed to ward off any deleterious changes by well-meaning bots. (This actually happened recently at L'Immortel [which is maybe in Draft space ])

Please see the discussion at Wikipedia talk:AutoWikiBrowser#How do I tag a page to scare AWB away.3F concerning changes made by AWB (and potentially other bots) who "correct" non-English text into English words. Mathglot (talk) 08:20, 25 April 2016 (UTC)

Largoplazo, I noticed you just reverted my restoration of the entry for Draft:L'Immortel. At the talk page linked to above by Mathglot, editors not active at WP:PNT decided that the article should be moved to draft space. If articles are going to be moved by some people and then have their listing removed from here by others, then I think it is defeating the purpose of having this page at all. Don't you think it is better to restore the entry rather than just hiding the article in the draft space for six months? AtHomeIn神戸 (talk) 00:36, 26 April 2016 (UTC)
@Athomeinkobe: I just left a reply on your talk page, but to respond here now that I understand the situation: OK, then the other editors were mistaken (assuming being in another language was the only rationale given for moving it to Draft space); the article should be moved back to article space; and then it should be put through the process here. —Largo Plazo (talk) 00:41, 26 April 2016 (UTC)
User:Mathglot says at Draft talk:L'Immortel that the article appears to be original research. Looking at it myself, I see that that's true: It's a blow-by-blow elaboration of the plot of the novel. WP:PLOT and WP:PLOTBLOAT scream for a drastic reduction in the plot description, with some attention given instead to expound on the significance of the book. Under the circumstances, the text doesn't qualify to be an article anyway—and there's no reason why anyone should be asked here to devote any effort to translating it anyway, especially the gobs and gobs of words that should be excised. —Largo Plazo (talk) 00:49, 26 April 2016 (UTC)
Yes, I agree that in this particular case the article is information overkill that would give even a paid translator a headache. My concern is not for the fate of this particular article, but rather the fact that if decisions like the one to move to draft space are made away from this page without keeping a way of finding the page after the move, then the system loses its value.
On the one hand, moving a non-English article like this to draft space is perhaps better than the current method of deleting after listing for three weeks. It gives the author (and regular helpers) plenty of time to tinker and get an article ready without time pressure. But if we are to do that, we need a space like the current WP:PNT to keep track of articles requiring the assistance of the regular translators. AtHomeIn神戸 (talk) 01:43, 26 April 2016 (UTC)
@Largoplazo and Athomeinkobe:
> "Under the circumstances, the text doesn't qualify to be an article anyway..."
I agree
> "...and there's no reason why anyone should be asked here to devote any effort to translating it anyway"
strongly agree
> "...gobs and gobs of words [...] should be excised
agree.
Beyond that, irrespective of improvements and/or translation, I don't think the topic meets notability guidelines for a book on FR WP, let alone here, as it meets none of the conditions listed in the five bullet points at the top of the Book notability guideline page. For me, that makes this a Delete recommendation.
But we're getting rather far afield from issues concerning a Template for AWB bot, and I'm wondering if we shouldn't port this discussion over to the Draft_talk:L'Immortel page where it more properly belongs? If no one objects, I will do that (and link to it from here). Mathglot (talk) 03:25, 26 April 2016 (UTC)
Bump- @Largoplazo and Athomeinkobe: So, no objections to moving this? Mathglot (talk) 05:05, 1 May 2016 (UTC)

So, page Template:Bots gives the following method to stop AWB: {{bots|deny=AWB}}. Mathglot (talk) 19:03, 2 October 2016 (UTC)

Why PROD?

The article رِۆلی گومرگ له‌ ژیانی ئابووری و كۆمه‌ڵایه‌تی و ته‌ندروستی له‌ناو كۆمه‌ڵگادا, listed here on April 11, was PRODded on April 25 after being here for two weeks. This is the usual procedure. Today, April 30, User:Salter212 removed the PROD tag. User:Widr restored it.

Once someone has removed a PROD tag, it shouldn't be restored. Unless an exception I don't know about is spelled out somewhere, Widr's action was incorrect. On an earlier occasion when a PROD was removed following a similar respite at WP:PNT, I submitted it to AFD. But that seemed a waste of time, given that the Delete outcome is supposed to be a foregone conclusion.

This led me to wonder why we don't just use speedy deletion criterion G6, "technical deletions", also known as "housekeeping". This seems like something that would have been discussed already, early on, though. Has it been? —Largo Plazo (talk) 19:51, 30 April 2016 (UTC)

I rollbacked all edits by the now blocked user. You can remove it again. Widr (talk) 19:56, 30 April 2016 (UTC)
@Largoplazo: The removal of PROD templates is only valid if the original concern has been addressed and the article was improved. In this case, there was a user who went on a template removal spree without providing any reasons for their actions. While a de-prodded page that still has issues should normally be listed for a full deletion discussion, I don't see anything wrong here with Widr restoring the PROD for procedural reasons. De728631 (talk) 20:04, 30 April 2016 (UTC)
To clarify: "The removal of PROD templates is only valid if the original concern has been addressed and the article was improved." That pertains to maintenance tags, not deletion tags. Your remark contradicts WP:Proposed deletion: "Any editor (including the article's creator) may object to the deletion by simply removing the tag; this action permanently cancels the proposed deletion via PROD." The guideline defines no distinction between valid and invalid reasons for objecting, and imposes no requirement that the issues expressed by the nominator have been addressed. The person removing the tag has only to object to the deletion.
The guideline does provide the exception to which you refer: "This excludes removals that are clearly not an objection to deletion, such as page blanking or obvious vandalism, and tags removed by banned users may be restored." Based on Widr's explanation and yours, this was the case here. That's fine, no problem. I have no further concerns about the restoration of this particular PROD tag.
The question that this situation inspired remains, though! In general, anybody can remove, in good faith, the PROD tag placed on an expired WP:PNT article. So we have to treat the matter as controversial, and drag it through a discussion. But it really shouldn't be controversial. A process has been followed already. The article has been posted for two weeks of scrutiny. Due process has been followed. So there shouldn't be any objection to deleting PNT-expired articles speedily. —Largo Plazo (talk) 21:58, 30 April 2016 (UTC)
  • I've been browsing through the archives and found that, in the beginning, the policy was to nominate pages directly at VfD (AfD's former incarnation) after the two-week grace period had passed. There was a discussion in 2005 to change this into speedy deletion, but after Jmabel mentioned a 15% save rate for untranslated VfDed pages, the opposition grew to roughly 50%. Another frequently used argument was that people needed more time for translations. The proposal failed, but VfD was replaced by PROD not much later. That all was 10 years ago; I can imagine that things have changed by now. - HyperGaruda (talk) 06:31, 1 May 2016 (UTC)
  • I disagree with the assertion that the time it takes to translate text is an excuse for keeping an untranslated article longer. I think the current period of three weeks (two listed at PNT plus the one-week PROD) is generous. From my own observation, sometimes articles get translated soon after they are prodded, which suggests some editors must keep an eye on prodded articles but not PNT. In any event, I do not think speedy deletion should be used. The placement of a PROD template can be considered a "final warning" to the author (or any other editor), giving them one week to fix the problem. On the other hand, if speedy deletion is used, the article could be deleted before the author sees the warning (if they are given one).
If anything, the original two-week period at PNT seems too long for me and it could be reduced to one week. Most articles listed at this page are barely two paragraphs long and do not require days of work to translate; they need someone who knows how to do it and is willing to spend an hour on it. For longer articles, if someone actually wants to translate the article they can move it to draft or user space, which is where these non-English articles really belong in the first place. AtHomeIn神戸 (talk) 02:16, 6 May 2016 (UTC)

Special PNTPROD

Since there's also a BLPPROD, maybe we should create a PROD for PNTs specifically, one that can only be removed if the page is translated or adopted/userfied/draftified. Any thoughts on that? - HyperGaruda (talk) 06:31, 1 May 2016 (UTC)

Don't think it's worth having a special prod just for this, at a given time there are rarely more than 2 dozen untranslated articles--Jac16888 Talk 21:36, 2 May 2016 (UTC)
I agree with Jac16888. The regular prod does the job well enough. I seem to remember that BLPPROD was only introduced because the rule concerning mandatory reliable sources for new BLP articles was new at the time. De728631 (talk) 21:49, 3 May 2016 (UTC)

June 2016 - Machine translations

In my opinion, obvious machine translations of other-language wikis should be treated as if they are still written in a foreign language. That is, listed at PNT for 2 weeks (perhaps under a separate section) and then PRODded. Some of you might think: why not just extend WP:A2 to machine translations? Well, there was a discussion about that a few years ago (Wikipedia talk:Criteria for speedy deletion/Archive 43#A2 and Machine translations), but the idea was turned down.

Unlike ordinary rough translations, deletion of pages machine-translated from other wikis does not cause a loss of information - it is still there at the other wiki. This way we can turn WP:MACHINETRANSLATION into practice, and I think this can clear our backlog considerably. Any support for this? - HyperGaruda (talk) 10:23, 15 June 2016 (UTC)

I agree that at the moment we have a problem where horrible translations are left to rot for months because they fall through gaps in the deletion criteria. EFX The Chosen, which I have just finished addressing at this entry, is an interesting case because the same user created the Spanish wiki article after originally posting the same contents to the English wiki then translating it. Questions of notability aside, I see no point in keeping such a bad translation for months or years until somebody decides to clean it up.
As an alternative idea, assuming that a topic is quite clearly notable, we could create a brief article in legible English to establish that notability then use the "Expand (language)" template to show that more information is available at a foreign-language wiki. AtHomeIn神戸 (talk) 03:11, 16 June 2016 (UTC)

New template: Needs English sources

With ample help from User:PanchoS, I've set up a template called {{Needs English sources}}. Articles transcluding this template are added to Category:Articles needing sources in English. This is useful for any articles you translate that also have mostly or entirely non-English sources. The validity of sources is independent of the language they're in, but valid sources in English are still preferred when available (see WP:NONENG), so it's a good idea to flag articles that can use some. Largoplazo (talk) 17:39, 26 July 2016 (UTC)

The template was deleted for policy reasons but what I usually do in this situation is use one of the general citation sources at article/section or inline level, such as {{Refimprove}} (with a reason= param) or {{Better source}}, or one of these:
I'm pretty sure there's a statement on a guideline page somewhere that states that "English sources are preferred" and you could link that in your "reason" param. Mathglot (talk) 19:19, 2 October 2016 (UTC)
Here it is: WP:NONENG: "Citations to non-English sources are allowed on the English Wikipedia. However, because this project is in English, English-language sources are preferred over non-English ones when available and of equal quality and relevance." Mathglot (talk) 04:05, 3 October 2016 (UTC)

Nomination for merging of Template:Rough translation

Template:Rough translation has been nominated for merging with Template:Cleanup translation. You are invited to comment on the discussion at the template's entry on the Templates for discussion page. Thank you. Pppery (talk) 23:35, 10 August 2016 (UTC)

Google translate blacklisted

Others may have noticed, but Google translate was blacklisted as a spam website yesterday, because it was being used to circumvent the blocking of other blacklisted sites. This means that languages cannot be added to the Not English template. I have raised a concern about it at the blacklist's talkpage. AtHomeIn神戸 (talk) 01:36, 18 August 2016 (UTC)

Completed pages

@SimonTrew, HyperGaruda, Elinruby, De728631, LargoPlazo, Jac16888, Athomeinkobe, JohnCD, and I dream of horses:

Hi. I'm proposing a change on how entries about translated articles get removed from on WP:PNT, to make it more subject to consensus. As things stand now, removal is arbitrary.

Summary

In brief: I'd like to see entries on PNT about a translated page that looks good to go be moved down to a staging area section called Completed pages before being deleted from PNT. Any editor could decide to move an entry there. Once there, entry deletion is open for discussion. If no discussion after a certain interval, any editor could delete the entry, simultaneously removing the Translate banner from the underlying article page.

tl;dr: HyperGaruda has come up with what I think is a better proposal; see the #Threaded discussion below for their comments. This tl;dr comment interpolated by Mathglot (talk) 19:00, 25 September 2016 (UTC)

Details

My concern about the current procedure is, that entries sometimes get removed from the page too soon, when translation or cleanup work on the underlying page hasn't really been completed, and the translation quality is not up to par. Under the current system, any editor can simply delete the entry without discussion, and that seems too arbitrary to me, and doesn't adhere to the spirit of WP:CONSENSUS.

Once an entry on PNT is removed, it doesn't draw the eyeballs from other editors who might have offered their input about translation quality (unless they look in the History) before deciding that an article is good to go or not. As long as an entry about a translated article still is present on the PNT page, there is sometimes discussion about it. Much more often there isn't any discussion, but the point is that as long as you can see it, you can choose to discuss or not to discuss. If it's removed from the page, you don't have that opportunity anymore.

Scope

This change in procedure would affect only those pages whose translations have been worked on after being listed. Pages listed at WP:PNT that just stagnate without any attention or copyediting to clean them up, or that have had some cleanup but not enough to merit being considered "done", would either just sit there, or continue to get removed however they do now.

Procedure summary

So, to summarize the proposed procedure for migration of page entries on WP:PNT:

  1. a page needing translation has an entry added to WP:PNT (no change from current procedure)
  2. one or more editors make changes and improvements to the article (no change)
  3. any single editor decides the translated page is up to par, and removes it from WP:PNT (old procedure)
  4. any single editor decides the translated page is up to par, and moves it to section #Completed pages (new)
  5. If discussion there ensues:
    • consensus that translation is good: any editor removes it from WP:PNT, and removes the rough translation banner from the article page
    • consensus achieved that translation is still not ready; any editor moves entry out of #Completed_pages, back up to the appropriate section
    • no consensus should probably err on the conservative side: i.e., since at least some editors would like to see more work done, they should be given an opportunity to do so, and the entry should be moved back up and out of #Completed_pages
  6. If no discussion ensues:
  • after a suitable interval, any editor removes it from WP:PNT, and removes the rough translation banner from the article page.

Sample page for viewing

To make this more concrete, I have included a real-world example. IMHO, the article L'Obs is now ready for prime time. So, I have created a sample page with a new #Completed pages section as a container to hold it and other entries, and moved the entry for L'Obs down to that section, so you can see what it would look like. This can be viewed in this sample page.

This was prompted by my having received a ping alert-notice on a change to PNT by user A, and by the time I went to look for it (same day) that entry was already deleted by user B. That's too soon, and too arbitrary.

Appreciate your thoughts on this. Mathglot (talk) 18:30, 23 September 2016 (UTC)

Threaded discussion

  • Honestly, I think it's needless double handling and we have enough of a backlog on this page as it is, without checking everything twice. Any editor is free to undo the removal of a rough translation tag or removal of an item from this page, anything beyond that is just not necessary--Jac16888 Talk 19:11, 23 September 2016 (UTC)
    • The problem is that a large number of the deletions that are going on are done with no opportunity of discussion at all. In some cases, I have been alert-noticed by one editor to look at a PNT discussion, and by the time I get there, the entry is already gone. I looked at a few of the deleted entries, and I don't agree with all of them. The alternative to the proposal, would be to go into history, and undelete some of the deletions that have already been made that in my judgment were made improperly. That seems highly likely to lead to edit warring, since by definition an undelete would mean a disagreement between editors and a lot of extra work. The whole point of the proposal is to follow guidelines about consensus and to avoid having editors make unilateral decisions about this, to avoid the painstaking work of having to examine history and do diffs to see if you agree that each unilateral deletion involves a page that was properly translated, and to make things simpler. Otherwise, you're privileging any editor who wants to delete entries, with or without reason, and with no consensus. I think that gives too much power to individual editors and is contrary to the spirit of the Project. Mathglot (talk) 23:30, 23 September 2016 (UTC)
      • I disagree with all of this. Undeleting would never be considered edit-warring unless the remover and the restorer kept doing it, and there has never been an edit war of that type on this page. There is no rule anywhere on Wikipedia that says an editor cannot remove a tag, just as there is no rule that says an editor cannot restore a tag, that's all that is happening here, an editor sees that there is no longer a need for a tag and removes it, the only difference being that the page is listed here as well--Jac16888 Talk 09:38, 24 September 2016 (UTC)
      • An individual editor has the power to create a brand new article in poorly written English that they have translated from elsewhere. If the flow, instead, is that an article is posted on this Wikipedia in another language and then translated by an individual editor, I don't see how the reversal of events incurs the need for a group review that didn't exist in the first case.
I have occasionally translated an article but realized that there were some holes in my competence to do so. In those cases, I've moved the article to the Cleanup section myself. There's no need for a special section for that.
I'm envisioning an ever-growing backlog of pages no one ever marks "complete", many of which were fine in the first place. If you want to give them your own review, you can always scan the history for reductions in the number of bytes on this page and check the diff to see if there are any pages you want to glance at. Largoplazo (talk) 23:00, 25 September 2016 (UTC)
  • Meh, this is a bit too much extra bureaucracy for this already way too backlogged page. Alternatively, you could just add a " Done" comment, and let someone else remove the entry on PNT, if no further discussion takes place within a reasonable amount of time - let's say 48h? That way, there are at least two people who have looked at the affected article (the translation improver and the one who checks if it really has been improved), in addition to the tacit approval of those who watch this page. --HyperGaruda (talk) 04:52, 24 September 2016 (UTC)
  • People using the done tag already takes place, but often articles are tidied by editors who aren't even aware of this page. I believe adding any kind of firm rule to this is just unnecessary--Jac16888 Talk 09:38, 24 September 2016 (UTC)
I have not seen the "done" tag used in most cases where entries have been deleted from PNT, but if we can move to a system like HyperGaruda's proposal, that works for me, as it offers the possibility of discussion and consensus for an interval, and avoids the problems of arbitrary deletion. And it doesn't matter if other editors tidy a page; if you see that they have, just add the check mark and it still works. HyperGaruda's proposal is a lot simpler, and solves the existing problem. I'm for it. Mathglot (talk) 15:06, 24 September 2016 (UTC)

I can't remember ever seeing a done tag. I'm not necessarily against this proposal, but I have had a note on one article asking someone to decide if it was done, since I had spent way too long on it and felt it should be assessed by fresh eyes. And it's been months. Also, we need a definition of "done"... the translation is done, the article has no issues with tone, references, inbound links? These are different critera... Don't get me wrong, I think this could have better workflow; pending further discussion I am not sure this is it.Elinruby (talk) 20:43, 24 September 2016 (UTC)

The definition of "done" that I had in mind, was "translation done" and that's it. If the original is full of factual errors, missing references, original research, no inlinks,m or any other problems, then in a faithful translation, those would get carried over. We're talking about a check mark on the PNT page, and n that contextit should mean only "translation done". All the many other ways of improving an article are to be desired, but not relevant here, imho.
P.S. If you can leave a pointer to the note you are talking about, if it's one of my languages I'll have a look. Mathglot (talk) 21:29, 25 September 2016 (UTC)
And why do we need this? The current process is that pages are either PRODded for lack of any translation, or when the translation has been done including factual errors, poor referencing and the like, the entry gets removed from the list. Insufficient translations like machine-translations or other gibberish are routinely added to "Pages needing cleanup" so I don't see why we would even need a "Done" tag. De728631 (talk) 23:34, 26 September 2016 (UTC)
  • I use the {{done}} tag quite often for exactly the reasons stated above. I dislike that we don't archive PNT listings/discussions, but that's by-the-by. Perhaps a new section is too much, but a "done" is hardly a burden. Sure, it shifts the deletion/move task off me onto another editor, but that second set of eyes is important, I feel. Si Trew (talk) 06:32, 3 October 2016 (UTC)
    • I too dislike that detailed discussions on the contents of the article or its translation can suddenly disappear, such as this one which was removed over the weekend. I think it should be pasted to the talk page of the article rather than simply vanishing from the PNT page. In this instance I have copied the text to Talk:Ana Garrido Ramos#Discussion regarding initial translation. I don't actually intend to "close" the discussion, so if there is a better way to highlight the portion which has been moved without discouraging further discussion on the topic, please make any changes that are necessary. AtHomeIn神戸 (talk) 07:26, 3 October 2016 (UTC)
Yeah I was surprised that one disappeared when I don't think you or I had quite "completed" it yet. Thanks for moving it to the talk page. Perhaps that is the best solution in these cases, but had I just done it myself after my usual rambling translator's comments, I would at least have left a pointer to it at the PNT page so that others (e.g. you) could find it, and if that got deleted we would both be in the "What the — was that article called? Oh, I have to check history to remember..." (Yes, I could check my contributions instead, but I'm currently going through the backlog of mass-created redirects at User:Anomie/Neelix/List 4 and so on, so my contributions list tends to get very long with very short contributions to numerous pages, so that I can't see the meat for the rice.) Si Trew (talk) 07:39, 3 October 2016 (UTC)

Translation attribution and copyvio

Something Elinruby said above reminded me of the attribution requirement per WP:Translation to mark the Talk page of translated articles with {{Translated page}} when the translation is done. I checked five translated articles corresponding to entries previously on WP:PNT and in only one case was this requirement observed:

The translating user(s) is(are) responsible for this, but since PNT is being used to track these, it wouldn't hurt for us to monitor this requirement as well. (This isn't a guideline, but a requirement, due to the possibility of copyright violation.) Mathglot (talk) 22:16, 25 September 2016 (UTC)

Huh? Ideally this tag should never apply to pages translated via WP:PNT: if the page has been copy-pasted from another Wikimedia project, it can be speedily deleted from EN:WP per WP:A2, "foreign language articles that exist on another Wikimedia project". The person listing it at PNT could do a quick search to see if the text occurs in another Wikipedia, before they list it at PNT. But more commonly, the content in the article is not a (partial) copy-paste from another Wikipedia, so the {{translated page}} tag is inappropriate.
For example Henri Rol-Tanguy has existed in English since the last day of 2004 and its WP:PNTCU listing ({{cleanup translation|fr}} tag) was in January 2016 (by you, User:Mathglot). The only differences between the tagged version and when the tag was removed in September by User:Jac16888 (diff here) – after I had gicen the all-clear at WP:PNTCU, as the your link shows – have nothing to do with translating from the French WP, so the {{translated page|fr}} tag would be entirely unwarranted merely by its listing at PNTCU. If it needed such a tag, it needed it even before it was at PNTCU. It has nothing to do with PNT.
Similarly, Tom Nordtvedt and Unión de Todos do not have any Interwiki links so I am not sure what you would expect the {{translated page}} tags to contain.
Charanguita is my tr. from the middle of last month and I put my hands up to that one that I just forgot, will add it now. However, it wasn't me who deleted it from PNT in the first place, but User:De728631. Similarly, there was probably good reason I didn't remove the {{not English}} because I was expecting that to be done by the person "closing" the PNT (otherwise, how does anyone know that it is listed at PNT?) This adds weight to those who say that things should not just magically disappear from PNT without checking (i.e. the #Completed pages discussion); I probably just completely forgot once it disappeared from PNT, to the extent I didn't remember even now that I had translated it until I checked the history. From now on, though, I'll add the {{translated page}} tag first, rather than last. I don't think I can find out whether the IW link was present (or added by me) when I translated it, that's one downside of Wikidata rather than having the IW links embedded in the text. But will tag to be on the safe side. Si Trew (talk) 11:07, 3 October 2016 (UTC)
@Mathglot: {{translated page}} is not a hint at the fact that the original version of the page in the English Wikipedia was not written in English but it serves to attribute content that was first translated off-wiki from another Wikipedia and then added here in (hopefully proper) English. It's just a note to retain the attribution of authors that is required for the Creative Commons licence of all written content on the source page. So, we at PNT should not use this template at all unless it is clear that certain phrases or sections were copied and pasted from another language WP and were afterwards kept in the English article in their translated form. De728631 (talk) 19:04, 4 October 2016 (UTC)
OKay thanks for the clarifications. Mathglot (talk) 04:59, 5 October 2016 (UTC)
Thanks all for that. {{translated page}} also puts it in a small little box unless you put small=no. It used to be the same size as other boxes on the talk page, and I argued against changing the default: or rather, the small= parameter was introduced withou discussion and suddenly when I was translating lots of Hungarian articles and attributing them I did a double take to find where I had put the things. It is not as if it says translated by User:SimonTrew, please bow and pay penance to the lord and chief, it just needs to be as prominent as other on the talk page. I might boldly change it. But as usual, we translators get pushed aside, because translation is easy right? Even Google Translate can do it. Si Trew (talk) 12:21, 21 October 2016 (UTC)
oh, and I can't change Template:Translated page because it is protected. Welcome to Wikipedia, the encylopaedia that anyone can edit. Why was it protected? Oh probably because I suggested a change that was not discussed and was dismissed without discussion and then protected. The talk page is protected too Si Trew (talk) 12:26, 21 October 2016 (UTC)

monolinguals and translation copy-editing

@SimonTrew, De728631, Athomeinkobe, HyperGaruda, Elinruby, and Noyster:
I'm running into a translation issue on some of the articles I've been working on that I'd like to raise here for discussion, because I'm really not sure what the best approach is.

The issue concerns articles in poorly translated English that are then partially or fully copy-edited into good English by editors who don't speak the source language. I'm starting to wonder if this is actually a good idea or not. The problem in a nutshell is that copy-editing machine-translated cruft into good English certainly makes the article read a whole lot better, but may obscure an inaccurate translation that completely misstates facts.

When I work on a translation listed at PNT, I usually scan the article looking for what areas need work. I usually start by looking for the most poorly written sections. I figure those are the ones that need attention, because they look like they haven't been worked on yet. The portions of the article that seem to be in perfectly grammatical, smoothly flowing English I usually skip, because they seem like they've already been worked on and don't need my help, and so I'd be wasting my time to go there. Kind of by accident, I found out recently that that's not always the case at all.

Sometimes a machine translation (or other poor translation) is understandable but just plain wrong or unfaithful to the original, because the state of the technology just isn't there yet to ensure accuracy. Copy-editing a rough translation that contains a factual error into proper English, merely sets into stone the wrong translation, and makes it less likely to draw attention from others. Other times, a snippet of rough translation is barely intelligible, and a good-faith copy editor may take their best guess at what it was trying to say, and turn it into something completely different than what the original was saying.

I've been looking at a translated article where some of the sentences say the direct opposite of what the original source says, and where others seem to be talking about some other topic than the original, or even refer to people that don't exist in the original.

If the text were left in rough, obviously wrong English, at least that would catch the eye and draw a translator to that portion to review. But if the English looks perfect, there's nothing to indicate that it needs attention, and so either you won't notice it at all, or only by accident, as happened to me recently.

Machine translation keeps getting better, and already it can often generate a lot of factually accurate and understandable English, even if the phrasing is a bit wooden or awkward. In those cases, simply tidying up the English so it reads better is sufficient. But how often are the machine translations basically accurate? Should we accept a certain percentage of factually incorrect statements in translated articles on English Wikipedia, as "the price of doing business" using machine translation + monolingual copy-editing? And what percentage? What if 80% of an article accurately represents the original, but 20% is just wrong, is that acceptable? What about 95% to 5%? 99 to 1?

Note also that in some cases, the smoothly flowing but mistranslated statements may even have references carried over from the original source that appear to verify them, however in these cases, the references don't verify the statement, they verify the original but contradict the translation; but who's to know, especially if the citation is in some other language?

On the one hand, there are a whole lot more monolingual editors willing to do copy-editing than there are translators, and I suspect the output volume of this project would decrease if we don't take advantage of them. On the other hand, are we sacrificing accuracy on articles on English Wikipedia if we do?

As far as working on translations, personally I'd still rather see Google's rough cut, even unintelligible text, than smoothly flowing English that is completely wrong, because at least in the former case I know there is work there to be done, whereas in the latter case, I might just skate right past it. On the other hand, I can only work on a very limited number of articles.

What do others think about this? Mathglot (talk) 01:25, 1 October 2016 (UTC)

Unfortunately this is a well-known issue at PNT. Years ago, @Jac16888: made {{No-rough}} which can be used to notify editors after they have added a "rough" translation to the original article. Maybe we should include this template's text into {{uw-notenglish}} so people are warned early not to use machine translations if they aren't capable of translating their text properly. A more radical measure would be treating poorly translated pages as we do with pages that have not been worked on at all: prod them after 14 days. This would also save us the cleanup effort and, in the end, one could claim that gibberish that was produced by a translation program (or a monolingual speaker) is not a translation at all, so the page should be deleted. Not sure though if the latter approach would need community consensus before being implemented. I seem to remember a Village Pump poll of sorts where a proposal of introducing speedy deletion for non-English articles was turned down and I was among those to !vote for keeping the 14-days "standard offer" of PNT, but it may be time to pull the break at least in terms of quality control. De728631 (talk) 02:23, 1 October 2016 (UTC)
The thing is, one man's gibberish....there's the thing. there was just a discussion about machine translation on some administrator list whose name escapes me (I got there from following someone's RfC-gone-bad if that helps)...anyway, many people were horrified that this is a thing now on wikipedia in certain languages, and apparently a few users did make hundreds of bad machine-translated articles by which they could be horrified. My take is this: yes there is such a this as bad translation, absolutely, however... I use the machine translation tool myself. And my translations speak for themselves. At least, ok, I hope and believe that they do. Nobody came back and said yeah sure but your translations suck. Or we found this error, even, or whatever. But the stuff that winds up on the cleanup translation list *also* tends to lend itself to highly arcane questions, which can go by even a native or near-native speaker. For example Mathglot checked at something I did at my request[1] and mentioned that he looked up "closed village" which had struck *me*, as someone who grew up immersed in that culture, as something completely obvious.[2]

Or, in the really bad articles where you are essentially rebuilding, you run into questions like "why does Castellane say that Boniface de Castellane won a battle in 852, when House of Castellane says the first baron was in 11 something something?" As another example, a lot of the original Castellane article seems to be a direct translation of an early-19th-century history which says stuff like "Monsieur X's property is part of the baronetcy of Roger d'Estaing de l'Alouette who as we all know was also the Duke of Nantes and therefore heir to the king of Sardinia, but that was before he became a Huguenot" Or it references Alfonse I about 300 years too late; aha Alfonse I Duke of something who became Alfonse III of Provence when his daddy the king of something else gave it to him....I made those up as an illustration, but you get the idea. I don't want to even discuss what I went through to verify that Brun de Caille was a person. Now then. Does translation in this instance include explaining why "something something, but protestants had blown the roof off in the previous century"? I dimly remember from my time at a lycée oh yeah, wars of religion. Why it took a hundred years to fix the roof and what country it was in at the time I am less sure about, but my point here is that there are a pretty small number of people who *can* figure it out. I *guess* this stuff is true. Half the place names are in Occitan and some exist in different forms in different dialects of Provençal. I worked on Aiguilles de Bavella and think the English is fine now. But there is one reference along the lines of "as you can tell from the name, such and such" when the name tells me exactly nothing. Does wikipedia have anyone who speaks Corsican? Who knows?
Some articles sit around for years on the clean-up translation list, so if someone can answer that Corsican question in five minutes then huzzah, better than me spending a couple of hours with reference materials trying to figure out what some other editor was trying to say. I flagged it and went on. But I think that there is something to be said for organizing the effort. Why did I spend a couple of hours translating an article about an African soccer player only to be told he wasn't notable? It's not that I begrudge the time, but I spent it thinking well, there aren' many articles about Africa. That would be one thing to do -- triage articles *before* they make the list. Make a separate draft space for articles that need verification? Sorry to bombard you guys, and thank you for letting me get that off my chest. I realize that I am not quite answering @Mathglot's question but I just want to point out that there are many issues that could be improved with some sort of system of work. Put simply: it would be nice to define where we should start, and also where we should stop. Elinruby (talk) 06:51, 1 October 2016 (UTC)
I've often thought what you're thinking Mathglot, but can't see a simple solution. Yes we could with benefit change the wording in {{uw-notenglish}} from We invite you to translate it into English into We invite you to get it translated into English by a competent translator, not relying on machine translation. The problem is not just machine translation, as there are human editors who, without much knowledge of the source language, think they can guess at the meaning of the original, and end up making more of a pudding than Google Translate would have done! Also, what about the many poorly translated pieces that never get flagged, and the copyeditors never realise that it was a translation? But since this is the talk page for WP:PNT the change I'd make there is to add to Standard procedures something to the effect Only people with good knowledge of the source language should attempt translations or cleanup of poor translations, and include a link to Translators available (I've had my name on there for years but hardly received any requests). Oh, and move the table of contents down to the Pages for consideration section, so that the advice in "Standard procedures" is not so easy to miss: Noyster (talk), 07:59, 1 October 2016 (UTC)
  • I can see the point about bad facts being hidden under the veneer of good English. (My rough translation of Ana Garrido Ramos from Spanish yesterday might be a good case in point). The thing is, when translating an article from PNT, I always check every reference if for no other reason than trying to add a trans-title to the {{cite}}, and to add dates and other bits and pieces that are often missing: often I might spot a discrepancy that way and also other things such as potential WP:COPYVIOs of paragraphs etc. I think that helps a lot.
I always tend to do the "scaffolding" first (infoboxes, references, general article layout, links, categories etc) and then translate the text afterwards. For me this helps in two ways:
  1. Some other editors hate doing that scaffolding, so even if I only do that much then they can take over (and often do if I abandon/forget the translation) and fill in the meat of the text with English-language sources once the scaffolding is in place.
  2. With the scaffolding in place it's easier for me to edit the article chunk-by-chink
but that's just my own preference and is complementary to the way others work, not in opposition to it. In short, working together as a collaborative effort is what ensures accuracy, for translated articles no more nor less than any other article.
I use Google Translate a lot, but only as "machine assisted translation", as I use online dictionaries, corpora, RS that are multilingual, etc: especially I use it when I forget the idiomatic English while thinking in French etc.
I would actually like a {{rough translation}} parameter so that if that template were placed on an article by a translator, the message that "it may have been created by a machine translation" could be substituted or suppressed when it is not in fact a machine translation. Admittedly one can add comments but not remove this text, as it stands. A translator could thus make it clear that they may be a subject expert but not a language expert, or vice versa. To expect a volunteer translator to be both (over the entire range of topics we get at PNT) is to expect too much. Si Trew (talk) 18:19, 1 October 2016 (UTC)
These are all great comments (keep 'em coming!) and I'll have more to say about it another time but while I agree it's a very thorny problem, several points you've all made make me think there may be something we can do to improve the situation after all. I'll go more into detail later, but it basically amounts to a division of labor, where editors not fluent in the source lang (and that's all of us, for most sources, unless you're centilingual) can build the "scaffolding" (including the not insignificant effort in converting citations and wikilinks and adjusting See-also's) and then turn it over to the bilinguals, who would pick up the building already framed, and concentrate on doing just the actual text translation. That would let each group concentrate on what they do best, and be a workable collaboration, that could pump out higher quality articles in optimum time, I think. Thanks again for the responses, I feel encouraged.
(P.S. I've set off the response to the "Corsican" question below into its own H2 section so that end of this thread is more evident.) Mathglot (talk) 07:08, 2 October 2016 (UTC)
Thanks @Mathglot:: I took an extended Wikibreak over the summer so had not been following some of the interesting discussions here (particularly the one about #June 2016 - Machine translations and its sibling over at Wikipedia:Administrators' noticeboard/CXT). It tends to confirm me in my feeling that doing the "scaffolding" is valuable work in itself – and one that this CXT tool does not seem to do very well, by the accounts of the various ways it cocks up templates, adds unnecessary HTML/Wiki markup, and so on. I just know both from explicitly being asked, and implicitly from looking at how quickly someone will "pounce", that quite often once the scary scaffolding is in place, other editors are more than happy to fill in the "meat" of the text.
I do like the idea of splitting the roles at least nominally, and thus I can see some merit in the idea of having a separate section, analagous to that discussed at #Completed pages, for "Scaffolding done" ("Pages needing translation after cleanup", perhaps?!) which is kinda the flipside of "Pages needing cleanup after translation"kinda thing, because the text translation might be sadly lacking or rough but all the "cleanup" has been done... I don't want to add to editors' burdens, but I think the separation of concerns is a good thing. It might be hard to categorize these, since they'd probably not be stubs as such but just that the text itself had not been very well scrubbed up. Of course, some translators (including me) are happy to do both, but if we think of those two as distinct stages then it's helpful to stage them that way at PNT. Of course that doesn't help the poor editor who takes a third approach of translating linearly from top to bottom, or some other strategy.
To give you some background, my "major" for my bachelor's degree (we don't have "majors" in the UK but essentially that) was in computational linguistics studying in the early 1990s under one of the best in the UK, at a time when statistical machine translation was really mostly a research project (mainly because online bilingual corpora were not common: IBM had an early system using the Canadian Hansard, which would translate "hear" as "Bravo!" for example because of the expression "Hear, Hear!" used in Parliament a lot). So I learned some Spanish and Japanese and a smattering of other languages mostly to learn about language structures etc, rather than learning vocabulary. (For example I remember that Spanish has different words for external and internal corners but can't remember what the words are.) ángulo and rincón Mathglot (talk) 06:15, 4 October 2016 (UTC)
I live in Hungary (Budapest) and speak basic Hungarian but would not be competent to translate most articles from Hungarian; but there are exceptions when the source is largely a collection of "undisputable" facts (e.g. a geography article) rather than those on subjects which have RS to expert opinions and so on. Hungarian is a good idea of where MT goes badly wrong because it is highly agglutinative, even though word order is mostly the same as in English, and also it omits articles and pronouns usually because the context (conjucations of verbs etc) tend to indicate those and they are only used for emphasis, like Spanish (but not French). Also the online bilingual corpora are not vast, there are really trivial things like having to switch the order of Hungarian names, yes, dates are written in different orders, the use of Roman numerals in dates, addresses etc (you don't really want to find out someoneone was born on 19 xi, for example) and all that jazz. While a professional machine translation tool no doubt would enable one store these kind of macro search-and-replace things, on WP it is easier to do it manually and then at the same time actually check the dates and so on, let alone trivia that we can write "9 September 2016" or "September 9, 2016". Some years ago (before I moved to Hungary) I created a version of {{Infobox Hungarian settlement}} ({{magyar település infobox}} IIRC) that allowed instancers of hu:https://hu.wikipedia.org/wiki/Sablon:Település_infobox to be copy-pasted directly and would provide the conversions from SI units, add flag icons, and so on: however, this was essentially deleted by a trend to get rid of specialised settlement infoboxes in EN:WP generally, let alone those that were translated versions. Although I only ever intended it as a halfway house to then be bot-substituted, having it deleted set me back a lot of effort and I abandoned my attempt to get infoboxes in for all 2500 or so Hungarian settlements. So, it seems that the idea of using templates as little hops to do bits and pieces of "scaffolding" translation generally doesn't (or didn't) have much consensus. I did a similar thing for {{Hungarian Revolution of 1848 participant}} so that all the actors in the Hungarian Revolution of 1848 – something the missus and I were translating a lot of articles on at the time – had the right flags in the infoboxes consistent across articles etc. That is such a backwater that it seems still to be in place.
Another major thing of course, which was referred by other editors, is that what may be common knowledge to a Hungarian-speaking community (for example), needs more elaboration to an English-speaking one. Having lived here for four years, and with a Hungarian wife, I am more easily amenable to idiom, usage and cultural norms – and how they might be confusing to an English speaker who has not visited Hungary – than a Hungarian with an English teaching degree who has never left Hungary (I know a couple of those). So sometimes exactly because I am an intelligent but ignorant foreigner I am more qualified to translate some kinds of Hungarian articles than Hungarians are, because I can more easily "see it from the outside": that's where collaboration is useful, no matter how good their English is. For example, very few Hungarians would think to add {{convert}} templates, and certainly a machine translation tool usually won't, any more than the annoying British habit of measuring distances and volumes in units of double-decker buses or areas in units of football fields (Which game? Which field?) or the Size of Wales. (Similarly the American habit of measuring distance in terms of time: "how far is it to New Orleans?" "Oh, it's about six hours". How far away are you if you're Twenty Four Hours from Tulsa?)
In short I can see a lot of good in having a kinda sandpit for editors like me who are sometimes more than happy to do the scaffolding, and who know their way around EN:WP templates, who have a basic competence in the target language and a kinda schoolboy knowledge of a subject, but would not like their translations to be considered "polished" i.e. the point you started with, that just because it is good English can be a veneer over a translation full of dry rot. To me, the words are the easy bit, those are the bits that can be looked up in dictionaries and so on, once the structure is in place. But that probably is because as an engineer and computer scientist that is just my normal way of using structural decomposition to attack a problem, and I quite realise that not everyone would attack a problem that way, nor that it is always the best way.
As usual I have packed the maximum amount of words into the minimum amount of thought. I hope some of it may help when you have "more to say about it another time". Si Trew (talk) 07:58, 3 October 2016 (UTC)
This discussion is moribund. Wake up, wake up, @Mathglot: wake up. Did I kill em all?

References

  1. ^ See article Daniel DeShaime. Mathglot (talk)
  2. ^ I couldn't imagine what a "closed village" was: a ghost town? A super-fund site closed due to toxic waste? A coastal town being reclaimed by the sea or a glacier? For the true answer, see Talk:Daniel DeShaime#Closed villages. Mathglot (talk)

Off-topic about the Aiguilles

@Elinruby: I don't know any Corsican (see discussion above), but I do have enough knowledge of romance languages to be able to compare the French and Corsican versions of Aiguilles de Bavella. We've got:

  • "la Punta Tafunata di i Paliri (montagne trouée comme son nom l'indique, à l'instar du Capu Tafunatu, 1 312 m)." (fr)
  • "a Punta Tafunata di i Paliri (muntagna tafunata com'è u so nomu l'evuchighja, com'è u Capu Tafunatu, 1312 metri)." (co)

The sentences are clearly direct translations of each other, so the Corsican adjective tafunata means trouée in French, which is "pierced" (or something along the lines of "having a hole") in English. What the sentence is thus trying to say, is (freely translated): "the Punta Tafunata -literally translated "Pierced Peak"- is a 1312 m tall mountain with a hole in it and has been named as such; by the way, there is this other completely unrelated mountain on Corsica, the Capu Tafunatu, that has the same etymology." --HyperGaruda (talk) 07:48, 1 October 2016 (UTC)

@HyperGaruda: sorry to respond so slowly; I got pulled into something else. Thangk you for confirming that. In that case the Corsican cleanup tag can come off I guess, and I should probably re-read to see if it needs some other flag... That was intended as an example, but I am glad we resolved it. In ten words or less I think I agree with the person that said that of course we are not experts in all these topics. It would still be nice to have some sort of checklist for translation articles though.... Elinruby (talk) 04:12, 7 October 2016 (UTC)
Perhaps that was me: although the chances of my putting things in ten words or less are rather slim. But, yes, I think it is too much to expect volunteer translators to be both subject experts and language experts: were they thus, they would make a lot more money as being a specialist translator full-time. Oddly enough, I find the hardest things to translate are when through any lucky coincidence I do have a good knowledge of the subject, because I find myself so tempted essentially to add WP:OR rather than just stick to the translation. It's far easier for me to translate when I have no idea what I am talking about. (Of course, during the course of the translation I usually end up learning about the subject, but I just find it a lot easier to resist temptation to add WP:OR when it's something I know little about.) Si Trew (talk) 06:58, 7 October 2016 (UTC)