Jump to content

Wikipedia talk:Naming conventions (standard letters with diacritics)/Archive 1

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

So what actually is this saying?

Noticeable by its absence is any indication of what the author actually wants to do with diacritics.

Are we intending to encourage or discourage their use?

FWIW my opinion is that articles should have titles which are correct, and alternate spellings should be catered for using REDIRECTs labelled with {{R from title without diacritics}}. HTH HAND —Phil | Talk 11:16, 30 January 2006 (UTC)

I concur with your proposal. —Nightstallion (?) 12:34, 30 January 2006 (UTC)
I think that it says that all diacritics used in modern languages should be used, while some diacritics typical only to old, inactive languages that have new form of spelling now should be avoided. It also excludes languages using other than Latin alphabet. The proposed label might be added to the policy.--SylwiaS | talk 19:35, 30 January 2006 (UTC)
I fully agree. For example, "Jaromir Jagr" is an incorrect spelling; when it is used, it is only for technical reasons. Wikipedia is supposed to be an encyclopedia, so it should give correct information. - Mike Rosoft 01:35, 8 February 2006 (UTC)
Ditto. Priority in choice of spelling for a proper name (e.g., of a ruler or other person, or of a geographic entity) should be given to that regarded as correct by the people with whom the name originates. That is the practice in serious scholarship, and we should not patronize any parties involved by talking down to them and homogenizing to the least common denominator. logologist|Talk 07:22, 8 February 2006 (UTC)
I know I'm repeating myself here, but -- 100% support from my side. —Nightstallion (?) 11:53, 8 February 2006 (UTC)
Totally false. Jaromir Jagr, the spelling routinely used in thousands upon thousands of newspaper, television, and magazine reports about this person, is most definitely not a misspelling. It is never a misspelling to use the English alphabet when writing in the English language. We have every bit as much right to use our own alphabet as anybody else does. We can certainly choose to include diacritics in his name; to do so because not to do so would be a misspelling is patently false. Gene Nygaard 10:13, 5 February 2007 (UTC)
Gene, I have you read that argument from you many times. Wards kan bee speled rong youing tha English alphabet wen riting in tha Inglish langwage two. Bendono 12:29, 5 February 2007 (UTC)

From here after the first version of the guideline formulation had been written down --Francis Schonken 22:23, 8 February 2006 (UTC)


With provision 4 this seems like a reasonable proposal. I see no harm it can do. But perhaps it would be useful if the author (or anybody else interested in this) could show us few examples where this policy would actually force us to rename an article? I see little point in creating a policy that would do nothing. This has to 1) solve some current disputes 2) make us rename some articles.--Piotr Konieczny aka Prokonsul Piotrus Talk 17:21, 8 February 2006 (UTC)

Note that according to the proposal, anybody who wants to write an article on a subject which should include diacritics, let's say the Łobzów district of Kraków would have to do BOTH of the following.
  1. Find 10 reliable publications about the district written in English which only use the diacritics version when referring to the place.
    Please bear in mind here that less than 1% of wikipedia articles provide 10 references.
  2. Do a google search (or something equivalent) and from that make a convincing argument that 20% of all English webpages referring to Łobzów outside of wikipedia use the diacritics. Here is a test [1] but note that even if I tried to exclude wikipedia, still at least two of the first 10 google pages are copies of wikipedia articles.

But don't take my word for it. Let's ask the author of the proposal (or anybody else) whether any of the following pages would have to be moved if this was accepted:

Stefán Ingi 19:52, 8 February 2006 (UTC)

@Piotr: The only question is if it would make a positive difference for wikipedia: if it would bring down the number & length of discussions about diacritics that would already be something; if it would bring down the number of WP:RM requests (either by additionally defining some types of moves as "non-controversial", or even better, because nobody would even see the need any more to move or to WP:RM certain diacritics-related pagenames) that would be a good point in itself, wouldn't it? Even today a new Village pump topic was started with emotions going high: Wikipedia:Village pump (policy)#Using diacritics (or national alphabet) in the name of the article (quote: "Man, I feel like the bottom man in a dogpile.", etc). If that could be tackled more easily by a guideline many people feel easy with, then that's enough for me. Even if not a page needs to be changed.

@Stefan:

  • "ß" and "ð" are presently outside the scope of this guideline, see Wikipedia:Naming conventions (standard letters with diacritics)#Scope. The present guideline wouldn't change a thing about Straße des 17. Juni and Davíð Oddsson, neither, for example, about Weißenhof or Weissenhof...
  • "Łobzów" wouldn't be so difficult. Appears Wladyslaw IV Vasa was born there. It appears to have a famous garden. It appears to have a lot of restaurants presently. So I could find at least 10 references in English for it having a lot of restaurants. If you'd reject these (wikipedia is not a tourist guide), even finding 10 non-wikipedia references in English for the "Wladyslaw IV birthplace" + "the famous garden" seems not too difficult. Anyhow, I had a higher Google result on "Łobzów" than on "Lobzow". Further, note that at the Wikipedia:WikiProject Geography of Poland naming conventions are being built presently (there's a vote going on at the moment): so in the future there might come a guideline that takes over (unless that guideline says nothing about diacriticals).
  • Antonín Dvořák might have to move back to Antonin Dvorak (at first sight less than 20% of google results, but would need further checking), but then he's without diacritics at the Radio Prague website [2] (the same website spells of course Antonín Dvořák on their pages in the Czech language, [3], so this is not about "technical limitations"); further, he lived a few years in America (and many Americans would of course throw the diacritics overboard when they had to write his name); and at Wikipedia his compositions are disambiguated with "(Dvorak)" and not by "(Dvořák)", except his 7th symphony (see Category:compositions by Antonin Dvorak) - so the prevailing spelling is even at Wikipedia "Antonin Dvorak" - wouldn't hurt to see all these a bit on the same line would it? Note also that there is a wikipedia:WikiProject Composers - if they decide to make a naming conventions guideline the naming could be settled on the Czech name too - I wouldn't have a problem with that! Even Wikipedia:Naming policy (Czech) could be policy within a week or so...
  • Alliance française has far above 20% of google hits, and as it is an organisation that publishes a lot (and about which a lot is published) 10 reference works wouldn't be a problem, would it? So this would stay where it is.
  • Well, I did three examples now, maybe you do some?

--Francis Schonken 22:23, 8 February 2006 (UTC)

Hmm, there seem even more conventions around, limiting the scope of this proposal, than I had realised. So it seems very difficult to forsee which moves this proposal will mandate. That certainly makes me worry. Another problem I have, aside from the instruction creep of asking people to dig up 10 references and combing through google results before they can write an article, is that I just don't agree that we need to have a fallback convention severely limiting the use of diacritics. Infact I don't think the use of them needs to be limited at all in articles on Irish, Norwegean, Polish, Croatian or German subjects and others. And it seems to me that this is a feeling shared by most editors who are actively working on these article and for the most part we should just let them get on with it. So finally, rather than give examples, I would simply suggest that we drop this proposal. Stefán Ingi 23:01, 8 February 2006 (UTC)
Concur heartily. I propose one overarching convention: that the presumption will favor use of authentic spellings, with diacritics, in all cases. logologist|Talk 23:18, 8 February 2006 (UTC)
I strongly oppose that. It is exactly the same type of arguments used by people to argue that an 80-year long American citizen, who never used diacritics in his own name at least as an adult from anything found after extensive arguing, an American educated college physics professor who devised a chess rating scale known throughout the world as the Elo rating system (though the words for "rating system" differ of course, but always "Elo" without any diacritics, in pretty much any language, at least in anything published before the name was butchered and corrrupted by Wikipedia, though sometimes capitalized as ELO by people who mistake it for an acronym), should be at Árpád Élő. Furthermore, while the ship's manifest when he left to come to America as a child did include diacritics in his name, it was different diacritics, in addition to two l's instead of one in his surname, in that manifest. Nobody has ever provided any contemporary to that usage evidence whatsoever as to the spelling of his name before he came to America, other than that ship's manifest information which I provided. Gene Nygaard 10:44, 5 February 2007 (UTC)
Furthermore, that 10 publications bit is ludicrous when it comes to something that has thousands upon thousands of publications in English which use only the form without diacritics. Plus, that very same corruption due to Wikipedia's past misusage can play a role in this usage in other "reliable sources" (which in Wikijargon usage is a property of the publication and not of the accuracy of the contents of that publication) as well. Gene Nygaard 10:44, 5 February 2007 (UTC)

Using diacritics (or national alphabet) in the name of the article

The discussion below has been copied from Wikipedia:Village pump (policy)#Using diacritics (or national alphabet) in the name of the article - 07:41, 2 March 2006 (UTC)

I came to the problem with national alphabet letters in article name. They are commonly used but I have found no mention about them in naming coventions (WP:NAME). The only convention related is to use English name, but it probable does not apply to the names of people. National alphabet is widely used in wikipedia. Examples are Luís de Camões Auguste and Louis Lumière or Karel Čapek. There are redirects from english spelling (Camoes, Lumiere, Capek).

On the other hand, wikiproject ice hockey WP:HOCKEY states rule for ice hockey players that their names should be written in English spelling. Currently some articles are being moved from Czech spelling to the english spelling (for example Patrik Eliáš to Patrick Elias). I object to this as I do not see genaral consensus and it will only lead to moving back and forth. WP:HOCKEY is not wikipedia policy nor guideline. In addition I do not see any reason why ice hockey players should be treated differently than other people.

There is a mention about using the most recognized name in the naming conventions policy. But this does not help in the case of many ice hockey players. It is very likely that for American and Canadian NHL fans the most recognised versions are Jagr, Hasek or Patrick ELias. But these people also played for the Czech republic in the Olympics and there they are known like Jágr, Hašek or Patrik Eliáš.

I would like to find out what is the current consensus about this. -- Jan Smolik 18:53, 7 February 2006 (UTC)

The only convention related is to use English name, but it probable does not apply to the names of people - incorrect. "Use the most common name of a person or thing that does not conflict with the names of other people or things" - Wikipedia:Naming :conventions (common names). Raul654 18:54, 7 February 2006 (UTC)
I mentioned this in the third article but it does not solve the problem. Americans are familiar with different spelling than Czechs. --Jan Smolik 19:11, 7 February 2006 (UTC)
Well, since this is the English Wikipedia, really we should use the name most familiar to English speakers. The policy doesn't say this explicitly, but I believe this is how it's usually interpreted. This is the form that English speakers will recognize most easily. Deco 19:02, 7 February 2006 (UTC)
Well it is wikipedia in English but it is read and edited by people from the whole world. --Jan Smolik 19:11, 7 February 2006 (UTC)

There was a straw poll about this with regard to place names: Wikipedia talk:Naming conventions (use English)/Archive 3#Proposal and straw poll regarding place names with diacritical marks. The proposal was that "whenever the most common English spelling is simply the native spelling with diacritical marks omitted, the native spelling should be used". It was close, but those who supported the proposal had more votes. Since, articles like Yaoundé have remained in place with no uproar. I would support a similar convention with regard to personal names. — BrianSmithson 19:17, 7 February 2006 (UTC)

I'm the user who initiated the WP:HOCKEY-based renaming with Alf. The project Player Pages Format Talk page has the discussion we had along with my reasoning, pasted below:

OK, team, it's simple. This is en-wiki. We don't have non-English characters on our keyboards, and people likely to come to en-wiki are mostly going to have ISO-EN keyboards, whether they're US, UK, or Aussie (to name a few) it doesn't matter. I set up a page at User:RasputinAXP/DMRwT for double move redirects with twist and started in on the Czech players that need to be reanglicized.

Myself and others interpret the policy just the same as Deco and BrianSmithson do: the familiar form in English is Jaromir Jagr, not Jaromír Jágr; we can't even type that. Attempting to avoid redirects is pretty tough as well. Is there a better way to build consensus regarding this? RasputinAXP talk contribs 19:36, 7 February 2006 (UTC)

I think you misread my statement above. My stance is that if the native spelling of the name varies from the English spelling only in the use of diacritics, use the native spelling. Thus, the article title should be Yaoundé and not Yaounde. Likewise, use Jōchō, not Jocho. Redirection makes any arguments about accessibility moot, and not using the diacritics makes us look lazy or ignorant. — BrianSmithson 16:34, 8 February 2006 (UTC)
Tentative overview (no cut-and-paste solutions, however):
  • Article names for names of people: wikipedia:naming conventions (people) - there's nothing specific about diacritics there (just mentioning this guideline because it is a naming conventions guideline, while there are no "hockey" naming conventions mentioned at wikipedia:naming conventions).
  • wikipedia:naming conventions (names and titles) is about royal & noble people: this is guideline, and *explicitly* mentions that wikipedia:naming conventions (common names) does NOT apply for these kind of people. But makes no difference: doesn't mention anything about diacritics.
  • Wikipedia talk:naming conventions (Polish rulers): here we're trying to solve the issue for Polish monarchs (some of which have diacritics in their Polish name): but don't expect to find answers there yet, talks are still going on. Anyway we need to come to a conclusion there too, hopefully soon (but not rushing).
  • Wikipedia:Naming conventions (standard letters with diacritics), early stages of a guideline proposal, I started this on a "blue monday" about a week ago. No guideline yet: the page contains merely a "scope" definition, and a tentative "rationale" section. What the basic principles of the guideline proposal will become I don't know yet (sort of waiting till after the "Polish rulers" issue gets sorted out I suppose...). But if any of you feel like being able to contribute, ultimately it will answer Jan Smolik's question (but I'd definitely advise not to hold your breath on it yet).
  • Other:
    • Some people articles with and without diacritics are mentioned at wikipedia talk:naming conventions (use English)#Diacritics, South Slavic languages - some of these after undergoing a WP:RM, but note that isolated examples are *not* the same as a guideline... (if I'd know a formulation of a guideline proposal that could be agreeable to the large majority of Wikipedians, I'd have written it down already...)
    • Talking about Lumiere/Lumière: there's a planet with that name: at a certain moment a few months ago it seemed as if the issue was settled to use the name with accent, but I don't know how that ended, see Wikipedia:WikiProject Astronomical objects, Andrewa said she was going to take the issue there. Didn't check whether they have a final conclusion yet.
Well, that's all I know about (unless you also want to involve non-standard characters, then there's still the wikipedia:naming conventions (þ) guideline proposal) --Francis Schonken 19:58, 7 February 2006 (UTC)
Note that I do not believe no En article should contain diacritics in its title. There are topics for which most English speakers are used to names containing diacritics, such as El Niño. Then there are topics for which the name without diacritics is widely disseminated throughout the English speaking world, like Celine Dion (most English speakers would be confused or surprised to see the proper "Céline Dion"). (Ironically enough, the articles for these don't support my point very well.) Deco 20:42, 7 February 2006 (UTC)
Sticking diacritics, particularly the Polish Ł is highly annoying, esp. when applied to Polish monarchs. It just gives editors much more work, and unless you're in Poland or know the code, you will be unable to type the name in the article. - Calgacus (ΚΑΛΓΑΚΟΣ) File:UW Logo-secondary.gif 20:45, 7 February 2006 (UTC)
Redirects make the issue of difficulty in visiting or linking to the article immaterial (I know we like to skip redirects, but as long as you watch out for double redirects you're fine). The limitations of our keyboards are not, by themselves, a good reason to exclude any article title. Deco 20:50, 7 February 2006 (UTC)
Deco, I should rephrase what I said. I agree with you that some English articles do require diacritics, like El Niño. Articles like Jaromir Jagr that are lacking diacritics in their English spellings should remain without diacritics because you're only going to find the name printed in any English-speaking paper without diacritics. RasputinAXP talk contribs 21:20, 7 February 2006 (UTC)
I checked articles about Czech people and in 90 % of cases (rough guess) they are with diacritics in the name of the article. This includes soccer players playing in England (like Vladimír Šmicer, Petr Čech, Milan Baroš). And no one actualy complains. So this seems to be a consensus. The only exception are extremely short stubs that did not receive much input. Articles with Czech diacritics are readable in English, you only need a redirect becouse of problems with typing. This is an international project written in English. It should not fulfill only needs of native English speakers but of all people of the world. --Jan Smolik 22:33, 7 February 2006 (UTC)
Very many names need diacritics to make sense. Petr Cech instead of Petr Čech makes a different impression as a name, does not look half as Czech and is much more likely to be totally mispronounced when you see it. Names with diacritics are also not IMHO such a big problem to use for editors because you can usually go through the redirect in an extra tab and cut and paste the correct title. I also don't see a problem at all in linking through redirects (that's part of what they are there for). Leaving out diacritics only where they are "not particularly useful" would be rather inconsequent. Kusma (討論) 22:48, 7 February 2006 (UTC)
As a matter of fact, "Petr Sykora" and "Jaromir Jagr" are not alternate spellings; they are incorrect ones which are only used for technical reasons. Since all other articles about Czech people use proper Czech diacritics, I don't know of any justification for making an exception in case of hockey players. - Mike Rosoft 01:13, 8 February 2006 (UTC)
Man, I feel like the bottom man in a dogpile. Reviewing Wikipedia:Naming conventions (common names), there'sWhat word would the average user of the Wikipedia put into the search engine? Making the name of the article include diacritics goes against the Use English guideline. The most common input into the search box over here onthe left, for en-wiki, is going to be Jaromir Jagr. Yes, we're supposed to avoid redirects. Yes, in Czech it's not correct. In English, it is correct. I guess I'm done with the discussion. There's no consensus in either direction, but it's going to be pushed back to the diacritic version anyhow. Go ahead and switch them back. I'mnot dead-set against it, but I was trying to follow guidelines. RasputinAXP talk contribs 15:48, 8 February 2006 (UTC)
There are many names, and even words, in dominant English usage that use diacritics. Whether or not these will ever be typed in a search engine, they're still the proper title. However, if English language media presentations of a topic overwhelmingly omit diacritics, then clearly English speakers would be most familiar with the form without diacritics and it should be used as the title on this Wikipedia. This is just common sense, even if it goes against the ad hoc conventions that have arisen. Deco 18:30, 8 February 2006 (UTC)
Czech names: almost all names with diacritics use it also in the title (and all of them have redirect). Adding missing diacritics is automatic behavior of Czech editors when they spot it. So for all practical purposes the policy is set de-facto (for Cz names) and you can't change it. Pavel Vozenilek 03:18, 8 February 2006 (UTC)

See Wikipedia:Naming policy (Czech) --Francis Schonken 11:01, 8 February 2006 (UTC)

and: Wikipedia:Naming conventions (hockey) --Francis Schonken 17:41, 8 February 2006 (UTC)

There are those among us trying to pull the ignorant North American card. I mentioned the following over at Wikipedia talk:WikiProject Ice Hockey/Player pages format...
Here's the Czech hockey team in English compliments of the Torino Italy Olympic Committee [4] Here they are in Italian: [5], French: [6]. Here are the rosters from the IIHF (INTERNATIONAL Ice Hockey Federation) based in Switzerland: [7].'
Those examples are straight from 2 international organizations (one based in Italy, one in Switzerland). I'm hard pressed to find any english publication that uses diacritics in hockey player names. I don't see why en.wiki should be setting a precedent otherwise. ccwaters 02:19, 9 February 2006 (UTC)
Over at WP:HOCKEY we have/had 3 forces promoting non-English characters in en.wiki hockey articles: native Finns demanding native spellings of Finnish players, native Czechs demanding native spellings of Czech players, and American stalkers of certain Finnish goaltenders. I did a little research and here are my findings:
Here's a Finnish site profiling NHL players. Here's an "incorrectly" spelt Jagr, but the Finnish and German alphabets both happen to have umlauts so here's a "correct" Olaf Kölzig. Who is Aleksei Jashin?
Here's a Czech article about the recent Montreal-Philadelphia game [8] Good luck finding any Finnish players names spelt "correctly"... here's a snippet from the MON-PHI article:
Flyers však do utkání nastoupili značně oslabeni. K zraněným oporám Peteru Forsbergovi, Keithu Primeauovi, Ericu Desjardinsovi a Kimu Johnssonovi totiž po posledním zápase přibyli také Petr Nedvěd a zadák Chris Therrien.
Well...I recognize Petr Nedvěd, he was born in Czechoslovakia. Who did the Flyers have in goal??? Oh its the Finnish guy, "Antero Niitymakiho".
My point? Different languages spell name differently. I found those sites just by searching yahoo in the respective languages. I admit I don't speak either and therefore I couldn't search thoroughly. If someone with backgrounds in either language can demonstrate patterns of Finnish publications acknowledging Czech characters and visa versa than I may change my stance. ccwaters 03:45, 9 February 2006 (UTC)
I support every word Ccwater said, albeit with not as much conviction. There is a reason why we have Wikipedia in different languages, and although there are few instances in the English uses some sort of extra-curricular lettering (i.e. café), most English speaking people do not use those. Croat Canuck 04:25, 9 February 2006 (UTC)
I must make a strong point that seems to be over-looked: this is not the international English language wikipedia. It is the English language wikipedia. It just so happens that the international communty contributes. There is a reason that there are other language sections to wikipedia, and this is one of them. The finnish section of wikipedia should spell names the Finnish way and the English wikipedia should spell names the English way. The vast majority of english publications drop the foreign characters and diacritics. Why? because they aren't part of the English language, hence the term "foreign characters". Masterhatch 04:32, 9 February 2006 (UTC)
I agree in every particular with Masterhatch. The NHL's own website and publications do not use diacriticals, nor does any other known English-language source. The absurdity of the racist card is breathtaking: in the same fashion as the Finnish and Czech language Wikipedias follow their own national conventions for nomenclature (the name of the country in which I live is called the "United States" on neither ... should I feel insulted?), the English language Wikipedia reflects the conventions of the various English-speaking nations. In none are diacriticals commonly used. I imagine the natives of the Finnish or Czech language Wikipedias would go berserk if some peeved Anglos barge in and demand they change their customary linguistic usages. I see no reason to change the English language to suit in a similar situation. RGTraynor 06:46, 9 February 2006 (UTC)
People like Jagr, Rucinsky or Elias are not only NHL players but also members of Czech team for winter olympics. Therefore I do not see any reason why spelling of their name in NHL publications should be prioritized. I intentionaly wrote the names without diacritics. I accept the fact that foreigners do that because they cannot write those letters properly and use them correctly. There are also technical restrictions. I also accepted fact that my US social security card bears name Jan Smolik instead of Jan Smolík. I do not have problem with this. I even sign my posts Jan Smolik. But Wikipedia does not have technical restrictions. I can even type wierd letters as Æ. And it has plenty of editors who are able to write names with diacritics correctly. The name without diacritics is sufficient for normal information but I still think it is wrong. I think that removing diacritics is a step back. Anyway it is true that I am not able to use diacritics in Finish names. But somebody can fix that for me.
I do not care which version will win. But I just felt there was not a clear consensus for the non-diacritics side and this discussion has proven me to be right. As for the notice of Czechs writing names incorectly. We use Inflection of names so that makes writing even more dificult (my name is Smolík but when you want to say we gave it to Smolík you will use form we gave it Smolíkovi). One last argument for diacritics, before I retire from this discussion as I think I said all I wanted to say. Without diacritics you cannot distinguish some names. For example Czech surnames Čapek and Cápek are both Capek. Anyway we also have language purists in the Czech republic. I am not one of them. --Jan Smolik 19:11, 9 February 2006 (UTC)
People like Jagr, Rucinsky or Elias are not only NHL players but also members of Czech team for winter olympics. Therefore I do not see any reason why spelling of their name in NHL publications should be prioritized -Fine we'll use the spellings used by the IIHF, IOC, NHLPA, AHL, OHL, WHL, ESPN, TSN, The Hockey News, Sports Illustrated, etc, etc, etc.
This isn't about laziness. Its about using the alphabet afforded to the respective language. We don't refer to Алексей Яшин because the English language doesn't use the Cyrillic alphabet. So why should we subject language A to the version of the Latin alphabet used by language B? Especially when B modifies proper names from languages C & D.
My main beef here is that that the use of such characters in en.wiki is a precedent, and not a common practice. If you think the English hockey world should start spelling Czech names natively, than start a campaign amongst Czech hockey players demanding so. It may work: languages constantly infiltrate and influence each other. Wikipedia should take a passive role in such things, and not be an active forum for them. ccwaters 20:09, 9 February 2006 (UTC)
People like Jagr, Rucinsky or Elias are not only NHL players but also members of Czech team for winter olympics. Therefore I do not see any reason why spelling of their name in NHL publications should be prioritized Great, in which case for Czech Olympic pages, especially on the Czech Wikipedia, spell them as they are done in the Czech Republic. Meanwhile, in the NHL-related articles, we'll spell them as per customary English-language usage. RGTraynor 08:05, 10 February 2006 (UTC)
I wish I understood why User:ccwaters has to be rude in his posts on this subject. "Stalkers of Finnish goaltenders" isn't the way I'd describe a Wikipedia contributor. Also, since you asked, Aleksei Jashin is the Finnish translitteration of Alexei Yashin. Russian transliterates differently into Finnish than into English. Of course you must know this, since you have such a habit of lecturing to us on languages. As for diacritics, I object to the idea of dumbing down Wikipedia. There are no technical limitations that stop us from writing Antero Niittymäki instead of Antero Niittymaki. The reason so many hockey publications all over the world don't use Finnish-Scandinavian letters or diacritics is simple laziness, and Wikipedia can do much better. Besides, it isn't accepted translation practice to change the spelling of proper names if they can be easily reproduced and understood, so in my opinion it's simply wrong to do so. Since it seems to be obvious there isn't a consensus on this matter, I think a vote would be in order. Elrith 16:40, 14 February 2006 (UTC)
Alas, a Finnish guy lecturing native English speakers on how they have to write Czech names in English (not to mention the lecturing regarding the laziness) is but a variation on the same theme of rudishness.
So, Elrith, or whomever reads this, if the lecturing is finished, could you maybe devote some attention to the Dvořák/Dvorak problem I mentioned below? I mean, whomever one asks this would not be problematic - but nobody volunteered thus far to get it solved. Am I the only one who experiences this as problematic inconsistency? --Francis Schonken 21:05, 14 February 2006 (UTC)
So is "Jagr" the Finnish transliteration of "Jágr"??? On that note, the Finnish "Ä" is not an "A" with "funny things" on top (that's an umlaut), its a completely separate letter nonexistent in the English language and is translated to "Æ". "Niittymaki" would be the English transliteration. "Nittymeki" or (more traditionally "Nittymӕki") would be the English transcription.
In the past I've said our friend's contributions were "thorough." I'll leave it at that. There will be nothing else about it from me unless asked. ccwaters 21:02, 14 February 2006 (UTC)
My opinion on the Dvořák/Dvorak issue is that his name is spelled Dvořák, and that's how the articles should be titled, along with redirects from Dvorak. Similarly, the article on Antero Niittymäki should be called just that, with a redirect from Niittymaki. You're right that it is a problematic inconsistency, and it needs to be fixed.
The only reason I may sound like I'm lecturing is that there are several people contributing to these discussions who don't understand the subject at all. Ccwaters's remarks on transliteration are

one example. It isn't customary or even acceptable to transliterate or transcribe Finnish letters into English; the accepted translation practice is to reproduce them, which is perfectly possible, for example, in Wikipedia. Niittymaki or anything else that isn't Niittymäki isn't a technically correct "translation". The reason North American, or for that matter, Finnish, hockey publications write Jagr instead of Jágr is ignorance and/or laziness. Wikipedia can do better that that.

However, since this discussion has, at least to me, established that there is no consensus on Wikipedia on diacritics and national letters, apart from a previous vote on diacritics, I'm going to continue my hockey edits and use Finnish/Scandinavian letters unless the matter is otherwise resolved. Elrith 04:32, 20 February 2006 (UTC)
Hi Elrith, your new batch of patronising declarations simply doesn't work. Your insights in language (and how language works) seem very limited, resuming all what you don't like about a language to "laziness" and "ignorance".
Seems like we might need an RfC on you, if you continue to oracle like this, especially when your technique seems to consist in calling anyone who doesn't agree with you incompetent.
Re. consensus, I think you would be surprised to see how much things have evolved since the archived poll you speak about. --Francis Schonken 23:14, 20 February 2006 (UTC)
My 2 cents:
1) This should NOT be setteld as a local consensus for hockey players, this is about how we name persons in the english wikipedia. It is wrong to have a local consensus for hockey players only.
2) I have tried to do some findings on how names are represented, it is wrong to say that since these names are spelled like this normally they should be spelled like this, many wrongs does not make it right. So I did a few checks,
If I look at the online version of Encyclopædia Britannica I get a hit on both Björn Borg and Bjorn Borg, but in the article it is spelled with swedish characters, same for Selma Lagerlöf and Dag Hammarskjöld, I could not find any more swedes in EB :-) (I did not check all..)
I also check for as many swedes as I could think of in wikipedia to see how it is done for none hockey swedes, I found the following swedes by looking at list of swedish ... and adding a few more that I could think of, ALL had their articles spelled with the swedish characters (I'm sure you can find a few that is spelled without the swedish characters but the majority for sure seams to be spelled the same way as in their births certificates). So IF you are proposing that we should 'rename' the swedish hockey players I think we must rename all other swedes also. Do we really think that is correct? I can not check this as easily for other countries but I would guess that it is the same.
Dag Hammarskjöld, Björn Borg, Annika Sörenstam, Björn Ulvaeus, Agnetha Fältskog, Selma Lagerlöf, Stellan Skarsgård,Gunnar Ekelöf, Gustaf Fröding, Pär Lagerkvist, Håkan Nesser, Bruno K. Öijer, Björn Ranelid, Fredrik Ström, Edith Södergran, Hjalmar Söderberg, Per Wahlöö, Gunnar Ekelöf, Gustaf Fröding, Pär Lagerkvist, Maj Sjöwall, Per Wästberg, Isaac Hirsche Grünewald, Tage Åsén, Gösta Bohman, Göran Persson, Björn von Sydow, Lasse Åberg, Helena Bergström, Victor Sjöström, Gunder Hägg, Sigfrid Edström, Anders Gärderud, Henrik Sjöberg, Patrik Sjöberg, Tore Sjöstrand, Arne Åhman, so there seams to be a consensus for non hockey playing swedes? Stefan 13:33, 21 February 2006 (UTC)
I also checked encarta for Björn Borg and Dag Hammarskjöld both have the Swedish characters as the main name of the articles, Selma Lagerlöf is not avaliable unless you pay so I can not check. I'm sure you can find example of the 'wrong' way also, but we can not say that there is consensus in the encyclopedic area of respelling foreign names the 'correct' english way. Stefan 14:16, 21 February 2006 (UTC)
This seems like a very constructive step to me. So I'll do the same as I did for Czech, i.e.:
  1. start Wikipedia:Naming conventions (Swedish) as a proposal, starting off with the content you bring in here.
  2. list that page in Wikipedia:Naming conventions#Conventions under consideration
  3. also list it on wikipedia:current surveys#Discussions
  4. list it in the guideline proposal Wikipedia:Naming conventions (standard letters with diacritics)#Specifics_according_to_language_of_origin
OK to work from there? --Francis Schonken 15:22, 21 February 2006 (UTC)
Works for me :-) Stefan 00:26, 22 February 2006 (UTC)
Tx for finetuning Wikipedia:Naming conventions (Swedish). I also contributed to further finetuning, but add a small note here to clarify what I did: page names in English wikipedia are in English per WP:UE. Making a Swedish name like Björn Borg English, means that the ö ("character" in Swedish language) is turned into an "o" character with a precombined diacritic mark (unicode: U+00F6, which is the same character used to write the last name of Johann Friedrich Böttger – note that böttger ware, named after this person, uses the same ö according to Webster's, and in that dictionary is sorted between "bottery tree" and "bottine"). Of course (in English!) the discussion whether it is a separate character or an "o" with a diacritic is rather futile *except* for alphabetical ordering: for alphabetical ordering in English wikipedia the ö is treated as if it were an o, hence the remark about the "category sort key" I added to the intro of the "Swedish NC" guideline proposal. In other words, you can't expect English wikipedians who try to find something in an alphabetic list to know in advance (a) what is the language or origin of a word, and (b) if any "special rules" for alphabetical ordering are applicable in that language. That would be putting things on their head. "Bö..." will always be sorted in the same way, whatever the language of origin.
What I mean is that "Björn Borg" (in Swedish) is transcribed/translated/transliterated to "Björn Borg" in English, the only (invisible!) difference being that in Swedish ö is a character, and in English ö is a letter o with a diacritic.
Or (still the same in other words): Ö is always treated the same as "O" in alphabetical ordering, whether it's a letter of Ötzi or of Öijer--Francis Schonken 10:56, 22 February 2006 (UTC)

For consistency with the rest of Wikipedia, hockey player articles should use non-English alphabet characters if the native spelling uses a Latin-based alphabet (with the exception of naturalized players like Petr Nedved). Why should Dominik Hasek be treated differently than Jaroslav Hašek? Olessi 20:48, 21 February 2006 (UTC)

If we are using other encyclopedias as litmus tests, we don't we look at a few hockey players: Dominik Hasek at Encarta Dominik Hasek at Britannica Jaromir Jagr at Encarta Teemu Selanne in Encarta list of top scorers

Last argument: We use the names that these players are overwhelming known as in the English language. We speak of Bobby Orr, not Robert Orr. Scotty Bowman, not William Scott Bowman. Ken Dryden not Kenneth Dryden. Tony Esposito, not Anthony Esposito. Gordie Howe not Gordon Howe... etc etc, etc. The NHL/NHLPA/media call these players by what they request to be called. Vyacheslav Kozlov used to go by Slava Kozlov. Evgeni Nabokov "americanized" himself for a season as "John Nabokov" but changed his mind again.

ccwaters 22:54, 25 February 2006 (UTC)

Dvořák

Could someone clean this up:

Article/category name without diacritics
Category:Compositions by Antonin Dvorak
Category:Operas by Antonin Dvorak
Cello Concerto (Dvorak)
String Quartet No. 11 (Dvorak)
String Quartet No. 12 (Dvorak)
Symphony No. 6 (Dvorak)
Symphony No. 8 (Dvorak)
Symphony No. 9 (Dvorak)
Violin Concerto (Dvorak)
Page name with diacritics
Antonín Dvořák
List of compositions by Antonín Dvořák
Symphony No. 7 (Dvořák)

I'd do it myself if I only knew which way the wikipedia community wants it... --Francis Schonken 10:53, 10 February 2006 (UTC)

I've been bold and renamed the articles to use diacritics in the title, since they already use them in the text. I've also slapped {{categoryredirect}} tags on the two categories: a bot should be along shortly to complete the job. —Ilmari Karonen (talk) 14:54, 21 February 2006 (UTC)
Tx!!! - I'll remove Dvořák as an exception from Wikipedia:Naming policy (Czech)#Exceptions --Francis Schonken 15:22, 21 February 2006 (UTC)

Moving up to guideline

Since the discussion on some related NC proposals (Czech, Swedish,... - see links above) appears to be concluded, I see no further problem to move this general treatment of diacritics up to NC guideline too --Francis Schonken 08:10, 2 March 2006 (UTC)

Francis, I think you need to advertise this widely and run a strawpoll before you change it into a guideline. The page has had only 5 editors and that is hardly enough of a consensus to create a guidline which had an impact on lots of articles across en.wikipedia. --Philip Baird Shearer 09:02, 2 March 2006 (UTC)
It seems clear to me that there is absolutely no consensus for this proposal. Aside from the objections that I have already stated, I might point out that this would be about as inconsistent as can be imagined. Eg. we should use diacritics for Serbian names according to the Cyrillic convention but would have to trawl through the strict conditions before we could use them for Croatian names. This is despite the fact that these two nations use a language which is very closely releated and uses the same set of diacritics. Also, with the Czech and Swedish proposal pages, both of which Francis started, we would have two islands where diacritics can be applyed regularly (perhaps with a few exceptions) but for every nation which borders these we would have to do the same trawling for any page which wanted to use diacritics.
Finally, to reiterate my main point: This proposal goes against the current practice on Wikipedia. Therefore there needs to be demonstrated a lot of support for it before the shift which it dictates is carried out. Stefán Ingi 10:16, 2 March 2006 (UTC)

The five editors of the page (six if you include my reversal). Apart from yourself non has made more than one edit.

  • 11:08, 28 January 2006 Francis Schonken (start)
  • 11:17, 30 January 2006 Phil Boswell (→Scope - {{{1}}})
  • 15:42, 9 February 2006 CesarB (→See also - Wikipedia:Naming conventions (technical restrictions)#Browser support limitations)
  • 08:38, 10 February 2006 TShilo12 (→Other - dab Hebrew, avoid redirects on other languages, changing description of Chinese)
  • 14:33, 20 February 2006 Nikai m (→Rationale - sp)
  • 08:56, 2 March 2006 Philip Baird Shearer (Reverted)

Francis as an old hand in this contriversial area you will be well aware (I certainly am) there are strong feeling among many editors of using Google or other search engines to decide issues like these. I suspect many people would object to such suggestion. Further there are those who argue that blog pages should not carry the same weight as research papers, books and other encyclopaedia entries.

So at the moment I know that you do not have a true consensus for changing this from a proposal into a guideline. Whether you have a Wikipedia:consensus is debatable. --Philip Baird Shearer 10:23, 2 March 2006 (UTC)

@Philip: your criticism is too absurd for words. It comes down to: "write a lower quality guideline, so that other wikipedians feel massively compelled to edit it". I did invite others to improve the text:
"[...] could you have a look at Wikipedia:Naming conventions (standard letters with diacritics)? I mean, both w.r.t. (ab)use of the English language and content of the thing, [...] --Francis Schonken 13:56, 14 February 2006 (UTC)" [9]
reply: "Looks OK to me, [...] Bill 14:21, 14 February 2006 (UTC)" [10]
So, no, Bill does not appear in the list of minor changes to the guideline proposal. Which proves the absurdity of your newly invented method for assessing guideline proposals. To be remarked also that you're moving way out of consensus by even imagining that such flimsy method would be acceptable to the wikipedia community.
I know that you have trouble accepting wikipedia:google test *is* part of wikipedia consensus. It's a how-to guideline, so I need not defend that I rely on it. Of course that guideline is a lot about caution re. the application of google test. That's one of the reasons why this NC is only on "standard letters with diacritics": Google is unreliable in filtering out non-standard letters like ß, þ and ð (I commented on that at wikipedia talk:naming conventions (thorn)). The same unreliability does not exist for standard letters with diacritics, see for example wikipedia:naming conventions (Swedish)#Rationale (but that's of course not the only testing I did to check reliability of filtering out variants of the same word with and without diacritics).
Further, it's not "the search engine that decides", as you erroneously try to present it. There's only a check required that the version with diacritics is not totally uncommon (20% is not really a high treshold, and takes account of the internet's bias towards diacritic-less variants). A minimum of references that use the variant with diacritics in English, is a requirement of no lesser stature.
@Stefan: Your criticism is inconsistent: first you ask me to prove the guideline changes something (following Piotr who didn't see the need for a guideline if it doesn't change anything), then you reproach me it *would* change some things (on which you exaggerate, but that's another point). Why should I answer to such inconsistencies?
It has been established long ago that the same rule for all words with diacritics wouldn't work (the famous diacritics poll). So the "standard letters with diacritics" NC distinguishes between languages, and is also an invitation to come up with NC's for languages that would be problematic (I invited Haukur not so long ago to come with such proposal for Icelandic at wikipedia talk:naming conventions (thorn), of course, if you wouldn't have seen that invitation, I invite you likewise!). I wrote/copied the Czech NC, taken all time together, including talk page, in less than half an hour. I don't think, for example, that a Croatian NC would take more than that. Maybe Croatian isn't even problematic seen its proximity to Serbian (and Serbo-Croatian that has both Latin and Cyrillic spelling? – I'm not that much of an expert in those languages)? - Anyway, if there would be a deeper problem, in that case the "standard letters with diacritics" NC would probably only offer a temporary solution, until the specific guideline is written, but the diacritics NC may help in writing such guideline (like it helped me while writing the Swedish NC - without knowing Swedish).
You're trying to make a reproach of me writing some of the NC's for specific languages. How absurd can you be? I wrote them (or a part of them), even collaborated to the hockey NC (as if I know anything about ice hockey). What problem could you have with that? They all settled disputes over *differences of opinion* regarding current practice, and did so as straight derivations of the diacritics NC proposal. So, please, don't make problems where there are none. --Francis Schonken 16:01, 2 March 2006 (UTC)
I have thought and still do think that if this were to become policy then it would change many names, or at least make a lot more effort for those defending them. I disagreed with Piotr when he said that it would not change anything. I asked you to confirm that it would change something but the examples I took weren't very good and nobody offered any other examples so it was inconclusive. I'm also saying that since I think that it changes many things, then it has to be shown to have some sort of Wikipedia:Consensus before it can be put up as a guideline. As Philip says, whether that consensus exists is debatable, from looking around on Wikipedia I would say that it definitely doesn't, if you disagree then you should offer some evidence for that statement.
Also, I wasn't reproaching you for writing these specific guidelines, I was just pointing it out for the benefit of people who were to come to this discussion. Perhaps there wasn't much point in doing that, I'm not sure. I apologise that I worded it in such a way that you took it to imply that I was reproaching you. Stefán Ingi 16:41, 2 March 2006 (UTC)

Spanish accents

I would like to know if this proposal covers Spanish accents. My interest is in regards to accents in a person's name. Should accents be used in the article name if that person's original name has them? Joelito 14:38, 3 March 2006 (UTC)

As it stands, this proposal would cover Spanish accents so if you wanted to include them and this proposal were a policy or guideline you would have to go through the motions to justify the accents every time. But, this is just a proposal so instead you might just as well look around you, e.g. on the list of Prime Ministers of Spain and see whether the accents are used there. In the example I'm taking they are. Stefán Ingi 14:47, 3 March 2006 (UTC)
Well as the proposal stands it would be very hard to prove points 2 and 3 if the person is not well covered in English publications. For example Eddie Miró, a person known by all the people from Puerto Rico as a television show host has had very little English coverage. It would be very hard to provide references in English for him. Possibly 7-10 relevant English references could be found.Joelito 14:58, 3 March 2006 (UTC)
Yes, I think that in many cases it would mean a lot of unproductive work. That's one of the reasons why I am opposed to this proposal Stefán Ingi 17:52, 3 March 2006 (UTC)
I think that the burden to prove the spelling should be on the non-native name (English), not the other way around. When it comes to articles about the non-English world, much of the work is done by non-native English speakers, who are more familiar with local spellings. Such work usually gets copyedited and such by native English speakers, and if at that point they think a move to a more English-friendly name is useful, they should do the search and if there is a much more widely used English name variant, a move can be done.--Piotr Konieczny aka Prokonsul Piotrus Talk 15:44, 5 March 2006 (UTC)

Speaking of Spanish accents, there is a small group of people who regularly edit Major League Soccer articles who have decided among themselves, without any other notification or documentation, that every MLS soccer player who now has US citizenship should have no accents in their names, even if they were born in countries where their names would normally be accented. BlankVerse 07:26, 29 March 2006 (UTC)

Japanese

An independant mediation supported a past change to the Japanese MoS so that now the inclusion of macrons in titles of articles with Japanese content is acceptable. (Don't blame me!) Japanese romanization utilizes (ō), (ū), and (').  freshgavinΓΛĿЌ  03:19, 20 April 2006 (UTC)

Technicality and an alternative proposal

As was raised in the very begining, there is problem with provision "There are at least 10 reliable publications that are fully in English". Besides the question 'what is meant by 'reliable' (and note that on WP:RS there is no specific list or easy 'how to determine' process)' there are many, many cases when there are much fewer then 10 publications. Lots or smaller towns or villages are not mentioned in 10+ English publications, the same problem is with many historical personas who might be notable in their country but are barely (or not at all) known outside. But if the proposal passes, then we will be forced to invent the undiactricized versions of many names, thus for example Okopy Świętej Trójcy would become Okopy Swietej Trojcy because this former village is apparently almost completly unknown to the English-speaking world. Instead I'd like to draw attention to a similar naming convention, which proposes a different approach and seems to attact mostly positive comments. The Wikipedia:Naming conventions (geographic names) proposed policy supports the use of English names, but states that if there is no widely accepted English name (with 'widely' being defined later) then local name should be used. I personally believe that this policy is more realistic, and it can be expanded beyond geographic names to other 'rare' names.--Piotr Konieczny aka Prokonsul Piotrus Talk 21:51, 5 June 2006 (UTC)

FYI,
...if you ask me not the best example of a stable Polish name... Up till now 20% of the total number of edits to that page have been page moves... For me this rings a bell that maybe a good guideline would be better than this move-warring... no?
Then you also make a link to Wikipedia:Naming conventions (geographic names) which is at a no-consensus for "proposal F" state, afaik *longer than the diacritics proposal exists* - I wouldn't boast too much on the "near to consensus" state of Wikipedia:Naming conventions (geographic names). Anyway, I don't even see "competition" there, its parameters for determining a choice for a name are comparable to those of the more general "diacritics" proposal ([12]) - and it certainly isn't a guideline that would be less complex to put in practice.
For instance, when applying the recommendations of that proposal (version F) to your example, I'd need to check Britannica, Columbia, Encarta, Google Scholar and Google Books. I pick one of these 5 recommended reference sources (Google Books):
  • "Okopy Świętej Trójcy" - did not match any documents. [13]
  • "Okopy Swietej Trojcy" - did not match any documents. [14]
  • 2 pages on "Ramparts of the Holy Trinity" [15]
  • "Stronghold of the Holy Trinity" - did not match any documents. [16]
  • 1 pages on "Okopy Sw. Trojcy" [17]
IMHO, this confirms the present name of the article at Ramparts of the Holy Trinity, and not any version of the Polish name. But yeah, true, if the "geographic names" proposal would be guideline I'd need to check 4 more reference sources in the same fashion. For the "standard letters with diacritics" proposal, the case would already been settled: translation appears indicated... no need to discuss about Polish versions with or without diacritics. --Francis Schonken 10:08, 6 June 2006 (UTC)
Your search above is actually misleading. The 3 (not 2) pages on Ramparts are actually:
[18] shoud be discarded because it does not refers to the village but to the ramparts of the castle
This is the same case, note the lower case used in ramparts (surely if it was to be the village's name it would used an upper case?
The third source is the only one which capitalizes it.
Finally it should be noted that while translation makes sense in the literary text (like #2 or discussions about it like #3) it makes no sence when we are refering to the geographical place.--Piotr Konieczny aka Prokonsul Piotrus Talk 17:30, 9 June 2006 (UTC)
My feeling is this: If a particular town or structure isn't being written up in any English publications, then by some standards we shouldn't have an article on it at all, because it doesn't pass the "Notability" test. On the other hand, I do see some advantages to having articles about not-necessarily-in-the-press things, such as schools and towns and some obscure bits of history. So I'm willing to accept the idea of having an article at Wikipedia about it, as long as the article is titled by whatever the most common English name is. If there's no English name, then okay, I'd say list the title, but without diacritics. Personally, even though I have a passing familiarity with several languages, I find diacritic names jarring to look at in what is supposed to be an English reference work, because it looks like an article has been written solely for the use of the locals in that area, which makes it less accessible to other nationalities (including people on other continents for whom English is a second language anyway). It's unsettling to see a name that is so clearly unpronounceable to most English speakers. So I would rather see the title use the non-diacritic version, which is how it would probably show up anyway in an English-language newspaper if they *did* end up writing an article about the town. And in this case, the problem would be self-correcting. If a town with an odd spelling did become famous in English-language literature, and genuinely notable, then the Wikipedia article would have a standard to follow: If the article was showing up in English-language newspapers with diacritics, a move could be requested to the more commonly-used version of the name. But until then, I'd say let's stick with the "no diacritics unless it can be shown that it's common usage in English" guideline. --Elonka 18:06, 9 June 2006 (UTC)

Related poll

Interested editors are invited to participate in: a poll on whether or not to use diacritics in the titles of Polish monarchs. --Elonka 18:13, 13 June 2006 (UTC)

Major rewrite

I took a stab at simplifying and condensing the proposed guideline to make it easier to read and understand. If I removed anybody's favorite section, please feel free to add it back in. :) --Elonka 06:05, 26 June 2006 (UTC)

I think it reads quite well now. Honestly, i think this is all common sense and I would like to see this get moved up one to become a guideline (eventually). Masterhatch 17:51, 26 June 2006 (UTC)

somewhat related

There is a somewhat related poll here Talk:Voss-strasse if anyone is interested in adding their two cents. Masterhatch 17:47, 27 June 2006 (UTC)

Scope addition

As regards the other "letters not included in this guideline", such as þ and Đ and ß, what is the feeling about adding wording such as: "Because of the limited geographic regions in which these letters are used, English-speakers in other parts of the world (especially those for whom English is a second language) often find these symbols incomprehensible and unpronounceable. As a result, this guideline recommends that their use be avoided in article titles." --Elonka 20:46, 27 June 2006 (UTC)

That makes sense. Personally, i don't know the difference between the sounds the diacritics make and i am a native speaker of English. Masterhatch 22:36, 27 June 2006 (UTC)

Other options

I hope that we have finally reached the agreement that linking through redirects is not a relevant issue, and that we can concentrate on English usage in Wikipedia articles. There are some important points to be made:

  • Wikipedia conventions are just that, conventions of Wikipedia. They are not natural laws, and we can choose to implement any naming convention we chose, including numbering articles by timestamp at creation or some other scheme. Use English is a convention of Wikipedia because we decided so, not because that's how it must be.
  • There is no central authority on correct usage in English.
  • Popular usage is not always correct - even if most sources refer to Tories, the party's proper name is still the Conservative Party.
  • We chose article titles which we judge as the most appropriate for Wikipedia, not those that are the most correct (United States is properly called United States of America) nor always those which are most commonly expected (China is overwhelmingly used to mean People's republic of China in the real world).
  • There are sources in English which use diacritics regularly, those which use them irregularly, those which use just certain diacritics or use them just in certain languages, and those that don't use them at all.
  • Foreign words used in English text don't automatically become English words.

So, we should approach this question with an open mind. Use English does not require us to not make exceptions for classes of special cases. It also does not require us to use or omit diacritics as neither omitting or including them is wrong in English per se. We should discuss what the advantages and disadvantages of using diacritics are. One obvious advantages I see is providing the additional information. The one real practical disadvantage mentioned so far is that they make it hard to search for the name inside the browser page.

There is the real possibility that both ways are equally correct and that it's a matter of taste, and tastes are hard to change through debate. In such cases, we should be looking for widely acceptable rather than hypercorrect solutions. IMO, we should aim to avoid the ridiculous situations when the spelling of people's names depends on ancd changes with where they currently live or work, or when the same first or last name is spelt differently in different article titles without a clear criterion.

The rewritten proposal is much better than previous attempts, but it has two major problems: (1) it's too long, and (2) it goes against the current practice, which has many supporters, and which will be hard to change, even if this is promoted into a guideline. Keep in mind that putting a guideline tag on something doesn't magically make all articles conform with it nor all editors agree with it.

Other solutions include:

  • Use no diacritics at all
This would have the advantage of being short, clear and easy to enforce, but as the current proposal says, it would sometimes force us to use wrong titles even for English names, which makes it unacceptable.
  • Alway use the original spelling
This would also be short and clear, but it would be simply wrong for monarchs and many other historic people, which makes it unacceptable.

There are other, more nuanced options, none of which should exclude per-case decisions in special circumstances, nor using English names when they are spelt entirely differently.

  • Use whatever makes more sense, or what the first editor used if no choice is substantially better
This is how BE/AE spelling and CE/AD era notations are handled.
  • Use diacritics if the common English spelling is the same as the original one, but without the diacritics.
This is more or less how place names are handled.
  • Use the original spelling unless the person has legally adopted the spelling without diacritics or regularly uses it even in their native language.
This would cover naturalized citizens or other people who have genuinely changed their name and language identity, while leaving most articles as they are now.

In short, I'd prefer names to be spelt by dafult like properly translated sources in English from the country of origin spell them (i.e. use the local transliteration), but any of the nuanced solutions are acceptable to me. Zocky | picture popups 01:29, 28 June 2006 (UTC)

What is this "natural laws" thing you keep talking about? are we discussing physics? Well, we decided to use english because this is the English language section of wikipedia. It would be kinda strange to use korean here, now wouldn't it? anyways, that is why there are multiple language sections on wikipedia.
Well, the Brits have their English and the Yanks have theirs. For wikipedia, we blend the two and use the most common form of English.
While popular usage isn't always correct, wikipedia has a policy of using the most common form of English in usage because wikipedia is for the layman and it is the most common form of English that the layman understands the best.
See above, we use the most common form, unless there is a disambig problem, of course.
That is why we go with the most common form. Simple, eh? If a word or name is most commonly written with diacritics, then wikipedia should use diacritics. If it is most commonly written without diacritics, then wikipedia should follow suit. Masterhatch 07:26, 28 June 2006 (UTC)
Look the only forseeable solution i can see is that for names, places, etc that diacritics are most commonly used in English, they keep the diacritics here on wikipedia article titles. For ariticles where English most commonly drops diacritics, wikipedia should reflect that. Isn't that simple? How can anyone logically argue against that? That way both sides win. People that like diacritics get them on articles where they are most commonly found in English and people who don't like diacritics don't have to have them rammed down their throat for words that they almost never see them on in daily life. Masterhatch 07:26, 28 June 2006 (UTC)
The above was inserted into middle of my comment which made it hard to read. All I can say is that it has been previously established on numerous occasions that reader ignorance is not a valid concern for editorial decisions, so most of Masterhatch's comment is irrelevant. We also know that common usage is one of the factors used for these decisions, not the overriding deciding factor, which makes the rest irrelevant. Zocky | picture popups 10:30, 28 June 2006 (UTC)

Zurich -- (Talk:Zürich/Archive1#Move (Zürich -> Zurich)) --Philip Baird Shearer 07:43, 28 June 2006 (UTC)

I've reverted some of the changes to the proposal:

(1) categories don't have redirects
So it's Category:Compositions by George Frideric Handel (without diacritic) if we have the composer at George Frideric Handel, and Category:Compositions by Camille Saint-Saëns (with diacritic) if we have the composer at Camille Saint-Saëns. I tried to draw a bit more attention to the category aspect of being consequent (while categories don't have redirects). Also the section is important, while it draws attention that being consequent only applies to the name of a topic, not to "copying all diacritics of a language".
(2) this is NC not MoS
refers to "first contributor" rule (copied from MoS) that was removed by me from this NC proposal, while the "first contributor" rule can only be used if it's competition between varieties that are acceptable in English. In other words, one doesn't fight non-English nationalistic POV by inciting to start as much wrong-named articles as possible, to give way to "I was the first" claims.
(3) remove "national varieties" doubling (Irish not nat. var. of Eng)
don't know why the "national varieties of English" link, that was already in the intro of Wikipedia:Naming conventions (standard letters with diacritics)#Specifics according to language of origin was doubled in a rephrased format in one of the subsections. Note that Irish is not a "variety of English" (the rephrased intro was a bit more ambiguous on this point).
(4) keep all commented out proposals until further notice
why delete some, and keep some others? Some of the deleted ones were pages in Wikipedia "naming conventions" format, some of those that were kept, were merely a link to an encyclopedia article (so not guidance on how such specifics are handled in wikipedia page names) --Francis Schonken 09:41, 28 June 2006 (UTC)
  • If you realy mean this "After the choice has been made whether a name is written with or without diacritics in a page name, all other Wikipedia pages" then I find it unacceptable. It is a step way beyond WP:NC. The reason for redirects is so that Wikipedia can accommodate other names for the same subject.
    • I see the confusion I generated: changed that to content pages. I hope "content page" is clear enough as a concept, or should "except redirect pages and disambiguation pages" be added to that? --Francis Schonken 11:24, 28 June 2006 (UTC)
  • I think that when there is a dispute over the name which can not be resolved then first is a reasonable compromise. If not how does one decide as a page has to have a name? It cuts down on revert wars while an alternative consensus is reached.
    • WP:RM always leads to a resolution (even if "stalling by lack of consensus"). And it has a slight bias towards "keep where it is" (while 60% is the usual threshold for a move). And part of the rules are, as far as I know, the WP:RM should not start from a place where the page has just been moved to (recently there was still a WP:RM vote broken off for that reason). --Francis Schonken 11:24, 28 June 2006 (UTC)
  • Depends on what you mean by "Irish" lets call it "Irish English" as reduces the ambiguity.
  • Only those which effect National verities of English should be mentioned here. The rest should not because there are potentially hundreds of these and there is no reason why this general guideline should be explicitly subservient other potentially POV laden guidelines like this proposed one: Wikipedia:Naming policy (Czech):
Czech names: almost all names with diacritics use it also in the title (and all of them have redirect). Adding missing diacritics is automatic behavior of Czech editors when they spot it. So for all practical purposes the policy is set de-facto (for Cz names) and you can't change it.
"Only those which effect National verities of English" - the title of the section is "Specific languages using the (extended) Latin alphabet". Neither Irish nor Māori language are a National variety of English. French has a more profound effect on UK English than on US English; it seems also that, for instance, Spanish has a more profound effect on US English than on UK English. But this is not the point of this section (while these US/UK style variants are treated in the MoS). If "böttger ware" turns up in Webster's, with a diacritic as in German, this is not an issue limited to "national varieties of English", but it should be part of Wikipedia's diacritic-related guidelines. FYI, German is a language using "extended Latin alphabet". --Francis Schonken 11:24, 28 June 2006 (UTC)
--Philip Baird Shearer 10:16, 28 June 2006 (UTC)

Catering for dumbed-down and lazy usage

A café is a café is a café. It is a French word which we, English speakers, have adopted to mean a particular type of building. The word is not an English word and it would be incorrect to treat it as such--it is a French word used by English speakers. Sure, some people spell it "cafe", some people even pronounce it "caff" - that's fine, local variation is good - language develops and evolves. One day, in a few decades time, the spelling "café" may seem quite alien - at that time, it will have been fully adopted. The word role, once spelt rôle, is an example of a French word which has been fully adopted into English and whose original spelling looks odd to most. This is the distinction between a non-English word in common usage among English speakers, and a fully adopted word. The misspelling café is one thing, but proper names are quite another - "Antonín Dvořák" is spelt one way and there is no alternative. Ultimately, we are trying to write an encyclopedia, doing something somehow authoritative. In a casual e-mail I may miss the diacritics due to laziness and in the understanding that the recipient would understand who I was referring to. We are not writing a casual e-mail, we are not texting our friends and we are not instant messaging our colleagues. As such, we should not treat language as if we were. It would also be wrong to go too far in the opposite direction and become too prescriptive about language, insisting that those who don't use diareses in words such as the verb meander (thus meänder) are somehow wrong or illiterate. Of course, we are still a dynamic work that is able to stay current and appeal to all but we have, at our disposal, a range of tools which enable us to use the correct characters for a wide range of languages more than anyone has ever had in history. Wikipedia is such that if someone doesn't understand, or recognise, a particular character they are able to look it up and educate themselves. We should make the effort to provide truth.

"Diacritics should only be used in an article's title, if it can be shown that the word is routinely used in that way, with diacritics, in common usage" is entirely flawed as a guideline. Common usage varies from continent to continent, from country to country and from culture to culture. I am sure it is common for most people who are writing about Dvořák to spell his name "Dvorak" because they aren't sure how to get that funny "ř" character to appear (this was particularly the case in the days of the typewriter). "Dvorak" would therefore be considered common usage, but this doesn't reduce from "Dvorak" being wrong through-and-through when referring to the man Dvořák. --Oldak Quill 17:17, 28 June 2006 (UTC)

In that case, the article about the man should definitely include the proper spelling of his name, in the body of the text. This guideline is not referring to the main article, but strictly to the titles of articles, and trying to come up with a consistent method which allows for ease of linking, reading, and finding, for the vast majority of English speakers. If I, myself, were looking for an article on the man you mentioned, I would type "Dvorak" into the search box, not something with diacritics. --Elonka 17:36, 28 June 2006 (UTC)
I cannot see your reasoning. Surely the title of an article should be spelt correctly? This is particularly the case in Wikipedia as we have redirects which allow us to use correct characters in titles without inconveniencing our visitors. Redirects allow us to both maintain a high standard of spelling and lexical correctness while making the browsing experience easy for the visitor. Potentially, all people who have primarily Cyrillic names could have Cyrillic article titles, redirects would ensure that finding the correct article would be as easy as finding a non-Cyrillic counterpart (Pyotr Ilyich Tchaikovsky, or whichever transliteration we chose to use, would redirect to Пётр Ильич Чайкoвский). Of course, using non-Latin alphabets in titles goes too far for most so it is something we don't do. But diacritics are easy to use and understand, they are part of our Latin alphabet- we should not incorrectly label people and things due to sheer institutional laziness. --Oldak Quill 17:46, 28 June 2006 (UTC)
I have to disagree with you Quill. I am strong believer in the most common form of English be used in article titles (except of course in the event of disambig problems). Diacritics, in most cases are foreign to English and if you look around, most people, places, and things that have diacritics in their native language, lose it when mentioned in English. Take Jaromir Jagr for example, his name in Czech include diacritics. if you look around publications in English, the extreme vast majority of times the diacritics are dropped, even on his hockey sweater. The native spelling, with the diacritics, should be (and is) shown in the first line of the first paragraph of the article. The title should be the most common form found in English, whether the most common form includes diacritics or not. Your basic reasoning is that English is spelling the names wrong. Well, i am telling you, English is not spelling it wrong, it is just spelling it English. That is what, in fact, got me into this. I used to not care either way if there were diacritics in titles until someone called the English spelling wrong. That lit a fire in my arse because that is pure ignorance for someone to call an accepted English spelling as wrong. I don't go and say that "États-Unis d'Amérique" is spelt wrong and they must spell it the English way! I understand that French has its spellings and you must understand that English has its spellings. Masterhatch 18:37, 28 June 2006 (UTC)
I would say that as long as a name is written in a Latin-derived alphabet and there is no commonly accepted English name (such as Spain for España, Rome for Roma etc) then the name should be written with its original diacritics. A name is a name, it is either spelled correctly or incorrectly and we shouldn't start disfiguring it. The majority of non-English names, whether they are of places or people are too little known in English to have commonly accepted English spellings. Simply because the name of a Czech village or a minor figure from Paraguayan history has diacritics that are not normally used in English doesn't mean they should be removed when written in English as there are no commonly accepted English spellings for such names. Booshank 19:21, 28 June 2006 (UTC)
Well, maybe some of those places that aren't well enough known to English speakers aren't notable enough to have their own article. If those places can't be found in an English atlas, then why would wikipedia have an article about it? If there are no English publications in regards to that place (or person), are they really notable enough to have an article? Most small czech villages can be found in a thorough english atlas. If those English atlases have diacritics, then wikipedia should follow suite. If those atlases drop the diacritics, then, again, wikipedia should follow suite. Same with people. If there are no or only a very few English publications about a person, then is he/she really notable enough to have an article? If there are English publications about that person, have a look at them and, again, whichever form is most common (with or without diacritics), then use that form. I have no problem with the use of diacritics if that is the most common way to spell that name in Engish. I only have a problem with the use of diacritics when the most common way of writing that name in English drops those very diacritics. Masterhatch 20:15, 28 June 2006 (UTC)
I don't believe a single source should ever be used to push an agenda. Just because a single atlas rejects diacritics (for the sake of space, perhaps) does not mean that we should follow suit. Diacritics should only be excluded if a non-diacriticed version has become more popular in English. On a side point, whether there are any publications on something in a particular language (where there might be plenty in another language) is not a measure of notability. Czech villages should be covered as extensively as British villages in Wikipedia (the latter of which will be covered far more thoroughly in the English language), for example. Articles on small Czech villages which won't be widely known enough among English speakers to have their own spelling variants (regardless of what a particular atlas states) should always be given the correct Czech name. --Oldak Quill 20:53, 28 June 2006 (UTC)
Single source? you must have misunderstood my comment. I would never say that a single source is good enough. Anyways, to clarify, for notability sakes, if there is no mention of a small czech village in an English reference book, then i ask, is it really notable enough for an article? But that is side tracking and that is a debate for a different day and a different place, so I won't discuss it further. Back to my point, if most English atlases and reference books aren't using diacritics for a czech town (city, village, person, whatever), then wikipedia shouldn't either. If most English atlases and reference books are using diacritics, then wikipedia should too. Funny thing is, with all this arguing back and forth no one has told me what is wrong with that. It is fair for everyone and it follows the wikipedia naming convention policy to a tee. Masterhatch 02:28, 30 June 2006 (UTC)
[Was replying to Masterhatch and had an editing conflict. This is a response to Masterhatch, but I completely agree with Booshank]. Of course "États-Unis d'Amérique" isn't wrong, this example is not analagous to what I have been saying. États-Unis d'Amérique and United States of America are both correct, it is "Etats-Unis d'Amerique" which is not. If one is going to use French words then spell them correctly - E is a different letter to É. In many languages with diacritics, forgetting the diacritic can result in an entirely different word with an entirely different meaning. In Afrikaans, for example, if one forgets the diacritic on the ë in the word "hoërskool" (meaning high school) and so produce "hoerskool", one would be expressing "whore school". This is an example which demonstrates the fact that a letter with and a letter without a diacritic are different letters and to confuse them is to arrogantly thrust the English non-use of diacrtics onto loaned words. Jaromir Jagr chooses to spell his name differently in English, to transliterate his name for an English-speaking sport. That is perfectly acceptable and we use his adoptive English name in articles. But we, as an encyclopedia, cannot thrust new names onto people because it suits us, because it is easier for us to type. Proper nouns, and some adopted words, exist outside the rules of the language by which they are adopted - they are words which should still be treated with the spelling rules of the language from which they came until their usage is so common that those rules are dropped. My name is Oldak Quill and I wouldn't expect speakers of a language which doesn't use "Q" or "Qu" to change my name to 'Oldak Kwill" to give themselves an easier time - it is just incorrect. Dropping the diacritics of a French or Polish word is nothing short of transliteration and is entierly comparable to changing "Quill" to "Kwill". Transliterations can only ever give rough and fuzzy approximations of a word and should therefore be avoided as much as possible.--Oldak Quill 19:17, 28 June 2006 (UTC)
You are missing my point (and i am probably missing yours) that we here on wikipedia aren't about changing English, but following the majority of English publishers to arrive at "most common form of English". It is simple, if the majority of English publications drop the diacritics, then wikipedia must follow suit. If the majority of other English encyclopaedias and reputable publishers don't use diacritics for names, why should we on wikipedia? As i have said before, if the majority of reputable English publishers are using diacritics for a given name, then, by all means, include them in the article's title in wikipedia. I am not trying to eliminate diacritics from wikipedia, i am just trying to make sure that wikipeida doesn't stray too far from "the most commonly used form of English".
See my reply below. Reputable English encyclopaedias do use diacritics in many, many names. --Oldak Quill 20:46, 28 June 2006 (UTC)
Agreeing with Oldak.--Piotr Konieczny aka Prokonsul Piotrus Talk 22:02, 28 June 2006 (UTC)

Not a place for change

Really, i think it is all pretty simple: if the majority of reputable English publications include diacritics, then wikipedia should too. If the majority of reputable English publications don't use diacritics, then wikipedia shouldn't either. It's a case by case situation. What is wrong with that? Honestly, if people have problems with English dropping diacritics, don't come to wikipeida with your grievances. Wikipedia isn't here to change the English language or how English does things. Go to websters or oxford or whoever. Wikipedia is not the place for change. Masterhatch 20:15, 28 June 2006 (UTC)

I have no desire to make Wikipedia radically different from other works of reference in English. As I said, if it has come to be that a word without diacritics is more common than one with, then it should be the one used (such as role instead of rôle). What I do object to is purposefully going out of our way to force change by standardly rejecting diacritics when they play a very valid rôle ;) in many words. Further, no works of reference that I know of force change in peoples names to exclude diacritics. Dvořák is Dvořák is Dvořák, there is simply no alternative way to spell this name - the diacritics are not just aesthetic but functional. I am striving to do exactly what you accuse me of doing: all I see in this proposal is an enforced rejection of diacritics for the sake of dumbing-down and laziness. --Oldak Quill 20:46, 28 June 2006 (UTC)
I am not standardly rejecting diacritics!! that would be ignorant of me. I will repeat what i have said all along, because people for some reason keep missing it: if most English works use diacrtics, then wikipedia should. If most works don't, then wikipedia shouldn't. This is a case by case situation, not a blanket covering! Someone please tell me why that won't work? Masterhatch 02:28, 30 June 2006 (UTC)
Nonsense. There are zillions of reputable English publications discussing Antonin Dvorak and other Dvoraks, and things such as the Dvorak Simplified Keyboard as well as its inventor August Dvorak. And there are probably as many English publications using Antonin Dvořak as Antonín Dvořák, so why is the former a redlink as I write this? Redirects are cheap enough that we could even include August Dvořák for fools who mistakenly think "Dvořák is Dvořák is Dvořák". Gene Nygaard 16:56, 18 August 2006 (UTC)

Dumbed down?

Café/cafe
IMHO, it's rather "dumbed down" to state that in English only the version with diacritic is correct:
Webster's 1981 international printed edition
café cafe
café au kirsch
café au lait
café brûlot
cafe car
café chantant
café concert
café crème
cafe curtain
café noir
café society
OED minidictionary (1994)
café
I was using café as an example - café is, as far as I can tell, the most commonly used form and the form used by most reputable dictionaries (including the current OED). I did not deny that some people did use the word "cafe" (in fact, I stated that they did) nor that derivative words might use that spelling (such as "cafe curtain"). --Oldak Quill 20:46, 28 June 2006 (UTC)
Well, "dumbing down" comes from calling a less used version a misspelling (that's the word you used). even caff (Brit : café) is in the addenda of the 1981 international printed Webster's, and so not a "misspelling". --Francis Schonken 21:31, 28 June 2006 (UTC)
I did not call caff a misspelling, I called cafe a misspelling. I was quite wrong about "cafe" being a misspelling though, but that isn't the discussion at hand. In trying to get my point across I erroniously emphasised something too strongly. All of my points (except calling cafe a misspelling) still stand. --Oldak Quill 22:02, 28 June 2006 (UTC)
Tx for taking the point. My next point is that "Antonin Dvorak" is not a misspelling in English. And that was your main point (at least your main example). I'm prepared to discuss whether the article on this composer should be at Antonin Dvorak or at Antonín Dvořák. In fact I already did, twice, as you can see in #So what actually is this saying? and #Using diacritics (or national alphabet) in the name of the article above on this page. As a result of these discussions, among others Category:Compositions by Antonin Dvorak was moved to Category:Compositions by Antonín Dvořák. I'm prepared to discuss the page naming for this composer again, but not on the basis of the "dumbed down" assumption that Antonin Dvorak is a "misspelling". --Francis Schonken 22:29, 28 June 2006 (UTC)
"ř" is an entirely different letter to "r". "Dvořák" is a different word to "Dvorak" with a different pronounciation. "ř" produces that particular "j" sound where "r" would produce, well... you know. It is not appropriate to replace "ř" with "r" just because they look similar. --Oldak Quill 23:11, 28 June 2006 (UTC)
"Antonin Dvorak" is not a misspelling in English. --Francis Schonken 23:17, 28 June 2006 (UTC)
"Antonin Dvorak" is not a misspelling in English. Gene Nygaard 12:20, 5 February 2007 (UTC)
Antonin Dvorak
As you might know, this composer lived and worked a few years in the USA. I always wondered how they wrote his name in the music school where he was director at that time? How was his name written on the concert programs when his music was performed during his stay in New York? Does anyone have any info on that? --Francis Schonken 20:11, 28 June 2006 (UTC)
That would be a good piece of info, but the spelling at the time is probably not what we want to use as a criterion. It would make us spell a bunch of historic names weirdly, if nothing else. Zocky | picture popups 13:01, 29 June 2006 (UTC)
Neither do I intend to do so! Unless when it would be clear that in English, the composer used the diacritic-less version of his name exclusively. In that case we'd have the same situation as for Arnold Schoenberg, who clearly changed his name to a diacritic-less variant when moving to the USA. For this composer the version of his name with diacritics could be considered a "misspelling" in English. And only in English - all other languages write Arnold Schönberg afaik. --Francis Schonken 13:52, 29 June 2006 (UTC)

Diacretics are not English, full stop

This is the English language Wikipedia, so diacretics should not be used, other than to indicate the form used in an original language. We don't use Chinese characters, and there is no more justification for using diacretics, other than that to some degree we can get away with it. Words written with diacretics are not English, and that is the end of the matter. Chicheley 20:51, 28 June 2006 (UTC)

This is simply not true. English does make use of diacritics natively (such as the diaresis). Further, it makes extensive use of diacritics in loan words because letters with diacritics are not the same as the similar-looking letter which doesn't have one. --Oldak Quill 20:55, 28 June 2006 (UTC)
I disagree, but that counts for little. Of rather more importance are the vast number of reliable sources which use diacritics where conventions dictate that they should be used. Angus McLellan (Talk) 21:43, 28 June 2006 (UTC)
I must agree with the title of this section: "Diacretics are not English, full stop". Indeed "diacretics" is not an English word. "Diacritics" is. And they are used in some English words, all dictionaries agree on that. Sorry for the pun. Couldn't resist :) --Francis Schonken 22:58, 28 June 2006 (UTC)

Foreign words are not English words. Full or partial stop or whatever. Zocky | picture popups 11:43, 29 June 2006 (UTC)

Of course loanwords are English words. Some of these have diacritics. See Wikipedia:Naming conventions (standard letters with diacritics)#Rationale--Francis Schonken 13:59, 29 June 2006 (UTC)
Personal names and other foreign words used in English texts are not loanwords, they're cited foreign words. Zocky | picture popups 14:04, 29 June 2006 (UTC)
Wasn't talking about proper nouns but about loanwords. The question was whether there are "English" diacritics. Proper nouns are afaik of no use when trying to prove whether diacritics are part of English or not. Loanwords, on the contrary, are useful in that context. And then the answer is yes, some diacritics are English. --Francis Schonken 15:07, 29 June 2006 (UTC)

Technology, not "it's not English," is why diacritics were historically stripped in English

The question was whether there are "English" diacritics. et al...

The notion of "diacritics" are "not English" is fundamentally rooted in pre-computer typesetting technology, whose ultimate technological achievment was the Linotype machine. Typing technology consisted of assembling a set of molds for all the letters and spaces in a line of text, then pouring hot metal into the created mold to create the line of print. It was simply not feasible (if not mechanically, then certainly not economically) to create Linotype machines which can do what we can do now from any computer keyboard, that is, type any language on the planet (almost). And so, what you had were Linotype machines set up by language, with a few "extras" tossed in which were used often enough that including them in the set of additional "special" characters was not overly burdensome.
   The result, when printing foreign names in English, particularly in the case of Eastern Europe, were all sorts of variations from just the elimination of diacritics to a seemingly endless supply of semi-transliterations.
   References recognized this and began moving away from non-diacriticalized versions as early as the 80's as phototypsetting technology began rolling out in force (having become commercially affordable in the late 1970's), using instead the original language form. However, it's only more recently that typing technology for the masses has caught up, from the RIGHT-ALT "GRE" language character shift standard to all the standard font faces supporting all the extant "code pages" and computer programs accepting alternate font "code pages" besides just Latin and "extended" Latin.
   Just in the tiny little article naming corner of the world I find myself embroiled in, we find the following question (leaving issues of monarchal titles, other languages, etc., out). What, for example, defines common English usage and/or current English usage for the Polish "Władysław"? Let's do the "Wiki" thing (google et al. searches, library searches, etc.):
  • Wladyslaw (drop the diacritics),
  • Vladislav,
  • Ladislau,
  • Ladislas,
  • along with the native Władysław.
Here we find three variants based on some sort of transliteration, one based on stripping the diacritics, and the native "Władysław," quite frequently used in major/popular references (not just obscure academic history journals) and simply indexed for English usage as if there were no diacritics.
   I submit that "Władysław is not English" is a red herring. Enough current English references, and going back 3 decades, simply use the native Polish syntax. So, what are the arguments in favor of the four non-Polish variants above?
  • A person "won't know how to type the ł's"—a moot point because redirects handle all the variants.
  • It looks strange—articles should mention all the historical English usage variants; I think it would be a net benefit for (many) English speakers to have a less parochial view of the Latin script; since it already appears to be generally accepted that one can use the native syntax within articles, there's no need to restrict the title.
  • Typing those ł's is damn inconvenient−shift to Polish keyboard, RIGHT-ALT plus "l", it really couldn't be easier... łłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłł (repeating key). Took less than 30 seconds to install Polish keyboard support on my PC.
  • A person typing it without the diacriticals won't find the Wiki article from a search engine—perhaps once upon a time, but as far as I can tell, search engines now pretty much ignore diacritics.
  • I can't save the file in Notepad—save as UTF-8.
It seems exceedingly odd to me that an encyclopedia which exists only because of computer technology, which provides a built in editor supporting all the (major) "font pages" so one can insert all manner of characters with the click of a mouse, would insist that article titles remain rooted in a technology that became obsolete thirty years ago.
   English usage no longer means "restricted to the English language character set." Yes, "Władysław is not English"—it's Polish, but "Władysław" is accepted and increasingly preferred English usage. —Pēters J. Vecrumba 03:49, 19 November 2006 (UTC)
Don't be silly.
"It isn't generally a technology limitation. Newspapers and magazines have long been able to include diacritics when they choose to. They do not choose to.
It's a "moot point because redirects handle all the variants"? Utter hogwash. First of all, redirects don't just happen. Just look through my contributions listing, and see all the redlinks mentioned in my edit summaries that are still red even after I have called the problems to the attention of those watching the articles.
Furthermore, there are much broader implications than what happens when you type something into the go box, or when you put a link into an article. The use of diacritics or not also has significant effects on searches in various search engines; and on the major search engines, there are so many different parameters that affect the results that those results are never predictable. I can show you a great many cases in which diacritics do indeed make significant differences, on various search engines.
We English speakers have every damn bit as much right to establish our own identity in the characters we use in our language as the users of some other language who can't think of any better way to establish their identity than to see how cute they can get with the squiggles they put on the letters they use. I get tired of hearing from far too many editors claiming not that we should choose to use the options with diacritics here on Wikipedia, but rather that it is an "error" for us not to do so. It is never an error to use the English alphabet when writing in English. We can at times choose to retain some other languages characters for some purposes; that does not mean that we are in any way obligated to do so.
I never used to be in favor of an "English as an official language" law in the United States. After dealing for two years on Wikipedia with POV-pushers who insist on sticking diacritics in hundreds of places where they clearly do not belong, I am now urging my representatives in Congress to not only pass such a law, but not to pass some wimpy, empty mollification of constituents clamoring for such laws by passing some meaningless nonsense, but rather to make a real law with real language police with real authority.
It may have taken you less than 30 seconds to install the Polish keyboard on your computer, but big fucking deal. What's the point. Yes, I can do that too. But the caps on my keys do not change. What the hell am I supposed to do? Be like those monkeys they talk about in math classes, given enough time they'd duplicate all the works ever written? Do I just hit keys at random, until something shows up that resembles what I'm supposed to be looking for? Does that keyboard include "dead keys" that don't do anything themselves, but rather change what happens with the next key you hit? I may get it installed in 30 seconds, but it is hard to teach an old dog like me new tricks. It would take me 30 years to learn how to use that Polish keyboard, and I likely don't have that much left. By the way, I have for many years had both the German and the Norwegian keyboards used on my computers. Even there, where I konw what the letters are, I have found it more trouble than it is worth to try to learn the keyboard layout. Rather, I find it much, much simpler to remember a few numbers and use the Alt-numeric keypad method to create them as I need them (and in most cases, it is things like Alt-134 that I remember, the old DOS operating system versions which still work with Windows, rather than the Alt-0229 Windows version which I had to look up because I don't know it. Of course, in the case of Alt-0248 and Alt-0216 which weren't available in DOS, I learned the Windows numbers.
Then, what in the world do I do for the 500 or so other languages used here on the English Wikipedia. How many keyboards can I install? How much memory does each one of them take up? How am I ever going to remember what I need to do to switch to the one I want to use, let alone remember what the layout of the keys is if I do figure that out?
Your example of the "native" Władysław spelling is the most common late-20th century/early 21st century spelling in the Polish language, of a name that has had dozens of variant spellings in both the Polish language and as well as in other languages throughout history, including the times when many of the people bearing that name lived, and including various other languages actually spoken by the people bearing equivalent names under whatever spelling. Gene Nygaard 03:04, 3 February 2007 (UTC)

Another thing

Another thing: We indeed don't use Chinese characters, but we do use the transliteration which the Chinese use. Zocky | picture popups 13:04, 29 June 2006 (UTC)

For that reason romanization systems (like pinyin) are defined as outside the scope of this proposal at Wikipedia:Naming conventions (standard letters with diacritics)#Scope --Francis Schonken 13:59, 29 June 2006 (UTC)
So, the Chinese and Serbs get to use their own transliteration, but Swedes and Czechs don't? Zocky | picture popups 14:03, 29 June 2006 (UTC)

OK, forget Serbian and take Macedonian, which uses a very similar latin spelling as Serbian, but only as transliteration. Keeping a person with the same last name at Buckovski or Bučkovski (which both would spell Bučkovski themselves), depending on whether they're from Serbia or Macedonia sounds unworkable.

I really have no clue what you're trying to get at. How to romanize the Macedonian Cyrillic script is described at Wikipedia:Naming conventions (Cyrillic)#Macedonian. Indeed, there, it is described as "may be written as Serbian" (with a few specifics/variants). If you have a problem with that, please direct your concerns to Wikipedia talk:Naming conventions (Cyrillic). Wikipedia:Naming conventions (standard letters with diacritics) is not about romanizing Cyrillic scripts. If you want the people involved in the romanization of Cyrillic script languages to read your suggestions, then this talk page is not the right place, Wikipedia talk:Naming conventions (Cyrillic) is. --Francis Schonken 16:19, 29 June 2006 (UTC)

My idea is that all languages should be treated the same - use the same spelling as used in English texts produced in the country of the language's origin. Zocky | picture popups 15:50, 29 June 2006 (UTC)

Don't see what that would solve. These English texts produced in the country of the language's origin don't all use the same spelling. And a side-effect would be that you'd make the current *agreement* on National varieties of English (as described at WP:MoS) explode: "the country of the language's origin" would be the UK in that case I suppose, so you'd get all the USA people against you. --Francis Schonken 16:19, 29 June 2006 (UTC)

Read that as "the country of the original language's origin", of course. In other words, spell Slovenian names as English texts produced in Slovenia do, spell Chinese names like English texts produced in China do, and spell American names as English names produced in US do.

UK would still be "the country of the original language's origin" when speaking about English (the original language) --Francis Schonken 16:49, 29 June 2006 (UTC)

The problem with this proposal excluding romanization is that it would e.g. force Serbian and Croatian names to drop diacritics while the same names used in Macedonia would keep them. Imagine a situation where both presidents of Serbia and Macedonia had the same first or last name, which includes a diacritic both in Serbian latin spelling and in the Macedonian romanization. A sentence saying "Sasa Cacic visited Saša Čačovski in Skopje" would look ridiculous. Zocky | picture popups 16:34, 29 June 2006 (UTC)

Giving an example of English texts produced in the country of the language's origin don't all use the same spelling:
Note that all the mentioned websites are Polish (.pl), and that for the Polish pages of each of these websites always the version with diacritics is used... (I mean: the differences in the English spelling don't result from the often gratuitously assumed "laziness" in this case).
So, no, I don't think Zocky's alternate proposal would solve much.
Neither for Chinese for that matter, Lao Tzu as well as Laozi (and some other variants) can be encountered in English texts produced in China. --Francis Schonken 16:47, 29 June 2006 (UTC)
Of course they don't all use the same spelling, but that's in no way different from English texts produced in English speaking countries, but it would still be the same rule for all languages. Zocky | picture popups 17:19, 29 June 2006 (UTC)

There are two aspects to you proposal (apart from the US/UK English thing, but that could be worked away with a diligent way of formulating the principle):

1. Use local sources in English for determining spelling in English Wikipedia
This has several problems, for one that it would be less compatible with the current provisions of wikipedia:naming conflict. For example for Lech Walesa/Wałęsa, using the table provided by that guideline:
Criterion Lech Walesa Lech Wałęsa
1. Most commonly used name in English 1 0
2. Current undisputed official name of entity 0 1
3. Current self-identifying name of entity (in English!) 1 0
1 point = yes, 0 points = no. Add totals to get final scores.
This is a weighed result. Doesn't give precedence to a single principle. Compatible with the present "diacritics" proposal. What you propose is that a single principle gets precedence, a principle that doesn't apply likewise to all countries/languages (not all countries/languages produce readily available "reliable sources" in English covering everything that is notable about the country, for instance - for several countries the majority of reliable sources in English are produced outside the country).
So, as far as your "published in English in home country single principle" proposal is concerned: this might seem a good idea on first sight, but I foresee too many problems, and won't support it.
"self-identifying name of entity (in English)" is roughly what I'm talking about, and in this case would probably more commonly be the one with diacritics. In fact, almost for all foreign names, items 2 and 3 gives points for the name with diacritics and trumps the English common usage. I guess that's why most articles are at the names with diacritics now. Also, Wikipedia:Use English says that for languages which use the latin alphabet, no transliteration is necessary, which I interpret as "use the original spelling". Zocky | picture popups 02:43, 30 June 2006 (UTC)
I've no idea what, in sum, you're trying to say:
  • Currently "self-identifying name of entity" should determine for 33% (the other two thirds being "official name" and "common name in English"), per the naming conflict guideline;
  • Then you say: no, "self-identifying name of entity" should determine for 100%, it is an appropriate formulation of the "published in English in home country single principle";
  • Then you say: no, "self-identifying name of entity" should determine for 0% while it trumps English common usage (which, furthermore, it obviously didn't in the example given above).
...all in all a quite confusing comment.
Also, your quote of Wikipedia:Use English is very questionable. The sentence where you quote from has quite clearly "If there is no commonly used English name". Arguably, for example, "Andre" is the common English format of the French name "André". Seems also as if you never read the guideline till the end. It has very clearly: "There is disagreement over what article title to use when a native name uses the Latin alphabet with diacritics", in Wikipedia:Use English#Disputed issues. It is (a part of) that dispute we're trying to solve with the present "diacritics" NC proposal. Your comments above seem so confused to me, that I still don't know what grounds you have to either support the thing getting solved, or not. --Francis Schonken 07:38, 30 June 2006 (UTC)
The problem here is the idea that everything has an "English name", which is simply not true. Some things are named in other languages and English uses them as citations. With substantial usage some of these become English words, and sometimes the spelling changes (that's how "Andrew", the real English equivalent of "André" came about). But in most cases where diacritics are used there are no English words, just cited foreign ones.
"Self-identifying name" to my mind is simple - what the person or entity uses themself. I have never said that anything trumps common English usage automatically or common English names at all (in fact, I supported titles like Oder, Drave, Save, Styria, etc.). I just meant to comment that if the above template is applied, diacritics would win in most cases, even if versions without diacritics were really "English".

How citations are rendered, is a matter of choice, but there's no magic formula that says that dropping the funny dots makes a foreign name or word English. Zocky | picture popups 03:33, 15 July 2006 (UTC)

2. treat the Latin alphabet languages and those with other native scripts with the same rules.
In fact I agree with you there. The "caveat" for the non-Latin alphabet languages is a practical one. Wikipedians have elaborated guidelines for Japanese, Chinese, etc... I think they did a good job. I'm not remotely experienced in these languages to doubt their assertions that on some level somewhere a more "formal" linguistical romanization system should be used, like pinyin, which results in some diacritics being used. Anyway, that's a different problem, and is, for those languages, covered by active guidelines. I don't think it would be a good idea to undermine that work. Of course, on short term for the natively "dual script" languages (how many are there: 2 or 3?) the guidance should be clear. Which for Serbian means that, unless the "Cyrillic" naming conventions page is updated in view of the impending diacritics NC guideline, things will be as said if and when this diacritics guideline goes life (change "Latin spelling is used" to "Latin spelling is used including native diacritics" on the Cyrillic NC page, and the thing would be settled too, without Serbian names needing to be changed).
Whether in a later stage Japanese, Cyrillic, Chinese, etc. guidelines are to be brought in line with the "Latin alphabets" diacritics guideline is not a problem to be solved now. Maybe it never happens. If it happens, and its a language I'm remotely acquainted with (Greek might fall in that category [19]) I'd support a diacritic-free romanization. --Francis Schonken 19:08, 29 June 2006 (UTC)