User:RandomP/somethings

From Wikipedia, the free encyclopedia

Some things I would like to see on Wikipedia[edit]

Computer-readable Wikipedia / Wikipedia Metadata[edit]

A computer-readable "language" version of Wikipedia. Right now, we're doing an okay job on English-language content (and an acceptable job on other languages, though I hate to see people wasting time on other languages — English is the only language that can reach a large share of the world population), but it's too hard to make computers do cool stuff with it.

By computer-readable, I mean that it should be easily possible for a client (someone with an okay net connection and some computing power and memory, essentially) to get the result of semantic queries of the facts used for Wikipedia.

Getting this started would be fairly easy — add a new language wiki, and start writing articles like this:

[[ [[Carl Friedrich Gauss]] was born on [[April 30, 1777]] ]]. [[ [[Carl Friedrich Gauss]] died od [[February 23, 1855]] ]]. [[ [[Carl Friedrich Gauss]] was a [[mathematician]] ]].

or

[[ According to [[stats.gov.nz]], [[ as of [[2006]], [[ the [[population]] of the country of [[New Zealand]] is [[ [[4,116,900]] people]] ]] ]] ]]

Nothing that couldn't be in the English Wikipedia, except for the tedious repetitions and the extra double-brackets. The way I'd envision it, there would be meta-articles with titles like [[X]] was born on [[Y]] (and a redirect to that from [[]] was born on [[]]) that would contain, in a limited number of computer languages, ways of dealing with that information).

(For example, [[X]] was born on [[Y]] should contain much the same information as the birthdate of [[X]] was [[Y]], but should also contain computer code resulting in the above implying [[ [[Carl Friedrich Gauss]] was born in the [[18th century]] ]], and so on.)

Maybe more importantly, articles/programs like [[As of [[TIME]], [[STATEMENT]] ]] and [[According to [[SOURCE]], [[STATEMENT]] ]] would help tag the statements made, so users could choose which sources to trust.

How and where software should live that actually reads that information, and replaces the "categories" currently available in the Wikimedia software, is quite another issue. I'd imagine that the mvs client that already exists would be quite useful, but that Wikipedia would also want to run a client itself.

Just as an example for how useful this could be: Find the average population density of countries that are members of the EU and NATO members.

Doing that with the CR Wikipedia would be trivial: there are only some 200 articles that say, after mangling, [[As of [[Right Now]], [[X]] is a country]]. Even easier would be to look directly for NATO members (which are disjoint), or EU members, then restrict yourself further to those which are members of the other organisation, find total area and population, and there you go!

(As an aside: I think "templates" of the form used above would make sense even for the English-language Wikipedia, though whether considering them encyclopedic articles is another issue. For example, [[ 50 km ]] could be a template that results, for standard users, in "50 km", but would display as "31 miles" for users who have expressed a preference for that, or "50 kilometers" for users who have expressed a preference for explicit units.)

Ultimately, I could see this quite changing the way Wikipedia works: for example, when describing a territorial entity, instead of trying to describe its borders (which we don't, currently), we would just describe its subentities. Only for the smallest entities would we have to describe the borders.

"Wikistats"[edit]

This has a lot of overlap with the previous entry; but even if a computer-readable "language" wiki is vetoed (for which there are good reasons), or never happens, I'd like to see much more hard data on Wikipedia, and some way to make access to this data easier than having to read through an entire article.

This applies, in particular, to historic data, which is often available. Once you've got a dozen data points or so, you get a free plot, too, and can make interesting statements.

Tell us what they did. Don't tell us where they're born.[edit]

Many biographical articles still start out with that typical encyclopedia formula:

"Magnus Gustaf Mittag-Leffler was a Swedish mathematician."

The first bit of information in that is Mittag-Leffler's nationality — something that's absolutely irrelevant to the rest of the article.

Of course, nationalities used to be a bit more relevant; people spent most of their lives in one country, and if they published, published in one language. Today, that's simply no longer the case, and unless we want articles of the form

"XY was a Colombian-born German who emigrated to the United States before retiring to France and Switzerland"

we should just stop making this the most prominent bit of information we include in biographies. Put it into the lead paragraph if necessary, but tell us what the person has a WP article for first.

As exceptions to this, I see two cases where the nationality is indeed the most important bit of information:

  • politicians who were born in, spent the majority of their lives in, were (and self-identified as) citizens of, and participated in the national politics of exactly one country. Those would accurately be described as "American politician", for example.
  • poets who wrote in exactly one language, simultaneously the national language of one (relevant) country, for which the same adjective is used in English, and who lived there, were (and self-identified as) citizens of that country. For those it would be accurate to write, say, "Swedish poet".

Note I'm not advocating removing this information, I merely want it to be moved back in the lead paragraph, and clarified where necessary.

I have no real hope of ever succeeding in this.

There are issues with national adjectives at least for the following groups of people:

  • British Citizens, which might be English, Welsh, English but living in Wales, Scottish, Scottish but not self-identifying as British, Cornish and English, Cornish and not self-identifying as English, ...
  • Chinese people living in Taiwan
  • Residents and citizens of Hong Kong and similar special areas of China.
  • Soviet union
  • German citizens living between 1945 and 1990, or living (or having been born) outside of its current borders.
  • French-Canadians (vs Quebecois)
  • Americans who consider themselves more involved with their state than with the union

It might actually be easier to list the countries for which there are no problems.

units[edit]

Being careful with units isn't just being pedantic: it's about being correct, it's about avoiding incorrect comparisons, and it would be tremendously helpful to young readers who've yet got to be introduced to the scientific concept.

People do make real mistakes because they're sloppy with units. It appears to me quite rare that people fail to understand something "because the units confuse them", and much more common that people realise they've failed to understand something simply because of the inclusion of units.

I'd actually like to move to a double-wikilinked approach for this, as in the computer-readable Wikipedia. Just use double brackets around all units, and double brackets around the number part, and the Wikipedia software should deal with it automatically.

Regardless of the question of people sometimes using (arguably) the wrong unit, it is totally unacceptable ever to use a plain number where a unit would be accepted.

Particularly when money is involved, people seem altogether too happy to leave out the currency and time units and talk in numbers. An investment doesn't yield 5%, it yields 5% p.a. This is particularly important when exponential and linear growth are conflated, in expressions such as "an average rate of growth".

Jargon and special conventions[edit]

Many areas of research, or even of quackery, have a highly specialised jargon. In some cases (mathematics, for example), this is unavoidable (mathematical terms are different from normal English words in that they have a formal definition.

For example, economists use "marginal" as jargon for "derivative". Chemists use "mole fraction" when they simply mean "proportion of particles". Engineers like to measure both thrust and mass in pounds, resulting in quotients that would be perfectly understandable if only they included the implicit g.

Write about things, not words[edit]

Too many articles still really consist of several articles, which just happen to share an English term.

We need numbers[edit]

Whenever avoidable, a Wikipedia article should not make a qualitative statement where a quantitative one is available.

Most world maps should be based on population, not geography[edit]

Yes, maps are cool. We all like maps. But maps like Image:Metric system are simply misleading, because the colours visible simply do not correspond to what the map is about (in this case, people using metric units). A standard world-map-by-population, where one pixel would correspond to a certain population, would make more sense, and would make the representation of Europe, Japan, Taiwan, etc. much easier.

Images shouldn't be in the wiki. Scripts generating images should[edit]

Uploading images to Wikipedia is a bad idea, because it's extremely difficult to edit images. Usually, you want your image to be created by a script (that should be on Wikipedia or the commons) from other data (that should be in the computer-readable Wikipedia or elsewhere).