Human Language

   Human language is language used mostly by [1]humans to communicate with
   each other; these languages are very hard to handle by [2]computers (only
   quite recently [3]neural network computer programs became able to show
   true understanding of human language). They are studies by [4]linguists.
   It is estimated (very roughly) that there are about 5000 human languages.
   Human languages are most commonly natural languages, i.e. ones that
   evolved naturally over many centuries such as [5]English, [6]Chinese,
   French or [7]Latin, but there also exist a great number of so called
   [8]constructed languages (conlangs), i.e. artificially made ones such as
   [9]Esperanto, Interslavic or [10]Lojban. But all of these are still human
   languages, different from e.g. [11]computer languages such [12]C or
   [13]XML. Natural human languages practically always show significant
   irregularities (exceptions to general rules) while constructed languages
   typically try to eliminate irregularities as much as possible so as to
   make them easier to learn, but even a constructed human language is still
   extremely difficult for a computer to understand.

   Human language is a social construct so according to [14]pseudoleftists
   it's an illusion, doesn't exist, doesn't work and has no significance.

   Why are human languages so hard for computers to handle? Well, firstly
   there are minor annoyances like syntactic ambiguity, irregularities,
   redundancy, complex rules of grammar -- for example the sentence "I know
   Bob likes computers, and so does John." can either mean that John knows
   that Bob likes computers or that both Bob and John like computers. Things
   like this can be addressed by designing the [15]grammar unambiguously, but
   analyzing already existing natural languages suffers by this. Furthermore
   in real life there are countless quirks of playing with language, things
   like sacrasm, parody, exaggerations, indirect hints, politeness,
   rhetorical questions, fau pax, memes and references. For example when we
   think of imperative, we imagine sentences such as "Close the window." --
   in real life we'll rather say something like "I'm cold, it wouldn't hurt
   to close the window.", i.e. something that's semantically an imperative
   but not syntactically, a dumb computer would deduce here we are stating a
   fact that closing the window will not hurt anyone; it takes human-like
   intelligence AND experience in how the real life works and abilities like
   being able to guess feelings and plans of others to correctly conclude
   this sentence in fact means "Please close the window." Just try to talk to
   someone for a while and focus on what the sentences mean literally and
   what they actually imply. So things revolving around this are pose the
   first issue, but yet a greater issue dwells in how to actually define
   meanings of words -- human language is not just "text strings" as it might
   seem on the first glance, behind the text strings lies a deep
   understanding of the extremely complex [16]real world. More details of the
   issues of semantic will be given below.

   What is the most [17]LRS human language? This is not [18]settled yet but
   [19]Esparanto looks pretty cool. [20]English is actually one of the most
   [21]suckless languages, it's extremely easy and everyone speaks it -- it's
   not perfect but it is like [22]C in programming, likely the best things we
   probably have at the moment. As a part of [23]less retarded society we
   should aim to create a constructed language that will be universally
   spoken by everyone and which, if at all possible, will solve the issue of
   the great language curse described below.

The Grand Curse Of Human Language

   { The following is a thought dump made without much research, please
   inform me if you're a linguist or something and have something
   enlightening to say, thank you <3 ~drummyfish }

   On one hand human languages are cool when viewed from cultural or
   [24]artistic perspective, they allow us to write poetry, describe feelings
   and nature around us -- in this way they can be considered [25]beautiful.
   However from the perspective of others, e.g. programmers or historians,
   human languages are a [26]nightmare. There is unfortunately an enormous,
   inherent curse connected to any human language, both natural or
   constructed, that comes from its inevitably [27]fuzzy nature stemming from
   fuzziness or real life concepts, it's the problem of defining
   [28]semantics of words and constructs (no, Lojban doesn't solve this).
   [29]Syntax (i.e. the rules that say which sentences are valid and which
   are not) doesn't pose such a problem, we can quite easily define what's
   grammatically correct or not (it's not as hard to write a program that
   checks gramatical correctness), it is semantics (i.e. meanings) that is
   extremely hard to grasp -- even in rigorous languages (such as
   mathematical notation or programming languages) semantics is a bit harder
   to define (quite often still relying on bits of human language), but while
   in a programming language we are essentially able to define quite EXACTLY
   what each construct means (e.g. a + b returns the sum of values a and b),
   in a natural language we are basically never able to do that, we can only
   ever form fuzzy connections between other fuzzy concepts and we can never
   have anything fixed.

   Due to this fuzziness human languages inevitably change over time no
   matter how hard we try to counter this, any text written a few thousand
   years ago is nowadays very hard to understand -- not because the old
   languages aren't spoken anymore, but because the original meanings of
   specific words, phrases and constructs are distorted by time; when
   learning an old language we learn what each word meant by reading its
   translation to some modern word, but the modern word is always more or
   less different. Even if it's a very simple word such as "fish", our modern
   word for fish means a slightly different thing than let's say ancient
   Roman's word for fish because it had slightly different connotations such
   as potential references to other things: fish for example used to be the
   symbol of Christianity, nowadays people don't even commonly make this
   connection. Fishermen were a despised class of workers, to some fish may
   have signified food and abundance, to others something that "smells bad",
   to others something or someone who's "slippery". Some words may have
   referred to some contemporary "[30]meme" that's been long forgotten and if
   some text makes the reference, we won't understand it. The word "book" for
   example meant something a bit different 2000 years ago than it means now:
   back then a book might have been just a relatively short scroll, it was
   expensive and people didn't read books the same way as we do today, they
   commonly just read them out loud to others, so "reading a book" and the
   word "book" itself doesn't conjure the same picture in our heads as it did
   back then. Or another example showing the difference between languages
   existing at the same time is this: while the Spanish word "perro"
   translates to English as "dog", the meanings aren't the same; some English
   speakers use the word as a synonym for "friend" but in Spanish the word
   can be used as an insult so shouting "perro" and "dog" in the street may
   lead to different (possibly completely opposite) images popping up in the
   heads of those who hear it. How do you describe a word precisely if you
   can only describe it with other imprecise words that are changing
   constantly? No, not even pictures will help -- if you attach the picture
   of a cat to the word "cat", it's still not clear what it means -- does it
   stand for the picture of the cat or for the cat that's in the picture,
   does it stand ONLY for the one cat that's in the picture or all other
   animals that are similar to the one in the picture? How similar? Is lion a
   cat? Is a toy cat or cartoon cat a cat? Or does the picture signify that
   anything with a fur is a cat? If it looks like cat but walks on two legs
   and speaks, is it still a cat? Now imagine describing a more abstract term
   such as thought, number or existence. There is no solid ground, even such
   essential words as "to want" or "to be" have different meanings between
   languages ("to be" can stand for "to exist", "to be in a place", "to
   temporarily have some property", "to permanently have some property"
   etc.). Even dictionaries admit defeat and are happy with having circular
   definitions because there aren't any foundations to build upon, circular
   definitions are inevitable, dictionaries just help you connect fuzzy
   concepts together. All of this extends to tenses, moods, cases and
   everything else. This can be very well seen e.g. with people interpreting
   old texts such as the Bible, for example some say [31]Jesus claimed to be
   the son of God while others reject it, saying that even if he stated the
   sentence, it actually wasn't meant literally as it was a commonly used
   phrase that meant something else -- these people will argue about
   everything and they can comfortably interpret the same text in completely
   opposite ways. The point is that we just can't know.

   { Just one more of other countless examples I recently encountered: it
   used to be generally believed that [32]Jesus was crucified so that he was
   nailed on the cross through his palms, however it was shown this wouldn't
   work and also other evidence showed people were nailed more in the arms,
   in a way that would hold the weight of the body but wouldn't hit the
   artery. The confusion came from translation -- the Greek word for "hand"
   also includes part of an arm, i.e. the word for hand in Greek is different
   from the word hand in some other languages. ~drummyfish }

   In addition there are ALWAYS great many hidden implicit assumptions that
   both communicating sides have to share to be able to communicate (and
   these can only be assured by many years of learning, spent in the same
   environment) -- for example if I tell someone "Drive to the city and buy
   food.", in fact I mean something like "Right now walk with your feet to
   our car, open the door, sit in, take the wheel in your hands, start the
   car, drive only on the road with your eyes open, ..."; the guy can
   technically satisfy my order by waiting 10 years, then driving a truck
   through forests with eyes closed over the whole globe and back. Just as
   it's impossible to perfectly define all words, it is impossible to
   explicitly recount all assumptions. Though the mentioned example is
   exaggerated, it shows an ever present phenomenon we have to deal with, a
   phenomenon which can cause misunderstanding or be easily abused.

   Of course this barrier exists between contemporary languages too, the
   idiom "lost in translation" exists for a reason -- translating something
   always loses or at least changes something. Translating one sentence over
   and over to different languages and back to the original one will most
   likely produce a sentence with very distinct meaning.

   This is the grand issue that common people almost universally overlook,
   most will naively think that with careful effort it is possible to express
   oneself so clearly that others simply won't be able to misunderstand --
   this is sadly false, even with most carefully crafted sentences language
   always extremely easily allows any word to be twisted by politicians to
   anything they want, it destroys old knowledge and prevents us from
   communicating with clarity and recording ideas so that they would last
   into the future. This damnation of language plagues every book, authors
   constantly complain "I should have rather used this and that word" but
   that wouldn't even help, it's impossible to say something so as to not be
   misunderstood because human language is a weak, crippled tool just based
   on shouting weird sounds in hopes someone will get a vague idea of what's
   going on in your head. Due to this limitation of language it is absolutely
   worthless to discuss anything if after 5 minutes you don't come to
   agreement, the discussion will lead nowhere, it's best to just leave it at
   communication being impossible because even if linguistically you speak
   the same language, you cannot communicate correct meanings, even words
   like "is", "when", "bad" or "will" will have absolutely different
   meanings, you would have to define every word of every sentence and then
   every word of every new sentence you produce for 1000 years until you come
   to circular definitions when you'll still be disagreeing but won't even be
   able to waste time further.

   This issue is very hard to solve, maybe impossible. It seems that due to
   the extreme complexity of [33]real life our language can't operate with
   precise equations but rather has to settle with concepts that are just
   fuzzy blobs that our brains -- [34]neural networks in our heads -- learn
   by trial and error over many years. We learn that if we hear the word X,
   it's best to react by feeling fear or turning our head or closing our eyes
   etc.

   { The only idea of a solution on how to make a "mathematically precise"
   human language for real world communication is the following. Firstly make
   a mathematical model of some artificial world that's similar to ours, for
   simplicity we can now just consider something like a 2D grid with
   differently colored cells, i.e. something like a [35]cellular automaton.
   The world changes in steps and each cell can "talk", i.e. at any frame it
   can emit a text string. Now make a language that's precisely defined in
   this world; if the world is simple, it's pretty doable e.g. like this:
   write a function in some programming language that takes the world and
   check if what the cells are saying classifies as your language used in a
   correct way within this world (so the function just returns true/false,
   nothing else is needed). Now this single function mathematically defines
   your language -- by looking at your function's source code anyone can
   derive the absolutely correct meaning of any word or sentence because he
   can see how the function checks whether that word of phrase is used
   correctly, he will know exactly which situations fit given sentence and
   which don't. Now the final step is only to find correspondence between the
   real life and your simplified mathematical world, e.g. that cells
   represent humans and so on (but this will have shortcomings, e.g. our
   simple world will make it difficult or impossible to talk about body parts
   since cells have none; also making the connection between the mathematical
   world and real world relies on intuition). ~drummyfish }

   { Yet another, maybe more practical idea would be to create a set of very
   few core words -- let's say 100, which we would try to define extremely
   precisely by all the current imperfect means but with very elevated
   effort, i.e. each word would have a detailed description, translations to
   20 other natural languages, positive and negative examples, pictures
   attached etc. Then the rest of the language would be defined only using
   these core words. But maybe it wouldn't work -- the language would be
   possibly a bit more stable but would eventually degenerate as well.
   ~drummyfish }

Links:
1. human.md
2. computer.md
3. neural_net.md
4. linguistics.md
5. english.md
6. chinese.md
7. latin.md
8. conlang.md
9. esperanto.md
10. lojban.md
11. computer_language.md
12. c.md
13. xml.md
14. pseudoleft.md
15. grammar.md
16. irl.md
17. lrs.md
18. settled.md
19. esperanto.md
20. english.md
21. suckless.md
22. c.md
23. less_retarded_society.md
24. art.md
25. beauty.md
26. nightmare.md
27. fuzzy.md
28. semantics.md
29. syntax.md
30. meme.md
31. jesus.md
32. jesus.md
33. irl.md
34. neural_net.md
35. cellular_automaton.md