[HN Gopher] FLOSS and Linguistic Diversity
___________________________________________________________________
 
FLOSS and Linguistic Diversity
 
Author : BerislavLopac
Score  : 28 points
Date   : 2021-05-15 08:11 UTC (14 hours ago)
 
web link (www.paulox.net)
w3m dump (www.paulox.net)
 
| torstenvl wrote:
| FLOSS*
| 
| This has nothing to do with oral hygiene. I honestly clicked
| through expecting to find some breakthrough study on the effects
| of oral hygiene on second language acquisition, either
| individually or sociologically.
 
  | dang wrote:
  | Fixed. Sorry. Thanks!
 
  | severine wrote:
  | Upvoted. Mods (or OP), please edit the title, the article is
  | great and can bring an interesting discussion!
 
| danhor wrote:
| I'm not sure if more linguistic diversity will help FLOSS. For a
| lot of floss software, there isn't even enough documentation in
| English and looking at communities that tend to have resources in
| their own languages (Chinese and Japanese seem for me to be the
| main ones), there's usually a large divide to the rest of the
| English speaking floss community, which hurts both sides. For
| countries with more English speakers (e.g. most European
| countries) I don't think translations provided to developers will
| prove beneficial.
| 
| For example, here in Germany a lot of mcu-related german-language
| resources are written in the mikrocontroller.net wiki. It has
| some good content, but it's often very apparent that not much has
| been updated in the last ~10 years, in many cases misdirecting
| potential beginners to less-that-optimal progression paths.
| 
| You can also see a lot of people with programming experience
| switching their system to English, to easily find resources in
| case of e.g. obscure errors.
| 
| The language-mastery issue is real, especially for very young
| people, but I'm not sure if "forcing" people isn't the better
| call. At least in my peer group it seems like most interested
| people became pretty adept at English and I'm not aware of any
| that gave up because of language issues (although I'm sure there
| are many examples).
| 
| The suggestions for more simplified English in the article seem
| like good ideas to me. Often times the subject matter is
| complicated enough, even for people who don't have issues
| understanding the language.
 
  | pauloxnet wrote:
  | Thanks a lot for you feedback and point of view it's very
  | useful for me.
  | 
  | I reply below to some points but only to share better my point
  | of view I tried to wrote in the article.
  | 
  | > I'm not sure if more linguistic diversity will help FLOSS.
  | 
  | I think linguistic diversity in FLOSS means a more accessible
  | FLOSS world to people that not speak English.
  | 
  | > For countries with more English speakers (e.g. most European
  | countries) I don't think translations provided to developers
  | will prove beneficial.
  | 
  | But here in Europe (I'm from Italy) not all country have the
  | same level English skills. I also wrote the article thinking to
  | the whole World with so meny country with a low level of
  | instruction.
  | 
  | > The language-mastery issue is real, especially for very young
  | people, but I'm not sure if "forcing" people isn't the better
  | call.
  | 
  | I don't think in the article I wrote about forcing people.
  | 
  | > The suggestions for more simplified English in the article
  | seem like good ideas to me.
  | 
  | Thanks, I'm happy you found the idea good.
 
  | zzo38computer wrote:
  | > For a lot of floss software, there isn't even enough
  | documentation in English and looking at communities that tend
  | to have resources in their own languages ...
  | 
  | I agree; many projects (both FOSS and non-FOSS) lack sufficient
  | documentation. However, this is the case whether or not it is
  | English; it is another issue.
  | 
  | More simplified English might be a good idea. There is what is
  | called Simplified Technical English, but that seems to be for
  | aerospace, and perhaps it could be adapted for computer
  | documentation, too.
  | 
  | I am willing to accept contributions of documentation in any
  | language whether English or otherwise for my projects, although
  | I am only writing in English myself, others can write in other
  | languages if they want to do. (However, I generally have no
  | intention to support commands and status messages etc in
  | languages other than English, nor in character encodings other
  | than ASCII.)
  | 
  | For names of stuff (and comments) in the source code of the
  | program itself, I do not consider it too important to write in
  | other languages, although people can try to do that if they
  | want to do. At least for my own projects though, I intend to
  | limit the source code to ASCII.
 
    | pauloxnet wrote:
    | Thanks a lot for your feedback.
    | 
    | > More simplified English might be a good idea. There is what
    | is called Simplified Technical English, but that seems to be
    | for aerospace, and perhaps it could be adapted for computer
    | documentation, too.
    | 
    | I'm happy that you think that the idea in the article is
    | good.
    | 
    | I didn't know about the "Simplified Technical English" it
    | seems interesting.
 
| hirundo wrote:
| There could be automated linguistic diversity in the code itself,
| rather than just the docs.
| 
| Say, you must pick identifier names from a subset dictionary of
| language concepts (pruned of synonyms and multiple senses), for
| which we have translations into X other languages. E.g.
| WidgetFactory is a valid name, since we can translate that, but
| FoznozzleBodega raises an error. So the resulting code can be
| trivially output in any language in the master dictionary.
| 
| There could also be multilingual literary programming by making
| the output syntax conform to the target language as well, in
| terms of conjugation, part of speech order, etc.
| 
| Code is a relatively low hanging fruit for debabelization.
 
  | SpicyLemonZest wrote:
  | Word-by-word translation doesn't really work, unfortunately,
  | even in the limited context of variable names. You'd have to
  | translate for example MoveTime into MoverTiempo, which is
  | awkward in general and the wrong translation entirely if the
  | variable refers to a time that someone's moving house.
 
  | torstenvl wrote:
  | Current output l10n tends to translate whole strings in
  | context. It sounds like your preference is to make automatic
  | translation part of every binary? Maybe I'm misunderstanding
  | you, but I don't think that is the best way to go about
  | achieving localization, given the massive overhead and
  | lackluster results.
 
  | geofft wrote:
  | I like this idea. It seems like it's related to another idea
  | that I've always liked but never seen in practice: assuming
  | that the code checked in always parses, you can re-lay-out code
  | style (indentation, tabs/spaces, line length) according to the
  | individual developer's preferences upon checkout, and transform
  | them back to some (arbitrary) standard upon checkin.
  | 
  | Git can handle this sort of thing with smudge/clean filters.
 
  | corty wrote:
  | GUI translation works like this, usually. You have a list of
  | strings/phrases occuring in the GUI and a translation table for
  | each language that assigns the english string to its translated
  | equivalent. With the help of a dictionary (general and domain
  | specific parts) you can even do automatic translations.
  | 
  | But the devil is in the details: Most english speakers don't
  | know foreign languages, so they are unaware of lots of problems
  | that occur from just a word list. E.g. there is a difference
  | between "end" as a noun and "end" as a verb. In German, one
  | would translate the former as "Ende", the latter as "beenden".
  | So for the string table, the programmer would need two
  | different annotated entries "end (n)" and "end (v)" for the
  | translation to be correct. But usually, the translation team
  | only gets "end" without context, annotations or anything, and
  | of course all occurences of "end" will be conflated into one
  | line in the translation table. There are other frequent
  | problems, like sentences containing numbers that require logic
  | to get right and distinguish between Singular/Plural, nothing
  | and maybe even Dual cases. Or the problem that designers got
  | their fingers into a GUI, leaving just enough room for English
  | language strings, but the translations will be cut off or
  | awkwardly skew the layout because they are too
  | short/overflowing/unaligned.
  | 
  | This frequently leads to the situation where as a German, I
  | have to translate to English and back to understand the
  | meaning. Which is why I nowadays avoid translated software and
  | do everything in English. I think translation without lots of
  | effort is pointless. And nobody will invest any significant
  | effort into "small overseas markets".
  | 
  | In FOSS it may be easier to get good translations because of
  | the feedback from real users. They are often more proficient in
  | the target language (than paid translators) and can recognize
  | awkward phrasings and misleading situations. Translation is
  | also an opportunity for small incremental improvements by tons
  | of otherwise non-technical volunteers.
  | 
  | But for the aforementioned reasons, automatic translation makes
  | things worse, not better. I have yet to find a useful
  | translation of technical content, i.e. anything that goes
  | beyond just understanding the general topic of a newspaper
  | article.
 
| [deleted]
 
| pauloxnet wrote:
| Hi @BerislavLopac, I'm Paolo Melchiorre the author of the
| article, and thanks for sharing
 
| xupybd wrote:
| This is a hard problem to solve. The amount of work required is
| huge. I understand the problem. I work on Italian CNC machines.
| Half my time debugging is in Google translate trying to figure
| out comments, variable names and error messages. I don't think
| it's impossible to translate all of the material to English but
| it would cost hundreds of thousands. I Know FOSS doesn't have the
| same economic model but it does have opportunity costs. If
| someone wants to advocate for this great, I'd be cautious to ever
| disparage a project for not doing this. Simply because it is a
| tremendous amount of work.
 
  | paulryanrogers wrote:
  | Is translation really that expensive? Once they're in a
  | spreadsheet there are plenty of services which will translate
  | them.
 
    | pauloxnet wrote:
    | Unfortunately it is not like that. We experienced various
    | difficulties translating only few sections in the Django
    | documentation because, also for technical text, mechanical
    | translation is not effective, but you have to adapt the text.
 
  | pauloxnet wrote:
  | Hi, thanks for your feedback. My suggestion in the article was
  | more to take care of all people in the community than to
  | translate everything. Starting to use a better/simpler English
  | would be a great starting point.
 
| WalterBright wrote:
| In the 80s I worked pretty hard to support multiple languages
| with the Zortech C/C++ compiler. Error messages were switchable
| between English, German, French and Japanese. Translated versions
| of the manual were made.
| 
| The trouble was, I had to hire translators. They weren't
| programmers, so the translations were (so I've been told)
| peculiar. When I'd modify the compiler, trying to keep the
| translated text in sync was a nightmare.
| 
| The last straw was when I found out that essentially none of our
| customers used the messages in their native language. They
| preferred the English versions.
| 
| So I just gave up on that with D. (Although the D language itself
| has excellent Unicode support, the user interface is all in
| English.)
| 
| Some members of the D community have taken the initiative to
| create documentation in their native languages, which is great.
 
  | jan_Inkepa wrote:
  | Interesting story! Thanks for sharing.
  | 
  | I'm consistently impressed by Microsoft's approach to
  | translation - their online database of annotated translations -
  | is an invaluable source for translating and just talking about
  | computer/software stuff in non-english languages. (
  | https://www.microsoft.com/en-us/language they made their entire
  | translation database available online).
  | 
  | I do some work on a compiler and as much as I'd like to enable
  | people to use it to teach young schoolkids in their native
  | language (with localizable keywords + error messages), the
  | implementation/maintenance burden would be massive and then
  | people wouldn't be able to share source code so easily
  | globally.
  | 
  | It's hard to balance - I like languages, and usually my code
  | isn't in English, but...yeah, I can only go so far in practice.
 
  | pauloxnet wrote:
  | Thanks a lot for sharing your experience it's really great to
  | learn more about FLOSS and its community.
 
    | WalterBright wrote:
    | There were some fun moments in this. In attempting to
    | translate "destructor" to Japanese, we'd get "death tractor".
    | Now, personally I felt that "deathTractor" was a far more
    | apropos term than "destructor" (sorry Bjarne). For years my
    | circle of colleagues called them deathTractors.
 
___________________________________________________________________
(page generated 2021-05-15 23:00 UTC)