Tiny tools for great writers   
==================================

Unix has traditionally had a quite tight relationship with electronic
text production and editing. In fact, most of early unix development at
Bell Labs was funded under the promise that those long-bearded folks
were developing a new text-processing system for the AT&T patent office.
So when the operating system started circulating inside and outside the
Labs, well, people expected it to have powerful text production tools
bundled in. And the auld craftsmen of Murray Hill managed to do the
needed magic, and stuffed the early unix releases with lots of
professional tools for writers. Most of them have survived until the
current era. We will shortly look at some of them here, in particular
dict(1), spell(1), diction(1) and style(1). Future phlogs from The
Dwarven Blacksmith will most probably feature other tools from that pack.

The first tool is dict(1). This is a client for the DICT protocol
(RFC 2229 [1]) which allows to query a remote or local dictionary server
to obtain words definitions: 

  $ dict dict://dict.org/d:GNU:wn
  1 definition found
  
  From WordNet (r) 3.0 (2006) [wn]:
  
    gnu
        n 1: large African antelope having a head with horns like an ox
             and a long tufted tail [syn: {gnu}, {wildebeest}]
  $

In this case we have asked dict(1) to contact the dictionary server at
dict.org, and to ask for all the existing definitions of the word
`GNU` in the dictionary `wn` (WordNet). If you want to query all the 
existing dictionaries, just remove the `:wn` from the query. Despite 
using remote dictionary servers is possible, for small installations it 
makes sense to have a dict(1) server running locally. In that case, you
just need to type:

  $ dict -d jargon PDP-11
  1 definition found
  
  From The Jargon File (version 4.4.7, 29 Dec 2003) [jargon]:
  
    PDP-11
    
    
        Possibly the single most successful minicomputer design in history, a
        favorite of hackers for many years, and the first major Unix machine, The
        first PDP-11s (the 11/15 and 11/20) shipped in 1970 from {DEC}; the last
        (11/93 and 11/94) in 1990. Along the way, the 11 gave birth to the {VAX},
        strongly influenced the design of microprocessors such as the Motorola 6800
        and Intel 386, and left a permanent imprint on the C language (which has an
        odd preference for octal embedded in its syntax because of the way PDP-11
        machine instructions were formatted). There is a history site.
  $

Notice that the option '-d' allows to choose a specific dictionary. In this
case we chose the Jargon File [2].

The second one is spell(1), which survives today in at least two
separate incarnations, namely ispell(1) and aspell(1). These tools take
as input a file, look for spelling mistakes, and propose corrections
using a system-wide wordlist. To spell a file you could use either:

  $ ispell textfile.txt
  
or:

  $ aspell -c textfile.txt

No need to say that these are quite useful for a life on the terminal,
and several editors have interfaces to one or both of them. Just refer
to their man pages for more info.    

Another cool pair of tools for writers are diction(1) and style(1). The
former scans a text  and finds typical mistakes in grammar and sentence
construction. Let's run it on the current phlog:

  $ diction -s 20190129_texttools.txt
   
  20190129_texttools.txt:5: In fact, [most -> Do not use as substitute for
  "almost."] of early unix development at Bell Labs was funded under the
  impression that those long-bearded [folks -> Avoid using "folks", when
  writing formally, to refer to your family or friends.] were developing a
  new text-processing [system -> Frequently used without need.] for the
  AT&T patent office.
  
  20190129_texttools.txt:8: [So -> (do not use as intensifier)] when the
  operating [system -> Frequently used without need.] started circulating
  inside and outside the Labs, well, [people -> Do not use with numbers or
  as substitute for "public".] [expected -> Use "expect" for simple
  predictions and "anticipate" for more complex actions in advance of an
  event.] it to have [powerful -> Overused, especially in computer
  industry press releases.] text production tools bundled in.
  
  20190129_texttools.txt:15: This is a client for the DICT protocol (RFC
  2229 [1]) [which -> (use "that" if clause is restrictive)] allows to
  query a remote or local dictionary server to obtain words definitions:
  
  .....
  20 phrases in 22 sentences found.
  $

diction(1) will mark all the 'suspect' words and sentences using
brackets []. The option '-s' forces the inclusion of suggestions for
alternative wordings, if at all available.

Finally, style(1) is a tool to compute readability statistics on a
textfile. For instance:

  $ style  20190129_texttools.txt 
  readability grades:
          Kincaid: 8.5
          ARI: 9.4
          Coleman-Liau: 9.2
          Flesch Index: 68.2/100 (plain English)
          Fog Index: 11.7
          Lix: 39.1 = school year 6
          SMOG-Grading: 10.6
  sentence info:
          1910 characters
          424 words, average length 4.50 characters = 1.41 syllables
          22 sentences, average length 19.3 words
          50% (11) short sentences (at most 14 words)
          18% (4) long sentences (at least 29 words)
          7 paragraphs, average length 3.1 sentences
          0% (0) questions
          31% (7) passive sentences
          longest sent 54 wds at sent 16; shortest sent 5 wds at sent 17
  word usage:
          verb types:
          to be (8) auxiliary (2) 
          types as % of total:
          conjunctions 4% (19) pronouns 4% (17) prepositions 12% (49)
          normalisations 2% (9)
  sentence beginnings:
          pronoun (2) interrogative pronoun (0) article (2)
          subordinating conjunction (1) conjunction (1) preposition (6)
  $

Now, I am not a linguist, but it looks like the higher the values of
each of the readability grades, the more readable your text is. But each
measure has a different range, so it's probably better to have a look at
the corresponding man page.

[1] gopher://gopher.rbfh.de/0/RFC/rfc2229.txt
[2] gopher://orion.ka10.de/0/books/Jargon.2003

  -+-+-+-
spell(1) appeared in Unix-v7 (January 1979)
diction(1) and style(1) were part of the Unix Documenter's WorkBench (DWB)
dict(1) was written by Rik Faith in the early 1990s