TEXT JUNIOR

So  I  ended  my  previous post with the following idea for plain-text
formatting that went something like this (to paraphrase myself):

      I want to write my content in an unobtrusive Markdown-like
      format  but  I  don't  feel like maintaining a complicated
      text formatting engine.

      Raw troff/groff can do the hard work, but  is  no  fun  to
      write.

      But,  if I write a pre-processor for *roff, we can get the
      best of both worlds cheap!

Well, it turns out that the idea was viable.  I've got a working solu-
tion.

The  hardest part, by far, was figuring out which of the cryptic *roff
commands would accomplish what I wanted in the  "ascii"  output  type.
At  one  point,  I  even  found myself reading the man doc macro ("an-
old.tmac") and the "grotty" C++ source (which is  an  output  "device"
for  groff  which  can  produce  ANSI  terminal  output as well as the
"ascii" and "utf8" output types.

I spent more time trying to figure out how to produce  output  as  one
continuous "page" with no breaks or vertical padding (on the last page
to make the "printer" completely feed out the whole sheet  of  paper!)
than on all of the other tasks combined.

Here's  the  answer  to  that one, by the way.  Just stick this at the
very end of your document:

    .pl 0

It tells groff that your page "length" should be 0 so  that  it  won't
attempt to pad the last "page" with any addtitional vertical space.

But  you  can't  put that line anywhere else in your document or groff
will think that each line is a separate "page", which will cause it to
collapse the other vertical spacing in your document.



A simple formatting language
======================================================================

I've written a couple homebrew text formats before (always  with  HTML
as the target output).  I've written line-based parsers and tokenizing
parsers for them.

It always ends up being harder than I'd expected.

I knew I wanted this to be dirt-simple, so I make the syntax  strictly
line-based with on/off syntax.

For example, a code block begins and ends with exactly three backticks
(```) on a line.  Nothing more or less is allowed.

Example:

    ```
    if(foo){
          echo "Hello!";
    }
    ```

Block quotes begin and end with exactly three double-quotes (""").

Titles and headings must follow this precise format:

  # Title
  ## A Heading
  ### A Sub-Heading

All whitespace outside of code blocks is normalized (for example, out-
put  paragraphs are separated by exactly one blank line).  Pretty nor-
mal stuff, just very strict about the block syntax.

All output formatting is specified with groff commands with the excep-
tion  of the heading lines, which I'm drawing with my preprocessor be-
cause that was just so much easier than figuring out how to do it pro-
gramatically with groff!



The tool
======================================================================

By yesterday, I had a test.groff document which produced  the  desired
text document output when I ran it through groff like this:

  $ groff -Tascii test.groff > output.txt

By this morning, I had a 20-line Perl script which did pretty much ev-
erything I needed.  By the end of the day, it's grown to 92 lines  and
seems to be able to handle whatever I throw at it.



        ___            ___         ___
       /\  \          /\  \       /\  \
       \:\  \         \:\  \     /::\  \
        \:\  \    ___ /::\__\   /:/\:\  \
        /::\  \  /\  /:/\/__/  /::\~\:\  \
       /:/\:\__\ \:\/:/  /    /:/\:\ \:\__\
      /:/  \/__/  \::/  /     \/_|::\/:/  /
     /:/  /        \/__/         |:|::/  /
     \/__/                       |:|\/__/
                                 |:|  |
                                  \|__|


I  named it Text Junior (tjr) to emphasize how small it is because I'm
giving all of the crappy work to groff. :-)

This post was processed with tjr.

I'll put the source up on RGB (this Gopher burrow) soon.

Until next time, happy hacking!