Below is the script that, with Mark Wallen's help, I wrote to convert troff documents to RTF so they could be imported to Word without loosing all of the formatting. Please let me know how it works for you, and what changes or additions you make. Bruce Jones Department of Communication bjones@ucsd.edu University of California, San Diego (619) 534-0417/4410 9500 Gilman Drive FAX (619) 534-7315 La Jolla, Ca. 92093-0503 -=- -=- -=- -=- -=- -=- -=- CUT HERE -=- -=- -=- -=- -=- -=- -=- -=- -=- # troff2rtf # # This script will take ordinary troff raw files which use -ms macros, # and convert them to Interchange Format (RTF) text files, which can # then be exported to a Macintosh (and probably a PC) running MS Word. # # Word recognizes the RTF codes and converts the document into a Word doc. # # Please note: this script is more or less custom fitted to my way of # working and my troff typesetting techniques. This is the header # file I prepend to all of my troff-set texts: # # .ST # .fp 1 Z # Bookman # .fp 2 ZI # Bookman Italic # .fp 3 ZB # Bookman Bold # .fp 4 ZX # Bookman Bold Italic # .fp 5 CW # Courier # .fp 6 CI # Courier Italic # .fp 7 CB # Courier Bold # .fp 8 CX # Courier Bold Italic # .ds CH # .ds CF \s10-%-\s0 # .PZ 12 # .PO 1i # .LL 6.5i # .bd 3 2 # .na # .nh # .ce # TITLE HERE # .bp1 # .ls 2 # # It is a 90% solution at best. First, it is easier to ignore some # aspects of troff code and just apply them by hand in Word - header # and footers are the most obvious. # # Second, there are somethings that you just can't get sed to do - # like search out the end of a list of centered lines and tell RTF # to turn off centering. Maybe some awk programmer would care to # hack this feature into the script? # # It has some other weirdnesses as well - most notably that I can't # figure out how to make it shift from double-spaced text (or 1.5 # spaced) to single-spaced. # # At the end of the script there is a short list of things # I'd like to get working but haven't. # # The script will also do things that troff - at least as far as I # know for this machine (weber@ucsd.edu) will not do: underlining, # bold-italics-underlined text, etc. Check the comments below. # # I am releasing it to the public now because I don't have time to # hack it into more reasonably working shape. I hope that people will # add lines to the script and then share their new versions with the # machintosh@ucsd.edu mailing list. # # Of course, this comes with the usual cautions and disclaimers: the # author of the script makes no claims about its usefulness, no # warranties are expressed or implied, your milage may vary, # guaranteed for nine miles or nine minutes, whichever comes first. # # Comments and suggestions to: # # Bruce Jones Department of Communication # bjones@ucsd.edu University of California, San Diego # # My thanks to Mark Wallen for his suggestion on using RTF, for # creating the original sed script from which this was spawned, and # for his additions to my script. #! /bin/csh if ( $#argv != 1) then echo "Usage for $0; $0 troff-file-name" exit 1 endif if (-e $1.msword) /bin/rm -i $1.msword # Create the output file by pre-pending the RTF header file: \ cat << 'EOF' > $1.msword {\rtf1\mac\deff2{\fonttbl {\f0\fswiss Chicago;} {\f2\froman New York;} {\f3\fswiss Geneva;} {\f4\fmodern Monaco;} {\f5\fscript Venice;} {\f6\fdecor London;} {\f7\fdecor Athens;} {\f8\fdecor San Francisco;} {\f9\fnil Toronto;} {\f10\fdecor Seattle;} {\f11\fnil Cairo;} {\f12\fnil Los Angeles;} {\f13\fnil Zapf Dingbats;} {\f14\fnil Bookman;} {\f15\fnil N Helvetica Narrow;} {\f16\fnil Palatino;} {\f18\fnil Zapf Chancery;} {\f20\froman Times;} {\f21\fswiss Helvetica;} {\f22\fmodern Courier;} {\f23\ftech Symbol;} {\f24\fnil Vectors;} {\f30\fnil Silicon Valley;} {\f31\fnil Broadway;} {\f32\fnil Chicago by Night;} {\f33\fnil Avant Garde;} {\f34\fnil New Century Schlbk;} {\f40\fnil pica;} {\f50\fnil Ravenna;} {\f100\fnil Alice;} {\f129\fnil ChicMath;} {\f130\fnil Palo Alto;} {\f131\fnil Princeton;} {\f132\fnil Santiago;} {\f201\fnil Math;} {\f203\fnil NY Nights;} {\f204\fnil Scan;} {\f205\fnil Broadway E;} {\f207\fnil Tiny;} {\f208\fnil KeyMap;} {\f210\fnil NYItalic;} {\f1024\fnil Mobile;} {\f2515\fnil MT Extra;} {\f15000\fnil BI Optima BoldOblique;} {\f15001\fnil B Optima Bold;} {\f15002\fnil I Optima Oblique;} {\f15003\fnil Optima;}} {\colortbl\red0\green0\blue0; \red0\green0\blue255; \red0\green255\blue255; \red0\green255\blue0; \red255\green0\blue255; \red255\green0\blue0; \red255\green255\blue0; \red255\green255\blue255;} {\stylesheet{\s245 \f3\fs18\up6 \sbasedon0\snext0 footnote reference;} {\s246 \f3\fs20 \sbasedon0\snext246 footnote text;} {\f3 \sbasedon222\snext0 Normal;}} {\info {\title Troff to MS Word Conversion} {\author PUT YOUR NAME HERE IF YOU LIKE}} \margl1440\margr1440\ftntj\pgnstart0\sectd\sbknone \linemod0\linex0\cols1\endnhere\pard\plain\f3 'EOF' sed < $1 \ # Remove all troff comment lines\ -e '/^\.\\"/d'\ # Substitute RTF current date info for troff current date:\ -e 's/\\*(DY/{\\field{\\*\\fldinst date \\\\@ "MMMM d, yyyy"}}/'\ # Italics\ -e 's/\\f2/{\\i /'\ # Boldface:\ -e 's/\\f3/{\\b /'\ # Bold-italics:\ -e 's/\\f4/{\\b\\i /'\ # Underlined (\fu is not a troff command)\ -e 's/\\fu/{\\ul /'\ # Underlined italics (\fui is not a troff command)\ -e 's/\fui/{\\i\\ul/' \ # Bold underlined (\fbu is not a troff command)\ -e 's/\fbu/{\\b\\ul/' \ # Bold, underlined italics (\fbui is not a troff command)\ -e 's/\fbui/{\\b\\i\\ul/' \ # End special typeface, return to fp 1:\ # DO NOT USE TO INITIATE FONT 1\ -e 's/\\f1/ }/' \ # A Standard tab-indented paragraph:\ -e 's/\.[Pp][Pp]/\\par\\par\\tab/'\ # A Left-Aligned Paragraph:\ -e 's/\.[Ll][Pp]/\\par\\par/'\ # Blank lines:\ -e 's/^\.sp.*/\\par\\par/'\ # Page breaks:\ -e 's/^\.br/\\par/' \ # Marks centered text:\ # DOES NOT SUBSTITUTE RTF CODE\ -e 's/^\.ce.*/CENTERED LINES:\\par/' \ # Substitute a tab\ -e 's/ /{\\tab}/g' \ # Substitute an em-dash\ -e 's/\\(em/\\emdash/g' \ # Block quote:\ -e 's/^\.QS/\\li720\\ri720/' \ # Block quote End\ -e 's/^\.QE/\\pard/' \ # Shift to 1.5 line-spaced text\ -e 's/^\.ls 1.5/\\sl360 /' \ # Shift to double-spaced text\ -e 's/^\.ls 2/\\sl480 /' \ # Shift to single-spaced text (doesn't work)\ # -e 's/.ls$/\\pard/' \\ # Start page 1 - doesn't work\ # -e 's/^\.bp 1/\\page \\pgstart0 /'\\ # Page Break:\ -e 's/^\.bp/\\page\\par\\pard /'\ # Footnote citation number:\ -e 's/\\\*\*/{\\fs18\\up6 \\chftn /' \ # Footnote Start\ -e 's/.FS/{\\footnote \\pard\\plain \\s246 \\fs20 {\\fs18\\up6 \\chftn }/' \ # Footnote End\ -e 's/.FE/}}/' \ # Shift to point size 10\ -e 's/^\.PZ 10/{\\fs20/'\ # Shift to point size 12\ -e 's/^\.PZ 12/{\\fs24/'\ # Shift to point size 14\ -e 's/^\.PZ 14/{\\fs28/'\ # Shift to point size 16\ -e 's/^\.PZ 16/{\\fs32/'\ # Mark the beginning of a keep:\ # DOES NOT SUBSTITUTE RTF CODE\ -e 's/^\.KS/\\par KEEP START\\par/' \ # Mark the end of a keep:\ # DOES NOT SUBSTITUTE RTF CODE\ -e 's/^\.KE/\\par KEEP END\\par/' \ # Mark the beginning of a Section:\ # DOES NOT SUBSTITUTE RTF CODE\ -e 's/^\.SH/\\par KEEP START\\par/' \ # Mark the beginning of a zero indent:\ # DOES NOT SUBSTITUTE RTF CODE\ -e 's/^\.in 0/\\par INDENT ZERO\\par/' \ # Mark the beginning of a temp zero indent:\ # DOES NOT SUBSTITUTE RTF CODE\ -e 's/^\.ti 0/\\par TEMPORARY INDENT ZERO \\par/' \ # Mark the beginning of an intended section:\ # DOES NOT SUBSTITUTE RTF CODE\ -e 's/^\.in.*/\\par INDENT\\par/' \ # Remove all remaining unchanged troff command lines:\ -e '/^\./d' \ # Place space at the end of each line of file:\ -e 's/$/ /' \ # Deletes all "newline" characters:\ | tr -d "\012" \ >> $1.msword exit # THINGS I'D LIKE TO GET WORKING BUT HAVEN'T FIGURED OUT HOW: # # Get the font set to Geneva # -e 's/^\.ST/\\f3 \\chftn /' \ # Right-Aligned text: can't figure out how to turn it off: # -e 's/^\.RP/\qr/' \ An attempt to get RTF codes for "begin page one" to work: # -e 's/^\.bp 1/\\page\\pgnstart1\\pard/' \ # -e 's/^\.bp$/\\page \\par \\pard/' \ # # I don't know how to make the RTF shift from 1.5 or double-spaced # text to single-spaced text. These don't work: # -e 's/.ls 1/\\pard/' \ # -e 's/.ls $/\\pard/' \