Regexp to create bold from an asterix encapsulated string
---------------------------------------------------------

Last edited: $Date: 2015/11/22 19:56:27 $


## awkiawki is wonderful

> awkiawki is a very light weight wiki that is great as your personal
> wiki

Awkiawki is a wiki based on awk,  hence the name. It works as a cgi-
script and uses plain text files for data storage.

The plain text files are in a somewhat markdown-like format and each
time a  page is requested,  awkiawki renders the textfile  into html
format on the fly. Awkiawki offers a search-field, and performs full
text searches on the plain text files. This is fast and practical.

I  have  been  using  awkiawki  for some  time  now,  these  are  my
experiences:

  * awkiawki runs perfectly on a Raspbery Pi (very light weight, so
    perfoms well)
  * awkiawki saves your data as plain text files which is imho the
    best way, plain text files are application-independent and can
    be worked up on with the very powerful Unix text tools.
  * awkiawki uses a wiki-style syntax close to the original wiki
    style (as in the old days)
  * awkiawki is a great wiki to use as your personal wiki, mainly
    because it is so easy to add files to the wiki by just typing a
    CamelCase word and because it is easy to do full text searches
    in your wiki.

A  personal  wiki can  be  used  to  write  notes, create  your  own
knowledge base, but can also  be used for personal leadership (think
of and create your personal plans  and goals and stuff like that) or
maintain todo sections.

## Wiki style markup

Awkiawki mimicks  the original wiki  style from the  early 2000~2004
era.

As I have been using ikiwki and  vimwiki for a long time, I prefer a
syntax that is closer to these two  wiki's and is also more close to
Markdown.

Fortunately, awkiawki can be hacked upon :)

## Trying to create the perfect regexp to create strong, bold text

I like to use asterixes to create bold text, like this:
    
    
     This is a *very bold* part of this sentence
    

So  the trick  is to  embed the  part "very  bold" into  html strong
markers, like this:

This is a __very bold__ part of this sentence

Also, it must  be possible to emphasis more parts  on the same line,
like this:
    
    
     This is a *very bold* part of this sentence, and *this* too
    

So, we must be carefull with "greediness" in our regexp.

I have been trying  to come up with a regexp  for this, by tinkering
with the orginal awkiawki code. Awkiawki uses gsub a lot, gsub is an
awk-command for string-replacement.

So far, this is what I have come up with:
    
    
    /\*[A-Za-z0-9]/ { if ( $0 !~ /https?:/ && $0 !~ /^ / ) { gsub(/\*([^\*]+)\*/, "<strong>&</strong>"); gsub (/<strong>\*/, "<strong>" ); gsub (/\*<\/strong>/, "</strong>" ); } }
    

This is a  line in the file parser.awk. Here  is a short explanation
of the several parts of this line.

### First part

    /\*[A-Za-z0-9]/ {
    

Look for strings that start with an asterix.

### Second part

    
    
    if ( $0 !~ /https?:/ && $0 !~ /^ / )
    

That does not contain an url and  does not start with a space (lines
starting with  spaces are  shown as  preformatted text  / typewriter
text).

### Fourth part

    
    
    gsub(/\*([^\*]+)\*/, "<strong>&</strong>")
    

Surround substrings that  are surrounded by asterixes  with the html
"strong" marker.

The part  ([^*]+) means a character  that is not an  asterix, [^*] ,
optionally  followed by  one  or  more characters  that  are not  an
asterix, the ( +) part.

This part  of the code  was the hardest to  come up with,  given the
constraint, mentioned above, that multiple parts in one line must be
possible.

### Remainder
    
    
    gsub (/<strong>\*/, "<strong>" ); gsub (/\*<\/strong>/, "</strong>" );
    

The remainder of the code is just to remove the surrounding asterixes.

### Not perfect, yet

I am sure there must be a neater, faster and shorter way to do this,
but this is the present status of my quest for that ...

## Awkiawki

The wonderful code of Awkiawki can be found on bogosoft.com.


$Id: boldregexp.txt,v 1.2 2015/11/22 19:56:27 matto Exp $