TITLE: Playfair cipher in R
DATE: 2021-01-25
AUTHOR: John L. Godlee
====================================================================


I was designing a treasure hunt as a Christmas present. I wanted to 
create a Playfair cipher as the final clue which when decoded would 
reveal the location of the Christmas present.

I used R to construct a function which produces a cipher matrix and 
key lookup table, and an encoded message. Here is a brief 
description of how the playfair cipher works:

Start with a matrix of letters:

a|J|N|G|o|k S|R|h|B|Z|b x|E|w|z|u|f d|l|t|W|H|p r|K|n|I|c|M 
s|q|g|P|T|D

and a lookup table:

keypair|letter wD|A bM|B aq|C sB|T

and an encoded message: sJgfSP

Find each pair of characters in the encoded message in the matrix, 
here starting with aq:

a|J|N|G|o|k S|R|h|B|Z|b x|E|w|z|u|f d|l|t|W|H|p r|K|n|I|c|M 
s|q|g|P|T|D

and take the "opposite" corners of the box formed by the keypair. 
In this case the answers are aq, wD, and sB.

Then take the output keypairs and match them in the lookup table. 
The answer here is CAT.

My function actually uses a slightly adapted version of the 
Playfair cipher. The differences are:

-   In my version if two key values are on the same row or column 
in the matrix they are simply swapped round, rather than transposed 
to the right or down.
-   In my version the matrix is 6x6 rather than 5x5 and uses a 
sample of 36 uppercase and lowercase letters rather than 25 (-J) 
uppercase letters.

Here is the function, which takes the message to be encoded as its 
single argument. It returns the encoded message, the matrix and the 
key lookup table:

    #' Create a playfair-style cipher
    #'
    #' @param x character string to encode
    #'
    #' @return list with three slots: (1) encoded message (2) 
decoder matrix 
    #'     (3) decoder lookup table
    #'
    #' @details Creates a cipher based on the original playfair 
cipher. 
    #'     Unlike the original playfair cipher this method produces 
a 
    #'     6x6 grid of upper and lowercase letters. Additionally, 
the 
    #'     behaviour when a keypair appear on the same row or 
column of 
    #'     the decoder matrix is different. In this version 
keypairs which
    #'     appear on the same row or column are merely swapped 
rather than
    #'     transposed as in the original cipher. 
    #'     Messages to be encoded are converted to uppercase and all
    #'     non-alphabet characters are stripped out.
    #'
    #' @examples
    #' x <- "This is a test"
    #' playfair(x)
    #' 
    #' @export
    #' 
    playfair <- function(x) {
      # List all letters, upper and lowercase (52 chr)
      all_chr <- c(letters, LETTERS)

      # Create 6x6 matrix of distinct letters
      mat <- matrix(sample(all_chr, 6*6), 6, 6)

      # Get all pairwise combinations of grid positions
      locs_pairs <- matrix(combn(seq(length(mat)), 2), ncol = 2)
      locs_clean <- unique(locs_pairs[locs_pairs[,1] != 
locs_pairs[,2],])

      # Randomly sample pairs of grid positions 
      # 26 times to create windows for each letter
      locs_letters <- locs_clean[sample(nrow(locs_clean), 26),]

      # Order the pairs to always take the top left of each pair
      locs_pairs <- apply(locs_letters, 1, function(y) {
        c(min(y), max(y))
        })

      # Search matrix for grid positions to get letter combinations
      combins <- apply(locs_pairs, 2, function(y) {
        paste0(mat[y[1]], mat[y[2]])
      })

      # Make tidy dataframe of letter codes
      code_df <- data.frame(input = combins,
        output = LETTERS)

      # Split x into component characters, 
      # remove spaces and non-letter characters
      x_string <- unlist(strsplit(toupper(x), 
        split = ""))
      x_string_clean <- x_string[x_string %in% LETTERS]
      decoded <- code_df[match(x_string_clean, code_df$output), 
"input"]

      # For each character, encode
      out <- unlist(lapply(decoded, function(i) {
        # Split string
        i_split <- unlist(strsplit(i, split = ""))

        # Find locations in matrix
        letter_one <- c(which(mat == i_split[1], arr.ind = TRUE))
        letter_two <- c(which(mat == i_split[2], arr.ind = TRUE))

        # Get opposite locations
        if (letter_one[1] == letter_two[1]) {
          opp_one <- mat[letter_one[1], letter_two[2]]
          opp_two <- mat[letter_two[1], letter_one[2]]
        } else {
          opp_one <- mat[letter_two[1], letter_one[2]]
          opp_two <- mat[letter_one[1], letter_two[2]]
        }

        # Combine into one string
        out <- paste0(opp_one, opp_two)

        out
      }))
      return(list(code = out, matrix = mat, key = code_df))
    }