While attending an introductory lecture on cryptography last week, the professor guided us through an attack on a simplified DES cypher, and obviously I had a BQN repl open through the whole thing. I felt that the equational approach that array languages propose really helped me reason through the problem, so here’s what I learned from it.

Cypher description

The cypher we attacked was a simplified version of DES, with less rounds and shorter key and blocks. The scheme is as follows:

The message is split in blocks of 12 total bits. each block is processed separately, so from now on we will focus on a single block. Each block is split in a 6-bit left half l and a 6-bit right half r, which are then passed through 3 rounds of the following scheme:

  • \(l' := r\)

  • \(r' := l \oplus f(r, k)\) where \(f\) is a Feistel-style function and \(k\) is the round key.

  • To compute \(f(r, k)\),

    • first set \(h := e(r)\) (an expansion operation from 6 to 8 bits)

    • then compute \(i := h \oplus k\)

    • then use \(i_{0..4}\) and \(i_{4..8}\) to index into the \(s_1, s_2\) S-boxes respectively.

If we do this for all three rounds, we get the following equations:

\[\begin{align} & l_1 := r_0 & r_1 := l_0 \oplus f(r_0, k_0) \\ & l_2 := r_1 & r_2 := l_1 \oplus f(r_1, k_1) \\ & l_3 := r_2 & r_3 := l_2 \oplus f(r_2, k_2) \\ \end{align}\]

The round keys are generated by successive rotations of a single 9-bit key, so that \(k_j := k[j..j+8 \mod 9 ]\)

Normalizing the notation

The first order of business is to make this abstract description executable, and as mentioned I used BQN. In the process I will also fill in the details, like the content of the S-boxes and what the expansion operation does precisely. The data representation is chosen so that

# blocks are 2‿3 arrays, keys are 2‿4.
s  (52)
	101, 010, 001, 110, 011, 100, 111, 000,
	001, 100, 110, 010, 000, 111, 101, 011,
	100, 000, 110, 101, 111, 001, 011, 010,
	101, 011, 000, 111, 110, 010, 001, 100,

E        [0132,3245]⊏⥊
F        {k𝕊r: (k≠E r)˘s}
Round    {k𝕊lr: rlk F r}
Encrypt  {k𝕊lr: lr Round´ (3)(24⥊⌽)¨<k}

Attacking the cypher

We went through a chosen plaintext attack on the cypher, with two pairs of plaintext and cyphertext:

\[\begin{align} & l_0 r_0 := 000111011011 & l_3 r_3 := 000011100101 \\ & l^*_0 r^*_0 := 101110011011 & l^*_3 r^*_3 := 100100011000 \end{align}\]

using the equations above, we can rewrite:

\[\begin{align} r_3 \oplus r^*_3 &= l_2 \oplus l^*_2 \oplus f(r_2, k_2) \oplus f(r^*_2, k_2) \\ &= r_1 \oplus r^*_1 \oplus f(r_2, k_2) \oplus f(r^*_2, k_2) \\ &= l_0 \oplus l^*_0 \oplus f(r_0, k_0) \oplus f(r^*_0,k_0) \oplus f(r_2, k_2) \oplus f(r^*_2, k_2) \end{align}\]

noting that in this specific case \(r_0 = r^*_0\) and that \(r^{(*)}_2 = l^{(*)}_3\), we can cancel out \(f(r_0,k_0) \oplus f(r^*_0,k_0)\) and obtain the equation:

\[r_3 \oplus r^*_3 \oplus l_0 \oplus l^*_0 = f(l_3, k_2) \oplus f(l^*_3, k_2)\]

which we will be solving for \(k_2\).

Equational BQN

We explicitly find the values of k2 that satisfy the equation:

m  ´ r3r3sl0l0s
m  (k2 F l3)  k2 F l3s

BQN is particularly useful for program manipulation because many primitives can be defined in terms of their equational properties, which we can use to simplify until we have a direct program to compute the solution.

m  ((k2≠E l3)˘s)  (k2≠E l3s)˘s˘    # expand F
m  ((2|k2+E l3)˘s)  (2|k2+E l3s)˘s  # ≠ on bool
m  (k2˘(E l3)˘s)  k2˘(E l3s)˘s    # ((≢s)|i+j)⊑s ←→ i⊑j⌽s
m  k2˘((E l3)˘s)  (E l3s)˘s        # ⊑˘ distributes over ≠
m  k2˘´((E l3)˘s)  (E l3s)˘s      # curry ≠
m  k2˘´˘s¨E¨l3l3s                 # map ⌽˘⟜s and E

the equation that lifts most of the weight of this transformation is the following property of Rotate(): ((≢s)|i+j)⊑s ←→ i⊑j⌽s.

this tells us that candidate values of the first and second half k2 are the indices of 1 in the cells of w in the following program. The last line searches for these positions and stitches together the guesses for the two halves, giving the full list of keys.

m  ´r3r3sl0l0s
w  (<˘m)¨´˘s¨E¨l3l3s
>⥊∾⌜˝/(⥊↕42)˘˘w