While attending an introductory lecture on cryptography last week, the professor guided us through an attack on a simplified DES cypher, and obviously I had a BQN repl open through the whole thing. I felt that the equational approach that array languages propose really helped me reason through the problem, so here’s what I learned from it.
Cypher description
The cypher we attacked was a simplified version of DES, with less rounds and shorter key and blocks. The scheme is as follows:
The message is split in blocks of 12 total bits. each block is processed separately, so from now on we will focus on a single block.
Each block is split in a 6-bit left half l and a 6-bit right half r, which are then passed through 3 rounds of the following scheme:
-
\(l' := r\)
-
\(r' := l \oplus f(r, k)\) where \(f\) is a Feistel-style function and \(k\) is the round key.
-
To compute \(f(r, k)\),
-
first set \(h := e(r)\) (an expansion operation from 6 to 8 bits)
-
then compute \(i := h \oplus k\)
-
then use \(i_{0..4}\) and \(i_{4..8}\) to index into the \(s_1, s_2\) S-boxes respectively.
-
If we do this for all three rounds, we get the following equations:
The round keys are generated by successive rotations of a single 9-bit key, so that \(k_j := k[j..j+8 \mod 9 ]\)
Normalizing the notation
The first order of business is to make this abstract description executable, and as mentioned I used BQN. In the process I will also fill in the details, like the content of the S-boxes and what the expansion operation does precisely. The data representation is chosen so that
# blocks are 2‿3 arrays, keys are 2‿4.
s ← (5⥊2)⥊⟨
1‿0‿1, 0‿1‿0, 0‿0‿1, 1‿1‿0, 0‿1‿1, 1‿0‿0, 1‿1‿1, 0‿0‿0,
0‿0‿1, 1‿0‿0, 1‿1‿0, 0‿1‿0, 0‿0‿0, 1‿1‿1, 1‿0‿1, 0‿1‿1,
1‿0‿0, 0‿0‿0, 1‿1‿0, 1‿0‿1, 1‿1‿1, 0‿0‿1, 0‿1‿1, 0‿1‿0,
1‿0‿1, 0‿1‿1, 0‿0‿0, 1‿1‿1, 1‿1‿0, 0‿1‿0, 0‿0‿1, 1‿0‿0,
⟩
E ← [0‿1‿3‿2,3‿2‿4‿5]⊏⥊
F ← {k𝕊r: (k≠E r)⊑˘s}
Round ← {k𝕊l‿r: r⋈l≠k F r}
Encrypt ← {k𝕊l‿r: l‿r Round´ (↕3)(2‿4⥊⌽)¨<k}
Attacking the cypher
We went through a chosen plaintext attack on the cypher, with two pairs of plaintext and cyphertext:
using the equations above, we can rewrite:
noting that in this specific case \(r_0 = r^*_0\) and that \(r^{(*)}_2 = l^{(*)}_3\), we can cancel out \(f(r_0,k_0) \oplus f(r^*_0,k_0)\) and obtain the equation:
which we will be solving for \(k_2\).
Equational BQN
We explicitly find the values of k2 that satisfy the equation:
m ← ≠´ r3‿r3s‿l0‿l0s
m ≡ (k2 F l3) ≠ k2 F l3s
BQN is particularly useful for program manipulation because many primitives can be defined in terms of their equational properties, which we can use to simplify until we have a direct program to compute the solution.
m ≡ ((k2≠E l3)⊑˘s) ≠ (k2≠E l3s)⊑˘s⊑˘ # expand F
m ≡ ((2|k2+E l3)⊑˘s) ≠ (2|k2+E l3s)⊑˘s # ≠ on bool
m ≡ (k2⊑˘(E l3)⌽˘s) ≠ k2⊑˘(E l3s)⌽˘s # ((≢s)|i+j)⊑s ←→ i⊑j⌽s
m ≡ k2⊑˘((E l3)⌽˘s) ≠ (E l3s)⌽˘s # ⊑˘ distributes over ≠
m ≡ k2⊑˘≠´((E l3)⌽˘s) ⋈ (E l3s)⌽˘s # curry ≠
m ≡ k2⊑˘≠´⌽˘⟜s¨E¨l3‿l3s # map ⌽˘⟜s and E
the equation that lifts most of the weight of this transformation is the
following property of
Rotate(⌽):
((≢s)|i+j)⊑s ←→ i⊑j⌽s.
this tells us that candidate values of the first and second half k2 are the indices of 1
in the cells of w in the following program.
The last line searches for these positions and stitches together the guesses for the two halves,
giving the full list of keys.
m ← ≠´r3‿r3s‿l0‿l0s
w ← (<˘m)≡¨≠´⌽˘⟜s¨E¨l3‿l3s
>⥊∾⌜˝/⟜(⥊↕4⥊2)˘⥊˘w