Cracking Word Searches with Haskell
After learning Haskell a few months ago, I've had disappointingly few opportunities to use it. For a simple data-crunch a python repl starts up faster and lets me work quicker, and many of my other projects require a specific language. So when I was browsing reddit and saw /u/SWEAR_WORD_SEARCH, I thought it would be a fun, quick project to crack these in Haskell. (quick warning: if you didn't guess from the SWEAR_WORD part, this article contains some profane word searches ;-)
The format of the word searches posted by this user are pretty simple. Here's an example:
N Z C G S M Z C M Q A C B O S W R E D V
X O K P G H N R E X T C V P E Z G W F V
W C D T N C P I Y F M E R J S Z W L E S
T R L K Z A S T M Z J V O N W F C Z X K
E S N N O H R W Y J I S Z A S M Y I S P
K T M W J R A D F W F M G W L I G O Q Q
C L D O L W G A M U V K G N I K C U F S
V X L K U O G H H I E Y Y Z C I S F B H
G J W B V U O R F H P E Y M N I I T O Q
G P V S N Y Y M R N H S A C J F J X H Z
T G J M F B R L E M H N R C S R I J M M
K V S A J V O N A W J E A F U U D W L B
P B H R H H Y A D B D W D Y G S W H A A
I Q I T M K I A Q I Y M W E A T U M J M
P P X W A T C H B H Y T Y Z D R S U Y U
V Q J N T L X L F V S D P G P A E Q E K
S Z N A E C Y N E J K H T W M T S P Z M
G Q M S E X T L R G Y G U B Y I J I A B
J C U D P E Z P M D W F Z G I N Z L L Y
P Y G X F W H P N T X R S I F G C C L T
Find the words:
FRUSTRATING INCREDIBLY FUCKING WATCH SMART
There's two parts here: the word search itself, and the word list we need to find. We could parse this with a combinator library like Parsec, but it's simple enough that lines should be enough. Indeed, after a bit of hacking in ghci and refactoring, I got
parsePuzzle :: String -> ([[Char]], [String])
parsePuzzle p = (parseLetters $ lines p, parseWords $ lines p)
where dup f a = f a a
parseLine = map head . words
parseLetters = map parseLine . dup (take . (subtract 2) . length)
parseWords = words . last
While the point-free style didn't work perfectly on parseLetters (I'm not a fan of Forth-like argument manipulation in Haskell, such as flip
and dup
), I think it's better than the alternative with names.
Now we need to locate the words in the 2d letter matrix. This is surprisingly easy, using Data.List.transpose
. We need to search through rows, columns, and diagonals. Rows are easy, they're simply the fst of the tuple we get from parsePuzzle
. Columns are similar, simply use transpose
on the rows. We need a little trick for the diagonals, though, however it's not as hard as it initially seems.
First, we need a helper function mapIndex. This is similar to the function map, which you are hopefully familiar with, however it also provides the function with the index of the current item.
mapIndex :: (a -> Int -> b) -> [a] -> [b]
mapIndex f l = map (uncurry f) $ zip l [0..length l - 1]
(uncurry
has signature (a -> b -> c) -> (a, b) -> c
, and is used to unwrap the tuple from zip
into two arguments)
This might not seem particularly useful, but in fact we're only a step away from finding the diagonals now. By passing flip drop
to mapIndex
, and transposing the result, we get
ghci> putStrLn $ unlines p
NZCGSMZCMQACBOSWREDV
XOKPGHNREXTCVPEZGWFV
WCDTNCPIYFMERJSZWLES
TRLKZASTMZJVONWFCZXK
ESNNOHRWYJISZASMYISP
KTMWJRADFWFMGWLIGOQQ
CLDOLWGAMUVKGNIKCUFS
VXLKUOGHHIEYYZCISFBH
GJWBVUORFHPEYMNIITOQ
GPVSNYYMRNHSACJFJXHZ
TGJMFBRLEMHNRCSRIJMM
KVSAJVONAWJEAFUUDWLB
PBHRHHYADBDWDYGSWHAA
IQITMKIAQIYMWEATUMJM
PPXWATCHBHYTYZDRSUYU
VQJNTLXLFVSDPGPAEQEK
SZNAECYNEJKHTWMTSPZM
GQMSEXTLRGYGUBYIJIAB
JCUDPEZPMDWFZGINZLLY
PYGXFWHPNTXRSIFGCCLT
ghci> putStrLn $ unlines $ transpose $ mapIndex (flip drop) p
NODKORGHFNHEDEDASILT
ZKTZHAAHHHNAYAREPAY
CPNARDMIPSRFGTSQZB
GGCSWFUEEACUSUUEM
SHPTYWVYYCSUWMYK
MNIMJFKYMJRDHJU
ZRYZIMGZNFIWAM
CEFJSGNCIJJLA
MXMVZWIIIXMB
QTEOALKSTHM
ACRNSICFOZ
CVJWMGUBQ
BPSFYOFH
OEZCIQS
SZWZSQ
WGLXP
RWEK
EFS
DV
V
... the left to right diagonals! We can get the left to right diagonals easily as well, by using (map reverse p
) instead of p
in the right of the pipeline. Finally, to cover the bottom diagonals as well we need to reverse
p before we process it, these are the prime versions.
Now, we should easily be able to find if a word exists in the word search.
findWord :: String -> [[Char]] -> String
findWord s p = head $ filter (checkIn s) (rows p ++ cols p ++ dial p ++ diar p ++ dial' p ++ diar' p)
where checkIn s line = s `isInfixOf` line || reverse s `isInfixOf` line
rows p = p
cols p = transpose p
dial p = transpose $ mapIndex (flip drop) p
diar p = transpose $ mapIndex (flip drop) (map reverse p)
dial' p = transpose $ mapIndex (flip drop) (reverse p)
diar' p = transpose $ mapIndex (flip drop) (map reverse (reverse p))
ghci> findWord "FUCKING" (fst $ parsePuzzle puzzle)
"CLDOLWGAMUVKGNIKCUFS"
ghci> take 7 $ drop 1 $ reverse "CLDOLWGAMUVKGNIKCUFS"
"FUCKING"
That's all well and good - we can see if a word appeared in the word search! But we really need to know what letter positions the word appears in, so we can do something special when printing those positions out. There's a few ways we could do this: a zipper, printing as we go, etc. However, the simplest solution is to simply keep position data with the characters. It will be preserved through the transformations, and at the end we can simply check to see if the characters we're printing are in those positions. First, let's add a type synonym.
type Letter = (Char, Int, Int)
and modify parsePuzzle
to generate Letter
s instead of Char
s:
parsePuzzle :: String -> ([[Letter]], [String])
parsePuzzle p = (parseLetters $ lines p, parseWords $ lines p)
where dup f a = f a a
parseLine chars y = mapIndex (\[c] x -> (c, x, y)) $ words chars
parseLetters = mapIndex parseLine . dup (take . (subtract 2) . length)
parseWords = words . last
Sadly the point-free style doesn't work for parseLine
anymore, but luckily we had to make only one change to parseLetters
, changing map
to our mapIndex
function from earlier.
Finally, we need to change findWord
to use Letter
as well. This is pretty simple:
findWord :: String -> [[Letter]] -> [Letter]
findWord s p = head $ filter (checkIn s) (rows p ++ cols p ++ dial p ++ diar p ++ dial' p ++ diar' p)
where lts = map (\(c, _, _) -> c)
checkIn s line = s `isInfixOf` lts line || reverse s `isInfixOf` lts line
rows p = p
cols p = transpose p
dial p = transpose $ mapIndex (flip drop) p
diar p = transpose $ mapIndex (flip drop) (map reverse p)
dial' p = transpose $ mapIndex (flip drop) (reverse p)
diar' p = transpose $ mapIndex (flip drop) (map reverse (reverse p))
ghci> findWord "FUCKING" (fst $ parsePuzzle puzzle)
[('C',0,6),('L',1,6),('D',2,6),('O',3,6),('L',4,6),('W',5,6),('G',6,6),('A',7,6),('M',8,6),('U',9,6),('V',10,6),('K',11,6),('G',12,6),('N',13,6),('I',14,6),('K',15,6),('C',16,6),('U',17,6),('F',18,6),('S',19,6)]
It might not seem like much, but in fact, we're incredibly close to finished! First, let's collect the positions for every word into one list:
ghci> let (p, w) = parsePuzzle puzzle
ghci> nub $ foldl (\pos word -> findWord word p ++ pos) [] w
[('G',3,0),('P',3,1),('T',3,2),('K',3,3),('N',3,4),('W',3,5),('O',3,6),('K',3,7),('B',3,8),('S',3,9),('M',3,10),('A',3,11),('R',3,12),('T',3,13),('W',3,14),('N',3,15),('A',3,16),('S',3,17),('D',3,18),('X',3,19),('P',0,14),('P',1,14),('X',2,14),('A',4,14),('T',5,14),('C',6,14),('H',7,14),('B',8,14),('H',9,14),('Y',10,14),('T',11,14),('Y',12,14),('Z',13,14),('D',14,14),('R',15,14),('S',16,14),('U',17,14),('Y',18,14),('U',19,14),('C',0,6),('L',1,6),('D',2,6),('L',4,6),('W',5,6),('G',6,6),('A',7,6),('M',8,6),('U',9,6),('V',10,6),('K',11,6),('G',12,6),('N',13,6),('I',14,6),('K',15,6),('C',16,6),('U',17,6),('F',18,6),('S',19,6),('W',15,0),('Z',15,1),('Z',15,2),('F',15,3),('M',15,4),('I',15,5),('I',15,7),('I',15,8),('F',15,9),('R',15,10),('U',15,11),('S',15,12),('T',15,13),('A',15,15),('T',15,16),('I',15,17),('N',15,18),('G',15,19)]
(nub, from Data.List
, removes duplicates from a list)
Finally, we need to print out the puzzle. First, let's focus on just printing it without any decorations for where words are.
mapM_ (\ row -> do
mapM_ (\ (c, _, _) -> putChar c) row
putStrLn ""
) p
Expanding it to highlight the rows is easy enough as well.
main = do
let (p, w) = parsePuzzle puzzle
positions = nub $ foldl (\pos word -> findWord word p ++ pos) [] w
mapM_ (\ row -> do
mapM_ (\ l@(c, _, _) ->
if l `notElem` positions
then putChar c
else putStr ("\x1b[32m" ++ [c] ++ "\x1b[0m")
) row
putStrLn ""
) p
While using raw escape codes to color the output is a bit ugly, and doesn't work on Windows, it's better than pulling in a whole library just for this. This works OK:
However, you might see a problem. Usually when solving a word search, you would only highlight the word, not the entire line. To fix this, we need to change findWord
. Right now, findWord
returns the entire line. This is a relic from before we returned positions. Now, since we return positions, there's enough context in the word itself.
The current code of findWord looks like this:
findWord :: String -> [[Letter]] -> [Letter]
findWord s p = head $ filter (checkIn s) (rows p ++ cols p ++ dial p ++ diar p ++ dial' p ++ diar' p)
where lts = map (\(c, _, _) -> c)
checkIn s line = s `isInfixOf` lts line || reverse s `isInfixOf` lts line
rows p = p
cols p = transpose p
dial p = transpose $ mapIndex (flip drop) p
dial' p = transpose $ mapIndex (flip drop) (reverse p)
diar p = transpose $ mapIndex (flip drop) (map reverse p)
diar' p = transpose $ mapIndex (flip drop) (map reverse (reverse p))
There's a few places we could change the code to make this work. However, I'm going to add an additional step after we extract the filtered list's head. This is a bit inefficient, since it means we have to locate s
/reverse s
twice, however it keeps the code cleaner and simpler.
To start out, we need a function, similar to isInfixOf
, but that returns the position of the infix list. Sadly this doesn't exist in the standard library, but happily it exists on the Haskell wiki.
Now that we have this function, we can modify findWord
to work like we want.
findWord :: String -> [[Letter]] -> [Letter]
findWord s p = onlyString s $ head $ filter (checkIn s) (rows p ++ cols p ++ dial p ++ diar p ++ dial' p ++ diar' p)
where lts = map (\(c, _, _) -> c)
checkIn s line = s `isInfixOf` lts line || reverse s `isInfixOf` lts line
indexOfR s line = fromJust $ findSublistIndex s (lts line) `orElse` findSublistIndex (reverse s) (lts line)
onlyString s line = take (length s) $ drop (indexOfR s line) $ line
rows p = p
cols p = transpose p
dial p = transpose $ mapIndex (flip drop) p
dial' p = transpose $ mapIndex (flip drop) (reverse p)
diar p = transpose $ mapIndex (flip drop) (map reverse p)
diar' p = transpose $ mapIndex (flip drop) (map reverse (reverse p)
We added the functions onlyString
, which returns only the range of letters in line corresponding to s
, and indexOfR
, which returns the index of either s
or reverse s
in line. It uses the function Data.Generics.Aliases.orElse
, which returns the first or second Maybe
, depending on which is not Nothing
. We then unwrap that with fromJust
, since we know that at least one will not be Nothing.
This works a lot better, only highlighting the actual words:
This about wraps everything up. Here's the full code to the project:
import Data.List (transpose, findIndex, isInfixOf, isPrefixOf, tails, nub)
import Data.Maybe (fromJust)
import Data.Generics.Aliases (orElse)
type Letter = (Char, Int, Int)
main = do
let (p, w) = parsePuzzle puzzle
positions = nub $ foldl (\pos word -> findWord word p ++ pos) [] w
mapM_ (\ row -> do
mapM_ (\ l@(c, _, _) ->
if l `notElem` positions
then putChar c
else putStr ("\x1b[32m" ++ [c] ++ "\x1b[0m")
) row
putStrLn ""
) p
findSublistIndex :: Eq a => [a] -> [a] -> Maybe Int
findSublistIndex xss xs = findIndex (isPrefixOf xss) $ tails xs
mapIndex :: (a -> Int -> b) -> [a] -> [b]
mapIndex f l = map (uncurry f) $ zip l [0..length l - 1]
findWord :: String -> [[Letter]] -> [Letter]
findWord s p = onlyString s $ head $ filter (checkIn s) (rows p ++ cols p ++ dial p ++ diar p ++ dial' p ++ diar' p)
where lts = map (\(c, _, _) -> c)
checkIn s line = s `isInfixOf` lts line || reverse s `isInfixOf` lts line
indexOfR s line = fromJust $ findSublistIndex s (lts line) `orElse` findSublistIndex (reverse s) (lts line)
onlyString s line = take (length s) $ drop (indexOfR s line) $ line
rows p = p
cols p = transpose p
dial p = transpose $ mapIndex (flip drop) p
dial' p = transpose $ mapIndex (flip drop) (reverse p)
diar p = transpose $ mapIndex (flip drop) (map reverse p)
diar' p = transpose $ mapIndex (flip drop) (map reverse (reverse p))
parsePuzzle :: String -> ([[Letter]], [String])
parsePuzzle p = (parseLetters $ lines p, parseWords $ lines p)
where dup f a = f a a
parseLine chars y = mapIndex (\[c] x -> (c, x, y)) $ words chars
parseLetters = mapIndex parseLine . dup (take . (subtract 2) . length)
parseWords = words . last
puzzle = "\
\N Z C G S M Z C M Q A C B O S W R E D V\n\
\X O K P G H N R E X T C V P E Z G W F V\n\
\W C D T N C P I Y F M E R J S Z W L E S\n\
\T R L K Z A S T M Z J V O N W F C Z X K\n\
\E S N N O H R W Y J I S Z A S M Y I S P\n\
\K T M W J R A D F W F M G W L I G O Q Q\n\
\C L D O L W G A M U V K G N I K C U F S\n\
\V X L K U O G H H I E Y Y Z C I S F B H\n\
\G J W B V U O R F H P E Y M N I I T O Q\n\
\G P V S N Y Y M R N H S A C J F J X H Z\n\
\T G J M F B R L E M H N R C S R I J M M\n\
\K V S A J V O N A W J E A F U U D W L B\n\
\P B H R H H Y A D B D W D Y G S W H A A\n\
\I Q I T M K I A Q I Y M W E A T U M J M\n\
\P P X W A T C H B H Y T Y Z D R S U Y U\n\
\V Q J N T L X L F V S D P G P A E Q E K\n\
\S Z N A E C Y N E J K H T W M T S P Z M\n\
\G Q M S E X T L R G Y G U B Y I J I A B\n\
\J C U D P E Z P M D W F Z G I N Z L L Y\n\
\P Y G X F W H P N T X R S I F G C C L T\n\
\Find the words:\n\
\FRUSTRATING INCREDIBLY FUCKING WATCH SMART"
Exercises for the reader:
- Allow it to read word searches from standard input
- Allow it to load them from Reddit comments, using HTTP and a JSON library
- Many thanks to https://reddit.com/user/Verroq, who pointed out a bug in my code and showed how arrows made it cleaner.