I have been reading about haskell for a while now and felt that it’s time to “get my hands dirty” by (re-)implementing some of the well known algorithms in computer science. For my first exercise I chose merge sort.
The resulting code (sans command line parameter handling, auxiliary I/O functions etc.) is quite neat and was written in little time. Haskell is very expressive and its attractiveness immediately obvious when formulating pseudo-mathematical problem solutions.
1 module Main(main) where 2 3 import qualified IO 4 import System(getArgs) 5 import Monad(mapM_) 6 import qualified Scaffolding 7 8 mergesort l = 9 if (length l) <= 1 then l -- The list is already sorted. 10 else -- Split the list into two halves and sort these. 11 merge (mergesort lpart) (mergesort rpart)  12 where llen_half = (length l) `div` 2 13 (lpart, rpart) = splitAt llen_half l 14 15 -- Case #1: the right list is empty; just append the left list to the 16 -- accumulator. The latter is reversed because we were prepending 17 -- to it in case #3. 18 merge lpart  acc = (reverse acc) ++ lpart 19 20 -- Case #2: the left list is empty; just append the right list to the 21 -- (reversed) accumulator list. 22 merge  rpart acc = (reverse acc) ++ rpart 23 24 -- Case #3: neither of the left/right lists to be merged is empty; 25 -- prepend the lesser head element to the accumulator list. 26 merge (l:ls) (r:rs) acc = 27 if l <= r then merge ls (r:rs) (l:acc) 28 else merge (l:ls) rs (r:acc) 29 30 -- The main method, handles command line arguments, input and output. 31 main = do 32 args <- getArgs 33 fileh <- case head args of 34 -- The text to be sorted is to be read from stdin. 35 "-" -> do return (Just IO.stdin) 36 -- The text to be sorted is to be read from the file 37 -- specified. 38 "-f" -> do Scaffolding.openFile (head (tail args)) 39 -- Just sort the command line parameters. 40 _ -> do let sorted_args = mergesort args 41 IO.putStrLn ("mergesort(" ++ show args ++ 42 ") = " ++ show sorted_args) 43 return Nothing 44 case fileh of 45 -- Read text from file handle, sort it and print to stdout. 46 Just fileh -> do text_to_sort <- Scaffolding.readLines fileh  47 mapM_ IO.putStrLn (mergesort text_to_sort) 48 -- We failed to open the file specified (if any). 49 Nothing -> do return ()
I also liked the fact that the haskell standard library is rich and nicely documented.
Just to get an idea of how my naive
mergesort implementation performs I created a file listing all the files on my Dell D630 laptop running Hardy Heron (please ignore the recursive nature of this problem 🙂 ).
1 u804: haskell $ ghc --version 2 The Glorious Glasgow Haskell Compilation System, version 6.8.2 3 u804: haskell $ ls -l all-files.txt 4 -rw-r--r-- 1 mhr mhr 26245382 2008-05-31 17:18 all-files.txt 5 u804: haskell $ wc -l all-files.txt 6 399754 all-files.txt 7 u804: haskell $ uname -a 8 Linux u804 2.6.24-18-generic #1 SMP Wed May 28 20:27:26 UTC 2008 i686 GNU/Linux
I then let loose the compiled haskell binary on that file (
all-files.txt) and compared how it performed against
1 u804: haskell $ time ./mergesort -f all-files.txt >/dev/null 2 3 real 0m12.994s 4 user 0m12.349s 5 sys 0m0.632s 6 u804: haskell $ time sort all-files.txt >/dev/null 7 8 real 0m10.675s 9 user 0m10.613s 10 sys 0m0.052s 11 u804: haskell $ time sort all-files.txt >/dev/null
The difference in performance was moderate (22%) given that I did not tune the haskell implementation in any way.
However, I gasped when I observed the RAM utilisation while the haskell program was running: it used 726 MB of RAM!
Compare this to 34 MB that were used by the standard
I guess it’s time to get acquainted with
ghc -prof and friends 🙂
Last but not least I’d like to point to a great haskell resource, the Real World Haskell beta book.