Wrap-up: mergesort in haskell

I have to admit that I didn’t fully understand the apfelmus example code. Nevertheless, I made an effort to address both of his criticisms:

  1. The merge() function does not use an accumulator argument any more and is indeed much simpler now.
  2. The recursive mergesort_ function now does not use the haskell list length operator any more. Instead, the length of the list to be sorted is passed down the recursive chain.

Please see the optimised mergesort implementation for details.

These improvements reduced the RAM utilisation and improved the run-time performance by another 20% respectively.

Nothing to scoff at, eh?


4 thoughts on “Wrap-up: mergesort in haskell

  1. You can possibly do better by writing a simple function to find the middle of the list in one pass without using length.

    The trick is to maintain two pointers, and move one ahead twice as fast as the other.

    findMiddle [] [] = ([], [])
    findMiddle (x:xs) [] = ([x], xs)
    findMiddle (x:xs) (_:[]) = ([x], xs)
    findMiddle (x:xs) (_:_:ys) = (x:xs’, ys’)
    where (xs’, ys’) = findMiddle xs ys

    that’s the idea, I think it’s got an off-by-1 error somewhere in there.

  2. To understand the different approaches to dividing the list, you can use a dummy merge and have a look at the resulting tree

    import Data.Tree
    leaf x = Node (“[“++show x++”]”) []
    merge a b = Node “merge” [a,b]
    empty = Node “[]” []

    In other words, the mergesorts won’t sort a list anymore but instead return a tree that shows how the calls to merge are nested. Use drawTree to print it.

    — apelmus’ version
    mergesortA [] = empty
    mergesortA xs = foldtree1 merge $ map leaf xs

    foldtree1 f [x] = x
    foldtree1 f xs = foldtree1 f $ pairs xs
    pairs (x:x’:xs) = f x x’ : pairs xs
    pairs xs = xs

    — muharem’s version
    mergesortB xs = mergesort’ xs (length xs)
    mergesort’ [] 0 = empty
    mergesort’ [x] 1 = leaf x
    mergesort’ xs n = merge (mergesort’ ls ln) (mergesort’ rs rn)
    (ls,rs) = splitAt middle xs
    middle = n `div` 2
    (ln,rn) = (middle, n – middle)

    — mrd’s version
    mergesortC [] = empty
    mergesortC [x] = leaf x
    mergesortC xs = merge (mergesortC ls) (mergesortC rs)
    where (ls,rs) = findMiddle xs xs

    findMiddle xs [] = ([], xs)
    findMiddle xs (_:[]) = ([], xs)
    findMiddle (x:xs) (_:_:ys) = (x:xs’, ys’)
    where (xs’, ys’) = findMiddle xs ys

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s