performance - Ackermann very inefficient with Haskell/GHC -
i try computing ackermann(4,1)
, there's big difference in performance between different languages/compilers. below results on core i7 3820qm, 16g, ubuntu 12.10 64bit,
c: 1.6s, gcc -o3
(with gcc 4.7.2)
int ack(int m, int n) { if (m == 0) return n+1; if (n == 0) return ack(m-1, 1); return ack(m-1, ack(m, n-1)); } int main() { printf("%d\n", ack(4,1)); return 0; }
ocaml: 3.6s, ocamlopt
(with ocaml 3.12.1)
let rec ack = function | 0,n -> n+1 | m,0 -> ack (m-1, 1) | m,n -> ack (m-1, ack (m, n-1)) in print_int (ack (4, 1))
standard ml: 5.1s mlton -codegen c -cc-opt -o3
(with mlton 20100608)
fun ack 0 n = n+1 | ack m 0 = ack (m-1) 1 | ack m n = ack (m-1) (ack m (n-1)); print (int.tostring (ack 4 1));
racket: 11.5s racket
(with racket v5.3.3)
(require racket/unsafe/ops) (define + unsafe-fx+) (define - unsafe-fx-) (define (ack m n) (cond [(zero? m) (+ n 1)] [(zero? n) (ack (- m 1) 1)] [else (ack (- m 1) (ack m (- n 1)))])) (time (ack 4 1))
haskell: unfinished, killed system after 22s ghc -o2
(with ghc 7.4.2)
haskell: 1.8s ajhc
(with ajhc 0.8.0.4)
main = print $ ack 4 1 ack :: int -> int -> int ack 0 n = n+1 ack m 0 = ack (m-1) 1 ack m n = ack (m-1) (ack m (n-1))
the haskell version 1 fails terminate because takes memory. freezes machine , fills swap space before getting killed. can improve without heavily fuglifying code?
edit: appreciate of asymptotically smarter solutions, not asking for. more seeing whether compiler handles patterns in reasonably efficient way (stack, tail calls, unboxing, etc.) computing ackermann function.
edit 2: pointed out several responses, seems a bug in recent versions of ghc. try same code ajhc , better performance.
thank :)
nb: high memory usage issue is bug in ghc rts, upon stack overflow , allocation of new stacks on heap not checked whether garbage collection due. has been fixed in ghc head.
i able better performance cps-converting ack
:
module main data p = p !int !int main :: io () main = print $ ack (p 4 1) id ack :: p -> (int -> int) -> int ack (p 0 n) k = k (n + 1) ack (p m 0) k = ack (p (m-1) 1) k ack (p m n) k = ack (p m (n-1)) (\a -> ack (p (m-1) a) k)
your original function consumes available memory on machine, while 1 runs in constant space.
$ time ./test 65533 ./test 52,47s user 0,50s system 96% cpu 54,797 total
ocaml still faster, however:
$ time ./test 65533./test 7,97s user 0,05s system 94% cpu 8,475 total
edit: when compiled jhc, original program fast ocaml version:
$ time ./hs.out 65533 ./hs.out 5,31s user 0,03s system 96% cpu 5,515 total
edit 2: else i've discovered: running original program larger stack chunk size (+rts -kc1m
) makes run in constant space. cps version still bit faster, though.
edit 3: managed produce version runs fast ocaml 1 manually unrolling main loop. however, works when run +rts -kc1m
(dan doel has filed bug behaviour):
{-# language cpp #-} module main data p = p {-# unpack #-} !int {-# unpack #-} !int ack0 :: int -> int ack0 n =(n+1) #define c(a) #define concat(a,b) c(a)c(b) #define acktype(m) concat(ack,m) :: int -> int acktype(1) acktype(2) acktype(3) acktype(4) #define ackdecl(m,m1) \ concat(ack,m) n = case n of { 0 -> concat(ack,m1) 1 \ ; 1 -> concat(ack,m1) (concat(ack,m1) 1) \ ; _ -> concat(ack,m1) (concat(ack,m) (n-1)) } ackdecl(1,0) ackdecl(2,1) ackdecl(3,2) ackdecl(4,3) ack :: p -> (int -> int) -> int ack (p m n) k = case m of 0 -> k (ack0 n) 1 -> k (ack1 n) 2 -> k (ack2 n) 3 -> k (ack3 n) 4 -> k (ack4 n) _ -> case n of 0 -> ack (p (m-1) 1) k 1 -> ack (p (m-1) 1) (\a -> ack (p (m-1) a) k) _ -> ack (p m (n-1)) (\a -> ack (p (m-1) a) k) main :: io () main = print $ ack (p 4 1) id
testing:
$ time ./test +rts -kc1m 65533 ./test +rts -kc1m 6,30s user 0,04s system 97% cpu 6,516 total
edit 4: apparently, space leak is fixed in ghc head, +rts -kc1m
won't required in future.
Comments
Post a Comment