## Sunday, March 23, 2008

### Projecteuler.net problem 14 - Csharp vs Fsharp

Problem 14 stated as following:

 The following iterative sequence is defined for the set of positive integers:n → n/2 (n is even)n → 3n + 1 (n is odd)Using the rule above and starting with 13, we generate the following sequence:13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.Which starting number, under one million, produces the longest chain?

Ok, so here it is. Lets have fun with F#:

`let seq_length n = n > Seq.unfold(fun x -> match x withx when x = 0L -> Nonex when x = 1L -> Some((1L,0L))x when x%2L=0L -> Some((x,x/2L))_ -> Some((x,3L*x + 1L))) > Seq.length[for i in 1L..1000000L -> (i, seq_length i)]> List.fold1_left(fun (ai,al) (i,l) -> if l > al then (i,l) else (ai,al) )> (fun (x,y) -> x ) > print_any`

Elegant, slow version of solution-> ~20 lines; ~10 seconds.

Now lets try C#:

`static long CountTerms(long x, ref long[] terms){ long tc = 1; long k = x; while (k!=1) {  tc++;  k = (k % 2 == 0) ? k / 2 : 3 * k + 1;  if (k <= x)   {   if (terms[k] > 0)   {    terms[x] = terms[k] + tc;    return terms[x];   }  } } terms[x] = tc; return terms[x] ;}static long LongestSeq(){ long max = 1000000; long[] termcount = new long[max+1]; long tcmax = 0; long c = 0; long ix = 0; for (int i = 1; i <= max; i++)  {  c = CountTerms(i, ref termcount);  if ( c > tcmax)  {   tcmax = c;   ix = i;  } } return ix;}static void Main(string[] args){ Console.WriteLine(LongestSeq()); Console.ReadLine();}`

Ugly, fast version of solution -> ~0.18 second; ~60 lines.

Conclusion:

F# solutions can be done in more elegant/less code way, but in contrary it is slower that C#
counterpart. Also C# solutions can be written in fast way, but needs more code and attention.
F# solution was by factor 55 slower than C# solution, but also code was by factor of 3 shorter than C# counterpart. So if you want elegancy/generics power - take F#; and if you want only perfomance- take C#.
P.S. There is a way in F# to code (using mutable keyword) very very ugly like in C#, but i`ve not
suggest using it on F#,- why to code ugly in F# if you can grab C# instead ?? :-)

1. You can considerable improve the preformance of your F# code with out resorting to use the mutable keyword. This code preforms is a factor of 10 slower that the C#, still not perfect but better that a factor of 55 slower:

let rec seq_length x n =
match x with
| x when x = 0L -> (n + 1)
| x when x = 1L -> seq_length 0L (n + 1)
| x when x%2L=0L ->seq_length (x/2L) (n + 1)
| _ -> seq_length (3L*x + 1L) (n + 1)

let rec loop i imax n =
let n' = seq_length i 0
let imax, n = if n' > n then i, n' else imax, n
if i < 1000000L then loop (i + 1L) imax n else imax

print_any (loop 1L 0L 0)

2. It was pointed out to me that the C# version used memorization were as the my F# version didn't once correct the F# version execution time is less that a second and it's still half the number of lines:

#light
open System.Collections.Generic

let dict = new Dictionary< int64, int >()

let seq_length x n =
let updateDict n =
n
let rec seq_length x n =
if dict.ContainsKey(x) then
updateDict (n + dict.[x])
else
match x with
| x when x = 0L -> updateDict (n + 1)
| x when x = 1L -> seq_length 0L (n + 1)
| x when x%2L=0L ->seq_length (x/2L) (n + 1)
| _ -> seq_length (3L*x + 1L) (n + 1)
seq_length x n

let rec loop i imax n =
let n'= seq_length i 0
let imax, n = if n' > n then i, n' else imax, n
if i < 1000000L then loop (i + 1L) imax n else imax

print_any (loop 1L 0L 0)

3. The following F# code is only about 30% slower than the C# version, but still shorter and more readable.

Some notes:
- I would expect the F# optimizer to improve over the next months, so that this F# implementation will probably catch up performance-wise.

- Comparing the original two implementations really is comparing apples with oranges.

- F# is not Haskell. You are allowed to use imperative code constructs and you should do so if that makes your code easier to understand or orders of magnitude faster.

- The readability of the F# code suffers somewhat from the need to make all conversions between int and int64 explicit. This is the price you pay for code inference.

- Interestingly the JIT manages to replace the integer division/ remainder operation with appropriate bit operations even though long is signed. On the other hand, if one replaces all long/int64 with ulongs/uint64 and thus makes the signlessness explicit, the braindead JIT inserts integer division operations instead.

- The ref qualification of the array argument in your C# code is superfluous.

#light
open System
open System.Diagnostics

let max = 1000000L

let countTerms x (terms: int64[]) =
let rec count c = function
| k when k = 1L -> c
| k when k <= x && terms.[int k] > 0L -> terms.[int k] + c
| k when k % 2L = 0L -> count (c + 1L) (k/2L)
| k -> count (c + 1L) (3L*k + 1L)
count 1L (int64 x)

let longestSequence () =
let termcount = Array.zero_create (int max + 1)
let mutable tcmax, ix = int64 0, 0
for i in 1 to int max do
let c = countTerms (int64 i) termcount
termcount.[i] <- c
if c > tcmax then
tcmax <- c
ix <- i
ix

let main () =
let sw = new Stopwatch()
sw.Start()
let result = longestSequence()
sw.Stop()
printf "Result: %i, Runtime: %ims" result sw.ElapsedMilliseconds

main ()

4. The following F# code is 3x shorter and 40% faster than your C#:

let rec inside (a : _ array) n =
if n<=1L || a.[int n]>0s then a.[int n] else
let p =
if n &&& 1L = 0L then inside a (n >>> 1) else
let n = 3L*n + 1L
if n < int64 a.Length then inside a n else outside a n
a.[int n] <- 1s + p
1s + p
and outside (a : _ array) n =
let n = if n &&& 1L = 0L then n >>> 1 else 3L*n + 1L
1s + if n < int64 a.Length then inside a n else outside a n

let euler14 n =
let a = Array.create (n+1) 0s
let a = Array.init (n+1) (fun n -> inside a (int64 n))
let i = Array.find_index (Array.fold1_left max a |> (=)) a
i, a.[i]

euler14 1000000

5. If possible please investigate an example that uses the performance of multi-core CPU's using ParallelFX: multi-processing extensions for .NET Framework.

Ref.:
http://weblogs.asp.net/esanchez/archive/2007/11/30/parallelfx-multi-processing-extensions-for-net-framework.aspx

Thanks,
art

6. Well.. by replace one c# line to
(guess which line..):

k = ((k & 0x1) == 0) ? (k >> 1) : ((k << 1) + k) + 1;

I squeezed another 20% performance improvement out of the ugly version...

7. You might like to look at my solution to Problem 14 which uses a Functional approach in C#, and has an elegant approach to memoize the calculations.

Comment will be posted after comment moderation.