Sunday, March 2, 2014

Akka actors based, generic primes calculation parallel algorithm

Actors programming is an interesting solution to take benefits of all your server CPU core to compute primes number while still being able to compute their order positions.

Here the principle is to create an actor router, which will load balance primes computation, CheckerActors. The load balancing is based on the size of each load balanced actors, a new task will be given to the actor with the lowest mailbox size.

All checked values are then sent to a single actor which will reorder received values and attribute them their respective positions. This actor, the ValuesManagerActor, is also responsible of managing the load, and check that final destination of all results can manage the result flow.

Here all results are sent to PrinterActor, this actor output results on the standard ouput, if you try here to print all results this actor may not as fast as the received result flow, so its mailbox will grow, up to the jvm maximum heap size thus generating OutOfMemory exception. That's why here I've added an acknowledge mechanism that verify we do not have too many results sent and not taken into account. If we have too many results not taken into account then ValuesManagerActor suspend the load balancing to CheckerActors.

My primes project at github for experiments and tests.

How long time to get the 500000th prime (sbt console) (first results, jvm 1.7.0.51, Intel Core i7 2.3 Ghz, 4 CPUs) :

Algo Duration Comments
Classic61s pgen.primes.take(500000).last
Parallel22s pgen.primesPar.take(500000).last
Actor (ack on each response)38s with fork-join-executor
Actor (ack every 5000 responses)35s with fork-join-executor
Actor (ack on each response)33s with thread-pool-executor
Actor (ack every 5000 responses)30s with thread-pool-executor

Actors based algorithm is quite faster than a classic sequential one, but not as fast as using a parallel collection in order to parallelize the classic algorithm.

Although my implementation is not as fast as the parallel solution (I'll have to investigate that point), it has a major benefits, it can be distributed across several computers. It'll be the subject of a new article.

Saturday, February 1, 2014

Generic primes generator

An example of how to implement a generic primes generator that you can use with any numerical type. Of course, performances will be impacted by your choice :

  • howlongfor(intPrimes.drop(25000).head) // res0: (String, Int) = (459ms,287137)
  • howlongfor(longPrimes.drop(25000).head) // res1: (String, Long) = (1349ms,287137)
  • howlongfor(bigIntPrimes.drop(25000).head) // res2: (String, BigInt) = (4361ms,287137)

My primes project can be found here.

Simplified source code

Sunday, November 24, 2013

Situations when you must not forget to use the scala blocking statement when working with futures...

Futures imply the use of an execution context, a custom one or the default one. An execution context is a mechanism that manage threads for you, map operations to be executed on them, they come with internal default limits which are most of time proportional to the number of CPU you have. This default behavior is the right one when dealing with CPU consuming operations, but with IO this is not the case. If you need to call 100 remote calls (1 seconds required for each of them) on an other system and you have only 2 cpu, your total response time will be 50 seconds, for an almost 0% cpu usage on your host...

That's why the blocking statement was introduced, it allows you to mark a code section as blocking, by blocking we mean time consuming and not cpu consuming. Thanks to this statement, the execution context will be able to grow as necessary, and you'll get very low latency.

On my 6 cores CPU, 50 virtual remote calls (2 seconds for each call) require only 2056ms with the blocking statement, and it requires of course 18 seconds without it (18s * 6cpu = 108s * 1cpu).

Take care with Futures created in scala for-comprehension...

Create Futures outside the for-comprehension, in order for them to start immediately their operations. If you create your futures inside the for-comprehension, your execution will be sequential and not concurrent ! This is a mistake very easy to make, not so easy to detect because it won't fail, you just won't be able to use all your CPU as you should when the processing involves CPU, or for other cases check for example how many network connections are established simultaneously...

Tuesday, October 1, 2013

Ulam spiral in scala

Just published a scala project on github, to play with primes and ulam spiral. To generate a 500x500 ulam spiral use :
$ sbt "run 500"
It will generate a PNG image file named "ulam-spiral-500.png"
To starts the console and play with primes :
$ sbt console"

scala> sexyPrimeStream(4)
res3: primes.Primes.PInteger = 17

scala> primeStream.take(15).toList
res3: List[primes.Primes.PInteger] = List(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47)

scala> sexyPrimeStream.take(5).toList
res5: List[primes.Primes.PInteger] = List(5, 7, 11, 13, 17)

scala> isolatedPrimeStream.take(10).toList
res2: List[primes.Primes.PInteger] = List(23, 37, 47, 53, 67, 79, 83, 89, 97, 113)

Saturday, September 28, 2013

Prime number scala stream based on an int stream

An integer stream filtered out as an prime number stream. So 21 seconds on my laptop to compute the first 100,001 prime numbers, the last one is 1 299 721, of course here only one processor is used...

Saturday, September 21, 2013

Scala collection chisel

First release and not yet good enough from my point of view but it works almost as it should :