Wednesday, December 14, 2011

Easy SSH scala scripting

Visit janalyse-ssh, and download "jassh.jar" standalone executable jar.
In the same directory where you've downloaded "jassh.jar"file, create a file named "helloworld", and copy&paste the following content in it :
#!/bin/sh
exec java -jar jassh.jar -nocompdaemon -usejavacp -savecompiled "$0" "$@"
!#
import fr.janalyse.ssh._
SSH.connect(host="localhost", username="test", password="testtest") { ssh =>
  println(ssh.execute("""echo -n "Hello World from `hostname`" """))
}
Change the specified user to an existing user, or create a "test" user. Give the right password or remove it if you've exported your public SSH key in remote user ~test/.ssh/authorized_keys file
Then make your "helloworld" file executable (chmod u+x helloworld), and execute it :
toto@myhost $ ./helloworld 
Hello World from myhost
Easy ! And you'll take all benefits of the scala language using just vi, in order to make any kind of remote operations ! No graphical user interface required, no compiler (automatic & transparent compilation on the fly when needed), high performances (I've reached a command throughput up to 545 cmd/s on my host), parallelism using actors, easy parsing using combinators, ...

Sunday, December 11, 2011

JMX grep scala script

A simple script to search for mbeans/attributes-names/attribute-values matching regular expressions given as script parameters. This script is using janalyse-jmx scala API; jajmx.jar java executable is used to startup the script.
This jmx grep script can be used against itself, with its own embedded jmx platform. This feature is enabled through java options (JAVA_OPTS) specified within script startup header.
$ ./jmxgrep localhost 9999  operating
java.lang:type=OperatingSystem - MaxFileDescriptorCount = 4096
java.lang:type=OperatingSystem - OpenFileDescriptorCount = 18
java.lang:type=OperatingSystem - CommittedVirtualMemorySize = 4520407040
java.lang:type=OperatingSystem - FreePhysicalMemorySize = 7900864512
java.lang:type=OperatingSystem - FreeSwapSpaceSize = 10742177792
java.lang:type=OperatingSystem - ProcessCpuTime = 1490000000
java.lang:type=OperatingSystem - TotalPhysicalMemorySize = 16848113664
java.lang:type=OperatingSystem - TotalSwapSpaceSize = 10742177792
java.lang:type=OperatingSystem - Name = Linux
java.lang:type=OperatingSystem - AvailableProcessors = 6
java.lang:type=OperatingSystem - Arch = amd64
java.lang:type=OperatingSystem - SystemLoadAverage = 0.0
java.lang:type=OperatingSystem - Version = 3.0.6-gentoo

$ ./jmxgrep localhost 9999 threadcount
java.lang:type=GarbageCollector,name=PS MarkSweep - LastGcInfo = javax.management.openmbean.CompositeDataSupport(compositeTyp...
java.lang:type=Runtime - SystemProperties = javax.management.openmbean.TabularDataSupport(tabularType=ja...
java.lang:type=Threading - DaemonThreadCount = 12
java.lang:type=Threading - PeakThreadCount = 13
java.lang:type=Threading - ThreadCount = 13
java.lang:type=Threading - TotalStartedThreadCount = 13
java.lang:type=GarbageCollector,name=PS Scavenge - LastGcInfo = javax.management.openmbean.CompositeDataSupport(compositeTyp...
jmxgrep code (just copy paste this code into a file named jmxgrep, and make it executable using chmod a+x jmxgrep) :
#!/bin/sh
JAVA_OPTS=""
JAVA_OPTS=$JAVA_OPTS" -Dcom.sun.management.jmxremote"
JAVA_OPTS=$JAVA_OPTS" -Dcom.sun.management.jmxremote.port=9999"
JAVA_OPTS=$JAVA_OPTS" -Dcom.sun.management.jmxremote.authenticate=false"
JAVA_OPTS=$JAVA_OPTS" -Dcom.sun.management.jmxremote.ssl=false"
SCA_OPTS="-nocompdaemon -usejavacp -savecompiled"
exec java $JAVA_OPTS -jar jajmx.jar $SCA_OPTS "$0" "$@"
!#

import fr.janalyse.jmx._
import JMXImplicits._

if (args.size < 2) {
  println("Usage   : jmxgrep host port searchMask1 ... searchMaskN")
  println("Example : jmxgrep localhost 1099  vendor")
  println("   will self connect to jmx server, and looks for vendor keyword")
  System.exit(1)
}
val host  = args(0)
val port  = args(1).toInt
val masks = args.toList.drop(2) map {s=>("(?i)"+s).r}

def truncate(str:String, n:Int=60) = if (str.size>n) str.take(n)+"..." else str

JMX.connect(host, port) { implicit jmx =>
  for(on <- jmx.browse ; attr <- on.browse) {
    val value = try { on.get[Any](attr).toString} catch { case _ => "**error**"}
    val found = List(on.toString, attr, value) exists { item =>
      masks exists {re => (re findFirstIn item).isDefined} 
    }
    if (found || masks.isEmpty) println("%s - %s = %s".format(on, attr, truncate(value)))
  }
}
Notice the "-savecompiled" scala option which enable the script compilation result to be stored in a file named "jmxgrep.jar". When started, already compiled code will be reuse if the script hasn't been modified since last compilation time. This allow very fast script startup.

Scala JMX console & scripts examples

I've released a JMX API for scala, ( project-link ) which is straightforward to use in scripts : I provide a single executable jar file named "jajmx.jar", containing everything required to start a scala console or to run a scala script dedicated to jmx operations.
Console mode usage example :
$ java -jar jajmx.jar -usejavacp
Welcome to Scala version 2.9.1.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_29).
Type in expressions to have them evaluated.
Type :help for more information.

scala> import fr.janalyse.jmx._
import fr.janalyse.jmx._

scala> val jmx=JMX("localhost", 9999, None)
jmx: fr.janalyse.jmx.JMX = fr.janalyse.jmx.JMX@1ce59895

scala> jmx.domains
res0: List[java.lang.String] = List(JMImplementation, com.sun.management, java.lang, java.util.logging)

scala> jmx.runtime map {_.get[Long]("Uptime")}
res5: Option[Long] = Some(149014)

scala> jmx.os map {_.get[String]("Version")}
res6: Option[String] = Some(3.0.6-gentoo)

scala> jmx("java.lang:type=Memory").set("Verbose", true)

scala> jmx("java.lang:type=Memory").call("gc")
[GC 45515K->28456K(310720K), 0.0010380 secs]
[Full GC 28456K->26792K(310720K), 0.2427710 secs]
res11: Option[Nothing] = None

A simple script example (Invoke an explicit GC from JMX) :
#!/bin/sh
exec java -jar jajmx.jar -nocompdaemon -usejavacp -savecompiled "$0" "$@"
!#
import fr.janalyse.jmx._

if (args.size != 2) {
  println("Usage : gcforce host port")
  System.exit(1)
}
val host=args(0)
val port=args(1).toInt

JMX.connect(host, port) { jmx =>
  val mem  = jmx("java.lang:type=Memory")
  mem.call("gc")
  println("Explicit GC invoked on host %s port %d".format(host,port))
}

Saturday, December 3, 2011

Scala project skeleton using SBT

Let's create a minimal scala project skeleton, based on SBT build system, with some SBT plugins enabled (eclipse and assembly). The goal is to generate a single standalone executable jar (without any external dependencies), using scala language (and/or java) and eclipse as an optional IDE.
The scala project skeleton is available as a google code project. visit : scala-dummy-project or checkout the project :
# Quick start
$ git clone git://github.com/dacr/scala-dummy-project.git
$ cd scala-dummy-project
$ sbt assembly
$ java -jar target/dummy.jar
Hello userX
Requirements are :
- JVM >= 1.5 available in your PATH
- get and install SBT (Simple Build Tool)
- if you want to use eclipse, download and install Scala IDE for eclipse

To use eclipse, you must initialize the project for eclipse, run "sbt eclipse" or just "eclipse" from SBT console.
All steps :
$ cd scala-dummy-project
$ sbt 
> eclipse
> run
> test
> assembly
> exit
$ java -jar target/dummy.jar
#OR
$ sbt run
The project is configured as follow :
import AssemblyKeys._

seq(assemblySettings: _*)

name := "ScalaDummyProject"

version := "0.1"

scalaVersion := "2.9.1"

mainClass in assembly := Some("dummy.Dummy")

jarName in assembly := "dummy.jar"

libraryDependencies += "org.scalatest" %% "scalatest" % "1.6.1" % "test"

// Junit is just required for eclipse to be able to start tests.
libraryDependencies += "junit" % "junit" % "4.10" % "test"

What's great with sbt is that it manages all dependencies for you (even sub-dependencies !), no need to download anything, everything is done automatically.
CONTEXT : Scala 2.9.1 / SBT 0.11 / ScalaTest 1.6.1 / sbt-assembly 0.7.2 / sbteclipse 1.5.0

Sunday, November 13, 2011

Get the elements which are in duplicates in a collection

groupBy method is a powerfull feature that can be used for many kind of operations, it is a "group by" SQL like operation. Let's use it to find elements which are in duplicates in a given collection :
val alist = List(1,2,3,4,3,2,5,7,5)
val duplicatesItem = alist groupBy {x=>x} filter {case (_,lst) => lst.size > 1 } keys
// The result is : Set(5, 2, 3)

val dogs=List(Dog("milou"), Dog("zorglub"), Dog("milou"), Dog("gaillac"))
val duplicatesDog = dogs groupBy {_.name} filter {case (_,lst) => lst.size > 1 } keys
// The result is : Set(milou)

Now if we want to define a function to do the job for us :
def findDuplicates[A,B](list:List[B])(crit:(B)=>A):Iterable[A] = {
     list.groupBy(crit) filter {case (_,l) => l.size > 1 } keys 
}
Let's use this function :
scala> findDuplicates(alist) {x=>x}
res21: Iterable[Int] = Set(5, 2, 3)

scala> findDuplicates(dogs) {dog=>dog.name}
res22: Iterable[String] = Set(milou)

Saturday, November 12, 2011

IO Read binary file with Stream continually helper method

A straightforward approach to read a binary stream in Scala. Using continually helper Stream method gives to the code a functional taste.
#!/bin/sh
exec scala -J-Xmx1g -J-Xms512m -savecompiled "$0" "$@"
!#

import java.io._
import util.Properties.{userHome}
import java.io.File.{separatorChar=> sep, pathSeparatorChar=>pathSep}

def readBinaryFile(input:InputStream):Array[Byte] = {
  val fos = new ByteArrayOutputStream(65535)
  val bis = new BufferedInputStream(input)
  val buf = new Array[Byte](1024)
  Stream.continually(bis.read(buf))
      .takeWhile(_ != -1)
      .foreach(fos.write(buf, 0, _))
  fos.toByteArray
}

// Read 191Mo file
val fname=userHome+sep+"LibO_3.4.4_Win_x86_install_multi.exe"
val fb = readBinaryFile(new FileInputStream(fname))
println("%s, %d bytes".format(fname, fb.size))
This is quite better than writing :
  var c=0
  do {
    c = bis.read(buf)
    if (c > 0) fos.write(buf,0,c)
  } while(c > -1)

Wednesday, November 9, 2011

md5sum in scala

How to compute the md5sum of a file (to get the same result as the one given by linux md5sum command)
Inspired from MD5Sum in Scala and modified to work with input streams.

 def md5sum(input:InputStream):String = {
    val bis = new BufferedInputStream(input)
    val buf = new Array[Byte](1024)
    val md5 = java.security.MessageDigest.getInstance("MD5")
    Stream.continually(bis.read(buf)).takeWhile(_ != -1).foreach(md5.update(buf, 0, _))
    md5.digest().map(0xFF & _).map { "%02x".format(_) }.foldLeft(""){_ + _}
  }

Some notes about how to simplify scala code (and monads)

This is just some notes...To be continued and completed ... Updated on 2011-11-17
Interesting articles (and set of comments) to understand and recognize monads :
- Monads are not metaphors
- Monads are elephants

Exploring the best way to get a value from a map, or a default value if the key doesn't exist :
val codes=Map("A"->10, "B"->5, "C"->20)

{ // Basic approach
  var code = 30
  if (codes contains "D") code = codes.get("D").get
}

{ // More compact approach
  val code = if (codes contains "D") codes.get("D").get else 30
}

{ // Match approach
  val code = codes.get("D") match {
    case Some(v)=>v
    case None=>30
  }
}

{ // Option monad approach
  val code = codes get "D" getOrElse 30
}

{ // Compact approach
  val code = codes.getOrElse("D",30)
}
With monads, many test cases can be avoided, the code looks like a data flow :
import collection.JavaConversions._
import java.io.File
val sp = java.lang.System.getProperties.toMap

val lookInto = sp.get("baseDir") orElse sp.get("rootDir") orElse sp.get("user.home")
val lookIntoDirFile = lookInto map {new File(_)} filter {_.exists}
val count = lookIntoDirFile map {_.list.size}

println(count map {_+" files"} getOrElse "Dir not found")

// will print "Dir not found" or for example "184 files"

An other example using Option Monad, the second fullName implementation becomes very simple:
// Using classical approach, a Java like approach
case class Person1(firstName:String, lastName:String) {
  def fullName() = {
    if (firstName!=null) {
      if (lastName!=null) {
        firstName+" "+lastName
      } else null
    } else null
  }
}

{
  val p1 = Person1("Einstein", "Albert")
  val p2 = Person1("Aristote", null)
  println("1-%s" format p1.fullName)
  val fullName = if (p2.fullName!=null) p2.fullName else "Unknown"
  println("1-%s" format fullName)
}

// Using Option monad approach  
case class Person2(firstName:Option[String], lastName:Option[String]) {
  def fullName() = firstName flatMap {fn => lastName map {ln => fn+" "+ln}}
}

{
  val p1 = Person2(Some("Einstein"), Some("Albert"))
  val p2 = Person2(Some("Aristote"), None)
  println("2-%s" format p1.fullName.getOrElse("Unknown"))
  println("2-%s" format p2.fullName.getOrElse("Unknown"))
}

// the Best and the cleanest, same usage as the previous one
case class Person3(firstName:Option[String], lastName:Option[String]) {
  def fullName() = for(fn<-firstName ; ln <- lastName) yield fn+" "+ln
}


To be continued and completed ...

Sunday, November 6, 2011

Automatic resource liberation

It is very easy to implements automatic resource liberation in Scala, thanks to parametric type which can be use without giving a specific class but rather any class which implements the given method signature !
In our case, we define a method which accepts all classes with a close method.
def using[T <: { def close()}, R] (resource: T) (block: T => R) = {
  try     block(resource)
  finally resource.close
}
An example usage :
import java.io._

using(new PrintStream("/tmp/dummy")) { o =>
  o.print("Hello world")
}

Tuesday, October 4, 2011

Scala serialize simple test case

A simple use case showing how to make scala classes serializable. This example checks that instances reference graph is entirely stored. When instances are restored, we modify the mutable java Properties instance to check that the change is seen both from service and server instance.
import java.io._

case class Common(properties:java.util.Properties) extends Serializable 
case class Server(name:String, ip:String, common:Common) extends Serializable
case class Service(server:Server, name:String, port:Int, common:Common) extends Serializable

val cmm1=Common(new java.util.Properties())
val srv1=Server("localhost", "127.0.0.1",cmm1)
val svc1=Service(srv1, "httpd", 80, cmm1)

val bufout = new ByteArrayOutputStream()
val obout = new ObjectOutputStream(bufout)

obout.writeObject(svc1)

val bufin = new ByteArrayInputStream(bufout.toByteArray)
val obin = new ObjectInputStream(bufin)

val svc2 = obin.readObject().asInstanceOf[Service]


assert(svc2 == svc1)

svc2.common.properties.put("toto", "tata")

assert(svc2.common == svc2.server.common)


CONTEXT : Linux Gentoo / Scala 2.9.1 / Java 1.6.0_26

Generic operations on scala collections - Using builders

Scala collections are very impressive, they achieve a high degree of genericity, a single method can be applied on any kind of collections ! Let's see an example through an operation consisting on removing identical consecutives elements.

Let's see a first simple implementation :
def compact[A](list:List[A]):List[A] = {
  var buf  = collection.mutable.ListBuffer.empty[A]
  var prev:Option[A] = None
  for(item <- list) {
    if (prev==None || prev.get!=item) buf += item
    prev=Some(item)
  }
  buf.toList
}

Or using folding :
def compact[A](list:List[A]):List[A] = {
  (List[A]() /: list) { (nl, I) => nl.headOption match { 
      case Some(I) => nl
      case Some(_)|None=> I::nl
    }
  } reverse
}

The problem with those implementations is that the collection type is "hardcoded", if you need to realize the same operation on Queues, Stacks, Vectors,... you will have to provide as many implementations as you have collections types you want to be taken into account by a compact operation.
You'll first thing that's this is not a problem, you just have to parameterize not only the item type but also the collection type using such method declaration :
def compact[A, I[A]<:Iterable[A]](list:I[A]) : I[A] = {
   ...
}

But such approach is a dead end issue, because you need to create a new collection of type I[A] which should inheritate from Iterable[A], but of course it is not possible to create such instance from type parameter. Trying, I.empty[A] or I[A]() won't work and will generate compiler error.

Scala provides an elegant solution based on implicits builders, in fact all collections provides builders that we're free to reuse in order to implement generic operations on collections.
// Fully generic implementation using collection builder
def compact[A, I[A]<:Iterable[A]](list:I[A]) (implicit bf: CanBuildFrom[I[A], A, I[A]]) : I[A] = {
  var builder = bf.apply()
  var prev:Option[A]=None
  for(item <- list) {
    if (prev==None || prev.get != item) builder += item
    prev=Some(item)
  }
  builder.result
}

Here find the complete example ready to run and test :
#!/bin/sh
exec scala -deprecation -savecompiled "$0" "$@"
!#
import scala.collection._
import scala.collection.immutable.{Queue,Stack}
import scala.collection.generic.CanBuildFrom

// Simple implementation
def compact0[A](list:List[A]) = {
  var buf  = collection.mutable.ListBuffer.empty[A]
  var prev:Option[A] = None
  for(item <- list) {
    if (prev==None || prev.get!=item) buf += item
    prev=Some(item)
  }
  buf.toList
}

// A bad recursive implementation
def compact1[A](list:List[A], prev:Option[A]=None):List[A] = {
  list match {
    case Nil => Nil
    case that::remain if (prev == None || prev.get != that) => that::compact1(remain, Some(that))
    case that::remain => compact1(remain, Some(that))
  }
}

// Implementation using folding
def compact2[A](list:List[A]):List[A] = {
  (List[A]() /: list) { (nl, I) => nl.headOption match { 
      case Some(I) => nl
      case Some(_)|None=> I::nl
    }
  } reverse
}

// Implementation using folding and a dedicated test function
def compactT1[A](list:List[A], tst:(A,A)=>Boolean):List[A] = {
  (List[A]() /: list) { (nl, i) => nl.headOption match { 
      case Some(iprev) if tst(iprev,i) => nl 
      case Some(_)|None=> i::nl
    }
  } reverse
}


// Fully generic implementation using builder
def compact[A, I[A]<:Iterable[A]](list:I[A])(implicit bf: CanBuildFrom[I[A], A, I[A]]):I[A] = {
  var builder = bf.apply()
  var prev:Option[A]=None
  for(item <- list) {
    if (prev==None || prev.get != item) builder += item
    prev=Some(item)
  }
  builder.result
}


// Fully generic implementation using builder and dedicated test function
def compactT[A, I[A]<:Iterable[A]](list:I[A], tst:(A,A)=>Boolean)
          (implicit bf: CanBuildFrom[I[A], A, I[A]]):I[A] = {
  var builder = bf.apply()
  var prev:Option[A]=None
  for(item <- list) {
    if (prev==None || !tst(prev.get,item)) builder += item
    prev=Some(item)
  }
  builder.result
}


// ----------------------------------
assert(compact(List.empty) == List())

// ----------------------------------
val l1 = List(1,1,1,2,3,3,2,2,2,4,4,1,1)
val l1compacted = List(1,2,3,2,4,1)
assert(compact0(l1) == l1compacted)
assert(compact1(l1) == l1compacted)
assert(compact2(l1) == l1compacted)
assert(compact(l1) == l1compacted)


// ----------------------------------
case class Cell(t:Int, v:Int)
val l2=List(Cell(1,10), Cell(2,11), Cell(2,4), Cell(3,1), Cell(3,2), Cell(2,0), Cell(2,30))
val l2compacted= List(Cell(1,10), Cell(2,11), Cell(3,1), Cell(2,0))
assert(compactT1(l2,(x:Cell,y:Cell)=>x.t==y.t) == l2compacted)
assert(compactT(l2,(x:Cell,y:Cell)=>x.t==y.t) == l2compacted)

// ----------------------------------
val testcase=List(1,1,1,2,3,3,2,2,2,4,4,1,1)
val testresult=List(1,2,3,2,4,1)

// ----------------------------------
val v1 = Vector(testcase :_ *)
val v1compacted = Vector(testresult :_ *)
assert(compact(v1) == v1compacted)

// ----------------------------------
val s1 = Stack(testcase :_ *)
val s1compacted =Stack(testresult :_ *)
assert(compact(s1) == s1compacted)

// ----------------------------------
val q1 = Queue(testcase :_ *)
val q1compacted =Queue(testresult :_ *)
assert(compact(q1) == q1compacted)

// ----------------------------------
val ab1 = collection.mutable.ArrayBuffer(testcase :_ *)
val ab1compacted = collection.mutable.ArrayBuffer(testresult :_ *)
assert(compact(ab1) == ab1compacted)

// ----------------------------------
val sq1 = Seq(testcase :_ *)
val sq1compacted =Seq(testresult :_ *)
assert(compact(sq1) == sq1compacted)




Sunday, September 25, 2011

Timed numeric series library for Scala

The first public release of my open-source timed-series library, janalyse-series, is available. It comes with a CSV parser which guesses used CSV format.

A use case : Google stock quote operations.
#!/bin/sh
export CP=../target/scala-2.9.1.final/janalyse-series_2.9.1-0.5.1.jar
exec scala -cp $CP "$0" "$@"
!#

import fr.janalyse.series._

val allSeries = CSV2Series.fromURL("http://ichart.finance.yahoo.com/table.csv?s=GOOG")
val closeSeries = allSeries("Close")

println("GOOGLE stock summary")
println("Higher : "+closeSeries.max)
println("Lowest : "+closeSeries.min)
println("Week Trend : "+closeSeries.stat.linearApproximation.slope*1000*3600*24*7)
println("Latest : "+closeSeries.last)


Notice that the startup script example has an hardcoded janalyse-series library dependency, update if necessary.
$ ./stock.scala 
GOOGLE stock summary
Higher : (07-11-06 00:00:00 -> 741,79)
Lowest : (04-09-03 00:00:00 -> 100,01)
Week Trend : 0.9271503465158119
Latest : (11-03-25 00:00:00 -> 579,74)

Sunday, September 18, 2011

Shell like code in Scala...

When you starts a Java Virtual Machine, the current directory is set to the directory in which the JVM was started, and it can't be modified any more. If you want to write some shell-like commands in scala, you'll have to find a way to write you're own change directory command (cd), but as the current directory can't be modified, it is not so simple.

One approach is to use an implicit mutable variable (which will contain the current directory) and change some implicit definitions which came with scala.sys.process.
#!/bin/sh
exec scala "$0" "$@"
!#
import sys.process.Process
import sys.process.ProcessBuilder._

case class CurDir(cwd:java.io.File)
implicit def stringToCurDir(d:String) = CurDir(new java.io.File(d))
implicit def stringToProcess(cmd: String)(implicit curDir:CurDir) = Process(cmd, curDir.cwd)

implicit var cwd:CurDir="/tmp"
def cd(dir:String=util.Properties.userDir) = cwd=dir

// ----------------------
"pwd"!

cd("/var/tmp/")
"pwd"!

cd("/var/log/")
"pwd"!

cd()
"pwd"!



Scala map implicit conversion into Java Properties

As it looks that there's no implicit conversion of a Scala Map (of Strings) into a Java Properties even in the latest Scala release, let's define one.
  implicit def map2Properties(map:Map[String,String]):java.util.Properties = {
    val props = new java.util.Properties()
    map foreach { case (key,value) => props.put(key, value)}
    props
  }


Or even in a one line fashion, using folding :
  implicit def map2Properties(map:Map[String,String]):java.util.Properties = {
    (new java.util.Properties /: map) {case (props, (k,v)) => props.put(k,v); props}
  }


CONTEXT : Linux Gentoo / Scala 2.9.1 / Java 1.6.0_26

Sunday, September 11, 2011

Using O/R Broker for JDBC SQL operations from Scala

O/R Broker is an interesting tool which aim to simplify relational database operations from scala language. Many interesting features such as no need of external configuration files (the mapping is done in scala), sql requests can be stored in plain files which make sql requests tuning easier, no leak (connections, statements, resultsets,... are all "managed"), ...

Let's first install & prepare a mysql service on a gentoo linux system :
As root user : 
$ emerge mysql
     ==> To install the latest mysql
$ emerge --config =dev-db/mysql-5.1.56  
     ==> initialize / prepare the database !! Ask for a root password
     ==> choose 'mysql2011root' for example
     ==> You may have to change the release to the precise one that have been installed
$ eselect rc add mysql  default
     ==> For automatic startup as system starts 
$ /etc/init.d/mysql start
     ==> Manual start
$ mysql -u root -h localhost -p            (Will ask you the root password  = 'mysql2011root')
     SHOW DATABASES;
     CREATE DATABASE science;
     USE science;
     CREATE TABLE SCIENTIST (
             SCIENTIST_ID INT NOT NULL PRIMARY KEY AUTO_INCREMENT, 
             FIRST_NAME VARCHAR(128) NOT NULL,
             LAST_NAME VARCHAR(128) NOT NULL,
              BIRTH_DATE DATE);
     INSERT INTO SCIENTIST (FIRST_NAME, LAST_NAME, BIRTH_DATE) VALUES ('Albert', 'Einstein', STR_TO_DATE('1879-03-18','%Y-%m-%d'));
     INSERT INTO SCIENTIST (FIRST_NAME, LAST_NAME, BIRTH_DATE) VALUES ('Thomas', 'Edison', STR_TO_DATE('1847-02-11','%Y-%m-%d'));
     INSERT INTO SCIENTIST (FIRST_NAME, LAST_NAME, BIRTH_DATE) VALUES ('Isaac', 'Newton', STR_TO_DATE('1643-01-16','%Y-%m-%d'));
     INSERT INTO SCIENTIST (FIRST_NAME, LAST_NAME, BIRTH_DATE) VALUES ('Samuel', 'Morse', STR_TO_DATE('1781-04-27','%Y-%m-%d'));
     INSERT INTO SCIENTIST (FIRST_NAME, LAST_NAME, BIRTH_DATE) VALUES ('Kurt', 'Gödel', STR_TO_DATE('1906-04-28','%Y-%m-%d'));
     select * from SCIENTIST;
     GRANT ALL ON SCIENTIST.* TO 'guest'@'localhost' IDENTIFIED BY 'guest2011';
$ mysql -u guest -h localhost -D science -p          (Password = guest2011)
     select * from SCIENTIST;

Let's create the SBT build configuration file (Build.scala) with all required dependencies :
import sbt._
import Keys._
object SQLSandboxBuild extends Build { 
  val ssbsettings = Defaults.defaultSettings  ++ Seq(
    name         := "sqlSandbox",
    version      := "1.0",
    scalaVersion := "2.9.1",
    libraryDependencies ++= Seq (
       "org.orbroker"   % "orbroker"             % "3.1.1"    % "compile",
       "mysql"          % "mysql-connector-java" % "5.1.6"    % "compile",
       "org.scalatest"  %% "scalatest"           % "1.6.1"    % "test",
       "c3p0"           % "c3p0"                 % "0.9.1.2"  % "compile"
    ),     
    resolvers += "GEOTools" at "http://download.osgeo.org/webdav/geotools/",
    resolvers += "JBoss Repository" at "http://repository.jboss.org/maven2",
    resolvers += "Typesafe Repository" at "http://repo.typesafe.com/typesafe/releases/"
  )
  lazy val ssbproject = Project("ssbproject",file("."), settings = ssbsettings)
}
Now the next step is to define the mapping for our Scientists database :
case class Scientist(firstName:String, lastName:String, birthDate:Option[Date]=None, id:Option[Int]=None)

object ScientistExtractor extends RowExtractor[Scientist] {
  val key = Set("SCIENTIST_ID")  
  def extract(row: Row) = {
    val id = row.integer("SCIENTIST_ID")
    val List(firstName,lastName) = List("FIRST_NAME", "LAST_NAME") map {row.string(_).get}
    val birthDate = row.date("BIRTH_DATE")
    Scientist(firstName, lastName, birthDate, id)
  }
}

object Tokens extends TokenSet(true) {
  val scientistSelectAll   = Token('scientistSelectAll, ScientistExtractor)
  val scientistSelectById  = Token('scientistSelectById, ScientistExtractor)
  val scientistInsert      = Token[Int]('scientistInsert)
  val scientistDelete      = Token('scientistDelete)
  val scientistUpdate      = Token('scientistUpdate)
}

So we defined a Scientist case class, and its extractor which tell how to build a Scientist instance from a row of "SCIENTIST" table data.

We store our sql request inside plain text files :
 $ find src/main/resources/sql/ -name "*.sql"
src/main/resources/sql/scientistSelectAll.sql
src/main/resources/sql/scientistDelete.sql
src/main/resources/sql/scientistSelectById.sql
src/main/resources/sql/scientistUpdate.sql
src/main/resources/sql/scientistInsert.sql

for example, scientistSelectById looks like :
SELECT *
FROM SCIENTIST
where SCIENTIST_ID = :id

In order to be able to execute sql requests, you first need to create a "broker" :
  val broker =  {
    val url = "jdbc:mysql://localhost/science"
    val username = "guest"
    val password = "guest2011"  
    val ds = getPooledDataSource(url,username,password) //new SimpleDataSource(url)
    val builder = new BrokerBuilder(ds)
    val res:Map[Symbol,String] = Tokens.idSet map {sym => sym -> "/sql/%s.sql".format(sym.name)} toMap;
    ClasspathRegistrant(res).register(builder)
    builder.verify(Tokens.idSet)
    builder.setUser(username, password)
    builder.build
  }

Get a scientist :
    val scientist = broker.readOnly() {
      _.selectOne(scientistSelectById, "id"->1)
    }
    scientist should not equal(None)

Get all scientists :
    val scientists = broker.readOnly() {
      _.selectAll(scientistSelectAll)
    }

Browse scientists :
    broker.readOnly() {
      _.select(scientistSelectAll) { scientist =>
        println(scientist)
        true
      }

And now the classical CRUD operations :
    // Create
    val newscientist = broker.transaction() { txn =>
      val scientist = Scientist("Benoist","Mandelbrot")
      val newid = txn.executeForKey(scientistInsert, "scientist"->scientist)
      scientist.copy(id = newid)
    }
    // Update
    val updatedScientist = newscientist.copy(birthDate = "1924-11-20")
    broker.transaction() { txn =>
      txn.execute(scientistUpdate, "scientist"->updatedScientist)
    }
    // Read
    val foundScientist = broker.readOnly() {
      _.selectOne(scientistSelectById, "id"->updatedScientist.id).get
    }
    foundScientist should equal(updatedScientist)
    // Delete
    broker.transaction() { txn =>
      txn.execute(scientistDelete, "id" -> updatedScientist.id.get)
    }
Notice here that once the new scientist has been created in the database we got back the automatically created numeric ID, so we return a new instance of scientist which contains the id chosen by the database, the original instance has not been modified, we want to keep all the benefits of immutability !

And now let's generate some load (40*10000 sql requests executed in parallel) on the database, and check the c3p0 database connection pool using an external jconsole :
    import actors.Actor._
    val howmany=40
    val creator = self
    for (n <- 1 to howmany) {
      actor {
        try {
          for(i<-1 to 10000) {
            val scientists = broker.readOnly() {
               _.selectAll(scientistSelectAll)
            }
          }
        } finally {
          creator ! "finished"
        }
      }
    }    
    for (i<- 1 to howmany) receive {
      case _ =>
    }
  }

The full SBT project is available here : sqlsandbox.tar.gz.

CONTEXT : Linux Gentoo / Scala 2.9.1 / Java 1.6.0_26 / SBT 0.10 / ScalaTest 1.6.1 / ORBroker 3.1.1

Sunday, September 4, 2011

Method receiver implicit conversions - data units use case

Scala implicit conversion is a great concept, basically you can use this feature to simplify object instanciation, I mean avoiding java constructor profusion by just providing in scala one construtor, and a set of implicit conversions (and of course default values). What's great with implicits is that they also apply on the receiver, let's see an example.

We want to be able to give durations or bytes size in a natural way, let's start a scala console through the sbt console, and try out my "unittools" library (download link : unittools.tar.gz) :
dcr@lanfeust ~/dev/unittools $ sbt
[info] Set current project to default-0b115b (in build file:/home/dcr/dev/unittools/)
> console
[info] Starting scala interpreter...
[info] 
Welcome to Scala version 2.9.1.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_26).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import unittools.UnitTools._
import unittools.UnitTools._

scala> "10h50m30s".toDuration
res0: Long = 39030000

scala> "1gb100kb".toSize
res1: Long = 1073844224

scala> 15444002.toDurationDesc
res1: String = 4h17m24s2ms

scala> 68675454743L.toSizeDesc
res2: String = 63gb982mb17kb791b

That's quite better to write "10h50m30s" inside your configuration files instead of "39030000", or "4h17m24s2ms" instead of "15444002" if you have to report a response time !

Of course strings don't have toDuration and toSize methods, in fact the "import unittools.UnitTools._" makes new implicit conversion definitions available to the compiler, those conversions make possible to convert a String instance to an internal object instance for which toDuration or toSize is available.

the "plumbing" is the following :
package unittools

class DurationHelper(value:Long, desc:String) {
  def toDuration()=value
  def toDurationDesc()=desc
}

class SizeHelper(value:Long, desc:String) {
  def toSize()=value
  def toSizeDesc()=desc
}

object UnitTools {
  // Implicit conversions
  implicit def string2DurationHelper(desc:String) = new DurationHelper(desc2Duration(desc), desc)
  implicit def long2DurationHelper(value:Long) = new DurationHelper(value, duration2Desc(value))

  implicit def string2SizeHelper(desc:String) = new SizeHelper(desc2Size(desc), desc)
  implicit def long2SizeHelper(value:Long) = new SizeHelper(value, size2Desc(value))

  //...
}

it's worth noting that the scala unittool internal algorithm contains less code than the java release I first made; 25 lines for scala, 40 lines for java.

CONTEXT : Linux Gentoo / Scala 2.9.1 / Java 1.6.0_26 / SBT 0.10 / ScalaTest 1.6.1

Thursday, August 25, 2011

Playing with JMX and Scala : MBean Creation & "Remote" access

** Update (2011-12-07) : scala JMX API project just created... I've started refactoring an older JMX scala API, and I'll release the new API as an open source project. The project link : janalyse-jmx : scala JMX API

An example (download corresponding sbt project) of how to create a MBean in Scala. We define the MBean interface using scala trait, and make our class implements the trait capabilities. No need to define the getter thanks to the "@BeanProperty" annotation, scala does the job for us. The MBean registration is directly done within Supervisor class instanciation, the jmxRegister method and with an implicit conversion to automatically convert a String to an ObjectName.

package jmxsandbox

import scala.actors.Actor
import scala.reflect.BeanProperty
import java.lang.management.ManagementFactory
import javax.management.ObjectName

// Some definitions to simplify the code
private object JMXHelpers {
    implicit def string2objectName(name:String):ObjectName = new ObjectName(name)
    def jmxRegister(ob:Object, obname:ObjectName) = 
      ManagementFactory.getPlatformMBeanServer.registerMBean(ob, obname)
}
import JMXHelpers._

// Some messages managed by the Supervisor actor
sealed abstract class Message
case object MAlive extends Message
case object MDead  extends Message
case object MExit  extends Message
case object MGet   extends Message

// The defined MBean Interface
trait SupervisorMBean {
  def getAlive():Int
}

// The class with JMX MBean
class Supervisor extends Actor with SupervisorMBean {
  jmxRegister(this, "JMXSandbox:name=Supervisor")
  @BeanProperty
  var alive=0
  def act() {
    loop { 
      react {
        case MAlive => alive+=1
        case MDead  => alive-=1
        case MExit  => exit
        case MGet   => reply(alive)
      }
    }
  }
}


And now the test case, showing how to test within the same JVM our Supervisor MBean. The test case creates an internal MBean server, which will be used by Supervison internal MBean registration. So we take benefits of local and remote-like access from jmx for our test case.

package jmxsandbox

import java.rmi.registry.LocateRegistry
import java.lang.management.ManagementFactory
import javax.management.remote.JMXConnectorServerFactory
import javax.management.remote.JMXConnectorFactory
import javax.management.remote.JMXServiceURL

import org.scalatest.FunSuite
import org.scalatest.matchers.ShouldMatchers

import JMXHelpers._

class SelfTest extends FunSuite with ShouldMatchers {
  // Let's create and start a local JMX service
  LocateRegistry.createRegistry(4500)
  val url = new JMXServiceURL("service:jmx:rmi:///jndi/rmi://localhost:4500/jmxapitestrmi")
  val mbs = ManagementFactory.getPlatformMBeanServer()
  val cs = JMXConnectorServerFactory.newJMXConnectorServer(url, null, mbs)
  cs.start
  

  test("Checking supervisor state with standard access and from JMX interface") {
    // Create and start the supervisor actor
    val supervisor = new Supervisor {start}
    
    // Send a MAlive message to the supervisor
    supervisor ! MAlive
    
    // get the current alive value from the actor
    val stateDirect = (supervisor !? MGet) match {
      case e:Int => e
      case _ => -1
    }
    
    // The check
    stateDirect should equal (1)

    // Do the same but using Supervisor MBean to read the alive value
    val mbserver = JMXConnectorFactory.connect(url).getMBeanServerConnection
    val stateViaJMX = mbserver.getAttribute("JMXSandbox:name=Supervisor","Alive").asInstanceOf[Int]
    
    // The check
    stateViaJMX should equal (stateDirect)
    
    // Ask the actor to exit
    supervisor ! MExit
  }
}


CONTEXT : Linux Gentoo / Scala 2.9.0 / Java 1.6.0_26 / SBT 0.10 / ScalaTest 1.6.1

Monday, August 22, 2011

Simple SBT use case

Simple Build Tool (sbt) is an impressive tool, building is really fast, and configuration quite simple in particular with small projects. No need to learn a new language, as sbt configuration is made using scala ! Of course automatic dependencies resolution is available !

Consider the following simple project, naturalsort.tar.gz, whose file hierarchy is the following :

naturalsort/
naturalsort/build.sbt
naturalsort/src/main/scala/naturalsort/NaturalSort.scala
naturalsort/src/test/scala/naturalsort/NaturalSortTest.scala


"build.sbt" just contains :

name := "NaturalSort"

version := "1.0"

scalaVersion := "2.9.0"

libraryDependencies += "org.scalatest" % "scalatest_2.9.0" % "1.6.1" % "test"


Once sbt is setup, just enter "naturalsort" directory, and run sbt; it will resolve dependencies, and make the sbt console available for you :

$ cd naturalsort/
$ sbt
>  compile
[info] Updating {file:/home/dcr/dev/naturalsort/}default-76224f...
[info] Done updating.
[info] Compiling 1 Scala source to /home/dcr/dev/naturalsort/target/scala-2.9.0.final/classes...
[success] Total time: 6 s, completed 22 août 2011 22:21:14
> test
[info] Compiling 1 Scala source to /home/dcr/dev/naturalsort/target/scala-2.9.0.final/test-classes...
[info] NaturalSortTest:
[info] - basic tests
[info] - advanced tests
[info] - special cases tests *** FAILED ***
[info]   List(1.1, 1.002, 1.3, 1.010) did not equal List(1.001, 1.002, 1.010, 1.02, 1.1, 1.3) (NaturalSortTest.scala:37)
[error] Failed: : Total 3, Failed 1, Errors 0, Passed 2, Skipped 0
[error] Failed tests:
[error]  naturalsort.NaturalSortTest
[error] {file:/home/dcr/dev/naturalsort/}default-76224f/test:test: Tests unsuccessful
[error] Total time: 4 s, completed 22 août 2011 22:21:35
> ~compile
[success] Total time: 0 s, completed 22 août 2011 22:23:02
1. Waiting for source changes... (press enter to interrupt)
> ~ test-only naturalsort.NaturalSortTest
[info] Compiling 1 Scala source to /home/dcr/dev/naturalsort/target/scala-2.9.0.final/test-classes...
[info] NaturalSortTest:
[info] - extreme tests
...
> package
[info] Packaging /home/dcr/dev/naturalsort/target/scala-2.9.0.final/naturalsort_2.9.0-1.0.jar ...
[info] Done packaging.
[success] Total time: 0 s, completed 4 sept. 2011 10:55:24


CONTEXT : Linux Gentoo / Scala 2.9.0 / Java 1.6.0_26 / SBT 0.10 / ScalaTest 1.6.1


Sunday, August 21, 2011

Simple hack to get response time

To use for example from the scala console in order to quickly get a function response time :

  def now=System.currentTimeMillis
  def duration[T](proc : =>T) = {
    val started = now
    val result = proc
    (now - started, result)
  }


Use case : Ordering a 100000 list of integer, built like as follow :

  val lst = (1 to 1000000).reverse.toList


Get / view response time :

  duration {lst.sortBy{x=>x}}


Or store the response time result :

  val (responsetime, _) = duration {lst.sortBy{x=>x}}


Since we're using a Java Virtual Machine, one measurement is not enough, we must make several iterations in order for the jvm to reach its steady state, and masks GarbageCollector border side effects on response times. We can use such approach :

  def howlong[T](proc : =>T, howmany:Long=5) {
    val durs = (1L to howmany) map {i =>
      val (dur,_) = duration {proc}
      dur
    }
    println("Duration : avg=%d min=%d max=%d all=%s".
         format(durs.sum/durs.size, durs.min, durs.max, durs))
  }


It gives, in milliseconds :

  scala> howlong {lst.sortBy{x=>x}}
  Duration : avg=670 min=524 max=903 all=Vector(816, 573, 524, 903, 534)


CONTEXT : Linux Gentoo / Scala 2.9.0 / Java 1.6.0_26

Friday, August 19, 2011

Natural Sorting...

A natural sort implementation using scala :

implicit val ord = new Ordering[String] {
  def groupIt(str:String) = 
       if (str.nonEmpty && str.head.isDigit) str.takeWhile(_.isDigit)
       else str.takeWhile(!_.isDigit)
  val dec="""(\d+)"""r
  def compare(str1: String, str2: String) = {
    (groupIt(str1), groupIt(str2)) match {
      case ("","") => 0
      case (dec(x),dec(y)) if (x.toInt==y.toInt) =>  
         compare(str1.substring(x.size), str2.substring(y.size))
      case (dec(x),dec(y)) => (x.toInt - y.toInt)
      case (x,y) if (x == y) =>
         compare(str1.substring(x.size), str2.substring(y.size))
      case (x,y) => x compareTo y
    }
  }
}


And now a usage case :

test("basic tests") {
  import scala.collection.immutable.TreeSet

  val t0 = new TreeSet[String]() ++ List("10", "5", "1")
  t0.toList should equal (List("1", "5", "10"))

  val t1 = new TreeSet[String]() ++ List("a-100","a-10", "a-5", "a-1")
  t1.toList should equal (List("a-1", "a-5", "a-10", "a-100"))
}


Of course it doesn't manage floating numbers, "1.01", will be sorted as "1.1"... or "a-2-050" will appear after "a-2-25".

CONTEXT : Linux Gentoo / Scala 2.9.0 / Java 1.6.0_26

Wednesday, August 17, 2011

Scala script to merge and archive PDF Files...

Just configure for example thunar action, in order to be able to select N pdf files, right-click and then launch the script. It will merge the pdf file into a single one with a new name starting with a timestamp. Previous files are backup into a Trash directory, each file being renamed with its path. Example :

bill.pdf check.pdf order.pdf   ==> 20110810-bill.pdf

#!/bin/sh
exec scala -nocompdaemon -savecompiled "$0" "$@"
!#
import sys.process._
if (args.size>0) {
  def now     = new java.util.Date()
  val defname = args(0).split("/").last
  val sdf     = new java.text.SimpleDateFormat("yyyyMMdd")
  val dest    = "%s-%s".format(sdf.format(now),defname)
  val res = Process("/usr/bin/pdftk"::args.toList:::"cat"::"output"::dest::Nil) !!
  val trash = scala.util.Properties.userHome+"/"+"Trash"+"/"
  Process("mkdir"::"-p"::trash::Nil)!
  
  for (file <- args) {
    Process("mv"::file::(trash+file.replace("/","++")+"-"+sdf.format(now))::Nil)!
  }
}


Of course this script relies on "app-text/pdftk-1.44" (gentoo package name), for more information see here : pdftk-the-pdf-toolkit.
CONTEXT : Linux Gentoo / Scala 2.9.0 / Java 1.6.0_26 / XFCE 4.8.0 / Thunar 1.2.1

Using scala as a script system admin language is quite cool...

No comment is necessary ! (Using Linux and scala 2.9.x)

import scala.sys.process._
"date"!
"date" #>> new java.io.File("dates.tmp") !
"cat dates.tmp" !

val files="ls"!!
val filteredFiles="ls" #| "grep .tar.gz" !!
val filesArray=("ls"!!).trim.split("\\n")

Using this language and its advanced features make system scripting quite easy and safer. No more problem with space in file names or directories, and so on ... sh coding is really a nightmare.

CONTEXT : Linux Gentoo / Scala 2.9.0 / Java 1.6.0_26