Monday, January 30, 2012

Simplifying scala scripts : Adding #include support to your scripts

Test a #include support example by executing the following 5 lines :
$ wget http://dnld.crosson.org/bootstrap.tar.gz
$ tar xvfz bootstrap.tar.gz
$ cd bootstrap
$ sbt assembly
$ ./scripts/test.scala
The test.scala example script is the following :
#!/bin/sh
DIRNAME=`dirname "$0"`
exec java -jar "$DIRNAME"/bootstrap.jar "$0" "$@"
!#

#include "shell.scala"

cd("/etc/")

"ls" #| "grep net" !

go to /etc directory, and prints files which contain net keyword in their names.

test.scala script includes the following file : scripts/include/shell.scala
import sys.process.Process
import sys.process.ProcessBuilder._
 
case class CurDir(cwd:java.io.File)
implicit def stringToCurDir(d:String) = CurDir(new java.io.File(d))
implicit def stringToProcess(cmd: String)(implicit curDir:CurDir) = Process(cmd, curDir.cwd)
implicit def stringSeqToProcess(cmd:Seq[String])(implicit curDir:CurDir) = Process(cmd, curDir.cwd)

implicit var cwd:CurDir=scala.util.Properties.userDir
def cd(dir:String=util.Properties.userDir) = cwd=dir
This file contains some definitions to make possible for the user to change current directory.

All the include mechanism logic in defined as follow :
package fr.janalyse.script

import scala.tools.nsc.ScriptRunner
import scala.tools.nsc.GenericRunnerCommand
import scala.io.Source
import java.io.File

object Bootstrap {
  val defaultOptions = List("-nocompdaemon","-usejavacp","-savecompiled", "-deprecation")
  val defaultExpandedScriptExt = ".pscala"
  
  val includeRE = """\s*#include\s+"(.+)"\s*"""r
  
  def expand(file:File, availableIncludes:List[File]) : List[String] = {
    val content=Source.fromFile(file).getLines().toList
    // First we remove "shell" startup lines, everything between #! and !#
    val cleanedContent = content.indexWhere { _.trim.startsWith("!#") } match {
      case -1 => content
      case i  => content.drop(i+1)
    }
   // Then we expand #include directives
   cleanedContent flatMap {
     case includeRE(filename) =>
       val fileOpt = availableIncludes find {_.getName() == filename}
       fileOpt orElse {
          throw new RuntimeException("%s : Couln't find include file '%s' ".format(file.getName, filename))
       }
       fileOpt map { file => expand(file, availableIncludes)} getOrElse List.empty[String]  
     case line => line::Nil
   }
  }
    
  def main(cmdargs:Array[String]) {
    val command = new GenericRunnerCommand(defaultOptions ++ cmdargs.toList)
    val scriptDir = new File(cmdargs(0)).getParentFile()
    val includePath = List(new File(scriptDir, "include"), scriptDir)
    val availableIncludes = includePath filter {_.exists()} flatMap {_.listFiles()}
    val scriptname = command.thingToRun 
    val script = new File(scriptname)
    val richerScript = new File(scriptname.replaceFirst(".scala", defaultExpandedScriptExt))
    
    if (script.exists()) {
      val jars = util.Properties.javaClassPath.split(File.pathSeparator) map {new File(_)} collect {
        case f if (f.exists() && f.isFile()) => f 
      }
      val jarsLastModified = (jars map {_.lastModified()} max)
      
      if (!richerScript.exists ||  // -- nothing already available
          (jarsLastModified > richerScript.lastModified) ||   // -- Bootstrap jar is newer
          (script.lastModified > richerScript.lastModified)) { // -- Script has been modified
        val newcontent =  expand(script, availableIncludes).mkString("\n")
        new java.io.FileOutputStream(richerScript) {
          write(newcontent.getBytes())
        }.close()
      }
    }
    ScriptRunner.runScript(command.settings, richerScript.getPath, command.arguments)
  }
}

How does it work :
The principle is to override scala standard script startup mechanism by introducing an additionnal step which consist to expand the script with all includes it contains, and then gives to scala the new script resulting of expansion process.
test.scala becomes test.pscala which will generate the savedcompile file test.pscala.jar. No recompilation will be required as soon as no change occured on test.scala or bootstrap.jar file.

You should also notice that the script is started using 'exec java -jar "$DIRNAME"/bootstrap.jar "$0" "$@"' and not 'exec scala ...' because bootstrap is an assembly jar which contains everything to run and compile scala scripts, and even more if you want, as it can include any third parties you may need, just add library dependencies ! So you only need one file, bootstrap.jar, to run any scala scripts, nothing to install, just one file to upload.

SBT build configuration : bootstrap/build.sbt
import AssemblyKeys._

seq(assemblySettings: _*)

name := "bootstrap"

version := "0.1"

scalaVersion := "2.9.1"

libraryDependencies <++=  scalaVersion { sv =>
   ("org.scala-lang" % "scala-swing" % sv) ::
   ("org.scala-lang" % "jline"           % sv  % "compile") ::
   ("org.scala-lang" % "scala-compiler"  % sv  % "compile") ::
   ("org.scala-lang" % "scala-dbc"       % sv  % "compile") ::
   ("org.scala-lang" % "scalap"          % sv  % "compile") ::
   ("org.scala-lang" % "scala-swing"     % sv  % "compile") ::Nil   
}

mainClass in assembly := Some("fr.janalyse.script.Bootstrap")

jarName in assembly := "bootstrap.jar"

SBT Plugins configuration : bootstrap/project/plugins.sbt file
resolvers += Classpaths.typesafeResolver

addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "2.0.0-M3")

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.7.2")

Thursday, January 26, 2012

Simplifying scala scripts

I've already written several dozens of scala scripts (series computations, ssh automation, various jmx operations, remote administration, garbage collector log analysis, ...) and found this language quite interesting for use as a script language. There's many reasons for that :
  • Script automatic compilation reduce runtime error. I am often amazed at the first attempt to get a script that works and without any runtime error !
  • Take benefits of scala powerfull collections that make possible to write "sql" like operations
  • It becomes straightforward to parallelize tasks using Actors; one of my favorite use case is a short script that trigger an explicit garbage collection on several dozens of remote jvm in a very short time
But I miss some features that will help to make scala scripts even simpler and concise :
  • A #include like feature within script
  • A way to modify default imports, to avoid adding always the same imports in all scripts
  • the #! !# shell scala bootstrap can become long (and not DRY) once you want to add many external java dependencies
In fact those missing features are no so difficult to implement, the following source code is a proof of concept that shows it no so difficult to implement those features. It defines a class, Bootstrap, which can be use to start a scala script and that will bring new imports and definitions to your script.
package fr.janalyse.script

import scala.tools.nsc.ScriptRunner
import scala.tools.nsc.GenericRunnerCommand
import scala.io.Source

object Bootstrap {

  val header = 
"""// WARNING
// Automatically generated file - do not edit !
import sys.process.Process
import sys.process.ProcessBuilder._
 
case class CurDir(cwd:java.io.File)
implicit def stringToCurDir(d:String) = CurDir(new java.io.File(d))
implicit def stringToProcess(cmd: String)(implicit curDir:CurDir) = Process(cmd, curDir.cwd)
implicit def stringSeqToProcess(cmd:Seq[String])(implicit curDir:CurDir) = Process(cmd, curDir.cwd)

implicit var cwd:CurDir=scala.util.Properties.userDir
def cd(dir:String=util.Properties.userDir) = cwd=dir

"""

  val footer = 
"""
"""

  def main(cmdargs:Array[String]) {

    def f(name:String) = new java.io.File(name)
    
    val na = List("-nocompdaemon","-usejavacp","-savecompiled", "-deprecation") ++ cmdargs.toList
    
    val command = new GenericRunnerCommand(na)
    
    import command.settings
    
    val scriptname = command.thingToRun 
    val script = f(scriptname)
    val richerScript = f(scriptname.replaceFirst(".scala", ".scala-plus"))
    
    if (script.exists()) {
      if (!richerScript.exists || (script.lastModified > richerScript.lastModified)) {
        val content=Source.fromFile(script).getLines().toList
        val cleanedContent = content.dropWhile(x => !x.startsWith("!#")).tail.mkString("\n")
        val newcontent =  List(header, cleanedContent, footer).mkString("\n")
        new java.io.FileOutputStream(richerScript) {
          write(newcontent.getBytes())
        }.close()
      }
    }
    
    val args = command.arguments
    
    ScriptRunner.runScript(settings, richerScript.getName, args)
  }  
}

Then generate a standalone executable jar with this class and all needed dependencies, thanks to such SBT build specification :
import AssemblyKeys._

seq(assemblySettings: _*)

name := "bootstrap"

version := "0.1"

scalaVersion := "2.9.1"

libraryDependencies <++=  scalaVersion { sv =>
   ("org.scala-lang" % "scala-swing" % sv) ::
   ("org.scala-lang" % "jline"           % sv  % "compile") ::
   ("org.scala-lang" % "scala-compiler"  % sv  % "compile") ::
   ("org.scala-lang" % "scala-dbc"       % sv  % "compile") ::
   ("org.scala-lang" % "scalap"          % sv  % "compile") ::
   ("org.scala-lang" % "scala-swing"     % sv  % "compile") ::Nil   
}

mainClass in assembly := Some("fr.janalyse.script.Bootstrap")

jarName in assembly := "bootstrap.jar"

you'll be able to directly run any scala script like that :
#!/bin/sh
exec java -jar bootstrap.jar "$0" "$@"
!#

cd("/etc/")

"ls" #| "grep net" !

Thanks to the assembly SBT plugin, you've generated a standalone executable jar, which contains the scala compiler, and our custom scala script startup mechanism.
In a next POST, I'll describe more in detail a new bootstrap implementation that will bring #include feature to scala script.

Tuesday, January 24, 2012

A simple approach to generate reports using scala

I was wondering if it was possible to create one executable script containing both the report template and its configuration. So I tried to write a scala script to test this idea :
#!/bin/sh
exec scala -nocompdaemon -usejavacp -savecompiled -deprecation "$0" "$@"
!#

import sys.process._
import java.io.{File, FileOutputStream}

println("Let's generate and display a report")

val message = "hello"
val guys = List("Marge", "Bart", "Omer")

val smallreport = 
< html>
  < head>
    < title>Hello guys report< /title>
  < /head>
  < body>
    < h1>{message}< /h1>
    {guys map {guy=> 
      < h2>{guy}< /h2>
    } }
  < /body>
< /html>

val reportfile = File.createTempFile("report", ".html")
reportfile.deleteOnExit

new FileOutputStream(reportfile) {
  write(smallreport.toString.getBytes)
}.close()

List("firefox", "-new-window", reportfile.toURI.toURL.toString) !

Note: I've added one space inside each HTML TAG, looks like SyntaxHighlighter doesn't like HTML or XML embedded within scala

Make this script executable (chmod u+x reportingScript.scala) and execute it (./reportingScript.scala), if firefox is available on your operating system, the following window will be displayed :

Now we can say :
- it is possible to combine a report template, and data access in a single executable script !
- By using a custom scala startup command, it should be possible to simplify many operations, extending defaults imports, ...

In a next post I'll try to make such script simpler.

Sunday, January 22, 2012

How to generate google stock trend chart in 3 lines...

Using janalyse-series scala API, it becomes very simple to generate a trend chart, 3 lines are enough in order to download, parse CSV data and then generate the chart.
#!/bin/sh
exec java -jar jaseries.jar -nocompdaemon -usejavacp -savecompiled "$0" "$@"
!#

import fr.janalyse.series.CSV2Series
import fr.janalyse.series.view.Chart

val allSeries = CSV2Series.fromURL("http://ichart.finance.yahoo.com/table.csv?s=GOOG")
val closeSeries = allSeries("Close").rename("Google stock value")    
Chart(closeSeries).toJpgFile(new java.io.File("googleStockTrend.jpg"), 800, 400)
The following chart is created.

CSV files merge, based on series names...

This example illustrates how janalyse-series API can be use to merge CSV files together while respecting series names. It takes a set of directory containing CSV files, merge all series and write results in a new directory. The merge is only based on series names, not file names, so if for one series, different file names were used, then only one of them will be chosen in the final destination directory, the shortest one.
#!/bin/sh
exec jaseries -deprecation -savecompiled "$0" "$@"
!#
/*
 * Copyright 2011 David Crosson
 * 
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 * 
 *   http://www.apache.org/licenses/LICENSE-2.0
 * 
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
import fr.janalyse.series._
import java.io.File

if (args.size < 2) {
  println("usage : csv-merge srcDir1 ... srcDirN destNewDir")
  println("  Merge and reduce numeric series stored in CSV files")
  System.exit(0)
}

case class InputSeries(name:String, filebasename:String, series:Series[Cell])
def file(filename:String) = new File(filename)
def file(dirname:String, filename:String) = new File(dirname, filename)
def file(dir:File, filename:String) = new File(dir, filename)

def csvFileFilter(f:String):Boolean = {f.endsWith(".csv") ||f.endsWith(".jxtcsv")}
val basename1RE="""(.*)(?:[-_]\d+)[.].+"""r
val basename2RE="""(.*)[.].+"""r
def basename(filename:String) = filename match {
  case basename1RE(basename) => basename
  case basename2RE(basename) => basename
  case other => other
}

val toMerge = args.init
val destNewDir = file(args.last)

destNewDir.mkdirs()

// --- Read everything
var inputSeriesList=List.empty[InputSeries]
for( dirname<-toMerge ; 
     filename<-file(dirname).list filter csvFileFilter;
     (seriesname, series)<-CSV2Series.fromFile(file(dirname, filename))) {
  val filebasename = basename(filename)
  inputSeriesList = InputSeries(seriesname, filebasename, series)::inputSeriesList
}

// --- Merge and Reduce
val seriesGroupByName=inputSeriesList groupBy {is => is.name}
val mergedSeriesList = for( (seriesname, inputSeriesList4Name) <- seriesGroupByName) yield {
  val mergedseries   = inputSeriesList4Name.map(_.series) reduceLeft {_ <<< _}
  val mergedbasename = inputSeriesList4Name.map(_.filebasename).min
  (mergedseries, mergedbasename)
}

// --- Write everything
val mergedSeriesGroupByBasename = mergedSeriesList groupBy { case (_, basename) => basename}
for( (mergedbasename, mergedTuples) <- mergedSeriesGroupByBasename) {
  val seriesList = mergedTuples map {case (series,_) => series}
  CSV2Series.toFile(seriesList, file(destNewDir, mergedbasename+".csv"))
}

Thursday, January 5, 2012

Build Play20 with latest sources, and let's play.

All the steps to run your first play 2 web application using latest play improvements; until the first stable release will be ready. The idea here is to build yourself Play. Eclipsify is now taken into account, so once all the steps described here are done, you can import it into eclipse IDE.
$ cd ~/workdir

$ git clone https://github.com/playframework/Play20.git

$ cd Play20/framework

$ ./build
> build-repository
> exit

$ export PATH=$PATH~/workdir/Play20

$ cd ~/workdir

$ play new firstapp

Getting play console_2.9.1 2.0-RC1-SNAPSHOT ...
:: retrieving :: org.scala-tools.sbt#boot-app
 confs: [default]
 5 artifacts copied, 0 already retrieved (5161kB/19ms)
       _            _ 
 _ __ | | __ _ _  _| |
| '_ \| |/ _' | || |_|
|  __/|_|\____|\__ (_)
|_|            |__/ 
             
play! 2.0-RC1-SNAPSHOT, http://www.playframework.org

The new application will be created in /home/dcr/experiments/firstapp

What is the application name? 
> firstapp

Which template do you want to use for this new application? 

  1 - Create a simple Scala application
  2 - Create a simple Java application
  3 - Create an empty project

> 1

OK, application firstapp is created.

Have fun!

$ cd firstapp

$ play 
> eclipsify
> run
--- (Running the application from SBT, auto-reloading is enabled) ---

[info] play - Listening for HTTP on port 9000...

(Server started, use Ctrl+D to stop and go back to the console...)

Now everything is ready, to test and modify the simple scala project application using eclipse. More information can be found on : play20 and scala ide 20 tutorial.