Showing posts with label java. Show all posts
Showing posts with label java. Show all posts

Friday, June 20, 2014

Migrating from Maven to Gradle

I thought I'd share some of my experiences with migrating from Maven to Gradle for a small Java open source project.

The Strategy

First, what's the best way to do this?   The project is a fairly straightforward Java project without complex Maven pom.xml files, so maybe the best way forward is to just create a Gradle build along side the Maven one.

Some advantages over Maven


 Here are some of the advantages I found when using Gradle:
  • The 'java' plugin does almost all the work.   It defines something equivalent to the Maven lifecycle in terms of compilation, testing, and packaging.
  • Much smaller configuration.  No more verbose pom.xml files!
  • A multi-module project can be configured from the top-level build.gradle file.
  • Dependency specifications are more terse and also more readable.
  • It's much straightforward to get Gradle to use libraries that are not in the Maven repositories, e.g. in version control.   (However, I do believe that it's best to make a private repository with Artifactory or Nexus and install the libraries there, rather than keeping them in version control).
  • Dependencies between sub-modules is also very easy.
  • The whole parent/aggregator/dep-management thing in Maven is a bit clunky.   Gradle makes this much easier.  You can even do a multi-module build with a single Gradle build file if you want.

 First Attempt

Here are the steps I took.
  • Using IDEA, create a new Gradle project where the existing sources are.  Set the location of the Gradle installation.   You should see the Gradle tab on the right side panel.
  •  Create a build.gradle file and a settings.gradle file in the project root directory.
  • The basic multi-module structure can be the same as a Maven multi-module build:
    • A 'main' build.gradle file in the root directory.   Along with a settings.gradle file that has the overall settings.
    • Sub-directories for each module.
    • Each module directory has it's own build.gradle file.
    • NOTE: If the module dependencies are defined correctly, building a module will also build the other dependent modules when you are in the module sub-directory!   Major win over Maven here, IMO.
  • Apply the plugins for a Java project, set the group and version, add repositories.  In this case I have a multi-module project so I'm putting all of that in the allprojects closure:

    allprojects {
      apply plugin: 'java'
      group = 'org.jegrid'
      version = '1.0-SNAPSHOT'
      repositories {
        mavenCentral()
        maven {
          url 'http://repository.jboss.org/nexus/content/groups/public'
        }
        flatDir {
          dirs "$rootDir/lib" // If we use just 'lib', the dir will be relative.
        }
      }
    }
    

    I also have some libraries in the lib directory at the top level because they are not in the global Maven repos, or in the JBoss repo. The flatDir closure will allow Gradle to look in this directory to resolve dependencies. 
  • Add dependencies.   For a multi-module build this is done inside each project closure.   Use the 'compileJava' task to make sure they are right.
In the end, this project didn't really work with Gradle because the dependencies are too old.   So, I will need to rebuild the project from the ground up anyway.   Some of the basic libraries have undergone many significant changes since the project started, so it's time to upgrade!

Basic Gradle Multi-Module Java Project Structure

Okay, so in creating a brand new project, the canonical structure is much like a Maven project.

  • In the root directory (an 'aggregator' project) there is a main build.gradle file and a settings.gradle file.   This is roughly equivalent to the root pom.xml file.
  • In each sub-project directory (module) there is a build.gradle file.   This is roughly equivalent to the module pom.xml files.
  • The settings.gradle file has an include for each sub-project.   This is roughly equivalent to the '<modules>' section of the root pom.xml file.
  • An allprojects closure in the root build.gradle file can contain dependencies to be used for all modules.   This is similar to a 'parent pom.xml' (but much easier to read!).
One thing I wanted to do right away is to create the source directories in a brand new module.  This is pretty darn easy with Gradle.   Just add a new task that iterates through the source sets and creates the directories:

  task createSourceDirectories << {
    sourceSets.all { set -> set.allSource.srcDirs.each { 
      println "creating $it ... "
      it.mkdirs() 
      }
    }
  }

I added this in the alllprojects closure, and boom! - I have the task for all of the modules.  Neato!   I can now run this on each sub-project as needed.

Porting The Code


One I had the directory layout and basic project files I can begin moving in some of the code.    I started with the basic utility code for the project and the unit tests.   Like I mentioned, this was using a very old version of JUnit, so I needed to upgrade the tests.

Diversion One - Upgrading to JUnit 4.x

Upgrading to JUnit 4.x is actually pretty easy.   For the most part it retains backwards compatibility.   There are a few reasons you might want to upgrade the tests.
  • I prefer annotations over extending TestCase.   This is a pretty simple transform:
    1. Remove 'extends TestCase'
    2. Remove the constructor that calls super.
    3. Remove the import for TestCase
    4. Add 'import static org.junit.Assert.*'
    5. Add @Test to each test method.
  • (already mentioned) Take advantage of 'import static'! import static org.junit.Assert.*
  • Expected exceptions:
    @Test(expected=java.lang.ArrayIndexOutOfBoundsException.class)
     
  • @BeforeClass and @AfterClass annotations to replace setUp() and tearDown().

Diversion Two - Using Guice or Dagger instead of PicoContainer?

I really enjoy using DI containers.  It takes so much of the boilerplate 'factory pattern' code out of the project and makes for easy de-coupling and configuring of components.   In the previous version of the project I had used PicoContainer.   

  • Pico - Pro: Good lifecycle support.   Really small JAR file.   Con: Not as type safe.  Project seems to have stalled.
  • Guice - Pro: Not as small as Pico, but still very small.   More type safe.  Large community.  Con: Bigger jar than Pico (but not too bad... without AOP its smaller).  No real lifecycle support.
  • Dagger - Pro: Really small, with a compiler! Con: Gradle doesn't have a built in plugin for running the dagger compiler (well, as far as I can tell).
I think I'll give Dagger a try as it will cause me to learn how to make a Gradle plugin.   Even if I don't succeed, I'll learn more about how Gradle works.

See also:

Saturday, May 24, 2014

Thinking about Java 8

With all the fanfare of the impending Java 8 release, I thought it would be a good opportunity to brush up on some of the new features and think about how useful they might be at work.   Here's what I've come up with so far:

  • @FunctionalInterface - I like this as it allows me to lock down interfaces that I want to have only one method (which is what makes them functional, or function-like).   I know a co-worker or two who will really like this.
  • java.util.time - Finally!   JODA time users (like me) will find this to be very familiar looking.
  • Lambdas - I think any Groovy user will say "finally, something like groovy closures!".   This will probably come in handy, but...
    1. As with anything concise and powerful, it could be misused.  Golden hammer problems might happen (suddenly everything has to be a Lambda).
    2. The syntax is close to what Groovy does, so it might be a little confusing to those of us who switch back and forth between Groovy and Java.
    3. The combination of Lambdas and function/method reference shorthand can result in some very 'tight' code.
     
  • No more Permanent Generation -  Okay, so now classes, interned strings and static fields are in the existing 'old' generation?   Sounds good to me initially, since I'm a big fan of 'one kind of stuff'.   However, I'm not sure about how this will affect GC configurations such as the one that I use frequently at work (ParNew + CMS).
 These features reduce the gap between Java and Scala.   As good as Scala is, it's not easy to justify using it in many cases, and with Java 8, I think that set of cases got quite a bit smaller.   I'll probably learn Scala anyway, just because, but for production code I'm thinking Java 8 would be a safer bet.

Friday, December 27, 2013

Reading Java Properties files with shell

This is an oldie but a goodie, so I'll re-post it here on this new blog.   How do you read Java properties files with shell?  Maybe another good question is: Why would you want to read Java properties files with shell?   Let's start with that.

Why?

Quite a few Java Enterprise systems that I've worked on used Java (ANT) properties files for build and runtime configuration.   It's an easy to edit and easy to read for both humans and Java programs.   However, when running Java Enterprise on Linux, you invariably get some shell scripts for process control, deployment, etc.   These need to know about database host names, and other things that are included in the configuration file.

Method 1 - Scan through the file to get an individual property

This works well if you just want to read a specific property in a Java-style properties file. Here's what it does:
  1. Remove comments (lines that start with '#')
  2. Grep for the property
  3. Get the last value using tail -n 1
  4. Strip off the property name and the '=' (using cut -d "=" -f2-) to handle values with '=' in them, as a commenter suggested).
sed '/^\#/d' myprops.properties | grep 'someproperty'  | tail -n 1 | cut -d "=" -f2-

It can also be handy to strip off the leading and trailing blanks with this sed expression:
 
s/^[[:space:]]*//;s/[[:space:]]*$//

Which makes the whole thing look like this:

sed '/^\#/d' myprops.properties | grep 'someproperty'  | tail -n 1 | cut -d "=" -f2- | sed 's/^[[:space:]]*//;s/[[:space:]]*$//'

Shell scripts can use this technique to set environment variables, for example:

JAVA_HOME=`sed '/^\#/d' build.properties | grep 'jdk.home'  | tail -n 1 | cut -d "=" -f2- | sed 's/^[[:space:]]*//;s/[[:space:]]*$//'`

Method 2 - Convert the properties file to a shell script, then run it

This method is good for when you want to read the whole properties file. However, it only works if the property names are valid shell variable names. For example, property names with a '.' in them will cause this to break. Here is the original article from +Michael Slinn that inspired this method.

Security Note

These techniques generate shell code from the properties file, then execute the generated code. This can be a potential security issue if the properties file has hidden shell code in it. Make sure you have appropriate access control on the properties file in this case. 


The steps are:
  1. Create a temp file.
  2. Stream the properties file, convert it to unix LF format (unless you are sure the file was produced on Linux).
  3. (not sure what the first 'sed' does)
  4. Double quote all the values, to preserve spaces.
  5. Source in the file. Shell should ignore Java properties file comments automatically.
  6. Erase the temp file.
Here is the code:

TEMPFILE=$(mktemp)
cat somefile.properties | dos2unix |
sed -re 's/"/"/'g|sed -re 's/=(.*)/="\1"/g'>$TEMPFILE
source $TEMPFILE
rm $TEMPFILE

Method 2 reloaded - Use AWK

This method is the same as the previous one except we use an inline awk script to replace any non-alpha-digit-underscore characters in the property names with underscore. This solves the problem the previous method has when it processes properties that have '.' in the name, for example.
Some clever person could probably figure out how to do the same thing with sed, but awk works.
Steps:
  1. Create a temporary file.
  2. Stream the properties file to stdout and (optionally) filter out dos style linefeeds.
  3. Split each line using "=" as the field separator. If there are exactly two fields, then substitute the non-identifier characters with underscore in the first one, and print the second field with quotes around it.
  4. If the line doesn't have an =, or seems to have more than two values, just print it out.
  5. Source in the temporary file and delete it.
Teh codez:

TEMPFILE=$(mktemp)
cat somefile.properties | dos2unix |
awk 'BEGIN { FS="="; } \
/^\#/ { print; } \
!/^\#/ { if (NF >= 2) { n = $1; gsub(/[^A-Za-z0-9_]/,"_",n); print n "=\"" $2 "\""; } else { print; } }' \
 >$TEMPFILE
source $TEMPFILE
rm $TEMPFILE

Method 2 - Using Groovy

This is my current favorite!   Since Groovy has all of the ANT facilities available, it can do all the property interpolation things that ANT does, plus handle all the quirks of Java Properties files that the other methods may not handle easily.

The example here can read two properties files: one for default values, and one for overrides that are environment specific.   This is great for when you have a lot of values that are the same most of the time, and only a few which change based on the environment.


#!/usr/bin/groovy
def generateEnvVars(File propertiesFile,File defaultsFile) {
    println "#!/bin/bash"
    AntBuilder ant = new AntBuilder()
    print "# Reading ${propertiesFile} ..."
    ant.property (file: propertiesFile)
    println "OK"
    if (defaultsFile) {
            print "# Reading ${defaultsFile} ..."
            ant.property (file: defaultsFile)
            println "OK"
    }
    // The properties are now in ant.project.properties
    def props = ant.project.properties
    props.keySet().toList().sort().each { String name ->
      // Shell-ify the property name.
      def shellVar = name.replaceAll(/[^A-Za-z0-9_]/, '_')
      def value = props[name]
      println "${shellVar}=\'${value}\'"
    }
}
 
// Main
// args[0] - Name of the main properties file
// args[1] (optional) - Name of the defaults properties file
generateEnvVars(new File(args[0]), args.length > 1 ? new File(args[1]) : null)

This reads the properties file (the first argument) and optionally the default properties file (the second argument), and writes the shell statements to stdout.  Since we're using ANT to read in the properties files, this version has a lot more functionality:
  1. All the standard ANT properties will be processed. 
  2. ANT will interpolate the property values. 

So, if we have the following properties file:

# properties example
some.property=this is a property
multiline.property=this is a java \
multi line property
interpolated=the value of some.property is "${some.property}"

The generated shell script will contain:
 
some_property='this is a property'
multiline_property='this is a java multi line property'
interpolated='the value of some.property is "this is a property"'

Edge Case - Shell commands in property values

What happens when the properties file looks like this? zzzhack=' echo "all your base are belong to us" ' When we use the Groovy script with this file, we hit an edge case:
 
$ groovy readprops.groovy my.properties > env-vars.sh
$ source env-vars.sh
all your base are belong to us

Uh oh! This property value could have done something very bad. We can change the Groovy script to escape any single quotes like this:
 
def value = props[name].replaceAll(/[\']/,"\\\\'")

However, this may break some things. Of course, this isn't a problem if you don't embed shell commands into your properties files... with single quotes around them. (yeah... well... maybe you shouldn't do that!)

Here's the previous generation of this post: http://shrubbery.homeip.net/c/display/W/Reading+Java-style+Properties+Files+with+Shell