Friday, December 27, 2013

Reading Java Properties files with shell

This is an oldie but a goodie, so I'll re-post it here on this new blog.   How do you read Java properties files with shell?  Maybe another good question is: Why would you want to read Java properties files with shell?   Let's start with that.

Why?

Quite a few Java Enterprise systems that I've worked on used Java (ANT) properties files for build and runtime configuration.   It's an easy to edit and easy to read for both humans and Java programs.   However, when running Java Enterprise on Linux, you invariably get some shell scripts for process control, deployment, etc.   These need to know about database host names, and other things that are included in the configuration file.

Method 1 - Scan through the file to get an individual property

This works well if you just want to read a specific property in a Java-style properties file. Here's what it does:
  1. Remove comments (lines that start with '#')
  2. Grep for the property
  3. Get the last value using tail -n 1
  4. Strip off the property name and the '=' (using cut -d "=" -f2-) to handle values with '=' in them, as a commenter suggested).
sed '/^\#/d' myprops.properties | grep 'someproperty'  | tail -n 1 | cut -d "=" -f2-

It can also be handy to strip off the leading and trailing blanks with this sed expression:
 
s/^[[:space:]]*//;s/[[:space:]]*$//

Which makes the whole thing look like this:

sed '/^\#/d' myprops.properties | grep 'someproperty'  | tail -n 1 | cut -d "=" -f2- | sed 's/^[[:space:]]*//;s/[[:space:]]*$//'

Shell scripts can use this technique to set environment variables, for example:

JAVA_HOME=`sed '/^\#/d' build.properties | grep 'jdk.home'  | tail -n 1 | cut -d "=" -f2- | sed 's/^[[:space:]]*//;s/[[:space:]]*$//'`

Method 2 - Convert the properties file to a shell script, then run it

This method is good for when you want to read the whole properties file. However, it only works if the property names are valid shell variable names. For example, property names with a '.' in them will cause this to break. Here is the original article from +Michael Slinn that inspired this method.

Security Note

These techniques generate shell code from the properties file, then execute the generated code. This can be a potential security issue if the properties file has hidden shell code in it. Make sure you have appropriate access control on the properties file in this case. 


The steps are:
  1. Create a temp file.
  2. Stream the properties file, convert it to unix LF format (unless you are sure the file was produced on Linux).
  3. (not sure what the first 'sed' does)
  4. Double quote all the values, to preserve spaces.
  5. Source in the file. Shell should ignore Java properties file comments automatically.
  6. Erase the temp file.
Here is the code:

TEMPFILE=$(mktemp)
cat somefile.properties | dos2unix |
sed -re 's/"/"/'g|sed -re 's/=(.*)/="\1"/g'>$TEMPFILE
source $TEMPFILE
rm $TEMPFILE

Method 2 reloaded - Use AWK

This method is the same as the previous one except we use an inline awk script to replace any non-alpha-digit-underscore characters in the property names with underscore. This solves the problem the previous method has when it processes properties that have '.' in the name, for example.
Some clever person could probably figure out how to do the same thing with sed, but awk works.
Steps:
  1. Create a temporary file.
  2. Stream the properties file to stdout and (optionally) filter out dos style linefeeds.
  3. Split each line using "=" as the field separator. If there are exactly two fields, then substitute the non-identifier characters with underscore in the first one, and print the second field with quotes around it.
  4. If the line doesn't have an =, or seems to have more than two values, just print it out.
  5. Source in the temporary file and delete it.
Teh codez:

TEMPFILE=$(mktemp)
cat somefile.properties | dos2unix |
awk 'BEGIN { FS="="; } \
/^\#/ { print; } \
!/^\#/ { if (NF >= 2) { n = $1; gsub(/[^A-Za-z0-9_]/,"_",n); print n "=\"" $2 "\""; } else { print; } }' \
 >$TEMPFILE
source $TEMPFILE
rm $TEMPFILE

Method 2 - Using Groovy

This is my current favorite!   Since Groovy has all of the ANT facilities available, it can do all the property interpolation things that ANT does, plus handle all the quirks of Java Properties files that the other methods may not handle easily.

The example here can read two properties files: one for default values, and one for overrides that are environment specific.   This is great for when you have a lot of values that are the same most of the time, and only a few which change based on the environment.


#!/usr/bin/groovy
def generateEnvVars(File propertiesFile,File defaultsFile) {
    println "#!/bin/bash"
    AntBuilder ant = new AntBuilder()
    print "# Reading ${propertiesFile} ..."
    ant.property (file: propertiesFile)
    println "OK"
    if (defaultsFile) {
            print "# Reading ${defaultsFile} ..."
            ant.property (file: defaultsFile)
            println "OK"
    }
    // The properties are now in ant.project.properties
    def props = ant.project.properties
    props.keySet().toList().sort().each { String name ->
      // Shell-ify the property name.
      def shellVar = name.replaceAll(/[^A-Za-z0-9_]/, '_')
      def value = props[name]
      println "${shellVar}=\'${value}\'"
    }
}
 
// Main
// args[0] - Name of the main properties file
// args[1] (optional) - Name of the defaults properties file
generateEnvVars(new File(args[0]), args.length > 1 ? new File(args[1]) : null)

This reads the properties file (the first argument) and optionally the default properties file (the second argument), and writes the shell statements to stdout.  Since we're using ANT to read in the properties files, this version has a lot more functionality:
  1. All the standard ANT properties will be processed. 
  2. ANT will interpolate the property values. 

So, if we have the following properties file:

# properties example
some.property=this is a property
multiline.property=this is a java \
multi line property
interpolated=the value of some.property is "${some.property}"

The generated shell script will contain:
 
some_property='this is a property'
multiline_property='this is a java multi line property'
interpolated='the value of some.property is "this is a property"'

Edge Case - Shell commands in property values

What happens when the properties file looks like this? zzzhack=' echo "all your base are belong to us" ' When we use the Groovy script with this file, we hit an edge case:
 
$ groovy readprops.groovy my.properties > env-vars.sh
$ source env-vars.sh
all your base are belong to us

Uh oh! This property value could have done something very bad. We can change the Groovy script to escape any single quotes like this:
 
def value = props[name].replaceAll(/[\']/,"\\\\'")

However, this may break some things. Of course, this isn't a problem if you don't embed shell commands into your properties files... with single quotes around them. (yeah... well... maybe you shouldn't do that!)

Here's the previous generation of this post: http://shrubbery.homeip.net/c/display/W/Reading+Java-style+Properties+Files+with+Shell

No comments:

Post a Comment