Where Java system properties come from

The reference implementation of Java, OpenJDK, has been open source since a long time now, which means we can look under the covers and see the minute implementation details of anything we find interesting or the API doesn’t fully specify. For example, let’s have a look at how the system properties like file.encoding, os.name and java.io.tmpdir get their values.

Starting in the source code of the getProperty method in the java.lang.System class, we see that the system properties are held in a private instance variable called props, and its values are filled by a native method called initProperties. Its implementation we find in System.c, and the very first thing it does is call another function with the name GetJavaProperties:

java_props_t *sprops = GetJavaProperties(env);

If you grep the OpenJDK source code for this function you’ll find it defined in two places: one for Windows and another for Solaris. (The source code under solaris is also used on Linux.) What it does is define a static struct sprops of type java_props_t, fill its fields with values based on API calls to the OS, and the return it. The initProperties function mentioned earlier takes this struct and combines its values with properties passed in from the command line to set the properties that Java’s System.getProperty method will eventually return.

For example, this is how java.io.tmpdir, the directory where temporary files are stored, gets its value: On Solaris and Linux the preprocessor mactro P_tmpdir is used; if it’s not defined the hardcoded path /var/tmp is used. On Windows the API function GetTempPathW is used:

WCHAR tmpdir[MAX_PATH + 1];
/* we might want to check that this succeed */
GetTempPathW(MAX_PATH + 1, tmpdir);
sprops.tmp_dir = _wcsdup(tmpdir);

The case for os.name is similar: On Solaris and Linux the JVM calls the uname function:

struct utsname name;
sprops.os_name = strdup(name.sysname);

On Windows it’s a little more complicated: The JVM calls the Windows API function GetVersionEx which returns the OS version number, like 6.1, and this is then translated into the product name, like “Windows 7”, by checking a bunch of conditions.

Then the case for file.encoding: On Linux and Solaris the encoding is obtained calling nl_langinfo(CODESET). On Windows the GetLocaleInfo function is used to get the “default ANSI code page”, and then the appropriate prefix is added to produce an encoding name such as Cp1252 or Ms932. Special handling is needed for some CJK encodings like EUC-JP or GBK on both platforms though.