Project Nayuki


Java SE 5 is the most significant release

If I were to pick one version of Java whose features I could not live without, it would easily be version 5 (released , officially known as J2SE 5.0). I started learning real programming and the Java programming language around , using version 1.4 for those first few years of software development. Back in those days, I didn’t learn to use the standard data structure classes in the collections framework. Instead, I manually wrote dynamically sized arrays for specific types, which was quite painful (but sort of type-safe, and avoids iterators and casting).

I began adopting version 5 around , but it took me the next few years to fully embrace its new features like generics, for-each loop, auto-boxing, varargs, printf(), etc. Later versions of Java (particularly 7 and 8) add a number of features that I enjoy, but I consider them to be nice-to-haves, rather than core features that make the Java programming language bearable to use.

Java SE 5 features that I use

The following features are listed in roughly descending order of personal importance, and explained with a usage example from my experience. This is not meant to be an exhaustive list of all features added in version 5; there are links to publicly available documentation for that information.

Generic data structures

The collections framework was introduced in version 1.2, but using it is clunky and unsafe:

// Cannot declare that a map's keys
// and values must have certain types
Map m = new HashMap();

// Cannot constrain the argument
// types when calling put()
m.put("a", new Integer(4));

// Cannot guarantee that objects obtained
// from the map have certain types
Object o = m.get("a");

// Need to cast manually, and
// may throw ClassCastException
Integer v = (Integer)o;

// Need to unbox the raw value
int i = v.intValue();

The generics introduced in Java 5 are a significant feature (in terms of changes and learning), but greatly improve safety and documentation:

// Java 5 generics
Map<String,Integer> m;

// Java 7 diamond operator
m = new HashMap<>();

// Java 5 auto-boxing
m.put("a", 4);

// Java 5 generics
Integer v = m.get("a");

// Java 5 auto-unboxing
int i = v;

Generics make the element types visible to the programmer, and are checked by the compiler. After compilation though, generics are translated into raw types and casting, which means that the two pieces of example code behave identically from the JVM’s point of view.

For-each loop

The old way of iterating forward through language-level arrays and standard library containers:

int[] arr = (...);
for (int i = 0; i < arr.length; i++) {
  int val = arr[i];
  (...)
}

List lst = (...);
for (Iterator iter = lst.iterator();
    iter.hasNext(); ) {
  Token val = (Token)lst.next();
  (...)
}

The new way with the enhanced for loop/statement (official name):

int[] arr = (...);
for (int val : arr) {
  (...)
}

List<Token> lst = (...);
for (Token val : lst) {
  (...)
}

The for-each loop drastically cuts down on the amount of verbose syntax in both cases. For the array case, some small ad hoc arrays are meant to be traversed from front to back – but many arrays need advanced control over the indexing (e.g. offset, length, stride, etc.), so using explicit loop variables is not considered unusual. For container classes that implement Iterable<E> (such as List<E> and Set<E>), it is very common to use the for-each syntax, and almost the only reason to use an explicit iterator is to be able to call remove() on elements.

The traditional for-loop, which comes from the C programming language, is rather low-level with its three sections and explicit mechanisms. By contrast, the for-each loop expresses the programmer’s high-level intent more succinctly, and is the type of for-loop found in Python and shell scripting languages.

Auto-boxing/unboxing

Java has an uncomfortable divide between primitive numeric types and object-boxed versions of these types (e.g. float vs. java.lang.Float). I personally don’t know if this language feature can be designed better (see also: C/C++ values vs. pointers, C# generics, and Java’s own Project Valhalla), but auto-boxing certainly reduces the syntactical pain in a low-cost, low-risk way. Auto-boxing plays well with generics and the for-each loop (both introduced in Java 5), making it far less painful to put int values into a List<Integer> for example. However, auto-boxing has some pitfalls when using the equality operator (==) and dealing with null values.

Variadic methods

Old example without the feature (a.k.a. varargs):

int sum(int[] arr) {
  (...)
}
print(sum(new int[]{1, 2, 3}));

With varargs, the method declaration is changed slightly and the caller’s code is simplified (but in the JVM it behaves exactly like the old example):

int sum(int... arr) {
  (...)
}
print(sum(1, 2, 3));

The C function printf() is a flexible, powerful, and concise way to express string formatting logic. Varargs enable this function to be included in Java with a sane syntax, made available in PrintWriter/PrintStream.printf() and String.format(). I do make use of these methods infrequently, and they are more desirable than doing things in a way that avoids varargs.

Return type covariance

Previously, it was impossible for a subclass to override a method to have a more specific return type:

class Animal { }
class AnimalFactory {
  Animal makeAnimal();
}

class Cat extends Animal { }
class CatFactory extends AnimalFactory {
  Cat makeAnimal();  // Compile error
}

This notion is somewhat more difficult to handle and verify for language/compiler designers. But it is logically sound, and it became legal in Java 5.

One consequence is that if you choose to override Object.clone(), you can and should make it return your class type instead of the Object type:

class Object {  // Package java.lang
  protected Object clone();  // Base definition
  (...)
}

// Implicitly extends Object
class Thing implements Cloneable {
  public Thing clone();  // Recommended!
}

Another consequence is that calling the implicit clone() method on an array will give a result of the same type, so you no longer need to cast:

byte[] b = (...);

// Required in Java 1.4 and below
byte[] c = (byte[])b.clone();

// Okay in Java 5 and above
byte[] d = b.clone();

Note that when implementing clone(), return type covariance will help tremendously when you clone sub-objects and arrays that your class contains.

Annotations

Annotations mark classes, methods, fields, local variables, etc. with additional information that can be used at compile time and/or run time. Although the annotation mechanism required a change in the Java programming language, each annotation definition does not – hence annotations are more lightweight than language features.

I make use of annotations when defining a JUnit test suite:

// Using JUnit 3 framework (no annotations)
class FooTest extends TestCase {
  // Names of test methods must start with "test"
  public void testBar();
  public void testQux();
  public void doNotTest();
}

// JUnit 4 (requires annotations and Java 5+)
import org.junit.Test;
class FooTest {
  // Names of test methods can be anything
  @Test public void bar();
  @Test public void TestQux();
  public void doNotTest();
}

I also use SuppressWarnings for various compile-time messages like raw types, unsafe generics, serialVersionUID, unused variables, etc. I am still undecided about whether using the Override annotation is a good or bad thing.

Enumerations

Enums are a great type-safe alternative to sequential constants (e.g. NORTH=0, EAST=1, SOUTH=2, WEST=3) and bit flags (e.g. BOLD=1, ITALIC=2, UNDERLINE=4). A basic, no-frills enum is very easy to declare:

public enum Direction {
  NORTH, EAST, SOUTH, WEST
  // Optional trailing semicolon
}

You can use these enum values in equality comparisons (e.g. if (dir == EAST)) and in switch statements. Moreover, each enum value is assigned an integer value starting from 0, which can be obtained by calling the ordinal() method (e.g. WEST.ordinal() == 3). This allows you to manually convert a strongly typed enum value into a plain integer. Conversely, you can call your enum type’s static values() method to get an array of all enum values in the declared order.

As an alternative to carefully defining bit flag constants and doing arithmetic on them (which seems to be a pastime of C/C++ programmers), you should definitely declare an enum type and put values into an EnumSet (which is even more convenient than BitSet).

A Java enum is a full-fledged class, and can have fields and methods (in addition to the implicit ones already mentioned). Even before Java 5, this view of enums as classes was already advocated by Joshua Bloch’s book Effective Java (1st edition), item 21 “Replace enum constructs with classes”. Enums are a concise way to implement the multiton pattern. (Whereas in C/C++, an enum is not much more than a way to define integer constants with auto-incrementing values. At least enums are type-safe in C++ and won’t implicitly cast to int.)

Standard library
  • Integer adds functions such as bitCount(), rotateLeft(), numberOfLeadingZeros(), which are useful in occasional low-level arithmetic (e.g. cryptography) and bitwise manipulation (e.g. packing data fields into bits). As a bonus, almost every function is intrinsified into a single machine instruction (at least on x86/x86-64).

  • The new System.nanoTime() aids benchmarking algorithms because it has much finer granularity (~1 μs) than System.currentTimeMillis() (~10 ms).

  • The new java.util.concurrent package and subpackages add tons of classes for building multithreaded/concurrent/parallel applications. I have only used a few of them personally (like ReentrantLock, Condition, AtomicInteger, AtomicLongArray, BlockingQueue, CountDownLatch, ExecutorService), but they are tremendously helpful. This is because without these classes, the alternative is either to implement concurrent data structures and algorithms (e.g. BlockingQueue) based on Java’s primitive monitor, or (even worse) use Unsafe or even JNI to implement some of these features at all (e.g. a Lock with multiple Condition variables).

  • Other Java 5 things I like from the standard library are already covered in the section on varargs (such as printf()).

Static imports

Before, if you wanted to call a static method of another class without prefixing the class’s name, you either needed to subclass that class (impure) or write your own proxy method (verbose):

class Factorial {
  static double gamma(double x) { ... }
}

class FooApp extends Factorial {
  static void main() {
    print(gamma(1.5));
  }
}

class BarApp {
  static void main() {
    print(gamma(-2.4));
  }
  static double gamma(double x) {
    return Factorial.gamma(x);
  }
}

Now, it is a simple matter of adding import static Factorial.gamma; or import static Factorial.*; to the top of your source file. This helps with frequently used functions or constants like Math.sin() or Math.PI. In the case of the old JUnit 3, your test suite extends TestCase which contains static methods such as assertEquals(). But in JUnit 4, your test suite doesn’t need to extend anything, and you can statically import org.junit.Assert.assertEquals() to achieve the same convenient effect.

More info

Java SE 6 features that I use

Java 6 was released only a year after Java 5, whereas versions 5, 7, and 8 took three or more years of development. This version had improvements behind the scenes, but not many affected me personally in practice.
Standard library

The new ArrayDeque<E> class is what I reflexively grab when I need a plain Queue<E> (not to be confused with a PriorityQueue<E> or heap). Previously in Java 5, the only real choice for a queue was LinkedList<E>, ignoring all the concurrent and blocking implementations.

The Arrays utility class adds the family of copyOf(), copyOfRange(), and binarySearch() functions, which are quite useful for manipulating low-level array buffers.

More info

Java SE 7 features that I use

Java 7 introduces a handful of syntactic and library conveniences, which I appreciate a decent amount. These features are definitely more significant than Java 6, and for now they play a much bigger role in my work than Java 8 features.

Generic diamond operator

Although Java 5’s generics improve safety and documentation, the redundant verboseness has always been a sore point that many programmers complain about.

// Required in Java 5 and 6
List<Byte> lst = new ArrayList<Byte>();
Map<String,Long> map = new TreeMap<String,Long>();

// Okay in Java 7 and above
List<Byte> lst = new ArrayList<>();
Map<String,Long> map = new TreeMap<>();

A couple of important notes:

  • Java 5 already has type inference when calling methods that have type parameters (e.g. Collections.max()), but does not apply to constructors. This means we could actually have even less syntax at the call site by implementing a helper function:

    static <K,V> Map<K,V> newTreeMap() {
      return new TreeMap<K,V>();
    }
    
    Map<String,Long> map = newTreeMap();
  • The diamond operator is not the same as omitting it, i.e. not the same as raw types. Raw types are unsafe and generate lots of compile-time warnings. The diamond operator infers the generic type only when it is safe and unambiguous to do so.

  • This sort of type inference in Java applies to the value being constructed, not to the variable being assigned. It is the opposite behavior of type inference like var x = new List<int>(); (C# 3.0 and above) or auto it = myvec.cbegin(); (C++11 and above).

Try-with-resources statement

Also known as automatic resource management (ARM), this reduces the syntactic burden of writing a finally block and a call to close(). For example:

// Required before Java 7
InputStream in = new FileInputStream(...);
try {
  (...)
} finally {
  in.close();
}

// Okay in Java 7 and above
try (InputStream in = new FileInputStream(...)) {
  (...)
}

Compared to other languages, this puts Java on par with Python’s with statement in the typical use case, and brings Java a little closer to the full power of RAII in C++.


	
Numeric literal syntax

Numeric literals can now contain underscores (which help as thousands separators), and can also be expressed in binary (useful for low-level bit arithmetic) – for example:

1_234_999L == 1234999L
0b11110000 == 0xF0
Switch on strings

Previously, the switch statement could only be applied to int (or any narrower type). Switching on strings is useful for applications like parsing and handling network protocols and file formats.

Multi-catch

I found this to be helpful when writing real applications that need to store the exception in a log and/or display it to the user. For example:

try {
  (...)
} catch (InterruptedException|IOException e) {
  logger.log(Level.SEVERE, "Exception", e);
  e.printStackTrace();
}
Standard library: Various utilities
  • I write explicit null checks at the beginning of many methods as a defensive programming practice. Now instead of writing if (obj == null) throw new NullPointerException();, I simply write Objects.requireNonNull(obj);.

  • Objects.hash() makes it quick and easy to implement a good-enough hashCode() method in your custom class. Of course if hash speed or uniformity were important, you would still write custom hashing logic.

  • The StandardCharsets class defines constant Charset objects like UTF_8, which are safer than string constants like "UTF-8". A Charset (or string) can be given to APIs like new OutputStreamWriter() and String.getBytes().

  • Integer.compare(), Short.compare(), etc. make it easier to write your own compareTo() methods or Comparators that examine primitive integer values. Previously, you either had to subtract numbers (could overflow), or test the three cases <, ==, > manually, or box the values into objects to call compareTo(). Curiously though, Float.compare() and Double.compare() have existed since Java 1.4.

  • The new Path objects (which are like File objects) can be used in the new Files utility class. Files provides powerful functions like copy(), readAttributes(), readAllBytes(), readAllLines(), write(), and much more.

More info

Java SE 8 features that I use

As of writing (), I have been aware of Java 8 features for only about a year (despite the technology being released in ). Java 8 is by far the most significant release since SE 5, and might have more raw features and changes than version 5. However, the number of Java 8 features I have adopted and chosen to rely on is much fewer than Java 5 features.

There is still much Java 8 functionality for me to discover, especially in lambdas and streams. Also, some computer systems (such as certain Linux distributions) only provide up to JDK 7, so when I publish code for a wide public audience, I generally stay away from Java 8 features because they might hinder acceptance of my programs.

Better generic type inference

Java 7’s diamond operator is a leap forward in convenience and conciseness, but it doesn’t cover all the cases. For example, the Java 7 compiler makes this incorrect inference:

Map<String,List<Integer>> m = new HashMap<>();
// ^ Okay

m.put("s", new ArrayList<>());
// ^ The method put(String, List<Integer>) in the
// type Map<String,List<Integer>> is not applicable
// for the arguments (String, ArrayList<Object>)

This type-inference behavior is fixed in Java 8, and the second line of code compiles correctly as you would expect.

Capture of effectively final variables

An “effectively final” local variable is a variable that isn’t declared as final, but hypothetically adding the final modifier would be legal and not result in a compile-time error. For example:

// Never final
int i = 0;
i++;  // Reassignment

// Explicitly final
final int j = 1;

// Effectively final (if we don't add more code)
int k = 2;
int[] m = {3};
m[0]++;  // Okay

new Runnable() {
  public void run() {
    print(i);  // Never legal
    print(j);  // Always legal
    print(k);  // Okay in Java 8 and above
    print(m[0]);  // Okay in Java 8 and above
  }
};

When creating an anonymous inner class within a method, the rule is that it can only capture local variables (and parameters) that are final. In Java 7 and earlier, the captured variables/parameters must be literally declared as final, but Java 8 relaxes this restriction so that effectively final variables are allowed too. Thus, we can reduce syntactical noise when writing anonymous classes and (Java 8 new) lambda functions.

Standard library
  • Base64’s encoder and decoder are really helpful in those few times that Base64-encoded text comes up in web/HTTP/email programming. It’s not hard to write your own encoder or decoder from scratch (around 20 lines per direction), but why go through the pain and risk having bugs?

  • Integer adds functions to treat the 32-bit int type as an unsigned value, and likewise for Long. This comes up rarely (like in making your own bigint implementation), but it’s still good to see official library support for these features. The functions can be emulated with a modest amount of knowledge and effort, and these ideas are discussed on my page Unsigned int considered harmful for Java. (I argue that adding new primitive unsigned types like uint is bad, but adding functions to treat the existing int as unsigned is just fine.)

  • BigInteger.multiply() became much faster for huge numbers. The naive Θ(n2) algorithm is now supplemented by Karatsuba (Θ(n1.585)) and 3-way Toom–Cook (Θ(n1.465)) algorithms. (Incidentally, this makes my Java Karatsuba multiplication implementation redundant.)

Interface default methods

I haven’t used this language feature yet, but I think it would make me redesign some of my old class hierarchies and abstract classes as interfaces instead.

More info