Skip to main content

Posts

Showing posts with the label Java

Update to the Allocation Instrumenter

A couple of years ago, I open-sourced a tool that lets you instrument all of the allocations in your Java program . Some quick updates today: It now supports JDK7. You can now optionally use it to instrument constructors of a particular type, instead of allocations. So, if you wanted to find out where your Thread s were being instantiated, you could instrument them with a callback that invokes Thread.dumpStack(). I know that most of the readers of this blog are more interested in concurrency, and that it has been a very, very long time since I posted. The spirit is willing, but the flesh is insanely busy... I have it in mind to do a post that discusses a bunch of strategies for doing efficient non-blocking concurrency in C++, because the attempts are very amusing, but I haven't had the chance to sit down and do it.

Scala / Java Shootout

I should probably comment on this paper from my colleague Robert Hundt , where he writes the same benchmark in Java, Scala, Go, Python and C++ ( TL;DR here ). Short version: Initially, C++ was faster than Scala, and Scala was faster than Java. I volunteered to make some improvements, which ended up with the C++ and Java implementations being about the same speed. Some C++ hackers then went and improved the C++ benchmark by another ~6x, and I didn't think I should go down that road. Imagine my surprise when I got this email from a friend of mine this morning: Date: Fri, 3 Jun 2011 03:20:48 -0700 Subject: Java Tunings To: Jeremy "Note that Jeremy deliberately refused to optimize the code further, many of the C++ optimizations would apply to the Java version as well." Slacker. Gee, thanks for making me look good, Robert. (Actually, I had approved that language, but I didn't realize anyone would actually read the paper. :) ) Anyway, I was not just being obdurate here;...

Java Puzzlers@Google I/O

Josh Bloch and I gave one of his Java Puzzlers talk at Google I/O this year. If you hate Java, you can waste a perfectly good hour listening to us make unfunny jokes at its expense.

Cliff Click in "A JVM Does What?"

Cliff Click gave a very good talk at Google last week. For your viewing enjoyment: The slides are available here . (For those who watched it: The inventors of bytecode were in the audience. I think he was just used to making that joke elsewhere.)

Why Many Profilers have Serious Problems (More on Profiling with Signals)

This is a followon to a blog post I made in 2007 about using signal handling to profile Java apps . I mentioned in another post why this might be a good idea , but I wanted to expand on the theme. Hiroshi Yamauchi and I are giving a talk at this year's JavaOne . Part of this talk will be about our experiences writing profilers at Google, and in preparing for it, I realized my previous entry on this subject only told half the story. The topic of the original post is that there is an undocumented JVMTI call in OpenJDK that allows you to get a stack trace from a running thread, regardless of the state of that thread. In Unix-like systems, you can use a SIGPROF signal to call this function at (semi-)regular intervals, without having to do anything to your code. The followon post, intended to describe why you would use this, described a couple of things that are true: That it tells you what is actually running on your CPU, not just what might be scheduled, which is what profilers ...

Garbage Collection, [Soft]References, Finalizers and the Memory Model (Part 2)

In which I ramble on at greater length about finalizers and multithreading, as well as applying that discussion to SoftReferences . Read Part 1 first . It explains why finalizers are more multithreaded than you think, and why they, like all multithreaded code, aren't necessarily going to do what you think they should do. I mentioned in the last entry that a co-worker asked me a question about when objects are allowed to be collected. My questioner had been arguing with his colleagues about the following example: Object f() { Object o = new Object(); SoftReference<Object> ref = new SoftReference<Object>(o); return ref.get(); } He wanted to know whether f() was guaranteed to return o , or whether the new Object could be collected before f() returned. The principle he was operating under was that because the object reference was still on the stack, the object would not be collected. Those of you who read the last entry will know better, of course. To put it in...

A Note on the Thread (Un)safety of Format Classes

I recently got a note on another blog post asking this question: I have a general question on the thread safety and this is not directly related with your blog. I would appreciate if you could post it on your blog. I have a class that has only one static method that accepts two string parameters and does some operation locally (as shown below). Do I need to synchronize this method? public class A {   public static boolean getDate(String str, String str2){     SimpleDateFormat formatter =       new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS");     boolean isBefore = false;     Date date1;     Date date2;     try {       date1 = formatter.parse(str);       date2 = formatter.parse(str);       isBefore = date1.after(date2);     } catch (Exception...

Allocation Instrumenter for Java

In brief: We've open sourced a tool that allows you to provide a callback every time your program performs an allocation. The Java Allocation Instrumenter can be found here. Give it a whirl, if you are interested. One thing that crops up a lot at my employer is the need to take an action on every allocation. This can happen in a lot of different contexts: The programmer has a task, and wants to know how much memory the task allocates, so wants to increment a counter on every allocation. The programmer wants to keep a histogram of most frequently accessed call sites. The programmer wants to prevent a task from allocating too much memory, so it keeps a counter on every allocation and throws an exception when the counter reaches a certain value. Because of the demand for this, a few of us put together a tool that instruments your code and invokes a callback on every allocation. The Allocation Instrumenter is a Java agent written using the java.lang.instrument API and ASM . Each al...

Cliff Click on Java-vs-C Performance

Cliff Click has a terrific post on Java-vs-native-code performance . Recommended. Cliff is the Chief JVM Architect at Azul Systems, was the lead designer behind Hotspot's server JIT compiler, and is an all-around smart guy. His main drawback is that he (apparently) gets dragged into flamewars on YouTube's comment section.

How Hotspot Decides to Clear SoftReferences

I got asked about this twice in one day, and I didn't know the answer, so I sat down and puzzled it out a bit. A SoftReference is a reference that the garbage collector can decide to clear if it is the only reference left to an object, and the GC decides (through some undefined process) that there is enough memory pressure to warrant clearing it. SoftReferences are generally used to implement memory-sensitive caches. A mistake many people make is to confuse it with a WeakReference , which is a reference that, if it is the only reference left to the object, the garbage collector will aggressively clear. I got asked how Hotspot's GC actually decides whether to clear SoftReferences. Twice. On the same day. (Hotspot is the Sun / OpenJDK JVM). The answer was that I had no idea, so I had to go look it up. Here is what I found out; I hope it is a useful reference for someone (pun intended). First, there is a global clock variable that is set with the current time (in millis)...

Faster Logging with Faster Logger Classes

Today, I'll discuss a little tweak I made to java.util.Logging that made my logging throughput double. I want to use it as an illustration that it often isn't very difficult to improve the performance of concurrent code by doing things that are actually pretty easy to do. So, "I" have an application that is running a couple of hundred threads on an 8-core machine, and it wants to log about 2MB a second using java.util.Logger . When I say "I have", I actually mean "someone else has", because if "I" had to log a megabyte a second, there is no way I would use java.util.Logger to do it. Still, we all make our choices. When I came to this code, it was already doing sensible things like buffering its output. It wasn't doing something more complicated, like using a dedicated logging thread. It could log about 1MB a second, and was chewing through CPU pretty rapidly. I just decided to run our profiler on the code and see where the bot...

Small Language Changes for JDK7

For those who haven't been paying attention, there is a new project from Sun to attract proposals for small language changes to JDK7 . Even in the first day of open calls for proposals , the entries have been very interesting: Strings in Switch, a proposal from Joe Darcy to be able to use Strings in switch statements. Automated Resource Blocks, a proposal from Josh Bloch to be able to say things like: try (BufferedReader br = new BufferedReader(new FileReader(path)) { return br.readLine(); } instead of: BufferedReader br = new BufferedReader(new FileReader(path)); try { return br.readLine(); } finally { br.close(); } based on having BufferedReader implement a Disposable interface. Block expressions, a proposal from Neal Gafter to be allow things like: double pi2 = (double pi = Math.PI ; pi*pi)**; // pi squared* Basically, the principle here is that the (parenthesis-delimited) block would return a value. Exception handling improvements, another proposal from Neal Gaft...

Date-Race-Ful Lazy Initialization for Performance

I was asked a question about benign data races in Java this week, so I thought I would take the opportunity to discuss one of the (only) approved patterns for benign races. So, at the risk of encouraging bad behavior (don't use data races in your code!), I will discuss the canonical example of "benign races for performance improvement". Also, I'll put in another plug for Josh Bloch's new revision of Effective Java (lgt amazon) , which I continue to recommend. As a reminder, basically, a data race is when you have one (or more) writes, and potentially some reads; they are all to the same memory location; they can happen at the same time; and that there is nothing in the program to prevent it. This is different from a race condition , which is when you just don't know the order in which two actions are going to occur. I've put more discussion of what a data race actually is at the bottom of this post. A lot of people think that it is okay to have a data...

What Volatile Means in Java

Today, I'm going to talk about what volatile means in Java. I've sort-of covered this in other posts, such as my posting on the ++ operator , my post on double-checked locking and the like, but I've never really addressed it directly. First, you have to understand a little something about the Java memory model. I've struggled a bit over the years to explain it briefly and well. As of today, the best way I can think of to describe it is if you imagine it this way: Each thread in Java takes place in a separate memory space (this is clearly untrue, so bear with me on this one). You need to use special mechanisms to guarantee that communication happens between these threads, as you would on a message passing system. Memory writes that happen in one thread can "leak through" and be seen by another thread, but this is by no means guaranteed. Without explicit communication, you can't guarantee which writes get seen by other threads, or even the order in whic...

G1 Garbage Collector in Latest OpenJDK Drop

ETA: The G1 collector is now in the latest JDK6 release, which you can download from Sun here . You can enable it for testing with the flags -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC . Original post follows. For those of you who are interested in tinkering with new garbage collectors, the "Garbage-first" G1 garbage collector is in the new JDK7 drop from OpenJDK . I haven't tried it yet, but I am looking forward to seeing what it does. (You will notice the link doesn't say anything about the new GC. Here's the appropriate bug .) G1 is supposed to provide a dramatic improvement on existing GCs. There was a rather good talk about it at this year's JavaOne. It allows the user to provide pause time goals, both in terms of actual seconds and in terms of percentage of runtime. The principle is simple: the collector splits the heap up into fixed-size regions and tracks the live data in those regions. It keeps a set of pointers — the "remembered set...

Don't Use StringBuffer!

One thing I have noticed among Java API users is that some don't seem to know that there is a difference between StringBuffer and StringBuilder. There is: StringBuffer is synchronized, and StringBuilder isn't. (The same is true for Vector and ArrayList, as well as Hashtable and HashMap). StringBuilder avoids extra locking operations, and therefore you should use it where possible. That's not the meat of this post, though. Among those who know that StringBuffer is synchronized, they sometimes think that it has magical thread-safety properties. Here's a simplified version of some code I wrote recently final StringBuilder sb = new StringBuilder(); Runnable runner = new Runnable() { @Override public void run() { sb.append("1"); } }; Thread t = new Thread(runner); t.start(); try { t.join() } catch (InterruptedException e) {} // use sb. A fellow looked at this code, and said that he thought I should use StringBuffer, because it is safer with multiple th...

Top 3 Reasons Why Constructors are Worthless

In my last post, I made an offhand reference to the fact that object constructors are worthless in Java. I was asked why this is, so I thought I'd fill in the details. This topic is a little more well-understood by the developer community than some of the concurrency tidbits that I usually discuss. People who use dependency injection toolkits like Guice and Spring 's inversion of control tend to have very strong feelings on the advanced suckitude of constructors. I know quite a few developers who claim that use of Guice changed the way they code (for the better). ETA: I've had a few people tell me just to use SPI. The point of this post is not that DI is wonderful, it is that constructors are awful. SPI doesn't stop constructors from being awful. In fact, it is yet another horrible solution, what with all of those Foo and FooImpl classes. If you like SPI, go ahead and use it. Furthermore, neither DI nor SPI addresses two of the three issues I mentioned. So, wi...

Immutability in Java, Part 3: Deserialization and Reflection

This time, I'll talk about deserialization, immutability and final fields. I'll try to be a little shorter. I spent the last two posts on immutability ( here and here ) talking about what it takes to make an object's fields immutable. This advice mostly boiled down to "it has to be final and set before the constructor ends". When you have deserialization, though, the fields can't be set before the constructor ends. An instance object is constructed using the default constructor for the object, and the fields have to be filled in later. We gave deserialization special semantics, so that when an object was constructed this way, it would behave as if it had been constructed by an ordinary constructor. No problems there. The problems come when and if you need to write custom deserialization (using, for example, the readObject method of ObjectInputStream ). You can do this using reflection: Class cl = // get the class instance Field f = cl.getDeclaredFiel...

Immutability in Java, Part 2

I'd like to talk a little more about what it takes to ensure thread-safe immutability in Java, following on from a (semi)recent post I made on the subject. The basic gist of that post was that if you make data immutable, then they can be shared between threads without additional synchronization. I call this "thread-safe immutability". In that post, I said this: Now, in common parlance, immutability means "does not change". Immutability doesn't mean "does not change" in Java. It means "is transitively reachable from a final field, has not changed since the final field was set, and a reference to the object containing the final field did not escape the constructor". I just wanted to go over these points again, because I get a lot of questions about them and there are a couple of things I glossed over. Let's take them one by one. Immutability means... the object is transitively reachable from a final field What this means is that y...

Double Checked Locking

I still get a lot of questions about whether double-checked locking works in Java, and I should probably post something to clear it up. And I'll plug Josh Bloch's new book, too. Double Checked Locking is this idiom: // Broken -- Do Not Use! class Foo {   private Helper helper = null;   public Helper getHelper() {     if (helper == null) {       synchronized(this) {         if (helper == null) {           helper = new Helper();         }       }     }   return helper; } The point of this code is to avoid synchronization when the object has already been constructed. This code doesn't work in Java. The basic principle is that compiler transformations (this includes the JIT, which is the optimizer that the JVM uses...