Tuesday, December 29, 2009

JVM process memory

Have you wondered what consumes memory in the JVM process? Here are the most of the list:
  • The Java heap. The maximum size is controlled by flag -Xmx. This is where Java objects are allocated.
  • The permanent generation (perm gen) heap. The maximum size is controlled by -XX:MaxPermSize. The default is 64MB on Linux/x86. This is where the JVM-level class metadata objects, interned strings (String.intern), and JVM-level symbol data are allocated. This often fills up unexpectedly when you use dynamic code/class generation in your application.
  • The code cache. The JIT compiled native code is allocated here.
  • The memory mapped .jar and .so files. The JDK's standard class library jar files and application's jar files are often memory mapped (typically only part of the files.) Various JDK shared library files (.so files) and application shared library files (JNI) are also memory mapped.
  • The thread stacks. The maximum size of a thread's stack is controlled by flag -Xss or -X:ThreadStackSize. On Linux/x86, 320KB is the default (per thread.)
  • The C/malloc heap. Both the JVM itself and any native code (either JDK's or application's) typically uses malloc to allocate memory from this heap. NIO direct buffers are allocated via malloc on Linux/x86.
  • Any other mmap calls. Any native code could call to allocate pages in the address space using mmap.
A side note is that most of the above are allocated lazily. That is, they are allocated in terms of virtual memory early but committed only on demand. Your application's physical memory use (RSS) may look small under light load, but may get substantially high under heavy load. A takeaway is it makes sense to consider the above factors when diagnosing memory footprint problems in the JVM.


JamesB said...

So how do you get a breakdown of where the memory is used in a running JVM?

Hiroshi Yamauchi said...

There is no easy way, unfortunately.

The data about the heap, the perm gen and the code cache are accessible via the java.lang.management API or the JDK tools like jmap.

The memory-mapped files are visible via /proc/pid/smaps on Linux.

The others (thread stacks, malloc heap) are much harder to figure out because they are simply anonymous memory regions in /proc/pid/smaps.