Scanning Java Class Files #2: Magic, Minor and Major

Marcio EndoMarcio EndoMar 2, 2025

In the previous post of this series, we used the javap tool to inspect Java class files. In doing so, we learned that Java class files contain a section called constant pool, which stores, among other things, the string literals defined in source code, and the names of any CLASS and RUNTIME annotation applied to the class.

In this post, we'll learn about the ClassFile structure, a few of its data types, and that it begins with the magic number, followed by the minor and major version numbers. We'll see that they immediately preceed the constant pool in a Java class file. And we'll begin our Java class scanner implementation by reading these three values.

A Note on the Class-File API

The JDK provides an API for reading class files. However, in this blog post series, we won’t be using it. The API abstracts away the low-level details of the Java class file format that we want to understand as part of this series.

Our Class File

To ensure our implementation is correct, we'll compare its output to the one from the javap tool. We'll use the same class from the previous post:

void main() {  IO.println("Hello, World!");}

We are using JDK 23 with preview language features enabled:

$ javac --enable-preview --release 23 HelloWorld.java

And here's the trimmed output from javap to guide us during our implementation:

Classfile /tmp/HelloWorld.class
  minor version: 65535
  major version: 67

Let's begin our implementation: we'll have it print the first line the output.

01: Print the Full Path of the Class File

The first line of the output contains the full path name of the class file being scanned, so we write the following code:

void main(String[] args) {  final String pathName;  pathName = args[0];  final Path file;  file = Path.of(pathName);  System.out.printf("ClassFile %s%n", file.toAbsolutePath());}

It is an instance main method of an implicitly declared class. It expects the first argument used to invoke the application to contain the path name to the class file. Let's run it:

$ java --enable-preview ClassFile2.java HelloWorld.class
ClassFile /tmp/HelloWorld.class

It works, but it might throw a ArrayIndexOutOfBoundsException at the first statement.

02: Print an "Usage" Message

To prevent the out-of-bounds exception from being thrown, we'll add the following check:

void main(String[] args) {  if (args.length == 1) {    execute(args);  } else {    println("Usage: java --enable-preview ClassFile2.java <class file>");  }}private void execute(String[] args) {  ...}

If we have exactly one argument, we continue executing the program. Otherwise, we print an "Usage" message.

Let's execute the program with no arguments:

$ java --enable-preview ClassFile2.java
Usage: java --enable-preview ClassFile2.java <class file>

And with two arguments:

$ java --enable-preview ClassFile2.java HelloWorld.class Second.class
Usage: java --enable-preview ClassFile2.java <class file>

That's good enough.

The Java Class File Format

To inspect a Java class file in a manner similar to how the javap tool does, we need to know how the class file is structured. Thankfully, we can find all the required information in the Java Virtual Machine Specification (JVMS). It is laid out on Chapter 4, titled The class File Format.

The ClassFile Structure

In the first section of the The class File Format chapter, we find the definition of the ClassFile structure:

ClassFile {
    u4             magic;
    u2             minor_version;
    u2             major_version;
    u2             constant_pool_count;
    cp_info        constant_pool[constant_pool_count-1];
    u2             access_flags;
    u2             this_class;
    u2             super_class;
    u2             interfaces_count;
    u2             interfaces[interfaces_count];
    u2             fields_count;
    field_info     fields[fields_count];
    u2             methods_count;
    method_info    methods[methods_count];
    u2             attributes_count;
    attribute_info attributes[attributes_count];
}

A Java class file is a stream of bytes which conforms to this ClassFile structure. It is made of items each having a type and a name.

The first item in any Java class file is named magic and its type is u4. An u4 type is an unsigned four-byte quantity stored in big-endian order. The second item in any Java class file is named minor_version and its type is u2. An u2 type is an unsigned two-byte quantity stored in big-endian order. The third item in any Java class file is named major_version and its type is also u2. And so on.

Items like the constant_pool, fields, methods and attributes have different types. For example, the constant pool has a type of cp_info. This type is defined elsewhere in Chapter 4, and it will be discussed later in this series.

Right now, we're interested in reading the first three items: magic, minor_version and major_version. Before we can do that, we need to load the contents of the file into memory.

Read the Class File into Memory

In the previous post, we learned that the constant pool is referenced by other parts of a class file. For example, code in a method body might reference a string literal from the constant pool. So, it might be useful to keep the constant pool in memory. For this reason, we will read the full contents of the class file into memory. Here's the code:

private byte[] data;private void readClassFile(String[] args) throws IOException {  final String pathName;  pathName = args[0];  final Path file;  file = Path.of(pathName);  System.out.printf("ClassFile %s%n", file.toAbsolutePath());  data = Files.readAllBytes(file);}

After printing the absolute path to the class file, we read its full contents, storing it in a instance byte array field named data.

EOF Check

We have the full contents of the specified file in memory, and we expect for it be a valid class file. But, it might not be. So, every time we need to read a byte from the array, we must first check if there's actually a byte to be read. In other words, we must first check if we've not reached the end of the file (EOF). Here's the code:

private int idx;private void check(int required) throws IOException {  final int remaining;  remaining = data.length - idx;  if (remaining < required) {    throw new EOFException("Unexpected end of file: required=" + required);  }}

We've created the int instance field named idx. It serves as the index to the component of the data byte array that is currently under examination.

The check method computes the total number of bytes remaining to be read in the data byte array. If this number is less than the required amount, an exception is thrown indicating an unexpected end of the file.

Read an u4 Value

The first item of the class file is the magic number. It is an u4 value, i.e., an unsigned four-byte quantity stored in big-endian order. In other words, an u4 value is made of 32 bits and, therefore, fits in an int value. We just need to remember that u4 is unsigned while int is signed, so Integer.toUnsignedLong might be eventually required.

Here's a method for reading an u4 value:

private int readU4() throws IOException {  check(4);  byte b0 = data[idx++];  byte b1 = data[idx++];  byte b2 = data[idx++];  byte b3 = data[idx++];  int v0 = toInt(b0, 24);  int v1 = toInt(b1, 16);  int v2 = toInt(b2, 8);  int v3 = toInt(b3, 0);  return v0 | v1 | v2 | v3;}private int toInt(byte b, int shift) {  return Byte.toUnsignedInt(b) << shift;}

First, the check(4) call ensures there's at least 4 bytes in the data array from the current idx position. The method then reads four bytes, b0, b1, b2, and b3, from the data array incrementing idx after each read to advance the position. They represent the most significant to the least significant bytes of the u4 value.

Let's say the four bytes are [0xCA, 0xFE, 0xBA, 0xBE]. To construct the final 32-bit integer, we have the following:

data = [0xCA, 0xFE, 0xBA, 0xBE];

0x000000CA << 24 = 0xCA000000 OR
0x000000FE << 16 = 0x00FE0000 OR
0x000000BA <<  8 = 0x0000BA00 OR
0x000000BE <<  0 = 0x000000BE
                   ----------
                   0xCAFEBABE

Here's what's happening:

  • First, each byte is converted to an int by an unsigned conversion.

  • The first value 0x000000CA is shifted to the left by 24 bits, resulting in 0xCA000000.

  • The second value 0x000000FE is shifted to the left by 16 bits, resulting in 0x00FE0000.

  • The third value 0x000000BA is shifted to the left by 8 bits, resulting in 0x0000BA00.

  • The fourth value 0x000000BE is shifted to the left by 0 bits, resulting in 0x000000BE.

  • These values are then combined with bitwise OR operations, resulting in 0xCAFEBABE.

03: Validate the Magic Number

With our readU4 method implemented, we can read the first item of the ClassFile structure, the magic number. It has a fixed value of 0xCAFEBABE. So, if the file we're reading does not have the correct value, it is an invalid class file:

private void verifyMagic() throws IOException {  final int magic;  magic = readU4();  if (magic != 0xCAFEBABE) {    throw new IOException("Invalid class file: magic not found");  }}

Let's run our implementation on a PDF file:

$ java --enable-preview ClassFile2.java report.pdf
ClassFile /tmp/report.pdf
Exception in thread "main" java.io.IOException: Invalid class file: magic not found
  at ClassFile2.verifyMagic(ClassFile2.java:57)
  at ClassFile2.execute(ClassFile2.java:33)
  at ClassFile2.main(ClassFile2.java:24)

Next, let's run on a proper class file:

$ java --enable-preview ClassFile2.java HelloWorld.class
ClassFile /tmp/HelloWorld.class

No exception was thrown.

Read an u2 value

The second and third items of the ClassFile structure are of the u2 type. We need a method for reading this kind of value:

private int readU2() throws IOException {  check(2);  byte b0 = data[idx++];  byte b1 = data[idx++];  int v0 = toInt(b0, 8);  int v1 = toInt(b1, 0);  return v0 | v1;}

It works the same way as the readU4 method does, so we won't discuss it.

04: Print the Major and Minor Versions

We can now read u2 values, so let's write a method for reading and printing the second and third items of the ClassFile structure. Here's the code:

private void printVersion() throws IOException {  final int minor;  minor = readU2();    System.out.printf("  minor version %d%n", minor);    final int major;  major = readU2();    System.out.printf("  major version %d%n", major);}

And we call it from the execute method:

private void execute(String[] args) throws IOException {  readClassFile(args);  verifyMagic();  printVersion();}

Let's run our implementation on the HelloWorld.class file:

$ java --enable-preview ClassFile2.java HelloWorld.class 
ClassFile /tmp/HelloWorld.class
  minor version 65535
  major version 67

The output is equal to the trimmed output from the javap tool.

In the Next Blog Post in This Series

In this second blog post in this series, we learned that the structure for the Java class file is defined in Chapter 4 of the Java Virtual Machine Specification. The chapter, titled The class File Format, defines that a class file consists of a single ClassFile structure. Knowing the Java class file structure, we began our class file scanner implementation, and it reads the first three items of the ClassFile structure.

In the next blog post in this series, we will keep improving our implementation, and have it traverse the constant pool of a Java class file.

You can find the source code used in this blog post in this Gist.