Java Byte Code Implementation
The Insider Story of JVM
![]() | ||
| The Main Actor |
Credits goes to
This blog post is basically written on the extracted knowledge from Ted Neward lecture about Java Byte Code Instruction set.
This also could be explain as Java Byte Code implementation.
As a programmer, on any language it's good to have a understanding of whats happening under one it.
In here we don't dig deep in.This explanation mainly concerned about Byte Code Instructions sets and the tools for it.
Smalltalk Virtual Machine Called StrongTalk. Another company purchased it and improved the VM and got Hotspot the "Java Virtual Machine"(JVM).
Because JAVA intend as a embedded language which runes on embedded devices.
To build intermediate language like this which
could be interpreted by different VMs in different Hardware.There for it only have to produce one compiler which would produce one VM per machine .
FYI -After the JVM was built the James Gosling wanted to bind the compiler with the browser .As result of that the "Applets" are born.
As a programmer, on any language it's good to have a understanding of whats happening under one it.
In here we don't dig deep in.This explanation mainly concerned about Byte Code Instructions sets and the tools for it.
History
Smalltalk Virtual Machine Called StrongTalk. Another company purchased it and improved the VM and got Hotspot the "Java Virtual Machine"(JVM).
Because JAVA intend as a embedded language which runes on embedded devices.
Reasons to build the JVM
To build intermediate language like this which
could be interpreted by different VMs in different Hardware.There for it only have to produce one compiler which would produce one VM per machine .
FYI -After the JVM was built the James Gosling wanted to bind the compiler with the browser .As result of that the "Applets" are born.
![]() | |
| Take Bow People for James Gosling |
Java was designed as a fundamentally interpreted language.Because most of the C++ developer back in those days don't like that interpreted idea and they predict that Java will fail.
Java starts to gather it's momentum from 1997 and by the 2000 JAVA is the best choice for building Enterprise Applications.
How Did That Happen ?
Mainly Because of JAVA Garbage Collection. Mostly developers don't like to do DELETE, ever.Because According to "Ted Neward ", almost 50% of the time of developers consumed by thinking of the deletion process.And C++ developers experienced it pretty well (Pointer Ownership Semantics).
Today we can consume that time to worry about other processes while the JVM taken care of Managing Threads, Managing Concurrency,Lock Free Programming and etc.
Java Assembly Language
TED Newards didn't recommend to do the projects in JAVA Assembly Language. But he recommends to have a understatement of whats happening under the hood.TO read that Oracle provide a tool (The Only tool) the "JAVAP".(But the TedNeward don't know what the JAVAP stands for)
JAVAP
- This presence on every JDK.
- Operates on the ClassPath. SO its not needed to pass the filename.Just the Class name is enough.
- Moslt importantly we could go and look at the compiled code
- It was absolutely crucial for some cases of Debugging.
![]() | |
| How we get the Disassembled Java bytecode |
Why studying this is worth?
Instead of writing like hundreds of unit testing to understand a concept. we could compile the source code and look at the generated byte code.Such as to select which is better of String Concatenation or StringBuilder.Also it saves a lot of time of the Debugging too.
Fundamentals in JVM
- As far as the JVM concerns there is no such thing as a "jar" file.Because JVM has no concept of Java Archives or Web Archives or the other files.They are only purely units of deployment.
- At JVM the fundamental atom is "Class".
- The Class is represented in "memory" by the File Format.Memory means any kind of storage which JVM uses to load the code.
- JVM is fundamentally Stack Based and no Registers involved.
- According to Ted Neward JVM byte code language is the "simplest assembly language " which you will ever see in your life.
Simple Javap Code explanation
First let me show the simpler code segment which I will be using for the explanation
2.Let me Compile my code and generate Class file
3.Lets check how default "javap" command works
This wont show you any private,protected or package friendly details.
4.Lets try to get some additional information by using the command of
javap -c -p -verbose
javap -c -p -verbose
-p argument is to show all the private methodsThe out put which I got is
How to understand the result we got?
As we know I added only one method in the source file which is only the Main method.But from the compiled code the Javap identified two methods.(Check method names under the "descriptor Tag")what is that extra method??![]() | |
| The two methods found by javap |
Those numbers you see is line numbers.And they are relative for each method.(Starts with 0) .
The reason is as a Java primary knowledge we all know that Java demands a primary Constructor for every class which has been created. But if we don't create a Constructor it will create a "No argument Constructor" for you.
What that constructor is do for you is it invoke the "java.lang Constructor"(it's base class Constructor).
Now there is a line in the out put says invokespecial #1 and also a small comment
// Method java/lang/Object."<init>":() The reson for this cond of feed back is the JVM always think of in fully qualified terms(more about that I described below). Some may think why SUN Engineers did use the "/" but not hte dot "." .Ted Neward still don't know the reason and SUN Engineers too.
And bellow that line also you can see a out put saying "(Ljava/lang/String;)V".
This is a "method descriptor". This is basically saying that "println" method takes an array of Java Strings and returns nothing ("V" - Void). If it takes a two dimensional array, the out put would be "[[Ljava/....". And being a Non-Primitive type it's always prefixed with "L" . Also that using the semicolon (";") is essential because it terminates the name of the Class.
You can see I use some set of Java primitive types.FYI In JVM there is a different encoding for different primitive types. I mention some of them below
- I - int
- J - long (L referred as Long Class name)
- V - void
- b - byte
Execution Stack
Execution stack is the heart and the Soul of the JVM.It's made a huge difference when compared to existing VM productions in those days because those VM's used registers.According to the JVM specification the Execution Stack of the JVM is exactly 32 bits wide.So anything larger than 32 bits has to push into two slots of the Stack(ex:longs,doubles).Ted Neward says it's the biggest mistake done by the SUN JVM Engineers.
Naming Strategy
Names in JAVA is full qualified Class names.Also the concept about the packages in JVM do not exists. Because JVM identify Classes by with it's full name(Package is a prefix to the Class name) eg: src.com.MyCOmpany.MyAppClass
Did you know..
"$" is not allowed in Java, But the interesting points is it's not restricted in JVM(to avoid Class name clashes).
Magic Numbers?
It's 4 Byte Header(Hexadecimal)- Java inventors thought of a mechanism of defining class headers to identify them in the memory. SO by referring their favorite Cafe they named the 1st four Bytes would be "CAFEBABE". In the header file it looks CA FE BA BE
CA FE DE AD. But Later on it has being changed to Java RLI (Request Library Interpreter (Not sure)). But the CAFEBABE still stands.
On the next Post I would like to explain about more advanced concepts on this story under these headings.
- Stack Manipulation
- Object Model Constructions
- Virtual Dispatch
- Arithmetic Operations
- Constant Pool
- The Three Question









Comments
Post a Comment