A Java virtual machine ( JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages, other languages referred to as JVM languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally describes what is required in a JVM implementation. Having a specification ensures interoperability of Java programs across different implementations so that program authors using the Java Development Kit (JDK) need not worry about idiosyncrasies of the underlying hardware platform.
The JVM reference implementation is developed by the OpenJDK project as open source code and includes a JIT compiler called HotSpot. The commercially supported Java releases available from Oracle are based on the OpenJDK runtime. Eclipse OpenJ9 is another open source JVM for OpenJDK.
Starting with Java Platform, Standard Edition (J2SE) 5.0, changes to the JVM specification have been developed under the Java Community Process as JSR 924. , changes to the specification to support changes proposed to the class file format (JSR 202) are being done as a maintenance release of JSR 924. The specification for the JVM was published as the blue book, The Java Virtual Machine Specification (the first and second editions are also available online). whose preface states:
The most commonly used Java virtual machine is Oracle's HotSpot.
Oracle owns the Java trademark and may allow its use to certify implementation suites as fully compatible with Oracle's specification.
| + Java versions and their Garbage Collectors | ||
| Serial, Parallel, CMS, G1 (E) | ||
| Serial, Parallel, CMS, G1 | ||
| G1 | ||
| Serial, Parallel, CMS, G1, Epsilon (E), ZGC (E) | ||
| Serial, Parallel, CMS, G1, Epsilon (E), ZGC (E), Shenandoah (E) | ||
| Serial, Parallel, G1, Epsilon (E), ZGC (E), Shenandoah (E) | ||
| Serial, Parallel, G1, Epsilon (E), ZGC, Shenandoah | ||
| Serial, Parallel, G1, Epsilon (E), ZGC, Shenandoah, GenZGC (E) | ||
| Serial, Parallel, G1, Epsilon (E), ZGC, Shenandoah, GenZGC (default ZGC) | ||
| Serial, Parallel, G1, Epsilon (E), Shenandoah, GenZGC, GenShen (E) | ||
| Serial, Parallel, G1, Epsilon (E), Shenandoah, GenZGC, GenShen | ||
| (E) = experimental | ||
Java's design philosophy revolves around the assumption of a garbage collector. Unlike languages such as C++ and Rust, deterministic memory management through a delete keyword (as in C++) is not possible. Even introducing such a feature is not possible, due to a lack of ownership, aside from using or through to allocate/deallocate memory outside the Java heap. Due to Java primarily using heap-based allocation, objects are stored as references, and deletion would result in .
delete a;
System.out.println(b);
The JVM has a garbage-collected heap for storing objects and arrays. Code, constants, and other class data are stored in the "method area". The method area is logically part of the heap, but implementations may treat the method area separately from the heap, and for example might not garbage collect it. Each JVM thread also has its own call stack (called a "Java Virtual Machine stack" for clarity), which stores frames. A new frame is created each time a method is called, and the frame is destroyed when that method exits.
Each frame provides an "operand stack" and an array of "local variables". The operand stack is used for operands to run computations and for receiving the return value of a called method, while local variables serve the same purpose as registers and are also used to pass method arguments. Thus, the JVM is both a stack machine and a register machine. In practice, HotSpot eliminates every stack besides the native thread/call stack even when running in Interpreted mode, as its Templating Interpreter technically functions as a compiler.
The JVM uses references and stack/array indexes to address data; it does not use byte addressing like most physical machines do, so it does not neatly fit the usual categorization of 32-bit or 64-bit machines. In one sense, it could be classified as a 32-bit machine, since this is the size of the largest value it natively stores: a 32-bit integer or floating-point value or a 32-bit reference. Because a reference is 32 bits, each program is limited to at most 2 unique references and therefore at most 2 objects. However, each object can be more than one byte large, and potentially very large; the largest possible object is an array of long of length 2 - 1 which would consume 16 GiB of memory, and there could potentially be 2 of these if there were enough memory available. This results in upper bounds that are more comparable to a typical 64-bit byte-addressable machine. A JVM implementation can be designed to run on a processor that natively uses any bit width as long as it correctly implements the integer (8-, 16-, 32-, and 64-bit) and floating-point (32- and 64-bit) math that the JVM requires. Depending on the method used to implement references (native pointers, compressed pointers, or an indirection table), this can limit the number of objects to less than the theoretical maximum. An implementation of the JVM on a 64-bit platform has access to a much larger address space than one on a 32-bit platform, which allows for a much larger heap size and an increased maximum number of threads, which is needed for certain kinds of large applications; however, there can be a performance hit from using a 64-bit implementation compared to a 32-bit implementation.
There are several JVM languages, both old languages ported to JVM and completely new languages. JRuby and Jython are perhaps the most well-known ports of existing languages, i.e. Ruby and Python respectively. Of the new languages that have been created from scratch to compile to Java bytecode, Clojure, Apache Groovy, Scala and Kotlin may be the most popular ones. A notable feature with the JVM languages is that they are compatible with each other, so that, for example, Scala libraries can be used with Java programs and vice versa.
Java 7 JVM implements JSR 292: Supporting Dynamically Typed Languages on the Java Platform, a new feature which supports dynamically typed languages in the JVM. This feature is developed within the Da Vinci Machine project whose mission is to extend the JVM so that it supports languages other than Java.
The class loader performs three basic activities in this strict order:
In general, there are three types of class loader: bootstrap class loader, extension class loader and System / Application class loader.
Every Java virtual machine implementation must have a bootstrap class loader that is capable of loading trusted classes, as well as an extension class loader or application class loader. The Java virtual machine specification does not specify how a class loader should locate classes.
The aim is binary compatibility. Each particular host operating system needs its own implementation of the JVM and runtime. These JVMs interpret the bytecode semantically the same way, but the actual implementation may be different. More complex than just emulating bytecode is compatibly and efficiently implementing the Java core API that must be mapped to each host operating system.
These instructions operate on a set of common rather the native data types of any specific instruction set architecture.
The JVM verifies all bytecode before it is executed. This verification consists primarily of three types of checks:
The first two of these checks take place primarily during the verification step that occurs when a class is loaded and made eligible for use. The third is primarily performed dynamically, when data items or methods of a class are first accessed by another class.
The verifier permits only some bytecode sequences in valid programs, e.g. a jump (branch) instruction can only target an instruction within the same method. Furthermore, the verifier ensures that any given instruction operates on a fixed stack location, allowing the JIT compiler to transform stack accesses into fixed register accesses. Because of this, that the JVM is a stack architecture does not imply a speed penalty for emulation on register machine when using a JIT compiler. In the face of the code-verified JVM architecture, it makes no difference to a JIT compiler whether it gets named imaginary registers or imaginary stack positions that must be allocated to the target architecture's registers. In fact, code verification makes the JVM different from a classic stack architecture, of which efficient emulation with a JIT compiler is more complicated and typically carried out by a slower interpreter. Additionally, the Interpreter used by the default JVM is a special type known as a Template Interpreter, which translates bytecode directly to native, register based machine language rather than emulate a stack like a typical interpreter. In many aspects the HotSpot Interpreter can be considered a JIT compiler rather than a true interpreter, meaning the stack architecture that the bytecode targets is not actually used in the implementation, but merely a specification for the intermediate representation that can well be implemented in a register based architecture. Another instance of a stack architecture being merely a specification and implemented in a register based virtual machine is the Common Language Runtime.
The original specification for the bytecode verifier used natural language that was incomplete or incorrect in some respects. A number of attempts have been made to specify the JVM as a formal system. By doing this, the security of current JVM implementations can more thoroughly be analyzed, and potential security exploits prevented. It will also be possible to optimize the JVM by skipping unnecessary safety checks, if the application being run is proven to be safe.
Formal proof of bytecode verifiers have been done by the Javacard industry (Formal Development of an Embedded Verifier for Java Card Byte Code)
When Java bytecode is executed by an interpreter, the execution will always be slower than the execution of the same program compiled into native machine language. This problem is mitigated by just-in-time (JIT) compilers for executing Java bytecode. A JIT compiler may translate Java bytecode into native machine language while executing the program. The translated parts of the program can then be executed much more quickly than they could be interpreted. This technique gets applied to those parts of a program frequently executed. This way a JIT compiler can significantly speed up the overall execution time.
There is no necessary connection between the Java programming language and Java bytecode. A program written in Java can be compiled directly into the machine language of a real computer and programs written in other languages than Java can be compiled into Java bytecode.
Java bytecode is intended to be platform-independent and secure.David J. Eck, Introduction to Programming Using Java , Seventh Edition, Version 7.0, August 2014 at Section 1.3 "The Java Virtual Machine" Some JVM implementations do not include an interpreter, but consist only of a just-in-time compiler. Oracle JRockit Introduction Release R28 at 2. "Understanding Just-In-Time Compilation and Optimization"
Java 22 introduces the Foreign Function and Memory API, which can be seen as the successor to Java Native Interface.
Native methods are enable by JNI to handle situations when an application cannot be written entirely in the Java programming language, e.g. when the standard Java class library does not support the platform-specific features or program library.
The JNI framework lets a native method use Java objects in the same way that Java code uses these objects. A native method can create Java objects and then inspect and use these objects to perform its tasks. A native method can also inspect and use objects created by Java application code.
JNI also allows direct access to assembly code, without even going through a C bridge.3 Accessing Java applications from assembly is possible in the same way.
The NPAPI Java browser plug-in was designed to allow the JVM to execute so-called Java applets embedded into HTML pages. For browsers with the plug-in installed, the applet is allowed to draw into a rectangular region on the page assigned to it. Because the plug-in includes a JVM, Java applets are not restricted to the Java programming language; any language targeting the JVM may run in the plug-in. A restricted set of APIs allow applets access to the user's microphone or 3D acceleration, although applets are not able to modify the page outside its rectangular region. Adobe Flash Player, the main competing technology, works in the same way in this respect.
according to W3Techs, Java applet and Silverlight use had fallen to 0.1% each for all web sites, while Flash had fallen to 10.8%.
Compiling the JVM bytecode, which is universal across JVM languages, allows building upon the language's existing compiler to bytecode. The main JVM bytecode to JavaScript transpilers are TeaVM, the compiler contained in Dragome Web SDK, Bck2Brwsr, and j2js-compiler.Wolfgang Kuehn (decatur). j2js-compiler GitHub
Leading transpilers from JVM languages to JavaScript include the Java-to-JavaScript transpiler contained in Google Web Toolkit, J2CL, Clojurescript (Clojure), GrooScript (Apache Groovy), Scala.js (Scala) and others.
|
|