Thursday, June 18, 2015

The truth about P-Code

Microsoft Visual Basic is a Rapid Application Development (RAD) tool that offers the flexibility of compiling applications to p-code (pseudo code) or native code.

Compiling to p-code optimizes for the smallest size, making p-code the choice for creating Internet applications in low bandwidth situations. Native code compilation is highly optimized for speed, but the executable files produced are larger than the p-code versions. Visual Basic is the only RAD tool to support both rapid application development through p-code as well as native code compiling for performance.

NOTE: Visual Basic projects compiled as p-code or native code programs still require the Visual Basic run-time DLL (MSVBVM50.DLL or MSVBVM60.DLL) to be installed in the target system. This run-time DLL provides a number of services for your compiled program, such as startup and shutdown code for your application, functionality for forms and intrinsic controls, and run-time functions such as Format and CLng.


This article has been written to provide a better and essential information to users of decompilers since I got many complaints from customers of others Visual Basic decompilers because their application weren’t recovered by these decompilers.
The reason is, when you write a Visual Basic 6.0 application you may choose a P-Code or Native code compilation, then your application will be compiled to P-Code or Native code which are very different approaches of compilation, so the decompilation will be different too.
There is a question you may absolutely ask yourself when you need a decompiler for a specific application: Is my application compiled into native or P-Code mode ?
The main problem is that a lot of users got a P-Code decompiler in order to decompile their application, but released with Native Code…
In fact, P-Code decompiler are today almost useless because 90% of Visual Basic 6 application are released with Native Code mode. This article will explain first the difference between P-Code and Native code, and then explain why the rate of Native code application is so important compared to P-Code applications.

I. P-Code Versus Native Code

When you write a line of code in the IDE, Visual Basic breaks it down into expressions and encodes the expressions into a preliminary format called op-codes. In other words, each line is partially precompiled as it is written. Some lines contain shared information that cannot be precompiled independently (mainly Dim statements and procedure definitions). This is why you have to restart if you change certain lines in break mode. The opcodes are compiled into p-code instructions when you compile (in the background if you have the Compile On Demand and Background Compile options set).
At run time, the p-code interpreter works through the program, decoding and executing p-code instructions. These p-code instructions are smaller than equivalent native code instructions, thus dramatically reducing the size of the executable program. But the system must load the p-code interpreter into memory in addition to the code, and it must decode each instruction.
It’s a different story with native code. You start with the same opcodes, but instead of translating to p-code instructions, the compiler translates to native instructions. Because you’re not going to be expecting an instant response while stepping through native code instructions in the IDE, the compiler can look at code from a greater distance; it can analyze blocks of code and find ways to eliminate inefficiency and duplication. The compiler philosophy is that, since you compile only once, you can take as long as you want to analyze as much code as necessary to generate the best results possible.
These two approaches create a disjunction. How can you guarantee that such different ways of analyzing code will generate the same results? Well, you can’t. In fact, if you look at the Advanced Optimizations dialog box (available from the Compile tab of the Project Properties dialog box) you’ll see a warning: "Enabling the following optimizations might prevent correct execution of your program." This might sound like an admission of failure, but welcome to the real world of compilers. Users of other compiled languages understand that optimization is a bonus. If it works, great. If not, turn it off.
On the other hand, very few developers are going to be used to the idea of working in an interpreter during development but releasing compiled code. Most compilers have a debug mode for fast compiles and a release mode for fast code. Visual Basic doesn’t worry about fast compiles because it has a no-compile mode that is faster than the fastest compiler. You get the best of both worlds, but it’s going to take a little while for people to really trust the compiler to generate code that they can’t easily see and debug.

II. Proportion of P-Code application in the world

The amount of P-Code developed application is very small compared to Native Code developed application (90% of Visual Basic 6 applications are compiled with Native Code setting -default setting in VB6-), that is one of the reason why I decided to develop VBReFormer more for Native Code than for P-Code.
The massive number of Visual Basic Native application compared to P-Code applications is probably more important due to the fact the default value in the compiler is set up to « Native Code », and of course because native application are almost fast than C++ applications contrary to P-Code applications.
Before choosing a decompiler you must know if it was released for Native application, or for P-Code applications, and if your application was released in P-Code or Native mode.
Note that P-Code is more easy to decompile than Native Code because of it’s high level property.

If you have the Professional or Enterprise edition of Visual Basic, you can compile your code either in standard Visual Basic p-code format or in native code format. Native code compilation provides several options for optimizing and debugging that aren't available with p-code.
P-code, or pseudo code, is an intermediate step between the high-level instructions in your Basic program and the low-level native code your computer's processor executes. At run time, Visual Basic translates each p-code statement to native code. By compiling directly to native code format, you eliminate the intermediate p-code step.
You can debug compiled native code using standard native code debugging tools, such as the debugging environment provided by Visual C++. You can also use options available in languages such as Visual C++ for optimizing and debugging native code. For example, you can optimize code for speed or for size.
Note   All projects created with Visual Basic use the services of the run-time DLL (MSVBVM60.DLL). Among the services provided by this DLL are startup and shutdown code for your application, functionality for forms and intrinsic controls, and run-time functions like Format and CLng.
Compiling a project with the Native Code option means that the code you write will be fully compiled to the native instructions of the processor chip, instead of being compiled to p-code. This will greatly speed up loops and mathematical calculations, and may somewhat speed up calls to the services provided by MSVBVM60.DLL. However, it does not eliminate the need for the DLL.
To compile a project to native code
  1. In the Project window, select the project you want to compile.
  2. From the Project menu, choose Project Properties.
  3. In the Project Properties dialog box, click the Compile tab.
    Figure 8.6   The Compile tab in the Project Properties dialog box
  4. Select Compile to Native Code.
    Visual Basic enables several options for customizing and optimizing the executable file. For example, to create compiled code that will be optimized for size, select the Optimize for Small Code option.
    For additional advanced optimization options, click the Advanced Optimizations button.
  5. Select the options you want, then click OK.
  6. From the File menu, choose Make Exe, or Make Project Group.
The following table describes the native code options for optimization.
Assume No Aliasing (Advanced Optimization)Tells the compiler that your program does not use aliasing. Checking this option allows the compiler to apply optimization such as storing variables in registers and performing loop optimizations.
Create Symbolic Debug InfoProduces a .pdb file and .exe or .dll file containing information to allow for debugging using Microsoft Visual C++ 5.0 or another compatible debugger.
Favor Pentium Pro(tm)Optimizes code to favor the Pentium Pro(tm) processor.
No OptimizationDisables all optimizations.
Optimize for Fast CodeMaximizes the speed of .exe and .dll files by telling the compiler to favor speed over size.
Optimize for Small CodeMinimizes the size of .exe and .dll files by telling the compiler to favor size over speed.
Remove Array Bounds Checks (Advanced Optimization)Disables Visual Basic array bounds checking.
Remove Floating Point Error Checks (Advanced Optimization)Disables Visual Basic floating-point error checking.
Remove Integer Overflow Checks (Advanced Optimization)Disables Visual Basic integer overflow checking.
Remove Safe Pentium(tm) FDIV Checks (Advanced Optimization)Disables checking for safe Pentium(tm) processor floating-point division.

No comments:

Post a Comment