Skip to content

dawwinci/ic71dump-opcodes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PHP 7.1 ionCube Opcode Dumper

Overview

This project produces a structured opcode dump from PHP source files that have been encoded with ionCube for PHP 7.1 (NTS, x86, Windows). For each input file the tool writes two output files:

  • <filename>.opcodes.txt — human-readable, print_r-style dump of the raw C-level data
  • <filename>.opcodes.json — structured IR in ic71dump-ir-v1 format, ready for analysis or decompilation

The tool works by loading the ionCube loader as a Zend extension (as normal), letting ionCube compile the encoded file into the Zend engine's in-memory op arrays, then calling our own PHP extension (php_opcodedump.dll) to walk those op arrays, decode any that ionCube has left in their encrypted state, and serialise the result. A PHP orchestration script (opcodedump.php) then transforms the raw C output into the final IR.

The tool is composed of two layers:

  1. C extension (php_opcodedump.dll) — hooks into the PHP 7.1 runtime after the ionCube loader has decrypted the op_array in memory, and walks all data structures to extract them as PHP arrays.
  2. PHP IR layer (opcodedump.php) — normalises, enriches, and re-encodes that raw data into the clean JSON IR.

Architecture

[ionCube-encoded .php file]
         │
         ▼
[PHP 7.1 runtime]
         │  ← zend_extension: loader_win_7.1.dll  (decrypts in-memory op_arrays)
         │  ← extension:      php_opcodedump.dll  (reads the decrypted structures)
         ▼
[opcodedump_get_opcodes()]   ← C function, called from opcodedump.php
         │
         ▼  raw PHP array (mirrors C structs directly)
[opcodedump.php]
         │  normalize_dump()  — key aliases, type coercion
         │  build_decompile_ir()  — constructs ic71dump-ir-v1 object
         │  json_encode()  — serialises to JSON (polyfill if extension absent)
         ▼
[<file>.opcodes.txt]   ← print_r of the raw dump
[<file>.opcodes.json]  ← ic71dump-ir-v1 JSON IR

Runtime Setup

Component Versions

Component Version / Details
PHP runtime 7.1.33 NTS x86, VC14 (VS 2015)
ionCube Loader v14.4 (loader_win_7.1.dll, 777,728 bytes)
php_opcodedump.dll built with VS2022 19.44.35223, build ID patched to VC14
Target encoding ionCube PHP 7.1 encoded files only

The ionCube Loader used in this setup is v14.4, confirmed via ioncube_loader_version() at runtime.

Files

ic71dump/
├── dump.bat                  entry point — wraps php.exe call
├── opcodedump.php            PHP IR layer
├── runtime/
│   ├── php.exe               PHP 7.1.33 NTS x86 (VC14 build)
│   ├── php7.dll
│   ├── php.ini
│   └── ext/
│       ├── loader_win_7.1.dll    ionCube Loader v14.4 for PHP 7.1
│       └── php_opcodedump.dll    our C extension
├── src/
│   ├── opcodedump.c
│   └── php_opcodedump.h
└── build/
    ├── build.bat
    └── out/
        └── php_opcodedump.dll

php.ini

[PHP]
extension_dir = "ext"
zend_extension = "loader_win_7.1.dll"   ; ionCube Loader v14.4 — must load BEFORE opcodedump
extension      = php_opcodedump
error_reporting = 0
display_errors  = On

The ionCube loader must be a zend_extension (not a plain extension) because it needs to hook zend_compile_file to intercept encoded files before the engine sees them. php_opcodedump is loaded as a plain extension afterwards.


C Extension — opcodedump.c

Source: php_v1_dump/opcodedump-master (original)

The starting point was an open-source extension skeleton from opcodedump-master that could walk zend_op_array structures. It was designed for unencrypted PHP files.

What the extension does

After PHP loads and the ionCube loader has decrypted the file, the extension:

  1. Receives the zend_op_array * from PHP's compiler hook.
  2. Detects whether the op_array is an ionCube sentinel (protected wrapper) and resolves it to the real decrypted op_array via the ionCube internal descriptor chain.
  3. Walks all fields: opcodes, literals, arg_info, try/catch, live_range, vars, static variables, properties, class tables, function tables.
  4. Returns the entire structure as a nested PHP array.

ionCube Sentinel Detection

ionCube replaces the real opcodes pointer with a small odd integer (1, 3, 5, 7) as a sentinel. When detected, the extension follows the ionCube internal descriptor chain:

op_array->reserved[3]  →  ic_descriptor (80-byte struct)
  [offset 76 / dword 19]  →  fn_ptr
    [offset 40 / dword 10]  →  real zend_op_array *
if (src->opcodes && ((uintptr_t)(src->opcodes) & 3)) {
    // sentinel detected
    const uint32_t *_desc = (const uint32_t *)src->reserved[3];
    uint32_t _fn_ptr_val  = _desc[19];   // descriptor offset 76
    const uint32_t *_fn_ptr = (const uint32_t *)_fn_ptr_val;
    uint32_t _real_oa_addr = _fn_ptr[10]; // fn_ptr offset 40
    const zend_op_array *_real = (const zend_op_array *)_real_oa_addr;
    // validate and use _real
}

The entire resolution is wrapped in __try/__except to guard against bad pointers.

Opcode XOR Decode

ionCube XOR-encodes the opcode byte of each zend_op. The key for each opcode is derived from the ionCube key table inside loader_win_7.1.dll:

IC_LOADER_NAME  = "loader_win_7.1.dll"
IC_KEY_TABLE_RVA = 0xB786C   // RVA of the key table pointer inside the DLL

Decode path per opcode at index op_index:

HMODULE hLoader = GetModuleHandleA("loader_win_7.1.dll");
uint32_t key_table = *(uint32_t*)((uintptr_t)hLoader + IC_KEY_TABLE_RVA);
uint32_t key_index = desc[1];          // from ionCube descriptor
uintptr_t key_entry = key_table + key_index * 4;
const uint8_t *key_stream = *(const uint8_t **)key_entry;
zend_uchar key_byte = key_stream[op_index];
zend_uchar real_opcode = stored_opcode ^ key_byte;

This is called from dasm_ic_display_opcode() for every opcode before it is recorded.

Memory Safety

Because ionCube-encrypted structures can have dangling or invalid pointers (especially for properties, arg_info, and literals of partially-decoded functions), every dereference that might fault is wrapped in one of:

  • __try { ... } __except(EXCEPTION_EXECUTE_HANDLER) { ... } — Windows SEH
  • dasm_ic_committed_readable_ptr(ptr) — calls VirtualQuery() to verify the page is committed and readable before dereferencing

Additionally, the literals and opcodes arrays are temporarily unlocked with VirtualProtect(..., PAGE_EXECUTE_READWRITE, ...) before reading (ionCube may leave them with restricted page permissions), then restored immediately after.


Four Fixes Applied to the C Extension

Fix 1 — Lineno Masking

Problem: ionCube sets bits 0x600000 in the lineno field of each zend_op as an internal marker. Without masking, every line number appears nonsensically large (e.g., 6316034 instead of 2).

Fix (dasm_zend_op, line ~562):

/* Mask ionCube-injected bits from lineno (bits 0x600000 set by ionCube encoder) */
add_assoc_long_ex(dst, ("lineno"), (sizeof("lineno")), src->lineno & ~0x600000u);

The same mask is applied to line_start and line_end of the op_array:

add_assoc_long_ex(dst, ("line_start"), ..., src->line_start & ~0x600000u);
add_assoc_long_ex(dst, ("line_end"),   ..., src->line_end   & ~0x600000u);

Result: Line numbers are now either correct (plain PHP files) or 0 (ionCube-encoded, where the encoder deliberately erases line info as an obfuscation measure).


Fix 2 — ZEND_JMPZNZ True Branch (jmpznz_true_opline)

Problem: ZEND_JMPZNZ has TWO jump targets:

  • The false/null branch: stored in op2 as the standard jump operand.
  • The true branch: stored in extended_value.

The original code exported extended_value as a raw integer, which is useless without knowing the opline base address. The true branch target opline index was missing.

Fix (end of dasm_zend_op):

if (display_opcode == ZEND_JMPZNZ) {
    zend_long ev_index = -1;
#if ZEND_USE_ABS_JMP_ADDR   // 32-bit: extended_value is an absolute address
    ev_index = dasm_index_from_address_base(
        (uintptr_t)(uint32_t)raw_src->extended_value,
        (uintptr_t)op_array->opcodes, op_array->last);
#else                        // 64-bit: extended_value is a relative byte offset
    {
        const zend_op *ev_target = (const zend_op *)
            ((const char *)raw_src + (int32_t)raw_src->extended_value);
        if (op_array->opcodes && ev_target >= op_array->opcodes &&
            ev_target < (op_array->opcodes + op_array->last)) {
            ev_index = (zend_long)(ev_target - op_array->opcodes);
        }
    }
#endif
    add_assoc_long(dst, "jmpznz_true_opline", ev_index);
}

Result: The JSON now contains "jmpznz_true_opline": N for every ZEND_JMPZNZ opcode, giving the decompiler both branch targets directly as opline indices.


Fix 3 — Property Default Values (has_default_value / default_value)

Problem: The original dasm_properties_info correctly walked zend_property_info but never exported the actual default value of the property (the initial value assigned in the class definition). The prop pointer was resolved but discarded.

Fix (after _dasm_properties_info(&zv, prop_info)):

#ifdef PHP_WIN32
__try {
    if (dasm_ic_committed_readable_ptr(prop) && Z_TYPE_P(prop) != IS_UNDEF) {
        zval prop_copy;
        ZVAL_COPY_VALUE(&prop_copy, prop);
        add_assoc_zval(&zv, "default_value", &prop_copy);
        add_assoc_bool(&zv, "has_default_value", 1);
    } else {
        add_assoc_null(&zv, "default_value");
        add_assoc_bool(&zv, "has_default_value", 0);
    }
} __except(EXCEPTION_EXECUTE_HANDLER) {
    add_assoc_null(&zv, "default_value");
    add_assoc_bool(&zv, "has_default_value", 0);
}
#else
    // same without SEH guard
#endif

Result: Each property in the JSON now has:

"has_default_value": true,
"default_value": "some_string_or_int_or_null"

This is essential for reconstructing the original class definition.


Fix 4 — Return Type Info (return_type_info)

Problem: PHP 7.1 stores the return type hint of a function in the slot immediately before arg_info[0], i.e. at arg_info[-1], when the flag ZEND_ACC_HAS_RETURN_TYPE (0x40000000) is set. The original code never read this slot.

Fix (after required_num_args in dasm_zend_op_array):

if ((src->fn_flags & ZEND_ACC_HAS_RETURN_TYPE) && src->arg_info != NULL
#ifdef PHP_WIN32
    && dasm_ic_committed_readable_ptr(src->arg_info - 1)
#endif
) {
    zval ret_zv;
    array_init(&ret_zv);
#ifdef PHP_WIN32
    __try { dasm_zend_arg_info(&ret_zv, src->arg_info - 1); }
    __except(EXCEPTION_EXECUTE_HANDLER) { add_assoc_null(&ret_zv, "name"); }
#else
    dasm_zend_arg_info(&ret_zv, src->arg_info - 1);
#endif
    add_assoc_zval(dst, "return_type_info", &ret_zv);
} else {
    add_assoc_null(dst, "return_type_info");
}

Result: Every op_array now has "return_type_info": { "name": "int", "type_hint": 16, ... } or null if no return type is declared. Required for correct function signature reconstruction.


PHP 7.1-Specific Engine Structures

live_range vs brk_cont_array

PHP 7.0 used brk_cont_array (for break/continue tracking). PHP 7.1 replaced it with live_range (for tracking temporary variable lifetimes). Both are conditionally compiled:

#if defined(ZEND_ENGINE_7_1)
// live_range — array of zend_live_range { var, start, end }
#endif

#if defined(ZEND_ENGINE_7_0)
// brk_cont_array — array of zend_brk_cont_element { start, cont, brk, parent }
#endif

The PHP 7.1 source tree defines ZEND_ENGINE_7_1, so live_range is active.


Build Process

Requirements

  • PHP 7.1.33 source tree at C:\dev\php_v1_dump\php-7.1.33-src
  • Visual Studio Build Tools at C:\BuildTools\VC\Auxiliary\Build\vcvars32.bat
  • php7.lib at php-7.1.33-src\Release\php7.lib

Build Command (build/build.bat)

cl.exe /nologo /c /MD /O2 /W3 /wd4996         ^
    /DWIN32 /D_WINDOWS /DZEND_WIN32=1           ^
    /DPHP_WIN32=1 /DCOMPILE_DL_OPCODEDUMP       ^
    /DZEND_DEBUG=0 /D_USE_32BIT_TIME_T=1        ^
    /I"%PHP_INC%" /I"%PHP_INC%\main"            ^
    /I"%PHP_INC%\Zend" /I"%PHP_INC%\TSRM"      ^
    /I"%PHP_INC%\win32" /I"%PHP_INC%\ext"       ^
    /Fo"%OUT%\opcodedump.obj"                   ^
    "%SRC%\opcodedump.c"

link.exe /nologo /DLL                          ^
    /OUT:"%OUT%\php_opcodedump.dll"             ^
    /LIBPATH:"%PHP_LIB%"                        ^
    "%OUT%\opcodedump.obj" php7.lib

Problem Solved: VC Version Mismatch

PHP 7.1 on Windows embeds a compiler ID in every module's build string (ZEND_MODULE_BUILD_ID). At load time PHP checks that the ID of the extension matches its own. The runtime binary was compiled with VC14 (VS 2015). Our build environment is VS2022 (compiler version 19.44.35223).

PHP's build system generates main/config.w32.h during configure.js, and it had:

#define PHP_COMPILER_ID "19.44.35223"

This caused the mismatch:

Module compiled with build ID=API20160303,NTS,19.44.35223
PHP    compiled with build ID=API20160303,NTS,VC14

Fix: Edit main/config.w32.h in the source tree:

// Before:
#define PHP_COMPILER_ID "19.44.35223"
// After:
#define PHP_COMPILER_ID "VC14"

After patching and rebuilding, the module loads cleanly. This change only affects the build ID string — no code behaviour is altered.


PHP IR Layer — opcodedump.php

Purpose

The raw PHP array from the C extension mirrors the C structs directly — all field names are truncated, some values are raw integers, jump targets are raw pointers. The PHP layer cleans this up into a structured, analysis-ready JSON IR.

JSON Polyfill

The custom PHP 7.1 build does not include the json extension (php_json.dll was not compiled in). A pure-PHP polyfill is injected at the top of opcodedump.php:

if (!function_exists('json_encode')) {
    define('JSON_PRETTY_PRINT',    128);
    define('JSON_UNESCAPED_SLASHES', 64);
    // ...
    function json_encode($value, $flags = 0, $depth = 512) {
        return _jenc($value, (bool)($flags & 128), 0);
    }
    function _jenc_str($s) { /* per-character escaping loop */ }
    function _jenc($v, $p, $d) {
        // handles null, bool, int, float, string, array (list + object)
        // builds string directly, no intermediate $parts array
    }
}

Critical implementation note: An earlier version used a $parts = array() accumulator inside _json_encode_value. In PHP 7.1 running under the ionCube loader, this variable was being pre-populated with the data from the last processed opcode array — a scope-leaking anomaly triggered by the ionCube loader's deep hooks into PHP's variable/opcode dispatch machinery. The fix was to eliminate the intermediate array entirely and build the JSON string by direct concatenation with a $first flag, using short unique variable names ($out, $ks, $il, $ind, etc.) that do not collide with any opcode-processing variable in the outer script context.

Key Normalisation (normalize_key)

The C extension truncates key names to avoid PHP's add_assoc_string length limits. The PHP layer maps them back:

'function_nam'  → 'function_name'
'arg_inf'       → 'arg_info'
'op_arra'       → 'op_array'
'typ'           → 'type'
// etc.

IR Format: ic71dump-ir-v1

The top-level JSON object:

{
  "format":      "ic71dump-ir-v1",
  "source_file": "/path/to/file.php",
  "summary": {
    "op_array_count": 5,
    "function_count": 4,
    "closure_count":  0,
    "class_count":    1
  },
  "entry":         "main",
  "op_arrays":     { ... },
  "function_index":{ ... },
  "closure_index": { ... },
  "class_index":   { ... }
}

Per-Op-Array IR

Each entry in op_arrays:

{
  "id":           "main",
  "kind":         "main",
  "function_name": { "type": "null", "value": null },
  "filename":     { "type": "string", "value": "...", "sha1": "...", ... },
  "line_start":   1,
  "line_end":     147,
  "fn_flags":     0,
  "fn_flags_decoded": { "visibility": "none", "is_static": false, ... },
  "num_args":     0,
  "required_num_args": 0,
  "return_type_info": null,
  "vars":         { "type": "array", "value": ["parola"] },
  "literals":     [ { "index": 0, "type": "string", "value": "...", ... } ],
  "opcodes":      [ ... ],
  "try_catch":    [],
  "live_range":   []
}

Per-Opcode IR

{
  "index":         3,
  "line":          25,
  "lineno_raw":    null,
  "opcode":        38,
  "opcode_name":   "ZEND_ASSIGN",
  "extended_value": 0,
  "extended_value_decoded": null,
  "op1": {
    "type": 16, "type_name": "IS_CV",
    "cv_name": "parola",
    ...
  },
  "op2": { "type": 1, "type_name": "IS_CONST", "literal": { ... } },
  "result": { ... },
  "jump_targets":       [],
  "is_call":            false,
  "is_include_or_eval": false,
  "is_lambda_declare":  false
}

For ZEND_JMPZNZ, an additional field is present:

"jmpznz_true_opline": 42

safe_value Envelope

Every value that comes from a PHP zval is wrapped in a safe_value envelope:

{
  "type":      "string",
  "length":    6,
  "printable": true,
  "value":     "parola",
  "preview":   "parola",
  "hex":       "7061726f6c61",
  "base64":    "cGFyb2xh",
  "sha1":      "83592796bc..."
}

This is used for literals, property default values, variable names, and any zval field that could contain arbitrary binary data.

Per-Class IR

{
  "id":       "class:PHASED...:method:verify",
  "kind":     "method",
  "meta": {
    "class_name": "PHPGangsta_GoogleAuthenticator",
    "method_name": { "type": "string", "value": "verifyCode" }
  },
  "opcodes":  [ ... ]
}

Observed Differences: ionCube-Encoded vs Plain PHP

When dumping both example_ic.php (ionCube-encoded) and example.php (decoded source), the following differences were observed and confirmed to be semantically irrelevant:

Field Plain PHP ionCube-encoded
line (per opcode) Real line numbers (2, 3, 25…) 0 on all opcodes
lineno_raw null (already clean) null (after masking 0x600000)
Literals count 51 52 (extra "date" literal)
Literal order Compiler order Re-ordered by ionCube encoder
ZEND_INIT_FCALL Used for known built-ins Replaced with ZEND_INIT_FCALL_BY_NAME
ZEND_DO_ICALL Used for internal functions Replaced with ZEND_DO_FCALL_BY_NAME
ZEND_SEND_VAL Standard argument passing Replaced with ZEND_SEND_VAL_EX
refcount Runtime-managed Different initial value

Why these differences don't matter for decompilation:

  • Line 0: Line numbers are metadata. Control flow is determined by JMP operands, not line numbers. Reconstructed PHP will not have line directives anyway.
  • Extra literal: ionCube stores function names explicitly as literals even for built-ins. All opcode operands still resolve to the correct values.
  • Literal reorder: The dump resolves every IS_CONST operand to its actual literal value at dump time. The literal index in the operand field already reflects the correct value — reordering is invisible to the decompiler.
  • BY_NAME vs ICALL variants: All of ZEND_INIT_FCALL, ZEND_INIT_FCALL_BY_NAME, ZEND_DO_FCALL, ZEND_DO_FCALL_BY_NAME, ZEND_DO_ICALL map to the same PHP construct: a function call. The variant only affects whether PHP resolves the callee at compile time or at runtime — the reconstructed PHP source is identical either way.
  • refcount: Internal memory management counter, irrelevant to program logic.

Usage

dump.bat path\to\encoded.php
dump.bat file1.php file2.php ...

Output files are written next to each input file:

  • <name>.opcodes.txt
  • <name>.opcodes.json

Example

cd C:\dev\ic71dump
dump.bat example_ic.php

Output summary:

== C:\...\example_ic.php ==
lines: 1-147
opcodes: 86 | literals: 52 | vars: 1
exported preview: 86 opcodes | 52 literals
closures: declared=0 | dumped=0 | missing=0
vars: parola
first opcodes:
  [000] line 0    ZEND_INCLUDE_OR_EVAL     (73)
  [001] line 0    ZEND_NOP                 (0)
  ...

saved: C:\...\example_ic.opcodes.txt
saved: C:\...\example_ic.opcodes.json

End-to-End Data Flow

1. PHP loads example_ic.php
2. ionCube loader intercepts the file, decrypts op_arrays in memory
3. opcodedump extension is called via the compiler hook
4. For each op_array:
   a. Detect sentinel (opcodes pointer & 3 != 0)
   b. Follow descriptor chain → real op_array pointer
   c. For each opcode: XOR decode opcode byte with key from loader's key table
   d. Resolve jump targets from raw pointer/offset to opline index
   e. Unlock page permissions on literals/opcodes if needed
   f. Walk all structures, wrap unsafe reads in SEH __try/__except
   g. Return as nested PHP array
5. opcodedump.php receives the raw array
6. normalize_dump() → fix truncated key names, normalise types
7. build_decompile_ir() → construct ic71dump-ir-v1 object
   - Each opcode: mask lineno, decode extended_value, annotate call/jump types
   - Each op_array: add return_type_info, fn_flags_decoded, literal table with sha1/hex
   - Each class: add properties with has_default_value/default_value
8. json_encode($ir, JSON_PRETTY_PRINT) → polyfill used (no json extension)
9. Write .opcodes.txt (print_r) and .opcodes.json (IR)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors