DexTrace is a lightweight core for Android APK / DEX parsing and call-tracing.
It does not decide whether an APK is malicious.
Instead, DexTrace focuses on producing a clean, standardized, and reproducible representation of:
- APK metadata
- AndroidManifest structure
- DEX internal tables
- API call evidence and method-level cross-references (XREF)
These results are designed to be consumed by higher-level engines, such as
π Quark Engine or other static / hybrid analysis frameworks.
DexTrace is intended to provide:
- lightweight APK and DEX parsing without depending on a full Android analysis framework
- deterministic Dalvik bytecode disassembly and inspection
- structured API extraction from DEX bytecode
- manifest and APK metadata parsing
- a Python API and CLI for inspection, debugging, and integration
- reproducible outputs suitable for downstream rule engines
- File hashes (MD5 / SHA1 / SHA256)
- File size and ZIP entries
- APK archive reading for downstream parsing workflows
- Supports binary AXML and plain XML
- Extracts:
- package name
- permissions
- activities
- services
- receivers
- providers
- Safe fallback for malformed or missing manifests
- Strict DEX magic validation (
dex\n035,cdex) - Full header field decoding
- Defensive handling of truncated / invalid DEX files
code_itemparsing- instruction iteration
- offset-aware bytecode handling
- method/code mapping support
- designed to scale toward richer control-flow or data-flow analysis later
DexTrace implements progressive API tracing stages aligned with
Quark Engineβs 5-stage detection model.
- extracts all
invoke-*instructions - resolves:
- caller class / method / prototype
- callee class / method / prototype
- opcode type and bytecode offset
- produces structured XREF output
- safe against malformed indices and corrupted tables
- groups APIs per caller method
- represents which APIs are used together
- order-independent
- designed for combination-based rule matching
- preserves static call order within each method
- offset-aware ordering of
invoke-*instructions - method-local (no CFG explosion)
- designed for sequence-based rule matching
DexTrace includes an iterative Dalvik bytecode interpreter (src/dextrace/vm/)
that can actually execute a method instead of only statically inspecting it.
- executes a single entry method by signature, with caller-supplied arguments
- supports integer/long/float/double arithmetic, branches, comparisons, array and
field access, type checks/conversions,
throw, and try/catch exception flow - resolves virtual calls through a constructed class hierarchy / vtable
- simulates common Android/Java framework calls via Android API stubs
(
vm/android_stubs/) so malware-style flows can run without a device - records a per-instruction execution trace and a call tree of internal calls and stubbed API calls
Exposed through the dextrace run command (see below).
src/dextrace/
api.py # public Python API entry point
cli/ # CLI entry points
core/ # APK/DEX parsing and API extraction core
dalvik/ # Dalvik disassembly and opcode utilities
vm/ # Dalvik bytecode execution engine, handlers, Android stubs
manifest/ # binary AndroidManifest parsing
errors.py # shared project exceptions
version.py # package version
tests/
fixtures/ # synthetic fixtures used by tests
test_*.py # pytest-based test suite
docs/
modules-overview.md # module-by-module handoff guide
development-workflow.md # contributor workflow and validation notes
current-status.md # current state, known gaps, handoff notes
Command-line entry points.
main.py: top-level CLI dispatchercmd_meta.py: metadata-oriented inspectioncmd_disasm.py: disassembly-oriented inspectioncmd_dex.py: DEX/API-oriented inspection
Core APK / DEX parsing and API extraction logic.
Includes:
- APK reading
- APK metadata extraction
- manifest parsing bridge
- DEX structure parsing
- method/code mapping
- API extraction
- method/API resolution
Dalvik bytecode internals.
Includes:
- opcode format metadata
- operand decoding
- instruction size handling
- payload decoding
- disassembly support
- smali-oriented helpers
Dalvik bytecode execution (dynamic analysis), distinct from dalvik/ disassembly.
Includes:
engine.py: the iterativeDalvikVMexecution engine- opcode handlers under
handlers/(arithmetic, array, branch, compare, field, move, throw, type-check, type-conversion) - simulated Android/Java framework methods under
android_stubs/(content, filesystem, intent, network, runtime, sms, telephony, text) - execution state, register file, object heap, call frames, class hierarchy / vtable resolution, and execution tracing
This subsystem powers the dextrace run command.
Low-level binary AXML parsing used by manifest-related workflows.
git clone https://github.com/ev-flow/DexTrace.git
cd DexTrace
pip install -e .pipenv install --dev
pipenv shellDexTrace exposes a single CLI entry point:
dextrace --helpShow hashes, manifest summary, and DEX presence:
dextrace meta sample.apkParse and display full DEX header fields:
dextrace dex --header sample.apkShow a concise overview of DEX structure:
dextrace dex --summary sample.apkdextrace dex --apis sample.apkdextrace dex --api-sets sample.apkdextrace dex --api-seq sample.apkAll commands support structured JSON output:
dextrace dex --api-seq --json sample.apkExecute a single method with the Dalvik VM and print its return value.
The input may be a .dex file or a .apk (the embedded DEX is loaded automatically).
dextrace run --helpRun an entry method by signature:
dextrace run sample.dex --entry 'Lp1;->main()I'Pass arguments (--arg/-a, repeatable; ints are auto-detected from decimal or 0x hex,
everything else is a string). Use --args for an explicit JSON list of mixed int/string:
dextrace run sample.dex --entry 'Lp2/Fib;->fib(I)I' --arg 10
dextrace run sample.dex --entry 'Lp1;->main()I' --args '["+15555550100","hi"]'Useful flags:
--jsonβ emit the result (and, with--trace,api_calls) as JSON--traceβ print the call tree of internal calls and stubbed API calls--strict-stubsβ treat every unstubbed external call as an error (default: void misses are silent no-ops)--dump-regsβ print non-zero registers after execution--verbose/-vβ print[INFO]progress to stderr
Exit codes: 0 success, 1 user error (bad args / method not found), 2 VM runtime error, 3 parse error.
{
"dex": {
"summary": {
"magic": "dex\n035\u0000",
"version": "035",
"file_size": 717940,
"string_ids_size": 6285,
"method_ids_size": 5455,
"class_defs_size": 534
},
"api_calls": [
{
"caller": {
"class": "Landroid/support/v4/accessibilityservice/AccessibilityServiceInfoCompat;",
"method": "<clinit>",
"proto": "()V"
},
"invoke": {
"opcode": "invoke-direct",
"offset": 16
},
"callee": {
"class": "Landroid/support/v4/accessibilityservice/AccessibilityServiceInfoCompat$AccessibilityServiceInfoJellyBeanMr2;",
"method": "<init>",
"proto": "()V"
}
}
],
"api_calls_count": 1
}
}Run the full test suite:
pytestIf you use the Pipenv workflow, run it through Pipenv instead:
pipenv run pytestRun a targeted test file:
pytest tests/test_dex_parser.pyRun tests by keyword:
pytest -k api_extractor-
CLI changes:
pytest tests/test_cli_meta.py tests/test_smoke.py
-
APK / metadata changes:
pytest tests/test_apk_reader.py tests/test_apk_metadata.py
-
manifest changes:
pytest tests/test_manifest_parser.py
-
DEX parser changes:
pytest tests/test_dex_parser.py tests/test_dex_header.py
-
API extraction changes:
pytest tests/test_dex_api_extractor.py
-
Dalvik / disassembly changes:
pytest -k disassembler
-
VM execution /
dextrace runchanges:pytest -k vm
DexTrace is organized by subsystem, so contributors should usually:
- identify the affected subsystem first
- make the smallest targeted change possible
- run the closest subsystem tests first
- broaden validation only if the change touches shared logic
- update documentation when contributor-facing behavior changes
When DexTrace is used under Quark Engine, Quark-facing mismatches should be investigated conservatively and evidence-first. Preserve:
- APK identifier or sample path
- rule IDs
- exact commands used
- DexTrace output
- comparison-core output such as Androguard
- diff excerpts
- current hypothesis
Prefer wording such as:
- inconsistent API matching
- resolution difference
- invoke extraction gap
until the exact root cause is verified in code and tests.
Additional contributor documentation:
The repository may include:
- extracted sample APK directories for validation or reproduction
- generated build artifacts under
dist/
These are useful for testing and packaging, but they are not the core implementation surface.
DexTrace can be used as an analysis core under Quark Engine. In that setup:
- DexTrace is responsible for parsing APK / DEX input and extracting evidence
- Quark Engine is responsible for higher-level rule matching and scoring
When validating Quark-facing behavior, comparisons should keep the APK, rule set, and Quark version fixed while only changing the analysis core.
See LICENSE.