Byte-Equivalent Decompilation of GPL-Violating Devices: A Genetic Programming Approach
This post explores the challenging problem of byte-equivalent decompilation of a Linux kernel binary from a GPL-violating device, aiming to recover the equivalent C code. The author proposes a genetic programming-based optimization approach to find a "perfect" solution, not just a "good enough" approximation. Challenges include generating the initial population, representing C code (using ASTs), representing the binary code (disassembly or IR), and improving the readability of the resulting C code. The author argues that population-based metaheuristics like genetic algorithms are better suited for this complex problem than single-point search heuristics. This is a long-term research project requiring deep understanding of decompilation techniques, kernel code, and optimization algorithms.