Compare commits

..

1 Commits

Author SHA1 Message Date
nemerle
2a59d07ef2 Orignal dcc code
Only 2 differences with original release:
  add return type to main
  int disassem.cpp popPosStack() add cast to intptr_t
2015-05-28 17:17:26 +02:00
147 changed files with 15828 additions and 29984 deletions

6
.gitattributes vendored
View File

@ -1,6 +0,0 @@
* text=auto
*.c text
*.cpp text
*.ui text
*.qrc text
*.h text

4
.gitignore vendored
View File

@ -4,6 +4,4 @@ tests/prev
tests/outputs/*
tests/errors
*.autosave
bld*
*.user
*.idb
bld*

View File

@ -1,68 +0,0 @@
PMOVMSKB
Gd, Pq1H
PMOVMSKB
(66)
Gd, Vdq1H
should be
PMOVMSKB
Gd, Qq1H
PMOVMSKB
(66)
Gd, Wdq1H
The instruction represented by this opcode expression does not support any
operand to be a memory location.
MASKMOVQ
Pq, Pq1H
MASKMOVDQU
(66)
Vdq, Vdq1H
should be
MASKMOVQ
Pq, Pq1H
MASKMOVDQU
(66)
Vdq, Wdq1H
MOVMSKPS
Gd, Vps1H
MOVMSKPD
(66)
Gd, Vpd1H
should be
MOVMSKPS
Gd, Wps1H
MOVMSKPD
(66)
Gd, Wpd1H
The opcode table entries for LFS, LGS, and LSS
L[FGS]S
Mp
should be
L[FGS]S
Gv,Mp
MOVHLPS
Vps, Vps
MOVLHPS
Vps, Vps
should be
MOVHLPS
Vps, Wps
MOVLHPS
Vps, Wps

View File

@ -1,137 +0,0 @@
The "Clarified Artistic License"
Preamble
The intent of this document is to state the conditions under which a
Package may be copied, such that the Copyright Holder maintains some
semblance of artistic control over the development of the package,
while giving the users of the package the right to use and distribute
the Package in a more-or-less customary fashion, plus the right to make
reasonable modifications.
Definitions:
"Package" refers to the collection of files distributed by the
Copyright Holder, and derivatives of that collection of files
created through textual modification.
"Standard Version" refers to such a Package if it has not been
modified, or has been modified in accordance with the wishes
of the Copyright Holder as specified below.
"Copyright Holder" is whoever is named in the copyright or
copyrights for the package.
"You" is you, if you're thinking about copying or distributing
this Package.
"Distribution fee" is a fee you charge for providing a copy of this
Package to another party.
"Freely Available" means that no fee is charged for the right to use
the item, though there may be fees involved in handling the item.
1. You may make and give away verbatim copies of the source form of the
Standard Version of this Package without restriction, provided that you
duplicate all of the original copyright notices and associated disclaimers.
2. You may apply bug fixes, portability fixes and other modifications
derived from the Public Domain, or those made Freely Available, or from
the Copyright Holder. A Package modified in such a way shall still be
considered the Standard Version.
3. You may otherwise modify your copy of this Package in any way, provided
that you insert a prominent notice in each changed file stating how and
when you changed that file, and provided that you do at least ONE of the
following:
a) place your modifications in the Public Domain or otherwise make them
Freely Available, such as by posting said modifications to Usenet or
an equivalent medium, or placing the modifications on a major archive
site allowing unrestricted access to them, or by allowing the Copyright
Holder to include your modifications in the Standard Version of the
Package.
b) use the modified Package only within your corporation or organization.
c) rename any non-standard executables so the names do not conflict
with standard executables, which must also be provided, and provide
a separate manual page for each non-standard executable that clearly
documents how it differs from the Standard Version.
d) make other distribution arrangements with the Copyright Holder.
e) permit and encourge anyone who receives a copy of the modified Package
permission to make your modifications Freely Available in some specific
way.
4. You may distribute the programs of this Package in object code or
executable form, provided that you do at least ONE of the following:
a) distribute a Standard Version of the executables and library files,
together with instructions (in the manual page or equivalent) on where
to get the Standard Version.
b) accompany the distribution with the machine-readable source of
the Package with your modifications.
c) give non-standard executables non-standard names, and clearly
document the differences in manual pages (or equivalent), together
with instructions on where to get the Standard Version.
d) make other distribution arrangements with the Copyright Holder.
e) offer the machine-readable source of the Package, with your
modifications, by mail order.
5. You may charge a distribution fee for any distribution of this Package.
If you offer support for this Package, you may charge any fee you choose
for that support. You may not charge a license fee for the right to use
this Package itself. You may distribute this Package in aggregate with
other (possibly commercial and possibly nonfree) programs as part of a
larger (possibly commercial and possibly nonfree) software distribution,
and charge license fees for other parts of that software distribution,
provided that you do not advertise this Package as a product of your own.
If the Package includes an interpreter, You may embed this Package's
interpreter within an executable of yours (by linking); this shall be
construed as a mere form of aggregation, provided that the complete
Standard Version of the interpreter is so embedded.
6. The scripts and library files supplied as input to or produced as
output from the programs of this Package do not automatically fall
under the copyright of this Package, but belong to whoever generated
them, and may be sold commercially, and may be aggregated with this
Package. If such scripts or library files are aggregated with this
Package via the so-called "undump" or "unexec" methods of producing a
binary executable image, then distribution of such an image shall
neither be construed as a distribution of this Package nor shall it
fall under the restrictions of Paragraphs 3 and 4, provided that you do
not represent such an executable image as a Standard Version of this
Package.
7. C subroutines (or comparably compiled subroutines in other
languages) supplied by you and linked into this Package in order to
emulate subroutines and variables of the language defined by this
Package shall not be considered part of this Package, but are the
equivalent of input as in Paragraph 6, provided these subroutines do
not change the language in any way that would cause it to fail the
regression tests for the language.
8. Aggregation of the Standard Version of the Package with a commercial
distribution is always permitted provided that the use of this Package is
embedded; that is, when no overt attempt is made to make this Package's
interfaces visible to the end user of the commercial distribution.
Such use shall not be construed as a distribution of this Package.
9. The name of the Copyright Holder may not be used to endorse or promote
products derived from this software without specific prior written permission.
10. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
The End

View File

@ -1,12 +0,0 @@
The rewritten libdisasm code uses the following namespaces:
Prefix Namespace
----------------------------------------------------
x86_ Global 'libdisasm' namespace
ia32_ Internal IA32 ISA namespace
ia64_ Internal IA64 ISA namespace
ix64_ Internal X86-64 ISA namespace
Note that the 64-bit ISAs are not yet supported/written.

View File

@ -1,2 +0,0 @@
This is a cut-up version of libdisasm originally from the bastard project http://bastard.sourceforge.net/

View File

@ -1,43 +0,0 @@
x86_format.c
------------
intel: jmpf -> jmp, callf -> call
att: jmpf -> ljmp, callf -> lcall
opcode table
------------
finish typing instructions
fix flag clear/set/toggle types
ix64 stuff
----------
document output file formats in web page
features doc: register aliases, implicit operands, stack mods,
ring0 flags, eflags, cpu model/isa
ia32_handle_* implementation
fix operand 0F C2
CMPPS
* sysenter, sysexit as CALL types -- preceded by MSR writes
* SYSENTER/SYSEXIT stack : overwrites SS, ESP
* stos, cmps, scas, movs, ins, outs, lods -> OP_PTR
* OP_SIZE in implicit operands
* use OP_SIZE to choose reg sizes!
DONE?? :
implicit operands: provide action ?
e.g. add/inc for stach, write, etc
replace table numbers in opcodes.dat with
#defines for table names
replace 0 with INSN_INVALID [or maybe FF for imnvalid and 00 for Not Applicable */
no wait that is only for prefix tables -- n/p
if ( prefx) only use if insn != invalid
these should cover all the wacky disasm exceptions
for the rep one we can chet, match only a 0x90
todo: privilege | ring

View File

@ -1,36 +0,0 @@
#include <stdio.h>
static const char * mem_fixup[256] = {
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 00 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 08 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 10 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 18 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 20 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 28 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 30 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 38 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 40 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 48 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 50 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 58 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 60 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 68 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 70 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 78 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 80 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 88 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 90 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 98 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* A0 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* A8 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* B0 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* B8 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* C0 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* C8 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* D0 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* D8 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* E0 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* E8 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* F0 */
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL /* F8 */
};

View File

@ -20,81 +20,81 @@ typedef struct {
static op_implicit_list_t list_aaa[] =
/* 37 : AAA : rw AL */
/* 3F : AAS : rw AL */
{{ OP_R | OP_W, REG_BYTE_OFFSET }, {0,0}}; /* aaa */
{{ OP_R | OP_W, REG_BYTE_OFFSET }, {0}}; /* aaa */
static op_implicit_list_t list_aad[] =
/* D5 0A, D5 (ib) : AAD : rw AX */
/* D4 0A, D4 (ib) : AAM : rw AX */
{{ OP_R | OP_W, REG_WORD_OFFSET }, {0,0}}; /* aad */
{{ OP_R | OP_W, REG_WORD_OFFSET }, {0}}; /* aad */
static op_implicit_list_t list_call[] =
/* E8, FF, 9A, FF : CALL : rw ESP, rw EIP */
/* C2, C3, CA, CB : RET : rw ESP, rw EIP */
{{ OP_R | OP_W, REG_EIP_INDEX },
{ OP_R | OP_W, REG_ESP_INDEX }, {0,0}}; /* call, ret */
{ OP_R | OP_W, REG_ESP_INDEX }, {0}}; /* call, ret */
static op_implicit_list_t list_cbw[] =
/* 98 : CBW : r AL, rw AX */
{{ OP_R | OP_W, REG_WORD_OFFSET },
{ OP_R, REG_BYTE_OFFSET}, {0,0}}; /* cbw */
{ OP_R, REG_BYTE_OFFSET}, {0}}; /* cbw */
static op_implicit_list_t list_cwde[] =
/* 98 : CWDE : r AX, rw EAX */
{{ OP_R | OP_W, REG_DWORD_OFFSET },
{ OP_R, REG_WORD_OFFSET }, {0,0}}; /* cwde */
{ OP_R, REG_WORD_OFFSET }, {0}}; /* cwde */
static op_implicit_list_t list_clts[] =
/* 0F 06 : CLTS : rw CR0 */
{{ OP_R | OP_W, REG_CTRL_OFFSET}, {0,0}}; /* clts */
{{ OP_R | OP_W, REG_CTRL_OFFSET}, {0}}; /* clts */
static op_implicit_list_t list_cmpxchg[] =
/* 0F B0 : CMPXCHG : rw AL */
{{ OP_R | OP_W, REG_BYTE_OFFSET }, {0,0}}; /* cmpxchg */
{{ OP_R | OP_W, REG_BYTE_OFFSET }, {0}}; /* cmpxchg */
static op_implicit_list_t list_cmpxchgb[] =
/* 0F B1 : CMPXCHG : rw EAX */
{{ OP_R | OP_W, REG_DWORD_OFFSET }, {0,0}}; /* cmpxchg */
{{ OP_R | OP_W, REG_DWORD_OFFSET }, {0}}; /* cmpxchg */
static op_implicit_list_t list_cmpxchg8b[] =
/* 0F C7 : CMPXCHG8B : rw EDX, rw EAX, r ECX, r EBX */
{{ OP_R | OP_W, REG_DWORD_OFFSET },
{ OP_R | OP_W, REG_DWORD_OFFSET + 2 },
{ OP_R, REG_DWORD_OFFSET + 1 },
{ OP_R, REG_DWORD_OFFSET + 3 }, {0,0}}; /* cmpxchg8b */
{ OP_R, REG_DWORD_OFFSET + 3 }, {0}}; /* cmpxchg8b */
static op_implicit_list_t list_cpuid[] =
/* 0F A2 : CPUID : rw EAX, w EBX, w ECX, w EDX */
{{ OP_R | OP_W, REG_DWORD_OFFSET },
{ OP_W, REG_DWORD_OFFSET + 1 },
{ OP_W, REG_DWORD_OFFSET + 2 },
{ OP_W, REG_DWORD_OFFSET + 3 }, {0,0}}; /* cpuid */
{ OP_W, REG_DWORD_OFFSET + 3 }, {0}}; /* cpuid */
static op_implicit_list_t list_cwd[] =
/* 99 : CWD/CWQ : rw EAX, w EDX */
{{ OP_R | OP_W, REG_DWORD_OFFSET },
{ OP_W, REG_DWORD_OFFSET + 2 }, {0,0}}; /* cwd */
{ OP_W, REG_DWORD_OFFSET + 2 }, {0}}; /* cwd */
static op_implicit_list_t list_daa[] =
/* 27 : DAA : rw AL */
/* 2F : DAS : rw AL */
{{ OP_R | OP_W, REG_BYTE_OFFSET }, {0,0}}; /* daa */
{{ OP_R | OP_W, REG_BYTE_OFFSET }, {0}}; /* daa */
static op_implicit_list_t list_idiv[] =
/* F6 : DIV, IDIV : r AX, w AL, w AH */
/* FIXED: first op was EAX, not Aw. TODO: verify! */
{{ OP_R, REG_WORD_OFFSET },
{ OP_W, REG_BYTE_OFFSET },
{ OP_W, REG_BYTE_OFFSET + 4 }, {0,0}}; /* div */
{ OP_W, REG_BYTE_OFFSET + 4 }, {0}}; /* div */
static op_implicit_list_t list_div[] =
/* F7 : DIV, IDIV : rw EDX, rw EAX */
{{ OP_R | OP_W, REG_DWORD_OFFSET + 2 },
{ OP_R | OP_W, REG_DWORD_OFFSET }, {0,0}}; /* div */
{ OP_R | OP_W, REG_DWORD_OFFSET }, {0}}; /* div */
static op_implicit_list_t list_enter[] =
/* C8 : ENTER : rw ESP w EBP */
{{ OP_R | OP_W, REG_DWORD_OFFSET + 4 },
{ OP_R, REG_DWORD_OFFSET + 5 }, {0,0}}; /* enter */
{ OP_R, REG_DWORD_OFFSET + 5 }, {0}}; /* enter */
static op_implicit_list_t list_f2xm1[] =
/* D9 F0 : F2XM1 : rw ST(0) */
@ -109,7 +109,7 @@ static op_implicit_list_t list_f2xm1[] =
/* D9 FE : FSIN : rw ST(0) */
/* D9 FA : FSQRT : rw ST(0) */
/* D9 F4 : FXTRACT : rw ST(0) */
{{ OP_R | OP_W, REG_FPU_OFFSET }, {0,0}}; /* f2xm1 */
{{ OP_R | OP_W, REG_FPU_OFFSET }, {0}}; /* f2xm1 */
static op_implicit_list_t list_fcom[] =
/* D8, DC, DE D9 : FCOM : r ST(0) */
@ -117,17 +117,17 @@ static op_implicit_list_t list_fcom[] =
/* DF, D8 : FIST : r ST(0) */
/* D9 E4 : FTST : r ST(0) */
/* D9 E5 : FXAM : r ST(0) */
{{ OP_R, REG_FPU_OFFSET }, {0,0}}; /* fcom */
{{ OP_R, REG_FPU_OFFSET }, {0}}; /* fcom */
static op_implicit_list_t list_fpatan[] =
/* D9 F3 : FPATAN : r ST(0), rw ST(1) */
{{ OP_R, REG_FPU_OFFSET }, {0,0}}; /* fpatan */
{{ OP_R, REG_FPU_OFFSET }, {0}}; /* fpatan */
static op_implicit_list_t list_fprem[] =
/* D9 F8, D9 F5 : FPREM : rw ST(0) r ST(1) */
/* D9 FD : FSCALE : rw ST(0), r ST(1) */
{{ OP_R | OP_W, REG_FPU_OFFSET },
{ OP_R, REG_FPU_OFFSET + 1 }, {0,0}}; /* fprem */
{ OP_R, REG_FPU_OFFSET + 1 }, {0}}; /* fprem */
static op_implicit_list_t list_faddp[] =
/* DE C1 : FADDP : r ST(0), rw ST(1) */
@ -135,67 +135,67 @@ static op_implicit_list_t list_faddp[] =
/* D9 F1 : FYL2X : r ST(0), rw ST(1) */
/* D9 F9 : FYL2XP1 : r ST(0), rw ST(1) */
{{ OP_R, REG_FPU_OFFSET },
{ OP_R | OP_W, REG_FPU_OFFSET + 1 }, {0,0}}; /* faddp */
{ OP_R | OP_W, REG_FPU_OFFSET + 1 }, {0}}; /* faddp */
static op_implicit_list_t list_fucompp[] =
/* DA E9 : FUCOMPP : r ST(0), r ST(1) */
{{ OP_R, REG_FPU_OFFSET },
{ OP_R, REG_FPU_OFFSET + 1 }, {0,0}}; /* fucompp */
{ OP_R, REG_FPU_OFFSET + 1 }, {0}}; /* fucompp */
static op_implicit_list_t list_imul[] =
/* F6 : IMUL : r AL, w AX */
/* F6 : MUL : r AL, w AX */
{{ OP_R, REG_BYTE_OFFSET },
{ OP_W, REG_WORD_OFFSET }, {0,0}}; /* imul */
{ OP_W, REG_WORD_OFFSET }, {0}}; /* imul */
static op_implicit_list_t list_mul[] =
/* F7 : IMUL : rw EAX, w EDX */
/* F7 : MUL : rw EAX, w EDX */
{{ OP_R | OP_W, REG_DWORD_OFFSET },
{ OP_W, REG_DWORD_OFFSET + 2 }, {0,0}}; /* imul */
{ OP_W, REG_DWORD_OFFSET + 2 }, {0}}; /* imul */
static op_implicit_list_t list_lahf[] =
/* 9F : LAHF : r EFLAGS, w AH */
{{ OP_R, REG_FLAGS_INDEX },
{ OP_W, REG_BYTE_OFFSET + 4 }, {0,0}}; /* lahf */
{ OP_W, REG_BYTE_OFFSET + 4 }, {0}}; /* lahf */
static op_implicit_list_t list_ldmxcsr[] =
/* 0F AE : LDMXCSR : w MXCSR SSE Control Status Reg */
{{ OP_W, REG_MXCSG_INDEX }, {0,0}}; /* ldmxcsr */
{{ OP_W, REG_MXCSG_INDEX }, {0}}; /* ldmxcsr */
static op_implicit_list_t list_leave[] =
/* C9 : LEAVE : rw ESP, w EBP */
{{ OP_R | OP_W, REG_ESP_INDEX },
{ OP_W, REG_DWORD_OFFSET + 5 }, {0,0}}; /* leave */
{ OP_W, REG_DWORD_OFFSET + 5 }, {0}}; /* leave */
static op_implicit_list_t list_lgdt[] =
/* 0F 01 : LGDT : w GDTR */
{{ OP_W, REG_GDTR_INDEX }, {0,0}}; /* lgdt */
{{ OP_W, REG_GDTR_INDEX }, {0}}; /* lgdt */
static op_implicit_list_t list_lidt[] =
/* 0F 01 : LIDT : w IDTR */
{{ OP_W, REG_IDTR_INDEX }, {0,0}}; /* lidt */
{{ OP_W, REG_IDTR_INDEX }, {0}}; /* lidt */
static op_implicit_list_t list_lldt[] =
/* 0F 00 : LLDT : w LDTR */
{{ OP_W, REG_LDTR_INDEX }, {0,0}}; /* lldt */
{{ OP_W, REG_LDTR_INDEX }, {0}}; /* lldt */
static op_implicit_list_t list_lmsw[] =
/* 0F 01 : LMSW : w CR0 */
{{ OP_W, REG_CTRL_OFFSET }, {0,0}}; /* lmsw */
{{ OP_W, REG_CTRL_OFFSET }, {0}}; /* lmsw */
static op_implicit_list_t list_loop[] =
/* E0, E1, E2 : LOOP : rw ECX */
{{ OP_R | OP_W, REG_DWORD_OFFSET + 1 }, {0,0}};/* loop */
{{ OP_R | OP_W, REG_DWORD_OFFSET + 1 }, {0}};/* loop */
static op_implicit_list_t list_ltr[] =
/* 0F 00 : LTR : w Task Register */
{{ OP_W, REG_TR_INDEX }, {0,0}}; /* ltr */
{{ OP_W, REG_TR_INDEX }, {0}}; /* ltr */
static op_implicit_list_t list_pop[] =
/* 8F, 58, 1F, 07, 17, 0F A1, 0F A9 : POP : rw ESP */
/* FF, 50, 6A, 68, 0E, 16, 1E, 06, 0F A0, 0F A8 : PUSH : rw ESP */
{{ OP_R | OP_W, REG_ESP_INDEX }, {0,0}}; /* pop, push */
{{ OP_R | OP_W, REG_ESP_INDEX }, {0}}; /* pop, push */
static op_implicit_list_t list_popad[] =
/* 61 : POPAD : rw esp, w edi esi ebp ebx edx ecx eax */
@ -206,12 +206,12 @@ static op_implicit_list_t list_popad[] =
{ OP_W, REG_DWORD_OFFSET + 3 },
{ OP_W, REG_DWORD_OFFSET + 2 },
{ OP_W, REG_DWORD_OFFSET + 1 },
{ OP_W, REG_DWORD_OFFSET }, {0,0}}; /* popad */
{ OP_W, REG_DWORD_OFFSET }, {0}}; /* popad */
static op_implicit_list_t list_popfd[] =
/* 9D : POPFD : rw esp, w eflags */
{{ OP_R | OP_W, REG_ESP_INDEX },
{ OP_W, REG_FLAGS_INDEX }, {0,0}}; /* popfd */
{ OP_W, REG_FLAGS_INDEX }, {0}}; /* popfd */
static op_implicit_list_t list_pushad[] =
/* FF, 50, 6A, 68, 0E, 16, 1E, 06, 0F A0, 0F A8 : PUSH : rw ESP */
@ -223,102 +223,102 @@ static op_implicit_list_t list_pushad[] =
{ OP_R, REG_DWORD_OFFSET + 3 },
{ OP_R, REG_DWORD_OFFSET + 5 },
{ OP_R, REG_DWORD_OFFSET + 6 },
{ OP_R, REG_DWORD_OFFSET + 7 }, {0,0}}; /* pushad */
{ OP_R, REG_DWORD_OFFSET + 7 }, {0}}; /* pushad */
static op_implicit_list_t list_pushfd[] =
/* 9C : PUSHFD : rw esp, r eflags */
{{ OP_R | OP_W, REG_ESP_INDEX },
{ OP_R, REG_FLAGS_INDEX }, {0,0}}; /* pushfd */
{ OP_R, REG_FLAGS_INDEX }, {0}}; /* pushfd */
static op_implicit_list_t list_rdmsr[] =
/* 0F 32 : RDMSR : r ECX, w EDX, w EAX */
{{ OP_R, REG_DWORD_OFFSET + 1 },
{ OP_W, REG_DWORD_OFFSET + 2 },
{ OP_W, REG_DWORD_OFFSET }, {0,0}}; /* rdmsr */
{ OP_W, REG_DWORD_OFFSET }, {0}}; /* rdmsr */
static op_implicit_list_t list_rdpmc[] =
/* 0F 33 : RDPMC : r ECX, w EDX, w EAX */
{{ OP_R, REG_DWORD_OFFSET + 1 },
{ OP_W, REG_DWORD_OFFSET + 2 },
{ OP_W, REG_DWORD_OFFSET }, {0,0}}; /* rdpmc */
{ OP_W, REG_DWORD_OFFSET }, {0}}; /* rdpmc */
static op_implicit_list_t list_rdtsc[] =
/* 0F 31 : RDTSC : rw EDX, rw EAX */
{{ OP_R | OP_W, REG_DWORD_OFFSET + 2 },
{ OP_R | OP_W, REG_DWORD_OFFSET }, {0,0}}; /* rdtsc */
{ OP_R | OP_W, REG_DWORD_OFFSET }, {0}}; /* rdtsc */
static op_implicit_list_t list_rep[] =
/* F3, F2 ... : REP : rw ECX */
{{ OP_R | OP_W, REG_DWORD_OFFSET + 1 }, {0,0}};/* rep */
{{ OP_R | OP_W, REG_DWORD_OFFSET + 1 }, {0}};/* rep */
static op_implicit_list_t list_rsm[] =
/* 0F AA : RSM : r CR4, r CR0 */
{{ OP_R, REG_CTRL_OFFSET + 4 },
{ OP_R, REG_CTRL_OFFSET }, {0,0}}; /* rsm */
{ OP_R, REG_CTRL_OFFSET }, {0}}; /* rsm */
static op_implicit_list_t list_sahf[] =
/* 9E : SAHF : r ah, rw eflags (set SF ZF AF PF CF) */
{{ OP_R, REG_DWORD_OFFSET }, {0,0}}; /* sahf */
{{ OP_R, REG_DWORD_OFFSET }, {0}}; /* sahf */
static op_implicit_list_t list_sgdt[] =
/* 0F : SGDT : r gdtr */
/* TODO: finish this! */
{{ OP_R, REG_DWORD_OFFSET }, {0,0}}; /* sgdt */
{{ OP_R, REG_DWORD_OFFSET }, {0}}; /* sgdt */
static op_implicit_list_t list_sidt[] =
/* 0F : SIDT : r idtr */
/* TODO: finish this! */
{{ OP_R, REG_DWORD_OFFSET }, {0,0}}; /* sidt */
{{ OP_R, REG_DWORD_OFFSET }, {0}}; /* sidt */
static op_implicit_list_t list_sldt[] =
/* 0F : SLDT : r ldtr */
/* TODO: finish this! */
{{ OP_R, REG_DWORD_OFFSET }, {0,0}}; /* sldt */
{{ OP_R, REG_DWORD_OFFSET }, {0}}; /* sldt */
static op_implicit_list_t list_smsw[] =
/* 0F : SMSW : r CR0 */
/* TODO: finish this! */
{{ OP_R, REG_DWORD_OFFSET }, {0,0}}; /* smsw */
{{ OP_R, REG_DWORD_OFFSET }, {0}}; /* smsw */
static op_implicit_list_t list_stmxcsr[] =
/* 0F AE : STMXCSR : r MXCSR */
/* TODO: finish this! */
{{ OP_R, REG_DWORD_OFFSET }, {0,0}}; /* stmxcsr */
{{ OP_R, REG_DWORD_OFFSET }, {0}}; /* stmxcsr */
static op_implicit_list_t list_str[] =
/* 0F 00 : STR : r TR (task register) */
/* TODO: finish this! */
{{ OP_R, REG_DWORD_OFFSET }, {0,0}}; /* str */
{{ OP_R, REG_DWORD_OFFSET }, {0}}; /* str */
static op_implicit_list_t list_sysenter[] =
/* 0F 34 : SYSENTER : w cs, w eip, w ss, w esp, r CR0, w eflags
* r sysenter_cs_msr, sysenter_esp_msr, sysenter_eip_msr */
/* TODO: finish this! */
{{ OP_R, REG_DWORD_OFFSET }, {0,0}}; /* sysenter */
{{ OP_R, REG_DWORD_OFFSET }, {0}}; /* sysenter */
static op_implicit_list_t list_sysexit[] =
/* 0F 35 : SYSEXIT : r edx, r ecx, w cs, w eip, w ss, w esp
* r sysenter_cs_msr */
/* TODO: finish this! */
{{ OP_R, REG_DWORD_OFFSET }, {0,0}}; /* sysexit */
{{ OP_R, REG_DWORD_OFFSET }, {0}}; /* sysexit */
static op_implicit_list_t list_wrmsr[] =
/* 0F 30 : WRMST : r edx, r eax, r ecx */
/* TODO: finish this! */
{{ OP_R, REG_DWORD_OFFSET }, {0,0}}; /* wrmsr */
{{ OP_R, REG_DWORD_OFFSET }, {0}}; /* wrmsr */
static op_implicit_list_t list_xlat[] =
/* D7 : XLAT : rw al r ebx (ptr) */
/* TODO: finish this! */
{{ OP_R, REG_DWORD_OFFSET }, {0,0}}; /* xlat */
{{ OP_R, REG_DWORD_OFFSET }, {0}}; /* xlat */
/* TODO:
* monitor 0f 01 c8 eax OP_R ecx OP_R edx OP_R
* mwait 0f 01 c9 eax OP_R ecx OP_R
*/
static op_implicit_list_t list_monitor[] =
{{ OP_R, REG_DWORD_OFFSET }, {0,0}}; /* monitor */
{{ OP_R, REG_DWORD_OFFSET }, {0}}; /* monitor */
static op_implicit_list_t list_mwait[] =
{{ OP_R, REG_DWORD_OFFSET }, {0,0}}; /* mwait */
{{ OP_R, REG_DWORD_OFFSET }, {0}}; /* mwait */
op_implicit_list_t *op_implicit_list[] = {
/* This is a list of implicit operands which are read/written by
@ -407,21 +407,7 @@ unsigned int Ia32_Decoder::ia32_insn_implicit_ops( unsigned int impl_idx ) {
if (!op) {
op = m_decoded->x86_operand_new();
/* all implicit operands are registers */
if(m_decoded->addr_size==2)
{
if(list->operand==REG_EIP_INDEX)
handle_impl_reg( op, REG_IP_INDEX );
else if(list->operand<REG_WORD_OFFSET)
{
handle_impl_reg( op, (list->operand-REG_DWORD_OFFSET)+REG_WORD_OFFSET);
assert((list->operand-REG_DWORD_OFFSET)<REG_WORD_OFFSET-REG_DWORD_OFFSET);
}
else
handle_impl_reg( op, list->operand);
}
else
handle_impl_reg( op, list->operand );
handle_impl_reg( op, list->operand );
/* decrement the 'explicit count' incremented by default in
* x86_operand_new */
m_decoded->explicit_count = m_decoded->explicit_count -1;

View File

@ -240,7 +240,7 @@ void Ia32_Decoder::ia32_handle_prefix( unsigned int prefixes ) {
}
static void reg_32_to_16( x86_op_t *op, x86_insn_t */*insn*/, void */*arg*/ ) {
static void reg_32_to_16( x86_op_t *op, x86_insn_t *insn, void *arg ) {
/* if this is a 32-bit register and it is a general register ... */
if ( op->type == op_register && op->data.reg.size == 4 &&
@ -539,11 +539,12 @@ size_t ia32_table_lookup( unsigned char *buf, size_t buf_len,
size_t Ia32_Decoder::handle_insn_suffix( unsigned char *buf, size_t buf_len,
ia32_insn_t *raw_insn ) {
// ia32_table_desc_t *table_desc;
ia32_table_desc_t *table_desc;
ia32_insn_t *sfx_insn;
size_t size;
unsigned int prefixes = 0;
//table_desc = &ia32_tables[raw_insn->table];
table_desc = &ia32_tables[raw_insn->table];
size = ia32_table_lookup( buf, buf_len, raw_insn->table, &sfx_insn,
&prefixes );
if (size == INVALID_INSN || sfx_insn->mnem_flag == INS_INVALID ) {

View File

@ -137,7 +137,7 @@ static int ia32_invariant_modrm( unsigned char *in, unsigned char *out,
}
static int ia32_decode_invariant( unsigned char *buf, size_t /*buf_len*/,
static int ia32_decode_invariant( unsigned char *buf, size_t buf_len,
ia32_insn_t *t, unsigned char *out,
unsigned int prefixes, x86_invariant_t *inv) {
@ -251,13 +251,13 @@ static int ia32_decode_invariant( unsigned char *buf, size_t /*buf_len*/,
case ADDRMETH_X:
inv->operands[x].flags.op_signed=true;
inv->operands[x].flags.op_pointer=true;
inv->operands[x].flags.op_seg=(x86_op_flags::op_ds_seg)>>8;
inv->operands[x].flags.op_seg=x86_op_flags::op_ds_seg;
inv->operands[x].flags.op_string=true;
break;
case ADDRMETH_Y:
inv->operands[x].flags.op_signed=true;
inv->operands[x].flags.op_pointer=true;
inv->operands[x].flags.op_seg=x86_op_flags::op_es_seg>>8;
inv->operands[x].flags.op_seg=x86_op_flags::op_es_seg;
inv->operands[x].flags.op_string=true;
break;
case ADDRMETH_RR:
@ -307,7 +307,6 @@ size_t ia32_disasm_invariant( unsigned char * buf, size_t buf_len,
}
size_t ia32_disasm_size( unsigned char *buf, size_t buf_len ) {
x86_invariant_t inv;
memset(&inv,0,sizeof(x86_invariant_t));
x86_invariant_t inv = { {0} };
return( ia32_disasm_invariant( buf, buf_len, &inv ) );
}

View File

@ -155,12 +155,12 @@ static size_t modrm_decode16( unsigned char *buf, unsigned int buf_len,
ia32_handle_register(&ea->base, REG_WORD_OFFSET + 3);
ia32_handle_register(&ea->index, REG_WORD_OFFSET + 7);
case MOD16_RM_BPSI:
op->flags.op_seg = x86_op_flags::op_ss_seg>>8;
op->flags.op_seg = x86_op_flags::op_ss_seg;
ia32_handle_register(&ea->base, REG_WORD_OFFSET + 5);
ia32_handle_register(&ea->index, REG_WORD_OFFSET + 6);
break;
case MOD16_RM_BPDI:
op->flags.op_seg = x86_op_flags::op_ss_seg>>8;
op->flags.op_seg = x86_op_flags::op_ss_seg;
ia32_handle_register(&ea->base, REG_WORD_OFFSET + 5);
ia32_handle_register(&ea->index, REG_WORD_OFFSET + 7);
break;
@ -172,7 +172,7 @@ static size_t modrm_decode16( unsigned char *buf, unsigned int buf_len,
break;
case MOD16_RM_BP:
if ( modrm->mod != MOD16_MOD_NODISP ) {
op->flags.op_seg = x86_op_flags::op_ss_seg>>8;
op->flags.op_seg = x86_op_flags::op_ss_seg;
ia32_handle_register(&ea->base,
REG_WORD_OFFSET + 5);
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -20,17 +20,17 @@ static void apply_seg( x86_op_t *op, unsigned int prefixes ) {
switch ( prefixes & PREFIX_REG_MASK ) {
/* NOTE: that op->flags for segment override are not a bitfield */
case PREFIX_CS:
op->flags.op_seg = x86_op_flags::op_cs_seg>>8; break;
op->flags.op_seg = x86_op_flags::op_cs_seg; break;
case PREFIX_SS:
op->flags.op_seg = x86_op_flags::op_ss_seg>>8; break;
op->flags.op_seg = x86_op_flags::op_ss_seg; break;
case PREFIX_DS:
op->flags.op_seg = x86_op_flags::op_ds_seg>>8; break;
op->flags.op_seg = x86_op_flags::op_ds_seg; break;
case PREFIX_ES:
op->flags.op_seg = x86_op_flags::op_es_seg>>8; break;
op->flags.op_seg = x86_op_flags::op_es_seg; break;
case PREFIX_FS:
op->flags.op_seg = x86_op_flags::op_fs_seg>>8; break;
op->flags.op_seg = x86_op_flags::op_fs_seg; break;
case PREFIX_GS:
op->flags.op_seg = x86_op_flags::op_gs_seg>>8; break;
op->flags.op_seg = x86_op_flags::op_gs_seg; break;
}
return;
@ -107,17 +107,19 @@ size_t Ia32_Decoder::decode_operand_value( unsigned char *buf, size_t buf_len,
/* No MODRM : note these set operand type explicitly */
case ADDRMETH_A: /* No modR/M -- direct addr */
op->type = op_absolute;
//according to Intel Manuals, offset goes first
/* segment:offset address used in far calls */
x86_imm_sized( buf, buf_len,
&op->data.absolute.segment, 2 );
if ( m_decoded->addr_size == 4 ) {
x86_imm_sized( buf, buf_len, &op->data.absolute.offset.off32, 4 );
size = 4;
x86_imm_sized( buf, buf_len,
&op->data.absolute.offset.off32, 4 );
size = 6;
} else {
x86_imm_sized( buf, buf_len, &op->data.absolute.offset.off16, 2 );
size = 2;
x86_imm_sized( buf, buf_len,
&op->data.absolute.offset.off16, 2 );
size = 4;
}
x86_imm_sized( buf+size, buf_len-size, &op->data.absolute.segment, 2 );
size+=2;
break;
case ADDRMETH_I: /* Immediate val */
@ -134,28 +136,21 @@ size_t Ia32_Decoder::decode_operand_value( unsigned char *buf, size_t buf_len,
size = op_size;
break;
case ADDRMETH_J: /* Rel offset to add to IP [jmp] */
/* this fills op->data.near_offset or
/* this fills op->data.near_offset or
op->data.far_offset depending on the size of
the operand */
op->flags.op_signed = true;
switch(op_size)
{
case 1:
/* one-byte near offset */
op->type = op_relative_near;
size = x86_imm_signsized(buf, buf_len, &op->data.relative_near, 1);
break;
case 2:
/* far offset...is this truly signed? */
op->type = op_relative_far;
int16_t offset_val; // easier upcast to int32_t
size = x86_imm_signsized(buf, buf_len, &offset_val, 2 );
op->data.relative_far=offset_val;
break;
default:
assert(false);
size=0;
if ( op_size == 1 ) {
/* one-byte near offset */
op->type = op_relative_near;
x86_imm_signsized(buf, buf_len, &op->data.relative_near, 1);
} else {
/* far offset...is this truly signed? */
op->type = op_relative_far;
x86_imm_signsized(buf, buf_len,
&op->data.relative_far, op_size );
}
size = op_size;
break;
case ADDRMETH_O: /* No ModR/M; op is word/dword offset */
/* NOTE: these are actually RVAs not offsets to seg!! */
@ -177,20 +172,20 @@ size_t Ia32_Decoder::decode_operand_value( unsigned char *buf, size_t buf_len,
case ADDRMETH_X: /* Memory addressed by DS:SI [string] */
op->type = op_expression;
op->flags.op_hardcode = true;
op->flags.op_seg = x86_op_flags::op_ds_seg>>8;
op->flags.op_seg = x86_op_flags::op_ds_seg;
op->flags.op_pointer = true;
op->flags.op_string = true;
ia32_handle_register( &op->data.expression.base,
gen_regs + 6 );
REG_DWORD_OFFSET + 6 );
break;
case ADDRMETH_Y: /* Memory addressed by ES:DI [string] */
op->type = op_expression;
op->flags.op_hardcode = true;
op->flags.op_seg = x86_op_flags::op_es_seg>>8;
op->flags.op_seg = x86_op_flags::op_es_seg;
op->flags.op_pointer = true;
op->flags.op_string = true;
ia32_handle_register( &op->data.expression.base,
gen_regs + 7 );
REG_DWORD_OFFSET + 7 );
break;
case ADDRMETH_RR: /* Gen Register hard-coded in opcode */
op->type = op_register;
@ -260,10 +255,10 @@ size_t Ia32_Decoder::decode_operand_size( unsigned int op_type, x86_op_t *op ) {
break;
case OPTYPE_p: /* 32/48-bit ptr [op size attr] */
/* technically these flags are not accurate: the
* value s a 16:16 pointer or a 16:32 pointer, where
* the first '16' is a segment */
* value s a 16:16 pointer or a 16:32 pointer, where
* the first '16' is a segment */
size = (m_decoded->addr_size == 4) ? 6 : 4;
op->datatype = (size == 6) ? op_descr32 : op_descr16;
op->datatype = (size == 4) ? op_descr32 : op_descr16;
break;
case OPTYPE_b: /* byte, ignore op-size */
size = 1;

View File

@ -31,45 +31,45 @@
* of the MMX registers, so this aliasing is not 100% accurate.
* */
static struct {
unsigned char alias; /* id of register this is an alias for */
unsigned char shift; /* # of bits register must be shifted */
unsigned char alias; /* id of register this is an alias for */
unsigned char shift; /* # of bits register must be shifted */
} ia32_reg_aliases[] = {
{ 0,0 },
{ REG_DWORD_OFFSET, 0 }, /* al : 1 */
{ REG_DWORD_OFFSET, 8 }, /* ah : 2 */
{ REG_DWORD_OFFSET, 0 }, /* ax : 3 */
{ REG_DWORD_OFFSET + 1, 0 }, /* cl : 4 */
{ REG_DWORD_OFFSET + 1, 8 }, /* ch : 5 */
{ REG_DWORD_OFFSET + 1, 0 }, /* cx : 6 */
{ REG_DWORD_OFFSET + 2, 0 }, /* dl : 7 */
{ REG_DWORD_OFFSET + 2, 8 }, /* dh : 8 */
{ REG_DWORD_OFFSET + 2, 0 }, /* dx : 9 */
{ REG_DWORD_OFFSET + 3, 0 }, /* bl : 10 */
{ REG_DWORD_OFFSET + 3, 8 }, /* bh : 11 */
{ REG_DWORD_OFFSET + 3, 0 }, /* bx : 12 */
{ REG_DWORD_OFFSET + 4, 0 }, /* sp : 13 */
{ REG_DWORD_OFFSET + 5, 0 }, /* bp : 14 */
{ REG_DWORD_OFFSET + 6, 0 }, /* si : 15 */
{ REG_DWORD_OFFSET + 7, 0 }, /* di : 16 */
{ REG_EIP_INDEX, 0 }, /* ip : 17 */
{ REG_FPU_OFFSET, 0 }, /* mm0 : 18 */
{ REG_FPU_OFFSET + 1, 0 }, /* mm1 : 19 */
{ REG_FPU_OFFSET + 2, 0 }, /* mm2 : 20 */
{ REG_FPU_OFFSET + 3, 0 }, /* mm3 : 21 */
{ REG_FPU_OFFSET + 4, 0 }, /* mm4 : 22 */
{ REG_FPU_OFFSET + 5, 0 }, /* mm5 : 23 */
{ REG_FPU_OFFSET + 6, 0 }, /* mm6 : 24 */
{ REG_FPU_OFFSET + 7, 0 } /* mm7 : 25 */
};
{ 0,0 },
{ REG_DWORD_OFFSET, 0 }, /* al : 1 */
{ REG_DWORD_OFFSET, 8 }, /* ah : 2 */
{ REG_DWORD_OFFSET, 0 }, /* ax : 3 */
{ REG_DWORD_OFFSET + 1, 0 }, /* cl : 4 */
{ REG_DWORD_OFFSET + 1, 8 }, /* ch : 5 */
{ REG_DWORD_OFFSET + 1, 0 }, /* cx : 6 */
{ REG_DWORD_OFFSET + 2, 0 }, /* dl : 7 */
{ REG_DWORD_OFFSET + 2, 8 }, /* dh : 8 */
{ REG_DWORD_OFFSET + 2, 0 }, /* dx : 9 */
{ REG_DWORD_OFFSET + 3, 0 }, /* bl : 10 */
{ REG_DWORD_OFFSET + 3, 8 }, /* bh : 11 */
{ REG_DWORD_OFFSET + 3, 0 }, /* bx : 12 */
{ REG_DWORD_OFFSET + 4, 0 }, /* sp : 13 */
{ REG_DWORD_OFFSET + 5, 0 }, /* bp : 14 */
{ REG_DWORD_OFFSET + 6, 0 }, /* si : 15 */
{ REG_DWORD_OFFSET + 7, 0 }, /* di : 16 */
{ REG_EIP_INDEX, 0 }, /* ip : 17 */
{ REG_FPU_OFFSET, 0 }, /* mm0 : 18 */
{ REG_FPU_OFFSET + 1, 0 }, /* mm1 : 19 */
{ REG_FPU_OFFSET + 2, 0 }, /* mm2 : 20 */
{ REG_FPU_OFFSET + 3, 0 }, /* mm3 : 21 */
{ REG_FPU_OFFSET + 4, 0 }, /* mm4 : 22 */
{ REG_FPU_OFFSET + 5, 0 }, /* mm5 : 23 */
{ REG_FPU_OFFSET + 6, 0 }, /* mm6 : 24 */
{ REG_FPU_OFFSET + 7, 0 } /* mm7 : 25 */
};
/* REGISTER TABLE: size, type, and name of every register in the
* CPU. Does not include MSRs since the are, after all,
* model specific. */
static struct {
unsigned int size;
enum x86_reg_type type;
unsigned int alias;
char mnemonic[8];
unsigned int size;
enum x86_reg_type type;
unsigned int alias;
char mnemonic[8];
} ia32_reg_table[NUM_X86_REGS + 2] = {
{ 0, reg_undef, 0, "" },
/* REG_DWORD_OFFSET */
@ -189,7 +189,7 @@ static struct {
{ REG_DWORD_SIZE, reg_sys, 0, "esp_msr" },
/* REG_EIPMSR_INDEX : SYSENTER_EIP_MSR : 92 */
{ REG_DWORD_SIZE, reg_sys, 0, "eip_msr" },
{ 0,reg_undef,0,"" }
{ 0 }
};
@ -197,38 +197,38 @@ static size_t sz_regtable = NUM_X86_REGS + 1;
void ia32_handle_register( x86_reg_t *reg, size_t id ) {
unsigned int alias;
if (! id || id > sz_regtable ) {
unsigned int alias;
if (! id || id > sz_regtable ) {
return;
}
memset( reg, 0, sizeof(x86_reg_t) );
strncpy( reg->name, ia32_reg_table[id].mnemonic, MAX_REGNAME );
reg->type = ia32_reg_table[id].type;
reg->size = ia32_reg_table[id].size;
alias = ia32_reg_table[id].alias;
if ( alias ) {
reg->alias = ia32_reg_aliases[alias].alias;
reg->shift = ia32_reg_aliases[alias].shift;
}
reg->id = id;
return;
}
memset( reg, 0, sizeof(x86_reg_t) );
strncpy( reg->name, ia32_reg_table[id].mnemonic, MAX_REGNAME );
reg->type = ia32_reg_table[id].type;
reg->size = ia32_reg_table[id].size;
alias = ia32_reg_table[id].alias;
if ( alias ) {
reg->alias = ia32_reg_aliases[alias].alias;
reg->shift = ia32_reg_aliases[alias].shift;
}
reg->id = id;
return;
}
size_t ia32_true_register_id( size_t id ) {
size_t reg;
size_t reg;
if (! id || id > sz_regtable ) {
return 0;
}
if (! id || id > sz_regtable ) {
return 0;
}
reg = id;
if (ia32_reg_table[reg].alias) {
reg = ia32_reg_aliases[ia32_reg_table[reg].alias].alias;
}
return reg;
reg = id;
if (ia32_reg_table[reg].alias) {
reg = ia32_reg_aliases[ia32_reg_table[reg].alias].alias;
}
return reg;
}

View File

@ -6,7 +6,6 @@
#endif
#include <cstring>
#include <cstdlib>
#include <cassert>
#include <stdint.h>
/* 'NEW" types
@ -90,7 +89,7 @@ enum x86_options { /* these can be ORed together */
opt_none= 0,
opt_ignore_nulls=1, /* ignore sequences of > 4 NULL bytes */
opt_16_bit=2, /* 16-bit/DOS disassembly */
opt_att_mnemonics=4 /* use AT&T syntax names for alternate opcode mnemonics */
opt_att_mnemonics=4, /* use AT&T syntax names for alternate opcode mnemonics */
};
/* ========================================= Instruction Representation */
@ -134,12 +133,12 @@ enum x86_reg_type { /* NOTE: these may be ORed together */
/* x86_reg_t : an X86 CPU register */
struct x86_reg_t {
char name[MAX_REGNAME];
enum x86_reg_type type; /* what register is used for */
unsigned int size; /* size of register in bytes */
unsigned int id; /* register ID #, for quick compares */
unsigned int alias; /* ID of reg this is an alias for */
unsigned int shift; /* amount to shift aliased reg by */
char name[MAX_REGNAME];
enum x86_reg_type type; /* what register is used for */
unsigned int size; /* size of register in bytes */
unsigned int id; /* register ID #, for quick compares */
unsigned int alias; /* ID of reg this is an alias for */
unsigned int shift; /* amount to shift aliased reg by */
x86_reg_t * aliased_reg( ) {
x86_reg_t * reg = (x86_reg_t * )calloc( sizeof(x86_reg_t), 1 );
reg->x86_reg_from_id( id );
@ -159,11 +158,11 @@ typedef struct {
/* x86_absolute_t : an X86 segment:offset address (descriptor) */
typedef struct {
unsigned short segment; /* loaded directly into CS */
union {
unsigned short off16; /* loaded directly into IP */
uint32_t off32; /* loaded directly into EIP */
} offset;
unsigned short segment; /* loaded directly into CS */
union {
unsigned short off16; /* loaded directly into IP */
uint32_t off32; /* loaded directly into EIP */
} offset;
} x86_absolute_t;
enum x86_op_type { /* mutually exclusive */
@ -251,87 +250,57 @@ struct x86_op_flags { /* ORed together, but segs are mutually exclusive */
/* x86_op_t : an X86 instruction operand */
struct x86_op_t{
friend struct x86_insn_t;
enum x86_op_type type; /* operand type */
enum x86_op_datatype datatype; /* operand size */
enum x86_op_access access; /* operand access [RWX] */
x86_op_flags flags; /* misc flags */
union {
/* sizeof will have to work on these union members! */
/* immediate values */
char sbyte;
short sword;
int32_t sdword;
qword_t sqword;
unsigned char byte;
unsigned short word;
uint32_t dword;
qword_t qword;
float sreal;
double dreal;
/* misc large/non-native types */
unsigned char extreal[10];
unsigned char bcd[10];
qword_t dqword[2];
unsigned char simd[16];
unsigned char fpuenv[28];
/* offset from segment */
uint32_t offset;
/* ID of CPU register */
x86_reg_t reg;
/* offsets from current insn */
char relative_near;
int32_t relative_far;
/* segment:offset */
x86_absolute_t absolute;
/* effective address [expression] */
x86_ea_t expression;
} data;
/* this is needed to make formatting operands more sane */
void * insn; /* pointer to x86_insn_t owning operand */
size_t size() const
enum x86_op_type type; /* operand type */
enum x86_op_datatype datatype; /* operand size */
enum x86_op_access access; /* operand access [RWX] */
x86_op_flags flags; /* misc flags */
union {
/* sizeof will have to work on these union members! */
/* immediate values */
char sbyte;
short sword;
int32_t sdword;
qword_t sqword;
unsigned char byte;
unsigned short word;
uint32_t dword;
qword_t qword;
float sreal;
double dreal;
/* misc large/non-native types */
unsigned char extreal[10];
unsigned char bcd[10];
qword_t dqword[2];
unsigned char simd[16];
unsigned char fpuenv[28];
/* offset from segment */
uint32_t offset;
x86_reg_t reg; /* ID of CPU register */
char relative_near; /* offsets from current insn */
int32_t relative_far;
x86_absolute_t absolute; /* segment:offset */
x86_ea_t expression; /* effective address [expression] */
} data;
/* this is needed to make formatting operands more sane */
void * insn; /* pointer to x86_insn_t owning operand */
size_t size()
{
return operand_size();
}
/* get size of operand data in bytes */
size_t operand_size() const;
size_t operand_size();
/* format (sprintf) an operand into 'buf' using specified syntax */
int x86_format_operand(char *buf, int len, enum x86_asm_format format );
bool is_address( ) const {
bool is_address( ) {
return ( type == op_absolute || type == op_offset );
}
bool is_relative( ) const {
bool is_relative( ) {
return ( type == op_relative_near || type == op_relative_far );
}
bool is_immediate( ) const { return ( type == op_immediate ); }
int32_t getAddress()
{
assert(is_address()||is_relative());
switch ( type ) {
case op_relative_near:
return (int32_t) data.relative_near;
case op_absolute:
if(datatype==op_descr16)
return int32_t((data.absolute.segment)<<4) + data.absolute.offset.off16;
else
return int32_t((data.absolute.segment)<<4) + data.absolute.offset.off32;
case op_offset:
return data.offset;
case op_relative_far:
if (data.relative_far & 0x8000)
return (data.relative_far & 0xFFFF) | 0xFFFF0000;
else
return (int32_t)data.relative_far;
default:
assert(false);
break;
}
return ~0;
}
char * format( enum x86_asm_format format );
x86_op_t * copy()
{
x86_op_t *op = (x86_op_t *) calloc( sizeof(x86_op_t), 1 );
if ( op ) {
memcpy( op, this, sizeof(x86_op_t) );
}
@ -470,7 +439,7 @@ enum x86_insn_note {
insn_note_smm = 2, /* "" in System Management Mode */
insn_note_serial = 4, /* Serializing instruction */
insn_note_nonswap = 8, /* Does not swap arguments in att-style formatting */
insn_note_nosuffix = 16 /* Does not have size suffix in att-style formatting */
insn_note_nosuffix = 16, /* Does not have size suffix in att-style formatting */
};
/* This specifies what effects the instruction has on the %eflags register */
@ -551,6 +520,7 @@ enum x86_insn_prefix {
/* TODO: maybe provide insn_new/free(), and have disasm return new insn_t */
/* FOREACH types: these are used to limit the foreach results to
* operands which match a certain "type" (implicit or explicit)
* or which are accessed in certain ways (e.g. read or write). Note
@ -602,8 +572,8 @@ private:
void x86_oplist_append(x86_oplist_t *op);
public:
/* information about the instruction */
uint32_t addr; /* load address */
uint32_t offset; /* offset into file/buffer */
uint32_t addr; /* load address */
uint32_t offset; /* offset into file/buffer */
x86_insn_group group; /* meta-type, e.g. INS_EXEC */
x86_insn_type type; /* type, e.g. INS_BRANCH */
x86_insn_note note; /* note, e.g. RING0 */
@ -634,36 +604,29 @@ public:
void *block; /* code block containing this insn */
void *function; /* function containing this insn */
int tag; /* tag the insn as seen/processed */
x86_op_t *x86_operand_new();
/* convenience routine: returns count of operands matching 'type' */
size_t x86_operand_count( enum x86_op_foreach_type type );
x86_op_t * x86_operand_new();
size_t x86_operand_count( enum x86_op_foreach_type type );
/* accessor functions for the operands */
x86_op_t * operand_1st( );
x86_op_t * operand_2nd( );
x86_op_t * operand_3rd( );
const x86_op_t * get_dest() const;
int32_t x86_get_rel_offset( );
x86_op_t * x86_get_branch_target( );
x86_op_t * x86_get_imm( );
/* More accessor fuctions, this time for user-defined info... */
x86_op_t * x86_operand_1st( );
x86_op_t * x86_operand_2nd( );
x86_op_t * x86_operand_3rd( );
x86_op_t * get_dest();
int32_t x86_get_rel_offset( );
x86_op_t * x86_get_branch_target( );
x86_op_t * x86_get_imm( );
uint8_t * x86_get_raw_imm( );
/* set the address (usually RVA) of the insn */
void x86_set_insn_addr( uint32_t addr );
/* format (sprintf) an instruction mnemonic into 'buf' using specified syntax */
int x86_format_mnemonic( char *buf, int len, enum x86_asm_format format);
int x86_format_insn( char *buf, int len, enum x86_asm_format);
void x86_oplist_free( );
/* returns 0 if an instruction is invalid, 1 if valid */
/* More accessor fuctions, this time for user-defined info... */
void x86_set_insn_addr( uint32_t addr );
int x86_format_mnemonic( char *buf, int len, enum x86_asm_format format);
int x86_format_insn( char *buf, int len, enum x86_asm_format);
void x86_oplist_free( );
bool is_valid( );
uint32_t x86_get_address( );
void make_invalid(unsigned char *buf);
void make_invalid(unsigned char *buf);
/* instruction tagging: these routines allow the programmer to mark
* instructions as "seen" in a DFS, for example. libdisasm does not use
* the tag field.*/
/* set insn->tag to 1 */
void x86_tag_insn( );
/* return insn->tag */
int x86_insn_is_tagged();
/* set insn->tag to 0 */
void x86_untag_insn();
@ -759,7 +722,7 @@ public:
* offset : Offset in buffer to disassemble
* insn : Structure to fill with disassembled instruction
*/
unsigned int x86_disasm(const unsigned char *buf, unsigned int buf_len,
unsigned int x86_disasm( unsigned char *buf, unsigned int buf_len,
uint32_t buf_rva, unsigned int offset,
x86_insn_t * insn );
/* x86_disasm_range: Sequential disassembly of a range of bytes in a buffer,
@ -840,7 +803,7 @@ public:
* void x86_get_aliased_reg( x86_reg_t *alias_reg, x86_reg_t *output_reg )
* where 'alias_reg' is a reg operand and 'output_reg' is filled with the
* register that the operand is an alias for */
//#define x86_get_aliased_reg( alias_reg, output_reg )
//#define x86_get_aliased_reg( alias_reg, output_reg ) \
// x86_reg_from_id( alias_reg->alias, output_reg )

View File

@ -1,49 +0,0 @@
;libdisasm.def : Declares the module parameters
LIBRARY "libdisasm.dll"
DESCRIPTION "libdisasm exported functions"
EXPORTS
x86_addr_size @1
x86_cleanup @2
x86_disasm @3
x86_disasm_forward @4
x86_disasm_range @5
x86_endian @6
x86_format_header @7
x86_format_insn @8
x86_format_mnemonic @9
x86_format_operand @10
x86_fp_reg @11
x86_get_branch_target @12
x86_get_imm @13
x86_get_options @14
x86_get_raw_imm @15
x86_get_rel_offset @16
x86_imm_signsized @17
x86_imm_sized @18
x86_init @19
x86_insn_is_tagged @20
x86_insn_is_valid @21
x86_invariant_disasm @22
x86_ip_reg @23
x86_max_insn_size @24
x86_op_size @25
x86_operand_1st @26
x86_operand_2nd @27
x86_operand_3rd @28
x86_operand_count @29
x86_operand_foreach @30
x86_operand_new @31
x86_operand_size @32
x86_oplist_free @33
x86_reg_from_id @34
x86_report_error @35
x86_set_insn_addr @36
x86_set_insn_block @37
x86_set_insn_function @38
x86_set_insn_offset @39
x86_set_options @40
x86_set_reporter @41
x86_size_disasm @42
x86_sp_reg @43
x86_tag_insn @44

View File

@ -9,8 +9,8 @@
#ifdef _MSC_VER
#define snprintf _snprintf
#define inline __inline
#define snprintf _snprintf
#define inline __inline
#endif
void x86_insn_t::make_invalid(unsigned char *buf)
{
@ -21,9 +21,9 @@ void x86_insn_t::make_invalid(unsigned char *buf)
type = insn_invalid;
memcpy( bytes, buf, 1 );
}
unsigned int X86_Disasm::x86_disasm( const unsigned char *buf, unsigned int buf_len,
uint32_t buf_rva, unsigned int offset,
x86_insn_t *insn ){
unsigned int X86_Disasm::x86_disasm( unsigned char *buf, unsigned int buf_len,
uint32_t buf_rva, unsigned int offset,
x86_insn_t *insn ){
int len, size;
unsigned char bytes[MAX_INSTRUCTION_SIZE];
@ -52,8 +52,8 @@ unsigned int X86_Disasm::x86_disasm( const unsigned char *buf, unsigned int buf_
/* copy enough bytes for disassembly into buffer : this
* helps prevent buffer overruns at the end of a file */
memset( bytes, 0, MAX_INSTRUCTION_SIZE );
memcpy( bytes, &buf[offset], (len < MAX_INSTRUCTION_SIZE) ? len :
MAX_INSTRUCTION_SIZE );
memcpy( bytes, &buf[offset], (len < MAX_INSTRUCTION_SIZE) ? len :
MAX_INSTRUCTION_SIZE );
/* actually do the disassembly */
/* TODO: allow switching when more disassemblers are added */
@ -81,139 +81,140 @@ unsigned int X86_Disasm::x86_disasm( const unsigned char *buf, unsigned int buf_
}
unsigned int X86_Disasm::x86_disasm_range( unsigned char *buf, uint32_t buf_rva,
unsigned int offset, unsigned int len,
DISASM_CALLBACK func, void *arg ) {
x86_insn_t insn;
unsigned int buf_len, size, count = 0, bytes = 0;
unsigned int offset, unsigned int len,
DISASM_CALLBACK func, void *arg ) {
x86_insn_t insn;
unsigned int buf_len, size, count = 0, bytes = 0;
/* buf_len is implied by the arguments */
buf_len = len + offset;
/* buf_len is implied by the arguments */
buf_len = len + offset;
while ( bytes < len ) {
size = x86_disasm( buf, buf_len, buf_rva, offset + bytes,
&insn );
if ( size ) {
/* invoke callback if it exists */
if ( func ) {
(*func)( &insn, arg );
}
bytes += size;
count ++;
} else {
/* error */
bytes++; /* try next byte */
while ( bytes < len ) {
size = x86_disasm( buf, buf_len, buf_rva, offset + bytes,
&insn );
if ( size ) {
/* invoke callback if it exists */
if ( func ) {
(*func)( &insn, arg );
}
bytes += size;
count ++;
} else {
/* error */
bytes++; /* try next byte */
}
insn.x86_oplist_free();
}
insn.x86_oplist_free();
}
return( count );
return( count );
}
static inline int follow_insn_dest( x86_insn_t *insn ) {
if ( insn->type == insn_jmp || insn->type == insn_jcc ||
insn->type == insn_call || insn->type == insn_callcc ) {
return(1);
}
return(0);
if ( insn->type == insn_jmp || insn->type == insn_jcc ||
insn->type == insn_call || insn->type == insn_callcc ) {
return(1);
}
return(0);
}
static inline int insn_doesnt_return( x86_insn_t *insn ) {
return( (insn->type == insn_jmp || insn->type == insn_return) ? 1: 0 );
return( (insn->type == insn_jmp || insn->type == insn_return) ? 1: 0 );
}
static int32_t internal_resolver( x86_op_t *op, x86_insn_t *insn ){
int32_t next_addr = -1;
if ( x86_optype_is_address(op->type) ) {
next_addr = op->data.sdword;
} else if ( op->type == op_relative_near ) {
next_addr = insn->addr + insn->size + op->data.relative_near;
} else if ( op->type == op_relative_far ) {
next_addr = insn->addr + insn->size + op->data.relative_far;
}
return( next_addr );
int32_t next_addr = -1;
if ( x86_optype_is_address(op->type) ) {
next_addr = op->data.sdword;
} else if ( op->type == op_relative_near ) {
next_addr = insn->addr + insn->size + op->data.relative_near;
} else if ( op->type == op_relative_far ) {
next_addr = insn->addr + insn->size + op->data.relative_far;
}
return( next_addr );
}
unsigned int X86_Disasm::x86_disasm_forward( unsigned char *buf, unsigned int buf_len,
uint32_t buf_rva, unsigned int offset,
DISASM_CALLBACK func, void *arg,
DISASM_RESOLVER resolver, void *r_arg ){
x86_insn_t insn;
x86_op_t *op;
int32_t next_addr;
int32_t next_offset;
unsigned int size, count = 0, bytes = 0, cont = 1;
uint32_t buf_rva, unsigned int offset,
DISASM_CALLBACK func, void *arg,
DISASM_RESOLVER resolver, void *r_arg ){
x86_insn_t insn;
x86_op_t *op;
int32_t next_addr;
uint32_t next_offset;
unsigned int size, count = 0, bytes = 0, cont = 1;
while ( cont && bytes < buf_len ) {
size = x86_disasm( buf, buf_len, buf_rva, offset + bytes,
while ( cont && bytes < buf_len ) {
size = x86_disasm( buf, buf_len, buf_rva, offset + bytes,
&insn );
if ( size ) {
/* invoke callback if it exists */
if ( func ) {
(*func)( &insn, arg );
}
bytes += size;
count ++;
} else {
/* error */
bytes++; /* try next byte */
}
if ( follow_insn_dest(&insn) ) {
op = insn.operand_1st();//x86_get_dest_operand
next_addr = -1;
/* if caller supplied a resolver, use it to determine
* the address to disassemble */
if ( resolver ) {
next_addr = resolver(op, &insn, r_arg);
} else {
next_addr = internal_resolver(op, &insn);
}
if (next_addr != -1 ) {
next_offset = next_addr - buf_rva;
/* if offset is in this buffer... */
if ( next_offset >= 0 && next_offset < buf_len ) {
/* go ahead and disassemble */
count += x86_disasm_forward( buf,
buf_len,
buf_rva,
next_offset,
func, arg,
resolver, r_arg );
} else {
/* report unresolved address */
x86_report_error( report_disasm_bounds,
(void*)(long)next_addr );
if ( size ) {
/* invoke callback if it exists */
if ( func ) {
(*func)( &insn, arg );
}
bytes += size;
count ++;
} else {
/* error */
bytes++; /* try next byte */
}
}
} /* end follow_insn */
if ( insn_doesnt_return(&insn) ) {
/* stop disassembling */
cont = 0;
if ( follow_insn_dest(&insn) ) {
op = insn.x86_operand_1st();//x86_get_dest_operand
next_addr = -1;
/* if caller supplied a resolver, use it to determine
* the address to disassemble */
if ( resolver ) {
next_addr = resolver(op, &insn, r_arg);
} else {
next_addr = internal_resolver(op, &insn);
}
if (next_addr != -1 ) {
next_offset = next_addr - buf_rva;
/* if offset is in this buffer... */
if ( next_offset >= 0 &&
next_offset < buf_len ) {
/* go ahead and disassemble */
count += x86_disasm_forward( buf,
buf_len,
buf_rva,
next_offset,
func, arg,
resolver, r_arg );
} else {
/* report unresolved address */
x86_report_error( report_disasm_bounds,
(void*)(long)next_addr );
}
}
} /* end follow_insn */
if ( insn_doesnt_return(&insn) ) {
/* stop disassembling */
cont = 0;
}
insn.x86_oplist_free( );
}
insn.x86_oplist_free( );
}
return( count );
return( count );
}
/* invariant instruction representation */
size_t x86_invariant_disasm( unsigned char *buf, int buf_len,
x86_invariant_t *inv ){
if (! buf || ! buf_len || ! inv ) {
return(0);
}
size_t x86_invariant_disasm( unsigned char *buf, int buf_len,
x86_invariant_t *inv ){
if (! buf || ! buf_len || ! inv ) {
return(0);
}
return ia32_disasm_invariant(buf, buf_len, inv);
return ia32_disasm_invariant(buf, buf_len, inv);
}
size_t x86_size_disasm( unsigned char *buf, unsigned int buf_len ) {
if (! buf || ! buf_len ) {
return(0);
}
if (! buf || ! buf_len ) {
return(0);
}
return ia32_disasm_size(buf, buf_len);
return ia32_disasm_size(buf, buf_len);
}

View File

@ -46,7 +46,7 @@
} \
} while( 0 )
static const char *prefix_strings[] = {
static char *prefix_strings[] = {
"", /* no prefix */
"repz ", /* the trailing spaces make it easy to prepend to mnemonic */
"repnz ",
@ -115,7 +115,7 @@ static void get_operand_data_str( x86_op_t *op, char *str, int len ){
static void get_operand_regtype_str( int regtype, char *str, int len )
{
static struct {
const char *name;
char *name;
int value;
} operand_regtypes[] = {
{"reg_gen" , 0x00001},
@ -284,7 +284,7 @@ static int format_expr( x86_ea_t *ea, char *buf, int len,
static int format_seg( x86_op_t *op, char *buf, int len,
enum x86_asm_format format ) {
int len_orig = len;
const char *reg = "";
char *reg = "";
if (! op || ! buf || ! len || ! op->flags.whole) {
return(0);
@ -295,9 +295,8 @@ static int format_seg( x86_op_t *op, char *buf, int len,
if (! (int) op->flags.op_seg) {
return(0);
}
uint16_t seg_ov=uint16_t(op->flags.op_seg)<<8;
switch (seg_ov)
{
switch (op->flags.op_seg) {
case x86_op_flags::op_es_seg: reg = "es"; break;
case x86_op_flags::op_cs_seg: reg = "cs"; break;
case x86_op_flags::op_ss_seg: reg = "ss"; break;
@ -329,9 +328,9 @@ static int format_seg( x86_op_t *op, char *buf, int len,
return( len_orig - len ); /* return length of appended string */
}
static const char *get_operand_datatype_str( x86_op_t *op ){
static char *get_operand_datatype_str( x86_op_t *op ){
static const char *types[] = {
static char *types[] = {
"sbyte", /* 0 */
"sword",
"sqword",
@ -406,7 +405,7 @@ static int format_insn_eflags_str( enum x86_flag_status flags, char *buf,
int len) {
static struct {
const char *name;
char *name;
int value;
} insn_flags[] = {
{ "carry_set ", 0x0001 },
@ -441,9 +440,9 @@ static int format_insn_eflags_str( enum x86_flag_status flags, char *buf,
return( len_orig - len );
}
static const char *get_insn_group_str( enum x86_insn_t::x86_insn_group gp ) {
static char *get_insn_group_str( enum x86_insn_t::x86_insn_group gp ) {
static const char *types[] = {
static char *types[] = {
"", // 0
"controlflow",// 1
"arithmetic", // 2
@ -468,10 +467,10 @@ static const char *get_insn_group_str( enum x86_insn_t::x86_insn_group gp ) {
return types[gp];
}
static const char *get_insn_type_str( enum x86_insn_type type ) {
static char *get_insn_type_str( enum x86_insn_type type ) {
static struct {
const char *name;
char *name;
int value;
} types[] = {
/* insn_controlflow */
@ -593,8 +592,8 @@ static const char *get_insn_type_str( enum x86_insn_type type ) {
return "";
}
static const char *get_insn_cpu_str( enum x86_insn_cpu cpu ) {
static const char *intel[] = {
static char *get_insn_cpu_str( enum x86_insn_cpu cpu ) {
static char *intel[] = {
"", // 0
"8086", // 1
"80286", // 2
@ -621,8 +620,8 @@ static const char *get_insn_cpu_str( enum x86_insn_cpu cpu ) {
return "";
}
static const char *get_insn_isa_str( enum x86_insn_isa isa ) {
static const char *subset[] = {
static char *get_insn_isa_str( enum x86_insn_isa isa ) {
static char *subset[] = {
NULL, // 0
"General Purpose", // 1
"Floating Point", // 2
@ -881,11 +880,11 @@ static int format_operand_xml( x86_op_t *op, x86_insn_t *insn, char *buf,
return( strlen( buf ) );
}
static int format_operand_raw( x86_op_t *op, x86_insn_t */*insn*/, char *buf,
static int format_operand_raw( x86_op_t *op, x86_insn_t *insn, char *buf,
int len){
char str[MAX_OP_RAW_STRING];
const char *datatype = get_operand_datatype_str(op);
char *datatype = get_operand_datatype_str(op);
switch (op->type) {
case op_register:
@ -1043,7 +1042,7 @@ char * x86_op_t::format( enum x86_asm_format format ) {
static int format_att_mnemonic( x86_insn_t *insn, char *buf, int len) {
int size = 0;
const char *suffix;
char *suffix;
if (! insn || ! buf || ! len )
return(0);
@ -1052,8 +1051,8 @@ static int format_att_mnemonic( x86_insn_t *insn, char *buf, int len) {
/* do long jump/call prefix */
if ( insn->type == insn_jmp || insn->type == insn_call ) {
if (! is_imm_jmp( insn->operand_1st() ) ||
(insn->operand_1st())->datatype != op_byte ) {
if (! is_imm_jmp( insn->x86_operand_1st() ) ||
(insn->x86_operand_1st())->datatype != op_byte ) {
/* far jump/call, use "l" prefix */
STRNCAT( buf, "l", len );
}
@ -1077,11 +1076,11 @@ static int format_att_mnemonic( x86_insn_t *insn, char *buf, int len) {
insn->type == insn_out
)) {
if ( insn->x86_operand_count( op_explicit ) > 0 &&
is_memory_op( insn->operand_1st() ) ){
size = insn->operand_1st()->operand_size();
is_memory_op( insn->x86_operand_1st() ) ){
size = insn->x86_operand_1st()->operand_size();
} else if ( insn->x86_operand_count( op_explicit ) > 1 &&
is_memory_op( insn->operand_2nd() ) ){
size = insn->operand_2nd()->operand_size();
is_memory_op( insn->x86_operand_2nd() ) ){
size = insn->x86_operand_2nd()->operand_size();
}
}
@ -1095,6 +1094,7 @@ static int format_att_mnemonic( x86_insn_t *insn, char *buf, int len) {
return ( strlen( buf ) );
}
/** format (sprintf) an instruction mnemonic into 'buf' using specified syntax */
int x86_format_mnemonic(x86_insn_t *insn, char *buf, int len,
enum x86_asm_format format){
char str[MAX_OP_STRING];
@ -1137,7 +1137,7 @@ static int format_insn_note(x86_insn_t *insn, char *buf, int len){
return( len_orig - len );
}
static int format_raw_insn( x86_insn_t *insn, char *buf, size_t len ){
static int format_raw_insn( x86_insn_t *insn, char *buf, int len ){
struct op_string opstr = { buf, len };
int i;
@ -1223,24 +1223,24 @@ static int format_xml_insn( x86_insn_t *insn, char *buf, int len ) {
len -= format_insn_eflags_str( insn->flags_tested, buf, len );
STRNCAT( buf, "\"/>\n\t</flags>\n", len );
if ( insn->operand_1st() ) {
insn->operand_1st()->x86_format_operand(str,
if ( insn->x86_operand_1st() ) {
insn->x86_operand_1st()->x86_format_operand(str,
sizeof str, xml_syntax);
STRNCAT( buf, "\t<operand name=dest>\n", len );
STRNCAT( buf, str, len );
STRNCAT( buf, "\t</operand>\n", len );
}
if ( insn->operand_2nd() ) {
insn->operand_2nd()->x86_format_operand(str,sizeof str,
if ( insn->x86_operand_2nd() ) {
insn->x86_operand_2nd()->x86_format_operand(str,sizeof str,
xml_syntax);
STRNCAT( buf, "\t<operand name=src>\n", len );
STRNCAT( buf, str, len );
STRNCAT( buf, "\t</operand>\n", len );
}
if ( insn->operand_3rd() ) {
insn->operand_3rd()->x86_format_operand(str,sizeof str,
if ( insn->x86_operand_3rd() ) {
insn->x86_operand_3rd()->x86_format_operand(str,sizeof str,
xml_syntax);
STRNCAT( buf, "\t<operand name=imm>\n", len );
STRNCAT( buf, str, len );
@ -1342,13 +1342,13 @@ int x86_insn_t::x86_format_insn( char *buf, int len,
STRNCAT( buf, "\t", len );
/* dest */
if ( (dst = operand_1st()) && !(dst->flags.op_implied) ) {
if ( (dst = x86_operand_1st()) && !(dst->flags.op_implied) ) {
dst->x86_format_operand(str, MAX_OP_STRING, format);
STRNCAT( buf, str, len );
}
/* src */
if ( (src = operand_2nd()) ) {
if ( (src = x86_operand_2nd()) ) {
if ( !(dst->flags.op_implied) ) {
STRNCAT( buf, ", ", len );
}
@ -1357,9 +1357,9 @@ int x86_insn_t::x86_format_insn( char *buf, int len,
}
/* imm */
if ( operand_3rd()) {
if ( x86_operand_3rd()) {
STRNCAT( buf, ", ", len );
operand_3rd()->x86_format_operand(str, MAX_OP_STRING,format);
x86_operand_3rd()->x86_format_operand(str, MAX_OP_STRING,format);
STRNCAT( buf, str, len );
}
@ -1373,8 +1373,8 @@ int x86_insn_t::x86_format_insn( char *buf, int len,
/* not sure which is correct? sometimes GNU as requires
* an imm as the first operand, sometimes as the third... */
/* imm */
if ( operand_3rd() ) {
operand_3rd()->x86_format_operand(str, MAX_OP_STRING,format);
if ( x86_operand_3rd() ) {
x86_operand_3rd()->x86_format_operand(str, MAX_OP_STRING,format);
STRNCAT( buf, str, len );
/* there is always 'dest' operand if there is 'src' */
STRNCAT( buf, ", ", len );
@ -1382,13 +1382,13 @@ int x86_insn_t::x86_format_insn( char *buf, int len,
if ( (note & insn_note_nonswap ) == 0 ) {
/* regular AT&T style swap */
src = operand_2nd();
dst = operand_1st();
src = x86_operand_2nd();
dst = x86_operand_1st();
}
else {
/* special-case instructions */
src = operand_1st();
dst = operand_2nd();
src = x86_operand_1st();
dst = x86_operand_2nd();
}
/* src */
@ -1431,20 +1431,20 @@ int x86_insn_t::x86_format_insn( char *buf, int len,
/* print operands */
/* dest */
if ( operand_1st() ) {
operand_1st()->x86_format_operand(str, MAX_OP_STRING,format);
if ( x86_operand_1st() ) {
x86_operand_1st()->x86_format_operand(str, MAX_OP_STRING,format);
STRNCATF( buf, "%s\t", str, len );
}
/* src */
if ( operand_2nd() ) {
operand_2nd()->x86_format_operand(str, MAX_OP_STRING,format);
if ( x86_operand_2nd() ) {
x86_operand_2nd()->x86_format_operand(str, MAX_OP_STRING,format);
STRNCATF( buf, "%s\t", str, len );
}
/* imm */
if ( operand_3rd()) {
operand_3rd()->x86_format_operand(str, MAX_OP_STRING,format);
if ( x86_operand_3rd()) {
x86_operand_3rd()->x86_format_operand(str, MAX_OP_STRING,format);
STRNCAT( buf, str, len );
}
}

View File

@ -17,6 +17,7 @@ int x86_insn_is_valid( x86_insn_t *insn ) {
return 0;
}
/** \returns false if an instruction is invalid, true if valid */
bool x86_insn_t::is_valid( )
{
if ( this && this->type != insn_invalid && this->size > 0 )
@ -93,12 +94,13 @@ x86_op_t * x86_insn_t::x86_get_branch_target() {
return NULL;
}
const x86_op_t * x86_insn_t::get_dest() const {
x86_op_t * x86_insn_t::get_dest() {
x86_oplist_t *op_lst;
assert(this);
if ( ! operands ) {
return NULL;
}
assert(this->x86_operand_count(op_dest)==1);
for (op_lst = operands; op_lst; op_lst = op_lst->next ) {
if ( op_lst->op.access & op_write)
return &(op_lst->op);
@ -169,7 +171,7 @@ uint8_t *x86_insn_t::x86_get_raw_imm() {
}
size_t x86_op_t::operand_size() const {
size_t x86_op_t::operand_size() {
switch (datatype ) {
case op_byte: return 1;
case op_word: return 2;
@ -201,12 +203,13 @@ size_t x86_op_t::operand_size() const {
return(4); /* default size */
}
/** set the address (usually RVA) of the insn */
void x86_insn_t::x86_set_insn_addr( uint32_t _addr ) {
addr = _addr;
}
void x86_insn_t::x86_set_insn_offset( unsigned int _offset ){
offset = _offset;
void x86_insn_t::x86_set_insn_offset( unsigned int offset ){
offset = offset;
}
void x86_insn_t::x86_set_insn_function( void * func ){
@ -217,6 +220,7 @@ void x86_insn_t::x86_set_insn_block( void * _block ){
block = _block;
}
/** set insn->tag to 1 */
void x86_insn_t::x86_tag_insn(){
tag = 1;
}
@ -225,6 +229,7 @@ void x86_insn_t::x86_untag_insn(){
tag = 0;
}
/** \return insn->tag */
int x86_insn_t::x86_insn_is_tagged(){
return tag;
}

View File

@ -4,31 +4,31 @@
void x86_insn_t::x86_oplist_append( x86_oplist_t *op ) {
x86_oplist_t *list;
x86_oplist_t *list;
assert(this);
list = operands;
if (! list ) {
operand_count = 1;
/* Note that we have no way of knowing if this is an
list = operands;
if (! list ) {
operand_count = 1;
/* Note that we have no way of knowing if this is an
* exlicit operand or not, since the caller fills
* the x86_op_t after we return. We increase the
* explicit count automatically, and ia32_insn_implicit_ops
* decrements it */
explicit_count = 1;
operands = op;
return;
}
/* get to end of list */
for ( ; list->next; list = list->next )
;
operand_count = operand_count + 1;
explicit_count = explicit_count + 1;
list->next = op;
explicit_count = 1;
operands = op;
return;
}
/* get to end of list */
for ( ; list->next; list = list->next )
;
operand_count = operand_count + 1;
explicit_count = explicit_count + 1;
list->next = op;
return;
}
bool x86_insn_t::containsFlag(x86_eflags tofind, x86_flag_status in)
@ -48,7 +48,7 @@ bool x86_insn_t::containsFlag(x86_eflags tofind, x86_flag_status in)
return (in & (insn_dir_set | insn_dir_clear))!=0;
case insn_eflag_sign:
return (in & (insn_sign_set | insn_sign_clear | insn_zero_set_or_sign_ne_oflow |
insn_sign_eq_oflow | insn_sign_ne_oflow))!=0;
insn_sign_eq_oflow | insn_sign_ne_oflow))!=0;
case insn_eflag_parity:
return (in & (insn_parity_set | insn_parity_clear))!=0;
}
@ -56,31 +56,31 @@ bool x86_insn_t::containsFlag(x86_eflags tofind, x86_flag_status in)
}
x86_op_t * x86_insn_t::x86_operand_new( ) {
x86_oplist_t *op;
x86_oplist_t *op;
assert(this);
op = (x86_oplist_t *)calloc( sizeof(x86_oplist_t), 1 );
op->op.insn = this;
x86_oplist_append( op );
return( &(op->op) );
op = (x86_oplist_t *)calloc( sizeof(x86_oplist_t), 1 );
op->op.insn = this;
x86_oplist_append( op );
return( &(op->op) );
}
/** free the operand list associated with an instruction -- useful for
* preventing memory leaks when free()ing an x86_insn_t */
void x86_insn_t::x86_oplist_free( )
{
x86_oplist_t *op, *list;
x86_oplist_t *op, *list;
assert(this);
for ( list = operands; list; ) {
op = list;
list = list->next;
free(op);
}
for ( list = operands; list; ) {
op = list;
list = list->next;
free(op);
}
operands = NULL;
operand_count = 0;
explicit_count = 0;
operands = NULL;
operand_count = 0;
explicit_count = 0;
return;
return;
}
/* ================================================== LIBDISASM API */
@ -88,121 +88,122 @@ void x86_insn_t::x86_oplist_free( )
enum... yet one more confusing thing in the API */
int x86_insn_t::x86_operand_foreach( x86_operand_fn func, void *arg, enum x86_op_foreach_type type )
{
x86_oplist_t *list;
char _explicit = 1, implicit = 1;
x86_oplist_t *list;
char _explicit = 1, implicit = 1;
assert(this);
if ( ! func ) {
return 0;
}
if ( ! func ) {
return 0;
}
/* note: explicit and implicit can be ORed together to
/* note: explicit and implicit can be ORed together to
* allow an "all" limited by access type, even though the
* user is stupid to do this since it is default behavior :) */
if ( (type & op_explicit) && ! (type & op_implicit) ) {
implicit = 0;
}
if ( (type & op_implicit) && ! (type & op_explicit) ) {
_explicit = 0;
if ( (type & op_explicit) && ! (type & op_implicit) ) {
implicit = 0;
}
if ( (type & op_implicit) && ! (type & op_explicit) ) {
_explicit = 0;
}
type = (x86_op_foreach_type)((int)type & 0x0F); /* mask out explicit/implicit operands */
for ( list = operands; list; list = list->next ) {
if (! implicit && (list->op.flags.op_implied) ) {
/* operand is implicit */
continue;
}
type = (x86_op_foreach_type)((int)type & 0x0F); /* mask out explicit/implicit operands */
for ( list = operands; list; list = list->next ) {
if (! implicit && (list->op.flags.op_implied) ) {
/* operand is implicit */
continue;
}
if (! _explicit && ! (list->op.flags.op_implied) ) {
/* operand is not implicit */
continue;
}
switch ( type ) {
case op_any:
break;
case op_dest:
if (! (list->op.access & op_write) ) {
continue;
}
break;
case op_src:
if (! (list->op.access & op_read) ) {
continue;
}
break;
case op_ro:
if (! (list->op.access & op_read) ||
(list->op.access & op_write ) ) {
continue;
}
break;
case op_wo:
if (! (list->op.access & op_write) ||
(list->op.access & op_read ) ) {
continue;
}
break;
case op_xo:
if (! (list->op.access & op_execute) ) {
continue;
}
break;
case op_rw:
if (! (list->op.access & op_write) ||
! (list->op.access & op_read ) ) {
continue;
}
break;
case op_implicit: case op_explicit: /* make gcc happy */
break;
}
/* any non-continue ends up here: invoke the callback */
(*func)( &list->op, this, arg );
if (! _explicit && ! (list->op.flags.op_implied) ) {
/* operand is not implicit */
continue;
}
return 1;
switch ( type ) {
case op_any:
break;
case op_dest:
if (! (list->op.access & op_write) ) {
continue;
}
break;
case op_src:
if (! (list->op.access & op_read) ) {
continue;
}
break;
case op_ro:
if (! (list->op.access & op_read) ||
(list->op.access & op_write ) ) {
continue;
}
break;
case op_wo:
if (! (list->op.access & op_write) ||
(list->op.access & op_read ) ) {
continue;
}
break;
case op_xo:
if (! (list->op.access & op_execute) ) {
continue;
}
break;
case op_rw:
if (! (list->op.access & op_write) ||
! (list->op.access & op_read ) ) {
continue;
}
break;
case op_implicit: case op_explicit: /* make gcc happy */
break;
}
/* any non-continue ends up here: invoke the callback */
(*func)( &list->op, this, arg );
}
return 1;
}
static void count_operand( x86_op_t */*op*/, x86_insn_t */*insn*/, void *arg ) {
size_t * count = (size_t *) arg;
*count = *count + 1;
static void count_operand( x86_op_t *op, x86_insn_t *insn, void *arg ) {
size_t * count = (size_t *) arg;
*count = *count + 1;
}
/** convenience routine: returns count of operands matching 'type' */
size_t x86_insn_t::x86_operand_count( enum x86_op_foreach_type type ) {
size_t count = 0;
size_t count = 0;
/* save us a list traversal for common counts... */
if ( type == op_any ) {
return operand_count;
} else if ( type == op_explicit ) {
return explicit_count;
}
/* save us a list traversal for common counts... */
if ( type == op_any ) {
return operand_count;
} else if ( type == op_explicit ) {
return explicit_count;
}
x86_operand_foreach( count_operand, &count, type );
return count;
x86_operand_foreach( count_operand, &count, type );
return count;
}
/* accessor functions */
x86_op_t * x86_insn_t::operand_1st() {
if (! explicit_count ) {
return NULL;
}
x86_op_t * x86_insn_t::x86_operand_1st() {
if (! explicit_count ) {
return NULL;
}
return &(operands->op);
return &(operands->op);
}
x86_op_t * x86_insn_t::operand_2nd( ) {
if ( explicit_count < 2 ) {
return NULL;
}
x86_op_t * x86_insn_t::x86_operand_2nd( ) {
if ( explicit_count < 2 ) {
return NULL;
}
return &(operands->next->op);
return &(operands->next->op);
}
x86_op_t * x86_insn_t::operand_3rd( ) {
if ( explicit_count < 3 ) {
return NULL;
}
return &(operands->next->next->op);
x86_op_t * x86_insn_t::x86_operand_3rd( ) {
if ( explicit_count < 3 ) {
return NULL;
}
return &(operands->next->next->op);
}

View File

@ -1,47 +1,37 @@
PROJECT(dcc_original)
cmake_minimum_required(VERSION 3.1)
set(CMAKE_INCLUDE_CURRENT_DIR ON)
set(CMAKE_AUTOMOC ON)
find_package(Qt5Core)
CMAKE_MINIMUM_REQUIRED(VERSION 2.8)
OPTION(dcc_build_tests "Enable unit tests." OFF)
#SET(LIBRARY_OUTPUT_PATH ${PROJECT_SOURCE_DIR})
ADD_DEFINITIONS(-D_CRT_SECURE_NO_WARNINGS -D__UNIX__ -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS)
IF("${CMAKE_CXX_COMPILER_ID}" STREQUAL "MSVC")
ADD_DEFINITIONS(-D_CRT_SECURE_NO_WARNINGS -D__UNIX__ -D_CRT_NONSTDC_NO_DEPRECATE -DNOMINMAX)
#OPTION(dcc_build_tests "Enable unit tests." OFF)
ADD_DEFINITIONS(-D_CRT_SECURE_NO_WARNINGS -D__UNIX__ -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS)
IF(CMAKE_BUILD_TOOL MATCHES "(msdev|devenv|nmake)")
ADD_DEFINITIONS(-D_CRT_SECURE_NO_WARNINGS -D__UNIX__ -D_CRT_NONSTDC_NO_DEPRECATE)
ADD_DEFINITIONS(/W4)
ELSE()
#-D_GLIBCXX_DEBUG
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -std=c++11")
SET(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} " ) #--coverage
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall --std=c++0x")
SET(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -D_GLIBCXX_DEBUG " ) #--coverage
ENDIF()
SET(CMAKE_CXX_STANDARD 11)
SET(CMAKE_MODULE_PATH ${PROJECT_SOURCE_DIR}/CMakeScripts;${CMAKE_MODULE_PATH})
SET(EXECUTABLE_OUTPUT_PATH ${PROJECT_SOURCE_DIR})
include(cotire)
FIND_PACKAGE(Boost)
IF(dcc_build_tests)
enable_testing()
FIND_PACKAGE(GMock)
ENDIF()
#FIND_PACKAGE(LLVM)
#FIND_PACKAGE(Boost)
#IF(dcc_build_tests)
# FIND_PACKAGE(GMock)
#ENDIF()
#ADD_SUBDIRECTORY(3rd_party)
#llvm_map_components_to_libraries(REQ_LLVM_LIBRARIES jit native mc support)
INCLUDE_DIRECTORIES(
3rd_party/libdisasm
# 3rd_party/libdisasm
include
include/idioms
common
${Boost_INCLUDE_DIRS}
# include/idioms
# ${Boost_INCLUDE_DIRS}
# ${LLVM_INCLUDE_DIRS}
)
ADD_SUBDIRECTORY(3rd_party)
ADD_SUBDIRECTORY(common)
ADD_SUBDIRECTORY(tools)
set(dcc_LIB_SOURCES
src/CallConvention.cpp
set(dcc_SOURCES
src/ast.cpp
src/backend.cpp
src/bundle.cpp
@ -49,94 +39,47 @@ set(dcc_LIB_SOURCES
src/comwrite.cpp
src/control.cpp
src/dataflow.cpp
src/dcc.cpp
src/disassem.cpp
src/DccFrontend.cpp
src/error.cpp
src/fixwild.cpp
src/frontend.cpp
src/graph.cpp
src/hlicode.cpp
src/hltype.cpp
src/machine_x86.cpp
src/icode.cpp
src/RegisterNode
src/idioms.cpp
src/idioms/idiom1.cpp
src/idioms/arith_idioms.cpp
src/idioms/call_idioms.cpp
src/idioms/epilogue_idioms.cpp
src/idioms/mov_idioms.cpp
src/idioms/neg_idioms.cpp
src/idioms/shift_idioms.cpp
src/idioms/xor_idioms.cpp
src/locident.cpp
src/liveness_set.cpp
src/parser.cpp
src/perfhlib.cpp
src/procs.cpp
src/project.cpp
src/Procedure.cpp
src/proplong.cpp
src/reducible.cpp
src/scanner.cpp
src/symtab.cpp
src/udm.cpp
src/BasicBlock.cpp
src/dcc_interface.cpp
)
set(dcc_SOURCES
src/dcc.cpp
)
set(dcc_HEADERS
include/ast.h
include/bundle.h
include/BinaryImage.h
include/DccFrontend.h
include/Enums.h
include/dcc.h
include/disassem.h
include/dosdcc.h
include/error.h
include/graph.h
include/hlicode.h
include/machine_x86.h
include/icode.h
include/idioms/idiom.h
include/idioms/idiom1.h
include/idioms/arith_idioms.h
include/idioms/call_idioms.h
include/idioms/epilogue_idioms.h
include/idioms/mov_idioms.h
include/idioms/neg_idioms.h
include/idioms/shift_idioms.h
include/idioms/xor_idioms.h
include/locident.h
include/CallConvention.h
include/project.h
include/perfhlib.h
include/scanner.h
include/state.h
include/symtab.h
include/types.h
include/Procedure.h
include/StackFrame.h
include/BasicBlock.h
include/dcc_interface.h
)
SOURCE_GROUP(Source FILES ${dcc_SOURCES})
SOURCE_GROUP(Headers FILES ${dcc_HEADERS})
ADD_LIBRARY(dcc_lib STATIC ${dcc_LIB_SOURCES} ${dcc_HEADERS})
qt5_use_modules(dcc_lib Core)
#cotire(dcc_lib)
ADD_EXECUTABLE(dcc_original ${dcc_SOURCES} ${dcc_HEADERS})
ADD_DEPENDENCIES(dcc_original dcc_lib)
TARGET_LINK_LIBRARIES(dcc_original dcc_lib dcc_hash disasm_s)
qt5_use_modules(dcc_original Core)
SET_PROPERTY(TARGET dcc_original PROPERTY CXX_STANDARD 11)
SET_PROPERTY(TARGET dcc_original PROPERTY CXX_STANDARD_REQUIRED ON)
#ADD_SUBDIRECTORY(gui)
if(dcc_build_tests)
ADD_SUBDIRECTORY(src)
endif()
#TARGET_LINK_LIBRARIES(dcc_original disasm_s ${REQ_LLVM_LIBRARIES})
#if(dcc_build_tests)
#ADD_SUBDIRECTORY(src)
#endif()

File diff suppressed because it is too large Load Diff

339
LICENSE
View File

@ -1,339 +0,0 @@
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Lesser General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License.

129
Readme.md
View File

@ -1,129 +0,0 @@
I've fixed many issues in this codebase, among other things - memory reallocation during decompilation.
To reflect those fixes, I've edited the original readme a bit.
* * *
dcc Distribution
================
[![Join the chat at https://gitter.im/nemerle/dcc](https://badges.gitter.im/nemerle/dcc.svg)](https://gitter.im/nemerle/dcc?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
The code provided in this distribution is (C) by their authors:
- Cristina Cifuentes (most of dcc code)
- Mike van Emmerik (signatures and prototype code)
- Jeff Ledermann (some disassembly code)
and is provided "as is". Additional contributor list is available
[on GitHub](https://github.com/nemerle/dcc/graphs/contributors).
The following files are included in the dccoo.tar.gz distribution:
- dcc.zip (dcc.exe DOS program, 1995)
- dccsrc.zip (source code *.c, *.h for dcc, 1993-1994)
- dcc32.zip (dcc_oo.exe 32 bit console (Win95/Win-NT) program, 1997)
- dccsrcoo.zip (source code *.cpp, *.h for "oo" dcc, 1993-1997)
- dccbsig.zip (library signatures for Borland C compilers, 1994)
- dccmsig.zip (library signatures for Microsoft C compilers, 1994)
- dcctpsig.zip (library signatures for Turbo Pascal compilers, 1994)
- dcclibs.dat (prototype file for C headers, 1994)
- test.zip (sample test files: *.c *.exe *.b, 1993-1996)
- makedsig.zip (creates a .sig file from a .lib C file, 1994)
- makedstp.zip (creates a .sig file from a Pascal library file, 1994)
- readsig.zip (reads signatures in a .sig file, 1994)
- dispsrch.zip (displays the name of a function given a signature, 1994)
- parsehdr.zip (generates a prototype file (dcclibs.dat) from C *.h files, 1994)
Note that the dcc_oo.exe program (in dcc32.zip) is a 32 bit program,
so it won't work under Windows 3.1. Also, it is a console mode program,
meaning that it has to be run in the "Command Prompt" window (sometimes
known as the "Dos Box"). It is not a GUI program.
The following files are included in the test.zip file: fibo,
benchsho, benchlng, benchfn, benchmul, byteops, intops, longops,
max, testlong, matrixmu, strlen, dhamp.
The version of dcc included in this distribution (dccsrcoo.zip and
dcc32.exe) is a bit better than the first release, but it is still
broken in some cases, and we do not have the time to work in this
project at present so we cannot provide any changes.
Comments on individual files:
- fibo (fibonacci): the small model (fibos.exe) decompiles correctly,
the large model (fibol.exe) expects an extra argument for
`scanf()`. This argument is the segment and is not displayed.
- benchsho: the first `scanf()` takes loc0 as an argument. This is
part of a long variable, but dcc does not have any clue at that
stage that the stack offset pushed on the stack is to be used
as a long variable rather than an integer variable.
- benchlng: as part of the `main()` code, `LO(loc1) | HI(loc1)` should
be displayed instead of `loc3 | loc9`. These two integer variables
are equivalent to the one long loc1 variable.
- benchfn: see benchsho.
- benchmul: see benchsho.
- byteops: decompiles correctly.
- intops: the du analysis for `DIV` and `MOD` is broken. dcc currently
generates code for a long and an integer temporary register that
were used as part of the analysis.
- longops: decompiles correctly.
- max: decompiles correctly.
- testlong: this example decompiles correctly given the algorithms
implemented in dcc. However, it shows that when long variables
are defined and used as integers (or long) without giving dcc
any hint that this is happening, the variable will be treated as
two integer variables. This is due to the fact that the assembly
code is in terms of integer registers, and long registers are not
available in 80286, so a long variable is equivalent to two integer
registers. dcc only knows of this through idioms such as add two
long variables.
- matrixmu: decompiles correctly. Shows that arrays are not supported
in dcc.
- strlen: decompiles correctly. Shows that pointers are partially
supported by dcc.
- dhamp: this program has far more data types than what dcc recognizes
at present.
Our thanks to Gary Shaffstall for some debugging work. Current bugs
are:
- [ ] if the code generated in the one line is too long, the (static)
buffer used for that line is clobbered. Solution: make the buffer
larger (currently 200 chars).
- [ ] the large memory model problem & `scanf()`
- [ ] dcc's error message shows a p option available which doesn't
exist, and doesn't show an i option which exists.
- [x] there is a nasty problem whereby some arrays can get reallocated
to a new address, and some pointers can become invalid. This mainly
tends to happen to larger executable files. A major rewrite will
probably be required to fix this.
For more information refer to the thesis "Reverse Compilation
Techniques" by Cristina Cifuentes, Queensland University of
Technology, 1994, and the dcc home page:
http://www.it.uq.edu.au/groups/csm/dcc_readme.html
Please note that the executable version of dcc provided in this
distribution does not necessarily match the source code provided,
some changes were done without us keeping track of every change.
Using dcc
---------
Here is a very brief summary of switches for dcc:
* `a1`, `a2`: assembler output, before and after re-ordering of input code
* `c`: Attempt to follow control through indirect call instructions
* `i`: Enter interactive disassembler
* `m`: Memory map
* `s`: Statistics summary
* `v`, `V`: verbose (and Very verbose)
* `o` filename: Use filename as assembler output file
If dcc encounters illegal instructions, it will attempt to enter the so called
interactive disassembler. The idea of this was to allow commands to fix the
problem so that dcc could continue, but no such changes are implemented
as yet. (Note: the Unix versions do not have the interactive disassembler). If
you get into this, you can get out of it by pressing `^X` (control-X). Once dcc
has entered the interactive disassembler, however, there is little chance that
it will recover and produce useful output.
If dcc loads the signature file `dccxxx.sig`, this means that it has not
recognised the compiler library used. You can place the signatures in a
different direcory to where you are working if you set the DCC environment
variable to point to their path. Note that if dcc can't find its signature
files, it will be severely handicapped.

View File

@ -1,7 +1,6 @@
#!/bin/bash
#cd bld
#make -j5
#cd ..
mkdir -p tests/outputs
cd bld
make -j5
cd ..
./test_use_base.sh
./regression_tester.rb ./dcc_original -s -c 2>stderr >stdout; diff -wB tests/prev/ tests/outputs/
./regression_tester.rb ./bld/dcc_original -s -c 2>stderr >stdout; diff tests/prev/ tests/outputs/

View File

@ -1,7 +0,0 @@
set(SRC
perfhlib.cpp
perfhlib.h
PatternCollector.h
)
add_library(dcc_hash STATIC ${SRC})

View File

@ -1,82 +0,0 @@
#ifndef PATTERNCOLLECTOR
#define PATTERNCOLLECTOR
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <vector>
#define SYMLEN 16 /* Number of chars in the symbol name, incl null */
#define PATLEN 23 /* Number of bytes in the pattern part */
struct HASHENTRY
{
char name[SYMLEN]; /* The symbol name */
uint8_t pat [PATLEN]; /* The pattern */
uint16_t offset; /* Offset (needed temporarily) */
};
struct PatternCollector {
uint8_t buf[100], bufSave[7]; /* Temp buffer for reading the file */
uint16_t readShort(FILE *f)
{
uint8_t b1, b2;
if (fread(&b1, 1, 1, f) != 1)
{
printf("Could not read\n");
exit(11);
}
if (fread(&b2, 1, 1, f) != 1)
{
printf("Could not read\n");
exit(11);
}
return (b2 << 8) + b1;
}
void grab(FILE *f,int n)
{
if (fread(buf, 1, n, f) != (size_t)n)
{
printf("Could not read\n");
exit(11);
}
}
uint8_t readByte(FILE *f)
{
uint8_t b;
if (fread(&b, 1, 1, f) != 1)
{
printf("Could not read\n");
exit(11);
}
return b;
}
uint16_t readWord(FILE *fl)
{
uint8_t b1, b2;
b1 = readByte(fl);
b2 = readByte(fl);
return b1 + (b2 << 8);
}
/* Called by map(). Return the i+1th key in *pKeys */
uint8_t *getKey(int i)
{
return keys[i].pat;
}
/* Display key i */
void dispKey(int i)
{
printf("%s", keys[i].name);
}
std::vector<HASHENTRY> keys; /* array of keys */
virtual int readSyms(FILE *f)=0;
};
#endif // PATTERNCOLLECTOR

View File

@ -1,38 +0,0 @@
#pragma once
#include <stdint.h>
/** Perfect hashing function library. Contains functions to generate perfect
hashing functions */
struct PatternCollector;
struct PerfectHash {
uint16_t *T1base;
uint16_t *T2base; /* Pointers to start of T1, T2 */
short *g; /* g[] */
int NumEntry; /* Number of entries in the hash table (# keys) */
int EntryLen; /* Size (bytes) of each entry (size of keys) */
int SetSize; /* Size of the char set */
char SetMin; /* First char in the set */
int NumVert; /* c times NumEntry */
/** Set the parameters for the hash table */
void setHashParams(int _numEntry, int _entryLen, int _setSize, char _setMin, int _numVert);
public:
void map(PatternCollector * collector); /* Part 1 of creating the tables */
void hashCleanup(); /* Frees memory allocated by setHashParams() */
void assign(); /* Part 2 of creating the tables */
int hash(uint8_t *string); /* Hash the string to an int 0 .. NUMENTRY-1 */
const uint16_t *readT1(void) const { return T1base; }
const uint16_t *readT2(void) const { return T2base; }
const uint16_t *readG(void) const { return (uint16_t *)g; }
uint16_t *readT1(void){ return T1base; }
uint16_t *readT2(void){ return T2base; }
uint16_t *readG(void) { return (uint16_t *)g; }
private:
void initGraph();
void addToGraph(int e, int v1, int v2);
bool isCycle();
bool DFS(int parentE, int v);
void traverse(int u);
PatternCollector *m_collector; /* used to retrieve the keys */
};

View File

@ -1,4 +1,3 @@
#!/bin/bash
makedir -p tests/outputs
./test_use_all.sh
./regression_tester.rb ./dcc_original -s -c 2>stderr >stdout; diff -wB tests/prev/ tests/outputs/
./regression_tester.rb ./bld/dcc_original -s -c 2>stderr >stdout; diff tests/prev/ tests/outputs/

View File

@ -1,135 +0,0 @@
#pragma once
#include <list>
#include <vector>
#include <bitset>
#include <string>
#include <boost/range/iterator_range.hpp>
#include "icode.h"
#include "types.h"
#include "graph.h"
//#include "icode.h"
/* Basic block (BB) node definition */
struct Function;
class CIcodeRec;
struct BB;
struct LOCAL_ID;
struct interval;
//TODO: consider default address value -> INVALID
struct TYPEADR_TYPE
{
uint32_t ip; /* Out edge icode address */
BB * BBptr; /* Out edge pointer to next BB */
interval *intPtr; /* Out edge ptr to next interval*/
TYPEADR_TYPE(uint32_t addr=0) : ip(addr),BBptr(nullptr),intPtr(nullptr)
{}
TYPEADR_TYPE(interval *v) : ip(0),BBptr(nullptr),intPtr(v)
{}
};
struct BB
{
friend struct Function;
private:
BB(const BB&);
BB() : nodeType(0),traversed(DFS_NONE),
numHlIcodes(0),flg(0),
inEdges(0),
edges(0),beenOnH(0),inEdgeCount(0),reachingInt(0),
inInterval(0),correspInt(0),
dfsFirstNum(0),dfsLastNum(0),immedDom(0),ifFollow(0),loopType(NO_TYPE),latchNode(0),
numBackEdges(0),loopHead(0),loopFollow(0),caseHead(0),caseTail(0),index(0)
{
}
//friend class SymbolTableListTraits<BB, Function>;
typedef boost::iterator_range<iICODE> rCODE;
rCODE instructions;
rCODE &my_range() {return instructions;}
public:
struct ValidFunctor
{
bool operator()(BB *p) {return p->valid();}
};
iICODE begin();
iICODE end() const;
riICODE rbegin();
riICODE rend();
ICODE &front();
ICODE &back();
size_t size();
uint8_t nodeType; /* Type of node */
eDFS traversed; /* last traversal id is held here traversed yet? */
int numHlIcodes; /* No. of high-level icodes */
uint32_t flg; /* BB flags */
/* In edges and out edges */
std::vector<BB *> inEdges; // does not own held pointers
//int numOutEdges; /* Number of out edges */
std::vector<TYPEADR_TYPE> edges;/* Array of ptrs. to out edges */
/* For interval construction */
int beenOnH; /* #times been on header list H */
int inEdgeCount; /* #inEdges (to find intervals) */
BB * reachingInt; /* Reaching interval header */
interval *inInterval; /* Node's interval */
/* For derived sequence construction */
interval *correspInt; //!< Corresponding interval in derived graph Gi-1
// For live register analysis
// LiveIn(b) = LiveUse(b) U (LiveOut(b) - Def(b))
LivenessSet liveUse; /* LiveUse(b) */
LivenessSet def; /* Def(b) */
LivenessSet liveIn; /* LiveIn(b) */
LivenessSet liveOut; /* LiveOut(b) */
/* For structuring analysis */
int dfsFirstNum; /* DFS #: first visit of node */
int dfsLastNum; /* DFS #: last visit of node */
int immedDom; /* Immediate dominator (dfsLast index) */
int ifFollow; /* node that ends the if */
eNodeHeaderType loopType; /* Type of loop (if any) */
int latchNode; /* latching node of the loop */
size_t numBackEdges; /* # of back edges */
int loopHead; /* most nested loop head to which this node belongs (dfsLast) */
int loopFollow; /* node that follows the loop */
int caseHead; /* most nested case to which this node belongs (dfsLast) */
int caseTail; /* tail node for the case */
int index; /* Index, used in several ways */
static BB * Create(void *ctx=0,const std::string &s="",Function *parent=0,BB *insertBefore=0);
static BB * CreateIntervalBB(Function *parent);
static BB * Create(const rCODE &r, eBBKind _nodeType, Function *parent);
void writeCode(int indLevel, Function *pProc, int *numLoc, int latchNode, int ifFollow);
void mergeFallThrough(CIcodeRec &Icode);
void dfsNumbering(std::vector<BB *> &dfsLast, int *first, int *last);
void displayDfs();
void display();
/// getParent - Return the enclosing method, or null if none
///
const Function *getParent() const { return Parent; }
Function *getParent() { return Parent; }
void writeBB(QTextStream & ostr, int lev, Function *pProc, int *numLoc);
BB * rmJMP(int marker, BB *pBB);
void genDU1();
void findBBExps(LOCAL_ID &locals, Function *f);
bool valid() {return 0==(flg & INVALID_BB); }
bool wasTraversedAtLevel(int l) const {return traversed==l;}
ICODE * writeLoopHeader(int &indLevel, Function* pProc, int *numLoc, BB *&latch, bool &repCond);
void addOutEdge(uint32_t ip) // TODO: fix this
{
edges.push_back(TYPEADR_TYPE(ip));
}
void addOutEdgeInterval(interval *i) // TODO: fix this
{
edges.push_back(TYPEADR_TYPE(i));
}
void RemoveUnusedDefs(eReg regi, int defRegIdx, iICODE picode);
private:
bool FindUseBeforeDef(eReg regi, int defRegIdx, iICODE start_at);
void ProcessUseDefForFunc(eReg regi, int defRegIdx, ICODE &picode);
bool isEndOfPath(int latch_node_idx) const;
Function *Parent;
};

View File

@ -1,25 +0,0 @@
#pragma once
#include <stdint.h>
#include <vector>
struct PROG /* Loaded program image parameters */
{
uint16_t initCS=0;
uint16_t initIP=0; /* These are initial load values */
uint16_t initSS=0; /* Probably not of great interest */
uint16_t initSP=0;
bool fCOM=false; /* Flag set if COM program (else EXE)*/
int cReloc=0; /* No. of relocation table entries */
std::vector<uint32_t> relocTable; /* Ptr. to relocation table */
uint8_t * map=nullptr; /* Memory bitmap ptr */
int cProcs=0; /* Number of procedures so far */
int offMain=0; /* The offset of the main() proc */
uint16_t segMain=0; /* The segment of the main() proc */
bool bSigs=false; /* True if signatures loaded */
int cbImage=0; /* Length of image in bytes */
uint8_t * Imagez=nullptr; /* Allocated by loader to hold entire program image */
int addressingMode=0;
public:
const uint8_t *image() const {return Imagez;}
void displayLoadInfo();
};

View File

@ -1,33 +0,0 @@
#pragma once
#include "ast.h"
class QTextStream;
struct CConv {
enum Type {
eUnknown=0,
eCdecl,
ePascal
};
virtual void processHLI(Function *func, Expr *_exp, iICODE picode)=0;
virtual void writeComments(QTextStream &)=0;
static CConv * create(Type v);
protected:
};
struct C_CallingConvention : public CConv {
virtual void processHLI(Function *func, Expr *_exp, iICODE picode);
virtual void writeComments(QTextStream &);
private:
int processCArg(Function *callee, Function *pProc, ICODE *picode, size_t numArgs);
};
struct Pascal_CallingConvention : public CConv {
virtual void processHLI(Function *func, Expr *_exp, iICODE picode);
virtual void writeComments(QTextStream &);
};
struct Unknown_CallingConvention : public CConv {
void processHLI(Function *func, Expr *_exp, iICODE picode) {}
virtual void writeComments(QTextStream &);
};

View File

@ -1,19 +0,0 @@
#pragma once
#include "Procedure.h"
/* CALL GRAPH NODE */
struct CALL_GRAPH
{
ilFunction proc; /* Pointer to procedure in pProcList */
std::vector<CALL_GRAPH *> outEdges; /* array of out edges */
public:
void write();
CALL_GRAPH()
{
}
public:
void writeNodeCallGraph(int indIdx);
bool insertCallGraph(ilFunction caller, ilFunction callee);
bool insertCallGraph(Function *caller, ilFunction callee);
void insertArc(ilFunction newProc);
};
//extern CALL_GRAPH * callGraph; /* Pointer to the head of the call graph */

View File

@ -1,17 +0,0 @@
#pragma once
#include <QObject>
class Project;
class DccFrontend : public QObject
{
Q_OBJECT
void LoadImage();
void parse(Project &proj);
std::string m_fname;
public:
explicit DccFrontend(QObject *parent = 0);
bool FrontEnd(); /* frontend.c */
signals:
public slots:
};

View File

@ -1,285 +0,0 @@
#pragma once
/* Register types */
enum regType
{
BYTE_REG,
WORD_REG
};
enum condId
{
UNDEF=0,
GLOB_VAR, /* global variable */
REGISTER, /* register */
LOCAL_VAR, /* negative disp */
PARAM, /* positive disp */
GLOB_VAR_IDX, /* indexed global variable *//*** should merge w/glob-var*/
CONSTANT, /* constant */
STRING, /* string */
LONG_VAR, /* long variable */
FUNCTION, /* function */
OTHER /* other **** tmp solution */
};
enum condOp
{
/* For conditional expressions */
LESS_EQUAL, /* <= */
LESS, /* < */
EQUAL, /* == */
NOT_EQUAL, /* != */
GREATER, /* > */
GREATER_EQUAL, /* >= */
/* For general expressions */
AND, /* & */
OR, /* | */
XOR, /* ^ */
NOT, /* ~ */ /* 1's complement */
ADD, /* + */
SUB, /* - */
MUL, /* * */
DIV, /* / */
SHR, /* >> */
SHL, /* << */
MOD, /* % */
DBL_AND, /* && */
DBL_OR, /* || */
DUMMY /* */
};
/* LOW_LEVEL operand location: source or destination */
enum opLoc
{
SRC, /* Source operand */
DST, /* Destination operand */
LHS_OP /* Left-hand side operand (for HIGH_LEVEL) */
};
/* LOW_LEVEL icode flags */
#define NO_SRC_B 0xF7FFFF /* Masks off SRC_B */
enum eLLFlags
{
B = 0x0000001, /* uint8_t operands (value implicitly used) */
I = 0x0000002, /* Immed. source */
NOT_HLL = 0x0000004, /* Not HLL inst. */
FLOAT_OP = 0x0000008, /* ESC or WAIT */
SEG_IMMED = 0x0000010, /* Number is relocated segment value */
IMPURE = 0x0000020, /* Instruction modifies code */
WORD_OFF = 0x0000040, /* Inst has uint16_t offset ie.could be address */
TERMINATES = 0x0000080, /* Instruction terminates program */
CASE = 0x0000100, /* Label as case part of switch */
SWITCH = 0x0000200, /* Treat indirect JMP as switch stmt */
TARGET = 0x0000400, /* Jump target */
SYNTHETIC = 0x0000800, /* Synthetic jump instruction */
NO_LABEL = 0x0001000, /* Immed. jump cannot be linked to a label */
NO_CODE = 0x0002000, /* Hole in Icode array */
SYM_USE = 0x0004000, /* Instruction uses a symbol */
SYM_DEF = 0x0008000, /* Instruction defines a symbol */
NO_SRC = 0x0010000, /* Opcode takes no source */
NO_OPS = 0x0020000, /* Opcode takes no operands */
IM_OPS = 0x0040000, /* Opcode takes implicit operands */
SRC_B = 0x0080000, /* Source operand is uint8_t (dest is uint16_t) */
HLL_LABEL = 0x0100000, /* Icode has a high level language label */
IM_DST = 0x0200000, /* Implicit DST for opcode (SIGNEX) */
IM_SRC = 0x0400000, /* Implicit SRC for opcode (dx:ax) */
IM_TMP_DST = 0x0800000, /* Implicit rTMP DST for opcode (DIV/IDIV) */
JMP_ICODE = 0x1000000, /* Jmp dest immed.op converted to icode index */
JX_LOOP = 0x2000000, /* Cond jump is part of loop conditional exp */
REST_STK = 0x4000000 /* Stack needs to be restored after CALL */
#define ICODEMASK 0x0FF00FF /* Masks off parser flags */
};
/* Types of icodes */
enum icodeType
{
NOT_SCANNED_ICODE = 0, // not even scanned yet
LOW_LEVEL_ICODE, // low-level icode
HIGH_LEVEL_ICODE // high-level icode
};
/* LOW_LEVEL icode opcodes */
enum llIcode
{
iINVALID=-1,
iCBW, /* 0 */
iAAA,
iAAD,
iAAM,
iAAS,
iADC,
iADD,
iAND,
iBOUND,
iCALL,
iCALLF, /* 10 */
iCLC,
iCLD,
iCLI,
iCMC,
iCMP,
iCMPS,
iREPNE_CMPS,
iREPE_CMPS,
iDAA,
iDAS, /* 20 */
iDEC,
iDIV,
iENTER,
iESC,
iHLT,
iIDIV,
iIMUL,
iIN,
iINC,
iINS, /* 30 */
iREP_INS,
iINT,
iIRET,
iJB,
iJBE,
iJAE,
iJA,
iJE,
iJNE,
iJL, /* 40 */
iJGE,
iJLE,
iJG,
iJS,
iJNS,
iJO,
iJNO,
iJP,
iJNP,
iJCXZ, /* 50 */
iJMP,
iJMPF,
iLAHF,
iLDS,
iLEA,
iLEAVE,
iLES,
iLOCK,
iLODS,
iREP_LODS, /* 60 */
iLOOP,
iLOOPE,
iLOOPNE,
iMOV, /* 64 */
iMOVS,
iREP_MOVS,
iMUL, /* 67 */
iNEG,
iNOT,
iOR, /* 70 */
iOUT,
iOUTS,
iREP_OUTS,
iPOP,
iPOPA,
iPOPF,
iPUSH, // 77
iPUSHA,
iPUSHF,
iRCL, /* 80 */
iRCR,
iROL,
iROR,
iRET, /* 84 */
iRETF,
iSAHF,
iSAR,
iSHL,
iSHR,
iSBB, /* 90 */
iSCAS,
iREPNE_SCAS,
iREPE_SCAS,
iSIGNEX,
iSTC,
iSTD,
iSTI,
iSTOS,
iREP_STOS,
iSUB, /* 100 */
iTEST,
iWAIT,
iXCHG,
iXLAT,
iXOR,
iINTO,
iNOP,
iREPNE,
iREPE,
iMOD /* 110 */
};
/* Conditional Expression enumeration nodes and operators */
enum condNodeType
{
UNKNOWN_OP=0,
BOOLEAN_OP, /* condOps */
NEGATION, /* not (2's complement) */
ADDRESSOF, /* addressOf (&) */
DEREFERENCE, /* contents of (*) */
IDENTIFIER, /* {register | local | param | constant | global} */
/* The following are only available to C programs */
POST_INC, /* ++ (post increment) */
POST_DEC, /* -- (post decrement) */
PRE_INC, /* ++ (pre increment) */
PRE_DEC /* -- (pre decrement) */
} ;
/* Enumeration to determine whether pIcode points to the high or low part
* of a long number */
enum hlFirst
{
HIGH_FIRST, /* High value is first */
LOW_FIRST /* Low value is first */
};
/* HIGH_LEVEL icodes opcodes */
enum hlIcode
{
HLI_INVALID=0,
HLI_ASSIGN, /* := */
HLI_CALL, /* Call procedure */
HLI_JCOND, /* Conditional jump */
HLI_RET, /* Return from procedure */
/* pseudo high-level icodes */
HLI_POP, /* Pop expression */
HLI_PUSH /* Push expression */
} ;
/* Type definitions used in the decompiled program */
enum hlType
{
TYPE_UNKNOWN = 0, /* unknown so far */
TYPE_BYTE_SIGN, /* signed byte (8 bits) */
TYPE_BYTE_UNSIGN, /* unsigned byte */
TYPE_WORD_SIGN, /* signed word (16 bits) */
TYPE_WORD_UNSIGN, /* unsigned word (16 bits) */
TYPE_LONG_SIGN, /* signed long (32 bits) */
TYPE_LONG_UNSIGN, /* unsigned long (32 bits) */
TYPE_RECORD, /* record structure */
TYPE_PTR, /* pointer (32 bit ptr) */
TYPE_STR, /* string */
TYPE_CONST, /* constant (any type) */
TYPE_FLOAT, /* floating point */
TYPE_DOUBLE /* double precision float */
};
/* Operand is defined, used or both flag */
enum operDu
{
eDEF=0x10, /* Operand is defined */
eUSE=0x100, /* Operand is used */
USE_DEF, /* Operand is used and defined */
NONE /* No operation is required on this operand */
};
/* LOW_LEVEL icode, DU flag bits */
enum eDuFlags
{
Cf=1,
Sf=2,
Zf=4,
Df=8
};

View File

@ -1,30 +0,0 @@
#pragma once
#include "ast.h"
#include "types.h"
#include "machine_x86.h"
struct GlobalVariable;
struct AstIdent;
struct IDENTTYPE
{
friend struct GlobalVariable;
friend struct Constant;
friend struct AstIdent;
protected:
condId idType;
public:
condId type() {return idType;}
void type(condId t) {idType=t;}
union _idNode {
int localIdx; /* idx into localId, LOCAL_VAR */
int paramIdx; /* idx into args symtab, PARAMS */
uint32_t strIdx; /* idx into image, for STRING */
int longIdx; /* idx into LOCAL_ID table, LONG_VAR*/
struct { /* for OTHER; tmp struct */
eReg seg; /* segment */
eReg regi; /* index mode */
int16_t off; /* offset */
} other;
} idNode;
IDENTTYPE() : idType(UNDEF)
{}
};

View File

@ -1,225 +0,0 @@
#pragma once
#include "BasicBlock.h"
#include "locident.h"
#include "state.h"
#include "icode.h"
#include "StackFrame.h"
#include "CallConvention.h"
#include <QtCore/QString>
#include <bitset>
#include <map>
class QIODevice;
class QTextStream;
/* PROCEDURE NODE */
struct CALL_GRAPH;
struct Expr;
struct Disassembler;
struct Function;
struct CALL_GRAPH;
struct PROG;
struct Function;
/* Procedure FLAGS */
enum PROC_FLAGS
{
PROC_BADINST=0x00000100,/* Proc contains invalid or 386 instruction */
PROC_IJMP =0x00000200,/* Proc incomplete due to indirect jmp */
PROC_ICALL =0x00000400, /* Proc incomplete due to indirect call */
PROC_HLL =0x00001000, /* Proc is likely to be from a HLL */
// CALL_PASCAL =0x00002000, /* Proc uses Pascal calling convention */
// CALL_C =0x00004000, /* Proc uses C calling convention */
// CALL_UNKNOWN=0x00008000, /* Proc uses unknown calling convention */
PROC_NEAR =0x00010000, /* Proc exits with near return */
PROC_FAR =0x00020000, /* Proc exits with far return */
GRAPH_IRRED =0x00100000, /* Proc generates an irreducible graph */
SI_REGVAR =0x00200000, /* SI is used as a stack variable */
DI_REGVAR =0x00400000, /* DI is used as a stack variable */
PROC_IS_FUNC=0x00800000, /* Proc is a function */
REG_ARGS =0x01000000, /* Proc has registers as arguments */
// PROC_VARARG =0x02000000, /* Proc has variable arguments */
PROC_OUTPUT =0x04000000, /* C for this proc has been output */
PROC_RUNTIME=0x08000000, /* Proc is part of the runtime support */
PROC_ISLIB =0x10000000, /* Proc is a library function */
PROC_ASM =0x20000000, /* Proc is an intrinsic assembler routine */
PROC_IS_HLL =0x40000000 /* Proc has HLL prolog code */
//#define CALL_MASK 0xFFFF9FFF /* Masks off CALL_C and CALL_PASCAL */
};
struct FunctionType
{
bool m_vararg=false;
bool isVarArg() const {return m_vararg;}
};
struct Assignment
{
Expr *lhs;
Expr *rhs;
};
struct JumpTable
{
uint32_t start;
uint32_t finish;
bool valid() {return start<finish;}
size_t size() { return (finish-start)/2;}
size_t entrySize() { return 2;}
void pruneEntries(uint16_t cs);
};
class FunctionCfg
{
std::list<BB*> m_listBB; /* Ptr. to BB list/CFG */
public:
typedef std::list<BB*>::iterator iterator;
iterator begin() {
return m_listBB.begin();
}
iterator end() {
return m_listBB.end();
}
BB * &front() { return m_listBB.front();}
void nodeSplitting()
{
/* Converts the irreducible graph G into an equivalent reducible one, by
* means of node splitting. */
fprintf(stderr,"Attempt to perform node splitting: NOT IMPLEMENTED\n");
}
void push_back(BB *v) { m_listBB.push_back(v);}
};
struct Function
{
typedef std::list<BB *> BasicBlockListType;
// BasicBlock iterators...
typedef BasicBlockListType::iterator iterator;
typedef BasicBlockListType::const_iterator const_iterator;
protected:
BasicBlockListType BasicBlocks; ///< The basic blocks
Function(FunctionType */*ty*/) : procEntry(0),depth(0),flg(0),cbParam(0),m_dfsLast(0),numBBs(0),
hasCase(false),liveAnal(0)
{
type = new FunctionType;
callingConv(CConv::eUnknown);
}
public:
FunctionType * type;
CConv * m_call_conv;
uint32_t procEntry; /* label number */
QString name; /* Meaningful name for this proc */
STATE state; /* Entry state */
int depth; /* Depth at which we found it - for printing */
uint32_t flg; /* Combination of Icode & Proc flags */
int16_t cbParam; /* Probable no. of bytes of parameters */
STKFRAME args; /* Array of arguments */
LOCAL_ID localId; /* Local identifiers */
ID retVal; /* Return value - identifier */
/* Icodes and control flow graph */
CIcodeRec Icode; /* Object with ICODE records */
FunctionCfg m_actual_cfg;
std::vector<BB*> m_dfsLast;
std::map<int,BB*> m_ip_to_bb;
// * (reverse postorder) order */
size_t numBBs; /* Number of BBs in the graph cfg */
bool hasCase; /* Procedure has a case node */
/* For interprocedural live analysis */
LivenessSet liveIn; /* Registers used before defined */
LivenessSet liveOut; /* Registers that may be used in successors */
bool liveAnal; /* Procedure has been analysed already */
virtual ~Function() {
delete type;
}
public:
static Function *Create(FunctionType *ty=0,int /*Linkage*/=0,const QString &nm="",void */*module*/=0)
{
Function *r=new Function(ty);
r->name = nm;
return r;
}
FunctionType *getFunctionType() const {
return type;
}
CConv *callingConv() const { return m_call_conv;}
void callingConv(CConv::Type v);
// bool anyFlagsSet(uint32_t t) { return (flg&t)!=0;}
bool hasRegArgs() const { return (flg & REG_ARGS)!=0;}
bool isLibrary() const { return (flg & PROC_ISLIB)!=0;}
void compoundCond();
void writeProcComments();
void lowLevelAnalysis();
void bindIcodeOff();
void dataFlow(LivenessSet &liveOut);
void compressCFG();
void highLevelGen();
void structure(derSeq *derivedG);
derSeq *checkReducibility();
void createCFG();
void markImpure();
void findImmedDom();
void FollowCtrl(CALL_GRAPH *pcallGraph, STATE *pstate);
void process_operands(ICODE &pIcode, STATE *pstate);
bool process_JMP(ICODE &pIcode, STATE *pstate, CALL_GRAPH *pcallGraph);
bool process_CALL(ICODE &pIcode, CALL_GRAPH *pcallGraph, STATE *pstate);
void freeCFG();
void codeGen(QIODevice & fs);
void mergeFallThrough(BB *pBB);
void structIfs();
void structLoops(derSeq *derivedG);
void buildCFG(Disassembler &ds);
void controlFlowAnalysis();
void newRegArg(iICODE picode, iICODE ticode);
void writeProcComments(QTextStream & ostr);
void displayCFG();
void displayStats();
void processHliCall(Expr *exp, iICODE picode);
void preprocessReturnDU(LivenessSet &_liveOut);
Expr * adjustActArgType(Expr *_exp, hlType forType);
QString writeCall(Function *tproc, STKFRAME &args, int *numLoc);
void processDosInt(STATE *pstate, PROG &prog, bool done);
ICODE *translate_DIV(LLInst *ll, ICODE &_Icode);
ICODE *translate_XCHG(LLInst *ll, ICODE &_Icode);
protected:
void extractJumpTableRange(ICODE& pIcode, STATE *pstate, JumpTable &table);
bool followAllTableEntries(JumpTable &table, uint32_t cs, ICODE &pIcode, CALL_GRAPH *pcallGraph, STATE *pstate);
bool removeInEdge_Flag_and_ProcessLatch(BB *pbb, BB *a, BB *b);
bool Case_X_and_Y(BB* pbb, BB* thenBB, BB* elseBB);
bool Case_X_or_Y(BB* pbb, BB* thenBB, BB* elseBB);
bool Case_notX_or_Y(BB* pbb, BB* thenBB, BB* elseBB);
bool Case_notX_and_Y(BB* pbb, BB* thenBB, BB* elseBB);
void replaceInEdge(BB* where, BB* which, BB* with);
void processExpPush(int &numHlIcodes, iICODE picode);
// TODO: replace those with friend visitor ?
void propLongReg(int loc_ident_idx, const ID &pLocId);
void propLongStk(int i, const ID &pLocId);
void propLongGlb(int i, const ID &pLocId);
void processTargetIcode(iICODE picode, int &numHlIcodes, iICODE ticode, bool isLong);
int findBackwarLongDefs(int loc_ident_idx, const ID &pLocId, iICODE iter);
int findForwardLongUses(int loc_ident_idx, const ID &pLocId, iICODE beg);
void structCases();
void findExps();
void genDU1();
void elimCondCodes();
void liveRegAnalysis(LivenessSet &in_liveOut);
void findIdioms();
void propLong();
void genLiveKtes();
bool findDerivedSeq(derSeq &derivedGi);
bool nextOrderGraph(derSeq &derivedGi);
void addOutEdgesForConditionalJump(BB* pBB, int next_ip, LLInst *ll);
private:
bool decodeIndirectJMP(ICODE &pIcode, STATE *pstate, CALL_GRAPH *pcallGraph);
bool decodeIndirectJMP2(ICODE &pIcode, STATE *pstate, CALL_GRAPH *pcallGraph);
};
typedef std::list<Function> FunctionListType;
typedef FunctionListType lFunction;
typedef lFunction::iterator ilFunction;

View File

@ -1,24 +0,0 @@
#pragma once
#include <vector>
#include <cstring>
#include "types.h"
#include "Enums.h"
#include "symtab.h"
struct STKFRAME : public SymbolTableCommon<STKSYM>
{
//std::vector<STKSYM> sym;
//STKSYM * sym; /* Symbols */
int16_t m_minOff; /* Initial offset in stack frame*/
int16_t maxOff; /* Maximum offset in stack frame*/
int cb; /* Number of bytes in arguments */
int numArgs; /* No. of arguments in the table*/
void adjustForArgType(size_t numArg_, hlType actType_);
STKFRAME() : m_minOff(0),maxOff(0),cb(0),numArgs(0)
{
}
size_t getLocVar(int off);
public:
void updateFrameOff(int16_t off, int size, uint16_t duFlag);
};

View File

@ -4,17 +4,9 @@
* Date: September 1993
* (C) Cristina Cifuentes
*/
#pragma once
#include "Enums.h"
#include "msvc_fixes.h"
#define operandSize 20
#include <boost/range/iterator_range.hpp>
#include <stdint.h>
#include <cstring>
#include <list>
static const int operandSize=20;
/* The following definitions and types define the Conditional Expression
* attributed syntax tree, as defined by the following EBNF:
CondExp ::= CondTerm AND CondTerm | CondTerm
@ -23,298 +15,123 @@ static const int operandSize=20;
Identifier ::= globalVar | register | localVar | parameter | constant
op ::= <= | < | = | != | > | >=
*/
/* Conditional Expression enumeration nodes and operators */
typedef enum {
BOOLEAN_OP, /* condOps */
NEGATION, /* not (2's complement) */
ADDRESSOF, /* addressOf (&) */
DEREFERENCE, /* contents of (*) */
IDENTIFIER, /* {register | local | param | constant | global} */
/* The following are only available to C programs */
POST_INC, /* ++ (post increment) */
POST_DEC, /* -- (post decrement) */
PRE_INC, /* ++ (pre increment) */
PRE_DEC, /* -- (pre decrement) */
} condNodeType;
typedef enum {
GLOB_VAR, /* global variable */
REGISTER, /* register */
LOCAL_VAR, /* negative disp */
PARAM, /* positive disp */
GLOB_VAR_IDX, /* indexed global variable *//*** should merge w/glob-var*/
CONSTANT, /* constant */
STRING, /* string */
LONG_VAR, /* long variable */
FUNCTION, /* function */
OTHER /* other **** tmp solution */
} condId;
typedef enum {
/* For conditional expressions */
LESS_EQUAL = 0, /* <= */
LESS, /* < */
EQUAL, /* == */
NOT_EQUAL, /* != */
GREATER, /* > */
GREATER_EQUAL, /* >= */
/* For general expressions */
AND, /* & */
OR, /* | */
XOR, /* ^ */
NOT, /* ~ */ /* 1's complement */
ADD, /* + */
SUB, /* - */
MUL, /* * */
DIV, /* / */
SHR, /* >> */
SHL, /* << */
MOD, /* % */
DBL_AND, /* && */
DBL_OR, /* || */
DUMMY, /* */
} condOp;
/* High-level BOOLEAN conditions for iJB..iJNS icodes */
static const condOp condOpJCond[12] = {LESS, LESS_EQUAL, GREATER_EQUAL, GREATER,
EQUAL, NOT_EQUAL, LESS, GREATER_EQUAL,
LESS_EQUAL, GREATER, GREATER_EQUAL, LESS};
struct AstIdent;
struct Function;
struct STKFRAME;
struct LOCAL_ID;
struct ICODE;
struct LLInst;
struct LLOperand;
struct ID;
typedef std::list<ICODE>::iterator iICODE;
typedef boost::iterator_range<iICODE> rICODE;
#include "IdentType.h"
static condOp condOpJCond[12] = {LESS, LESS_EQUAL, GREATER_EQUAL, GREATER,
EQUAL, NOT_EQUAL, LESS, GREATER_EQUAL,
LESS_EQUAL, GREATER, GREATER_EQUAL, LESS};
static condOp invCondOpJCond[12] = {GREATER_EQUAL, GREATER, LESS, LESS_EQUAL,
NOT_EQUAL, EQUAL, GREATER_EQUAL, LESS,
GREATER, LESS_EQUAL, LESS, GREATER_EQUAL};
/* Register types */
typedef enum {
BYTE_REG, WORD_REG
} regType;
typedef struct
{
condId idType;
regType regiType; /* for REGISTER only */
union _idNode {
Int regiIdx; /* index into localId, REGISTER */
Int globIdx; /* index into symtab for GLOB_VAR */
Int localIdx; /* idx into localId, LOCAL_VAR */
Int paramIdx; /* idx into args symtab, PARAMS */
Int idxGlbIdx; /* idx into localId, GLOB_VAR_IDX */
struct _kte { /* for CONSTANT only */
dword kte; /* value of the constant */
byte size; /* #bytes size constant */
} kte;
dword strIdx; /* idx into image, for STRING */
Int longIdx; /* idx into LOCAL_ID table, LONG_VAR*/
struct _call { /* for FUNCTION only */
struct _proc *proc;
struct _STKFRAME *args;
} call;
struct { /* for OTHER; tmp struct */
byte seg; /* segment */
byte regi; /* index mode */
int16 off; /* offset */
} other;
} idNode;
} IDENTTYPE;
/* Expression data type */
struct Expr
{
public:
condNodeType m_type; /* Conditional Expression Node Type */
public:
static bool insertSubTreeLongReg(Expr *exp, Expr *&tree, int longIdx);
static bool insertSubTreeReg(Expr *&tree, Expr *_expr, eReg regi, const LOCAL_ID *locsym);
static bool insertSubTreeReg(AstIdent *&tree, Expr *_expr, eReg regi, const LOCAL_ID *locsym);
public:
typedef struct _condExpr {
condNodeType type; /* Conditional Expression Node Type */
union _exprNode { /* Different cond expr nodes */
struct { /* for BOOLEAN_OP */
condOp op;
struct _condExpr *lhs;
struct _condExpr *rhs;
} boolExpr;
struct _condExpr *unaryExp; /* for NEGATION,ADDRESSOF,DEREFERENCE*/
IDENTTYPE ident; /* for IDENTIFIER */
} expr;
} COND_EXPR;
virtual Expr *clone() const=0; //!< Makes a deep copy of the given expression
Expr(condNodeType t=UNKNOWN_OP) : m_type(t)
{
/* Sequence of conditional expression data type */
/*** NOTE: not used at present ****/
typedef struct _condExpSeq {
COND_EXPR *expr;
struct _condExpSeq *next;
} SEQ_COND_EXPR;
}
/** Recursively deallocates the abstract syntax tree rooted at *exp */
virtual ~Expr() {}
public:
virtual QString walkCondExpr (Function * pProc, int* numLoc) const=0;
virtual Expr *inverse() const=0; // return new COND_EXPR that is invarse of this
virtual bool xClear(rICODE range_to_check, iICODE lastBBinst, const LOCAL_ID &locId)=0;
virtual Expr *insertSubTreeReg(Expr *_expr, eReg regi, const LOCAL_ID *locsym)=0;
virtual Expr *insertSubTreeLongReg(Expr *_expr, int longIdx)=0;
virtual hlType expType(Function *pproc) const=0;
virtual int hlTypeSize(Function *pproc) const=0;
virtual Expr * performLongRemoval(eReg regi, LOCAL_ID *locId) { return this; }
};
struct UnaryOperator : public Expr
{
UnaryOperator(condNodeType t=UNKNOWN_OP) : Expr(t),unaryExp(nullptr) {}
Expr *unaryExp;
virtual Expr *inverse() const
{
if (m_type == NEGATION) //TODO: memleak here
{
return unaryExp->clone();
}
return this->clone();
}
virtual Expr *clone() const
{
UnaryOperator *newExp = new UnaryOperator(*this);
newExp->unaryExp = unaryExp->clone();
return newExp;
}
virtual bool xClear(rICODE range_to_check, iICODE lastBBinst, const LOCAL_ID &locs);
static UnaryOperator *Create(condNodeType t, Expr *sub_expr)
{
UnaryOperator *newExp = new UnaryOperator();
newExp->m_type = t;
newExp->unaryExp = sub_expr;
return (newExp);
}
~UnaryOperator()
{
delete unaryExp;
unaryExp=nullptr;
}
public:
int hlTypeSize(Function *pproc) const;
virtual QString walkCondExpr(Function *pProc, int *numLoc) const;
virtual Expr *insertSubTreeReg(Expr *_expr, eReg regi, const LOCAL_ID *locsym);
virtual hlType expType(Function *pproc) const;
virtual Expr *insertSubTreeLongReg(Expr *_expr, int longIdx);
private:
QString wrapUnary(Function *pProc, int *numLoc, QChar op) const;
};
struct BinaryOperator : public Expr
{
condOp m_op;
Expr *m_lhs;
Expr *m_rhs;
BinaryOperator(condOp o) : Expr(BOOLEAN_OP)
{
m_op = o;
m_lhs=m_rhs=nullptr;
}
BinaryOperator(condOp o,Expr *l,Expr *r) : Expr(BOOLEAN_OP)
{
m_op = o;
m_lhs=l;
m_rhs=r;
}
~BinaryOperator()
{
assert(m_lhs!=m_rhs or m_lhs==nullptr);
delete m_lhs;
delete m_rhs;
m_lhs=m_rhs=nullptr;
}
static BinaryOperator *Create(condOp o,Expr *l,Expr *r)
{
BinaryOperator *res = new BinaryOperator(o);
res->m_lhs = l;
res->m_rhs = r;
return res;
}
static BinaryOperator *LogicAnd(Expr *l,Expr *r)
{
return Create(DBL_AND,l,r);
}
static BinaryOperator *createSHL(Expr *l,Expr *r)
{
return Create(SHL,l,r);
}
static BinaryOperator *And(Expr *l,Expr *r)
{
return Create(AND,l,r);
}
static BinaryOperator *Or(Expr *l,Expr *r)
{
return Create(OR,l,r);
}
static BinaryOperator *LogicOr(Expr *l,Expr *r)
{
return Create(DBL_OR,l,r);
}
static BinaryOperator *CreateAdd(Expr *l,Expr *r) {
return Create(ADD,l,r);
}
void changeBoolOp(condOp newOp);
virtual Expr *inverse() const;
virtual Expr *clone() const;
virtual bool xClear(rICODE range_to_check, iICODE lastBBinst, const LOCAL_ID &locs);
virtual Expr *insertSubTreeReg(Expr *_expr, eReg regi, const LOCAL_ID *locsym);
virtual Expr *insertSubTreeLongReg(Expr *_expr, int longIdx);
const Expr *lhs() const
{
return const_cast<const Expr *>(const_cast<BinaryOperator *>(this)->lhs());
}
const Expr *rhs() const
{
return const_cast<const Expr *>(const_cast<BinaryOperator *>(this)->rhs());
}
Expr *lhs()
{
assert(m_type==BOOLEAN_OP);
return m_lhs;
}
Expr *rhs()
{
assert(m_type==BOOLEAN_OP);
return m_rhs;
}
condOp op() const { return m_op;}
/* Changes the boolean conditional operator at the root of this expression */
void op(condOp o) { m_op=o;}
QString walkCondExpr(Function * pProc, int* numLoc) const;
public:
hlType expType(Function *pproc) const;
int hlTypeSize(Function *pproc) const;
};
struct AstIdent : public UnaryOperator
{
AstIdent() : UnaryOperator(IDENTIFIER)
{
}
IDENTTYPE ident; /* for IDENTIFIER */
static AstIdent * Loc(int off, LOCAL_ID *localId);
static AstIdent * LongIdx(int idx);
static AstIdent * String(uint32_t idx);
static AstIdent * Other(eReg seg, eReg regi, int16_t off);
static AstIdent * Param(int off, const STKFRAME *argSymtab);
static AstIdent * Long(LOCAL_ID *localId, opLoc sd, iICODE pIcode, hlFirst f, iICODE ix, operDu du, LLInst &atOffset);
static AstIdent * idID(const ID *retVal, LOCAL_ID *locsym, iICODE ix_);
static Expr * id(const LLInst &ll_insn, opLoc sd, Function *pProc, iICODE ix_, ICODE &duIcode, operDu du);
virtual Expr *clone() const
{
return new AstIdent(*this);
}
virtual int hlTypeSize(Function *pproc) const;
virtual hlType expType(Function *pproc) const;
virtual Expr * performLongRemoval(eReg regi, LOCAL_ID *locId);
virtual QString walkCondExpr(Function *pProc, int *numLoc) const;
virtual Expr *insertSubTreeReg(Expr *_expr, eReg regi, const LOCAL_ID *locsym);
virtual Expr *insertSubTreeLongReg(Expr *_expr, int longIdx);
virtual bool xClear(rICODE range_to_check, iICODE lastBBinst, const LOCAL_ID &locId);
};
struct GlobalVariable : public AstIdent
{
bool valid;
int globIdx;
virtual Expr *clone() const
{
return new GlobalVariable(*this);
}
GlobalVariable(int16_t segValue, int16_t off);
QString walkCondExpr(Function *pProc, int *numLoc) const;
int hlTypeSize(Function *pproc) const;
hlType expType(Function *pproc) const;
};
struct GlobalVariableIdx : public AstIdent
{
bool valid;
int idxGlbIdx; /* idx into localId, GLOB_VAR_IDX */
virtual Expr *clone() const
{
return new GlobalVariableIdx(*this);
}
GlobalVariableIdx(int16_t segValue, int16_t off, uint8_t regi, const LOCAL_ID *locSym);
QString walkCondExpr(Function *pProc, int *numLoc) const;
int hlTypeSize(Function *pproc) const;
hlType expType(Function *pproc) const;
};
struct Constant : public AstIdent
{
struct _kte
{ /* for CONSTANT only */
uint32_t kte; /* value of the constant */
uint8_t size; /* #bytes size constant */
} kte;
Constant(uint32_t _kte, uint8_t size)
{
ident.idType = CONSTANT;
kte.kte = _kte;
kte.size = size;
}
virtual Expr *clone() const
{
return new Constant(*this);
}
QString walkCondExpr(Function *pProc, int *numLoc) const;
int hlTypeSize(Function *pproc) const;
hlType expType(Function *pproc) const { return TYPE_CONST; }
};
struct FuncNode : public AstIdent
{
struct _call { /* for FUNCTION only */
Function *proc;
STKFRAME *args;
} call;
FuncNode(Function *pproc, STKFRAME *args)
{
call.proc = pproc;
call.args = args;
}
virtual Expr *clone() const
{
return new FuncNode(*this);
}
QString walkCondExpr(Function *pProc, int *numLoc) const;
int hlTypeSize(Function *pproc) const;
hlType expType(Function *pproc) const;
};
struct RegisterNode : public AstIdent
{
const LOCAL_ID *m_syms;
regType regiType; /* for REGISTER only */
int regiIdx; /* index into localId, REGISTER */
virtual Expr *insertSubTreeReg(Expr *_expr, eReg regi, const LOCAL_ID *locsym);
RegisterNode(int idx, regType reg_type,const LOCAL_ID *syms)
{
m_syms= syms;
ident.type(REGISTER);
regiType = reg_type;
regiIdx = idx;
}
RegisterNode(const LLOperand &, LOCAL_ID *locsym);
//RegisterNode(eReg regi, uint32_t icodeFlg, LOCAL_ID *locsym);
virtual Expr *clone() const
{
return new RegisterNode(*this);
}
QString walkCondExpr(Function *pProc, int *numLoc) const;
int hlTypeSize(Function *) const;
hlType expType(Function *pproc) const;
bool xClear(rICODE range_to_check, iICODE lastBBinst, const LOCAL_ID &locId);
};

View File

@ -4,41 +4,28 @@
* Purpose: Module to handle the bundle type (array of pointers to strings).
* (C) Cristina Cifuentes
****************************************************************************/
#pragma once
#include <stdio.h>
#include <vector>
#include <QtCore/QString>
#include <QtCore/QIODevice>
struct strTable : std::vector<QString>
{
/* Returns the next available index into the table */
size_t nextIdx() {return size();}
public:
void addLabelBundle(int idx, int label);
};
typedef struct {
Int numLines; /* Number of lines in the table */
Int allocLines; /* Number of lines allocated in the table */
char **str; /* Table of strings */
} strTable;
struct bundle
{
public:
void appendCode(const char *format, ...);
void appendCode(const QString &s);
void appendDecl(const char *format, ...);
void appendDecl(const QString &);
void init()
{
decl.clear();
code.clear();
}
typedef struct {
strTable decl; /* Declarations */
strTable code; /* C code */
int current_indent;
};
} bundle;
extern bundle cCode;
#define lineSize 360 /* 3 lines in the mean time */
//void newBundle (bundle *procCode);
void writeBundle (QIODevice & ios, bundle procCode);
void newBundle (bundle *procCode);
void appendStrTab (strTable *strTab, char *format, ...);
Int nextBundleIdx (strTable *strTab);
void addLabelBundle (strTable *strTab, Int idx, Int label);
void writeBundle (FILE *fp, bundle procCode);
void freeBundle (bundle *procCode);

View File

@ -2,15 +2,8 @@
* dcc project general header
* (C) Cristina Cifuentes, Mike van Emmerik
****************************************************************************/
#pragma once
//TODO: Remove boolT
#include <utility>
#include <algorithm>
#include <bitset>
#include <QtCore/QString>
#include "Enums.h"
#include "types.h"
#include "ast.h"
#include "icode.h"
@ -18,87 +11,314 @@
#include "error.h"
#include "graph.h"
#include "bundle.h"
#include "Procedure.h"
#include "BasicBlock.h"
class Project;
/* SYMBOL TABLE */
typedef struct {
char name[10]; /* New name for this variable */
dword label; /* physical address (20 bit) */
Int size; /* maximum size */
flags32 flg; /* SEG_IMMED, IMPURE, WORD_OFF */
hlType type; /* probable type */
word duVal; /* DEF, USE, VAL */
} SYM;
typedef SYM *PSYM;
typedef struct {
Int csym; /* No. of symbols in table */
Int alloc; /* Allocation */
PSYM sym; /* Symbols */
} SYMTAB;
typedef SYMTAB *PSYMTAB;
/* STACK FRAME */
typedef struct {
COND_EXPR *actual; /* Expression tree of actual parameter */
COND_EXPR *regs; /* For register arguments only */
int16 off; /* Immediate off from BP (+:args, -:params) */
byte regOff; /* Offset is a register (e.g. SI, DI) */
Int size; /* Size */
hlType type; /* Probable type */
word duVal; /* DEF, USE, VAL */
boolT hasMacro; /* This type needs a macro */
char macro[10]; /* Macro name */
char name[10]; /* Name for this symbol/argument */
boolT invalid; /* Boolean: invalid entry in formal arg list*/
} STKSYM;
typedef STKSYM *PSTKSYM;
typedef struct _STKFRAME {
Int csym; /* No. of symbols in table */
Int alloc; /* Allocation */
PSTKSYM sym; /* Symbols */
int16 minOff; /* Initial offset in stack frame*/
int16 maxOff; /* Maximum offset in stack frame*/
Int cb; /* Number of bytes in arguments */
Int numArgs; /* No. of arguments in the table*/
} STKFRAME;
typedef STKFRAME *PSTKFRAME;
/* PROCEDURE NODE */
typedef struct _proc {
dword procEntry; /* label number */
char name[SYMLEN]; /* Meaningful name for this proc */
STATE state; /* Entry state */
Int depth; /* Depth at which we found it - for printing */
flags32 flg; /* Combination of Icode & Proc flags */
int16 cbParam; /* Probable no. of bytes of parameters */
STKFRAME args; /* Array of arguments */
LOCAL_ID localId; /* Local identifiers */
ID retVal; /* Return value - identifier */
/* Icodes and control flow graph */
CIcodeRec Icode; /* Object with ICODE records */
PBB cfg; /* Ptr. to BB list/CFG */
PBB *dfsLast; /* Array of pointers to BBs in dfsLast
* (reverse postorder) order */
Int numBBs; /* Number of BBs in the graph cfg */
boolT hasCase; /* Procedure has a case node */
/* For interprocedural live analysis */
dword liveIn; /* Registers used before defined */
dword liveOut; /* Registers that may be used in successors */
boolT liveAnal; /* Procedure has been analysed already */
/* Double-linked list */
struct _proc *next;
struct _proc *prev;
} PROCEDURE;
typedef PROCEDURE *PPROC;
/* CALL GRAPH NODE */
typedef struct _callGraph {
PPROC proc; /* Pointer to procedure in pProcList */
Int numOutEdges; /* # of out edges (ie. # procs invoked) */
Int numAlloc; /* # of out edges allocated */
struct _callGraph **outEdges; /* array of out edges */
} CALL_GRAPH;
typedef CALL_GRAPH *PCALL_GRAPH;
#define NUM_PROCS_DELTA 5 /* delta # procs a proc invokes */
extern PPROC pProcList; /* Pointer to the head of the procedure list */
extern PPROC pLastProc; /* Pointer to last node of the proc list */
extern PCALL_GRAPH callGraph; /* Pointer to the head of the call graph */
extern bundle cCode; /* Output C procedure's declaration and code */
/* Procedure FLAGS */
#define PROC_BADINST 0x000100 /* Proc contains invalid or 386 instruction */
#define PROC_IJMP 0x000200 /* Proc incomplete due to indirect jmp */
#define PROC_ICALL 0x000400 /* Proc incomplete due to indirect call */
#define PROC_HLL 0x001000 /* Proc is likely to be from a HLL */
#define CALL_PASCAL 0x002000 /* Proc uses Pascal calling convention */
#define CALL_C 0x004000 /* Proc uses C calling convention */
#define CALL_UNKNOWN 0x008000 /* Proc uses unknown calling convention */
#define PROC_NEAR 0x010000 /* Proc exits with near return */
#define PROC_FAR 0x020000 /* Proc exits with far return */
#define GRAPH_IRRED 0x100000 /* Proc generates an irreducible graph */
#define SI_REGVAR 0x200000 /* SI is used as a stack variable */
#define DI_REGVAR 0x400000 /* DI is used as a stack variable */
#define PROC_IS_FUNC 0x800000 /* Proc is a function */
#define REG_ARGS 0x1000000 /* Proc has registers as arguments */
#define PROC_VARARG 0x2000000 /* Proc has variable arguments */
#define PROC_OUTPUT 0x4000000 /* C for this proc has been output */
#define PROC_RUNTIME 0x8000000 /* Proc is part of the runtime support */
#define PROC_ISLIB 0x10000000 /* Proc is a library function */
#define PROC_ASM 0x20000000 /* Proc is an intrinsic assembler routine */
#define PROC_IS_HLL 0x40000000 /* Proc has HLL prolog code */
#define CALL_MASK 0xFFFF9FFF /* Masks off CALL_C and CALL_PASCAL */
/* duVal FLAGS */
#define DEF 0x0010 /* Variable was first defined than used */
#define USE 0x0100 /* Variable was first used than defined */
#define VAL 0x1000 /* Variable has an initial value. 2 cases:
* 1. When variable is used first (ie. global)
* 2. When a value is moved into the variable
* for the first time. */
#define USEVAL 0x1100 /* Use and Val */
/**** Global variables ****/
extern QString asm1_name, asm2_name; /* Assembler output filenames */
extern char *asm1_name, *asm2_name; /* Assembler output filenames */
/** Command line option flags */
struct OPTION
{
bool verbose;
bool VeryVerbose;
bool asm1; /* Early disassembly listing */
bool asm2; /* Disassembly listing after restruct */
bool Map;
bool Stats;
bool Interact; /* Interactive mode */
bool Calls; /* Follow register indirect calls */
QString filename; /* The input filename */
uint32_t CustomEntryPoint;
};
typedef struct { /* Command line option flags */
unsigned verbose : 1;
unsigned VeryVerbose : 1;
unsigned asm1 : 1; /* Early disassembly listing */
unsigned asm2 : 1; /* Disassembly listing after restruct */
unsigned Map : 1;
unsigned Stats : 1;
unsigned Interact : 1; /* Interactive mode */
unsigned Calls : 1; /* Follow register indirect calls */
char filename[80]; /* The input filename */
} OPTION;
extern OPTION option; /* Command line options */
extern SYMTAB symtab; /* Global symbol table */
#include "BinaryImage.h"
typedef struct { /* Loaded program image parameters */
int16 initCS;
int16 initIP; /* These are initial load values */
int16 initSS; /* Probably not of great interest */
int16 initSP;
boolT fCOM; /* Flag set if COM program (else EXE)*/
Int cReloc; /* No. of relocation table entries */
dword *relocTable; /* Ptr. to relocation table */
byte *map; /* Memory bitmap ptr */
Int cProcs; /* Number of procedures so far */
Int offMain; /* The offset of the main() proc */
word segMain; /* The segment of the main() proc */
boolT bSigs; /* True if signatures loaded */
Int cbImage; /* Length of image in bytes */
byte *Image; /* Allocated by loader to hold entire
* program image */
} PROG;
extern PROG prog; /* Loaded program image parameters */
extern char condExp[200]; /* Conditional expression buffer */
extern char callBuf[100]; /* Function call buffer */
extern dword duReg[30]; /* def/use bits for registers */
extern dword maskDuReg[30]; /* masks off du bits for regs */
/* Registers used by icode instructions */
static char *allRegs[21] = {"ax", "cx", "dx", "bx", "sp", "bp",
"si", "di", "es", "cs", "ss", "ds",
"al", "cl", "dl", "bl", "ah", "ch", "dh", "bh",
"tmp"};
/* Memory map states */
enum eAreaType
{
BM_UNKNOWN = 0, /* Unscanned memory */
BM_DATA = 1, /* Data */
BM_CODE = 2, /* Code */
BM_IMPURE = 3 /* Used as Data and Code*/
};
#define BM_UNKNOWN 0 /* Unscanned memory */
#define BM_DATA 1 /* Data */
#define BM_CODE 2 /* Code */
#define BM_IMPURE 3 /* Used as Data and Code*/
/* Intermediate instructions statistics */
struct STATS
{
int numBBbef; /* number of basic blocks initially */
int numBBaft; /* number of basic blocks at the end */
int nOrder; /* n-th order */
int numLLIcode; /* number of low-level Icode instructions */
int numHLIcode; /* number of high-level Icode instructions */
int totalLL; /* total number of low-level Icode insts */
int totalHL; /* total number of high-level Icod insts */
};
typedef struct {
Int numBBbef; /* number of basic blocks initially */
Int numBBaft; /* number of basic blocks at the end */
Int nOrder; /* n-th order */
Int numLLIcode; /* number of low-level Icode instructions */
Int numHLIcode; /* number of high-level Icode instructions */
Int totalLL; /* total number of low-level Icode insts */
Int totalHL; /* total number of high-level Icod insts */
} STATS;
extern STATS stats; /* Icode statistics */
/**** Global function prototypes ****/
void FrontEnd(char *filename, PCALL_GRAPH *); /* frontend.c */
void *allocMem(Int cb); /* frontend.c */
void *reallocVar(void *p, Int newsize); /* frontend.c */
void udm(void); /* udm.c */
void freeCFG(BB * cfg); /* graph.c */
BB * newBB(BB *, int, int, uint8_t, int, Function *); /* graph.c */
void BackEnd(CALL_GRAPH *); /* backend.c */
extern char *cChar(uint8_t c); /* backend.c */
eErrorId scan(uint32_t ip, ICODE &p); /* scanner.c */
void parse (CALL_GRAPH * *); /* parser.c */
extern int strSize (const uint8_t *, char); /* parser.c */
//void disassem(int pass, Function * pProc); /* disassem.c */
void interactDis(Function *, int initIC); /* disassem.c */
bool JmpInst(llIcode opcode); /* idioms.c */
queue::iterator appendQueue(queue &Q, BB *node); /* reducible.c */
bool SetupLibCheck(void); /* chklib.c */
PBB createCFG(PPROC pProc); /* graph.c */
void compressCFG(PPROC pProc); /* graph.c */
void freeCFG(PBB cfg); /* graph.c */
PBB newBB(PBB, Int, Int, byte, Int, PPROC); /* graph.c */
void BackEnd(char *filename, PCALL_GRAPH); /* backend.c */
char *cChar(byte c); /* backend.c */
Int scan(dword ip, PICODE p); /* scanner.c */
void parse (PCALL_GRAPH *); /* parser.c */
boolT labelSrch(PICODE pIc, Int n, dword tg, Int *pIdx); /* parser.c */
void setState(PSTATE state, word reg, int16 value); /* parser.c */
Int strSize (byte *, char); /* parser.c */
void disassem(Int pass, PPROC pProc); /* disassem.c */
void interactDis(PPROC initProc, Int initIC); /* disassem.c */
void bindIcodeOff (PPROC); /* idioms.c */
void lowLevelAnalysis (PPROC pProc); /* idioms.c */
void propLong (PPROC pproc); /* proplong.c */
boolT JmpInst(llIcode opcode); /* idioms.c */
void checkReducibility(PPROC pProc, derSeq **derG); /* reducible.c */
queue *appendQueue(queue **Q, BB *node); /* reducible.c */
void freeDerivedSeq(derSeq *derivedG); /* reducible.c */
void displayDerivedSeq(derSeq *derG); /* reducible.c */
void structure(PPROC pProc, derSeq *derG); /* control.c */
void compoundCond (PPROC); /* control.c */
void dataFlow(PPROC pProc, dword liveOut); /* dataflow.c */
void writeIntComment (PICODE icode, char *s); /* comwrite.c */
void writeProcComments (PPROC pProc, strTable *sTab); /* comwrite.c */
void checkStartup(PSTATE pState); /* chklib.c */
void SetupLibCheck(void); /* chklib.c */
void CleanupLibCheck(void); /* chklib.c */
bool LibCheck(Function &p); /* chklib.c */
boolT LibCheck(PPROC p); /* chklib.c */
/* Exported functions from procs.c */
boolT insertCallGraph (PCALL_GRAPH, PPROC, PPROC);
void writeCallGraph (PCALL_GRAPH);
void newRegArg (PPROC, PICODE, PICODE);
boolT newStkArg (PICODE, COND_EXPR *, llIcode, PPROC);
void allocStkArgs (PICODE, Int);
void placeStkArg (PICODE, COND_EXPR *, Int);
void adjustActArgType (COND_EXPR *, hlType, PPROC);
void adjustForArgType (PSTKFRAME, Int, hlType);
/* Exported functions from ast.c */
COND_EXPR *boolCondExp (COND_EXPR *lhs, COND_EXPR *rhs, condOp op);
COND_EXPR *unaryCondExp (condNodeType, COND_EXPR *exp);
COND_EXPR *idCondExpGlob (int16 segValue, int16 off);
COND_EXPR *idCondExpReg (byte regi, flags32 flg, LOCAL_ID *);
COND_EXPR *idCondExpRegIdx (Int idx, regType);
COND_EXPR *idCondExpLoc (Int off, LOCAL_ID *);
COND_EXPR *idCondExpParam (Int off, PSTKFRAME argSymtab);
COND_EXPR *idCondExpKte (dword kte, byte);
COND_EXPR *idCondExpLong (LOCAL_ID *, opLoc, PICODE, hlFirst, Int idx, operDu,
Int);
COND_EXPR *idCondExpLongIdx (Int);
COND_EXPR *idCondExpFunc (PPROC, PSTKFRAME);
COND_EXPR *idCondExpOther (byte seg, byte regi, int16 off);
COND_EXPR *idCondExpID (ID *, LOCAL_ID *, Int);
COND_EXPR *idCondExp (PICODE, opLoc, PPROC, Int i, PICODE duIcode, operDu);
COND_EXPR *copyCondExp (COND_EXPR *);
void removeRegFromLong (byte, LOCAL_ID *, COND_EXPR *);
char *walkCondExpr (COND_EXPR *exp, PPROC pProc, Int *);
condId idType (PICODE pIcode, opLoc sd);
Int hlTypeSize (COND_EXPR *, PPROC);
hlType expType (COND_EXPR *, PPROC);
void setRegDU (PICODE, byte regi, operDu);
void copyDU (PICODE, PICODE, operDu, operDu);
void changeBoolCondExpOp (COND_EXPR *, condOp);
boolT insertSubTreeReg (COND_EXPR *, COND_EXPR **, byte, LOCAL_ID *);
boolT insertSubTreeLongReg (COND_EXPR *, COND_EXPR **, Int);
void freeCondExpr (COND_EXPR *exp);
COND_EXPR *concatExps (SEQ_COND_EXPR *, COND_EXPR *, condNodeType);
void initExpStk();
void pushExpStk (COND_EXPR *);
COND_EXPR *popExpStk();
Int numElemExpStk();
boolT emptyExpStk();
/* Exported functions from hlicode.c */
QString writeJcond(const HLTYPE &, Function *, int *);
QString writeJcondInv(HLTYPE, Function *, int *);
void newAsgnHlIcode (PICODE, COND_EXPR *, COND_EXPR *);
void newCallHlIcode (PICODE);
void newUnaryHlIcode (PICODE, hlIcode, COND_EXPR *);
void newJCondHlIcode (PICODE, COND_EXPR *);
void invalidateIcode (PICODE);
boolT removeDefRegi (byte, PICODE, Int, LOCAL_ID *);
void highLevelGen (PPROC);
char *writeCall (PPROC, PSTKFRAME, PPROC, Int *);
char *write1HlIcode (HLTYPE, PPROC, Int *);
char *writeJcond (HLTYPE, PPROC, Int *);
char *writeJcondInv (HLTYPE, PPROC, Int *);
Int power2 (Int);
void writeDU (PICODE, Int);
void inverseCondOp (COND_EXPR **);
/* Exported funcions from locident.c */
Int newByteWordRegId (LOCAL_ID *, hlType t, byte regi);
Int newByteWordStkId (LOCAL_ID *, hlType t, Int off, byte regOff);
Int newIntIdxId (LOCAL_ID *, int16 seg, int16 off, byte regi, Int, hlType);
Int newLongRegId (LOCAL_ID *, hlType t, byte regH, byte regL, Int idx);
Int newLongStkId (LOCAL_ID *, hlType t, Int offH, Int offL);
Int newLongId (LOCAL_ID *, opLoc sd, PICODE, hlFirst, Int idx, operDu, Int);
boolT checkLongEq (LONG_STKID_TYPE, PICODE, Int, Int, PPROC, COND_EXPR **,
COND_EXPR **, Int);
boolT checkLongRegEq (LONGID_TYPE, PICODE, Int, Int, PPROC, COND_EXPR **,
COND_EXPR **, Int);
byte otherLongRegi (byte, Int, LOCAL_ID *);
void insertIdx (IDX_ARRAY *, Int);
void propLongId (LOCAL_ID *, byte, byte, char *);
/* Exported funcions from locident.c */
bool checkLongEq(LONG_STKID_TYPE, iICODE, int, Function *, Assignment &asgn, LLInst &atOffset);
bool checkLongRegEq(LONGID_TYPE, iICODE, int, Function *, Assignment &asgn, LLInst &);
extern const char *indentStr(int level);

View File

@ -1,24 +0,0 @@
#pragma once
#include "Procedure.h"
#include <QtCore/QObject>
#include <QtCore/QDir>
class IXmlTarget;
struct IDcc {
static IDcc *get();
virtual void BaseInit()=0;
virtual void Init(QObject *tgt)=0;
virtual lFunction::iterator GetFirstFuncHandle()=0;
virtual lFunction::iterator GetCurFuncHandle()=0;
virtual void analysis_Once()=0;
virtual void load(QString name)=0; // load and preprocess -> find entry point
virtual void prtout_asm(IXmlTarget *,int level=0)=0;
virtual void prtout_cpp(IXmlTarget *,int level=0)=0;
virtual size_t getFuncCount()=0;
virtual const lFunction &validFunctions() const =0;
virtual void SetCurFunc_by_Name(QString )=0;
virtual QDir installDir()=0;
virtual QDir dataDir(QString kind)=0;
};

View File

@ -1,43 +1,30 @@
/*
***************************************************************************
dcc project disassembler header
(C) Mike van Emmerik
***************************************************************************
*/
#pragma once
#include "bundle.h"
/****************************************************************************
* dcc project disassembler header
* (C) Mike van Emmerik
****************************************************************************/
#include <fstream>
#include <vector>
#include <QString>
#include <QTextStream>
struct LLInst;
struct Function;
struct Disassembler
{
protected:
int pass;
int g_lab;
//bundle &cCode;
QIODevice *m_disassembly_target;
QTextStream m_fp;
std::vector<std::string> m_decls;
std::vector<std::string> m_code;
public:
Disassembler(int _p) : pass(_p)
{
g_lab=0;
}
public:
void disassem(Function *ppProc);
void disassem(Function *ppProc, int i);
void dis1Line(LLInst &inst, int loc_ip, int pass);
};
/* Definitions for extended keys (first key is zero) */
#define EXT 0x100 /* "Extended" flag */
#ifdef __MSDOS__
#define KEY_DOWN EXT+'P'
#define KEY_LEFT EXT+'K'
#define KEY_UP EXT+'H'
#define KEY_RIGHT EXT+'M'
#define KEY_NPAGE EXT+'Q'
#define KEY_PPAGE EXT+'I'
#endif
#ifdef _CONSOLE
#define KEY_DOWN 0x50 /* Same as keypad scancodes */
#define KEY_LEFT 0x4B
#define KEY_UP 0x48
#define KEY_RIGHT 0x4D
#define KEY_NPAGE 0x51
#define KEY_PPAGE 0x49
#endif
#ifdef __UNIX__
#define KEY_DOWN EXT+'B'
#define KEY_LEFT EXT+'D'

View File

@ -1,41 +1,38 @@
/*
=**************************************************************************
/***************************************************************************
* File : dosdcc.h
* Purpose : include file for files decompiled by dcc.
* Copyright (c) Cristina Cifuentes - QUT - 1992
*************************************************************************
*/
**************************************************************************/
/* Type definitions for intel 80x86 architecture */
typedef unsigned int uint16_t; /* 16 bits */
typedef unsigned char uint8_t; /* 8 bits */
typedef unsigned int Word; /* 16 bits */
typedef unsigned char Byte; /* 8 bits */
typedef union {
unsigned long dW;
uint16_t wL, wH; /* 2 words */
Word wL, wH; /* 2 words */
} Dword; /* 32 bits */
/* Structure to access high and low bits of a uint8_t or uint16_t variable */
/* Structure to access high and low bits of a Byte or Word variable */
typedef struct {
/* low uint8_t */
uint16_t lowBitWord : 1;
uint16_t filler1 : 6;
uint16_t highBitByte : 1;
/* high uint8_t */
uint16_t lowBitByte : 1;
uint16_t filler2 : 6;
uint16_t highBitWord : 1;
/* low byte */
Word lowBitWord : 1;
Word filler1 : 6;
Word highBitByte : 1;
/* high byte */
Word lowBitByte : 1;
Word filler2 : 6;
Word highBitWord : 1;
} wordBits;
/* Low and high bits of a uint8_t or uint16_t variable */
/* Low and high bits of a Byte or Word variable */
#define lowBit(a) ((wordBits)(a).lowBitWord)
#define highBitByte(a) ((wordBits)(a).highBitByte)
#define lowBitByte(a) ((wordBits)(a).lowBitByte)
#define highBit(a) (sizeof(a) == sizeof(uint16_t) ? \
#define highBit(a) (sizeof(a) == sizeof(Word) ? \
((wordBits)(a).highBitWord):\
((wordBits)(a).highBitByte))
/* uint16_t register variables */
/* Word register variables */
#define ax regs.x.ax
#define bx regs.x.bx
#define cx regs.x.cx
@ -55,7 +52,7 @@ typedef struct {
#define carry regs.x.cflags
#define overF regs.x.flags /***** check *****/
/* uint8_t register variables */
/* Byte register variables */
#define ah regs.h.ah
#define al regs.h.al
#define bh regs.h.bh
@ -67,8 +64,8 @@ typedef struct {
/* High and low words of a Dword */
#define highWord(w) (*((uint16_t*)&(w) + 1))
#define lowWord(w) ((uint16_t)(w))
#define highWord(w) (*((Word*)&(w) + 1))
#define lowWord(w) ((Word)(w))
#define MAXByte 0xFF
#define MAXWord 0xFFFF
@ -77,4 +74,7 @@ typedef struct {
#define MAXSignWord 0x7FFF
#define MINSignWord 0x8001
/* Booleans */
#define TRUE 1
#define FALSE 0

View File

@ -1,39 +1,33 @@
/*
****************************************************************************
/*****************************************************************************
* Error codes
* (C) Cristina Cifuentes
***************************************************************************
*/
#pragma once
****************************************************************************/
/* These definitions refer to errorMessage in error.c */
enum eErrorId
{
NO_ERR =0,
USAGE,
INVALID_ARG,
INVALID_OPCODE,
INVALID_386OP,
FUNNY_SEGOVR,
FUNNY_REP,
CANNOT_OPEN,
CANNOT_READ,
MALLOC_FAILED,
NEWEXE_FORMAT,
NO_BB,
INVALID_SYNTHETIC_BB,
INVALID_INT_BB,
IP_OUT_OF_RANGE,
DEF_NOT_FOUND,
JX_NOT_DEF,
NOT_DEF_USE,
REPEAT_FAIL,
WHILE_FAIL
};
#define USAGE 0
#define INVALID_ARG 1
#define INVALID_OPCODE 2
#define INVALID_386OP 3
#define FUNNY_SEGOVR 4
#define FUNNY_REP 5
#define CANNOT_OPEN 6
#define CANNOT_READ 7
#define MALLOC_FAILED 8
#define NEWEXE_FORMAT 9
#define NO_BB 10
#define INVALID_SYNTHETIC_BB 11
#define INVALID_INT_BB 12
#define IP_OUT_OF_RANGE 13
#define DEF_NOT_FOUND 14
#define JX_NOT_DEF 15
#define NOT_DEF_USE 16
#define REPEAT_FAIL 17
#define WHILE_FAIL 18
void fatalError(eErrorId errId, ...);
void reportError(eErrorId errId, ...);
void fatalError(Int errId, ...);
void reportError(Int errId, ...);

View File

@ -1,53 +1,38 @@
/*
****************************************************************************
/*****************************************************************************
* CFG, BB and interval related definitions
* ( C ) Cristina Cifuentes
****************************************************************************
*/
#pragma once
#include <stdint.h>
#include <list>
* (C) Cristina Cifuentes
****************************************************************************/
struct Function;
/* Types of basic block nodes */
/* Real basic blocks: type defined according to their out-edges */
enum eBBKind
{
ONE_BRANCH = 0, /* unconditional branch */
TWO_BRANCH = 1, /* conditional branch */
MULTI_BRANCH=2, /* case branch */
FALL_NODE=3, /* fall through */
RETURN_NODE=4, /* procedure/program return */
CALL_NODE=5, /* procedure call */
LOOP_NODE=6, /* loop instruction */
REP_NODE=7, /* repeat instruction */
INTERVAL_NODE=8, /* contains interval list */
#define ONE_BRANCH 0 /* unconditional branch */
#define TWO_BRANCH 1 /* conditional branch */
#define MULTI_BRANCH 2 /* case branch */
#define FALL_NODE 3 /* fall through */
#define RETURN_NODE 4 /* procedure/program return */
#define CALL_NODE 5 /* procedure call */
#define LOOP_NODE 6 /* loop instruction */
#define REP_NODE 7 /* repeat instruction */
#define INTERVAL_NODE 8 /* contains interval list */
TERMINATE_NODE=11, /* Exit to DOS */
NOWHERE_NODE=12 /* No outedges going anywhere */
};
#define TERMINATE_NODE 11 /* Exit to DOS */
#define NOWHERE_NODE 12 /* No outedges going anywhere */
/* Depth-first traversal constants */
enum eDFS
{
DFS_NONE,
DFS_DISP=1, /* Display graph pass */
DFS_MERGE=2, /* Merge nodes pass */
DFS_NUM=3, /* DFS numbering pass */
DFS_CASE=4, /* Case pass */
DFS_ALPHA=5, /* Alpha code generation*/
DFS_JMP=9 /* rmJMP pass - must be largest flag */
};
#define DFS_DISP 1 /* Display graph pass */
#define DFS_MERGE 2 /* Merge nodes pass */
#define DFS_NUM 3 /* DFS numbering pass */
#define DFS_CASE 4 /* Case pass */
#define DFS_ALPHA 5 /* Alpha code generation*/
#define DFS_JMP 9 /* rmJMP pass - must be largest flag */
/* Control flow analysis constants */
enum eNodeHeaderType
{
NO_TYPE=0, /* node is not a loop header*/
WHILE_TYPE=1, /* node is a while header */
REPEAT_TYPE=2, /* node is a repeat header */
ENDLESS_TYPE=3 /* endless loop header */
};
#define NO_TYPE 0 /* node is not a loop header*/
#define WHILE_TYPE 1 /* node is a while header */
#define REPEAT_TYPE 2 /* node is a repeat header */
#define ENDLESS_TYPE 3 /* endless loop header */
/* Uninitialized values for certain fields */
#define NO_NODE MAX /* node has no associated node */
@ -58,42 +43,90 @@ enum eNodeHeaderType
#define ELSE 1 /* else edge */
/* Basic Block (BB) flags */
#define INVALID_BB 0x0001 /* BB is not valid any more */
#define INVALID_BB 0x0001 /* BB is not valid any more */
#define IS_LATCH_NODE 0x0002 /* BB is the latching node of a loop */
struct BB;
/* Interval structure */
typedef std::list<BB *> queue;
struct interval
/* Interval structure */
typedef struct _queueNode {
struct _BB *node; /* Ptr to basic block */
struct _queueNode *next;
} queue;
typedef struct _intNode {
byte numInt; /* # of the interval */
byte numOutEdges; /* Number of out edges */
queue *nodes; /* Nodes of the interval*/
queue *currNode; /* Current node */
struct _intNode *next; /* Next interval */
} interval;
typedef union
{
uint8_t numInt=0; /* # of the interval */
uint8_t numOutEdges=0; /* Number of out edges */
queue nodes; /* Nodes of the interval*/
queue::iterator currNode; /* Current node */
interval * next=0; /* Next interval */
BB * firstOfInt();
interval() : currNode(nodes.end()){
}
void appendNodeInt(queue &pqH, BB *node);
};
dword ip; /* Out edge icode address */
struct _BB *BBptr; /* Out edge pointer to next BB */
interval *intPtr; /* Out edge ptr to next interval*/
} TYPEADR_TYPE;
/* Basic block (BB) node definition */
typedef struct _BB {
byte nodeType; /* Type of node */
Int traversed; /* Boolean: traversed yet? */
Int start; /* First instruction offset */
Int length; /* No. of instructions this BB */
Int numHlIcodes; /* No. of high-level icodes */
flags32 flg; /* BB flags */
/* In edges and out edges */
Int numInEdges; /* Number of in edges */
struct _BB **inEdges; /* Array of ptrs. to in edges */
Int numOutEdges; /* Number of out edges */
TYPEADR_TYPE *edges; /* Array of ptrs. to out edges */
/* For interval construction */
Int beenOnH; /* #times been on header list H */
Int inEdgeCount; /* #inEdges (to find intervals) */
struct _BB *reachingInt; /* Reaching interval header */
interval *inInterval; /* Node's interval */
/* For derived sequence construction */
interval *correspInt; /* Corresponding interval in
* derived graph Gi-1 */
/* For live register analysis
* LiveIn(b) = LiveUse(b) U (LiveOut(b) - Def(b)) */
dword liveUse; /* LiveUse(b) */
dword def; /* Def(b) */
dword liveIn; /* LiveIn(b) */
dword liveOut; /* LiveOut(b) */
/* For structuring analysis */
Int dfsFirstNum; /* DFS #: first visit of node */
Int dfsLastNum; /* DFS #: last visit of node */
Int immedDom; /* Immediate dominator (dfsLast
* index) */
Int ifFollow; /* node that ends the if */
Int loopType; /* Type of loop (if any) */
Int latchNode; /* latching node of the loop */
Int numBackEdges; /* # of back edges */
Int loopHead; /* most nested loop head to which
* this node belongs (dfsLast) */
Int loopFollow; /* node that follows the loop */
Int caseHead; /* most nested case to which this
node belongs (dfsLast) */
Int caseTail; /* tail node for the case */
Int index; /* Index, used in several ways */
struct _BB *next; /* Next (list link) */
} BB;
typedef BB *PBB;
/* Derived Sequence structure */
struct derSeq_Entry
{
BB * Gi=nullptr; /* Graph pointer */
std::list<interval *> m_intervals;
interval * Ii=nullptr; /* Interval list of Gi */
~derSeq_Entry();
public:
void findIntervals(Function *c);
};
class derSeq : public std::list<derSeq_Entry>
{
public:
void display();
};
void freeDerivedSeq(derSeq &derivedG); /* reducible.c */
typedef struct _derivedNode {
BB *Gi; /* Graph pointer */
interval *Ii; /* Interval list of Gi */
struct _derivedNode *next; /* Next derived graph */
} derSeq;

View File

@ -1,4 +1,4 @@
/*
/*
* File: hlIcode.h
* Purpose: module definitions for high-level icodes
* Date: September 1993
@ -6,8 +6,16 @@
/* High level icodes opcodes - def in file icode.h */
struct HLICODE
{
/*typedef enum {
HLI_ASSIGN,
INC,
DEC,
HLI_JCOND,
} hlIcode; */
typedef struct {
hlIcode opcode; /* hlIcode opcode */
union { /* different operands */
struct {
@ -17,4 +25,12 @@ struct HLICODE
COND_EXPR *exp; /* for HLI_JCOND, INC, DEC */
} oper; /* operand */
boolT valid; /* has a valid hlIcode */
};
} HLICODE;
typedef struct {
Int numIcodes; /* No. of hlIcode reocrds written */
Int numAlloc; /* No. of hlIcode records allocated */
HLICODE *hlIcode; /* Array of high-level icodes */
} HLICODEREC;

View File

@ -2,621 +2,367 @@
* I-code related definitions
* (C) Cristina Cifuentes
****************************************************************************/
#pragma once
#include "msvc_fixes.h"
#include "BinaryImage.h"
#include "libdis.h"
#include "Enums.h"
/* LOW_LEVEL icode flags */
#define B 0x000001 /* Byte operands (value implicitly used) */
#define I 0x000002 /* Immed. source */
#define NOT_HLL 0x000004 /* Not HLL inst. */
#define FLOAT_OP 0x000008 /* ESC or WAIT */
#define SEG_IMMED 0x000010 /* Number is relocated segment value */
#define IMPURE 0x000020 /* Instruction modifies code */
#define WORD_OFF 0x000040 /* Inst has word offset ie.could be address */
#define TERMINATES 0x000080 /* Instruction terminates program */
#define CASE 0x000100 /* Label as case part of switch */
#define SWITCH 0x000200 /* Treat indirect JMP as switch stmt */
#define TARGET 0x000400 /* Jump target */
#define SYNTHETIC 0x000800 /* Synthetic jump instruction */
#define NO_LABEL 0x001000 /* Immed. jump cannot be linked to a label */
#define NO_CODE 0x002000 /* Hole in Icode array */
#define SYM_USE 0x004000 /* Instruction uses a symbol */
#define SYM_DEF 0x008000 /* Instruction defines a symbol */
#define NO_SRC 0x010000 /* Opcode takes no source */
#define NO_OPS 0x020000 /* Opcode takes no operands */
#define IM_OPS 0x040000 /* Opcode takes implicit operands */
#define SRC_B 0x080000 /* Source operand is byte (dest is word) */
#define NO_SRC_B 0xF7FFFF /* Masks off SRC_B */
#define HLL_LABEL 0x100000 /* Icode has a high level language label */
#define IM_DST 0x200000 /* Implicit DST for opcode (SIGNEX) */
#define IM_SRC 0x400000 /* Implicit SRC for opcode (dx:ax) */
#define IM_TMP_DST 0x800000 /* Implicit rTMP DST for opcode (DIV/IDIV) */
#define JMP_ICODE 0x1000000 /* Jmp dest immed.op converted to icode index */
#define JX_LOOP 0x2000000 /* Cond jump is part of loop conditional exp */
#define REST_STK 0x4000000 /* Stack needs to be restored after CALL */
/* Parser flags */
#define TO_REG 0x000100 /* rm is source */
#define S 0x000200 /* sign extend */
#define OP386 0x000400 /* 386 op-code */
#define NSP 0x000800 /* NOT_HLL if SP is src or dst */
#define ICODEMASK 0xFF00FF /* Masks off parser flags */
/* LOW_LEVEL icode, DU flag bits */
#define Cf 1
#define Sf 2
#define Zf 4
#define Df 8
/* Machine registers */
#define rAX 1 /* These are numbered relative to real 8086 */
#define rCX 2
#define rDX 3
#define rBX 4
#define rSP 5
#define rBP 6
#define rSI 7
#define rDI 8
#define rES 9
#define rCS 10
#define rSS 11
#define rDS 12
#define rAL 13
#define rCL 14
#define rDL 15
#define rBL 16
#define rAH 17
#define rCH 18
#define rDH 19
#define rBH 20
#define rTMP 21 /* temp register for DIV/IDIV/MOD */
#define INDEXBASE 22 /* Indexed modes go from INDEXBASE to
* INDEXBASE+7 */
/* Byte and Word registers */
static char *byteReg[9] = {"al", "cl", "dl", "bl",
"ah", "ch", "dh", "bh", "tmp" };
static char *wordReg[21] = {"ax", "cx", "dx", "bx", "sp", "bp",
"si", "di", "es", "cs", "ss", "ds",
"", "", "", "", "", "", "", "", "tmp"};
#include "state.h" // State depends on INDEXBASE, but later need STATE
#include "CallConvention.h"
#include <boost/range/iterator_range.hpp>
#include <QtCore/QString>
/* Types of icodes */
typedef enum {
NOT_SCANNED = 0, /* not even scanned yet */
LOW_LEVEL, /* low-level icode */
HIGH_LEVEL /* high-level icode */
} icodeType;
#include <memory>
#include <vector>
#include <list>
#include <bitset>
#include <set>
#include <algorithm>
#include <initializer_list>
//enum condId;
/* LOW_LEVEL icode opcodes */
typedef enum {
iCBW, /* 0 */
iAAA,
iAAD,
iAAM,
iAAS,
iADC,
iADD,
iAND,
iBOUND,
iCALL,
iCALLF, /* 10 */
iCLC,
iCLD,
iCLI,
iCMC,
iCMP,
iCMPS,
iREPNE_CMPS,
iREPE_CMPS,
iDAA,
iDAS, /* 20 */
iDEC,
iDIV,
iENTER,
iESC,
iHLT,
iIDIV,
iIMUL,
iIN,
iINC,
iINS, /* 30 */
iREP_INS,
iINT,
iIRET,
iJB,
iJBE,
iJAE,
iJA,
iJE,
iJNE,
iJL, /* 40 */
iJGE,
iJLE,
iJG,
iJS,
iJNS,
iJO,
iJNO,
iJP,
iJNP,
iJCXZ, /* 50 */
iJMP,
iJMPF,
iLAHF,
iLDS,
iLEA,
iLEAVE,
iLES,
iLOCK,
iLODS,
iREP_LODS, /* 60 */
iLOOP,
iLOOPE,
iLOOPNE,
iMOV, /* 64 */
iMOVS,
iREP_MOVS,
iMUL, /* 67 */
iNEG,
iNOT,
iOR, /* 70 */
iOUT,
iOUTS,
iREP_OUTS,
iPOP,
iPOPA,
iPOPF,
iPUSH,
iPUSHA,
iPUSHF,
iRCL, /* 80 */
iRCR,
iROL,
iROR,
iRET, /* 84 */
iRETF,
iSAHF,
iSAR,
iSHL,
iSHR,
iSBB, /* 90 */
iSCAS,
iREPNE_SCAS,
iREPE_SCAS,
iSIGNEX,
iSTC,
iSTD,
iSTI,
iSTOS,
iREP_STOS,
iSUB, /* 100 */
iTEST,
iWAIT,
iXCHG,
iXLAT,
iXOR,
iINTO,
iNOP,
iREPNE,
iREPE,
iMOD, /* 110 */
} llIcode;
struct LOCAL_ID;
struct BB;
struct Function;
struct STKFRAME;
class CIcodeRec;
struct ICODE;
struct bundle;
typedef std::list<ICODE>::iterator iICODE;
typedef std::list<ICODE>::reverse_iterator riICODE;
typedef boost::iterator_range<iICODE> rCODE;
struct LivenessSet
{
std::set<eReg> registers;
public:
LivenessSet(const std::initializer_list<eReg> &init) : registers(init) {}
LivenessSet() {}
LivenessSet(const LivenessSet &other) : registers(other.registers)
{
}
void reset()
{
registers.clear();
}
/* HIGH_LEVEL icodes opcodes */
typedef enum {
HLI_ASSIGN, /* := */
HLI_CALL, /* Call procedure */
HLI_JCOND, /* Conditional jump */
HLI_RET, /* Return from procedure */
/* pseudo high-level icodes */
HLI_POP, /* Pop expression */
HLI_PUSH, /* Push expression */
} hlIcode;
// LivenessSet(LivenessSet &&other) : LivenessSet()
// {
// swap(*this,other);
// }
LivenessSet &operator=(LivenessSet other)
{
swap(*this,other);
return *this;
}
friend void swap(LivenessSet& first, LivenessSet& second) // nothrow
{
std::swap(first.registers, second.registers);
}
LivenessSet &operator|=(const LivenessSet &other)
{
registers.insert(other.registers.begin(),other.registers.end());
return *this;
}
LivenessSet &operator&=(const LivenessSet &other)
{
std::set<eReg> res;
std::set_intersection(registers.begin(),registers.end(),
other.registers.begin(),other.registers.end(),
std::inserter(res, res.end()));
registers = res;
return *this;
}
LivenessSet &operator-=(const LivenessSet &other)
{
std::set<eReg> res;
std::set_difference(registers.begin(),registers.end(),
other.registers.begin(),other.registers.end(),
std::inserter(res, res.end()));
registers = res;
return *this;
}
LivenessSet operator-(const LivenessSet &other) const
{
return LivenessSet(*this) -= other;
}
LivenessSet operator+(const LivenessSet &other) const
{
return LivenessSet(*this) |= other;
}
LivenessSet operator &(const LivenessSet &other) const
{
return LivenessSet(*this) &= other;
}
bool any() const
{
return not registers.empty();
}
bool operator==(const LivenessSet &other) const
{
return registers==other.registers;
}
bool operator!=(const LivenessSet &other) const { return not(*this==other);}
LivenessSet &setReg(int r);
LivenessSet &addReg(int r);
bool testReg(int r) const
{
return registers.find(eReg(r))!=registers.end();
}
bool testRegAndSubregs(int r) const;
LivenessSet &clrReg(int r);
private:
void postProcessCompositeRegs();
};
/* Operand is defined, used or both flag */
typedef enum {
DEF, /* Operand is defined */
USE, /* Operand is used */
USE_DEF, /* Operand is used and defined */
NONE, /* No operation is required on this operand */
} operDu;
/* uint8_t and uint16_t registers */
// I can't believe these are necessary!
#define E_DEF (operDu)DEF
#define E_USE (operDu)USE
#define E_NONE (operDu)NONE
#define E_USE_DEF (operDu)USE_DEF
/* Def/use of flags - low 4 bits represent flags */
struct DU
{
uint8_t d;
uint8_t u;
};
typedef struct {
byte d;
byte u;
} DU;
typedef DU *PDU;
/* Def/Use of registers and stack variables */
typedef struct {
dword def; /* For Registers: position in dword is reg index*/
dword lastDefRegi;/* Bit set if last def of this register in BB */
dword use; /* For Registers: position in dword is reg index*/
}DU_ICODE;
/* Definition-use chain for level 1 (within a basic block) */
#define MAX_REGS_DEF 4 /* 2 regs def'd for long-reg vars */
#define MAX_REGS_DEF 2 /* 2 regs def'd for long-reg vars */
#define MAX_USES 5
typedef struct {
Int numRegsDef; /* # registers defined by this inst */
byte regi[MAX_REGS_DEF]; /* registers defined by this inst */
Int idx[MAX_REGS_DEF][MAX_USES]; /* inst that uses this def */
} DU1;
struct Expr;
struct AstIdent;
struct UnaryOperator;
struct HlTypeSupport
{
//hlIcode opcode; /* hlIcode opcode */
virtual bool removeRegFromLong(eReg regi, LOCAL_ID *locId)=0;
virtual QString writeOut(Function *pProc, int *numLoc) const=0;
protected:
Expr * performLongRemoval (eReg regi, LOCAL_ID *locId, Expr *tree);
};
struct CallType : public HlTypeSupport
{
//for HLI_CALL
Function * proc;
STKFRAME * args; // actual arguments
void allocStkArgs (int num);
bool newStkArg(Expr *exp, llIcode opcode, Function *pproc);
void placeStkArg(Expr *exp, int pos);
virtual Expr * toAst();
public:
bool removeRegFromLong(eReg /*regi*/, LOCAL_ID * /*locId*/)
{
printf("CallType : removeRegFromLong not supproted\n");
return false;
}
QString writeOut(Function *pProc, int *numLoc) const;
};
struct AssignType : public HlTypeSupport
{
/* for HLI_ASSIGN */
protected:
public:
Expr * m_lhs;
Expr * m_rhs;
AssignType() {}
Expr *lhs() const {return m_lhs;}
void lhs(Expr *l);
bool removeRegFromLong(eReg regi, LOCAL_ID *locId);
QString writeOut(Function *pProc, int *numLoc) const;
};
struct ExpType : public HlTypeSupport
{
/* for HLI_JCOND, HLI_RET, HLI_PUSH, HLI_POP*/
Expr * v;
ExpType() : v(nullptr) {}
bool removeRegFromLong(eReg regi, LOCAL_ID *locId)
{
v=performLongRemoval(regi,locId,v);
return true;
}
QString writeOut(Function *pProc, int *numLoc) const;
};
struct HLTYPE
{
protected:
public:
ExpType exp; /* for HLI_JCOND, HLI_RET, HLI_PUSH, HLI_POP*/
hlIcode opcode; /* hlIcode opcode */
AssignType asgn;
CallType call;
HlTypeSupport *get();
const HlTypeSupport *get() const
{
return const_cast<const HlTypeSupport *>(const_cast<HLTYPE*>(this)->get());
}
void expr(Expr *e)
{
assert(e);
exp.v=e;
}
Expr *getMyExpr()
{
if(opcode==HLI_CALL)
return call.toAst();
return expr();
}
void replaceExpr(Expr *e);
Expr * expr() { return exp.v;}
const Expr * expr() const { return exp.v;}
void set(hlIcode i,Expr *e)
{
if(i!=HLI_RET)
assert(e);
assert(exp.v==0);
opcode=i;
exp.v=e;
}
void set(Expr *l,Expr *r);
void setCall(Function *proc);
HLTYPE(hlIcode op=HLI_INVALID) : opcode(op)
{}
// HLTYPE() // help valgrind find uninitialized HLTYPES
// {}
HLTYPE & operator=(const HLTYPE &l)
{
exp = l.exp;
opcode = l.opcode;
asgn = l.asgn;
call = l.call;
return *this;
}
public:
QString write1HlIcode(Function *pProc, int *numLoc) const;
void setAsgn(Expr *lhs, Expr *rhs);
} ;
/* LOW_LEVEL icode operand record */
struct LLOperand
typedef struct {
byte seg; /* CS, DS, ES, SS */
int16 segValue; /* Value of segment seg during analysis */
byte segOver; /* CS, DS, ES, SS if segment override */
byte regi; /* 0 < regs < INDEXBASE <= index modes */
int16 off; /* memory address offset */
} ICODEMEM;
typedef ICODEMEM *PMEM;
/* LOW_LEVEL operand location: source or destination */
typedef enum {
SRC, /* Source operand */
DST, /* Destination operand */
LHS_OP, /* Left-hand side operand (for HIGH_LEVEL) */
} opLoc;
typedef struct
{
eReg seg; /* CS, DS, ES, SS */
eReg segOver; /* CS, DS, ES, SS if segment override */
int16_t segValue; /* Value of segment seg during analysis */
eReg regi; /* 0 < regs < INDEXBASE <= index modes */
int16_t off; /* memory address offset */
uint32_t opz; /* idx of immed src op */
bool immed;
bool is_offset; // set by jumps
bool is_compound;
size_t width;
//union {/* Source operand if (flg & I) */
struct { /* Call & # actual arg bytes */
Function *proc; /* pointer to target proc (for CALL(F))*/
int cb; /* # actual arg bytes */
} proc;
LLOperand() : seg(rUNDEF),segOver(rUNDEF),segValue(0),regi(rUNDEF),off(0),
opz(0),immed(0),is_offset(false),is_compound(0),width(0)
{
proc.proc=0;
proc.cb=0;
}
LLOperand(eReg r,size_t w) : LLOperand()
{
regi=r;
width=w;
}
bool operator==(const LLOperand &with) const
{
return (seg==with.seg) and
(segOver==with.segOver) and
(segValue==with.segValue) and
(regi == with.regi) and
(off == with.off) and
(opz==with.opz) and
(proc.proc==with.proc.proc);
}
int64_t getImm2() const {return opz;}
void SetImmediateOp(uint32_t dw)
{
opz=dw;
}
eReg getReg2() const {return regi;}
bool isReg() const;
static LLOperand CreateImm2(int64_t Val,uint8_t wdth=2)
{
LLOperand Op;
Op.immed=true;
Op.opz = Val;
Op.width = wdth;
return Op;
}
static LLOperand CreateReg2(unsigned Val)
{
LLOperand Op;
Op.regi = (eReg)Val;
return Op;
}
bool isSet()
{
return not (*this == LLOperand());
}
void addProcInformation(int param_count, CConv::Type call_conv);
bool isImmediate() const { return immed;}
void setImmediate(bool x) { immed=x;}
bool compound() const {return is_compound;} // dx:ax pair
size_t byteWidth() const { assert(width<=4); return width;}
};
struct LLInst
hlIcode opcode; /* hlIcode opcode */
union { /* different operands */
struct { /* for HLI_ASSIGN */
COND_EXPR *lhs;
COND_EXPR *rhs;
} asgn;
COND_EXPR *exp; /* for HLI_JCOND, HLI_RET, HLI_PUSH, HLI_POP*/
struct { /* for HLI_CALL */
struct _proc *proc;
struct _STKFRAME *args; /* actual arguments */
} call;
} oper; /* operand */
} HLTYPE;
typedef struct
{
protected:
llIcode m_opcode; // Low level opcode identifier
uint32_t flg; /* icode flags */
LLOperand m_src; /* source operand */
public:
int codeIdx; /* Index into cCode.code */
uint8_t numBytes; /* Number of bytes this instr */
uint32_t label; /* offset in image (20-bit adr) */
LLOperand m_dst; /* destination operand */
DU flagDU; /* def/use of flags */
int caseEntry;
std::vector<uint32_t> caseTbl2;
int hllLabNum; /* label # for hll codegen */
llIcode opcode; /* llIcode instruction */
byte numBytes; /* Number of bytes this instr */
flags32 flg; /* icode flags */
dword label; /* offset in image (20-bit adr) */
ICODEMEM dst; /* destination operand */
ICODEMEM src; /* source operand */
union { /* Source operand if (flg & I) */
dword op; /* idx of immed src op */
struct { /* Call & # actual arg bytes */
struct _proc *proc; /* ^ target proc (for CALL(F))*/
Int cb; /* # actual arg bytes */
} proc;
} immed;
DU flagDU; /* def/use of flags */
struct { /* Case table if op==JMP && !I */
Int numEntries; /* # entries in case table */
dword *entries; /* array of offsets */
} caseTbl;
Int hllLabNum; /* label # for hll codegen */
} LLTYPE;
llIcode getOpcode() const { return m_opcode;}
void setOpcode(uint32_t op) { m_opcode=(llIcode)op; }
bool conditionalJump()
{
return (getOpcode() >= iJB) and (getOpcode() < iJCXZ);
}
bool testFlags(uint32_t x) const { return (flg & x)!=0;}
void setFlags(uint32_t flag) {flg |= flag;}
void clrFlags(uint32_t flag)
{
if(getOpcode()==iMOD)
{
assert(false);
}
flg &= ~flag;
}
uint32_t getFlag() const {return flg;}
uint32_t GetLlLabel() const { return label;}
void SetImmediateOp(uint32_t dw) {m_src.SetImmediateOp(dw);}
bool match(llIcode op)
{
return (getOpcode()==op);
}
bool matchWithRegDst(llIcode op)
{
return (getOpcode()==op) and m_dst.isReg();
}
bool match(llIcode op,eReg dest)
{
return (getOpcode()==op)&&m_dst.regi==dest;
}
bool match(llIcode op,eReg dest,uint32_t flgs)
{
return (getOpcode()==op) and (m_dst.regi==dest) and testFlags(flgs);
}
bool match(llIcode op,eReg dest,eReg src_reg)
{
return (getOpcode()==op) and (m_dst.regi==dest) and (m_src.regi==src_reg);
}
bool match(eReg dest,eReg src_reg)
{
return (m_dst.regi==dest) and (m_src.regi==src_reg);
}
bool match(eReg dest)
{
return (m_dst.regi==dest);
}
bool match(llIcode op,uint32_t flgs)
{
return (getOpcode()==op) and testFlags(flgs);
}
void set(llIcode op,uint32_t flags)
{
setOpcode(op);
flg =flags;
}
void set(llIcode op,uint32_t flags,eReg dst_reg)
{
setOpcode(op);
m_dst = LLOperand::CreateReg2(dst_reg);
flg =flags;
}
void set(llIcode op,uint32_t flags,eReg dst_reg,const LLOperand &src_op)
{
setOpcode(op);
m_dst = LLOperand::CreateReg2(dst_reg);
m_src = src_op;
flg =flags;
}
void emitGotoLabel(int indLevel);
void findJumpTargets(CIcodeRec &_pc);
void writeIntComment(QTextStream & s);
void dis1Line(int loc_ip, int pass);
QTextStream & strSrc(QTextStream & os, bool skip_comma=false);
void flops(QTextStream & out);
bool isJmpInst();
HLTYPE createCall();
LLInst(ICODE *container) : flg(0),codeIdx(0),numBytes(0),m_link(container)
{
setOpcode(0);
}
const LLOperand &src() const {return m_src;}
LLOperand &src() {return m_src;}
void replaceSrc(const LLOperand &with)
{
m_src = with;
}
void replaceSrc(eReg r)
{
m_src = LLOperand::CreateReg2(r);
}
void replaceSrc(int64_t r)
{
m_src = LLOperand::CreateImm2(r);
}
void replaceDst(const LLOperand &with)
{
m_dst = with;
}
// void replaceDst(eReg r)
// {
// dst = LLOperand::CreateReg2(r);
// }
ICODE *m_link;
condId idType(opLoc sd) const;
const LLOperand * get(opLoc sd) const { return (sd == SRC) ? &src() : &m_dst; }
LLOperand * get(opLoc sd) { return (sd == SRC) ? &src() : &m_dst; }
};
/* Icode definition: LOW_LEVEL and HIGH_LEVEL */
struct ICODE
{
// use llvm names at least
typedef BB MachineBasicBlock;
protected:
LLInst m_ll;
HLTYPE m_hl;
MachineBasicBlock * Parent; /* BB to which this icode belongs */
bool invalid; /* Has no HIGH_LEVEL equivalent */
public:
x86_insn_t insn;
template<int FLAG>
struct FlagFilter
{
bool operator()(ICODE *ic) {return ic->ll()->testFlags(FLAG);}
bool operator()(ICODE &ic) {return ic.ll()->testFlags(FLAG);}
};
template<int TYPE>
struct TypeFilter
{
bool operator()(ICODE *ic) {return ic->type==TYPE;}
bool operator()(ICODE &ic) {return ic.type==TYPE;}
};
template<int TYPE>
struct TypeAndValidFilter
{
bool operator()(ICODE *ic) {return (ic->type==TYPE) and (ic->valid());}
bool operator()(ICODE &ic) {return (ic.type==TYPE) and ic.valid();}
};
static TypeFilter<HIGH_LEVEL_ICODE> select_high_level;
static TypeAndValidFilter<HIGH_LEVEL_ICODE> select_valid_high_level;
/* Def/Use of registers and stack variables */
struct DU_ICODE
{
DU_ICODE()
{
def.reset();
use.reset();
lastDefRegi.reset();
}
LivenessSet def; // For Registers: position in bitset is reg index
LivenessSet use; // For Registers: position in uint32_t is reg index
LivenessSet lastDefRegi;// Bit set if last def of this register in BB
void addDefinedAndUsed(eReg r)
{
def.addReg(r);
use.addReg(r);
}
};
struct DU1
{
protected:
int numRegsDef; /* # registers defined by this inst */
typedef struct {
icodeType type; /* Icode type */
boolT invalid; /* Has no HIGH_LEVEL equivalent */
struct _BB *inBB; /* BB to which this icode belongs */
DU_ICODE du; /* Def/use regs/vars */
DU1 du1; /* du chain 1 */
Int codeIdx; /* Index into cCode.code */
struct { /* Different types of icodes */
LLTYPE ll;
HLTYPE hl; /* For HIGH_LEVEL icodes */
} ic; /* intermediate code */
} ICODE;
typedef ICODE* PICODE;
public:
struct Use
{
int Reg; // used register
std::vector<std::list<ICODE>::iterator> uses; // use locations [MAX_USES]
void removeUser(std::list<ICODE>::iterator us)
{
// ic is no no longer an user
auto iter=std::find(uses.begin(),uses.end(),us);
if(iter==uses.end())
return;
uses.erase(iter);
assert("Same user more then once!" and uses.end()==std::find(uses.begin(),uses.end(),us));
}
};
uint8_t regi[MAX_REGS_DEF+1]; /* registers defined by this inst */
Use idx[MAX_REGS_DEF+1];
//int idx[MAX_REGS_DEF][MAX_USES]; /* inst that uses this def */
bool used(int regIdx)
{
return not idx[regIdx].uses.empty();
}
int numUses(int regIdx)
{
return idx[regIdx].uses.size();
}
void recordUse(int regIdx,std::list<ICODE>::iterator location)
{
idx[regIdx].uses.push_back(location);
}
void remove(int regIdx,int use_idx)
{
idx[regIdx].uses.erase(idx[regIdx].uses.begin()+use_idx);
}
void remove(int regIdx,std::list<ICODE>::iterator ic)
{
Use &u(idx[regIdx]);
u.removeUser(ic);
}
int getNumRegsDef() const {return numRegsDef;}
void clearAllDefs() {numRegsDef=0;}
DU1 &addDef(eReg r) {numRegsDef++; return *this;}
DU1 &setDef(eReg r) {numRegsDef=1; return *this;}
void removeDef(eReg r) {numRegsDef--;}
DU1() : numRegsDef(0)
{
}
};
icodeType type; /* Icode type */
DU_ICODE du; /* Def/use regs/vars */
DU1 du1; /* du chain 1 */
int loc_ip; // used by CICodeRec to number ICODEs
LLInst * ll() { return &m_ll;}
const LLInst * ll() const { return &m_ll;}
HLTYPE * hlU() {
// assert(type==HIGH_LEVEL);
// assert(m_hl.opcode!=HLI_INVALID);
return &m_hl;
}
const HLTYPE * hl() const {
// assert(type==HIGH_LEVEL);
// assert(m_hl.opcode!=HLI_INVALID);
return &m_hl;
}
void hl(const HLTYPE &v) { m_hl=v;}
void setRegDU(eReg regi, operDu du_in);
void invalidate();
void newCallHl();
void writeDU();
condId idType(opLoc sd);
// HLL setting functions
// set this icode to be an assign
void setAsgn(Expr *lhs, Expr *rhs)
{
type=HIGH_LEVEL_ICODE;
hlU()->setAsgn(lhs,rhs);
}
void setUnary(hlIcode op, Expr *_exp);
void setJCond(Expr *cexp);
void emitGotoLabel(int indLevel);
void copyDU(const ICODE &duIcode, operDu _du, operDu duDu);
bool valid() {return not invalid;}
void setParent(MachineBasicBlock *P) { Parent = P; }
public:
bool removeDefRegi(eReg regi, int thisDefIdx, LOCAL_ID *locId);
void checkHlCall();
bool newStkArg(Expr *exp, llIcode opcode, Function *pproc)
{
return hlU()->call.newStkArg(exp,opcode,pproc);
}
ICODE() : m_ll(this),Parent(0),invalid(false),type(NOT_SCANNED_ICODE),loc_ip(0)
{
}
public:
const MachineBasicBlock* getParent() const { return Parent; }
MachineBasicBlock* getParent() { return Parent; }
//unsigned getNumOperands() const { return (unsigned)Operands.size(); }
};
/** Map n low level instructions to m high level instructions
*/
//struct MappingLLtoML
//{
// typedef boost::iterator_range<iICODE> rSourceRange;
// typedef boost::iterator_range<InstListType::iterator> rTargetRange;
// rSourceRange m_low_level;
// rTargetRange m_middle_level;
//};
// This is the icode array object.
class CIcodeRec : public std::list<ICODE>
// The bulk of this could well be done with a class library
class CIcodeRec
{
public:
CIcodeRec(); // Constructor
CIcodeRec(); // Constructor
~CIcodeRec(); // Destructor
PICODE addIcode(PICODE pIcode);
PICODE GetFirstIcode();
// PICODE GetNextIcode(PICODE pCurIcode);
boolT IsValid(PICODE pCurIcode);
int GetNumIcodes();
void SetInBB(int start, int end, struct _BB* pnewBB);
void SetImmediateOp(int ip, dword dw);
void SetLlFlag(int ip, dword flag);
void ClearLlFlag(int ip, dword flag);
dword GetLlFlag(int ip);
void SetLlInvalid(int ip, boolT fInv);
dword GetLlLabel(int ip);
llIcode GetLlOpcode(int ip);
boolT labelSrch(dword target, Int *pIndex);
PICODE GetIcode(int ip);
protected:
Int numIcode; /* # icodes in use */
Int alloc; /* # icodes allocated */
PICODE icode; /* Array of icodes */
ICODE * addIcode(ICODE *pIcode);
void SetInBB(rCODE &rang, BB* pnewBB);
bool labelSrch(uint32_t target, uint32_t &pIndex);
iterator labelSrch(uint32_t target);
ICODE * GetIcode(size_t ip);
bool alreadyDecoded(uint32_t target);
};

View File

@ -1,77 +0,0 @@
#pragma once
#include <vector>
#include "idiom.h"
#include "icode.h"
#include <deque>
struct Idiom5 : public Idiom
{
protected:
iICODE m_icodes[2];
public:
virtual ~Idiom5() {}
Idiom5(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 2;}
bool match(iICODE pIcode);
int action();
};
struct Idiom6 : public Idiom
{
protected:
iICODE m_icodes[2];
public:
virtual ~Idiom6() {}
Idiom6(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 2;}
bool match(iICODE pIcode);
int action();
};
struct Idiom18 : public Idiom
{
protected:
iICODE m_icodes[4];
bool m_is_dec;
/* type of variable: 1 = reg-var, 2 = local */
int m_idiom_type;
public:
Idiom18(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 4;}
bool match(iICODE picode);
int action();
};
struct Idiom19 : public Idiom
{
protected:
iICODE m_icodes[2];
bool m_is_dec;
public:
Idiom19(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 2;}
bool match(iICODE picode);
int action();
};
struct Idiom20 : public Idiom
{
protected:
iICODE m_icodes[4];
condNodeType m_is_dec;
public:
Idiom20(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 4;}
bool match(iICODE picode);
int action();
};

View File

@ -1,42 +0,0 @@
#pragma once
#include <vector>
#include "idiom.h"
#include "icode.h"
#include <deque>
struct CallIdiom : public Idiom
{
protected:
int m_param_count;
public:
virtual ~CallIdiom() {}
CallIdiom(Function *f) : Idiom(f)
{
}
};
struct Idiom3 : public CallIdiom
{
protected:
iICODE m_icodes[2];
public:
virtual ~Idiom3() {}
Idiom3(Function *f) : CallIdiom(f)
{
}
uint8_t minimum_match_length() {return 2;}
bool match(iICODE pIcode);
int action();
};
struct Idiom17 : public CallIdiom
{
protected:
std::vector<iICODE> m_icodes;
public:
virtual ~Idiom17() {}
Idiom17(Function *f) : CallIdiom(f)
{
}
uint8_t minimum_match_length() {return 2;}
bool match(iICODE pIcode);
int action();
};

View File

@ -1,39 +0,0 @@
#pragma once
#include "idiom.h"
#include "icode.h"
#include <deque>
struct EpilogIdiom : public Idiom
{
protected:
std::deque<iICODE> m_icodes; // deque to push_front optional icodes from popStkVars
void popStkVars (iICODE pIcode);
public:
virtual ~EpilogIdiom() {}
EpilogIdiom(Function *f) : Idiom(f)
{
}
};
struct Idiom2 : public EpilogIdiom
{
virtual ~Idiom2() {}
Idiom2(Function *f) : EpilogIdiom(f)
{
}
uint8_t minimum_match_length() {return 3;}
bool match(iICODE pIcode);
int action();
};
struct Idiom4 : public EpilogIdiom
{
protected:
int m_param_count;
public:
virtual ~Idiom4() {}
Idiom4(Function *f) : EpilogIdiom(f)
{
}
uint8_t minimum_match_length() {return 1;}
bool match(iICODE pIcode);
int action();
};

View File

@ -1,22 +0,0 @@
#pragma once
#include "icode.h"
#include "Procedure.h"
struct Idiom
{
protected:
Function *m_func;
iICODE m_end;
public:
Idiom(Function *f) : m_func(f),m_end(f->Icode.end())
{
}
virtual uint8_t minimum_match_length()=0;
virtual bool match(iICODE at)=0;
virtual int action()=0;
int operator ()(iICODE at)
{
if(match(at))
return action();
return 1;
}
};

View File

@ -1,17 +0,0 @@
#pragma once
#include "idiom.h"
struct Idiom1 : public Idiom
{
protected:
std::vector<iICODE> m_icodes;
int m_min_off;
int checkStkVars (iICODE pIcode);
public:
Idiom1(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 1;}
bool match(iICODE picode);
int action();
size_t match_length() {return m_icodes.size();}
};

View File

@ -1,36 +0,0 @@
#pragma once
#include <vector>
#include "idiom.h"
#include "icode.h"
#include <deque>
struct Idiom14 : public Idiom
{
protected:
iICODE m_icodes[2];
eReg m_regL;
eReg m_regH;
public:
virtual ~Idiom14() {}
Idiom14(Function *f) : Idiom(f),m_regL(rUNDEF),m_regH(rUNDEF)
{
}
uint8_t minimum_match_length() {return 2;}
bool match(iICODE pIcode);
int action();
};
struct Idiom13 : public Idiom
{
protected:
iICODE m_icodes[2];
eReg m_loaded_reg;
public:
virtual ~Idiom13() {}
Idiom13(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 2;}
bool match(iICODE pIcode);
int action();
};

View File

@ -1,33 +0,0 @@
#pragma once
#include <vector>
#include "idiom.h"
#include "icode.h"
#include <deque>
struct Idiom11 : public Idiom
{
protected:
iICODE m_icodes[3];
public:
virtual ~Idiom11() {}
Idiom11(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 3;}
bool match(iICODE pIcode);
int action();
};
struct Idiom16 : public Idiom
{
protected:
iICODE m_icodes[3];
public:
virtual ~Idiom16() {}
Idiom16(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 3;}
bool match(iICODE pIcode);
int action();
};

View File

@ -1,66 +0,0 @@
#pragma once
#include <vector>
#include "idiom.h"
#include "icode.h"
#include <deque>
struct Idiom8 : public Idiom
{
protected:
iICODE m_icodes[2];
uint8_t m_loaded_reg;
public:
virtual ~Idiom8() {}
Idiom8(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 2;}
bool match(iICODE pIcode);
int action();
};
struct Idiom15 : public Idiom
{
protected:
std::vector<iICODE> m_icodes;
public:
virtual ~Idiom15() {}
Idiom15(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 2;}
bool match(iICODE pIcode);
int action();
};
struct Idiom12 : public Idiom
{
protected:
iICODE m_icodes[2];
uint8_t m_loaded_reg;
public:
virtual ~Idiom12() {}
Idiom12(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 2;}
bool match(iICODE pIcode);
int action();
};
struct Idiom9 : public Idiom
{
protected:
iICODE m_icodes[2];
uint8_t m_loaded_reg;
public:
virtual ~Idiom9() {}
Idiom9(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 2;}
bool match(iICODE pIcode);
int action();
};

View File

@ -1,46 +0,0 @@
#pragma once
#include <vector>
#include "idiom.h"
#include "icode.h"
#include <deque>
struct Idiom21 : public Idiom
{
protected:
iICODE m_icodes[2];
public:
virtual ~Idiom21() {}
Idiom21(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 2;}
bool match(iICODE pIcode);
int action();
};
struct Idiom7 : public Idiom
{
protected:
iICODE m_icode;
public:
virtual ~Idiom7() {}
Idiom7(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 1;}
bool match(iICODE pIcode);
int action();
};
struct Idiom10 : public Idiom
{
protected:
iICODE m_icodes[2];
public:
virtual ~Idiom10() {}
Idiom10(Function *f) : Idiom(f)
{
}
uint8_t minimum_match_length() {return 1;}
bool match(iICODE pIcode);
int action();
};

View File

@ -1,11 +0,0 @@
#pragma once
class ILoader
{
};
class LoaderManger
{
};

View File

@ -5,177 +5,106 @@
* (C) Cristina Cifuentes
*/
#pragma once
#include "msvc_fixes.h"
#include "types.h"
#include "Enums.h"
#include "machine_x86.h"
#include <QtCore/QString>
#include <stdint.h>
#include <vector>
#include <list>
#include <set>
#include <algorithm>
/* Type definition */
// this array has to stay in-order of addition i.e. not std::set<iICODE,std::less<iICODE> >
// TODO: why ?
struct Expr;
struct AstIdent;
struct ICODE;
struct LLInst;
typedef std::list<ICODE>::iterator iICODE;
struct IDX_ARRAY : public std::vector<iICODE>
{
bool inList(iICODE idx) const
{
return std::find(begin(),end(),idx)!=end();
}
};
typedef struct {
Int csym; /* # symbols used */
Int alloc; /* # symbols allocated */
Int *idx; /* Array of integer indexes */
} IDX_ARRAY;
enum frameType
{
STK_FRAME, /* For stack vars */
REG_FRAME, /* For register variables */
GLB_FRAME /* For globals */
};
/* Type definitions used in the decompiled program */
typedef enum {
TYPE_UNKNOWN = 0, /* unknown so far */
TYPE_BYTE_SIGN, /* signed byte (8 bits) */
TYPE_BYTE_UNSIGN, /* unsigned byte */
TYPE_WORD_SIGN, /* signed word (16 bits) */
TYPE_WORD_UNSIGN, /* unsigned word (16 bits) */
TYPE_LONG_SIGN, /* signed long (32 bits) */
TYPE_LONG_UNSIGN, /* unsigned long (32 bits) */
TYPE_RECORD, /* record structure */
TYPE_PTR, /* pointer (32 bit ptr) */
TYPE_STR, /* string */
TYPE_CONST, /* constant (any type) */
TYPE_FLOAT, /* floating point */
TYPE_DOUBLE, /* double precision float */
} hlType;
struct BWGLB_TYPE
static char *hlTypes[13] = {"", "char", "unsigned char", "int", "unsigned int",
"long", "unsigned long", "record", "int *", "char *",
"", "float", "double"};
typedef enum
{
int16_t seg; /* segment value */
int16_t off; /* offset */
eReg regi; /* optional indexed register */
} ;
STK_FRAME, /* For stack vars */
REG_FRAME, /* For register variables */
GLB_FRAME, /* For globals */
} frameType;
/* For TYPE_LONG_(UN)SIGN on the stack */
struct LONG_STKID_TYPE
{
int offH; /* high offset from BP */
int offL; /* low offset from BP */
LONG_STKID_TYPE(int h,int l) : offH(h),offL(l) {}
};
/* For TYPE_LONG_(UN)SIGN registers */
struct LONGID_TYPE
{
protected:
eReg m_h; /* high register */
eReg m_l; /* low register */
public:
void set(eReg highpart,eReg lowpart)
{
m_h = highpart;
m_l = lowpart;
}
eReg l() const { return m_l; }
eReg h() const { return m_h; }
bool srcDstRegMatch(iICODE a,iICODE b) const;
LONGID_TYPE() {} // uninitializing constructor to help valgrind catch uninit accesses
LONGID_TYPE(eReg h,eReg l) : m_h(h),m_l(l) {}
};
/* Enumeration to determine whether pIcode points to the high or low part
* of a long number */
typedef enum {
HIGH_FIRST, /* High value is first */
LOW_FIRST, /* Low value is first */
} hlFirst;
struct LONGGLB_TYPE /* For TYPE_LONG_(UN)SIGN globals */
typedef struct
{
int16_t seg; /* segment value */
int16_t offH; /* offset high */
int16_t offL; /* offset low */
uint8_t regi; /* optional indexed register */
LONGGLB_TYPE(int16_t _seg,int16_t _H,int16_t _L,int8_t _reg=0)
{
seg=_seg;
offH=_H;
offL=_L;
regi=_reg;
}
};
int16 seg; /* segment value */
int16 off; /* offset */
byte regi; /* optional indexed register */
} BWGLB_TYPE;
typedef struct
{ /* For TYPE_LONG_(UN)SIGN on the stack */
Int offH; /* high offset from BP */
Int offL; /* low offset from BP */
} LONG_STKID_TYPE;
typedef struct
{ /* For TYPE_LONG_(UN)SIGN registers */
byte h; /* high register */
byte l; /* low register */
} LONGID_TYPE;
/* ID, LOCAL_ID */
struct ID
{
protected:
LONGID_TYPE m_longId; /* For TYPE_LONG_(UN)SIGN registers */
public:
hlType type; /* Probable type */
IDX_ARRAY idx; /* Index into icode array (REG_FRAME only) */
frameType loc; /* Frame location */
bool illegal; /* Boolean: not a valid field any more */
bool hasMacro; /* Identifier requires a macro */
char macro[10]; /* Macro for this identifier */
QString name; /* Identifier's name */
union ID_UNION { /* Different types of identifiers */
friend struct ID;
protected:
LONG_STKID_TYPE longStkId; /* For TYPE_LONG_(UN)SIGN on the stack */
public:
eReg regi; /* For TYPE_BYTE(WORD)_(UN)SIGN registers */
struct { /* For TYPE_BYTE(WORD)_(UN)SIGN on the stack */
uint8_t regOff; /* register offset (if any) */
int off; /* offset from BP */
} bwId;
BWGLB_TYPE bwGlb; /* For TYPE_BYTE(uint16_t)_(UN)SIGN globals */
LONGGLB_TYPE longGlb;
struct { /* For TYPE_LONG_(UN)SIGN constants */
uint32_t h; /* high uint16_t */
uint32_t l; /* low uint16_t */
} longKte;
ID_UNION() { /*new (&longStkId) LONG_STKID_TYPE();*/}
} id;
typedef struct {
hlType type; /* Probable type */
boolT illegal;/* Boolean: not a valid field any more */
IDX_ARRAY idx; /* Index into icode array (REG_FRAME only) */
frameType loc; /* Frame location */
boolT hasMacro;/* Identifier requires a macro */
char macro[10];/* Macro for this identifier */
char name[20];/* Identifier's name */
union { /* Different types of identifiers */
byte regi; /* For TYPE_BYTE(WORD)_(UN)SIGN registers */
struct { /* For TYPE_BYTE(WORD)_(UN)SIGN on the stack */
byte regOff; /* register offset (if any) */
Int off; /* offset from BP */
} bwId;
BWGLB_TYPE bwGlb; /* For TYPE_BYTE(WORD)_(UN)SIGN globals */
LONGID_TYPE longId; /* For TYPE_LONG_(UN)SIGN registers */
LONG_STKID_TYPE longStkId;/* For TYPE_LONG_(UN)SIGN on the stack */
struct { /* For TYPE_LONG_(UN)SIGN globals */
int16 seg; /* segment value */
int16 offH; /* offset high */
int16 offL; /* offset low */
byte regi; /* optional indexed register */
} longGlb;
struct { /* For TYPE_LONG_(UN)SIGN constants */
dword h; /* high word */
dword l; /* low word */
} longKte;
} id;
} ID;
LONGID_TYPE & longId() {assert(isLong() and loc==REG_FRAME); return m_longId;}
const LONGID_TYPE & longId() const {assert(isLong() and loc==REG_FRAME); return m_longId;}
LONG_STKID_TYPE & longStkId() {assert(isLong() and loc==STK_FRAME); return id.longStkId;}
const LONG_STKID_TYPE & longStkId() const {assert(isLong() and loc==STK_FRAME); return id.longStkId;}
ID();
ID(hlType t, frameType f);
ID(hlType t, const LONGID_TYPE &s);
ID(hlType t, const LONG_STKID_TYPE &s);
ID(hlType t, const LONGGLB_TYPE &s);
bool isSigned() const { return (type==TYPE_BYTE_SIGN) or (type==TYPE_WORD_SIGN) or (type==TYPE_LONG_SIGN);}
uint16_t typeBitsize() const
{
return TypeContainer::typeSize(type)*8;
}
bool isLong() const { return (type==TYPE_LONG_UNSIGN) or (type==TYPE_LONG_SIGN); }
void setLocalName(int i)
{
char buf[32];
sprintf (buf, "loc%d", i);
name=buf;
}
bool isLongRegisterPair() const { return (loc == REG_FRAME) and isLong();}
eReg getPairedRegister(eReg first) const;
};
struct LOCAL_ID
{
std::vector<ID> id_arr;
protected:
int newLongIdx(int16_t seg, int16_t offH, int16_t offL, uint8_t regi, hlType t);
int newLongGlb(int16_t seg, int16_t offH, int16_t offL, hlType t);
int newLongStk(hlType t, int offH, int offL);
public:
LOCAL_ID()
{
id_arr.reserve(256);
}
// interface to allow range based iteration
std::vector<ID>::iterator begin() {return id_arr.begin();}
std::vector<ID>::iterator end() {return id_arr.end();}
int newByteWordReg(hlType t, eReg regi);
int newByteWordStk(hlType t, int off, uint8_t regOff);
int newIntIdx(int16_t seg, int16_t off, eReg regi, hlType t);
int newLongReg(hlType t, const LONGID_TYPE &longT, iICODE ix_);
int newLong(opLoc sd, iICODE pIcode, hlFirst f, iICODE ix, operDu du, int off);
int newLong(opLoc sd, iICODE pIcode, hlFirst f, iICODE ix, operDu du, LLInst &atOffset);
void newIdent(hlType t, frameType f);
void flagByteWordId(int off);
void propLongId(uint8_t regL, uint8_t regH, const QString & name);
size_t csym() const {return id_arr.size();}
void newRegArg(ICODE & picode, ICODE & ticode) const;
void processTargetIcode(ICODE & picode, int &numHlIcodes, ICODE & ticode, bool isLong) const;
void forwardSubs(Expr *lhs, Expr *rhs, ICODE & picode, ICODE & ticode, int &numHlIcodes) const;
AstIdent *createId(const ID *retVal, iICODE ix_);
eReg getPairedRegisterAt(int idx,eReg first) const;
};
typedef struct {
Int csym; /* No. of symbols in the table */
Int alloc; /* No. of symbols allocated */
ID *id; /* Identifier */
} LOCAL_ID;

View File

@ -1,81 +0,0 @@
#pragma once
#include <QtCore/QString>
#include <stdint.h>
#include <bitset>
class QTextStream;
struct LivenessSet;
/* Machine registers */
enum eReg
{
rUNDEF = 0,
rAX = 1, /* These are numbered relative to real 8086 */
rCX = 2,
rDX = 3,
rBX = 4,
rSP = 5,
rBP = 6,
rSI = 7,
rDI = 8,
rES = 9,
rCS = 10,
rSS = 11,
rDS = 12,
rAL = 13,
rCL = 14,
rDL = 15,
rBL = 16,
rAH = 17,
rCH = 18,
rDH = 19,
rBH = 20,
rTMP= 21, /* temp register for DIV/IDIV/MOD */
rTMP2= 22, /* temp register for DIV/IDIV/MOD */
/* Indexed modes go from INDEXBASE to INDEXBASE+7 */
INDEX_BX_SI = 23, // "bx+si"
INDEX_BX_DI, // "bx+di"
INDEX_BP_SI, // "bp+si"
INDEX_BP_DI, // "bp+di"
INDEX_SI, // "si"
INDEX_DI, // "di"
INDEX_BP, // "bp"
INDEX_BX, // "bx"
LAST_REG
};
class SourceMachine
{
public:
virtual bool physicalReg(eReg r)=0;
};
//class Machine_X86_Disassembler
//{
// void formatRM(std::ostringstream &p, uint32_t flg, const LLOperand &pm);
//};
class Machine_X86 : public SourceMachine
{
public:
Machine_X86();
virtual ~Machine_X86() {}
static const QString & regName(eReg r);
static const QString & opcodeName(unsigned r);
static const QString & floatOpName(unsigned r);
bool physicalReg(eReg r);
/* Writes the registers that are set in the bitvector */
//TODO: move this into Machine_X86 ?
static void writeRegVector (QTextStream & ostr, const LivenessSet &regi);
static eReg subRegH(eReg reg);
static eReg subRegL(eReg reg);
static bool isMemOff(eReg r);
static bool isSubRegisterOf(eReg reg, eReg parent);
static bool hasSubregisters(eReg reg);
static bool isPartOfComposite(eReg reg);
static eReg compositeParent(eReg reg);
};

View File

@ -1,3 +0,0 @@
#ifdef _MSC_VER
#include <iso646.h>
#endif

34
include/perfhlib.h Normal file
View File

@ -0,0 +1,34 @@
/* Perfect hashing function library. Contains functions to generate perfect
hashing functions
* (C) Mike van Emmerik
*/
#define TRUE 1
#define FALSE 0
#define bool unsigned char
#define byte unsigned char
#define word unsigned short
/* Prototypes */
void hashParams(int NumEntry, int EntryLen, int SetSize, char SetMin,
int NumVert); /* Set the parameters for the hash table */
void hashCleanup(void); /* Frees memory allocated by hashParams() */
void map(void); /* Part 1 of creating the tables */
void assign(void); /* Part 2 of creating the tables */
int hash(byte *s); /* Hash the string to an int 0 .. NUMENTRY-1 */
word *readT1(void); /* Returns a pointer to the T1 table */
word *readT2(void); /* Returns a pointer to the T2 table */
word *readG(void); /* Returns a pointer to the g table */
/* The application must provide these functions: */
void getKey(int i, byte **pKeys);/* Set *keys to point to the i+1th key */
void dispKey(int i); /* Display the key */
/* Macro reads a LH word from the image regardless of host convention */
#ifndef LH
#define LH(p) ((int)((byte *)(p))[0] + ((int)((byte *)(p))[1] << 8))
#endif

View File

@ -1,74 +0,0 @@
#pragma once
#include <string>
#include <stdint.h>
#include <cassert>
#include <list>
#include <boost/icl/interval.hpp>
#include <boost/icl/interval_map.hpp>
#include <boost/icl/split_interval_map.hpp>
#include <unordered_set>
#include <QtCore/QString>
#include "symtab.h"
#include "BinaryImage.h"
#include "Procedure.h"
class QString;
class SourceMachine;
struct CALL_GRAPH;
class IProject
{
virtual PROG *binary()=0;
virtual const QString & project_name() const =0;
virtual const QString & binary_path() const =0;
};
class Project : public IProject
{
static Project *s_instance;
QString m_fname;
QString m_project_name;
QString m_output_path;
public:
typedef std::list<Function> FunctionListType;
typedef FunctionListType lFunction;
typedef FunctionListType::iterator ilFunction;
SYMTAB symtab; /* Global symbol table */
FunctionListType pProcList;
CALL_GRAPH * callGraph; /* Pointer to the head of the call graph */
PROG prog; /* Loaded program image parameters */
// no copies
Project(const Project&) = delete;
const Project & operator=(const Project & l) =delete;
// only moves
Project(); // default constructor,
public:
void create(const QString &a);
bool load();
const QString & output_path() const {return m_output_path;}
const QString & project_name() const {return m_project_name;}
const QString & binary_path() const {return m_fname;}
QString output_name(const char *ext);
ilFunction funcIter(Function *to_find);
ilFunction findByEntry(uint32_t entry);
ilFunction createFunction(FunctionType *f, const QString & name);
bool valid(ilFunction iter);
int getSymIdxByAddr(uint32_t adr);
bool validSymIdx(size_t idx);
size_t symbolSize(size_t idx);
hlType symbolType(size_t idx);
const QString & symbolName(size_t idx);
const SYM & getSymByIdx(size_t idx) const;
static Project * get();
PROG * binary() {return &prog;}
SourceMachine *machine();
const FunctionListType &functions() const { return pProcList; }
FunctionListType &functions() { return pProcList; }
protected:
void initialize();
void writeGlobSymTable();
};
//extern Project g_proj;

View File

@ -1,12 +1,38 @@
#pragma once
/* Scanner functions
* (C) Cristina Cifuentes, Jeff Ledermann
*/
#include <stdint.h>
#include "error.h"
#define REG(x) ((uint8_t)(x & 0x38) >> 3)
//#define LH(p) ((int)((uint8_t *)(p))[0] + ((int)((uint8_t *)(p))[1] << 8))
struct ICODE;
/* Extracts reg bits from middle of mod-reg-rm uint8_t */
extern eErrorId scan(uint32_t ip, ICODE &p);
#define LH(p) ((int)((byte *)(p))[0] + ((int)((byte *)(p))[1] << 8))
static void rm(Int i);
static void modrm(Int i);
static void segrm(Int i);
static void data1(Int i);
static void data2(Int i);
static void regop(Int i);
static void segop(Int i);
static void strop(Int i);
static void escop(Int i);
static void axImp(Int i);
static void alImp(Int i);
static void axSrcIm(Int i);
static void memImp(Int i);
static void memReg0(Int i);
static void memOnly(Int i);
static void dispM(Int i);
static void dispS(Int i);
static void dispN(Int i);
static void dispF(Int i);
static void prefix(Int i);
static void immed(Int i);
static void shift(Int i);
static void arith(Int i);
static void trans(Int i);
static void const1(Int i);
static void const3(Int i);
static void none1(Int i);
static void none2(Int i);
static void checkInt(Int i);
/* Extracts reg bits from middle of mod-reg-rm byte */
#define REG(x) ((byte)(x & 0x38) >> 3)

View File

@ -2,38 +2,18 @@
* dcc project header
* (C) Cristina Cifuentes, Mike van Emmerik
****************************************************************************/
#pragma once
#include <stdint.h>
#include <cstring>
#include "machine_x86.h"
/* STATE TABLE */
struct STATE
typedef struct
{
uint32_t IP; /* Offset into Image */
uint16_t r[INDEX_BX_SI]; /* Value of segs and AX */
bool f[INDEX_BX_SI]; /* True if r[.] has a value */
dword IP; /* Offset into Image */
int16 r[INDEXBASE]; /* Value of segs and AX */
byte f[INDEXBASE]; /* True if r[.] has a value */
struct
{ /* For case stmt indexed reg */
uint8_t regi; /* Last conditional jump */
int16_t immed; /* Contents of the previous register */
{ /* For case stmt indexed reg */
byte regi; /* Last conditional jump */
int16 immed; /* Contents of the previous register */
} JCond;
void setState(uint16_t reg, int16_t value);
void checkStartup();
bool isKnown(eReg v) {return f[v];}
void kill(eReg v) { f[v]=false;}
STATE() : IP(0)
{
JCond.regi=0;
JCond.immed=0;
memset(r,0,sizeof(int16_t)*INDEX_BX_SI); //TODO: move this to machine_x86
memset(f,0,sizeof(uint8_t)*INDEX_BX_SI);
}
void setMemoryByte(uint32_t addr,uint8_t val)
{
//TODO: make this into a full scale value tracking class !
}
};
} STATE;
typedef STATE *PSTATE;

View File

@ -2,111 +2,42 @@
* Symbol table prototypes
* (C) Mike van Emmerik
*/
#pragma once
#include "Enums.h"
#include "types.h"
#include "msvc_fixes.h"
#include <QtCore/QString>
#include <string>
#include <stdint.h>
struct Expr;
struct AstIdent;
struct TypeContainer;
/* * * * * * * * * * * * * * * * * */
/* Symbol table structs and protos */
/* * * * * * * * * * * * * * * * * */
struct SymbolCommon
{
QString name; /* New name for this variable/symbol/argument */
int size; /* Size/maximum size */
hlType type; /* probable type */
eDuVal duVal; /* DEF, USE, VAL */
SymbolCommon() : size(0),type(TYPE_UNKNOWN)
{}
};
struct SYM : public SymbolCommon
{
typedef uint32_t tLabel;
SYM() : label(0),flg(0)
{
}
uint32_t label; /* physical address (20 bit) */
uint32_t flg; /* SEG_IMMED, IMPURE, WORD_OFF */
};
/* STACK FRAME */
struct STKSYM : public SymbolCommon
typedef struct
{
typedef int16_t tLabel;
Expr * actual=0; /* Expression tree of actual parameter */
AstIdent * regs=0; /* For register arguments only */
tLabel label=0; /* Immediate off from BP (+:args, -:params) */
uint8_t regOff=0; /* Offset is a register (e.g. SI, DI) */
bool hasMacro=false; /* This type needs a macro */
QString macro; /* Macro name */
bool invalid=false; /* Boolean: invalid entry in formal arg list*/
void setArgName(int i)
{
char buf[32];
sprintf (buf, "arg%d", i);
name = buf;
}
};
template<class T>
class SymbolTableCommon : public std::vector<T>
{
public:
typedef typename std::vector<T>::iterator iterator;
typedef typename std::vector<T>::const_iterator const_iterator;
iterator findByLabel(typename T::tLabel lab)
{
auto iter = std::find_if(this->begin(),this->end(),
[lab](T &s)->bool {return s.label==lab;});
return iter;
}
const_iterator findByLabel(typename T::tLabel lab) const
{
auto iter = std::find_if(this->begin(),this->end(),
[lab](const T &s)->bool {return s.label==lab;});
return iter;
}
char *pSymName; /* Ptr to symbolic name or comment */
dword symOff; /* Symbol image offset */
PPROC symProc; /* Procedure pointer */
word preHash; /* Hash value before the modulo */
word postHash; /* Hash value after the modulo */
word nextOvf; /* Next entry this hash bucket, or -1 */
word prevOvf; /* Back link in Ovf chain */
} SYMTABLE;
};
/* SYMBOL TABLE */
class SYMTAB : public SymbolTableCommon<SYM>
{
public:
void updateSymType(uint32_t symbol, const TypeContainer &tc);
SYM *updateGlobSym(uint32_t operand, int size, uint16_t duFlag, bool &inserted_new);
};
struct Function;
struct SYMTABLE
{
std::string pSymName; /* Ptr to symbolic name or comment */
uint32_t symOff; /* Symbol image offset */
Function *symProc; /* Procedure pointer */
SYMTABLE() : symOff(0),symProc(0) {}
SYMTABLE(uint32_t _sym,Function *_proc) : symOff(_sym),symProc(_proc)
{}
bool operator == (const SYMTABLE &other) const
{
// does not yse pSymName, to ease finding by symOff/symProc combo
// in map<SYMTABLE,X>
return (symOff==other.symOff) and symProc==(other.symProc);
}
};
enum tableType /* The table types */
enum _tableType /* The table types */
{
Label=0, /* The label table */
Comment /* The comment table */
};
constexpr int NUM_TABLE_TYPES = int(Comment)+1; /* Number of entries: must be last */
Comment, /* The comment table */
NUM_TABLE_TYPES /* Number of entries: must be last */
};
typedef enum _tableType tableType; /* For convenience */
void createSymTables(void);
void destroySymTables(void);
bool readVal (QTextStream & symName, uint32_t symOff, Function *symProc);
void enterSym(char *symName, dword symOff, PPROC symProc, boolT bSymToo);
boolT readSym (char *symName, dword *pSymOff, PPROC *pSymProc);
boolT readVal (char *symName, dword symOff, PPROC symProc);
void deleteSym(char *symName);
void deleteVal(dword symOff, PPROC symProc, boolT bSymToo);
boolT findVal(dword symOff, PPROC symProc, word *pIndex);
word symHash(char *name, word *pre);
word valHash(dword off, PPROC proc, word *pre);
void selectTable(tableType); /* Select a particular table */
char *addStrTbl(char *pStr); /* Add string to string table */

View File

@ -1,21 +1,34 @@
/*
***************************************************************************
/****************************************************************************
* dcc project general header
* (C) Cristina Cifuentes, Mike van Emmerik
***************************************************************************
*/
#pragma once
#include "Enums.h"
#include "msvc_fixes.h"
#include <cassert>
#include <stdint.h>
#include <stdlib.h>
****************************************************************************/
/**** Common definitions and macros ****/
#ifdef __MSDOS__ /* Intel: 16 bit integer */
typedef long Int; /* Int: 0x80000000..0x7FFFFFFF */
typedef unsigned long flags32; /* 32 bits */
typedef unsigned long dword; /* 32 bits */
#define MAX 0x7FFFFFFF
#else /* Unix: 32 bit integer */
typedef int Int; /* Int: 0x80000000..0x7FFFFFFF */
typedef unsigned int flags32; /* 32 bits */
typedef unsigned int dword; /* 32 bits */
#define MAX 0x7FFFFFFF
#endif
/* Type definitions used in the program */
typedef unsigned char byte; /* 8 bits */
typedef unsigned short word;/* 16 bits */
typedef short int16; /* 16 bits */
typedef unsigned char boolT; /* 8 bits */
#if defined(__MSDOS__) | defined(WIN32)
#define unlink _unlink // Compiler is picky about non Ansi names
#endif
#define TRUE 1
#define FALSE 0
#define SYNTHESIZED_MIN 0x100000 /* Synthesized labs use bits 21..32 */
@ -24,17 +37,19 @@
#define PATLEN 23 /* Length of proc patterns */
#define WILD 0xF4 /* The wild byte */
/* MACROS */
/****** MACROS *******/
// Macro reads a LH word from the image regardless of host convention
// Returns a 16 bit quantity, e.g. C000 is read into an Int as C000
/* Macro to allocate a node of size sizeof(structType). */
#define allocStruc(structType) (structType *)allocMem(sizeof(structType))
/* Macro reads a LH word from the image regardless of host convention */
/* Returns a 16 bit quantity, e.g. C000 is read into an Int as C000 */
//#define LH(p) ((int16)((byte *)(p))[0] + ((int16)((byte *)(p))[1] << 8))
#define LH(p) ((uint16_t)((uint8_t *)(p))[0] + ((uint16_t)((uint8_t *)(p))[1] << 8))
#define LH(p) ((word)((byte *)(p))[0] + ((word)((byte *)(p))[1] << 8))
/* Macro reads a LH word from the image regardless of host convention */
/* Returns a signed quantity, e.g. C000 is read into an Int as FFFFC000 */
#define LH_SIGNED(p) (((uint8_t *)(p))[0] + (((char *)(p))[1] << 8))
#define LHS(p) (((byte *)(p))[0] + (((char *)(p))[1] << 8))
/* Macro tests bit b for type t in prog.map */
#define BITMAP(b, t) (prog.map[(b) >> 2] & ((t) << (((b) & 3) << 1)))
@ -42,85 +57,3 @@
/* Macro to convert a segment, offset definition into a 20 bit address */
#define opAdr(seg,off) ((seg << 4) + off)
/* duVal FLAGS */
struct eDuVal
{
eDuVal()
{
def=use=val=0;
}
enum flgs
{
DEF=1,
USE=2,
VAL=4
};
uint8_t def :1; //!< Variable was first defined than used
uint8_t use :1; //!< Variable was first used than defined
uint8_t val :1; /* Variable has an initial value. 2 cases:
1. When variable is used first (ie. global)
2. When a value is moved into the variable
for the first time.
*/
void setFlags(uint16_t x)
{
def = x&DEF;
use = x&USE;
val = x&VAL;
}
bool isUSE_VAL() {return use and val;} //Use and Val
};
static constexpr const char * hlTypes[13] = {
"",
"char",
"unsigned char",
"int",
"unsigned int",
"long",
"unsigned long",
"record",
"int *",
"char *",
"",
"float",
"double"
};
struct TypeContainer
{
hlType m_type;
size_t m_size;
TypeContainer(hlType t,size_t sz) : m_type(t),m_size(sz)
{
}
static size_t typeSize(hlType t)
{
switch(t)
{
case TYPE_WORD_SIGN: case TYPE_WORD_UNSIGN:
return 2;
case TYPE_BYTE_SIGN: case TYPE_BYTE_UNSIGN:
return 1;
case TYPE_LONG_SIGN: case TYPE_LONG_UNSIGN:
return 4;
case TYPE_FLOAT:
return 4;
default:
return ~0;
}
return 0;
}
static hlType defaultTypeForSize(size_t x)
{
/* Type of the symbol according to the number of bytes it uses */
static hlType cbType[] = {TYPE_UNKNOWN, TYPE_BYTE_UNSIGN, TYPE_WORD_SIGN,
TYPE_UNKNOWN, TYPE_LONG_SIGN};
assert(x < sizeof(cbType)/sizeof(hlType));
return cbType[x];
}
static constexpr const char *typeName(hlType t)
{
return hlTypes[t];
}
};

View File

@ -14,10 +14,8 @@ def perform_test(exepath,filepath,outname,args)
filepath=path_local(filepath)
joined_args = args.join(' ')
printf("calling:" + "#{exepath} -a1 #{joined_args} -o#{output_path}.a1 #{filepath}\n")
STDERR << "Errors for : #{filepath}\n"
result = `#{exepath} -a 1 -o#{output_path}.a1 #{filepath}`
result = `#{exepath} -a 2 #{joined_args} -o#{output_path}.a2 #{filepath}`
result = `#{exepath} #{joined_args} -o#{output_path} #{filepath}`
result = `#{exepath} -a1 -o#{output_path}.a1 #{filepath}`
result = `#{exepath} -a2 #{joined_args} -o#{output_path}.a2 #{filepath}`
puts result
p $?
end

View File

@ -1,439 +0,0 @@
#include "BasicBlock.h"
#include "msvc_fixes.h"
#include "Procedure.h"
#include "dcc.h"
#include "msvc_fixes.h"
#include <QtCore/QTextStream>
#include <cassert>
#include <string>
#include <boost/range/rbegin.hpp>
#include <boost/range/rend.hpp>
using namespace std;
using namespace boost;
BB *BB::Create(void */*ctx*/, const string &/*s*/, Function *parent, BB */*insertBefore*/)
{
BB *pnewBB = new BB;
pnewBB->Parent = parent;
return pnewBB;
}
/**
* @arg start - basic block starts here, might be parent->Icode.end()
* @arg fin - last of basic block's instructions
*/
BB *BB::Create(const rCODE &r,eBBKind _nodeType, Function *parent)
{
BB* pnewBB;
pnewBB = new BB;
pnewBB->nodeType = _nodeType; /* Initialise */
pnewBB->immedDom = NO_DOM;
pnewBB->loopHead = pnewBB->caseHead = pnewBB->caseTail =
pnewBB->latchNode= pnewBB->loopFollow = NO_NODE;
pnewBB->instructions = r;
/* Mark the basic block to which the icodes belong to, but only for
* real code basic blocks (ie. not interval bbs) */
if(parent)
{
int addr = pnewBB->begin()->loc_ip;
//setInBB should automatically handle if our range is empty
parent->Icode.SetInBB(pnewBB->instructions, pnewBB);
assert(parent->m_ip_to_bb.find(addr)==parent->m_ip_to_bb.end());
parent->m_ip_to_bb[addr] = pnewBB;
parent->m_actual_cfg.push_back(pnewBB);
pnewBB->Parent = parent;
if ( r.begin() != parent->Icode.end() ) /* Only for code BB's */
stats.numBBbef++;
}
return pnewBB;
}
BB *BB::CreateIntervalBB(Function *parent)
{
iICODE endOfParent = parent->Icode.end();
return Create(make_iterator_range(endOfParent,endOfParent),INTERVAL_NODE,nullptr);
}
static const char *const s_nodeType[] = {"branch", "if", "case", "fall", "return", "call",
"loop", "repeat", "interval", "cycleHead",
"caseHead", "terminate",
"nowhere" };
static const char *const s_loopType[] = {"noLoop", "while", "repeat", "loop", "for"};
void BB::display()
{
printf("\nnode type = %s, ", s_nodeType[nodeType]);
printf("start = %d, length = %zd, #out edges = %zd\n", begin()->loc_ip, size(), edges.size());
for (size_t i = 0; i < edges.size(); i++)
{
if(edges[i].BBptr==nullptr)
printf(" outEdge[%2zd] = Unlinked out edge to %d\n",i, edges[i].ip);
else
printf(" outEdge[%2zd] = %d\n",i, edges[i].BBptr->begin()->loc_ip);
}
}
/*****************************************************************************
* displayDfs - Displays the CFG using a depth first traversal
****************************************************************************/
void BB::displayDfs()
{
int i;
assert(this);
traversed = DFS_DISP;
printf("node type = %s, ", s_nodeType[nodeType]);
printf("start = %d, length = %zd, #in-edges = %zd, #out-edges = %zd\n",
begin()->loc_ip, size(), inEdges.size(), edges.size());
printf("dfsFirst = %d, dfsLast = %d, immed dom = %d\n",
dfsFirstNum, dfsLastNum,
immedDom == MAX ? -1 : immedDom);
printf("loopType = %s, loopHead = %d, latchNode = %d, follow = %d\n",
s_loopType[(int)loopType],
loopHead == MAX ? -1 : loopHead,
latchNode == MAX ? -1 : latchNode,
loopFollow == MAX ? -1 : loopFollow);
printf ("ifFollow = %d, caseHead = %d, caseTail = %d\n",
ifFollow == MAX ? -1 : ifFollow,
caseHead == MAX ? -1 : caseHead,
caseTail == MAX ? -1 : caseTail);
if (nodeType == INTERVAL_NODE)
printf("corresponding interval = %d\n", correspInt->numInt);
else
{
int edge_idx=0;
for(BB *node : inEdges)
{
printf (" inEdge[%d] = %d\n", edge_idx, node->begin()->loc_ip);
edge_idx++;
}
}
/* Display out edges information */
i=0;
for(TYPEADR_TYPE &edg : edges)
{
if (nodeType == INTERVAL_NODE)
printf(" outEdge[%d] = %d\n", i, edg.BBptr->correspInt->numInt);
else
printf(" outEdge[%d] = %d\n", i, edg.BBptr->begin()->loc_ip);
++i;
}
printf("----\n");
/* Recursive call on successors of current node */
for(TYPEADR_TYPE &pb : edges)
{
if (pb.BBptr->traversed != DFS_DISP)
pb.BBptr->displayDfs();
}
}
/** Recursive procedure that writes the code for the given procedure, pointed
to by pBB.
\param indLevel indentation level - used for formatting.
\param numLoc: last # assigned to local variables
*/
ICODE* BB::writeLoopHeader(int &indLevel, Function* pProc, int *numLoc, BB *&latch, bool &repCond)
{
if(loopType == eNodeHeaderType::NO_TYPE)
return nullptr;
latch = pProc->m_dfsLast[this->latchNode];
QString ostr_contents;
QTextStream ostr(&ostr_contents);
ICODE* picode;
switch (loopType)
{
case eNodeHeaderType::WHILE_TYPE:
picode = &this->back();
/* Check for error in while condition */
if (picode->hl()->opcode != HLI_JCOND)
reportError (WHILE_FAIL);
/* Check if condition is more than 1 HL instruction */
if (numHlIcodes > 1)
{
/* Write the code for this basic block */
writeBB(ostr,indLevel, pProc, numLoc);
repCond = true;
}
/* Condition needs to be inverted if the loop body is along
* the THEN path of the header node */
if (edges[ELSE].BBptr->dfsLastNum == loopFollow)
{
picode->hlU()->replaceExpr(picode->hl()->expr()->inverse());
}
{
QString e=picode->hl()->expr()->walkCondExpr (pProc, numLoc);
ostr << "\n"<<indentStr(indLevel)<<"while ("<<e<<") {\n";
}
picode->invalidate();
break;
case eNodeHeaderType::REPEAT_TYPE:
ostr << "\n"<<indentStr(indLevel)<<"do {\n";
picode = &latch->back();
picode->invalidate();
break;
case eNodeHeaderType::ENDLESS_TYPE:
ostr << "\n"<<indentStr(indLevel)<<"for (;;) {\n";
picode = &latch->back();
break;
}
ostr.flush();
cCode.appendCode(ostr_contents);
stats.numHLIcode += 1;
indLevel++;
return picode;
}
bool BB::isEndOfPath(int latch_node_idx) const
{
return nodeType == RETURN_NODE or nodeType == TERMINATE_NODE or
nodeType == NOWHERE_NODE or dfsLastNum == latch_node_idx;
}
void BB::writeCode (int indLevel, Function * pProc , int *numLoc,int _latchNode, int _ifFollow)
{
int follow; /* ifFollow */
BB * succ, *latch; /* Successor and latching node */
ICODE * picode; /* Pointer to HLI_JCOND instruction */
QString l; /* Pointer to HLI_JCOND expression */
bool emptyThen, /* THEN clause is empty */
repCond; /* Repeat condition for while() */
/* Check if this basic block should be analysed */
if ((_ifFollow != UN_INIT) and (this == pProc->m_dfsLast[_ifFollow]))
return;
if (wasTraversedAtLevel(DFS_ALPHA))
return;
traversed = DFS_ALPHA;
/* Check for start of loop */
repCond = false;
latch = nullptr;
picode=writeLoopHeader(indLevel, pProc, numLoc, latch, repCond);
/* Write the code for this basic block */
if (repCond == false)
{
QString ostr_contents;
QTextStream ostr(&ostr_contents);
writeBB(ostr,indLevel, pProc, numLoc);
ostr.flush();
cCode.appendCode(ostr_contents);
}
/* Check for end of path */
if (isEndOfPath(_latchNode))
return;
/* Check type of loop/node and process code */
if ( loopType!=eNodeHeaderType::NO_TYPE ) /* there is a loop */
{
assert(latch);
if (this != latch) /* loop is over several bbs */
{
if (loopType == eNodeHeaderType::WHILE_TYPE)
{
succ = edges[THEN].BBptr;
if (succ->dfsLastNum == loopFollow)
succ = edges[ELSE].BBptr;
}
else
succ = edges[0].BBptr;
if (succ->traversed != DFS_ALPHA)
succ->writeCode (indLevel, pProc, numLoc, latch->dfsLastNum,_ifFollow);
else /* has been traversed so we need a goto */
succ->front().ll()->emitGotoLabel (indLevel);
}
/* Loop epilogue: generate the loop trailer */
indLevel--;
if (loopType == eNodeHeaderType::WHILE_TYPE)
{
QString ostr_contents;
QTextStream ostr(&ostr_contents);
/* Check if there is need to repeat other statements involved
* in while condition, then, emit the loop trailer */
if (repCond)
{
writeBB(ostr,indLevel+1, pProc, numLoc);
}
ostr <<indentStr(indLevel)<< "} /* end of while */\n";
ostr.flush();
cCode.appendCode(ostr_contents);
}
else if (loopType == eNodeHeaderType::ENDLESS_TYPE)
cCode.appendCode( "%s} /* end of loop */\n",indentStr(indLevel));
else if (loopType == eNodeHeaderType::REPEAT_TYPE)
{
QString e = "//*failed*//";
if (picode->hl()->opcode != HLI_JCOND)
{
reportError (REPEAT_FAIL);
}
else
{
e=picode->hl()->expr()->walkCondExpr (pProc, numLoc);
}
cCode.appendCode( "%s} while (%s);\n", indentStr(indLevel),qPrintable(e));
}
/* Recurse on the loop follow */
if (loopFollow != MAX)
{
succ = pProc->m_dfsLast[loopFollow];
if (succ->traversed != DFS_ALPHA)
succ->writeCode (indLevel, pProc, numLoc, _latchNode, _ifFollow);
else /* has been traversed so we need a goto */
succ->front().ll()->emitGotoLabel (indLevel);
}
}
else /* no loop, process nodeType of the graph */
{
if (nodeType == TWO_BRANCH) /* if-then[-else] */
{
stats.numHLIcode++;
indLevel++;
emptyThen = false;
if (ifFollow != MAX) /* there is a follow */
{
/* process the THEN part */
follow = ifFollow;
succ = edges[THEN].BBptr;
if (succ->traversed != DFS_ALPHA) /* not visited */
{
if (succ->dfsLastNum != follow) /* THEN part */
{
l = writeJcond ( *back().hl(), pProc, numLoc);
cCode.appendCode( "\n%s%s", indentStr(indLevel-1), qPrintable(l));
succ->writeCode (indLevel, pProc, numLoc, _latchNode,follow);
}
else /* empty THEN part => negate ELSE part */
{
l = writeJcondInv ( *back().hl(), pProc, numLoc);
cCode.appendCode( "\n%s%s", indentStr(indLevel-1), qPrintable(l));
edges[ELSE].BBptr->writeCode (indLevel, pProc, numLoc, _latchNode, follow);
emptyThen = true;
}
}
else /* already visited => emit label */
succ->front().ll()->emitGotoLabel(indLevel);
/* process the ELSE part */
succ = edges[ELSE].BBptr;
if (succ->traversed != DFS_ALPHA) /* not visited */
{
if (succ->dfsLastNum != follow) /* ELSE part */
{
cCode.appendCode( "%s}\n%selse {\n",
indentStr(indLevel-1), indentStr(indLevel - 1));
succ->writeCode (indLevel, pProc, numLoc, _latchNode, follow);
}
/* else (empty ELSE part) */
}
else if (not emptyThen) /* already visited => emit label */
{
cCode.appendCode( "%s}\n%selse {\n",
indentStr(indLevel-1), indentStr(indLevel - 1));
succ->front().ll()->emitGotoLabel (indLevel);
}
cCode.appendCode( "%s}\n", indentStr(--indLevel));
/* Continue with the follow */
succ = pProc->m_dfsLast[follow];
if (succ->traversed != DFS_ALPHA)
succ->writeCode (indLevel, pProc, numLoc, _latchNode,_ifFollow);
}
else /* no follow => if..then..else */
{
l = writeJcond ( *back().hl(), pProc, numLoc);
cCode.appendCode( "%s%s", indentStr(indLevel-1), qPrintable(l));
edges[THEN].BBptr->writeCode (indLevel, pProc, numLoc, _latchNode, _ifFollow);
cCode.appendCode( "%s}\n%selse {\n", indentStr(indLevel-1), indentStr(indLevel - 1));
edges[ELSE].BBptr->writeCode (indLevel, pProc, numLoc, _latchNode, _ifFollow);
cCode.appendCode( "%s}\n", indentStr(--indLevel));
}
}
else /* fall, call, 1w */
{
succ = edges[0].BBptr; /* fall-through edge */
assert(succ->size()>0);
if (succ->traversed != DFS_ALPHA)
{
succ->writeCode (indLevel, pProc,numLoc, _latchNode,_ifFollow);
}
}
}
}
/* Writes the code for the current basic block.
* Args: pBB: pointer to the current basic block.
* Icode: pointer to the array of icodes for current procedure.
* lev: indentation level - used for formatting. */
void BB::writeBB(QTextStream &ostr,int lev, Function * pProc, int *numLoc)
{
/* Save the index into the code table in case there is a later goto
* into this instruction (first instruction of the BB) */
front().ll()->codeIdx = cCode.code.nextIdx();
/* Generate code for each hlicode that is not a HLI_JCOND */
for(ICODE &pHli : instructions)
{
if ((pHli.type == HIGH_LEVEL_ICODE) and ( pHli.valid() )) //TODO: use filtering range here.
{
QString line = pHli.hl()->write1HlIcode(pProc, numLoc);
if (not line.isEmpty())
{
ostr<<indentStr(lev)<<line;
stats.numHLIcode++;
}
if (option.verbose)
pHli.writeDU();
}
}
}
iICODE BB::begin()
{
return instructions.begin();
}
iICODE BB::end() const
{
return instructions.end();
}
ICODE &BB::back()
{
return instructions.back();
}
size_t BB::size()
{
return distance(instructions.begin(),instructions.end());
}
ICODE &BB::front()
{
return instructions.front();
}
riICODE BB::rbegin()
{
return riICODE( instructions.end() );
}
riICODE BB::rend()
{
return riICODE( instructions.begin() );
}

View File

@ -1,13 +0,0 @@
SET(dcc_test_SOURCES
tests/comwrite.cpp
tests/project.cpp
tests/loader.cpp
)
include_directories(${GMOCK_INCLUDE_DIRS} ${GMOCK_ROOT}/gtest/include)
add_executable(tester ${dcc_test_SOURCES})
ADD_DEPENDENCIES(tester dcc_lib)
target_link_libraries(tester dcc_lib disasm_s
${GMOCK_BOTH_LIBRARIES} ${REQ_LLVM_LIBRARIES})
add_test(dcc-tests tester)

View File

@ -1,37 +0,0 @@
#include <ostream>
#include <cassert>
#include "CallConvention.h"
#include <QtCore/QTextStream>
CConv *CConv::create(Type v)
{
static C_CallingConvention *c_call = nullptr;
static Pascal_CallingConvention *p_call = nullptr;
static Unknown_CallingConvention *u_call= nullptr;
if(nullptr==c_call)
c_call = new C_CallingConvention;
if(nullptr==p_call)
p_call = new Pascal_CallingConvention;
if(nullptr==u_call)
u_call = new Unknown_CallingConvention;
switch(v) {
case eUnknown: return u_call;
case eCdecl: return c_call;
case ePascal: return p_call;
}
assert(false);
return nullptr;
}
void C_CallingConvention::writeComments(QTextStream & ostr)
{
ostr << " * C calling convention.\n";
}
void Pascal_CallingConvention::writeComments(QTextStream & ostr)
{
ostr << " * Pascal calling convention.\n";
}
void Unknown_CallingConvention::writeComments(QTextStream & ostr)
{
ostr << " * Unknown calling convention.\n";
}

View File

@ -1,535 +0,0 @@
#include "DccFrontend.h"
#include "dcc.h"
#include "msvc_fixes.h"
#include "project.h"
#include "disassem.h"
#include "CallGraph.h"
#include <QtCore/QFileInfo>
#include <QtCore/QDebug>
#include <cstdio>
class Loader
{
bool loadIntoProject(IProject *);
};
struct PSP { /* PSP structure */
uint16_t int20h; /* interrupt 20h */
uint16_t eof; /* segment, end of allocation block */
uint8_t res1; /* reserved */
uint8_t dosDisp[5]; /* far call to DOS function dispatcher */
uint8_t int22h[4]; /* vector for terminate routine */
uint8_t int23h[4]; /* vector for ctrl+break routine */
uint8_t int24h[4]; /* vector for error routine */
uint8_t res2[22]; /* reserved */
uint16_t segEnv; /* segment address of environment block */
uint8_t res3[34]; /* reserved */
uint8_t int21h[6]; /* opcode for int21h and far return */
uint8_t res4[6]; /* reserved */
uint8_t fcb1[16]; /* default file control block 1 */
uint8_t fcb2[16]; /* default file control block 2 */
uint8_t res5[4]; /* reserved */
uint8_t cmdTail[0x80]; /* command tail and disk transfer area */
};
static struct MZHeader { /* EXE file header */
uint8_t sigLo; /* .EXE signature: 0x4D 0x5A */
uint8_t sigHi;
uint16_t lastPageSize; /* Size of the last page */
uint16_t numPages; /* Number of pages in the file */
uint16_t numReloc; /* Number of relocation items */
uint16_t numParaHeader; /* # of paragraphs in the header */
uint16_t minAlloc; /* Minimum number of paragraphs */
uint16_t maxAlloc; /* Maximum number of paragraphs */
uint16_t initSS; /* Segment displacement of stack */
uint16_t initSP; /* Contents of SP at entry */
uint16_t checkSum; /* Complemented checksum */
uint16_t initIP; /* Contents of IP at entry */
uint16_t initCS; /* Segment displacement of code */
uint16_t relocTabOffset; /* Relocation table offset */
uint16_t overlayNum; /* Overlay number */
} header;
#define EXE_RELOCATION 0x10 /* EXE images rellocated to above PSP */
//static void LoadImage(char *filename);
static void displayMemMap(void);
/****************************************************************************
* displayLoadInfo - Displays low level loader type info.
***************************************************************************/
void PROG::displayLoadInfo(void)
{
int i;
printf("File type is %s\n", (fCOM)?"COM":"EXE");
if (not fCOM) {
printf("Signature = %02X%02X\n", header.sigLo, header.sigHi);
printf("File size %% 512 = %04X\n", LH(&header.lastPageSize));
printf("File size / 512 = %04X pages\n", LH(&header.numPages));
printf("# relocation items = %04X\n", LH(&header.numReloc));
printf("Offset to load image = %04X paras\n", LH(&header.numParaHeader));
printf("Minimum allocation = %04X paras\n", LH(&header.minAlloc));
printf("Maximum allocation = %04X paras\n", LH(&header.maxAlloc));
}
printf("Load image size = %08lX\n", cbImage - sizeof(PSP));
printf("Initial SS:SP = %04X:%04X\n", initSS, initSP);
printf("Initial CS:IP = %04X:%04X\n", initCS, initIP);
if (option.VeryVerbose and cReloc)
{
printf("\nRelocation Table\n");
for (i = 0; i < cReloc; i++)
{
printf("%06X -> [%04X]\n", relocTable[i],LH(image() + relocTable[i]));
}
}
printf("\n");
}
/*****************************************************************************
* fill - Fills line for displayMemMap()
****************************************************************************/
static void fill(int ip, char *bf)
{
PROG &prog(Project::get()->prog);
static uint8_t type[4] = {'.', 'd', 'c', 'x'};
uint8_t i;
for (i = 0; i < 16; i++, ip++)
{
*bf++ = ' ';
*bf++ = (ip < prog.cbImage)? type[(prog.map[ip >> 2] >> ((ip & 3) * 2)) & 3]: ' ';
}
*bf = '\0';
}
/*****************************************************************************
* displayMemMap - Displays the memory bitmap
****************************************************************************/
static void displayMemMap(void)
{
PROG &prog(Project::get()->prog);
char c, b1[33], b2[33], b3[33];
uint8_t i;
int ip = 0;
printf("\nMemory Map\n");
while (ip < prog.cbImage)
{
fill(ip, b1);
printf("%06X %s\n", ip, b1);
ip += 16;
for (i = 3, c = b1[1]; i < 32 and c == b1[i]; i += 2)
; /* Check if all same */
if (i > 32)
{
fill(ip, b2); /* Skip until next two are not same */
fill(ip+16, b3);
if (not (strcmp(b1, b2) || strcmp(b1, b3)))
{
printf(" :\n");
do
{
ip += 16;
fill(ip+16, b1);
} while (0==strcmp(b1, b2));
}
}
}
printf("\n");
}
DccFrontend::DccFrontend(QObject *parent) :
QObject(parent)
{
}
/*****************************************************************************
* FrontEnd - invokes the loader, parser, disassembler (if asm1), icode
* rewritter, and displays any useful information.
****************************************************************************/
bool DccFrontend::FrontEnd ()
{
/* Do depth first flow analysis building call graph and procedure list,
* and attaching the I-code to each procedure */
parse (*Project::get());
if (option.asm1)
{
qWarning() << "dcc: writing assembler file "<<asm1_name<<'\n';
}
/* Search through code looking for impure references and flag them */
Disassembler ds(1);
for(Function &f : Project::get()->pProcList)
{
f.markImpure();
if (option.asm1)
{
ds.disassem(&f);
}
}
if (option.Interact)
{
interactDis(&Project::get()->pProcList.front(), 0); /* Interactive disassembler */
}
/* Converts jump target addresses to icode offsets */
for(Function &f : Project::get()->pProcList)
{
f.bindIcodeOff();
}
/* Print memory bitmap */
if (option.Map)
displayMemMap();
return(true); // we no longer own proj !
}
struct DosLoader {
protected:
void prepareImage(PROG &prog,size_t sz,QFile &fp) {
/* Allocate a block of memory for the program. */
prog.cbImage = sz + sizeof(PSP);
prog.Imagez = new uint8_t [prog.cbImage];
prog.Imagez[0] = 0xCD; /* Fill in PSP int 20h location */
prog.Imagez[1] = 0x20; /* for termination checking */
/* Read in the image past where a PSP would go */
if (sz != fp.read((char *)prog.Imagez + sizeof(PSP),sz))
fatalError(CANNOT_READ, fp.fileName().toLocal8Bit().data());
}
};
struct ComLoader : public DosLoader {
bool canLoad(QFile &fp) {
fp.seek(0);
char sig[2];
if(2==fp.read(sig,2)) {
return not (sig[0] == 0x4D and sig[1] == 0x5A);
}
return false;
}
bool load(PROG &prog,QFile &fp) {
fp.seek(0);
/* COM file
* In this case the load module size is just the file length
*/
auto cb = fp.size();
/* COM programs start off with an ORG 100H (to leave room for a PSP)
* This is also the implied start address so if we load the image
* at offset 100H addresses should all line up properly again.
*/
prog.initCS = 0;
prog.initIP = 0x100;
prog.initSS = 0;
prog.initSP = 0xFFFE;
prog.cReloc = 0;
prepareImage(prog, cb, fp);
/* Set up memory map */
cb = (prog.cbImage + 3) / 4;
prog.map = (uint8_t *)malloc(cb);
memset(prog.map, BM_UNKNOWN, (size_t)cb);
return true;
}
};
#if 0
struct RomLoader {
bool canLoad(QFile &fp) {
fp.seek(0xFFF0);
uint8_t sig[1];
if(fp.read((char *)sig,1) == 1)
{
return (sig[0] == 0xEA);
}
return false;
}
bool load(PROG &prog,QFile &fp) {
printf("Loading ROM...\n");
fp.seek(0);
/* ROM file
* In this case the load module size is just the file length
*/
auto cb = fp.size();
fp.seek(cb - 0x10);
uint8_t buf[5];
printf("Going to get CS/IP...\n");
if(fp.read((char *)buf, 5) != 5)
{
return false;
}
fp.seek(0);
/* ROM File, Hard to say where it is suppose to start, so try to trust the
*/
prog.initIP = (buf[2] << 8) | buf[1];
//prog.initCS = 0;
prog.initCS = (buf[4] << 8) | buf[3];
prog.initSS = 0;
prog.initSP = 0xFFFE;
prog.cReloc = 0;
prepareImage(prog, cb, fp);
/* Set up memory map */
cb = (prog.cbImage + 3) / 4;
prog.map = (uint8_t *)malloc(cb);
memset(prog.map, BM_UNKNOWN, (size_t)cb);
return true;
}
protected:
void prepareImage(PROG &prog, size_t sz, QFile &fp)
{
int32_t start = 0x100000 - sz;
/* Allocate a block of memory for the program. */
prog.cbImage = 1 * 1024 * 1024; /* Allocate the whole 1MB memory */
//prog.cbImage = 64 * 1024; /* Allocate the whole 1MB memory */
prog.Imagez = new uint8_t [prog.cbImage];
if (fp.read((char *)prog.Imagez + start, sz) != sz)
//if (fp.read((char *)prog.Imagez, sz) != sz)
{
fatalError(CANNOT_READ, fp.fileName().toLocal8Bit().data());
}
}
};
#else
struct RomLoader {
bool canLoad(QFile &fp) {
fp.seek(0xFFF0);
uint8_t sig[1];
if(fp.read((char *)sig,1) == 1)
{
return (sig[0] == 0xEA);
}
return false;
}
bool load(PROG &prog,QFile &fp) {
fp.seek(0);
/* COM file
* In this case the load module size is just the file length
*/
auto cb = fp.size();
/* COM programs start off with an ORG 100H (to leave room for a PSP)
* This is also the implied start address so if we load the image
* at offset 100H addresses should all line up properly again.
*/
prog.initCS = 0;
prog.initIP = 0x000;
prog.initSS = 0;
prog.initSP = 0xFFFE;
prog.cReloc = 0;
prepareImage(prog, cb, fp);
/* Set up memory map */
cb = (prog.cbImage + 3) / 4;
prog.map = (uint8_t *)malloc(cb);
memset(prog.map, BM_UNKNOWN, (size_t)cb);
return true;
}
protected:
void prepareImage(PROG &prog, size_t sz, QFile &fp)
{
/* Allocate a block of memory for the program. */
prog.cbImage = sz;
prog.Imagez = new uint8_t[prog.cbImage];
if (sz != fp.read((char *)prog.Imagez, sz))
fatalError(CANNOT_READ, fp.fileName().toLocal8Bit().data());
}
};
#endif
struct ExeLoader : public DosLoader {
bool canLoad(QFile &fp) {
if(fp.size()<sizeof(header))
return false;
MZHeader tmp_header;
fp.seek(0);
fp.read((char *)&tmp_header, sizeof(header));
if(not (tmp_header.sigLo == 0x4D and tmp_header.sigHi == 0x5A))
return false;
/* This is a typical DOS kludge! */
if (LH(&header.relocTabOffset) == 0x40)
{
qDebug() << "Don't understand new EXE format";
return false;
}
return true;
}
bool load(PROG &prog,QFile &fp) {
/* Read rest of header */
fp.seek(0);
if (fp.read((char *)&header, sizeof(header)) != sizeof(header))
return false;
/* Calculate the load module size.
* This is the number of pages in the file
* less the length of the header and reloc table
* less the number of bytes unused on last page
*/
uint32_t cb = (uint32_t)LH(&header.numPages) * 512 - (uint32_t)LH(&header.numParaHeader) * 16;
if (header.lastPageSize)
{
cb -= 512 - LH(&header.lastPageSize);
}
/* We quietly ignore minAlloc and maxAlloc since for our
* purposes it doesn't really matter where in real memory
* the program would end up. EXE programs can't really rely on
* their load location so setting the PSP segment to 0 is fine.
* Certainly programs that prod around in DOS or BIOS are going
* to have to load DS from a constant so it'll be pretty
* obvious.
*/
prog.initCS = (int16_t)LH(&header.initCS) + EXE_RELOCATION;
prog.initIP = (int16_t)LH(&header.initIP);
prog.initSS = (int16_t)LH(&header.initSS) + EXE_RELOCATION;
prog.initSP = (int16_t)LH(&header.initSP);
prog.cReloc = (int16_t)LH(&header.numReloc);
/* Allocate the relocation table */
if (prog.cReloc)
{
prog.relocTable.resize(prog.cReloc);
fp.seek(LH(&header.relocTabOffset));
/* Read in seg:offset pairs and convert to Image ptrs */
uint8_t buf[4];
for (int i = 0; i < prog.cReloc; i++)
{
fp.read((char *)buf,4);
prog.relocTable[i] = LH(buf) + (((int)LH(buf+2) + EXE_RELOCATION)<<4);
}
}
/* Seek to start of image */
uint32_t start_of_image= LH(&header.numParaHeader) * 16;
fp.seek(start_of_image);
/* Allocate a block of memory for the program. */
prepareImage(prog,cb,fp);
/* Set up memory map */
cb = (prog.cbImage + 3) / 4;
prog.map = (uint8_t *)malloc(cb);
memset(prog.map, BM_UNKNOWN, (size_t)cb);
/* Relocate segment constants */
for(uint32_t v : prog.relocTable) {
uint8_t *p = &prog.Imagez[v];
uint16_t w = (uint16_t)LH(p) + EXE_RELOCATION;
*p++ = (uint8_t)(w & 0x00FF);
*p = (uint8_t)((w & 0xFF00) >> 8);
}
return true;
}
};
/*****************************************************************************
* LoadImage
****************************************************************************/
bool Project::load()
{
// addTask(loaderSelection,PreCond(BinaryImage))
// addTask(applyLoader,PreCond(Loader))
const char *fname = binary_path().toLocal8Bit().data();
QFile finfo(binary_path());
/* Open the input file */
if(not finfo.open(QFile::ReadOnly)) {
fatalError(CANNOT_OPEN, fname);
}
/* Read in first 2 bytes to check EXE signature */
if (finfo.size()<=2)
{
fatalError(CANNOT_READ, fname);
}
RomLoader rom_loader;
ComLoader com_loader;
ExeLoader exe_loader;
if(rom_loader.canLoad(finfo)) {
/* We have no relacation and code should be on 64K only,
* So let's consider it as a COM file
*/
prog.fCOM = true;
return rom_loader.load(prog,finfo);
}
if(exe_loader.canLoad(finfo)) {
prog.fCOM = false;
return exe_loader.load(prog,finfo);
}
if(com_loader.canLoad(finfo)) {
prog.fCOM = true;
return com_loader.load(prog,finfo);
}
return false;
}
uint32_t SynthLab;
/* Parses the program, builds the call graph, and returns the list of
* procedures found */
void DccFrontend::parse(Project &proj)
{
PROG &prog(proj.prog);
STATE state;
/* Set initial state */
state.setState(rES, 0); /* PSP segment */
state.setState(rDS, 0);
state.setState(rCS, prog.initCS);
state.setState(rSS, prog.initSS);
state.setState(rSP, prog.initSP);
state.IP = ((uint32_t)prog.initCS << 4) + prog.initIP;
SynthLab = SYNTHESIZED_MIN;
/* Check for special settings of initial state, based on idioms of the
startup code */
state.checkStartup();
ilFunction start_proc;
/* Make a struct for the initial procedure */
if (prog.offMain != -1)
{
start_proc = proj.createFunction(0,"main");
start_proc->retVal.loc = REG_FRAME;
start_proc->retVal.type = TYPE_WORD_SIGN;
start_proc->retVal.id.regi = rAX;
/* We know where main() is. Start the flow of control from there */
start_proc->procEntry = prog.offMain;
/* In medium and large models, the segment of main may (will?) not be
the same as the initial CS segment (of the startup code) */
state.setState(rCS, prog.segMain);
state.IP = prog.offMain;
}
else
{
start_proc = proj.createFunction(0,"start");
/* Create initial procedure at program start address */
start_proc->procEntry = (uint32_t)state.IP;
}
/* The state info is for the first procedure */
start_proc->state = state;
/* Set up call graph initial node */
proj.callGraph = new CALL_GRAPH;
proj.callGraph->proc = start_proc;
/* This proc needs to be called to set things up for LibCheck(), which
checks a proc to see if it is a know C (etc) library */
prog.bSigs = SetupLibCheck();
//BUG: proj and g_proj are 'live' at this point !
/* Recursively build entire procedure list */
start_proc->FollowCtrl(proj.callGraph, &state);
/* This proc needs to be called to clean things up from SetupLibCheck() */
CleanupLibCheck();
}

View File

@ -1,40 +0,0 @@
#include "Procedure.h"
#include "msvc_fixes.h"
#include "project.h"
#include "scanner.h"
//FunctionType *Function::getFunctionType() const
//{
// return &m_type;
//}
/* Does some heuristic pruning. Looks for ptrs. into the table
* and for addresses that don't appear to point to valid code.
*/
void JumpTable::pruneEntries(uint16_t cs)
{
PROG *prg(Project::get()->binary());
for (uint32_t i = start; i < finish; i += 2)
{
uint32_t target = cs + LH(&prg->image()[i]);
if (target < finish and target >= start)
finish = target;
else if (target >= (uint32_t)prg->cbImage)
finish = i;
}
ICODE _Icode; // used as scan input
for (uint32_t i = start; i < finish; i += 2)
{
uint32_t target = cs + LH(&prg->image()[i]);
/* Be wary of 00 00 as code - it's probably data */
if (not (prg->image()[target] or prg->image()[target+1]) or scan(target, _Icode))
finish = i;
}
}
void Function::callingConv(CConv::Type v) {
m_call_conv=CConv::create(v);
}

View File

@ -1,122 +0,0 @@
#include "types.h"
#include "msvc_fixes.h"
#include "ast.h"
#include "bundle.h"
#include "machine_x86.h"
#include "project.h"
#include <stdint.h>
#include <string>
#include <sstream>
#include <iostream>
#include <cassert>
#include <boost/range/adaptor/filtered.hpp>
#include <boost/range.hpp>
//#include <boost/range/algorithm.hpp>
//#include <boost/assign.hpp>
using namespace std;
using namespace boost::adaptors;
RegisterNode::RegisterNode(const LLOperand &op, LOCAL_ID *locsym)
{
m_syms = locsym;
ident.type(REGISTER);
hlType type_sel;
regType reg_type;
if (op.byteWidth()==1)
{
type_sel = TYPE_BYTE_SIGN;
reg_type = BYTE_REG;
}
else /* uint16_t */
{
type_sel = TYPE_WORD_SIGN;
reg_type = WORD_REG;
}
regiIdx = locsym->newByteWordReg(type_sel, op.regi);
regiType = reg_type;
}
//RegisterNode::RegisterNode(eReg regi, uint32_t icodeFlg, LOCAL_ID *locsym)
//{
// ident.type(REGISTER);
// hlType type_sel;
// regType reg_type;
// if ((icodeFlg & B) or (icodeFlg & SRC_B))
// {
// type_sel = TYPE_BYTE_SIGN;
// reg_type = BYTE_REG;
// }
// else /* uint16_t */
// {
// type_sel = TYPE_WORD_SIGN;
// reg_type = WORD_REG;
// }
// regiIdx = locsym->newByteWordReg(type_sel, regi);
// regiType = reg_type;
//}
QString RegisterNode::walkCondExpr(Function *pProc, int *numLoc) const
{
QString codeOut;
QString o;
assert(&pProc->localId==m_syms);
ID *id = &pProc->localId.id_arr[regiIdx];
if (id->name[0] == '\0') /* no name */
{
id->setLocalName(++(*numLoc));
codeOut += QString("%1 %2; ").arg(TypeContainer::typeName(id->type)).arg(id->name);
codeOut += QString("/* %1 */\n").arg(Machine_X86::regName(id->id.regi));
}
if (id->hasMacro)
o += QString("%1(%2)").arg(id->macro).arg(id->name);
else
o += id->name;
cCode.appendDecl(codeOut);
return o;
}
int RegisterNode::hlTypeSize(Function *) const
{
if (regiType == BYTE_REG)
return 1;
else
return 2;
}
hlType RegisterNode::expType(Function *pproc) const
{
if (regiType == BYTE_REG)
return TYPE_BYTE_SIGN;
else
return TYPE_WORD_SIGN;
}
Expr *RegisterNode::insertSubTreeReg(Expr *_expr, eReg regi, const LOCAL_ID *locsym)
{
assert(locsym==m_syms);
eReg treeReg = locsym->id_arr[regiIdx].id.regi;
if (treeReg == regi) /* uint16_t reg */
{
return _expr;
}
else if(Machine_X86::isSubRegisterOf(treeReg,regi)) /* uint16_t/uint8_t reg */
{
return _expr;
}
return nullptr;
}
bool RegisterNode::xClear(rICODE range_to_check, iICODE lastBBinst, const LOCAL_ID &locId)
{
uint8_t regi = locId.id_arr[regiIdx].id.regi;
range_to_check.advance_begin(1);
auto all_valid_and_high_level_after_start = range_to_check | filtered(ICODE::select_valid_high_level);
for (ICODE &i : all_valid_and_high_level_after_start)
if (i.du.def.testRegAndSubregs(regi))
return false;
if (all_valid_and_high_level_after_start.end().base() != lastBBinst)
return true;
return false;
}

File diff suppressed because it is too large Load Diff

View File

@ -4,54 +4,47 @@
* Purpose: Back-end module. Generates C code for each procedure.
* (C) Cristina Cifuentes
****************************************************************************/
#include "dcc.h"
#include "msvc_fixes.h"
#include "disassem.h"
#include "project.h"
#include "CallGraph.h"
#include <QtCore/QDir>
#include <QtCore/QFile>
#include <QtCore/QStringList>
#include <QtCore/QDebug>
#include <cassert>
#include <string>
#include <boost/range.hpp>
#include <boost/range/adaptor/filtered.hpp>
#include <boost/range/algorithm.hpp>
#include <fstream>
#include <iostream>
#include <sstream>
#include <string.h>
#include <stdio.h>
using namespace boost;
using namespace boost::adaptors;
using namespace std;
bundle cCode; /* Procedure declaration and code */
/* Returns a unique index to the next label */
int getNextLabel()
{
static int labelIdx = 1; /* index of the next label */
return (labelIdx++);
/* Indentation buffer */
#define indSize 81 /* size of the indentation buffer. Each indentation
* is of 4 spaces => max. 20 indentation levels */
static char indentBuf[indSize] =
" ";
static char *indent (Int indLevel)
/* Indentation according to the depth of the statement */
{
return (&indentBuf[indSize-(indLevel*4)-1]);
}
static Int getNextLabel()
/* Returns a unique index to the next label */
{ static Int labelIdx = 1; /* index of the next label */
return (labelIdx++);
}
static void displayStats (PPROC pProc)
/* displays statistics on the subroutine */
void Function::displayStats ()
{
qDebug() << "\nStatistics - Subroutine" << name;
qDebug() << "Number of Icode instructions:";
qDebug() << " Low-level :" << stats.numLLIcode;
if (not (flg & PROC_ASM))
{
qDebug() << " High-level:"<<stats.numHLIcode;
qDebug() << QString(" Percentage reduction: %1%%").arg(100.0 - (stats.numHLIcode *
100.0) / stats.numLLIcode,4,'f',2,QChar('0'));
}
printf("\nStatistics - Subroutine %s\n", pProc->name);
printf ("Number of Icode instructions:\n");
printf (" Low-level : %4d\n", stats.numLLIcode);
if (! (pProc->flg & PROC_ASM))
{
printf (" High-level: %4d\n", stats.numHLIcode);
printf (" Percentage reduction: %2.2f%%\n", 100.0 - (stats.numHLIcode *
100.0) / stats.numLLIcode);
}
}
@ -62,325 +55,619 @@ static void fixupLabels (PPROC pProc)
* a unique label number for it. This label is placed in the associated
* icode for the node (pProc->Icode). The procedure is done in sequential
* order of dsfLast numbering. */
{ int i; /* index into the dfsLast array */
PBB *dfsLast; /* pointer to the dfsLast array */
{ Int i; /* index into the dfsLast array */
PBB *dfsLast; /* pointer to the dfsLast array */
dfsLast = pProc->dfsLast;
for (i = 0; i < pProc->numBBs; i++)
if (dfsLast[i]->flg/* & BB_HAS_LABEL*/) {
pProc->Icode.icode[dfsLast[i]->start].ll()->flg |= HLL_LABEL;
pProc->Icode.icode[dfsLast[i]->start].ll()->hllLabNum = getNextLabel();
}
dfsLast = pProc->dfsLast;
for (i = 0; i < pProc->numBBs; i++)
if (dfsLast[i]->flg/* & BB_HAS_LABEL*/) {
pProc->Icode.icode[dfsLast[i]->start].ic.ll.flg |= HLL_LABEL;
pProc->Icode.icode[dfsLast[i]->start].ic.ll.hllLabNum = getNextLabel();
}
}
#endif
char *cChar (byte c)
/* Returns the corresponding C string for the given character c. Character
* constants such as carriage return and line feed, require 2 C characters. */
char *cChar (uint8_t c)
{
static char res[3];
{ static char res[3];
switch (c) {
case 0x8: /* backspace */
sprintf (res, "\\b");
break;
case 0x9: /* horizontal tab */
sprintf (res, "\\t");
break;
case 0x0A: /* new line */
sprintf (res, "\\n");
break;
case 0x0C: /* form feed */
sprintf (res, "\\f");
break;
case 0x0D: /* carriage return */
sprintf (res, "\\r");
break;
default: /* any other character*/
sprintf (res, "%c", c);
}
return (res);
switch (c) {
case 0x8: /* backspace */
sprintf (res, "\\b");
break;
case 0x9: /* horizontal tab */
sprintf (res, "\\t");
break;
case 0x0A: /* new line */
sprintf (res, "\\n");
break;
case 0x0C: /* form feed */
sprintf (res, "\\f");
break;
case 0x0D: /* carriage return */
sprintf (res, "\\r");
break;
default: /* any other character*/
sprintf (res, "%c", c);
}
return (res);
}
/* Prints the variable's name and initial contents on the file.
static void printGlobVar (PSYM psym)
/* Prints the variable's name and initial contents on the file.
* Note: to get to the value of the variable:
* com file: prog.Image[operand]
* exe file: prog.Image[operand+0x100] */
static void printGlobVar (QTextStream &ostr,SYM * psym)
{
int j;
PROG &prog(Project::get()->prog);
uint32_t relocOp = prog.fCOM ? psym->label : psym->label + 0x100;
{ Int j;
dword relocOp = prog.fCOM ? psym->label : psym->label + 0x100;
char *strContents; /* initial contents of variable */
switch (psym->size)
{
case 1:
ostr << "uint8_t\t"<<psym->name<<" = "<<prog.image()[relocOp]<<";\n";
break;
case 2:
ostr << "uint16_t\t"<<psym->name<<" = "<<LH(prog.image()+relocOp)<<";\n";
break;
case 4: if (psym->type == TYPE_PTR) /* pointer */
ostr << "uint16_t *\t"<<psym->name<<" = "<<LH(prog.image()+relocOp)<<";\n";
else /* char */
ostr << "char\t"<<psym->name<<"[4] = \""<<
prog.image()[relocOp]<<prog.image()[relocOp+1]<<
prog.image()[relocOp+2]<<prog.image()[relocOp+3]<<";\n";
break;
default:
{
QString strContents;
for (j=0; j < psym->size; j++)
strContents += cChar(prog.image()[relocOp + j]);
ostr << "char\t*"<<psym->name<<" = \""<<strContents<<"\";\n";
}
}
switch (psym->size) {
case 1: appendStrTab (&cCode.decl, "byte\t%s = %ld;\n",
psym->name, prog.Image[relocOp]);
break;
case 2: appendStrTab (&cCode.decl, "word\t%s = %ld;\n",
psym->name, LH(prog.Image+relocOp));
break;
case 4: if (psym->type == TYPE_PTR) /* pointer */
appendStrTab (&cCode.decl, "word\t*%s = %ld;\n",
psym->name, LH(prog.Image+relocOp));
else /* char */
appendStrTab (&cCode.decl,
"char\t%s[4] = \"%c%c%c%c\";\n",
psym->name, prog.Image[relocOp],
prog.Image[relocOp+1], prog.Image[relocOp+2],
prog.Image[relocOp+3]);
break;
default:strContents = (char *)allocMem((psym->size*2+1) *sizeof(char));
strContents[0] = '\0';
for (j=0; j < psym->size; j++)
strcat (strContents, cChar(prog.Image[relocOp + j]));
appendStrTab (&cCode.decl, "char\t*%s = \"%s\";\n",
psym->name, strContents);
}
}
// Note: Not called at present.
static void writeGlobSymTable()
/* Writes the contents of the symbol table, along with any variable
* initialization. */
void Project::writeGlobSymTable()
{
QString contents;
QTextStream ostr(&contents);
{ Int idx;
char type[10];
PSYM pSym;
if (symtab.empty())
return;
ostr<<"/* Global variables */\n";
for (SYM &sym : symtab)
{
if (sym.duVal.isUSE_VAL()) /* first used */
printGlobVar (ostr,&sym);
else { /* first defined */
switch (sym.size) {
case 1: ostr<<"uint8_t\t"; break;
case 2: ostr<<"int16_t\t"; break;
case 4: if (sym.type == TYPE_PTR)
ostr<<"int32_t\t*";
else
ostr<<"char\t*";
break;
default: ostr<<"char\t*";
}
ostr<<sym.name<<";\t/* size = "<<sym.size<<" */\n";
}
}
ostr<< "\n";
ostr.flush();
cCode.appendDecl( contents );
if (symtab.csym)
{
appendStrTab (&cCode.decl, "/* Global variables */\n");
for (idx = 0; idx < symtab.csym; idx++)
{
pSym = &symtab.sym[idx];
if (symtab.sym[idx].duVal & USEVAL) /* first used */
printGlobVar (&(symtab.sym[idx]));
else { /* first defined */
switch (pSym->size) {
case 1: strcpy (type, "byte\t"); break;
case 2: strcpy (type, "int\t"); break;
case 4: if (pSym->type == TYPE_PTR)
strcpy (type, "int\t*");
else
strcpy (type, "char\t*");
break;
default: strcpy (type, "char\t*");
}
appendStrTab (&cCode.decl, "%s%s;\t/* size = %ld */\n",
type, pSym->name, pSym->size);
}
}
appendStrTab (&cCode.decl, "\n");
}
}
static void writeHeader (FILE *fp, char *fileName)
/* Writes the header information and global variables to the output C file
* fp. */
static void writeHeader (QIODevice &_ios, const std::string &fileName)
{
PROG &prog(Project::get()->prog);
/* Write header information */
cCode.init();
cCode.appendDecl( "/*\n");
cCode.appendDecl( " * Input file\t: %s\n", fileName.c_str());
cCode.appendDecl( " * File type\t: %s\n", (prog.fCOM)?"COM":"EXE");
cCode.appendDecl( " */\n\n#include \"dcc.h\"\n\n");
{
/* Write header information */
newBundle (&cCode);
appendStrTab (&cCode.decl, "/*\n");
appendStrTab (&cCode.decl, " * Input file\t: %s\n", fileName);
appendStrTab (&cCode.decl, " * File type\t: %s\n", (prog.fCOM)?"COM":"EXE");
appendStrTab (&cCode.decl, " */\n\n#include \"dcc.h\"\n\n");
/* Write global symbol table */
/** writeGlobSymTable(); *** need to change them into locident fmt ***/
writeBundle (_ios, cCode);
freeBundle (&cCode);
/* Write global symbol table */
/** writeGlobSymTable(); *** need to change them into locident fmt ***/
writeBundle (fp, cCode);
freeBundle (&cCode);
}
static void writeBitVector (dword regi)
/* Writes the registers that are set in the bitvector */
{ Int j;
for (j = 0; j < INDEXBASE; j++)
{
if ((regi & power2(j)) != 0)
printf ("%s ", allRegs[j]);
}
}
static void emitGotoLabel (PICODE pt, Int indLevel)
/* Checks the given icode to determine whether it has a label associated
* to it. If so, a goto is emitted to this label; otherwise, a new label
* is created and a goto is also emitted.
* Note: this procedure is to be used when the label is to be backpatched
* onto code in cCode.code */
{
if (! (pt->ic.ll.flg & HLL_LABEL)) /* node hasn't got a lab */
{
/* Generate new label */
pt->ic.ll.hllLabNum = getNextLabel();
pt->ic.ll.flg |= HLL_LABEL;
/* Node has been traversed already, so backpatch this label into
* the code */
addLabelBundle (&cCode.code, pt->codeIdx, pt->ic.ll.hllLabNum);
}
appendStrTab (&cCode.code, "%sgoto L%ld;\n", indent(indLevel),
pt->ic.ll.hllLabNum);
stats.numHLIcode++;
}
// Note: Not currently called!
/** Checks the given icode to determine whether it has a label associated
static void emitFwdGotoLabel (PICODE pt, Int indLevel)
/* Checks the given icode to determine whether it has a label associated
* to it. If so, a goto is emitted to this label; otherwise, a new label
* is created and a goto is also emitted.
* is created and a goto is also emitted.
* Note: this procedure is to be used when the label is to be forward on
* the code; that is, the target code has not been traversed yet. */
#if 0
static void emitFwdGotoLabel (ICODE * pt, int indLevel)
{
if ( not pt->ll()->testFlags(HLL_LABEL)) /* node hasn't got a lab */
{
/* Generate new label */
pt->ll()->hllLabNum = getNextLabel();
pt->ll()->setFlags(HLL_LABEL);
}
cCode.appendCode( "%sgoto l%ld;\n", indentStr(indLevel), pt->ll()->hllLabNum);
if (! (pt->ic.ll.flg & HLL_LABEL)) /* node hasn't got a lab */
{
/* Generate new label */
pt->ic.ll.hllLabNum = getNextLabel();
pt->ic.ll.flg |= HLL_LABEL;
}
appendStrTab (&cCode.code, "%sgoto l%ld;\n", indent(indLevel),
pt->ic.ll.hllLabNum);
}
#endif
static void writeBB (PBB pBB, PICODE hli, Int lev, PPROC pProc, Int *numLoc)
/* Writes the code for the current basic block.
* Args: pBB: pointer to the current basic block.
* Icode: pointer to the array of icodes for current procedure.
* lev: indentation level - used for formatting. */
{ Int i, last;
char *line; /* Pointer to the HIGH-LEVEL line */
/* Save the index into the code table in case there is a later goto
* into this instruction (first instruction of the BB) */
hli[pBB->start].codeIdx = nextBundleIdx (&cCode.code);
/* Generate code for each hlicode that is not a HLI_JCOND */
for (i = pBB->start, last = i + pBB->length; i < last; i++)
if ((hli[i].type == HIGH_LEVEL) && (hli[i].invalid == FALSE))
{
line = write1HlIcode (hli[i].ic.hl, pProc, numLoc);
if (line[0] != '\0')
{
appendStrTab (&cCode.code, "%s%s", indent(lev), line);
stats.numHLIcode++;
}
if (option.verbose)
writeDU (&hli[i], i);
}
//if (hli[i].invalid)
//printf("Invalid icode: %d!\n", hli[i].invalid);
}
static void writeCode (PBB pBB, Int indLevel, PPROC pProc , Int *numLoc,
Int latchNode, Int ifFollow)
/* Recursive procedure that writes the code for the given procedure, pointed
* to by pBB.
* Parameters: pBB: pointer to the cfg.
* Icode: pointer to the Icode array for the cfg graph of the
* current procedure.
* indLevel: indentation level - used for formatting.
* numLoc: last # assigned to local variables */
{
Int follow, /* ifFollow */
loopType, /* Type of loop, if any */
nodeType; /* Type of node */
PBB succ, latch; /* Successor and latching node */
PICODE picode; /* Pointer to HLI_JCOND instruction */
char *l; /* Pointer to HLI_JCOND expression */
boolT emptyThen, /* THEN clause is empty */
repCond; /* Repeat condition for while() */
/* Check if this basic block should be analysed */
if (!pBB) return;
if ((ifFollow != UN_INIT) && (pBB == pProc->dfsLast[ifFollow]))
return;
if (pBB->traversed == DFS_ALPHA)
return;
pBB->traversed = DFS_ALPHA;
/* Check for start of loop */
repCond = FALSE;
latch = NULL;
loopType = pBB->loopType;
if (loopType)
{
latch = pProc->dfsLast[pBB->latchNode];
switch (loopType) {
case WHILE_TYPE:
picode = pProc->Icode.GetIcode(pBB->start + pBB->length - 1);
/* Check for error in while condition */
if (picode->ic.hl.opcode != HLI_JCOND)
reportError (WHILE_FAIL);
/* Check if condition is more than 1 HL instruction */
if (pBB->numHlIcodes > 1)
{
/* Write the code for this basic block */
writeBB(pBB, pProc->Icode.GetFirstIcode(), indLevel, pProc, numLoc);
repCond = TRUE;
}
/* Condition needs to be inverted if the loop body is along
* the THEN path of the header node */
if (pBB->edges[ELSE].BBptr->dfsLastNum == pBB->loopFollow)
inverseCondOp (&picode->ic.hl.oper.exp);
appendStrTab (&cCode.code, "\n%swhile (%s) {\n", indent(indLevel),
walkCondExpr (picode->ic.hl.oper.exp, pProc, numLoc));
invalidateIcode (picode);
break;
case REPEAT_TYPE:
appendStrTab (&cCode.code, "\n%sdo {\n", indent(indLevel));
picode = pProc->Icode.GetIcode(latch->start+latch->length-1);
invalidateIcode (picode);
break;
case ENDLESS_TYPE:
appendStrTab (&cCode.code, "\n%sfor (;;) {\n", indent(indLevel));
}
stats.numHLIcode += 1;
indLevel++;
}
/* Write the code for this basic block */
if (repCond == FALSE)
writeBB (pBB, pProc->Icode.GetFirstIcode(), indLevel, pProc, numLoc);
/* Check for end of path */
nodeType = pBB->nodeType;
if (nodeType == RETURN_NODE || nodeType == TERMINATE_NODE ||
nodeType == NOWHERE_NODE || (pBB->dfsLastNum == latchNode))
return;
/* Check type of loop/node and process code */
if (loopType) /* there is a loop */
{
if (pBB != latch) /* loop is over several bbs */
{
if (loopType == WHILE_TYPE)
{
succ = pBB->edges[THEN].BBptr;
if (succ->dfsLastNum == pBB->loopFollow)
succ = pBB->edges[ELSE].BBptr;
}
else
succ = pBB->edges[0].BBptr;
if (succ->traversed != DFS_ALPHA)
writeCode (succ, indLevel, pProc, numLoc, latch->dfsLastNum,
ifFollow);
else /* has been traversed so we need a goto */
emitGotoLabel (pProc->Icode.GetIcode(succ->start), indLevel);
}
/* Loop epilogue: generate the loop trailer */
indLevel--;
if (loopType == WHILE_TYPE)
{
/* Check if there is need to repeat other statements involved
* in while condition, then, emit the loop trailer */
if (repCond)
writeBB (pBB, pProc->Icode.GetFirstIcode(), indLevel+1, pProc, numLoc);
appendStrTab (&cCode.code, "%s} /* end of while */\n",
indent(indLevel));
}
else if (loopType == ENDLESS_TYPE)
appendStrTab (&cCode.code, "%s} /* end of loop */\n",
indent(indLevel));
else if (loopType == REPEAT_TYPE)
{
if (picode->ic.hl.opcode != HLI_JCOND)
reportError (REPEAT_FAIL);
appendStrTab (&cCode.code, "%s} while (%s);\n", indent(indLevel),
walkCondExpr (picode->ic.hl.oper.exp, pProc, numLoc));
}
/* Recurse on the loop follow */
if (pBB->loopFollow != MAX)
{
succ = pProc->dfsLast[pBB->loopFollow];
if (succ->traversed != DFS_ALPHA)
writeCode (succ, indLevel, pProc, numLoc, latchNode, ifFollow);
else /* has been traversed so we need a goto */
emitGotoLabel (pProc->Icode.GetIcode(succ->start), indLevel);
}
}
else /* no loop, process nodeType of the graph */
{
if (nodeType == TWO_BRANCH) /* if-then[-else] */
{
stats.numHLIcode++;
indLevel++;
emptyThen = FALSE;
if (pBB->ifFollow != MAX) /* there is a follow */
{
/* process the THEN part */
follow = pBB->ifFollow;
succ = pBB->edges[THEN].BBptr;
if (succ->traversed != DFS_ALPHA) /* not visited */
{
if (succ->dfsLastNum != follow) /* THEN part */
{
l = writeJcond (
pProc->Icode.GetIcode(pBB->start + pBB->length -1)->ic.hl,
pProc, numLoc);
appendStrTab (&cCode.code, "\n%s%s", indent(indLevel-1), l);
writeCode (succ, indLevel, pProc, numLoc, latchNode,
follow);
}
else /* empty THEN part => negate ELSE part */
{
l = writeJcondInv (
pProc->Icode.GetIcode(pBB->start + pBB->length -1)->ic.hl,
pProc, numLoc);
appendStrTab (&cCode.code, "\n%s%s", indent(indLevel-1), l);
writeCode (pBB->edges[ELSE].BBptr, indLevel, pProc, numLoc,
latchNode, follow);
emptyThen = TRUE;
}
}
else /* already visited => emit label */
emitGotoLabel (pProc->Icode.GetIcode(succ->start), indLevel);
/* process the ELSE part */
succ = pBB->edges[ELSE].BBptr;
if (succ->traversed != DFS_ALPHA) /* not visited */
{
if (succ->dfsLastNum != follow) /* ELSE part */
{
appendStrTab (&cCode.code, "%s}\n%selse {\n",
indent(indLevel-1), indent(indLevel - 1));
writeCode (succ, indLevel, pProc, numLoc, latchNode,
follow);
}
/* else (empty ELSE part) */
}
else if (! emptyThen) /* already visited => emit label */
{
appendStrTab (&cCode.code, "%s}\n%selse {\n",
indent(indLevel-1), indent(indLevel - 1));
emitGotoLabel (pProc->Icode.GetIcode(succ->start), indLevel);
}
appendStrTab (&cCode.code, "%s}\n", indent(--indLevel));
/* Continue with the follow */
succ = pProc->dfsLast[follow];
if (succ->traversed != DFS_ALPHA)
writeCode (succ, indLevel, pProc, numLoc, latchNode,
ifFollow);
}
else /* no follow => if..then..else */
{
l = writeJcond (
pProc->Icode.GetIcode(pBB->start + pBB->length -1)->ic.hl,
pProc, numLoc);
appendStrTab (&cCode.code, "%s%s", indent(indLevel-1), l);
writeCode (pBB->edges[THEN].BBptr, indLevel, pProc, numLoc,
latchNode, ifFollow);
appendStrTab (&cCode.code, "%s}\n%selse {\n", indent(indLevel-1),
indent(indLevel - 1));
writeCode (pBB->edges[ELSE].BBptr, indLevel, pProc, numLoc,
latchNode, ifFollow);
appendStrTab (&cCode.code, "%s}\n", indent(--indLevel));
}
}
else /* fall, call, 1w */
{
succ = pBB->edges[0].BBptr; /* fall-through edge */
if (succ->traversed != DFS_ALPHA)
writeCode (succ, indLevel, pProc,numLoc, latchNode,ifFollow);
}
}
}
static void codeGen (PPROC pProc, FILE *fp)
/* Writes the procedure's declaration (including arguments), local variables,
* and invokes the procedure that writes the code of the given record *hli */
void Function::codeGen (QIODevice &fs)
{
int numLoc;
QString ostr_contents;
QTextStream ostr(&ostr_contents);
//STKFRAME * args; /* Procedure arguments */
//char buf[200], /* Procedure's definition */
// arg[30]; /* One argument */
BB *pBB; /* Pointer to basic block */
{ Int i, numLoc;
PSTKFRAME args; /* Procedure arguments */
char buf[200], /* Procedure's definition */
arg[30]; /* One argument */
ID *locid; /* Pointer to one local identifier */
BB *pBB; /* Pointer to basic block */
/* Write procedure/function header */
cCode.init();
if (flg & PROC_IS_FUNC) /* Function */
ostr << QString("\n%1 %2 (").arg(TypeContainer::typeName(retVal.type)).arg(name);
newBundle (&cCode);
if (pProc->flg & PROC_IS_FUNC) /* Function */
appendStrTab (&cCode.decl, "\n%s %s (", hlTypes[pProc->retVal.type],
pProc->name);
else /* Procedure */
ostr << "\nvoid "+name+" (";
appendStrTab (&cCode.decl, "\nvoid %s (", pProc->name);
/* Write arguments */
struct validArg
args = &pProc->args;
memset (buf, 0, sizeof(buf));
for (i = 0; i < args->csym; i++)
{
bool operator()(STKSYM &s) { return s.invalid==false;}
};
QStringList parts;
for (STKSYM &arg : (args | filtered(validArg())))
{
parts << QString("%1 %2").arg(hlTypes[arg.type]).arg(arg.name);
if (args->sym[i].invalid == FALSE)
{
sprintf (arg,"%s %s",hlTypes[args->sym[i].type], args->sym[i].name);
strcat (buf, arg);
if (i < (args->numArgs - 1))
strcat (buf, ", ");
}
}
ostr << parts.join(", ")+")\n";
strcat (buf, ")\n");
appendStrTab (&cCode.decl, "%s", buf);
/* Write comments */
writeProcComments( ostr );
/* Write comments */
writeProcComments (pProc, &cCode.decl);
/* Write local variables */
if (not (flg & PROC_ASM))
{
numLoc = 0;
for (ID &refId : localId )
{
/* Output only non-invalidated entries */
if ( refId.illegal )
continue;
if (refId.loc == REG_FRAME)
{
/* Register variables are assigned to a local variable */
if (((flg & SI_REGVAR) and (refId.id.regi == rSI)) or
((flg & DI_REGVAR) and (refId.id.regi == rDI)))
{
refId.setLocalName(++numLoc);
ostr << "int "<<refId.name<<";\n";
}
/* Other registers are named when they are first used in
* the output C code, and appended to the proc decl. */
}
else if (refId.loc == STK_FRAME)
{
/* Name local variables and output appropriate type */
refId.setLocalName(++numLoc);
ostr << TypeContainer::typeName(refId.type)<<" "<<refId.name<<";\n";
}
}
}
ostr.flush();
fs.write(ostr_contents.toLatin1());
if (! (pProc->flg & PROC_ASM))
{
locid = &pProc->localId.id[0];
numLoc = 0;
for (i = 0; i < pProc->localId.csym; i++, locid++)
{
/* Output only non-invalidated entries */
if (locid->illegal == FALSE)
{
if (locid->loc == REG_FRAME)
{
/* Register variables are assigned to a local variable */
if (((pProc->flg & SI_REGVAR) && (locid->id.regi == rSI)) ||
((pProc->flg & DI_REGVAR) && (locid->id.regi == rDI)))
{
sprintf (locid->name, "loc%ld", ++numLoc);
appendStrTab (&cCode.decl, "int %s;\n", locid->name);
}
/* Other registers are named when they are first used in
* the output C code, and appended to the proc decl. */
}
else if (locid->loc == STK_FRAME)
{
/* Name local variables and output appropriate type */
sprintf (locid->name, "loc%ld", ++numLoc);
appendStrTab (&cCode.decl, "%s %s;\n",
hlTypes[locid->type], locid->name);
}
}
}
}
/* Write procedure's code */
if (flg & PROC_ASM) /* generate assembler */
{
Disassembler ds(3);
ds.disassem(this);
}
else /* generate C */
{
m_actual_cfg.front()->writeCode (1, this, &numLoc, MAX, UN_INIT);
}
if (pProc->flg & PROC_ASM) /* generate assembler */
disassem (3, pProc);
else /* generate C */
writeCode (pProc->cfg, 1, pProc, &numLoc, MAX, UN_INIT);
cCode.appendCode( "}\n\n");
writeBundle (fs, cCode);
freeBundle (&cCode);
appendStrTab (&cCode.code, "}\n\n");
writeBundle (fp, cCode);
freeBundle (&cCode);
/* Write Live register analysis information */
if (option.verbose) {
QString debug_contents;
QTextStream debug_stream(&debug_contents);
for (size_t i = 0; i < numBBs; i++)
{
pBB = m_dfsLast[i];
if (pBB->flg & INVALID_BB) continue; /* skip invalid BBs */
debug_stream << "BB "<<i<<"\n";
debug_stream << " Start = "<<pBB->begin()->loc_ip;
debug_stream << ", end = "<<pBB->begin()->loc_ip+pBB->size()<<"\n";
debug_stream << " LiveUse = ";
Machine_X86::writeRegVector(debug_stream,pBB->liveUse);
debug_stream << "\n Def = ";
Machine_X86::writeRegVector(debug_stream,pBB->def);
debug_stream << "\n LiveOut = ";
Machine_X86::writeRegVector(debug_stream,pBB->liveOut);
debug_stream << "\n LiveIn = ";
Machine_X86::writeRegVector(debug_stream,pBB->liveIn);
debug_stream <<"\n\n";
}
debug_stream.flush();
qDebug() << debug_contents.toLatin1();
}
if (option.verbose)
for (i = 0; i < pProc->numBBs; i++)
{
pBB = pProc->dfsLast[i];
if (pBB->flg & INVALID_BB) continue; /* skip invalid BBs */
printf ("BB %d\n", i);
printf (" Start = %d, end = %d\n", pBB->start, pBB->start +
pBB->length - 1);
printf (" LiveUse = ");
writeBitVector (pBB->liveUse);
printf ("\n Def = ");
writeBitVector (pBB->def);
printf ("\n LiveOut = ");
writeBitVector (pBB->liveOut);
printf ("\n LiveIn = ");
writeBitVector (pBB->liveIn);
printf ("\n\n");
}
}
static void backBackEnd (char *filename, PCALL_GRAPH pcallGraph, FILE *fp)
/* Recursive procedure. Displays the procedure's code in depth-first order
* of the call graph. */
static void backBackEnd (CALL_GRAPH * pcallGraph, QIODevice &_ios)
{
{ Int i;
// IFace.Yield(); /* This is a good place to yield to other apps */
// IFace.Yield(); /* This is a good place to yield to other apps */
/* Check if this procedure has been processed already */
if ((pcallGraph->proc->flg & PROC_OUTPUT) or
(pcallGraph->proc->flg & PROC_ISLIB))
return;
pcallGraph->proc->flg |= PROC_OUTPUT;
/* Check if this procedure has been processed already */
if ((pcallGraph->proc->flg & PROC_OUTPUT) ||
(pcallGraph->proc->flg & PROC_ISLIB))
return;
pcallGraph->proc->flg |= PROC_OUTPUT;
/* Dfs if this procedure has any successors */
for (auto & elem : pcallGraph->outEdges)
{
backBackEnd (elem, _ios);
}
/* Dfs if this procedure has any successors */
if (pcallGraph->numOutEdges > 0)
{
for (i = 0; i < pcallGraph->numOutEdges; i++)
{
backBackEnd (filename, pcallGraph->outEdges[i], fp);
}
}
/* Generate code for this procedure */
stats.numLLIcode = pcallGraph->proc->Icode.size();
stats.numHLIcode = 0;
pcallGraph->proc->codeGen (_ios);
/* Generate code for this procedure */
stats.numLLIcode = pcallGraph->proc->Icode.GetNumIcodes();
stats.numHLIcode = 0;
codeGen (pcallGraph->proc, fp);
/* Generate statistics */
if (option.Stats)
pcallGraph->proc->displayStats ();
if (not (pcallGraph->proc->flg & PROC_ASM))
{
stats.totalLL += stats.numLLIcode;
stats.totalHL += stats.numHLIcode;
}
/* Generate statistics */
if (option.Stats)
displayStats (pcallGraph->proc);
if (! (pcallGraph->proc->flg & PROC_ASM))
{
stats.totalLL += stats.numLLIcode;
stats.totalHL += stats.numHLIcode;
}
}
void BackEnd (char *fileName, PCALL_GRAPH pcallGraph)
/* Invokes the necessary routines to produce code one procedure at a time. */
void BackEnd(CALL_GRAPH * pcallGraph)
{
/* Get output file name */
QString outNam(Project::get()->output_name("b")); /* b for beta */
QFile fs(outNam); /* Output C file */
char* outName, *ext;
FILE* fp; /* Output C file */
/* Open output file */
if(not fs.open(QFile::WriteOnly|QFile::Text))
fatalError (CANNOT_OPEN, outNam.toStdString().c_str());
/* Get output file name */
outName = strcpy ((char*)allocMem(strlen(fileName)+1), fileName);
if ((ext = strrchr (outName, '.')) != NULL)
*ext = '\0';
strcat (outName, ".b"); /* b for beta */
qDebug()<<"dcc: Writing C beta file"<<outNam;
/* Open output file */
fp = fopen (outName, "wt");
if (!fp)
fatalError (CANNOT_OPEN, outName);
printf ("dcc: Writing C beta file %s\n", outName);
/* Header information */
writeHeader (fs, option.filename.toStdString());
/* Header information */
writeHeader (fp, fileName);
/* Initialize total Icode instructions statistics */
stats.totalLL = 0;
stats.totalHL = 0;
/* Initialize total Icode instructions statistics */
stats.totalLL = 0;
stats.totalHL = 0;
/* Process each procedure at a time */
backBackEnd (pcallGraph, fs);
/* Process each procedure at a time */
backBackEnd (fileName, pcallGraph, fp);
/* Close output file */
fs.close();
qDebug() << "dcc: Finished writing C beta file";
/* Close output file */
fclose (fp);
printf ("dcc: Finished writing C beta file\n");
}

View File

@ -6,89 +6,107 @@
#include "dcc.h"
#include <stdarg.h>
#include <iostream>
#include <memory.h>
#include <stdlib.h>
#include <string.h>
#include <QtCore/QIODevice>
#define deltaProcLines 20
using namespace std;
void newBundle (bundle *procCode)
/* Allocates memory for a new bundle and initializes it to zero. */
{
memset (&(procCode->decl), 0, sizeof(strTable));
memset (&(procCode->code), 0, sizeof(strTable));
}
static void incTableSize (strTable *strTab)
/* Increments the size of the table strTab by deltaProcLines and copies all
* the strings to the new table. */
{
strTab->allocLines += deltaProcLines;
strTab->str = (char**)reallocVar (strTab->str, strTab->allocLines*sizeof(char *));
memset (&strTab->str[strTab->allocLines - deltaProcLines], 0,
deltaProcLines * sizeof(char *));
}
void appendStrTab (strTable *strTab, char *format, ...)
/* Appends the new line (in printf style) to the string table strTab. */
{ va_list args;
va_start (args, format);
if (strTab->numLines == strTab->allocLines)
{
incTableSize (strTab);
}
strTab->str[strTab->numLines] = (char *)malloc(lineSize * sizeof(char));
if (strTab->str == NULL)
{
fatalError(MALLOC_FAILED, lineSize * sizeof(char));
}
vsprintf (strTab->str[strTab->numLines], format, args);
strTab->numLines++;
va_end (args);
}
Int nextBundleIdx (strTable *strTab)
/* Returns the next available index into the table */
{
return (strTab->numLines);
}
void addLabelBundle (strTable *strTab, Int idx, Int label)
/* Adds the given label to the start of the line strTab[idx]. The first
* tab is removed and replaced by this label */
void strTable::addLabelBundle (int idx, int label)
{
QString &processedLine(at(idx));
QString s = QString("l%1: ").arg(label);
if(processedLine.size()<4)
processedLine = s;
else
processedLine = s+processedLine.mid(4);
{ char s[lineSize];
sprintf (s, "l%ld: %s", label, &strTab->str[idx][4]);
strcpy (strTab->str[idx], s);
}
static void writeStrTab (FILE *fp, strTable strTab)
/* Writes the contents of the string table on the file fp. */
static void writeStrTab (QIODevice &ios, strTable &strTab)
{
for (size_t i = 0; i < strTab.size(); i++)
ios.write(strTab[i].toLatin1());
{ Int i;
for (i = 0; i < strTab.numLines; i++)
fprintf (fp, "%s", strTab.str[i]);
}
void writeBundle (FILE *fp, bundle procCode)
/* Writes the contents of the bundle (procedure code and declaration) to
* a file. */
void writeBundle (QIODevice &ios, bundle procCode)
{
writeStrTab (ios, procCode.decl);
writeStrTab (ios, procCode.code);
writeStrTab (fp, procCode.decl);
if (procCode.decl.str[procCode.decl.numLines - 1][0] != ' ')
fprintf (fp, "\n");
writeStrTab (fp, procCode.code);
}
static void freeStrTab (strTable *strTab)
/* Frees the storage allocated by the string table. */
static void freeStrTab (strTable &strTab)
{
strTab.clear();
{ Int i;
if (strTab->allocLines > 0) {
for (i = 0; i < strTab->numLines; i++)
free (strTab->str[i]);
free (strTab->str);
memset (strTab, 0, sizeof(strTable));
}
}
/* Deallocates the space taken by the bundle procCode */
void freeBundle (bundle *procCode)
{
freeStrTab (procCode->decl);
freeStrTab (procCode->code);
}
void bundle::appendCode(const char *format,...)
{
va_list args;
char buf[lineSize]={0};
va_start (args, format);
vsprintf (buf, format, args);
code.push_back(buf);
va_end (args);
}
void bundle::appendCode(const QString & s)
{
code.push_back(s);
}
void bundle::appendDecl(const char *format,...)
{
va_list args;
char buf[lineSize]={0};
va_start (args, format);
vsprintf (buf, format, args);
decl.push_back(buf);
va_end (args);
}
void bundle::appendDecl(const QString &v)
{
decl.push_back(v);
/* Deallocates the space taken by the bundle procCode */
{
freeStrTab (&(procCode->decl));
freeStrTab (&(procCode->code));
}

File diff suppressed because it is too large Load Diff

View File

@ -7,277 +7,258 @@
****************************************************************************/
#include "dcc.h"
#include "msvc_fixes.h"
#include "machine_x86.h"
#include <string.h>
#include <sstream>
#include <QTextStream>
using namespace std;
#define intSize 40
static const char *int21h[] =
{
"Terminate process",
"Character input with echo",
"Character output",
"Auxiliary input",
"Auxiliary output",
"Printer output",
"Direct console i/o",
"Unfiltered char i w/o echo",
"Character input without echo",
"Display string",
"Buffered keyboard input",
"Check input status",
"Flush input buffer and then input",
"Disk reset",
"Select disk",
"Open file",
"Close file",
"Find first file",
"Find next file",
"Delete file",
"Sequential read",
"Sequential write",
"Create file",
"Rename file",
"Reserved",
"Get current disk",
"Set DTA address",
"Get default drive data",
"Get drive data",
"Reserved",
"Reserved",
"Reserved",
"Reserved",
"Random read",
"Random write",
"Get file size",
"Set relative record number",
"Set interrupt vector",
"Create new PSP",
"Random block read",
"Random block write",
"Parse filename",
"Get date",
"Set date",
"Get time",
"Set time",
"Set verify flag",
"Get DTA address",
"Get MSDOS version number",
"Terminate and stay resident",
"Reserved",
"Get or set break flag",
"Reserved",
"Get interrupt vector",
"Get drive allocation info",
"Reserved",
"Get or set country info",
"Create directory",
"Delete directory",
"Set current directory",
"Create file",
"Open file",
"Close file",
"Read file or device",
"Write file or device",
"Delete file",
"Set file pointer",
"Get or set file attributes",
"IOCTL (i/o control)",
"Duplicate handle",
"Redirect handle",
"Get current directory",
"Alloate memory block",
"Release memory block",
"Resize memory block",
"Execute program (exec)",
"Terminate process with return code",
"Get return code",
"Find first file",
"Find next file",
"Reserved",
"Reserved",
"Reserved",
"Reserved",
"Get verify flag",
"Reserved",
"Rename file",
"Get or set file date & time",
"Get or set allocation strategy",
"Get extended error information",
"Create temporary file",
"Create new file",
"Lock or unlock file region",
"Reserved",
"Get machine name",
"Device redirection",
"Reserved",
"Reserved",
"Get PSP address",
"Get DBCS lead uint8_t table",
"Reserved",
"Get extended country information",
"Get or set code page",
"Set handle count",
"Commit file",
"Reserved",
"Reserved",
"Reserved",
"Extended open file"
static char *int21h[] =
{"Terminate process",
"Character input with echo",
"Character output",
"Auxiliary input",
"Auxiliary output",
"Printer output",
"Direct console i/o",
"Unfiltered char i w/o echo",
"Character input without echo",
"Display string",
"Buffered keyboard input",
"Check input status",
"Flush input buffer and then input",
"Disk reset",
"Select disk",
"Open file",
"Close file",
"Find first file",
"Find next file",
"Delete file",
"Sequential read",
"Sequential write",
"Create file",
"Rename file",
"Reserved",
"Get current disk",
"Set DTA address",
"Get default drive data",
"Get drive data",
"Reserved",
"Reserved",
"Reserved",
"Reserved",
"Random read",
"Random write",
"Get file size",
"Set relative record number",
"Set interrupt vector",
"Create new PSP",
"Random block read",
"Random block write",
"Parse filename",
"Get date",
"Set date",
"Get time",
"Set time",
"Set verify flag",
"Get DTA address",
"Get MSDOS version number",
"Terminate and stay resident",
"Reserved",
"Get or set break flag",
"Reserved",
"Get interrupt vector",
"Get drive allocation info",
"Reserved",
"Get or set country info",
"Create directory",
"Delete directory",
"Set current directory",
"Create file",
"Open file",
"Close file",
"Read file or device",
"Write file or device",
"Delete file",
"Set file pointer",
"Get or set file attributes",
"IOCTL (i/o control)",
"Duplicate handle",
"Redirect handle",
"Get current directory",
"Alloate memory block",
"Release memory block",
"Resize memory block",
"Execute program (exec)",
"Terminate process with return code",
"Get return code",
"Find first file",
"Find next file",
"Reserved",
"Reserved",
"Reserved",
"Reserved",
"Get verify flag",
"Reserved",
"Rename file",
"Get or set file date & time",
"Get or set allocation strategy",
"Get extended error information",
"Create temporary file",
"Create new file",
"Lock or unlock file region",
"Reserved",
"Get machine name",
"Device redirection",
"Reserved",
"Reserved",
"Get PSP address",
"Get DBCS lead byte table",
"Reserved",
"Get extended country information",
"Get or set code page",
"Set handle count",
"Commit file",
"Reserved",
"Reserved",
"Reserved",
"Extended open file"
};
static const char *intOthers[] = {
"Exit", /* 0x20 */
"", /* other table */
"Terminate handler address", /* 0x22 */
"Ctrl-C handler address", /* 0x23 */
"Critical-error handler address", /* 0x24 */
"Absolute disk read", /* 0x25 */
"Absolute disk write", /* 0x26 */
"Terminate and stay resident", /* 0x27 */
"Reserved", /* 0x28 */
"Reserved", /* 0x29 */
"Reserved", /* 0x2A */
"Reserved", /* 0x2B */
"Reserved", /* 0x2C */
"Reserved", /* 0x2D */
"Reserved" /* 0x2E */
static char *intOthers[] = {
"Exit", /* 0x20 */
"", /* other table */
"Terminate handler address", /* 0x22 */
"Ctrl-C handler address", /* 0x23 */
"Critical-error handler address", /* 0x24 */
"Absolute disk read", /* 0x25 */
"Absolute disk write", /* 0x26 */
"Terminate and stay resident", /* 0x27 */
"Reserved", /* 0x28 */
"Reserved", /* 0x29 */
"Reserved", /* 0x2A */
"Reserved", /* 0x2B */
"Reserved", /* 0x2C */
"Reserved", /* 0x2D */
"Reserved" /* 0x2E */
};
/* Writes the description of the current interrupt. Appends it to the
void writeIntComment (PICODE icode, char *s)
/* Writes the description of the current interrupt. Appends it to the
* string s. */
void LLInst::writeIntComment (QTextStream &s)
{
uint32_t src_immed=src().getImm2();
s<<"\t/* ";
if (src_immed == 0x21)
{
s <<int21h[m_dst.off];
{ char *t;
t = (char *)allocMem(intSize * sizeof(char));
if (icode->ic.ll.immed.op == 0x21)
{ sprintf (t, "\t/* %s */\n", int21h[icode->ic.ll.dst.off]);
strcat (s, t);
}
else if (icode->ic.ll.immed.op > 0x1F && icode->ic.ll.immed.op < 0x2F)
{
sprintf (t, "\t/* %s */\n", intOthers[icode->ic.ll.immed.op - 0x20]);
strcat (s, t);
}
else if (src_immed > 0x1F and src_immed < 0x2F)
{
s <<intOthers[src_immed - 0x20];
}
else if (src_immed == 0x2F)
{
switch (m_dst.off)
{
case 0x01 :
s << "Print spooler";
break;
case 0x02:
s << "Assign";
break;
case 0x10:
s << "Share";
break;
case 0xB7:
s << "Append";
}
}
else
s<<"Unknown int";
s<<" */\n";
else if (icode->ic.ll.immed.op == 0x2F)
switch (icode->ic.ll.dst.off) {
case 0x01 : strcat (s, "\t/* Print spooler */\n");
break;
case 0x02: strcat (s, "\t/* Assign */\n");
break;
case 0x10: strcat (s, "\t/* Share */\n");
break;
case 0xB7: strcat (s, "\t/* Append */\n");
}
else
strcat (s, "\n");
}
//, &cCode.decl
void Function::writeProcComments()
{
QString dest_str;
{
QTextStream ostr(&dest_str);
writeProcComments(ostr);
}
cCode.appendDecl(dest_str);
void writeProcComments (PPROC p, strTable *strTab)
{ int i;
ID *id; /* Pointer to register argument identifier */
PSTKSYM psym; /* Pointer to register argument symbol */
/* About the parameters */
if (p->cbParam)
appendStrTab (strTab, "/* Takes %d bytes of parameters.\n",
p->cbParam);
else if (p->flg & REG_ARGS)
{
appendStrTab (strTab, "/* Uses register arguments:\n");
for (i = 0; i < p->args.numArgs; i++)
{
psym = &p->args.sym[i];
if (psym->regs->expr.ident.idType == REGISTER)
{
id = &p->localId.id[psym->regs->expr.ident.idNode.regiIdx];
if (psym->regs->expr.ident.regiType == WORD_REG)
appendStrTab (strTab, " * %s = %s.\n", psym->name,
wordReg[id->id.regi - rAX]);
else /* BYTE_REG */
appendStrTab (strTab, " * %s = %s.\n", psym->name,
byteReg[id->id.regi - rAL]);
}
else /* long register */
{
id = &p->localId.id[psym->regs->expr.ident.idNode.longIdx];
appendStrTab (strTab, " * %s = %s:%s.\n", psym->name,
wordReg[id->id.longId.h - rAX],
wordReg[id->id.longId.l - rAX]);
}
}
}
else
appendStrTab (strTab, "/* Takes no parameters.\n");
/* Type of procedure */
if (p->flg & PROC_RUNTIME)
appendStrTab (strTab," * Runtime support routine of the compiler.\n");
if (p->flg & PROC_IS_HLL)
appendStrTab (strTab," * High-level language prologue code.\n");
if (p->flg & PROC_ASM)
{
appendStrTab (strTab,
" * Untranslatable routine. Assembler provided.\n");
if (p->flg & PROC_IS_FUNC)
switch (p->retVal.type) {
case TYPE_BYTE_SIGN: case TYPE_BYTE_UNSIGN:
appendStrTab (strTab, " * Return value in register al.\n");
break;
case TYPE_WORD_SIGN: case TYPE_WORD_UNSIGN:
appendStrTab (strTab, " * Return value in register ax.\n");
break;
case TYPE_LONG_SIGN: case TYPE_LONG_UNSIGN:
appendStrTab (strTab, " * Return value in registers dx:ax.\n");
break;
} /* eos */
}
/* Calling convention */
if (p->flg & CALL_PASCAL)
appendStrTab (strTab, " * Pascal calling convention.\n");
else if (p->flg & CALL_C)
appendStrTab (strTab, " * C calling convention.\n");
else if (p->flg & CALL_UNKNOWN)
appendStrTab (strTab, " * Unknown calling convention.\n");
/* Other flags */
if (p->flg & (PROC_BADINST | PROC_IJMP))
appendStrTab (strTab, " * Incomplete due to an %s.\n",
(p->flg & PROC_BADINST)? "untranslated opcode":
"indirect JMP");
if (p->flg & PROC_ICALL)
appendStrTab (strTab, " * Indirect call procedure.\n");
if (p->flg & IMPURE)
appendStrTab (strTab, " * Contains impure code.\n");
if (p->flg & NOT_HLL)
appendStrTab (strTab,
" * Contains instructions not normally used by compilers.\n");
if (p->flg & FLOAT_OP)
appendStrTab (strTab," * Contains coprocessor instructions.\n");
/* Graph reducibility */
if (p->flg & GRAPH_IRRED)
appendStrTab (strTab," * Irreducible control flow graph.\n");
appendStrTab (strTab, " */\n{\n");
}
void Function::writeProcComments(QTextStream &ostr)
{
int i;
ID *id; /* Pointer to register argument identifier */
STKSYM * psym; /* Pointer to register argument symbol */
/* About the parameters */
if (this->cbParam)
ostr << "/* Takes "<<this->cbParam<<" bytes of parameters.\n";
else if (this->flg & REG_ARGS)
{
ostr << "/* Uses register arguments:\n";
for (i = 0; i < this->args.numArgs; i++)
{
psym = &this->args[i];
ostr << " * "<<psym->name<<" = ";
if (psym->regs->ident.type() == REGISTER)
{
id = &this->localId.id_arr[((RegisterNode *)psym->regs)->regiIdx];
ostr << Machine_X86::regName(id->id.regi);
}
else /* long register */
{
id = &this->localId.id_arr[psym->regs->ident.idNode.longIdx];
ostr << Machine_X86::regName(id->longId().h()) << ":";
ostr << Machine_X86::regName(id->longId().l());
}
ostr << ".\n";
}
}
else
ostr << "/* Takes no parameters.\n";
/* Type of procedure */
if (this->flg & PROC_RUNTIME)
ostr << " * Runtime support routine of the compiler.\n";
if (this->flg & PROC_IS_HLL)
ostr << " * High-level language prologue code.\n";
if (this->flg & PROC_ASM)
{
ostr << " * Untranslatable routine. Assembler provided.\n";
if (this->flg & PROC_IS_FUNC)
switch (this->retVal.type) { // TODO: Functions return value in various regs
case TYPE_BYTE_SIGN: case TYPE_BYTE_UNSIGN:
ostr << " * Return value in register al.\n";
break;
case TYPE_WORD_SIGN: case TYPE_WORD_UNSIGN:
ostr << " * Return value in register ax.\n";
break;
case TYPE_LONG_SIGN: case TYPE_LONG_UNSIGN:
ostr << " * Return value in registers dx:ax.\n";
break;
default:
fprintf(stderr,"Unknown retval type %d",this->retVal.type);
break;
} /* eos */
}
/* Calling convention */
callingConv()->writeComments(ostr);
/* Other flags */
if (this->flg & (PROC_BADINST | PROC_IJMP))
{
ostr << " * Incomplete due to an ";
if(this->flg & PROC_BADINST)
ostr << "untranslated opcode.\n";
else
ostr << "indirect JMP.\n";
}
if (this->flg & PROC_ICALL)
ostr << " * Indirect call procedure.\n";
if (this->flg & IMPURE)
ostr << " * Contains impure code.\n";
if (this->flg & NOT_HLL)
ostr << " * Contains instructions not normally used by compilers.\n";
if (this->flg & FLOAT_OP)
ostr << " * Contains coprocessor instructions.\n";
/* Graph reducibility */
if (this->flg & GRAPH_IRRED)
ostr << " * Irreducible control flow graph.\n";
ostr << " */\n{\n";
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -5,141 +5,154 @@
****************************************************************************/
#include "dcc.h"
#include "msvc_fixes.h"
#include "project.h"
#include "CallGraph.h"
#include "DccFrontend.h"
#include <cstring>
#include <iostream>
#include <QtCore/QCoreApplication>
#include <QCommandLineParser>
#include <QtCore/QFile>
#include <string.h>
#ifdef __UNIX__
#include <unistd.h>
#else
#include <stdio.h>
#include <io.h> /* For unlink() */
#endif
/* Global variables - extern to other modules */
extern QString asm1_name, asm2_name; /* Assembler output filenames */
extern SYMTAB symtab; /* Global symbol table */
extern STATS stats; /* cfg statistics */
extern OPTION option; /* Command line options */
char *progname; /* argv[0] - for error msgs */
char *asm1_name, *asm2_name; /* Assembler output filenames */
SYMTAB symtab; /* Global symbol table */
STATS stats; /* cfg statistics */
PROG prog; /* programs fields */
OPTION option; /* Command line options */
PPROC pProcList; /* List of procedures, topologically sort */
PPROC pLastProc; /* Pointer to last node in procedure list */
CALL_GRAPH *callGraph; /* Call graph of the program */
static char *initargs(int argc, char *argv[]);
static void displayTotalStats();
/****************************************************************************
* main
***************************************************************************/
void setupOptions(const QCoreApplication &app) {
//[-a1a2cmsi]
QCommandLineParser parser;
parser.setApplicationDescription("dcc");
parser.addHelpOption();
//parser.addVersionOption();
//QCommandLineOption showProgressOption("p", QCoreApplication::translate("main", "Show progress during copy"));
QCommandLineOption boolOpts[] {
QCommandLineOption {"v", QCoreApplication::translate("main", "verbose")},
QCommandLineOption {"V", QCoreApplication::translate("main", "very verbose")},
QCommandLineOption {"c", QCoreApplication::translate("main", "Follow register indirect calls")},
QCommandLineOption {"m", QCoreApplication::translate("main", "Print memory maps of program")},
QCommandLineOption {"s", QCoreApplication::translate("main", "Print stats")}
};
for(QCommandLineOption &o : boolOpts) {
parser.addOption(o);
}
QCommandLineOption assembly("a", QCoreApplication::translate("main", "Produce assembly"),"assembly_level");
QCommandLineOption targetFileOption(QStringList() << "o" << "output",
QCoreApplication::translate("main", "Place output into <file>."),
QCoreApplication::translate("main", "file"));
QCommandLineOption entryPointOption(QStringList() << "E",
QCoreApplication::translate("main", "Custom entry point as hex"),
QCoreApplication::translate("main", "offset"),
"0"
);
parser.addOption(targetFileOption);
parser.addOption(assembly);
parser.addOption(entryPointOption);
//parser.addOption(forceOption);
// Process the actual command line arguments given by the user
parser.addPositionalArgument("source", QCoreApplication::translate("main", "Dos Executable file to decompile."));
parser.process(app);
const QStringList args = parser.positionalArguments();
if(args.empty()) {
parser.showHelp();
}
// source is args.at(0), destination is args.at(1)
option.verbose = parser.isSet(boolOpts[0]);
option.VeryVerbose = parser.isSet(boolOpts[1]);
if(parser.isSet(assembly)) {
option.asm1 = parser.value(assembly).toInt()==1;
option.asm2 = parser.value(assembly).toInt()==2;
}
option.Map = parser.isSet(boolOpts[3]);
option.Stats = parser.isSet(boolOpts[4]);
option.Interact = false;
option.Calls = parser.isSet(boolOpts[2]);
option.filename = args.first();
option.CustomEntryPoint = parser.value(entryPointOption).toUInt(nullptr,16);
if(parser.isSet(targetFileOption))
asm1_name = asm2_name = parser.value(targetFileOption);
else if(option.asm1 or option.asm2) {
asm1_name = option.filename+".a1";
asm2_name = option.filename+".a2";
}
}
int main(int argc, char **argv)
int main(int argc, char *argv[])
{
QCoreApplication app(argc,argv);
QCoreApplication::setApplicationVersion("0.1");
setupOptions(app);
/* Extract switches and filename */
strcpy(option.filename, initargs(argc, argv));
/* Front end reads in EXE or COM file, parses it into I-code while
* building the call graph and attaching appropriate bits of code for
* each procedure.
*/
Project::get()->create(option.filename);
FrontEnd (option.filename, &callGraph);
DccFrontend fe(&app);
if(not Project::get()->load()) {
return -1;
}
if (option.verbose)
Project::get()->prog.displayLoadInfo();
if(false==fe.FrontEnd ())
return -1;
if(option.asm1)
return 0;
/* In the middle is a so called Universal Decompiling Machine.
* It processes the procedure list and I-code and attaches where it can
* to each procedure an optimised cfg and ud lists
*/
udm();
if(option.asm2)
return 0;
/* Back end converts each procedure into C using I-code, interval
* analysis, data flow etc. and outputs it to output file ready for
* re-compilation.
*/
BackEnd(Project::get()->callGraph);
BackEnd(option.filename, callGraph);
Project::get()->callGraph->write();
writeCallGraph (callGraph);
if (option.Stats)
displayTotalStats();
/*
freeDataStructures(pProcList);
*/
return 0;
}
/****************************************************************************
* initargs - Extract command line arguments
***************************************************************************/
static char *initargs(int argc, char *argv[])
{
char *pc;
progname = *argv; /* Save invocation name for error messages */
while (--argc > 0 && (*++argv)[0] == '-') {
for (pc = argv[0]+1; *pc; pc++)
switch (*pc) {
case 'a': /* Print assembler listing */
if (*(pc+1) == '2')
option.asm2 = TRUE;
else
option.asm1 = TRUE;
if (*(pc+1) == '1' || *(pc+1) == '2')
pc++;
break;
case 'c':
option.Calls = TRUE;
break;
case 'i':
option.Interact = TRUE;
break;
case 'm': /* Print memory map */
option.Map = TRUE;
break;
case 's': /* Print Stats */
option.Stats = TRUE;
break;
case 'V': /* Very verbose => verbose */
option.VeryVerbose = TRUE;
case 'v': /* Make everything verbose */
option.verbose = TRUE;
break;
case 'o': /* assembler output file */
if (*(pc+1)) {
asm1_name = asm2_name = pc+1;
goto NextArg;
}
else if (--argc > 0) {
asm1_name = asm2_name = *++argv;
goto NextArg;
}
default:
fatalError(INVALID_ARG, *pc);
return *argv;
}
NextArg:;
}
if (argc == 1)
{
if (option.asm1 || option.asm2)
{
if (! asm1_name)
{
asm1_name = strcpy((char*)allocMem(strlen(*argv)+4), *argv);
pc = strrchr(asm1_name, '.');
if (pc > strrchr(asm1_name, '/'))
{
*pc = '\0';
}
asm2_name = (char*)allocMem(strlen(asm1_name)+4) ;
strcat(strcpy(asm2_name, asm1_name), ".a2");
unlink(asm2_name);
strcat(asm1_name, ".a1");
}
unlink(asm1_name); /* Remove asm output files */
}
return *argv; /* filename of the program to decompile */
}
fatalError(USAGE);
return *argv;
}
static void
displayTotalStats ()
/* Displays final statistics for the complete program */
{
printf ("\nFinal Program Statistics\n");
printf (" Total number of low-level Icodes : %d\n", stats.totalLL);
printf (" Total number of high-level Icodes: %d\n", stats.totalHL);
printf (" Total number of low-level Icodes : %ld\n", stats.totalLL);
printf (" Total number of high-level Icodes: %ld\n", stats.totalHL);
printf (" Total reduction of instructions : %2.2f%%\n", 100.0 -
(stats.totalHL * 100.0) / stats.totalLL);
}

View File

@ -1,70 +0,0 @@
#include "dcc_interface.h"
#include "dcc.h"
#include "project.h"
struct DccImpl : public IDcc {
ilFunction m_current_func;
// IDcc interface
public:
void BaseInit()
{
m_current_func = Project::get()->functions().end();
}
void Init(QObject *tgt)
{
}
ilFunction GetFirstFuncHandle()
{
return Project::get()->functions().begin();
}
ilFunction GetCurFuncHandle()
{
return m_current_func;
}
void analysis_Once()
{
}
void load(QString name)
{
option.filename = name;
Project::get()->create(name);
}
void prtout_asm(IXmlTarget *, int level)
{
}
void prtout_cpp(IXmlTarget *, int level)
{
}
size_t getFuncCount()
{
return Project::get()->functions().size();
}
const lFunction &validFunctions() const
{
return Project::get()->functions();
}
void SetCurFunc_by_Name(QString v)
{
lFunction & funcs(Project::get()->functions());
for(auto iter=funcs.begin(),fin=funcs.end(); iter!=fin; ++iter) {
if(iter->name==v) {
m_current_func = iter;
return;
}
}
}
QDir installDir() {
return QDir(".");
}
QDir dataDir(QString kind) { // return directory containing decompilation helper data -> signatures/includes/etc.
QDir res(installDir());
res.cd(kind);
return res;
}
};
IDcc* IDcc::get() {
static IDcc *v=0;
if(nullptr == v)
v = new DccImpl;
return v;
}

File diff suppressed because it is too large Load Diff

View File

@ -3,59 +3,63 @@
* (C) Cristina Cifuentes
***************************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <map>
#include <string>
#include <stdarg.h>
#include "dcc.h"
static const std::map<eErrorId,std::string> errorMessage =
{
{INVALID_ARG ,"Invalid option -%c\n"},
{INVALID_OPCODE ,"Invalid instruction %02X at location %06lX\n"},
{INVALID_386OP ,"Don't understand 80386 instruction %02X at location %06lX\n"},
{FUNNY_SEGOVR ,"Segment override with no memory operand at location %06lX\n"},
{FUNNY_REP ,"REP prefix without a string instruction at location %06lX\n"},
{CANNOT_OPEN ,"Cannot open %s\n"},
{CANNOT_READ ,"Error while reading %s\n"},
{MALLOC_FAILED ,"malloc of %ld bytes failed\n"},
{NEWEXE_FORMAT ,"Don't understand new EXE format\n"},
{NO_BB ,"Failed to find a BB for jump to %ld in proc %s\n"},
{INVALID_SYNTHETIC_BB,"Basic Block is a synthetic jump\n"},
{INVALID_INT_BB ,"Failed to find a BB for interval\n"},
{IP_OUT_OF_RANGE ,"Instruction at location %06lX goes beyond loaded image\n"},
{DEF_NOT_FOUND ,"Definition not found for condition code usage at opcode %d\n"},
{JX_NOT_DEF ,"JX use, definition not supported at opcode #%d\n"},
{NOT_DEF_USE ,"%x: Def - use not supported. Def op = %d, use op = %d.\n"},
{REPEAT_FAIL ,"Failed to construct repeat..until() condition.\n"},
{WHILE_FAIL ,"Failed to construct while() condition.\n"},
#include <stdio.h>
#include <stdlib.h>
//#ifndef __UNIX__
#if 1
#include <stdarg.h>
#else
#include <varargs.h>
#endif
static char *errorMessage[] = {
"Invalid option -%c\n", /* INVALID_ARG */
"Invalid instruction %02X at location %06lX\n", /* INVALID_OPCODE */
"Don't understand 80386 instruction %02X at location %06lX\n",
/* INVALID_386OP */
"Segment override with no memory operand at location %06lX\n",
/* FUNNY_SEGOVR */
"REP prefix without a string instruction at location %06lX\n",/* FUNNY_REP */
"Cannot open %s\n", /* CANNOT_OPEN */
"Error while reading %s\n", /* CANNOT_READ */
"malloc of %ld bytes failed\n", /* MALLOC_FAILED */
"Don't understand new EXE format\n", /* NEWEXE_FORMAT */
"Failed to find a BB for jump to %ld in proc %s\n", /* NO_BB */
"Basic Block is a synthetic jump\n", /* INVALID_SYNTHETIC_BB */
"Failed to find a BB for interval\n", /* INVALID_INT_BB */
"Instruction at location %06lX goes beyond loaded image\n",
/* IP_OUT_OF_RANGE*/
"Definition not found for condition code usage at opcode %d\n",
/* DEF_NOT_FOUND */
"JX use, definition not supported at opcode #%d\n", /* JX_NOT_DEF */
"Def - use not supported. Def op = %d, use op = %d.\n", /* NOT_DEF_USE */
"Failed to construct repeat..until() condition.\n", /* REPEAT_FAIL */
"Failed to construct while() condition.\n", /* WHILE_FAIL */
};
/****************************************************************************
fatalError: displays error message and exits the program.
****************************************************************************/
void fatalError(eErrorId errId, ...)
{
va_list args;
//#ifdef __UNIX__ /* ultrix */
void fatalError(Int errId, ...)
{ va_list args;
//#ifdef __UNIX__ /* ultrix */
#if 0
int errId;
Int errId;
va_start(args);
errId = va_arg(args, int);
errId = va_arg(args, Int);
#else
va_start(args, errId);
#endif
if (errId == USAGE)
fprintf(stderr,"Usage: dcc [-a1a2cmpsvVi][-o asmfile] DOS_executable\n");
fprintf(stderr,"Usage: dcc [-a1a2cmpsvVi][-o asmfile] DOS_executable\n");
else {
auto msg_iter = errorMessage.find(errId);
assert(msg_iter!=errorMessage.end());
fprintf(stderr, "dcc: ");
vfprintf(stderr, msg_iter->second.c_str(), args);
vfprintf(stderr, errorMessage[errId - 1], args);
}
va_end(args);
exit((int)errId);
@ -65,21 +69,18 @@ void fatalError(eErrorId errId, ...)
/****************************************************************************
reportError: reports the warning/error and continues with the program.
****************************************************************************/
void reportError(eErrorId errId, ...)
{
va_list args;
//#ifdef __UNIX__ /* ultrix */
void reportError(Int errId, ...)
{ va_list args;
//#ifdef __UNIX__ /* ultrix */
#if 0
int errId;
Int errId;
va_start(args);
errId = va_arg(args, int);
errId = va_arg(args, Int);
#else /* msdos or windows*/
va_start(args, errId);
#endif
fprintf(stderr, "dcc: ");
auto msg_iter = errorMessage.find(errId);
assert(msg_iter!=errorMessage.end());
vfprintf(stderr, msg_iter->second.c_str(), args);
vfprintf(stderr, errorMessage[errId - 1], args);
va_end(args);
}

View File

@ -9,32 +9,38 @@
* *
\* * * * * * * * * * * * */
#include "msvc_fixes.h"
#include <memory.h>
#include <stdint.h>
#ifndef PATLEN
#define PATLEN 23
#define WILD 0xF4
#endif
#ifndef bool
#define bool unsigned char
#define TRUE 1
#define FALSE 0
#define byte unsigned char
#endif
static int pc; /* Indexes into pat[] */
/* prototypes */
static bool ModRM(uint8_t pat[]); /* Handle the mod/rm uint8_t */
static bool TwoWild(uint8_t pat[]); /* Make the next 2 bytes wild */
static bool FourWild(uint8_t pat[]); /* Make the next 4 bytes wild */
void fixWildCards(uint8_t pat[]); /* Main routine */
static bool ModRM(byte pat[]); /* Handle the mod/rm byte */
static bool TwoWild(byte pat[]); /* Make the next 2 bytes wild */
static bool FourWild(byte pat[]); /* Make the next 4 bytes wild */
void fixWildCards(byte pat[]); /* Main routine */
/* Handle the mod/rm case. Returns true if pattern exhausted */
static bool ModRM(uint8_t pat[])
static bool
ModRM(byte pat[])
{
uint8_t op;
byte op;
/* A standard mod/rm uint8_t follows opcode */
op = pat[pc++]; /* The mod/rm uint8_t */
if (pc >= PATLEN) return true; /* Skip Mod/RM */
/* A standard mod/rm byte follows opcode */
op = pat[pc++]; /* The mod/rm byte */
if (pc >= PATLEN) return TRUE; /* Skip Mod/RM */
switch (op & 0xC0)
{
case 0x00: /* [reg] or [nnnn] */
@ -42,66 +48,66 @@ static bool ModRM(uint8_t pat[])
{
/* Uses [nnnn] address mode */
pat[pc++] = WILD;
if (pc >= PATLEN) return true;
if (pc >= PATLEN) return TRUE;
pat[pc++] = WILD;
if (pc >= PATLEN) return true;
if (pc >= PATLEN) return TRUE;
}
break;
case 0x40: /* [reg + nn] */
if ((pc+=1) >= PATLEN) return true;
if ((pc+=1) >= PATLEN) return TRUE;
break;
case 0x80: /* [reg + nnnn] */
/* Possibly just a long constant offset from a register,
but often will be an index from a variable */
pat[pc++] = WILD;
if (pc >= PATLEN) return true;
if (pc >= PATLEN) return TRUE;
pat[pc++] = WILD;
if (pc >= PATLEN) return true;
if (pc >= PATLEN) return TRUE;
break;
case 0xC0: /* reg */
break;
}
return false;
return FALSE;
}
/* Change the next two bytes to wild cards */
static bool
TwoWild(uint8_t pat[])
TwoWild(byte pat[])
{
pat[pc++] = WILD;
if (pc >= PATLEN) return true; /* Pattern exhausted */
if (pc >= PATLEN) return TRUE; /* Pattern exhausted */
pat[pc++] = WILD;
if (pc >= PATLEN) return true;
return false;
if (pc >= PATLEN) return TRUE;
return FALSE;
}
/* Change the next four bytes to wild cards */
static bool
FourWild(uint8_t pat[])
FourWild(byte pat[])
{
if(TwoWild(pat))
return true;
return TwoWild(pat);
TwoWild(pat);
return TwoWild(pat);
}
/* Chop from the current point by wiping with zeroes. Can't rely on anything
after this point */
static void
chop(uint8_t pat[])
chop(byte pat[])
{
if (pc >= PATLEN) return; /* Could go negative otherwise */
memset(&pat[pc], 0, PATLEN - pc);
}
static bool op0F(uint8_t pat[])
static bool
op0F(byte pat[])
{
/* The two uint8_t opcodes */
uint8_t op = pat[pc++];
/* The two byte opcodes */
byte op = pat[pc++];
switch (op & 0xF0)
{
case 0x00: /* 00 - 0F */
if (op >= 0x06) /* Clts, Invd, Wbinvd */
return false;
return FALSE;
else
{
/* Grp 6, Grp 7, LAR, LSL */
@ -111,10 +117,10 @@ static bool op0F(uint8_t pat[])
return ModRM(pat);
case 0x80:
pc += 2; /* uint16_t displacement cond jumps */
return false;
pc += 2; /* Word displacement cond jumps */
return FALSE;
case 0x90: /* uint8_t set on condition */
case 0x90: /* Byte set on condition */
return ModRM(pat);
case 0xA0:
@ -124,7 +130,7 @@ static bool op0F(uint8_t pat[])
case 0xA1: /* Pop FS */
case 0xA8: /* Push GS */
case 0xA9: /* Pop GS */
return false;
return FALSE;
case 0xA3: /* Bt Ev,Gv */
case 0xAB: /* Bts Ev,Gv */
@ -132,9 +138,9 @@ static bool op0F(uint8_t pat[])
case 0xA4: /* Shld EvGbIb */
case 0xAC: /* Shrd EvGbIb */
if (ModRM(pat)) return true;
if (ModRM(pat)) return TRUE;
pc++; /* The #num bits to shift */
return false;
return FALSE;
case 0xA5: /* Shld EvGb CL */
case 0xAD: /* Shrd EvGb CL */
@ -148,9 +154,9 @@ static bool op0F(uint8_t pat[])
if (op == 0xBA)
{
/* Grp 8: bt/bts/btr/btc Ev,#nn */
if (ModRM(pat)) return true;
if (ModRM(pat)) return TRUE;
pc++; /* The #num bits to shift */
return false;
return FALSE;
}
return ModRM(pat);
@ -161,10 +167,10 @@ static bool op0F(uint8_t pat[])
return ModRM(pat);
}
/* Else BSWAP */
return false;
return FALSE;
default:
return false; /* Treat as double uint8_t opcodes */
return FALSE; /* Treat as double byte opcodes */
}
@ -178,10 +184,11 @@ static bool op0F(uint8_t pat[])
processor is in 16 bit address mode (real mode).
PATLEN bytes are scanned.
*/
void fixWildCards(uint8_t pat[])
void
fixWildCards(byte pat[])
{
uint8_t op, quad, intArg;
byte op, quad, intArg;
pc=0;
@ -190,17 +197,17 @@ void fixWildCards(uint8_t pat[])
op = pat[pc++];
if (pc >= PATLEN) return;
quad = (uint8_t) (op & 0xC0); /* Quadrant of the opcode map */
quad = (byte) (op & 0xC0); /* Quadrant of the opcode map */
if (quad == 0)
{
/* Arithmetic group 00-3F */
if ((op & 0xE7) == 0x26) /* First check for the odds */
{
/* Segment prefix: treat as 1 uint8_t opcode */
/* Segment prefix: treat as 1 byte opcode */
continue;
}
if (op == 0x0F) /* 386 2 uint8_t opcodes */
if (op == 0x0F) /* 386 2 byte opcodes */
{
if (op0F(pat)) return;
continue;
@ -211,20 +218,20 @@ void fixWildCards(uint8_t pat[])
/* All these are constant. Work out the instr length */
if (op & 2)
{
/* Push, pop, other 1 uint8_t opcodes */
/* Push, pop, other 1 byte opcodes */
continue;
}
else
{
if (op & 1)
{
/* uint16_t immediate operands */
/* Word immediate operands */
pc += 2;
continue;
}
else
{
/* uint8_t immediate operands */
/* Byte immediate operands */
pc++;
continue;
}
@ -250,7 +257,7 @@ void fixWildCards(uint8_t pat[])
/* 0x60 - 0x70 */
if (op & 0x10)
{
/* 70-7F 2 uint8_t jump opcodes */
/* 70-7F 2 byte jump opcodes */
pc++;
continue;
}
@ -277,11 +284,11 @@ void fixWildCards(uint8_t pat[])
if (TwoWild(pat)) return;
continue;
case 0x68: /* Push uint8_t */
case 0x6A: /* Push uint8_t */
case 0x68: /* Push byte */
case 0x6A: /* Push byte */
case 0x6D: /* insb port */
case 0x6F: /* outsb port */
/* 2 uint8_t instr, no wilds */
/* 2 byte instr, no wilds */
pc++;
continue;
@ -295,14 +302,14 @@ void fixWildCards(uint8_t pat[])
switch (op & 0xF0)
{
case 0x80: /* 80 - 8F */
/* All have a mod/rm uint8_t */
/* All have a mod/rm byte */
if (ModRM(pat)) return;
/* These also have immediate values */
switch (op)
{
case 0x80:
case 0x83:
/* One uint8_t immediate */
/* One byte immediate */
pc++;
continue;
@ -321,7 +328,7 @@ void fixWildCards(uint8_t pat[])
if (FourWild(pat)) return;
continue;
}
/* All others are 1 uint8_t opcodes */
/* All others are 1 byte opcodes */
continue;
case 0xA0: /* A0 - AF */
if ((op & 0x0C) == 0)
@ -332,11 +339,11 @@ void fixWildCards(uint8_t pat[])
}
else if ((op & 0xFE) == 0xA8)
{
/* test al,#uint8_t or test ax,#uint16_t */
/* test al,#byte or test ax,#word */
if (op & 1) pc += 2;
else pc += 1;
continue;
}
case 0xB0: /* B0 - BF */
{
@ -361,10 +368,10 @@ void fixWildCards(uint8_t pat[])
/* In the last quadrant of the op code table */
switch (op)
{
case 0xC0: /* 386: Rotate group 2 ModRM, uint8_t, #uint8_t */
case 0xC1: /* 386: Rotate group 2 ModRM, uint16_t, #uint8_t */
case 0xC0: /* 386: Rotate group 2 ModRM, byte, #byte */
case 0xC1: /* 386: Rotate group 2 ModRM, word, #byte */
if (ModRM(pat)) return;
/* uint8_t immediate value follows ModRM */
/* Byte immediate value follows ModRM */
pc++;
continue;
@ -385,29 +392,29 @@ void fixWildCards(uint8_t pat[])
case 0xC6: /* Mov ModRM, #nn */
if (ModRM(pat)) return;
/* uint8_t immediate value follows ModRM */
/* Byte immediate value follows ModRM */
pc++;
continue;
case 0xC7: /* Mov ModRM, #nnnn */
if (ModRM(pat)) return;
/* uint16_t immediate value follows ModRM */
/* Word immediate value follows ModRM */
/* Immediate 16 bit values might be constant, but also
might be relocatable. For now, make them wild */
if (TwoWild(pat)) return;
continue;
case 0xC8: /* Enter Iw, Ib */
pc += 3; /* Constant uint16_t, uint8_t */
pc += 3; /* Constant word, byte */
continue;
case 0xC9: /* Leave */
continue;
case 0xCC: /* int 3 */
case 0xCC: /* Int 3 */
continue;
case 0xCD: /* int nn */
case 0xCD: /* Int nn */
intArg = pat[pc++];
if ((intArg >= 0x34) and (intArg <= 0x3B))
if ((intArg >= 0x34) && (intArg <= 0x3B))
{
/* Borland/Microsoft FP emulations */
if (ModRM(pat)) return;
@ -420,10 +427,10 @@ void fixWildCards(uint8_t pat[])
case 0xCF: /* Iret */
continue;
case 0xD0: /* Group 2 rotate, uint8_t, 1 bit */
case 0xD1: /* Group 2 rotate, uint16_t, 1 bit */
case 0xD2: /* Group 2 rotate, uint8_t, CL bits */
case 0xD3: /* Group 2 rotate, uint16_t, CL bits */
case 0xD0: /* Group 2 rotate, byte, 1 bit */
case 0xD1: /* Group 2 rotate, word, 1 bit */
case 0xD2: /* Group 2 rotate, byte, CL bits */
case 0xD3: /* Group 2 rotate, word, CL bits */
if (ModRM(pat)) return;
continue;
@ -495,8 +502,8 @@ void fixWildCards(uint8_t pat[])
case 0xFD: /* Std */
continue;
case 0xF6: /* Group 3 uint8_t test/not/mul/div */
case 0xF7: /* Group 3 uint16_t test/not/mul/div */
case 0xF6: /* Group 3 byte test/not/mul/div */
case 0xF7: /* Group 3 word test/not/mul/div */
case 0xFE: /* Inc/Dec group 4 */
if (ModRM(pat)) return;
continue;
@ -506,7 +513,7 @@ void fixWildCards(uint8_t pat[])
if (ModRM(pat)) return;
continue;
default: /* Rest are single uint8_t opcodes */
default: /* Rest are single byte opcodes */
continue;
}
}

View File

@ -1,353 +1,497 @@
/*****************************************************************************
* dcc project Front End module
* Loads a program into simulated main memory and builds the procedure list.
* (C) Cristina Cifuentes
****************************************************************************/
* dcc project Front End module
* Loads a program into simulated main memory and builds the procedure list.
* (C) Cristina Cifuentes
****************************************************************************/
#include "dcc.h"
#include "disassem.h"
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <stdlib.h>
#include <string.h>
#ifdef __BORLAND__
#include <alloc.h>
#else
#include <malloc.h> /* For malloc, free, realloc */
#include "project.h"
class Loader
{
bool loadIntoProject(IProject *);
};
#endif
typedef struct { /* PSP structure */
uint16_t int20h; /* interrupt 20h */
uint16_t eof; /* segment, end of allocation block */
uint8_t res1; /* reserved */
uint8_t dosDisp[5]; /* far call to DOS function dispatcher */
uint8_t int22h[4]; /* vector for terminate routine */
uint8_t int23h[4]; /* vector for ctrl+break routine */
uint8_t int24h[4]; /* vector for error routine */
uint8_t res2[22]; /* reserved */
uint16_t segEnv; /* segment address of environment block */
uint8_t res3[34]; /* reserved */
uint8_t int21h[6]; /* opcode for int21h and far return */
uint8_t res4[6]; /* reserved */
uint8_t fcb1[16]; /* default file control block 1 */
uint8_t fcb2[16]; /* default file control block 2 */
uint8_t res5[4]; /* reserved */
uint8_t cmdTail[0x80]; /* command tail and disk transfer area */
word int20h; /* interrupt 20h */
word eof; /* segment, end of allocation block */
byte res1; /* reserved */
byte dosDisp[5]; /* far call to DOS function dispatcher */
byte int22h[4]; /* vector for terminate routine */
byte int23h[4]; /* vector for ctrl+break routine */
byte int24h[4]; /* vector for error routine */
byte res2[22]; /* reserved */
word segEnv; /* segment address of environment block */
byte res3[34]; /* reserved */
byte int21h[6]; /* opcode for int21h and far return */
byte res4[6]; /* reserved */
byte fcb1[16]; /* default file control block 1 */
byte fcb2[16]; /* default file control block 2 */
byte res5[4]; /* reserved */
byte cmdTail[0x80]; /* command tail and disk transfer area */
} PSP;
static struct { /* EXE file header */
uint8_t sigLo; /* .EXE signature: 0x4D 0x5A */
uint8_t sigHi;
uint16_t lastPageSize; /* Size of the last page */
uint16_t numPages; /* Number of pages in the file */
uint16_t numReloc; /* Number of relocation items */
uint16_t numParaHeader; /* # of paragraphs in the header */
uint16_t minAlloc; /* Minimum number of paragraphs */
uint16_t maxAlloc; /* Maximum number of paragraphs */
uint16_t initSS; /* Segment displacement of stack */
uint16_t initSP; /* Contents of SP at entry */
uint16_t checkSum; /* Complemented checksum */
uint16_t initIP; /* Contents of IP at entry */
uint16_t initCS; /* Segment displacement of code */
uint16_t relocTabOffset; /* Relocation table offset */
uint16_t overlayNum; /* Overlay number */
byte sigLo; /* .EXE signature: 0x4D 0x5A */
byte sigHi;
word lastPageSize; /* Size of the last page */
word numPages; /* Number of pages in the file */
word numReloc; /* Number of relocation items */
word numParaHeader; /* # of paragraphs in the header */
word minAlloc; /* Minimum number of paragraphs */
word maxAlloc; /* Maximum number of paragraphs */
word initSS; /* Segment displacement of stack */
word initSP; /* Contents of SP at entry */
word checkSum; /* Complemented checksum */
word initIP; /* Contents of IP at entry */
word initCS; /* Segment displacement of code */
word relocTabOffset; /* Relocation table offset */
word overlayNum; /* Overlay number */
} header;
#define EXE_RELOCATION 0x10 /* EXE images rellocated to above PSP */
//static void LoadImage(char *filename);
static void LoadImage(char *filename);
static void displayLoadInfo(void);
static void displayMemMap(void);
/*****************************************************************************
* FrontEnd - invokes the loader, parser, disassembler (if asm1), icode
* rewritter, and displays any useful information.
****************************************************************************/
bool DccFrontend::FrontEnd ()
* FrontEnd - invokes the loader, parser, disassembler (if asm1), icode
* rewritter, and displays any useful information.
****************************************************************************/
void FrontEnd (char *filename, PCALL_GRAPH *pcallGraph)
{
Project::get()->callGraph = nullptr;
Project::get()->create(m_fname);
PPROC pProc;
PSYM psym;
Int i, c;
/* Load program into memory */
LoadImage(filename);
/* Load program into memory */
LoadImage(*Project::get());
if (option.verbose)
{
displayLoadInfo();
}
if (option.verbose)
displayLoadInfo();
/* Do depth first flow analysis building call graph and procedure list,
* and attaching the I-code to each procedure */
parse (pcallGraph);
/* Do depth first flow analysis building call graph and procedure list,
* and attaching the I-code to each procedure */
parse (*Project::get());
if (option.asm1)
{
printf("dcc: writing assembler file %s\n", asm1_name);
}
if (option.asm1)
{
printf("dcc: writing assembler file %s\n", asm1_name.c_str());
}
/* Search through code looking for impure references and flag them */
for (pProc = pProcList; pProc; pProc = pProc->next)
{
for (i = 0; i < pProc->Icode.GetNumIcodes(); i++)
{
if (pProc->Icode.GetLlFlag(i) & (SYM_USE | SYM_DEF))
{
psym = &symtab.sym[pProc->Icode.GetIcode(i)->ic.ll.caseTbl.numEntries];
for (c = (Int)psym->label; c < (Int)psym->label+psym->size; c++)
{
if (BITMAP(c, BM_CODE))
{
pProc->Icode.SetLlFlag(i, IMPURE);
pProc->flg |= IMPURE;
break;
}
}
}
}
/* Print assembler listing */
if (option.asm1)
disassem(1, pProc);
}
/* Search through code looking for impure references and flag them */
Disassembler ds(1);
for(Function &f : Project::get()->pProcList)
{
f.markImpure();
if (option.asm1)
{
ds.disassem(&f);
}
}
if (option.Interact)
{
interactDis(&Project::get()->pProcList.front(), 0); /* Interactive disassembler */
}
if (option.Interact)
{
interactDis(pProcList, 0); /* Interactive disassembler */
}
/* Converts jump target addresses to icode offsets */
for(Function &f : Project::get()->pProcList)
{
f.bindIcodeOff();
}
/* Print memory bitmap */
if (option.Map)
displayMemMap();
return(true); // we no longer own proj !
/* Converts jump target addresses to icode offsets */
for (pProc = pProcList; pProc; pProc = pProc->next)
bindIcodeOff (pProc);
/* Print memory bitmap */
if (option.Map)
displayMemMap();
}
/****************************************************************************
* displayLoadInfo - Displays low level loader type info.
***************************************************************************/
* displayLoadInfo - Displays low level loader type info.
***************************************************************************/
static void displayLoadInfo(void)
{
PROG &prog(Project::get()->prog);
int i;
Int i;
printf("File type is %s\n", (prog.fCOM)?"COM":"EXE");
if (! prog.fCOM) {
printf("Signature = %02X%02X\n", header.sigLo, header.sigHi);
printf("File size %% 512 = %04X\n", LH(&header.lastPageSize));
printf("File size / 512 = %04X pages\n", LH(&header.numPages));
printf("# relocation items = %04X\n", LH(&header.numReloc));
printf("Offset to load image = %04X paras\n", LH(&header.numParaHeader));
printf("Minimum allocation = %04X paras\n", LH(&header.minAlloc));
printf("Maximum allocation = %04X paras\n", LH(&header.maxAlloc));
}
printf("Load image size = %04" PRIiPTR "\n", prog.cbImage - sizeof(PSP));
printf("Initial SS:SP = %04X:%04X\n", prog.initSS, prog.initSP);
printf("Initial CS:IP = %04X:%04X\n", prog.initCS, prog.initIP);
printf("File type is %s\n", (prog.fCOM)?"COM":"EXE");
if (! prog.fCOM) {
printf("Signature = %02X%02X\n", header.sigLo, header.sigHi);
printf("File size %% 512 = %04X\n", LH(&header.lastPageSize));
printf("File size / 512 = %04X pages\n", LH(&header.numPages));
printf("# relocation items = %04X\n", LH(&header.numReloc));
printf("Offset to load image = %04X paras\n", LH(&header.numParaHeader));
printf("Minimum allocation = %04X paras\n", LH(&header.minAlloc));
printf("Maximum allocation = %04X paras\n", LH(&header.maxAlloc));
}
printf("Load image size = %04X\n", prog.cbImage - sizeof(PSP));
printf("Initial SS:SP = %04X:%04X\n", prog.initSS, prog.initSP);
printf("Initial CS:IP = %04X:%04X\n", prog.initCS, prog.initIP);
if (option.VeryVerbose && prog.cReloc)
{
printf("\nRelocation Table\n");
for (i = 0; i < prog.cReloc; i++)
{
printf("%06X -> [%04X]\n", prog.relocTable[i],LH(prog.image() + prog.relocTable[i]));
}
}
printf("\n");
if (option.VeryVerbose && prog.cReloc)
{
printf("\nRelocation Table\n");
for (i = 0; i < prog.cReloc; i++)
{
printf("%06X -> [%04X]\n", prog.relocTable[i],
LH(prog.Image + prog.relocTable[i]));
}
}
printf("\n");
}
/*****************************************************************************
* fill - Fills line for displayMemMap()
****************************************************************************/
static void fill(int ip, char *bf)
* fill - Fills line for displayMemMap()
****************************************************************************/
static void fill(Int ip, char *bf)
{
PROG &prog(Project::get()->prog);
static uint8_t type[4] = {'.', 'd', 'c', 'x'};
uint8_t i;
static byte type[4] = {'.', 'd', 'c', 'x'};
byte i;
for (i = 0; i < 16; i++, ip++)
{
*bf++ = ' ';
*bf++ = (ip < prog.cbImage)? type[(prog.map[ip >> 2] >> ((ip & 3) * 2)) & 3]: ' ';
}
*bf = '\0';
for (i = 0; i < 16; i++, ip++)
{
*bf++ = ' ';
*bf++ = (ip < prog.cbImage)?
type[(prog.map[ip >> 2] >> ((ip & 3) * 2)) & 3]: ' ';
}
*bf = '\0';
}
/*****************************************************************************
* displayMemMap - Displays the memory bitmap
****************************************************************************/
* displayMemMap - Displays the memory bitmap
****************************************************************************/
static void displayMemMap(void)
{
PROG &prog(Project::get()->prog);
char c, b1[33], b2[33], b3[33];
byte i;
Int ip = 0;
char c, b1[33], b2[33], b3[33];
uint8_t i;
int ip = 0;
printf("\nMemory Map\n");
while (ip < prog.cbImage)
{
fill(ip, b1);
printf("%06X %s\n", ip, b1);
ip += 16;
for (i = 3, c = b1[1]; i < 32 && c == b1[i]; i += 2)
; /* Check if all same */
if (i > 32)
{
fill(ip, b2); /* Skip until next two are not same */
fill(ip+16, b3);
if (! (strcmp(b1, b2) || strcmp(b1, b3)))
{
printf(" :\n");
do
{
ip += 16;
fill(ip+16, b1);
} while (! strcmp(b1, b2));
}
}
}
printf("\n");
printf("\nMemory Map\n");
while (ip < prog.cbImage)
{
fill(ip, b1);
printf("%06X %s\n", ip, b1);
ip += 16;
for (i = 3, c = b1[1]; i < 32 && c == b1[i]; i += 2)
; /* Check if all same */
if (i > 32)
{
fill(ip, b2); /* Skip until next two are not same */
fill(ip+16, b3);
if (! (strcmp(b1, b2) || strcmp(b1, b3)))
{
printf(" :\n");
do
{
ip += 16;
fill(ip+16, b1);
} while (! strcmp(b1, b2));
}
}
}
printf("\n");
}
/*****************************************************************************
* LoadImage
****************************************************************************/
void DccFrontend::LoadImage(Project &proj)
* LoadImage
****************************************************************************/
static void LoadImage(char *filename)
{
PROG &prog(Project::get()->prog);
FILE *fp;
int i, cb;
uint8_t buf[4];
FILE *fp;
Int i, cb;
byte buf[4];
/* Open the input file */
if ((fp = fopen(proj.binary_path().c_str(), "rb")) == nullptr)
{
fatalError(CANNOT_OPEN, proj.binary_path().c_str());
}
/* Open the input file */
if ((fp = fopen(filename, "rb")) == NULL)
{
fatalError(CANNOT_OPEN, filename);
}
/* Read in first 2 bytes to check EXE signature */
if (fread(&header, 1, 2, fp) != 2)
{
fatalError(CANNOT_READ, proj.binary_path().c_str());
}
prog.fCOM = (header.sigLo != 0x4D || header.sigHi != 0x5A);
if (! prog.fCOM ) {
/* Read rest of header */
fseek(fp, 0, SEEK_SET);
if (fread(&header, sizeof(header), 1, fp) != 1)
{
fatalError(CANNOT_READ, proj.binary_path().c_str());
}
/* Read in first 2 bytes to check EXE signature */
if (fread(&header, 1, 2, fp) != 2)
{
fatalError(CANNOT_READ, filename);
}
/* This is a typical DOS kludge! */
if (LH(&header.relocTabOffset) == 0x40)
{
fatalError(NEWEXE_FORMAT);
}
if (! (prog.fCOM = (boolT)(header.sigLo != 0x4D || header.sigHi != 0x5A))) {
/* Read rest of header */
fseek(fp, 0, SEEK_SET);
if (fread(&header, sizeof(header), 1, fp) != 1)
{
fatalError(CANNOT_READ, filename);
}
/* Calculate the load module size.
* This is the number of pages in the file
* less the length of the header and reloc table
* less the number of bytes unused on last page
*/
cb = (uint32_t)LH(&header.numPages) * 512 - (uint32_t)LH(&header.numParaHeader) * 16;
if (header.lastPageSize)
{
cb -= 512 - LH(&header.lastPageSize);
}
/* This is a typical DOS kludge! */
if (LH(&header.relocTabOffset) == 0x40)
{
fatalError(NEWEXE_FORMAT);
}
/* We quietly ignore minAlloc and maxAlloc since for our
* purposes it doesn't really matter where in real memory
* the program would end up. EXE programs can't really rely on
* their load location so setting the PSP segment to 0 is fine.
* Certainly programs that prod around in DOS or BIOS are going
* to have to load DS from a constant so it'll be pretty
* obvious.
*/
prog.initCS = (int16_t)LH(&header.initCS) + EXE_RELOCATION;
prog.initIP = (int16_t)LH(&header.initIP);
prog.initSS = (int16_t)LH(&header.initSS) + EXE_RELOCATION;
prog.initSP = (int16_t)LH(&header.initSP);
prog.cReloc = (int16_t)LH(&header.numReloc);
/* Calculate the load module size.
* This is the number of pages in the file
* less the length of the header and reloc table
* less the number of bytes unused on last page
*/
cb = (dword)LH(&header.numPages) * 512 - (dword)LH(&header.numParaHeader) * 16;
if (header.lastPageSize)
{
cb -= 512 - LH(&header.lastPageSize);
}
/* We quietly ignore minAlloc and maxAlloc since for our
* purposes it doesn't really matter where in real memory
* the program would end up. EXE programs can't really rely on
* their load location so setting the PSP segment to 0 is fine.
* Certainly programs that prod around in DOS or BIOS are going
* to have to load DS from a constant so it'll be pretty
* obvious.
*/
prog.initCS = (int16)LH(&header.initCS) + EXE_RELOCATION;
prog.initIP = (int16)LH(&header.initIP);
prog.initSS = (int16)LH(&header.initSS) + EXE_RELOCATION;
prog.initSP = (int16)LH(&header.initSP);
prog.cReloc = (int16)LH(&header.numReloc);
/* Allocate the relocation table */
if (prog.cReloc)
{
prog.relocTable = new uint32_t [prog.cReloc];
fseek(fp, LH(&header.relocTabOffset), SEEK_SET);
/* Allocate the relocation table */
if (prog.cReloc)
{
prog.relocTable = (dword*)allocMem(prog.cReloc * sizeof(Int));
fseek(fp, LH(&header.relocTabOffset), SEEK_SET);
/* Read in seg:offset pairs and convert to Image ptrs */
for (i = 0; i < prog.cReloc; i++)
{
fread(buf, 1, 4, fp);
prog.relocTable[i] = LH(buf) +
(((int)LH(buf+2) + EXE_RELOCATION)<<4);
}
}
/* Seek to start of image */
uint32_t start_of_image= LH(&header.numParaHeader) * 16;
fseek(fp, start_of_image, SEEK_SET);
}
else
{ /* COM file
* In this case the load module size is just the file length
*/
fseek(fp, 0, SEEK_END);
cb = ftell(fp);
/* Read in seg:offset pairs and convert to Image ptrs */
for (i = 0; i < prog.cReloc; i++)
{
fread(buf, 1, 4, fp);
prog.relocTable[i] = LH(buf) +
(((Int)LH(buf+2) + EXE_RELOCATION)<<4);
}
}
/* Seek to start of image */
fseek(fp, (Int)LH(&header.numParaHeader) * 16, SEEK_SET);
}
else
{ /* COM file
* In this case the load module size is just the file length
*/
fseek(fp, 0, SEEK_END);
cb = ftell(fp);
/* COM programs start off with an ORG 100H (to leave room for a PSP)
* This is also the implied start address so if we load the image
* at offset 100H addresses should all line up properly again.
*/
prog.initCS = 0;
prog.initIP = 0x100;
prog.initSS = 0;
prog.initSP = 0xFFFE;
prog.cReloc = 0;
/* COM programs start off with an ORG 100H (to leave room for a PSP)
* This is also the implied start address so if we load the image
* at offset 100H addresses should all line up properly again.
*/
prog.initCS = 0;
prog.initIP = 0x100;
prog.initSS = 0;
prog.initSP = 0xFFFE;
prog.cReloc = 0;
fseek(fp, 0, SEEK_SET);
}
fseek(fp, 0, SEEK_SET);
}
/* Allocate a block of memory for the program. */
prog.cbImage = cb + sizeof(PSP);
prog.Imagez = new uint8_t [prog.cbImage];
prog.Imagez[0] = 0xCD; /* Fill in PSP int 20h location */
prog.Imagez[1] = 0x20; /* for termination checking */
/* Allocate a block of memory for the program. */
prog.cbImage = cb + sizeof(PSP);
prog.Image = (byte*)allocMem(prog.cbImage);
prog.Image[0] = 0xCD; /* Fill in PSP Int 20h location */
prog.Image[1] = 0x20; /* for termination checking */
/* Read in the image past where a PSP would go */
if (cb != (int)fread(prog.Imagez + sizeof(PSP), 1, (size_t)cb, fp))
{
fatalError(CANNOT_READ, proj.binary_path().c_str());
}
/* Read in the image past where a PSP would go */
#ifdef __DOSWIN__
if (cb > 0xFFFF)
{
printf("Image size of %ld bytes too large for fread!\n", cb);
fatalError(CANNOT_READ, filename);
}
#endif
if (cb != (Int)fread(prog.Image + sizeof(PSP), 1, (size_t)cb, fp))
{
fatalError(CANNOT_READ, filename);
}
/* Set up memory map */
cb = (prog.cbImage + 3) / 4;
prog.map = (uint8_t *)malloc(cb);
memset(prog.map, BM_UNKNOWN, (size_t)cb);
/* Set up memory map */
cb = (prog.cbImage + 3) / 4;
prog.map = (byte *)memset(allocMem(cb), BM_UNKNOWN, (size_t)cb);
/* Relocate segment constants */
if (prog.cReloc)
{
for (i = 0; i < prog.cReloc; i++)
{
uint8_t *p = &prog.Imagez[prog.relocTable[i]];
uint16_t w = (uint16_t)LH(p) + EXE_RELOCATION;
*p++ = (uint8_t)(w & 0x00FF);
*p = (uint8_t)((w & 0xFF00) >> 8);
}
}
/* Relocate segment constants */
if (prog.cReloc)
{
for (i = 0; i < prog.cReloc; i++)
{
byte *p = &prog.Image[prog.relocTable[i]];
word w = (word)LH(p) + EXE_RELOCATION;
*p++ = (byte)(w & 0x00FF);
*p = (byte)((w & 0xFF00) >> 8);
}
}
fclose(fp);
fclose(fp);
}
/*****************************************************************************
* allocMem - malloc with failure test
****************************************************************************/
void *allocMem(int cb)
* allocMem - malloc with failure test
****************************************************************************/
void *allocMem(Int cb)
{
uint8_t *p;
byte *p;
//printf("Attempt to allocMem %5ld bytes\n", cb);
//printf("Attempt to allocMem %5ld bytes\n", cb);
if (! (p = (uint8_t*)malloc((size_t)cb)))
/* if (! (p = (uint8_t*)calloc((size_t)cb, 1))) */
{
fatalError(MALLOC_FAILED, cb);
}
// printf("allocMem: %p\n", p);
return p;
#if 0 /* Microsoft specific heap debugging code */
switch (_heapset('Z'))
{
case _HEAPBADBEGIN: printf("aM: Bad heap begin\n"); break;
case _HEAPBADNODE: printf("aM: Bad heap node\n");
printf("Attempt to allocMem %5d bytes\n", cb);
{
_HEAPINFO hinfo;
int heapstatus;
boolT cont = TRUE;
hinfo._pentry = NULL;
while (cont)
{
switch (heapstatus = _heapwalk(&hinfo))
{
case _HEAPOK:
printf("%6s block at %Fp of size %4.4X\n",
(hinfo._useflag == _USEDENTRY ? "USED" : "FREE"),
hinfo._pentry, hinfo._size);
break;
case _HEAPBADBEGIN:
printf("Heap bad begin\n");
break;
case _HEAPBADNODE:
printf("BAD NODE %6s block at %Fp of size %4.4X\n",
(hinfo._useflag == _USEDENTRY ? "USED" : "FREE"),
hinfo._pentry, hinfo._size);
break;
case _HEAPEND:
cont = FALSE;
break;
case _HEAPBADPTR:
printf("INFO BAD %6s block at %Fp of size %4.4X\n",
(hinfo._useflag == _USEDENTRY ? "USED" : "FREE"),
hinfo._pentry, hinfo._size);
cont=FALSE;
}
}
}
getchar();
exit(1);
case _HEAPEMPTY: printf("aM: Heap empty\n"); getchar(); break;
case _HEAPOK:putchar('.');break;
}
#endif
if (! (p = (byte*)malloc((size_t)cb)))
/* if (! (p = (byte*)calloc((size_t)cb, 1))) */
{
fatalError(MALLOC_FAILED, cb);
}
/*printf("allocMem: %p\n", p);/**/
return p;
}
/*****************************************************************************
* reallocVar - reallocs extra variable space
****************************************************************************/
void *reallocVar(void *p, Int newsize)
{
/*printf("Attempt to reallocVar %5d bytes\n", newsize);/**/
#if 0
switch (_heapset('Z'))
{
case _HEAPBADBEGIN: printf("aV: Bad heap begin\n"); /*getchar()*/; break;
case _HEAPBADNODE: printf("aV: Bad heap node\n");
printf("Attempt to reallocVar %5d bytes at %p\n", newsize, p);/**/
{
_HEAPINFO hinfo;
int heapstatus;
boolT cont = TRUE;
hinfo._pentry = NULL;
while (cont)
{
switch (heapstatus = _heapwalk(&hinfo))
{
case _HEAPOK:
printf("%6s block at %Fp of size %4.4X\n",
(hinfo._useflag == _USEDENTRY ? "USED" : "FREE"),
hinfo._pentry, hinfo._size);
break;
case _HEAPBADBEGIN:
printf("Heap bad begin\n");
break;
case _HEAPBADNODE:
printf("BAD NODE %6s block at %Fp of size %4.4X\n",
(hinfo._useflag == _USEDENTRY ? "USED" : "FREE"),
hinfo._pentry, hinfo._size);
break;
case _HEAPEND:
cont = FALSE;
break;
case _HEAPBADPTR:
printf("INFO BAD %6s block at %Fp of size %4.4X\n",
(hinfo._useflag == _USEDENTRY ? "USED" : "FREE"),
hinfo._pentry, hinfo._size);
cont=FALSE;
}
}
}
getchar();
break;
case _HEAPEMPTY: printf("aV: Heap empty\n"); getchar(); break;
case _HEAPOK:putchar('!');break;
}
#endif
if (! (p = realloc((byte *)p, (size_t)newsize)))
{
fatalError(MALLOC_FAILED, newsize);
}
/*printf("reallocVar: %p\n", p);/**/
return p;
}
#if 0
void free(void *p)
{
_ffree(p);
switch (_heapset('Z'))
{
case _HEAPBADBEGIN: printf("f: Bad heap begin\n"); getchar(); break;
case _HEAPBADNODE: printf("f: Bad heap node\n"); getchar(); break;
case _HEAPEMPTY: printf("f: Heap empty\n"); getchar(); break;
case _HEAPOK:putchar('!');break;
}/**/
}
#endif

View File

@ -3,395 +3,398 @@
* (C) Cristina Cifuentes
****************************************************************************/
#include "graph.h"
#include "msvc_fixes.h"
#include "dcc.h"
#include "project.h"
#include <boost/range/rbegin.hpp>
#include <boost/range/rend.hpp>
#include <string.h>
#if __BORLAND__
#include <alloc.h>
#else
#include <malloc.h> /* For free() */
#endif
using namespace std;
using namespace boost;
extern Project g_proj;
//static BB * rmJMP(Function * pProc, int marker, BB * pBB);
//static void mergeFallThrough(Function * pProc, BB * pBB);
//static void dfsNumbering(BB * pBB, std::vector<BB*> &dfsLast, int *first, int *last);
void Function::addOutEdgesForConditionalJump(BB * pBB,int next_ip, LLInst *ll)
{
pBB->addOutEdge(next_ip);
/* This is checking for jumps off into nowhere */
if ( not ll->testFlags(NO_LABEL) )
pBB->addOutEdge(ll->src().getImm2());
}
static PBB rmJMP(PPROC pProc, Int marker, PBB pBB);
static void mergeFallThrough(PPROC pProc, PBB pBB);
static void dfsNumbering(PBB pBB, PBB *dfsLast, Int *first, Int *last);
/*****************************************************************************
* createCFG - Create the basic control flow graph
****************************************************************************/
void Function::createCFG()
PBB createCFG(PPROC pProc)
{
/* Splits Icode associated with the procedure into Basic Blocks.
* The links between BBs represent the control flow graph of the
* procedure.
* A Basic Block is defined to end on one of the following instructions:
* 1) Conditional and unconditional jumps
* 2) CALL(F)
* 3) RET(F)
* 4) On the instruction before a join (a flagged TARGET)
* 5) Repeated string instructions
* 6) End of procedure
*/
/* Splits Icode associated with the procedure into Basic Blocks.
* The links between BBs represent the control flow graph of the
* procedure.
* A Basic Block is defined to end on one of the following instructions:
* 1) Conditional and unconditional jumps
* 2) CALL(F)
* 3) RET(F)
* 4) On the instruction before a join (a flagged TARGET)
* 5) Repeated string instructions
* 6) End of procedure
*/
Int i;
Int ip, start;
BB cfg;
PBB psBB;
PBB pBB = &cfg;
PICODE pIcode = pProc->Icode.GetFirstIcode();
BB * psBB;
BB * pBB;
iICODE pIcode = Icode.begin();
cfg.next = NULL;
stats.numBBbef = stats.numBBaft = 0;
for (ip = start = 0; pProc->Icode.IsValid(pIcode); ip++, pIcode++)
{
/* Stick a NOWHERE_NODE on the end if we terminate
* with anything other than a ret, jump or terminate */
if (ip + 1 == pProc->Icode.GetNumIcodes() &&
! (pIcode->ic.ll.flg & TERMINATES) &&
pIcode->ic.ll.opcode != iJMP && pIcode->ic.ll.opcode != iJMPF &&
pIcode->ic.ll.opcode != iRET && pIcode->ic.ll.opcode != iRETF)
newBB(pBB, start, ip, NOWHERE_NODE, 0, pProc);
stats.numBBbef = stats.numBBaft = 0;
rICODE current_range=make_iterator_range(pIcode,++iICODE(pIcode));
for (; pIcode!=Icode.end(); ++pIcode,current_range.advance_end(1))
{
iICODE nextIcode = ++iICODE(pIcode);
pBB = nullptr;
/* Only process icodes that have valid instructions */
else if ((pIcode->ic.ll.flg & NO_CODE) != NO_CODE)
{
switch (pIcode->ic.ll.opcode) {
case iJB: case iJBE: case iJAE: case iJA:
case iJL: case iJLE: case iJGE: case iJG:
case iJE: case iJNE: case iJS: case iJNS:
case iJO: case iJNO: case iJP: case iJNP:
case iJCXZ:
pBB = newBB(pBB, start, ip, TWO_BRANCH, 2, pProc);
CondJumps:
start = ip + 1;
pBB->edges[0].ip = (dword)start;
/* This is for jumps off into nowhere */
if (pIcode->ic.ll.flg & NO_LABEL)
pBB->numOutEdges--;
else
pBB->edges[1].ip = pIcode->ic.ll.immed.op;
break;
LLInst *ll = pIcode->ll();
/* Only process icodes that have valid instructions */
if(ll->testFlags(NO_CODE))
continue;
/* Stick a NOWHERE_NODE on the end if we terminate
* with anything other than a ret, jump or terminate */
if (nextIcode == Icode.end() and
(not ll->testFlags(TERMINATES)) and
(not ll->match(iJMP)) and (not ll->match(iJMPF)) and
(not ll->match(iRET)) and (not ll->match(iRETF)))
{
pBB=BB::Create(current_range, NOWHERE_NODE, this);
}
else
switch (ll->getOpcode()) {
case iJB: case iJBE: case iJAE: case iJA:
case iJL: case iJLE: case iJGE: case iJG:
case iJE: case iJNE: case iJS: case iJNS:
case iJO: case iJNO: case iJP: case iJNP:
case iJCXZ:
pBB = BB::Create(current_range, TWO_BRANCH, this);
addOutEdgesForConditionalJump(pBB,nextIcode->loc_ip, ll);
break;
case iLOOP: case iLOOPE: case iLOOPNE:
pBB = newBB(pBB, start, ip, LOOP_NODE, 2, pProc);
goto CondJumps;
case iLOOP: case iLOOPE: case iLOOPNE:
pBB = BB::Create(current_range, LOOP_NODE, this);
addOutEdgesForConditionalJump(pBB,nextIcode->loc_ip, ll);
break;
case iJMPF: case iJMP:
if (pIcode->ic.ll.flg & SWITCH)
{
pBB = newBB(pBB, start, ip, MULTI_BRANCH,
pIcode->ic.ll.caseTbl.numEntries, pProc);
for (i = 0; i < pIcode->ic.ll.caseTbl.numEntries; i++)
pBB->edges[i].ip = pIcode->ic.ll.caseTbl.entries[i];
pProc->hasCase = TRUE;
}
else if ((pIcode->ic.ll.flg & (I | NO_LABEL)) == I) {
pBB = newBB(pBB, start, ip, ONE_BRANCH, 1, pProc);
pBB->edges[0].ip = pIcode->ic.ll.immed.op;
}
else
newBB(pBB, start, ip, NOWHERE_NODE, 0, pProc);
start = ip + 1;
break;
case iJMPF: case iJMP:
if (ll->testFlags(SWITCH))
{
pBB = BB::Create(current_range, MULTI_BRANCH, this);
for (auto & elem : ll->caseTbl2)
pBB->addOutEdge(elem);
hasCase = true;
}
else if ((ll->getFlag() & (I | NO_LABEL)) == I) //TODO: WHY NO_LABEL TESTIT
{
pBB = BB::Create(current_range, ONE_BRANCH, this);
pBB->addOutEdge(ll->src().getImm2());
}
else
pBB = BB::Create(current_range, NOWHERE_NODE, this);
break;
case iCALLF: case iCALL:
{ PPROC p = pIcode->ic.ll.immed.proc.proc;
if (p)
i = ((p->flg) & TERMINATES) ? 0 : 1;
else
i = 1;
pBB = newBB(pBB, start, ip, CALL_NODE, i, pProc);
start = ip + 1;
if (i)
pBB->edges[0].ip = (dword)start;
}
break;
case iCALLF: case iCALL:
{
Function * p = ll->src().proc.proc;
pBB = BB::Create(current_range, CALL_NODE, this);
if (p and not ((p->flg) & TERMINATES) )
pBB->addOutEdge(nextIcode->loc_ip);
break;
}
case iRET: case iRETF:
newBB(pBB, start, ip, RETURN_NODE, 0, pProc);
start = ip + 1;
break;
case iRET: case iRETF:
pBB = BB::Create(current_range, RETURN_NODE, this);
break;
default:
/* Check for exit to DOS */
if (pIcode->ic.ll.flg & TERMINATES)
{
pBB = newBB(pBB, start, ip, TERMINATE_NODE, 0, pProc);
start = ip + 1;
}
/* Check for a fall through */
else if (pProc->Icode.GetFirstIcode()[ip + 1].ic.ll.flg & (TARGET | CASE))
{
pBB = newBB(pBB, start, ip, FALL_NODE, 1, pProc);
start = ip + 1;
pBB->edges[0].ip = (dword)start;
}
break;
}
}
}
default:
/* Check for exit to DOS */
if ( ll->testFlags(TERMINATES) )
{
pBB = BB::Create(current_range, TERMINATE_NODE, this);
}
/* Check for a fall through */
else if (nextIcode != Icode.end())
{
if (nextIcode->ll()->testFlags(TARGET | CASE))
{
pBB = BB::Create(current_range, FALL_NODE, this);
pBB->addOutEdge(nextIcode->loc_ip);
}
}
break;
}
if(pBB!=nullptr) // created a new Basic block
{
// restart the range
// end iterator will be updated by expression in for statement
current_range=make_iterator_range(nextIcode,nextIcode);
}
if (nextIcode == Icode.end())
break;
}
for (auto pr : m_ip_to_bb)
{
BB* pBB=pr.second;
for (auto & elem : pBB->edges)
{
int32_t ip = elem.ip;
if (ip >= SYNTHESIZED_MIN)
{
fatalError (INVALID_SYNTHETIC_BB);
return;
}
auto iter2=m_ip_to_bb.find(ip);
if(iter2==m_ip_to_bb.end())
fatalError(NO_BB, ip, qPrintable(name));
psBB = iter2->second;
elem.BBptr = psBB;
psBB->inEdges.push_back((BB *)nullptr);
}
}
/* Convert list of BBs into a graph */
for (pBB = cfg.next; pBB; pBB = pBB->next)
{
for (i = 0; i < pBB->numOutEdges; i++)
{
ip = pBB->edges[i].ip;
if (ip >= SYNTHESIZED_MIN)
fatalError (INVALID_SYNTHETIC_BB);
else
{
for (psBB = cfg.next; psBB; psBB = psBB->next)
if (psBB->start == ip)
{
pBB->edges[i].BBptr = psBB;
psBB->numInEdges++;
break;
}
if (! psBB)
fatalError(NO_BB, ip, pProc->name);
}
}
}
return cfg.next;
}
void Function::markImpure()
{
PROG &prog(Project::get()->prog);
for(ICODE &icod : Icode)
{
if ( not icod.ll()->testFlags(SYM_USE | SYM_DEF))
continue;
//assert that case tbl has less entries then symbol table ????
//WARNING: Case entries are held in symbol table !
assert(Project::get()->validSymIdx(icod.ll()->caseEntry));
const SYM &psym(Project::get()->getSymByIdx(icod.ll()->caseEntry));
for (int c = (int)psym.label; c < (int)psym.label+psym.size; c++)
{
if (BITMAP(c, BM_CODE))
{
icod.ll()->setFlags(IMPURE);
flg |= IMPURE;
break;
}
}
}
/*****************************************************************************
* newBB - Allocate new BB and link to end of list
*****************************************************************************/
PBB newBB (PBB pBB, Int start, Int ip, byte nodeType, Int numOutEdges,
PPROC pproc)
{
PBB pnewBB;
pnewBB = allocStruc(BB);
memset (pnewBB, 0, sizeof(BB));
pnewBB->nodeType = nodeType; /* Initialise */
pnewBB->start = start;
pnewBB->length = ip - start + 1;
pnewBB->numOutEdges = (byte)numOutEdges;
pnewBB->immedDom = NO_DOM;
pnewBB->loopHead = pnewBB->caseHead = pnewBB->caseTail =
pnewBB->latchNode= pnewBB->loopFollow = NO_NODE;
if (numOutEdges)
pnewBB->edges = (TYPEADR_TYPE*)allocMem(numOutEdges * sizeof(TYPEADR_TYPE));
/* Mark the basic block to which the icodes belong to, but only for
* real code basic blocks (ie. not interval bbs) */
if (start >= 0)
pproc->Icode.SetInBB(start, ip, pnewBB);
while (pBB->next) /* Link */
pBB = pBB->next;
pBB->next = pnewBB;
if (start != -1) { /* Only for code BB's */
stats.numBBbef++;
}
return pnewBB;
}
/*****************************************************************************
* freeCFG - Deallocates a cfg
****************************************************************************/
void Function::freeCFG()
void freeCFG(PBB cfg)
{
for(auto p : m_ip_to_bb)
{
delete p.second;
}
m_ip_to_bb.clear();
PBB pBB;
for (pBB = cfg; pBB; pBB = cfg) {
if (pBB->inEdges)
free(pBB->inEdges);
if (pBB->edges)
free(pBB->edges);
cfg = pBB->next;
free(pBB);
}
}
/*****************************************************************************
* compressCFG - Remove redundancies and add in-edge information
****************************************************************************/
void Function::compressCFG()
{
BB *pNxt;
int ip, first=0, last;
void compressCFG(PPROC pProc)
{ PBB pBB, pNxt;
Int ip, first=0, last, i;
/* First pass over BB list removes redundant jumps of the form
* (Un)Conditional -> Unconditional jump */
for (BB *pBB : m_actual_cfg) //m_cfg
{
if(pBB->inEdges.empty() or (pBB->nodeType != ONE_BRANCH and pBB->nodeType != TWO_BRANCH))
continue;
for (TYPEADR_TYPE &edgeRef : pBB->edges)
{
ip = pBB->rbegin()->loc_ip;
pNxt = edgeRef.BBptr->rmJMP(ip, edgeRef.BBptr);
/* First pass over BB list removes redundant jumps of the form
* (Un)Conditional -> Unconditional jump */
for (pBB = pProc->cfg; pBB; pBB = pBB->next)
if (pBB->numInEdges != 0 && (pBB->nodeType == ONE_BRANCH ||
pBB->nodeType == TWO_BRANCH))
for (i = 0; i < pBB->numOutEdges; i++)
{
ip = pBB->start + pBB->length - 1;
pNxt = rmJMP(pProc, ip, pBB->edges[i].BBptr);
if (not pBB->edges.empty()) /* Might have been clobbered */
{
edgeRef.BBptr = pNxt;
assert(pBB->back().loc_ip==ip);
pBB->back().ll()->SetImmediateOp((uint32_t)pNxt->begin()->loc_ip);
//Icode[ip].SetImmediateOp((uint32_t)pNxt->begin());
}
}
}
if (pBB->numOutEdges) /* Might have been clobbered */
{
pBB->edges[i].BBptr = pNxt;
pProc->Icode.SetImmediateOp(ip, (dword)pNxt->start);
}
}
/* Next is a depth-first traversal merging any FALL_NODE or
* ONE_BRANCH that fall through to a node with that as their only
* in-edge. */
m_actual_cfg.front()->mergeFallThrough(Icode);
/* Next is a depth-first traversal merging any FALL_NODE or
* ONE_BRANCH that fall through to a node with that as their only
* in-edge. */
mergeFallThrough(pProc, pProc->cfg);
/* Remove redundant BBs created by the above compressions
* and allocate in-edge arrays as required. */
stats.numBBaft = stats.numBBbef;
bool entry_node=true;
for(BB *pBB : m_actual_cfg)
{
if (pBB->inEdges.empty())
{
if (entry_node) /* Init it misses out on */
pBB->index = UN_INIT;
else
{
delete pBB;
stats.numBBaft--;
}
}
else
{
pBB->inEdgeCount = pBB->inEdges.size();
}
entry_node=false;
}
/* Remove redundant BBs created by the above compressions
* and allocate in-edge arrays as required. */
stats.numBBaft = stats.numBBbef;
/* Allocate storage for dfsLast[] array */
numBBs = stats.numBBaft;
m_dfsLast.resize(numBBs,nullptr); // = (BB **)allocMem(numBBs * sizeof(BB *))
for (pBB = pProc->cfg; pBB; pBB = pNxt)
{
pNxt = pBB->next;
if (pBB->numInEdges == 0)
{
if (pBB == pProc->cfg) /* Init it misses out on */
pBB->index = UN_INIT;
else
{
if (pBB->numOutEdges)
free(pBB->edges);
free(pBB);
stats.numBBaft--;
}
}
else
{
pBB->inEdgeCount = pBB->numInEdges;
pBB->inEdges = (PBB*)allocMem(pBB->numInEdges * sizeof(PBB));
}
}
/* Now do a dfs numbering traversal and fill in the inEdges[] array */
last = numBBs - 1;
m_actual_cfg.front()->dfsNumbering(m_dfsLast, &first, &last);
/* Allocate storage for dfsLast[] array */
pProc->numBBs = stats.numBBaft;
pProc->dfsLast = (PBB*)allocMem(pProc->numBBs * sizeof(PBB));
/* Now do a dfs numbering traversal and fill in the inEdges[] array */
last = pProc->numBBs - 1;
dfsNumbering(pProc->cfg, pProc->dfsLast, &first, &last);
}
/****************************************************************************
* rmJMP - If BB addressed is just a JMP it is replaced with its target
***************************************************************************/
BB *BB::rmJMP(int marker, BB * pBB)
static PBB rmJMP(PPROC pProc, Int marker, PBB pBB)
{
marker += (int)DFS_JMP;
marker += DFS_JMP;
while (pBB->nodeType == ONE_BRANCH and pBB->size() == 1)
{
if (pBB->traversed != marker)
{
pBB->traversed = (eDFS)marker;
pBB->inEdges.pop_back();
if (not pBB->inEdges.empty())
{
pBB->edges[0].BBptr->inEdges.push_back((BB *)nullptr);
}
else
{
pBB->front().ll()->setFlags(NO_CODE);
pBB->front().invalidate(); //pProc->Icode.SetLlInvalid(pBB->begin(), true);
}
while (pBB->nodeType == ONE_BRANCH && pBB->length == 1) {
if (pBB->traversed != marker) {
pBB->traversed = marker;
if (--pBB->numInEdges)
pBB->edges[0].BBptr->numInEdges++;
else
{
pProc->Icode.SetLlFlag(pBB->start, NO_CODE);
pProc->Icode.SetLlInvalid(pBB->start, TRUE);
}
pBB = pBB->edges[0].BBptr;
}
else
{
/* We are going around in circles */
pBB->nodeType = NOWHERE_NODE;
pBB->front().ll()->replaceSrc(LLOperand::CreateImm2(pBB->front().loc_ip));
//pBB->front().ll()->src.immed.op = pBB->front().loc_ip;
do {
pBB = pBB->edges[0].BBptr;
pBB->inEdges.pop_back(); // was --numInedges
if (not pBB->inEdges.empty())
{
pBB->front().ll()->setFlags(NO_CODE);
pBB->front().invalidate();
// pProc->Icode.setFlags(pBB->start, NO_CODE);
// pProc->Icode.SetLlInvalid(pBB->start, true);
}
} while (pBB->nodeType != NOWHERE_NODE);
pBB = pBB->edges[0].BBptr;
}
else { /* We are going around in circles */
pBB->nodeType = NOWHERE_NODE;
pProc->Icode.GetIcode(pBB->start)->ic.ll.immed.op = (dword)pBB->start;
pProc->Icode.SetImmediateOp(pBB->start, (dword)pBB->start);
do {
pBB = pBB->edges[0].BBptr;
if (! --pBB->numInEdges)
{
pProc->Icode.SetLlFlag(pBB->start, NO_CODE);
pProc->Icode.SetLlInvalid(pBB->start, TRUE);
}
} while (pBB->nodeType != NOWHERE_NODE);
pBB->edges.clear();
}
}
return pBB;
free(pBB->edges);
pBB->numOutEdges = 0;
pBB->edges = NULL;
}
}
return pBB;
}
/*****************************************************************************
* mergeFallThrough
****************************************************************************/
void BB::mergeFallThrough( CIcodeRec &Icode)
static void mergeFallThrough(PPROC pProc, PBB pBB)
{
BB * pChild;
if (nullptr==this)
{
printf("mergeFallThrough on empty BB!\n");
}
while (nodeType == FALL_NODE or nodeType == ONE_BRANCH)
{
pChild = edges[0].BBptr;
/* Jump to next instruction can always be removed */
if (nodeType == ONE_BRANCH)
{
assert(Parent==pChild->Parent);
if(back().loc_ip>pChild->front().loc_ip) // back edege
break;
auto iter=std::find_if(this->end(),pChild->begin(),[](ICODE &c)
{return not c.ll()->testFlags(NO_CODE);});
PBB pChild;
Int i, ip;
if (iter != pChild->begin())
break;
back().ll()->setFlags(NO_CODE);
back().invalidate();
nodeType = FALL_NODE;
//instructions.advance_end(-1); //TODO: causes creation of empty BB
}
/* If there's no other edges into child can merge */
if (pChild->inEdges.size() != 1)
break;
if (pBB) {
while (pBB->nodeType == FALL_NODE || pBB->nodeType == ONE_BRANCH)
{
pChild = pBB->edges[0].BBptr;
/* Jump to next instruction can always be removed */
if (pBB->nodeType == ONE_BRANCH)
{
ip = pBB->start + pBB->length;
for (i = ip; i < pChild->start
&& (pProc->Icode.GetLlFlag(i) & NO_CODE); i++);
if (i != pChild->start)
break;
pProc->Icode.SetLlFlag(ip - 1, NO_CODE);
pProc->Icode.SetLlInvalid(ip - 1, TRUE);
pBB->nodeType = FALL_NODE;
pBB->length--;
nodeType = pChild->nodeType;
instructions = boost::make_iterator_range(begin(),pChild->end());
pChild->front().ll()->clrFlags(TARGET);
edges.swap(pChild->edges);
}
/* If there's no other edges into child can merge */
if (pChild->numInEdges != 1)
break;
pChild->inEdges.clear();
pChild->edges.clear();
}
traversed = DFS_MERGE;
pBB->nodeType = pChild->nodeType;
pBB->length = pChild->start + pChild->length - pBB->start;
pProc->Icode.ClearLlFlag(pChild->start, TARGET);
pBB->numOutEdges = pChild->numOutEdges;
free(pBB->edges);
pBB->edges = pChild->edges;
/* Process all out edges recursively */
for (auto & elem : edges)
{
if (elem.BBptr->traversed != DFS_MERGE)
elem.BBptr->mergeFallThrough(Icode);
}
pChild->numOutEdges = pChild->numInEdges = 0;
pChild->edges = NULL;
}
pBB->traversed = DFS_MERGE;
/* Process all out edges recursively */
for (i = 0; i < pBB->numOutEdges; i++)
if (pBB->edges[i].BBptr->traversed != DFS_MERGE)
mergeFallThrough(pProc, pBB->edges[i].BBptr);
}
}
/*****************************************************************************
* dfsNumbering - Numbers nodes during first and last visits and determine
* dfsNumbering - Numbers nodes during first and last visits and determine
* in-edges
****************************************************************************/
void BB::dfsNumbering(std::vector<BB *> &dfsLast, int *first, int *last)
static void dfsNumbering(PBB pBB, PBB *dfsLast, Int *first, Int *last)
{
BB * pChild;
traversed = DFS_NUM;
dfsFirstNum = (*first)++;
PBB pChild;
byte i;
/* index is being used as an index to inEdges[]. */
// for (i = 0; i < edges.size(); i++)
for(auto edge : edges)
{
pChild = edge.BBptr;
pChild->inEdges[pChild->index++] = this;
if (pBB)
{
pBB->traversed = DFS_NUM;
pBB->dfsFirstNum = (*first)++;
/* Is this the last visit? */
if (pChild->index == int(pChild->inEdges.size()))
pChild->index = UN_INIT;
/* index is being used as an index to inEdges[]. */
for (i = 0; i < pBB->numOutEdges; i++)
{
pChild = pBB->edges[i].BBptr;
pChild->inEdges[pChild->index++] = pBB;
if (pChild->traversed != DFS_NUM)
pChild->dfsNumbering(dfsLast, first, last);
}
dfsLastNum = *last;
dfsLast[(*last)--] = this;
/* Is this the last visit? */
if (pChild->index == pChild->numInEdges)
pChild->index = UN_INIT;
if (pChild->traversed != DFS_NUM)
dfsNumbering(pChild, dfsLast, first, last);
}
pBB->dfsLastNum = *last;
dfsLast[(*last)--] = pBB;
}
}

File diff suppressed because it is too large Load Diff

View File

@ -1,25 +0,0 @@
#include "icode.h"
#include "ast.h"
void HLTYPE::replaceExpr(Expr *e)
{
assert(e);
delete exp.v;
exp.v=e;
}
HlTypeSupport *HLTYPE::get()
{
switch(opcode)
{
case HLI_ASSIGN: return &asgn;
case HLI_RET:
case HLI_POP:
case HLI_JCOND:
case HLI_PUSH: return &exp;
case HLI_CALL: return &call;
default:
return nullptr;
}
}

View File

@ -1,114 +1,143 @@
// Object oriented icode code for dcc
// (C) 1997 Mike Van Emmerik
#include "icode.h"
#include "msvc_fixes.h"
#include "dcc.h"
#include "types.h" // Common types like uint8_t, etc
#include "ast.h" // Some icode types depend on these
#include <stdlib.h>
#include <malloc.h>
#include <memory.h>
#include "types.h" // Common types like byte, etc
#include "ast.h" // Some icode types depend on these
#include "icode.h"
void *reallocVar(void *p, Int newsize); /* frontend.c !? */
#define ICODE_DELTA 25 // Amount to allocate for new chunk
ICODE::TypeFilter<HIGH_LEVEL_ICODE> ICODE::select_high_level;
ICODE::TypeAndValidFilter<HIGH_LEVEL_ICODE> ICODE::select_valid_high_level;
CIcodeRec::CIcodeRec()
{
numIcode = 0;
alloc = 0;
icode = 0; // Initialise the pointer to 0
}
CIcodeRec::~CIcodeRec()
{
if (icode)
{
free(icode);
}
}
PICODE CIcodeRec::addIcode(PICODE pIcode)
/* Copies the icode that is pointed to by pIcode to the icode array.
* If there is need to allocate extra memory, it is done so, and
* the alloc variable is adjusted. */
ICODE * CIcodeRec::addIcode(ICODE *pIcode)
{
push_back(*pIcode);
back().loc_ip = size()-1;
return &back();
}
PICODE resIcode;
void CIcodeRec::SetInBB(rCODE &rang, BB *pnewBB)
{
for(ICODE &ic : rang)
ic.setParent(pnewBB);
}
/* labelSrchRepl - Searches the icodes for instruction with label = target, and
replaces *pIndex with an icode index */
bool CIcodeRec::labelSrch(uint32_t target, uint32_t &pIndex)
{
iICODE location=labelSrch(target);
if(end()==location)
return false;
pIndex=location->loc_ip;
return true;
}
bool CIcodeRec::alreadyDecoded(uint32_t target)
{
iICODE location=labelSrch(target);
if(end()==location)
return false;
return true;
}
CIcodeRec::iterator CIcodeRec::labelSrch(uint32_t target)
{
return find_if(begin(),end(),[target](ICODE &l) -> bool {return l.ll()->label==target;});
}
ICODE * CIcodeRec::GetIcode(size_t ip)
{
assert(ip<size());
iICODE res=begin();
advance(res,ip);
return &(*res);
}
extern int getNextLabel();
extern bundle cCode;
/* Checks the given icode to determine whether it has a label associated
* to it. If so, a goto is emitted to this label; otherwise, a new label
* is created and a goto is also emitted.
* Note: this procedure is to be used when the label is to be backpatched
* onto code in cCode.code */
void LLInst::emitGotoLabel (int indLevel)
{
if ( not testFlags(HLL_LABEL) ) /* node hasn't got a lab */
if (numIcode == alloc)
{
/* Generate new label */
hllLabNum = getNextLabel();
setFlags(HLL_LABEL);
alloc += ICODE_DELTA;
icode = (PICODE)reallocVar(icode, alloc * sizeof(ICODE));
memset (&icode[numIcode], 0, ICODE_DELTA * sizeof(ICODE));
/* Node has been traversed already, so backpatch this label into
* the code */
cCode.code.addLabelBundle (codeIdx, hllLabNum);
}
cCode.appendCode( "%sgoto L%ld;\n", indentStr(indLevel), hllLabNum);
stats.numHLIcode++;
resIcode = (PICODE)memcpy (&icode[numIcode], pIcode,
sizeof(ICODE));
numIcode++;
return (resIcode);
}
PICODE CIcodeRec::GetFirstIcode()
{
return icode;
}
/* Don't need this; just pIcode++ since array is guaranteed to be contiguous
PICODE CIcodeRec::GetNextIcode(PICODE pCurIcode)
{
int idx = pCurIcode - icode; // Current index
ASSERT(idx+1 < numIcode);
return &icode[idx+1];
}
*/
boolT CIcodeRec::IsValid(PICODE pCurIcode)
{
int idx = pCurIcode - icode; // Current index
return idx < numIcode;
}
int CIcodeRec::GetNumIcodes()
{
return numIcode;
}
void CIcodeRec::SetInBB(int start, int end, struct _BB* pnewBB)
{
for (int i = start; i <= end; i++)
icode[i].inBB = pnewBB;
}
void CIcodeRec::SetImmediateOp(int ip, dword dw)
{
icode[ip].ic.ll.immed.op = dw;
}
void CIcodeRec::SetLlFlag(int ip, dword flag)
{
icode[ip].ic.ll.flg |= flag;
}
dword CIcodeRec::GetLlFlag(int ip)
{
return icode[ip].ic.ll.flg;
}
void CIcodeRec::ClearLlFlag(int ip, dword flag)
{
icode[ip].ic.ll.flg &= (~flag);
}
void CIcodeRec::SetLlInvalid(int ip, boolT fInv)
{
icode[ip].invalid = fInv;
}
dword CIcodeRec::GetLlLabel(int ip)
{
return icode[ip].ic.ll.label;
}
llIcode CIcodeRec::GetLlOpcode(int ip)
{
return icode[ip].ic.ll.opcode;
}
boolT CIcodeRec::labelSrch(dword target, Int *pIndex)
/* labelSrchRepl - Searches the icodes for instruction with label = target, and
replaces *pIndex with an icode index */
{
Int i;
for (i = 0; i < numIcode; i++)
{
if (icode[i].ic.ll.label == target)
{
*pIndex = i;
return TRUE;
}
}
return FALSE;
}
PICODE CIcodeRec::GetIcode(int ip)
{
return &icode[ip];
}
bool LLOperand::isReg() const
{
return (regi>=rAX) and (regi<=rTMP);
}
void LLOperand::addProcInformation(int param_count, CConv::Type call_conv)
{
proc.proc->cbParam = (int16_t)param_count;
proc.cb = param_count;
proc.proc->callingConv(call_conv);
}
void HLTYPE::setCall(Function *proc)
{
opcode = HLI_CALL;
call.proc = proc;
call.args = new STKFRAME;
}
bool AssignType::removeRegFromLong(eReg regi, LOCAL_ID *locId)
{
m_lhs=lhs()->performLongRemoval(regi,locId);
return true;
}
void AssignType::lhs(Expr *l)
{
assert(dynamic_cast<UnaryOperator *>(l));
m_lhs=l;
}

File diff suppressed because it is too large Load Diff

View File

@ -1,333 +0,0 @@
#include "arith_idioms.h"
#include "dcc.h"
#include "msvc_fixes.h"
#include <QtCore/QDebug>
using namespace std;
/*****************************************************************************
* idiom5 - Long addition.
* ADD reg/stackOff, reg/stackOff
* ADC reg/stackOff, reg/stackOff
* Eg: ADD ax, [bp-4]
* ADC dx, [bp-2]
* => dx:ax = dx:ax + [bp-2]:[bp-4]
* Found in Borland Turbo C code.
* Commonly used idiom for long addition.
****************************************************************************/
bool Idiom5::match(iICODE pIcode)
{
if(distance(pIcode,m_end)<2)
return false;
m_icodes[0]=pIcode++;
m_icodes[1]=pIcode++;
if (m_icodes[1]->ll()->match(iADC))
return true;
return false;
}
int Idiom5::action()
{
AstIdent *rhs,*lhs;
Expr *expr;
lhs = AstIdent::Long (&m_func->localId, DST, m_icodes[0], LOW_FIRST, m_icodes[0], USE_DEF, *m_icodes[1]->ll());
rhs = AstIdent::Long (&m_func->localId, SRC, m_icodes[0], LOW_FIRST, m_icodes[0], eUSE, *m_icodes[1]->ll());
expr = new BinaryOperator(ADD,lhs, rhs);
m_icodes[0]->setAsgn(lhs, expr);
m_icodes[1]->invalidate();
return 2;
}
/*****************************************************************************
* idiom6 - Long substraction.
* SUB reg/stackOff, reg/stackOff
* SBB reg/stackOff, reg/stackOff
* Eg: SUB ax, [bp-4]
* SBB dx, [bp-2]
* => dx:ax = dx:ax - [bp-2]:[bp-4]
* Found in Borland Turbo C code.
* Commonly used idiom for long substraction.
****************************************************************************/
bool Idiom6::match(iICODE pIcode)
{
if(distance(pIcode,m_end)<2)
return false;
m_icodes[0]=pIcode++;
m_icodes[1]=pIcode++;
if (m_icodes[1]->ll()->match(iSBB))
return true;
return false;
}
int Idiom6::action()
{
AstIdent *rhs,*lhs;
Expr *expr;
lhs = AstIdent::Long (&m_func->localId, DST, m_icodes[0], LOW_FIRST, m_icodes[0], USE_DEF, *m_icodes[1]->ll());
rhs = AstIdent::Long (&m_func->localId, SRC, m_icodes[0], LOW_FIRST, m_icodes[0], eUSE, *m_icodes[1]->ll());
expr = new BinaryOperator(SUB,lhs, rhs);
m_icodes[0]->setAsgn(lhs, expr);
m_icodes[1]->invalidate();
return 2;
}
/*****************************************************************************
* idiom 18: Post-increment or post-decrement in a conditional jump
* Used
* 0 MOV reg, var (including register variables)
* 1 INC var or DEC var <------------------------- input point
* 2 CMP var, Y
* 3 JX label
* => HLI_JCOND (var++ X Y)
* Eg: MOV ax, si
* INC si
* CMP ax, 8
* JL labX
* => HLI_JCOND (si++ < 8)
* Found in Borland Turbo C. Intrinsic to C languages.
****************************************************************************/
bool Idiom18::match(iICODE picode)
{
if(picode==m_func->Icode.begin())
return false;
if(std::distance(picode,m_end)<3)
return false;
--picode; //
for(int i=0; i<4; ++i)
m_icodes[i] =picode++;
m_idiom_type=-1;
m_is_dec = m_icodes[1]->ll()->match(iDEC);
uint8_t regi; /* register of the MOV */
if(not m_icodes[0]->ll()->matchWithRegDst(iMOV) )
return false;
regi = m_icodes[0]->ll()->m_dst.regi;
if( not ( m_icodes[2]->ll()->match(iCMP,regi) and
m_icodes[3]->ll()->conditionalJump() ) )
return false;
// Simple matching finished, select apropriate matcher based on dst type
/* Get variable */
if (m_icodes[1]->ll()->m_dst.regi == 0) /* global variable */
{
/* not supported yet */
m_idiom_type = 0;
}
else if ( m_icodes[1]->ll()->m_dst.isReg() ) /* register */
{
m_idiom_type = 1;
// if ((m_icodes[1]->ll()->dst.regi == rSI) and (m_func->flg & SI_REGVAR))
// m_idiom_type = 1;
// else if ((m_icodes[1]->ll()->dst.regi == rDI) and (m_func->flg & DI_REGVAR))
// m_idiom_type = 1;
}
else if (m_icodes[1]->ll()->m_dst.off) /* local variable */
m_idiom_type = 2;
else /* indexed */
{
m_idiom_type=3;
/* not supported yet */
ICODE &ic(*picode);
const Function *my_proc(ic.getParent()->getParent());
qWarning() << "Unsupported idiom18 type at"<< QString::number(ic.loc_ip,16)
<< "in"<< my_proc->name <<':'<< QString::number(my_proc->procEntry,16) << "- indexed";
}
switch(m_idiom_type)
{
case 0: // global
printf("Unsupported idiom18 type at %x : global variable\n",picode->loc_ip);
break;
case 1: /* register variable */
/* Check previous instruction for a MOV */
if ( m_icodes[0]->ll()->src().regi == m_icodes[1]->ll()->m_dst.regi)
{
return true;
}
break;
case 2: /* local */
if (m_icodes[0]->ll()->src().off == m_icodes[1]->ll()->m_dst.off)
{
return true;
}
break;
case 3: // indexed
printf("Untested idiom18 type: indexed\n");
if ((m_icodes[0]->ll()->src() == m_icodes[1]->ll()->m_dst))
{
return true;
}
break;
}
return false;
}
int Idiom18::action() // action length
{
Expr *rhs,*lhs;/* Pointers to left and right hand side exps */
Expr *expr;
lhs = AstIdent::id (*m_icodes[0]->ll(), SRC, m_func, m_icodes[1], *m_icodes[1], eUSE);
lhs = UnaryOperator::Create(m_is_dec ? POST_DEC : POST_INC, lhs);
rhs = AstIdent::id (*m_icodes[2]->ll(), SRC, m_func, m_icodes[1], *m_icodes[3], eUSE);
expr = new BinaryOperator(condOpJCond[m_icodes[3]->ll()->getOpcode() - iJB],lhs, rhs);
m_icodes[3]->setJCond(expr);
m_icodes[0]->invalidate();
m_icodes[1]->invalidate();
m_icodes[2]->invalidate();
return 3;
}
/*****************************************************************************
* idiom 19: pre-increment or pre-decrement in conditional jump, comparing against 0.
* [INC | DEC] var (including register vars)
* JX lab JX lab
* => HLI_JCOND (++var X 0) or HLI_JCOND (--var X 0)
* Eg: INC [bp+4]
* JG lab2
* => HLI_JCOND (++[bp+4] > 0)
* Found in Borland Turbo C. Intrinsic to C language.
****************************************************************************/
bool Idiom19::match(iICODE picode)
{
if(std::distance(picode,m_end)<2)
return false;
ICODE &ic(*picode);
int type;
for(int i=0; i<2; ++i)
m_icodes[i] =picode++;
m_is_dec = m_icodes[0]->ll()->match(iDEC);
if ( not m_icodes[1]->ll()->conditionalJump() )
return false;
if (m_icodes[0]->ll()->m_dst.regi == 0) /* global variable */
/* not supported yet */ ;
else if ( m_icodes[0]->ll()->m_dst.isReg() ) /* register */
{
// if (((picode->ll()->dst.regi == rSI) and (pproc->flg & SI_REGVAR)) or
// ((picode->ll()->dst.regi == rDI) and (pproc->flg & DI_REGVAR)))
return true;
}
else if (m_icodes[0]->ll()->m_dst.off) /* stack variable */
{
return true;
}
else /* indexed */
{
fprintf(stderr,"idiom19 : Untested type [indexed]\n");
return true;
/* not supported yet */
}
return false;
}
int Idiom19::action()
{
Expr *lhs,*expr;
lhs = AstIdent::id (*m_icodes[0]->ll(), DST, m_func, m_icodes[0], *m_icodes[1], eUSE);
lhs = UnaryOperator::Create(m_is_dec ? PRE_DEC : PRE_INC, lhs);
expr = new BinaryOperator(condOpJCond[m_icodes[1]->ll()->getOpcode() - iJB],lhs, new Constant(0, 2));
m_icodes[1]->setJCond(expr);
m_icodes[0]->invalidate();
return 2;
}
/*****************************************************************************
* idiom20: Pre increment/decrement in conditional expression (compares
* against a register, variable or constant different than 0).
* INC var or DEC var (including register vars)
* MOV reg, var MOV reg, var
* CMP reg, Y CMP reg, Y
* JX lab JX lab
* => HLI_JCOND (++var X Y) or HLI_JCOND (--var X Y)
* Eg: INC si (si is a register variable)
* MOV ax, si
* CMP ax, 2
* JL lab4
* => HLI_JCOND (++si < 2)
* Found in Turbo C. Intrinsic to C language.
****************************************************************************/
bool Idiom20::match(iICODE picode)
{
uint8_t type = 0; /* type of variable: 1 = reg-var, 2 = local */
uint8_t regi; /* register of the MOV */
if(std::distance(picode,m_end)<4)
return false;
for(int i=0; i<4; ++i)
m_icodes[i] =picode++;
/* Check second instruction for a MOV */
if( not m_icodes[1]->ll()->matchWithRegDst(iMOV) )
return false;
m_is_dec = m_icodes[0]->ll()->match(iDEC) ? PRE_DEC : PRE_INC;
const LLOperand &ll_dest(m_icodes[0]->ll()->m_dst);
/* Get variable */
if (ll_dest.regi == 0) /* global variable */
{
/* not supported yet */ ;
}
else if ( ll_dest.isReg() ) /* register */
{
type = 1;
// if ((ll_dest.regi == rSI) and (m_func->flg & SI_REGVAR))
// type = 1;
// else if ((ll_dest.regi == rDI) and (m_func->flg & DI_REGVAR))
// type = 1;
}
else if (ll_dest.off) /* local variable */
type = 2;
else /* indexed */
{
printf("idiom20 : Untested type [indexed]\n");
type = 3;
/* not supported yet */ ;
}
regi = m_icodes[1]->ll()->m_dst.regi;
const LLOperand &mov_src(m_icodes[1]->ll()->src());
if (m_icodes[2]->ll()->match(iCMP,(eReg)regi) and m_icodes[3]->ll()->conditionalJump())
{
switch(type)
{
case 1: /* register variable */
if ((mov_src.regi == ll_dest.regi))
{
return true;
}
break;
case 2: // local
if ((mov_src.off == ll_dest.off))
{
return true;
}
break;
case 3:
fprintf(stderr,"Test 3 ");
if ((mov_src == ll_dest))
{
return true;
}
break;
}
}
return false;
}
int Idiom20::action()
{
Expr *lhs,*rhs,*expr;
lhs = AstIdent::id (*m_icodes[1]->ll(), SRC, m_func, m_icodes[0], *m_icodes[0], eUSE);
lhs = UnaryOperator::Create(m_is_dec, lhs);
rhs = AstIdent::id (*m_icodes[2]->ll(), SRC, m_func, m_icodes[0], *m_icodes[3], eUSE);
expr = new BinaryOperator(condOpJCond[m_icodes[3]->ll()->getOpcode() - iJB],lhs, rhs);
m_icodes[3]->setJCond(expr);
for(int i=0; i<3; ++i)
m_icodes[i]->invalidate();
return 4;
}

View File

@ -1,114 +0,0 @@
#include "call_idioms.h"
#include "dcc.h"
#include "msvc_fixes.h"
using namespace std;
/*****************************************************************************
* idiom3 - C calling convention.
* CALL(F) proc_X
* ADD SP, immed
* Eg: CALL proc_X
* ADD SP, 6
* => pProc->cbParam = immed
* Special case: when the call is at the end of the procedure,
* sometimes the stack gets restored by a MOV sp, bp.
* Need to flag the procedure in these cases.
* Used by compilers to restore the stack when invoking a procedure using
* the C calling convention.
****************************************************************************/
bool Idiom3::match(iICODE picode)
{
if(distance(picode,m_end)<2)
return false;
m_param_count=0;
/* Match ADD SP, immed */
for(int i=0; i<2; ++i)
m_icodes[i] = picode++;
if ( m_icodes[1]->ll()->testFlags(I) and m_icodes[1]->ll()->match(iADD,rSP))
{
m_param_count = m_icodes[1]->ll()->src().getImm2();
return true;
}
else if (m_icodes[1]->ll()->match(iMOV,rSP,rBP))
{
m_icodes[0]->ll()->setFlags(REST_STK);
return true;
}
return 0;
}
int Idiom3::action()
{
if (m_icodes[0]->ll()->testFlags(I) )
{
m_icodes[0]->ll()->src().addProcInformation(m_param_count,CConv::eCdecl);
}
else
{
printf("Indirect call at idiom3\n");
}
m_icodes[1]->invalidate();
return 2;
}
/*****************************************************************************
* idiom 17 - C calling convention.
* CALL(F) xxxx
* POP reg
* [POP reg] reg in {AX, BX, CX, DX}
* Eg: CALL proc_X
* POP cx
* POP cx (4 bytes of arguments)
* => pProc->cbParam = # pops * 2
* Found in Turbo C when restoring the stack for a procedure that uses the
* C calling convention. Used to restore the stack of 2 or 4 bytes args.
****************************************************************************/
bool Idiom17::match(iICODE picode)
{
if(distance(picode,m_end)<2)
return false;
m_param_count=0; /* Count on # pops */
m_icodes.clear();
/* Match ADD SP, immed */
for(int i=0; i<2; ++i)
m_icodes.push_back(picode++);
uint8_t regi;
/* Match POP reg */
if (m_icodes[1]->ll()->match(iPOP))
{
int i=0;
regi = m_icodes[1]->ll()->m_dst.regi;
if ((regi >= rAX) and (regi <= rBX))
i++;
while (picode != m_end and picode->ll()->match(iPOP))
{
if (picode->ll()->m_dst.regi != regi)
break;
i++;
m_icodes.push_back(picode++);
}
m_param_count = i*2;
}
return m_param_count!=0;
}
int Idiom17::action()
{
if (m_icodes[0]->ll()->testFlags(I))
{
m_icodes[0]->ll()->src().addProcInformation(m_param_count,CConv::eCdecl);
for(size_t idx=1; idx<m_icodes.size(); ++idx)
{
m_icodes[idx]->invalidate();
}
}
// TODO : it's a calculated call
else
{
printf("Indirect call at idiom17\n");
}
return m_icodes.size();
}

View File

@ -1,155 +0,0 @@
#include "epilogue_idioms.h"
#include "dcc.h"
#include "msvc_fixes.h"
/*****************************************************************************
* popStkVars - checks for
* [POP DI]
* [POP SI]
* or [POP SI]
* [POP DI]
****************************************************************************/
void EpilogIdiom::popStkVars(iICODE pIcode)
{
// TODO : only process SI-DI DI-SI pairings, no SI-SI, DI-DI like it's now
/* Match [POP DI] */
if (pIcode->ll()->match(iPOP))
{
if ((m_func->flg & DI_REGVAR) and pIcode->ll()->match(rDI))
m_icodes.push_front(pIcode);
else if ((m_func->flg & SI_REGVAR) and pIcode->ll()->match(rSI))
m_icodes.push_front(pIcode);
}
++pIcode;
if(pIcode==m_end)
return;
/* Match [POP SI] */
if (pIcode->ll()->match(iPOP))
{
if ((m_func->flg & SI_REGVAR) and pIcode->ll()->match(rSI))
m_icodes.push_front(pIcode);
else if ((m_func->flg & DI_REGVAR) and pIcode->ll()->match(rDI))
m_icodes.push_front(pIcode);
}
}
/*****************************************************************************
* idiom2 - HLL procedure epilogue; Returns number of instructions matched.
* [POP DI]
* [POP SI]
* MOV SP, BP
* POP BP
* RET(F)
*****************************************************************************/
bool Idiom2::match(iICODE pIcode)
{
iICODE nicode;
if(pIcode==m_func->Icode.begin()) // pIcode->loc_ip == 0
return false;
if ( pIcode->ll()->testFlags(I) or (not pIcode->ll()->match(rSP,rBP)) )
return false;
if(distance(pIcode,m_end)<3)
return false;
/* Matched MOV SP, BP */
m_icodes.clear();
m_icodes.push_back(pIcode);
/* Get next icode, skip over holes in the icode array */
nicode = ++iICODE(pIcode);
while (nicode->ll()->testFlags(NO_CODE) and (nicode != m_end))
{
nicode++;
}
if(nicode == m_end)
return false;
if (nicode->ll()->match(iPOP,rBP) and not (nicode->ll()->testFlags(I | TARGET | CASE)) )
{
m_icodes.push_back(nicode++); // Matched POP BP
/* Match RET(F) */
if ( nicode != m_end and
not (nicode->ll()->testFlags(I | TARGET | CASE)) and
(nicode->ll()->match(iRET) or nicode->ll()->match(iRETF))
)
{
m_icodes.push_back(nicode); // Matched RET
advance(pIcode,-2); // move back before our start
popStkVars (pIcode); // and add optional pop di/si to m_icodes
return true;
}
}
return false;
}
int Idiom2::action()
{
for(size_t idx=0; idx<m_icodes.size()-1; ++idx) // don't invalidate last entry
m_icodes[idx]->invalidate();
return 3;
}
/*****************************************************************************
* idiom4 - Pascal calling convention.
* RET(F) immed
* ==> pProc->cbParam = immed
* sets CALL_PASCAL flag
* - Second version: check for optional pop of stack vars
* [POP DI]
* [POP SI]
* POP BP
* RET(F) [immed]
* - Third version: pop stack vars
* [POP DI]
* [POP SI]
* RET(F) [immed]
****************************************************************************/
bool Idiom4::match(iICODE pIcode)
{
m_param_count = 0;
/* Check for [POP DI]
* [POP SI] */
if(distance(m_func->Icode.begin(),pIcode)>=3)
{
iICODE search_at(pIcode);
advance(search_at,-3);
popStkVars(search_at);
}
if(pIcode != m_func->Icode.begin())
{
iICODE prev1 = --iICODE(pIcode);
/* Check for POP BP */
if (prev1->ll()->match(iPOP,rBP) and not prev1->ll()->testFlags(I) )
m_icodes.push_back(prev1);
else if(prev1!=m_func->Icode.begin())
{
iICODE search_at(pIcode);
advance(search_at,-2);
popStkVars (search_at);
}
}
/* Check for RET(F) immed */
if (pIcode->ll()->testFlags(I) )
{
m_param_count = (int16_t)pIcode->ll()->src().getImm2();
return true;
}
return false;
}
int Idiom4::action()
{
if( not m_icodes.empty()) // if not an empty RET[F] N
{
for(size_t idx=0; idx<m_icodes.size()-1; ++idx) // don't invalidate last entry
m_icodes[idx]->invalidate();
}
if(m_param_count)
{
m_func->cbParam = (int16_t)m_param_count;
m_func->callingConv(CConv::ePascal);
}
return 1;
}

View File

@ -1,140 +0,0 @@
#include "idiom1.h"
#include "dcc.h"
#include "msvc_fixes.h"
/*****************************************************************************
* checkStkVars - Checks for PUSH SI
* [PUSH DI]
* or PUSH DI
* [PUSH SI]
* In which case, the stack variable flags are set
****************************************************************************/
int Idiom1::checkStkVars (iICODE pIcode)
{
/* Look for PUSH SI */
int si_matched=0;
int di_matched=0;
if(pIcode==m_end)
return 0;
if (pIcode->ll()->match(iPUSH,rSI))
{
si_matched = 1;
++pIcode;
if ((pIcode != m_end) and pIcode->ll()->match(iPUSH,rDI)) // Look for PUSH DI
di_matched = 1;
}
else if (pIcode->ll()->match(iPUSH,rDI))
{
di_matched = 1;
++pIcode;
if ((pIcode != m_end) and pIcode->ll()->match(iPUSH,rSI)) // Look for PUSH SI
si_matched = 1;
}
m_func->flg |= (si_matched ? SI_REGVAR : 0) | (di_matched ? DI_REGVAR : 0);
return si_matched+di_matched;
}
/*****************************************************************************
* idiom1 - HLL procedure prologue; Returns number of instructions matched.
* PUSH BP ==> ENTER immed, 0
* MOV BP, SP and sets PROC_HLL flag
* [SUB SP, immed]
* [PUSH SI]
* [PUSH DI]
* - Second version: Push stack variables and then save BP
* PUSH BP
* PUSH SI
* [PUSH DI]
* MOV BP, SP
* - Third version: Stack variables
* [PUSH SI]
* [PUSH DI]
****************************************************************************/
bool Idiom1::match(iICODE picode)
{
//uint8_t type = 0; /* type of variable: 1 = reg-var, 2 = local */
//uint8_t regi; /* register of the MOV */
if(m_func->flg & PROC_HLL)
return false;
if(picode==m_end)
return false;
//int n;
m_icodes.clear();
m_min_off = 0;
/* PUSH BP as first instruction of procedure */
if ( (not picode->ll()->testFlags(I)) and picode->ll()->src().regi == rBP)
{
m_icodes.push_back( picode++ ); // insert iPUSH
if(picode==m_end)
return false;
/* MOV BP, SP as next instruction */
if ( not picode->ll()->testFlags(I | TARGET | CASE) and picode->ll()->match(iMOV ,rBP,rSP) )
{
m_icodes.push_back( picode++ ); // insert iMOV
if(picode==m_end)
return false;
m_min_off = 2;
/* Look for SUB SP, immed */
if (
picode->ll()->testFlags(I | TARGET | CASE) and picode->ll()->match(iSUB,rSP)
)
{
m_icodes.push_back( picode++ ); // insert iSUB
int n = checkStkVars (picode); // find iPUSH si [iPUSH di]
for(int i=0; i<n; ++i)
m_icodes.push_back(picode++); // insert
}
}
/* PUSH SI
* [PUSH DI]
* MOV BP, SP */
else
{
int n = checkStkVars (picode);
if (n > 0)
{
for(int i=0; i<n; ++i)
m_icodes.push_back(picode++);
if(picode == m_end)
return false;
/* Look for MOV BP, SP */
if ( picode != m_end and
not picode->ll()->testFlags(I | TARGET | CASE) and
picode->ll()->match(iMOV,rBP,rSP))
{
m_icodes.push_back(picode);
m_min_off = 2 + (n * 2);
}
else
return false; // Cristina: check this please!
}
else
return false; // Cristina: check this please!
}
}
else // push di [push si] / push si [push di]
{
size_t n = checkStkVars (picode);
for(size_t i=0; i<n; ++i)
m_icodes.push_back(picode++);
}
return not m_icodes.empty();
}
int Idiom1::action()
{
for(iICODE ic : m_icodes)
{
ic->invalidate();
}
m_func->flg |= PROC_HLL;
if(0!=m_min_off)
{
m_func->args.m_minOff = m_min_off;
m_func->flg |= PROC_IS_HLL;
}
return m_icodes.size();
}

View File

@ -1,116 +0,0 @@
#include "mov_idioms.h"
#include "dcc.h"
#include "msvc_fixes.h"
using namespace std;
/*****************************************************************************
* idiom 14 - Long uint16_t assign
* MOV regL, mem/reg
* XOR regH, regH
* Eg: MOV ax, di
* XOR dx, dx
* => MOV dx:ax, di
* Note: only the following combinations are allowed:
* dx:ax
* cx:bx
* this is to remove the possibility of making errors in situations
* like this:
* MOV dx, offH
* MOV ax, offL
* XOR cx, cx
* Found in Borland Turbo C, used for division of unsigned integer
* operands.
****************************************************************************/
bool Idiom14::match(iICODE pIcode)
{
if(distance(pIcode,m_end)<2)
return false;
m_icodes[0]=pIcode++;
m_icodes[1]=pIcode++;
LLInst * matched [] {m_icodes[0]->ll(),m_icodes[1]->ll()};
/* Check for regL */
m_regL = matched[0]->m_dst.regi;
if (not matched[0]->testFlags(I) and ((m_regL == rAX) or (m_regL ==rBX)))
{
/* Check for XOR regH, regH */
if (matched[1]->match(iXOR) and not matched[1]->testFlags(I))
{
m_regH = matched[1]->m_dst.regi;
if (m_regH == matched[1]->src().getReg2())
{
if ((m_regL == rAX) and (m_regH == rDX))
return true;
if ((m_regL == rBX) and (m_regH == rCX))
return true;
}
}
}
return false;
}
int Idiom14::action()
{
int idx = m_func->localId.newLongReg (TYPE_LONG_SIGN, LONGID_TYPE(m_regH,m_regL), m_icodes[0]);
AstIdent *lhs = AstIdent::LongIdx (idx);
m_icodes[0]->setRegDU( m_regH, eDEF);
Expr *rhs = AstIdent::id (*m_icodes[0]->ll(), SRC, m_func, m_icodes[0], *m_icodes[0], NONE);
m_icodes[0]->setAsgn(lhs, rhs);
m_icodes[1]->invalidate();
return 2;
}
/*****************************************************************************
* idiom 13 - uint16_t assign
* MOV regL, mem
* MOV regH, 0
* Eg: MOV al, [bp-2]
* MOV ah, 0
* => MOV ax, [bp-2]
* Found in Borland Turbo C, used for multiplication and division of
* uint8_t operands (ie. they need to be extended to words).
****************************************************************************/
bool Idiom13::match(iICODE pIcode)
{
if(distance(pIcode,m_end)<2)
return false;
m_icodes[0]=pIcode++;
m_icodes[1]=pIcode++;
m_loaded_reg = rUNDEF;
eReg regi;
/* Check for regL */
regi = m_icodes[0]->ll()->m_dst.regi;
if (not m_icodes[0]->ll()->testFlags(I) and (regi >= rAL) and (regi <= rBH))
{
/* Check for MOV regH, 0 */
if (m_icodes[1]->ll()->match(iMOV,I) and (m_icodes[1]->ll()->src().getImm2() == 0))
{
if (m_icodes[1]->ll()->m_dst.regi == (regi + 4)) //WARNING: based on distance between AH-AL,BH-BL etc.
{
m_loaded_reg=(eReg)(regi - rAL + rAX);
return true;
}
}
}
return false;
}
int Idiom13::action()
{
AstIdent *lhs;
Expr *rhs;
eReg regi = m_icodes[0]->ll()->m_dst.regi;
m_icodes[0]->du1.removeDef(regi);
//m_icodes[0]->du1.numRegsDef--; /* prev uint8_t reg def */
lhs = new RegisterNode(LLOperand(m_loaded_reg, 0), &m_func->localId);
m_icodes[0]->setRegDU( m_loaded_reg, eDEF);
rhs = AstIdent::id (*m_icodes[0]->ll(), SRC, m_func, m_icodes[0], *m_icodes[0], NONE);
m_icodes[0]->setAsgn(lhs, rhs);
m_icodes[1]->invalidate();
return 2;
}

View File

@ -1,109 +0,0 @@
#include "neg_idioms.h"
#include "dcc.h"
#include "msvc_fixes.h"
using namespace std;
/*****************************************************************************
* idiom11 - Negate long integer
* NEG regH
* NEG regL
* SBB regH, 0
* Eg: NEG dx
* NEG ax
* SBB dx, 0
* => dx:ax = - dx:ax
* Found in Borland Turbo C.
****************************************************************************/
bool Idiom11::match (iICODE picode)
{
//const char *matchstring="(oNEG rH) (oNEG rL) (SBB \rH i0)";
condId type; /* type of argument */
if(distance(picode,m_end)<3)
return false;
for(int i=0; i<3; ++i)
m_icodes[i]=picode++;
type = m_icodes[0]->ll()->idType(DST);
if(type==CONSTANT or type == OTHER)
return false;
/* Check NEG reg/mem
* SBB reg/mem, 0*/
if (not m_icodes[1]->ll()->match(iNEG) or not m_icodes[2]->ll()->match(iSBB))
return false;
switch (type)
{
case GLOB_VAR:
if ((m_icodes[2]->ll()->m_dst.segValue == m_icodes[0]->ll()->m_dst.segValue) and
(m_icodes[2]->ll()->m_dst.off == m_icodes[0]->ll()->m_dst.off))
return true;
break;
case REGISTER:
if (m_icodes[2]->ll()->m_dst.regi == m_icodes[0]->ll()->m_dst.regi)
return true;
break;
case PARAM:
case LOCAL_VAR:
if (m_icodes[2]->ll()->m_dst.off == m_icodes[0]->ll()->m_dst.off)
return true;
break;
default:
fprintf(stderr,"Idiom11::match unhandled type %d\n",type);
}
return false;
}
int Idiom11::action()
{
AstIdent *lhs;
Expr *rhs;
lhs = AstIdent::Long (&m_func->localId, DST, m_icodes[0], HIGH_FIRST,m_icodes[0], USE_DEF, *m_icodes[1]->ll());
rhs = UnaryOperator::Create(NEGATION, lhs);
m_icodes[0]->setAsgn(lhs, rhs);
m_icodes[1]->invalidate();
m_icodes[2]->invalidate();
return 3;
}
/*****************************************************************************
* idiom 16: Bitwise negation
* NEG reg
* SBB reg, reg
* INC reg
* => ASGN reg, !reg
* Eg: NEG ax
* SBB ax, ax
* INC ax
* => ax = !ax
* Found in Borland Turbo C when negating bitwise.
****************************************************************************/
bool Idiom16::match (iICODE picode)
{
//const char *matchstring="(oNEG rR) (oSBB rR rR) (oINC rR)";
if(distance(picode,m_end)<3)
return false;
for(int i=0; i<3; ++i)
m_icodes[i]=picode++;
uint8_t regi = m_icodes[0]->ll()->m_dst.regi;
if ((regi >= rAX) and (regi < INDEX_BX_SI))
{
if (m_icodes[1]->ll()->match(iSBB) and m_icodes[2]->ll()->match(iINC))
if ((m_icodes[1]->ll()->m_dst.regi == (m_icodes[1]->ll()->src().getReg2())) and
m_icodes[1]->ll()->match((eReg)regi) and
m_icodes[2]->ll()->match((eReg)regi))
return true;
}
return false;
}
int Idiom16::action()
{
AstIdent *lhs;
Expr *rhs;
lhs = new RegisterNode(*m_icodes[0]->ll()->get(DST),&m_func->localId);
rhs = UnaryOperator::Create(NEGATION, lhs->clone());
m_icodes[0]->setAsgn(lhs, rhs);
m_icodes[1]->invalidate();
m_icodes[2]->invalidate();
return 3;
}

Some files were not shown because too many files have changed in this diff Show More