Compare commits
15 Commits
loader_sep
...
v1.0.1-alp
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
97f093feaa | ||
|
|
3561de6e12 | ||
|
|
e84d09b97c | ||
|
|
d8a4fe1c04 | ||
|
|
e4e6ad6415 | ||
|
|
2543617930 | ||
|
|
bc5784a8f2 | ||
|
|
842687726f | ||
|
|
c5c9196561 | ||
|
|
a697ad05c0 | ||
|
|
d8c66e7791 | ||
|
|
337a6c44aa | ||
|
|
cde4484821 | ||
|
|
36b063c183 | ||
|
|
3603877f42 |
68
3rd_party/libdisasm/INTEL_BUGS
vendored
Normal file
68
3rd_party/libdisasm/INTEL_BUGS
vendored
Normal file
@@ -0,0 +1,68 @@
|
||||
PMOVMSKB
|
||||
Gd, Pq1H
|
||||
PMOVMSKB
|
||||
(66)
|
||||
Gd, Vdq1H
|
||||
|
||||
should be
|
||||
|
||||
PMOVMSKB
|
||||
Gd, Qq1H
|
||||
PMOVMSKB
|
||||
(66)
|
||||
Gd, Wdq1H
|
||||
|
||||
The instruction represented by this opcode expression does not support any
|
||||
operand to be a memory location.
|
||||
|
||||
MASKMOVQ
|
||||
Pq, Pq1H
|
||||
MASKMOVDQU
|
||||
(66)
|
||||
Vdq, Vdq1H
|
||||
|
||||
should be
|
||||
|
||||
MASKMOVQ
|
||||
Pq, Pq1H
|
||||
MASKMOVDQU
|
||||
(66)
|
||||
Vdq, Wdq1H
|
||||
|
||||
MOVMSKPS
|
||||
Gd, Vps1H
|
||||
MOVMSKPD
|
||||
(66)
|
||||
Gd, Vpd1H
|
||||
|
||||
should be
|
||||
|
||||
MOVMSKPS
|
||||
Gd, Wps1H
|
||||
MOVMSKPD
|
||||
(66)
|
||||
Gd, Wpd1H
|
||||
|
||||
The opcode table entries for LFS, LGS, and LSS
|
||||
|
||||
L[FGS]S
|
||||
Mp
|
||||
|
||||
should be
|
||||
|
||||
L[FGS]S
|
||||
Gv,Mp
|
||||
|
||||
MOVHLPS
|
||||
Vps, Vps
|
||||
|
||||
MOVLHPS
|
||||
Vps, Vps
|
||||
|
||||
should be
|
||||
|
||||
MOVHLPS
|
||||
Vps, Wps
|
||||
|
||||
MOVLHPS
|
||||
Vps, Wps
|
||||
137
3rd_party/libdisasm/LICENSE
vendored
Normal file
137
3rd_party/libdisasm/LICENSE
vendored
Normal file
@@ -0,0 +1,137 @@
|
||||
|
||||
|
||||
|
||||
|
||||
The "Clarified Artistic License"
|
||||
|
||||
Preamble
|
||||
|
||||
The intent of this document is to state the conditions under which a
|
||||
Package may be copied, such that the Copyright Holder maintains some
|
||||
semblance of artistic control over the development of the package,
|
||||
while giving the users of the package the right to use and distribute
|
||||
the Package in a more-or-less customary fashion, plus the right to make
|
||||
reasonable modifications.
|
||||
|
||||
Definitions:
|
||||
|
||||
"Package" refers to the collection of files distributed by the
|
||||
Copyright Holder, and derivatives of that collection of files
|
||||
created through textual modification.
|
||||
|
||||
"Standard Version" refers to such a Package if it has not been
|
||||
modified, or has been modified in accordance with the wishes
|
||||
of the Copyright Holder as specified below.
|
||||
|
||||
"Copyright Holder" is whoever is named in the copyright or
|
||||
copyrights for the package.
|
||||
|
||||
"You" is you, if you're thinking about copying or distributing
|
||||
this Package.
|
||||
|
||||
"Distribution fee" is a fee you charge for providing a copy of this
|
||||
Package to another party.
|
||||
|
||||
"Freely Available" means that no fee is charged for the right to use
|
||||
the item, though there may be fees involved in handling the item.
|
||||
|
||||
1. You may make and give away verbatim copies of the source form of the
|
||||
Standard Version of this Package without restriction, provided that you
|
||||
duplicate all of the original copyright notices and associated disclaimers.
|
||||
|
||||
2. You may apply bug fixes, portability fixes and other modifications
|
||||
derived from the Public Domain, or those made Freely Available, or from
|
||||
the Copyright Holder. A Package modified in such a way shall still be
|
||||
considered the Standard Version.
|
||||
|
||||
3. You may otherwise modify your copy of this Package in any way, provided
|
||||
that you insert a prominent notice in each changed file stating how and
|
||||
when you changed that file, and provided that you do at least ONE of the
|
||||
following:
|
||||
|
||||
a) place your modifications in the Public Domain or otherwise make them
|
||||
Freely Available, such as by posting said modifications to Usenet or
|
||||
an equivalent medium, or placing the modifications on a major archive
|
||||
site allowing unrestricted access to them, or by allowing the Copyright
|
||||
Holder to include your modifications in the Standard Version of the
|
||||
Package.
|
||||
|
||||
b) use the modified Package only within your corporation or organization.
|
||||
|
||||
c) rename any non-standard executables so the names do not conflict
|
||||
with standard executables, which must also be provided, and provide
|
||||
a separate manual page for each non-standard executable that clearly
|
||||
documents how it differs from the Standard Version.
|
||||
|
||||
d) make other distribution arrangements with the Copyright Holder.
|
||||
|
||||
e) permit and encourge anyone who receives a copy of the modified Package
|
||||
permission to make your modifications Freely Available in some specific
|
||||
way.
|
||||
|
||||
4. You may distribute the programs of this Package in object code or
|
||||
executable form, provided that you do at least ONE of the following:
|
||||
|
||||
a) distribute a Standard Version of the executables and library files,
|
||||
together with instructions (in the manual page or equivalent) on where
|
||||
to get the Standard Version.
|
||||
|
||||
b) accompany the distribution with the machine-readable source of
|
||||
the Package with your modifications.
|
||||
|
||||
c) give non-standard executables non-standard names, and clearly
|
||||
document the differences in manual pages (or equivalent), together
|
||||
with instructions on where to get the Standard Version.
|
||||
|
||||
d) make other distribution arrangements with the Copyright Holder.
|
||||
|
||||
e) offer the machine-readable source of the Package, with your
|
||||
modifications, by mail order.
|
||||
|
||||
5. You may charge a distribution fee for any distribution of this Package.
|
||||
If you offer support for this Package, you may charge any fee you choose
|
||||
for that support. You may not charge a license fee for the right to use
|
||||
this Package itself. You may distribute this Package in aggregate with
|
||||
other (possibly commercial and possibly nonfree) programs as part of a
|
||||
larger (possibly commercial and possibly nonfree) software distribution,
|
||||
and charge license fees for other parts of that software distribution,
|
||||
provided that you do not advertise this Package as a product of your own.
|
||||
If the Package includes an interpreter, You may embed this Package's
|
||||
interpreter within an executable of yours (by linking); this shall be
|
||||
construed as a mere form of aggregation, provided that the complete
|
||||
Standard Version of the interpreter is so embedded.
|
||||
|
||||
6. The scripts and library files supplied as input to or produced as
|
||||
output from the programs of this Package do not automatically fall
|
||||
under the copyright of this Package, but belong to whoever generated
|
||||
them, and may be sold commercially, and may be aggregated with this
|
||||
Package. If such scripts or library files are aggregated with this
|
||||
Package via the so-called "undump" or "unexec" methods of producing a
|
||||
binary executable image, then distribution of such an image shall
|
||||
neither be construed as a distribution of this Package nor shall it
|
||||
fall under the restrictions of Paragraphs 3 and 4, provided that you do
|
||||
not represent such an executable image as a Standard Version of this
|
||||
Package.
|
||||
|
||||
7. C subroutines (or comparably compiled subroutines in other
|
||||
languages) supplied by you and linked into this Package in order to
|
||||
emulate subroutines and variables of the language defined by this
|
||||
Package shall not be considered part of this Package, but are the
|
||||
equivalent of input as in Paragraph 6, provided these subroutines do
|
||||
not change the language in any way that would cause it to fail the
|
||||
regression tests for the language.
|
||||
|
||||
8. Aggregation of the Standard Version of the Package with a commercial
|
||||
distribution is always permitted provided that the use of this Package is
|
||||
embedded; that is, when no overt attempt is made to make this Package's
|
||||
interfaces visible to the end user of the commercial distribution.
|
||||
Such use shall not be construed as a distribution of this Package.
|
||||
|
||||
9. The name of the Copyright Holder may not be used to endorse or promote
|
||||
products derived from this software without specific prior written permission.
|
||||
|
||||
10. THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
|
||||
IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
|
||||
WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
|
||||
|
||||
The End
|
||||
12
3rd_party/libdisasm/NAMESPACE.TXT
vendored
Normal file
12
3rd_party/libdisasm/NAMESPACE.TXT
vendored
Normal file
@@ -0,0 +1,12 @@
|
||||
|
||||
The rewritten libdisasm code uses the following namespaces:
|
||||
|
||||
|
||||
Prefix Namespace
|
||||
----------------------------------------------------
|
||||
x86_ Global 'libdisasm' namespace
|
||||
ia32_ Internal IA32 ISA namespace
|
||||
ia64_ Internal IA64 ISA namespace
|
||||
ix64_ Internal X86-64 ISA namespace
|
||||
|
||||
Note that the 64-bit ISAs are not yet supported/written.
|
||||
2
3rd_party/libdisasm/README
vendored
Normal file
2
3rd_party/libdisasm/README
vendored
Normal file
@@ -0,0 +1,2 @@
|
||||
This is a cut-up version of libdisasm originally from the bastard project http://bastard.sourceforge.net/
|
||||
|
||||
43
3rd_party/libdisasm/TODO
vendored
Normal file
43
3rd_party/libdisasm/TODO
vendored
Normal file
@@ -0,0 +1,43 @@
|
||||
x86_format.c
|
||||
------------
|
||||
intel: jmpf -> jmp, callf -> call
|
||||
att: jmpf -> ljmp, callf -> lcall
|
||||
|
||||
opcode table
|
||||
------------
|
||||
finish typing instructions
|
||||
fix flag clear/set/toggle types
|
||||
|
||||
ix64 stuff
|
||||
----------
|
||||
document output file formats in web page
|
||||
features doc: register aliases, implicit operands, stack mods,
|
||||
ring0 flags, eflags, cpu model/isa
|
||||
|
||||
ia32_handle_* implementation
|
||||
|
||||
fix operand 0F C2
|
||||
CMPPS
|
||||
|
||||
* sysenter, sysexit as CALL types -- preceded by MSR writes
|
||||
* SYSENTER/SYSEXIT stack : overwrites SS, ESP
|
||||
* stos, cmps, scas, movs, ins, outs, lods -> OP_PTR
|
||||
* OP_SIZE in implicit operands
|
||||
* use OP_SIZE to choose reg sizes!
|
||||
|
||||
DONE?? :
|
||||
implicit operands: provide action ?
|
||||
e.g. add/inc for stach, write, etc
|
||||
replace table numbers in opcodes.dat with
|
||||
#defines for table names
|
||||
|
||||
replace 0 with INSN_INVALID [or maybe FF for imnvalid and 00 for Not Applicable */
|
||||
no wait that is only for prefix tables -- n/p
|
||||
|
||||
if ( prefx) only use if insn != invalid
|
||||
|
||||
these should cover all the wacky disasm exceptions
|
||||
|
||||
for the rep one we can chet, match only a 0x90
|
||||
|
||||
todo: privilege | ring
|
||||
36
3rd_party/libdisasm/ia32_fixup.cpp
vendored
Normal file
36
3rd_party/libdisasm/ia32_fixup.cpp
vendored
Normal file
@@ -0,0 +1,36 @@
|
||||
#include <stdio.h>
|
||||
|
||||
static const char * mem_fixup[256] = {
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 00 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 08 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 10 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 18 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 20 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 28 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 30 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 38 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 40 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 48 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 50 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 58 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 60 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 68 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 70 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 78 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 80 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 88 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 90 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* 98 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* A0 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* A8 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* B0 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* B8 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* C0 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* C8 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* D0 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* D8 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* E0 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* E8 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, /* F0 */
|
||||
NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL /* F8 */
|
||||
};
|
||||
3206
3rd_party/libdisasm/ia32_opcode.dat
vendored
Normal file
3206
3rd_party/libdisasm/ia32_opcode.dat
vendored
Normal file
File diff suppressed because it is too large
Load Diff
49
3rd_party/libdisasm/libdisasm.def
vendored
Normal file
49
3rd_party/libdisasm/libdisasm.def
vendored
Normal file
@@ -0,0 +1,49 @@
|
||||
;libdisasm.def : Declares the module parameters
|
||||
|
||||
LIBRARY "libdisasm.dll"
|
||||
DESCRIPTION "libdisasm exported functions"
|
||||
EXPORTS
|
||||
x86_addr_size @1
|
||||
x86_cleanup @2
|
||||
x86_disasm @3
|
||||
x86_disasm_forward @4
|
||||
x86_disasm_range @5
|
||||
x86_endian @6
|
||||
x86_format_header @7
|
||||
x86_format_insn @8
|
||||
x86_format_mnemonic @9
|
||||
x86_format_operand @10
|
||||
x86_fp_reg @11
|
||||
x86_get_branch_target @12
|
||||
x86_get_imm @13
|
||||
x86_get_options @14
|
||||
x86_get_raw_imm @15
|
||||
x86_get_rel_offset @16
|
||||
x86_imm_signsized @17
|
||||
x86_imm_sized @18
|
||||
x86_init @19
|
||||
x86_insn_is_tagged @20
|
||||
x86_insn_is_valid @21
|
||||
x86_invariant_disasm @22
|
||||
x86_ip_reg @23
|
||||
x86_max_insn_size @24
|
||||
x86_op_size @25
|
||||
x86_operand_1st @26
|
||||
x86_operand_2nd @27
|
||||
x86_operand_3rd @28
|
||||
x86_operand_count @29
|
||||
x86_operand_foreach @30
|
||||
x86_operand_new @31
|
||||
x86_operand_size @32
|
||||
x86_oplist_free @33
|
||||
x86_reg_from_id @34
|
||||
x86_report_error @35
|
||||
x86_set_insn_addr @36
|
||||
x86_set_insn_block @37
|
||||
x86_set_insn_function @38
|
||||
x86_set_insn_offset @39
|
||||
x86_set_options @40
|
||||
x86_set_reporter @41
|
||||
x86_size_disasm @42
|
||||
x86_sp_reg @43
|
||||
x86_tag_insn @44
|
||||
@@ -1,5 +1,8 @@
|
||||
PROJECT(dcc_original)
|
||||
CMAKE_MINIMUM_REQUIRED(VERSION 2.8)
|
||||
cmake_minimum_required(VERSION 2.8.9)
|
||||
set(CMAKE_INCLUDE_CURRENT_DIR ON)
|
||||
set(CMAKE_AUTOMOC ON)
|
||||
find_package(Qt5Core)
|
||||
|
||||
OPTION(dcc_build_tests "Enable unit tests." OFF)
|
||||
#SET(LIBRARY_OUTPUT_PATH ${PROJECT_SOURCE_DIR})
|
||||
@@ -9,31 +12,41 @@ IF(CMAKE_BUILD_TOOL MATCHES "(msdev|devenv|nmake)")
|
||||
ADD_DEFINITIONS(/W4)
|
||||
ELSE()
|
||||
#-D_GLIBCXX_DEBUG
|
||||
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall --std=c++0x")
|
||||
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -std=c++11")
|
||||
SET(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} " ) #--coverage
|
||||
ENDIF()
|
||||
|
||||
SET(CMAKE_MODULE_PATH ${PROJECT_SOURCE_DIR}/CMakeScripts;${CMAKE_MODULE_PATH})
|
||||
SET(EXECUTABLE_OUTPUT_PATH ${PROJECT_SOURCE_DIR})
|
||||
include(cotire)
|
||||
FIND_PACKAGE(LLVM)
|
||||
FIND_PACKAGE(Boost)
|
||||
IF(dcc_build_tests)
|
||||
enable_testing()
|
||||
FIND_PACKAGE(GMock)
|
||||
ENDIF()
|
||||
|
||||
ADD_SUBDIRECTORY(3rd_party)
|
||||
|
||||
llvm_map_components_to_libraries(REQ_LLVM_LIBRARIES jit native mc support tablegen)
|
||||
|
||||
find_package(LLVM REQUIRED CONFIG)
|
||||
llvm_map_components_to_libnames(REQ_LLVM_LIBRARIES native mc support tablegen)
|
||||
|
||||
INCLUDE_DIRECTORIES(
|
||||
3rd_party/libdisasm
|
||||
include
|
||||
include/idioms
|
||||
common
|
||||
${Boost_INCLUDE_DIRS}
|
||||
${LLVM_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
|
||||
ADD_SUBDIRECTORY(3rd_party)
|
||||
ADD_SUBDIRECTORY(common)
|
||||
ADD_SUBDIRECTORY(tools)
|
||||
|
||||
|
||||
set(dcc_LIB_SOURCES
|
||||
src/CallConvention.cpp
|
||||
src/ast.cpp
|
||||
src/backend.cpp
|
||||
src/bundle.cpp
|
||||
@@ -42,9 +55,9 @@ set(dcc_LIB_SOURCES
|
||||
src/control.cpp
|
||||
src/dataflow.cpp
|
||||
src/disassem.cpp
|
||||
src/DccFrontend.cpp
|
||||
src/error.cpp
|
||||
src/fixwild.cpp
|
||||
src/frontend.cpp
|
||||
src/graph.cpp
|
||||
src/hlicode.cpp
|
||||
src/hltype.cpp
|
||||
@@ -63,7 +76,6 @@ set(dcc_LIB_SOURCES
|
||||
src/locident.cpp
|
||||
src/liveness_set.cpp
|
||||
src/parser.cpp
|
||||
src/perfhlib.cpp
|
||||
src/procs.cpp
|
||||
src/project.cpp
|
||||
src/Procedure.cpp
|
||||
@@ -73,7 +85,7 @@ set(dcc_LIB_SOURCES
|
||||
src/symtab.cpp
|
||||
src/udm.cpp
|
||||
src/BasicBlock.cpp
|
||||
src/CallConvention.cpp
|
||||
src/dcc_interface.cpp
|
||||
)
|
||||
set(dcc_SOURCES
|
||||
src/dcc.cpp
|
||||
@@ -82,6 +94,7 @@ set(dcc_HEADERS
|
||||
include/ast.h
|
||||
include/bundle.h
|
||||
include/BinaryImage.h
|
||||
include/DccFrontend.h
|
||||
include/dcc.h
|
||||
include/disassem.h
|
||||
include/dosdcc.h
|
||||
@@ -100,7 +113,7 @@ set(dcc_HEADERS
|
||||
include/idioms/shift_idioms.h
|
||||
include/idioms/xor_idioms.h
|
||||
include/locident.h
|
||||
include/perfhlib.h
|
||||
include/CallConvention.h
|
||||
include/project.h
|
||||
include/scanner.h
|
||||
include/state.h
|
||||
@@ -109,20 +122,23 @@ set(dcc_HEADERS
|
||||
include/Procedure.h
|
||||
include/StackFrame.h
|
||||
include/BasicBlock.h
|
||||
include/CallConvention.h
|
||||
include/dcc_interface.h
|
||||
|
||||
)
|
||||
|
||||
SOURCE_GROUP(Source FILES ${dcc_SOURCES})
|
||||
SOURCE_GROUP(Headers FILES ${dcc_HEADERS})
|
||||
|
||||
ADD_LIBRARY(dcc_lib STATIC ${dcc_LIB_SOURCES} ${dcc_HEADERS})
|
||||
qt5_use_modules(dcc_lib Core)
|
||||
#cotire(dcc_lib)
|
||||
|
||||
ADD_EXECUTABLE(dcc_original ${dcc_SOURCES} ${dcc_HEADERS})
|
||||
ADD_DEPENDENCIES(dcc_original dcc_lib)
|
||||
TARGET_LINK_LIBRARIES(dcc_original LLVMSupport dcc_lib disasm_s ${REQ_LLVM_LIBRARIES} LLVMSupport)
|
||||
TARGET_LINK_LIBRARIES(dcc_original dcc_lib dcc_hash disasm_s ${REQ_LLVM_LIBRARIES} LLVMSupport)
|
||||
qt5_use_modules(dcc_original Core)
|
||||
#ADD_SUBDIRECTORY(gui)
|
||||
if(dcc_build_tests)
|
||||
ADD_SUBDIRECTORY(src)
|
||||
endif()
|
||||
|
||||
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
127
Readme.md
Normal file
127
Readme.md
Normal file
@@ -0,0 +1,127 @@
|
||||
I've fixed many issues in this codebase, among other things - memory reallocation during decompilation.
|
||||
|
||||
To reflect those fixes, I've edited the original readme a bit.
|
||||
|
||||
* * *
|
||||
dcc Distribution
|
||||
================
|
||||
|
||||
The code provided in this distribution is (C) by their authors:
|
||||
- Cristina Cifuentes (most of dcc code)
|
||||
- Mike van Emmerik (signatures and prototype code)
|
||||
- Jeff Ledermann (some disassembly code)
|
||||
|
||||
and is provided "as is". Additional contributor list is available
|
||||
[on GitHub](https://github.com/nemerle/dcc/graphs/contributors).
|
||||
|
||||
The following files are included in the dccoo.tar.gz distribution:
|
||||
- dcc.zip (dcc.exe DOS program, 1995)
|
||||
- dccsrc.zip (source code *.c, *.h for dcc, 1993-1994)
|
||||
- dcc32.zip (dcc_oo.exe 32 bit console (Win95/Win-NT) program, 1997)
|
||||
- dccsrcoo.zip (source code *.cpp, *.h for "oo" dcc, 1993-1997)
|
||||
- dccbsig.zip (library signatures for Borland C compilers, 1994)
|
||||
- dccmsig.zip (library signatures for Microsoft C compilers, 1994)
|
||||
- dcctpsig.zip (library signatures for Turbo Pascal compilers, 1994)
|
||||
- dcclibs.dat (prototype file for C headers, 1994)
|
||||
- test.zip (sample test files: *.c *.exe *.b, 1993-1996)
|
||||
- makedsig.zip (creates a .sig file from a .lib C file, 1994)
|
||||
- makedstp.zip (creates a .sig file from a Pascal library file, 1994)
|
||||
- readsig.zip (reads signatures in a .sig file, 1994)
|
||||
- dispsrch.zip (displays the name of a function given a signature, 1994)
|
||||
- parsehdr.zip (generates a prototype file (dcclibs.dat) from C *.h files, 1994)
|
||||
|
||||
Note that the dcc_oo.exe program (in dcc32.zip) is a 32 bit program,
|
||||
so it won't work under Windows 3.1. Also, it is a console mode program,
|
||||
meaning that it has to be run in the "Command Prompt" window (sometimes
|
||||
known as the "Dos Box"). It is not a GUI program.
|
||||
|
||||
The following files are included in the test.zip file: fibo,
|
||||
benchsho, benchlng, benchfn, benchmul, byteops, intops, longops,
|
||||
max, testlong, matrixmu, strlen, dhamp.
|
||||
The version of dcc included in this distribution (dccsrcoo.zip and
|
||||
dcc32.exe) is a bit better than the first release, but it is still
|
||||
broken in some cases, and we do not have the time to work in this
|
||||
project at present so we cannot provide any changes.
|
||||
Comments on individual files:
|
||||
- fibo (fibonacci): the small model (fibos.exe) decompiles correctly,
|
||||
the large model (fibol.exe) expects an extra argument for
|
||||
`scanf()`. This argument is the segment and is not displayed.
|
||||
- benchsho: the first `scanf()` takes loc0 as an argument. This is
|
||||
part of a long variable, but dcc does not have any clue at that
|
||||
stage that the stack offset pushed on the stack is to be used
|
||||
as a long variable rather than an integer variable.
|
||||
- benchlng: as part of the `main()` code, `LO(loc1) | HI(loc1)` should
|
||||
be displayed instead of `loc3 | loc9`. These two integer variables
|
||||
are equivalent to the one long loc1 variable.
|
||||
- benchfn: see benchsho.
|
||||
- benchmul: see benchsho.
|
||||
- byteops: decompiles correctly.
|
||||
- intops: the du analysis for `DIV` and `MOD` is broken. dcc currently
|
||||
generates code for a long and an integer temporary register that
|
||||
were used as part of the analysis.
|
||||
- longops: decompiles correctly.
|
||||
- max: decompiles correctly.
|
||||
- testlong: this example decompiles correctly given the algorithms
|
||||
implemented in dcc. However, it shows that when long variables
|
||||
are defined and used as integers (or long) without giving dcc
|
||||
any hint that this is happening, the variable will be treated as
|
||||
two integer variables. This is due to the fact that the assembly
|
||||
code is in terms of integer registers, and long registers are not
|
||||
available in 80286, so a long variable is equivalent to two integer
|
||||
registers. dcc only knows of this through idioms such as add two
|
||||
long variables.
|
||||
- matrixmu: decompiles correctly. Shows that arrays are not supported
|
||||
in dcc.
|
||||
- strlen: decompiles correctly. Shows that pointers are partially
|
||||
supported by dcc.
|
||||
- dhamp: this program has far more data types than what dcc recognizes
|
||||
at present.
|
||||
|
||||
Our thanks to Gary Shaffstall for some debugging work. Current bugs
|
||||
are:
|
||||
- [ ] if the code generated in the one line is too long, the (static)
|
||||
buffer used for that line is clobbered. Solution: make the buffer
|
||||
larger (currently 200 chars).
|
||||
- [ ] the large memory model problem & `scanf()`
|
||||
- [ ] dcc's error message shows a p option available which doesn't
|
||||
exist, and doesn't show an i option which exists.
|
||||
- [x] there is a nasty problem whereby some arrays can get reallocated
|
||||
to a new address, and some pointers can become invalid. This mainly
|
||||
tends to happen to larger executable files. A major rewrite will
|
||||
probably be required to fix this.
|
||||
|
||||
For more information refer to the thesis "Reverse Compilation
|
||||
Techniques" by Cristina Cifuentes, Queensland University of
|
||||
Technology, 1994, and the dcc home page:
|
||||
http://www.it.uq.edu.au/groups/csm/dcc_readme.html
|
||||
|
||||
Please note that the executable version of dcc provided in this
|
||||
distribution does not necessarily match the source code provided,
|
||||
some changes were done without us keeping track of every change.
|
||||
|
||||
Using dcc
|
||||
---------
|
||||
|
||||
Here is a very brief summary of switches for dcc:
|
||||
|
||||
* `a1`, `a2`: assembler output, before and after re-ordering of input code
|
||||
* `c`: Attempt to follow control through indirect call instructions
|
||||
* `i`: Enter interactive disassembler
|
||||
* `m`: Memory map
|
||||
* `s`: Statistics summary
|
||||
* `v`, `V`: verbose (and Very verbose)
|
||||
* `o` filename: Use filename as assembler output file
|
||||
|
||||
If dcc encounters illegal instructions, it will attempt to enter the so called
|
||||
interactive disassembler. The idea of this was to allow commands to fix the
|
||||
problem so that dcc could continue, but no such changes are implemented
|
||||
as yet. (Note: the Unix versions do not have the interactive disassembler). If
|
||||
you get into this, you can get out of it by pressing `^X` (control-X). Once dcc
|
||||
has entered the interactive disassembler, however, there is little chance that
|
||||
it will recover and produce useful output.
|
||||
|
||||
If dcc loads the signature file `dccxxx.sig`, this means that it has not
|
||||
recognised the compiler library used. You can place the signatures in a
|
||||
different direcory to where you are working if you set the DCC environment
|
||||
variable to point to their path. Note that if dcc can't find its signature
|
||||
files, it will be severely handicapped.
|
||||
@@ -2,5 +2,6 @@
|
||||
#cd bld
|
||||
#make -j5
|
||||
#cd ..
|
||||
mkdir -p tests/outputs
|
||||
./test_use_base.sh
|
||||
./regression_tester.rb ./dcc_original -s -c 2>stderr >stdout; diff -wB tests/prev/ tests/outputs/
|
||||
|
||||
7
common/CMakeLists.txt
Normal file
7
common/CMakeLists.txt
Normal file
@@ -0,0 +1,7 @@
|
||||
set(SRC
|
||||
perfhlib.cpp
|
||||
perfhlib.h
|
||||
PatternCollector.h
|
||||
|
||||
)
|
||||
add_library(dcc_hash STATIC ${SRC})
|
||||
82
common/PatternCollector.h
Normal file
82
common/PatternCollector.h
Normal file
@@ -0,0 +1,82 @@
|
||||
#ifndef PATTERNCOLLECTOR
|
||||
#define PATTERNCOLLECTOR
|
||||
#include <stdio.h>
|
||||
#include <stdint.h>
|
||||
#include <stdlib.h>
|
||||
#include <vector>
|
||||
|
||||
#define SYMLEN 16 /* Number of chars in the symbol name, incl null */
|
||||
#define PATLEN 23 /* Number of bytes in the pattern part */
|
||||
|
||||
struct HASHENTRY
|
||||
{
|
||||
char name[SYMLEN]; /* The symbol name */
|
||||
uint8_t pat [PATLEN]; /* The pattern */
|
||||
uint16_t offset; /* Offset (needed temporarily) */
|
||||
};
|
||||
|
||||
struct PatternCollector {
|
||||
uint8_t buf[100], bufSave[7]; /* Temp buffer for reading the file */
|
||||
uint16_t readShort(FILE *f)
|
||||
{
|
||||
uint8_t b1, b2;
|
||||
|
||||
if (fread(&b1, 1, 1, f) != 1)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
if (fread(&b2, 1, 1, f) != 1)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
return (b2 << 8) + b1;
|
||||
}
|
||||
|
||||
void grab(FILE *f,int n)
|
||||
{
|
||||
if (fread(buf, 1, n, f) != (size_t)n)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
}
|
||||
|
||||
uint8_t readByte(FILE *f)
|
||||
{
|
||||
uint8_t b;
|
||||
|
||||
if (fread(&b, 1, 1, f) != 1)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
return b;
|
||||
}
|
||||
|
||||
uint16_t readWord(FILE *fl)
|
||||
{
|
||||
uint8_t b1, b2;
|
||||
|
||||
b1 = readByte(fl);
|
||||
b2 = readByte(fl);
|
||||
|
||||
return b1 + (b2 << 8);
|
||||
}
|
||||
|
||||
/* Called by map(). Return the i+1th key in *pKeys */
|
||||
uint8_t *getKey(int i)
|
||||
{
|
||||
return keys[i].pat;
|
||||
}
|
||||
/* Display key i */
|
||||
void dispKey(int i)
|
||||
{
|
||||
printf("%s", keys[i].name);
|
||||
}
|
||||
std::vector<HASHENTRY> keys; /* array of keys */
|
||||
virtual int readSyms(FILE *f)=0;
|
||||
};
|
||||
#endif // PATTERNCOLLECTOR
|
||||
|
||||
440
common/perfhlib.cpp
Normal file
440
common/perfhlib.cpp
Normal file
@@ -0,0 +1,440 @@
|
||||
/*
|
||||
*$Log: perfhlib.c,v $
|
||||
* Revision 1.5 93/09/29 14:45:02 emmerik
|
||||
* Oops, didn't do the casts last check in
|
||||
*
|
||||
* Revision 1.4 93/09/29 14:41:45 emmerik
|
||||
* Added casts to mod instructions to keep the SVR4 compiler happy
|
||||
*
|
||||
*
|
||||
* Perfect hashing function library. Contains functions to generate perfect
|
||||
* hashing functions
|
||||
*/
|
||||
#include "perfhlib.h"
|
||||
#include "PatternCollector.h"
|
||||
|
||||
#include <stdio.h>
|
||||
#include <cassert>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
/* Private data structures */
|
||||
|
||||
//static int NumEntry; /* Number of entries in the hash table (# keys) */
|
||||
//static int EntryLen; /* Size (bytes) of each entry (size of keys) */
|
||||
//static int SetSize; /* Size of the char set */
|
||||
//static char SetMin; /* First char in the set */
|
||||
//static int NumVert; /* c times NumEntry */
|
||||
|
||||
//static uint16_t *T1base, *T2base; /* Pointers to start of T1, T2 */
|
||||
static uint16_t *T1, *T2; /* Pointers to T1[i], T2[i] */
|
||||
|
||||
static int *graphNode; /* The array of edges */
|
||||
static int *graphNext; /* Linked list of edges */
|
||||
static int *graphFirst;/* First edge at a vertex */
|
||||
|
||||
|
||||
static int numEdges; /* An edge counter */
|
||||
static bool *visited; /* Array of bools: whether visited */
|
||||
static bool *deleted; /* Array of bools: whether deleted */
|
||||
|
||||
/* Private prototypes */
|
||||
static void initGraph(void);
|
||||
static void addToGraph(int e, int v1, int v2);
|
||||
static bool isCycle(void);
|
||||
static void duplicateKeys(int v1, int v2);
|
||||
|
||||
void PerfectHash::setHashParams(int _NumEntry, int _EntryLen, int _SetSize, char _SetMin,
|
||||
int _NumVert)
|
||||
{
|
||||
/* These parameters are stored in statics so as to obviate the need for
|
||||
passing all these (or defererencing pointers) for every call to hash()
|
||||
*/
|
||||
|
||||
NumEntry = _NumEntry;
|
||||
EntryLen = _EntryLen;
|
||||
SetSize = _SetSize;
|
||||
SetMin = _SetMin;
|
||||
NumVert = _NumVert;
|
||||
|
||||
/* Allocate the variable sized tables etc */
|
||||
if ((T1base = (uint16_t *)malloc(EntryLen * SetSize * sizeof(uint16_t))) == 0)
|
||||
{
|
||||
goto BadAlloc;
|
||||
}
|
||||
if ((T2base = (uint16_t *)malloc(EntryLen * SetSize * sizeof(uint16_t))) == 0)
|
||||
{
|
||||
goto BadAlloc;
|
||||
}
|
||||
|
||||
if ((graphNode = (int *)malloc((NumEntry*2 + 1) * sizeof(int))) == 0)
|
||||
{
|
||||
goto BadAlloc;
|
||||
}
|
||||
if ((graphNext = (int *)malloc((NumEntry*2 + 1) * sizeof(int))) == 0)
|
||||
{
|
||||
goto BadAlloc;
|
||||
}
|
||||
if ((graphFirst = (int *)malloc((NumVert + 1) * sizeof(int))) == 0)
|
||||
{
|
||||
goto BadAlloc;
|
||||
}
|
||||
|
||||
if ((g = (short *)malloc((NumVert+1) * sizeof(short))) == 0)
|
||||
{
|
||||
goto BadAlloc;
|
||||
}
|
||||
if ((visited = (bool *)malloc((NumVert+1) * sizeof(bool))) == 0)
|
||||
{
|
||||
goto BadAlloc;
|
||||
}
|
||||
if ((deleted = (bool *)malloc((NumEntry+1) * sizeof(bool))) == 0)
|
||||
{
|
||||
goto BadAlloc;
|
||||
}
|
||||
return;
|
||||
|
||||
BadAlloc:
|
||||
printf("Could not allocate memory\n");
|
||||
hashCleanup();
|
||||
exit(1);
|
||||
}
|
||||
|
||||
void PerfectHash::hashCleanup(void)
|
||||
{
|
||||
/* Free the storage for variable sized tables etc */
|
||||
if (T1base) free(T1base);
|
||||
if (T2base) free(T2base);
|
||||
if (graphNode) free(graphNode);
|
||||
if (graphNext) free(graphNext);
|
||||
if (graphFirst) free(graphFirst);
|
||||
if (g) free(g);
|
||||
if (visited) free(visited);
|
||||
if (deleted) free(deleted);
|
||||
}
|
||||
|
||||
void PerfectHash::map(PatternCollector *collector)
|
||||
{
|
||||
m_collector = collector;
|
||||
assert(nullptr!=collector);
|
||||
int i, j, c;
|
||||
uint16_t f1, f2;
|
||||
bool cycle;
|
||||
uint8_t *keys;
|
||||
|
||||
c = 0;
|
||||
|
||||
do
|
||||
{
|
||||
initGraph();
|
||||
cycle = false;
|
||||
|
||||
/* Randomly generate T1 and T2 */
|
||||
for (i=0; i < SetSize*EntryLen; i++)
|
||||
{
|
||||
T1base[i] = rand() % NumVert;
|
||||
T2base[i] = rand() % NumVert;
|
||||
}
|
||||
|
||||
for (i=0; i < NumEntry; i++)
|
||||
{
|
||||
f1 = 0; f2 = 0;
|
||||
keys = m_collector->getKey(i);
|
||||
for (j=0; j < EntryLen; j++)
|
||||
{
|
||||
T1 = T1base + j * SetSize;
|
||||
T2 = T2base + j * SetSize;
|
||||
f1 += T1[keys[j] - SetMin];
|
||||
f2 += T2[keys[j] - SetMin];
|
||||
}
|
||||
f1 %= (uint16_t)NumVert;
|
||||
f2 %= (uint16_t)NumVert;
|
||||
if (f1 == f2)
|
||||
{
|
||||
/* A self loop. Reject! */
|
||||
printf("Self loop on vertex %d!\n", f1);
|
||||
cycle = true;
|
||||
break;
|
||||
}
|
||||
addToGraph(numEdges++, f1, f2);
|
||||
}
|
||||
if (cycle || (cycle = isCycle())) /* OK - is there a cycle? */
|
||||
{
|
||||
printf("Iteration %d\n", ++c);
|
||||
}
|
||||
else
|
||||
{
|
||||
break;
|
||||
}
|
||||
}
|
||||
while (/* there is a cycle */ 1);
|
||||
|
||||
}
|
||||
|
||||
/* Initialise the graph */
|
||||
void PerfectHash::initGraph()
|
||||
{
|
||||
int i;
|
||||
|
||||
for (i=1; i <= NumVert; i++)
|
||||
{
|
||||
graphFirst[i] = 0;
|
||||
}
|
||||
|
||||
for (i= -NumEntry; i <= NumEntry; i++)
|
||||
{
|
||||
/* No need to init graphNode[] as they will all be filled by successive
|
||||
calls to addToGraph() */
|
||||
graphNext[NumEntry+i] = 0;
|
||||
}
|
||||
|
||||
numEdges = 0;
|
||||
}
|
||||
|
||||
/* Add an edge e between vertices v1 and v2 */
|
||||
/* e, v1, v2 are 0 based */
|
||||
void PerfectHash::addToGraph(int e, int v1, int v2)
|
||||
{
|
||||
e++; v1++; v2++; /* So much more convenient */
|
||||
|
||||
graphNode[NumEntry+e] = v2; /* Insert the edge information */
|
||||
graphNode[NumEntry-e] = v1;
|
||||
|
||||
graphNext[NumEntry+e] = graphFirst[v1]; /* Insert v1 to list of alphas */
|
||||
graphFirst[v1]= e;
|
||||
graphNext[NumEntry-e] = graphFirst[v2]; /* Insert v2 to list of omegas */
|
||||
graphFirst[v2]= -e;
|
||||
|
||||
}
|
||||
|
||||
bool PerfectHash::DFS(int parentE, int v)
|
||||
{
|
||||
int e, w;
|
||||
|
||||
/* Depth first search of the graph, starting at vertex v, looking for
|
||||
cycles. parent and v are origin 1. Note parent is an EDGE,
|
||||
not a vertex */
|
||||
|
||||
visited[v] = true;
|
||||
|
||||
/* For each e incident with v .. */
|
||||
for (e = graphFirst[v]; e; e = graphNext[NumEntry+e])
|
||||
{
|
||||
uint8_t *key1;
|
||||
|
||||
if (deleted[abs(e)])
|
||||
{
|
||||
/* A deleted key. Just ignore it */
|
||||
continue;
|
||||
}
|
||||
key1 = m_collector->getKey(abs(e)-1);
|
||||
w = graphNode[NumEntry+e];
|
||||
if (visited[w])
|
||||
{
|
||||
/* Did we just come through this edge? If so, ignore it. */
|
||||
if (abs(e) != abs(parentE))
|
||||
{
|
||||
/* There is a cycle in the graph. There is some subtle code here
|
||||
to work around the distinct possibility that there may be
|
||||
duplicate keys. Duplicate keys will always cause unit
|
||||
cycles, since f1 and f2 (used to select v and w) will be the
|
||||
same for both. The edges (representing an index into the
|
||||
array of keys) are distinct, but the key values are not.
|
||||
The logic is as follows: for the candidate edge e, check to
|
||||
see if it terminates in the parent vertex. If so, we test
|
||||
the keys associated with e and the parent, and if they are
|
||||
the same, we can safely ignore e for the purposes of cycle
|
||||
detection, since edge e adds nothing to the cycle. Cycles
|
||||
involving v, w, and e0 will still be found. The parent
|
||||
edge was not similarly eliminated because at the time when
|
||||
it was a candidate, v was not yet visited.
|
||||
We still have to remove the key from further consideration,
|
||||
since each edge is visited twice, but with a different
|
||||
parent edge each time.
|
||||
*/
|
||||
/* We save some stack space by calculating the parent vertex
|
||||
for these relatively few cases where it is needed */
|
||||
int parentV = graphNode[NumEntry-parentE];
|
||||
|
||||
if (w == parentV)
|
||||
{
|
||||
uint8_t *key2;
|
||||
|
||||
key2=m_collector->getKey(abs(parentE)-1);
|
||||
if (memcmp(key1, key2, EntryLen) == 0)
|
||||
{
|
||||
printf("Duplicate keys with edges %d and %d (",
|
||||
e, parentE);
|
||||
m_collector->dispKey(abs(e)-1);
|
||||
printf(" & ");
|
||||
m_collector->dispKey(abs(parentE)-1);
|
||||
printf(")\n");
|
||||
deleted[abs(e)] = true; /* Wipe the key */
|
||||
}
|
||||
else
|
||||
{
|
||||
/* A genuine (unit) cycle. */
|
||||
printf("There is a unit cycle involving vertex %d and edge %d\n", v, e);
|
||||
return true;
|
||||
}
|
||||
|
||||
}
|
||||
else
|
||||
{
|
||||
/* We have reached a previously visited vertex not the
|
||||
parent. Therefore, we have uncovered a genuine cycle */
|
||||
printf("There is a cycle involving vertex %d and edge %d\n", v, e);
|
||||
return true;
|
||||
|
||||
}
|
||||
}
|
||||
}
|
||||
else /* Not yet seen. Traverse it */
|
||||
{
|
||||
if (DFS(e, w))
|
||||
{
|
||||
/* Cycle found deeper down. Exit */
|
||||
return true;
|
||||
}
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
bool PerfectHash::isCycle(void)
|
||||
{
|
||||
int v, e;
|
||||
|
||||
for (v=1; v <= NumVert; v++)
|
||||
{
|
||||
visited[v] = false;
|
||||
}
|
||||
for (e=1; e <= NumEntry; e++)
|
||||
{
|
||||
deleted[e] = false;
|
||||
}
|
||||
for (v=1; v <= NumVert; v++)
|
||||
{
|
||||
if (!visited[v])
|
||||
{
|
||||
if (DFS(-32767, v))
|
||||
{
|
||||
return true;
|
||||
}
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
void PerfectHash::traverse(int u)
|
||||
{
|
||||
int w, e;
|
||||
|
||||
visited[u] = true;
|
||||
/* Find w, the neighbours of u, by searching the edges e associated with u */
|
||||
e = graphFirst[1+u];
|
||||
while (e)
|
||||
{
|
||||
w = graphNode[NumEntry+e]-1;
|
||||
if (!visited[w])
|
||||
{
|
||||
g[w] = (abs(e)-1 - g[u]) % NumEntry;
|
||||
if (g[w] < 0) g[w] += NumEntry; /* Keep these positive */
|
||||
traverse(w);
|
||||
}
|
||||
e = graphNext[NumEntry+e];
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
void PerfectHash::assign(void)
|
||||
{
|
||||
int v;
|
||||
|
||||
|
||||
for (v=0; v < NumVert; v++)
|
||||
{
|
||||
g[v] = 0; /* g is sparse; leave the gaps 0 */
|
||||
visited[v] = false;
|
||||
}
|
||||
|
||||
for (v=0; v < NumVert; v++)
|
||||
{
|
||||
if (!visited[v])
|
||||
{
|
||||
g[v] = 0;
|
||||
traverse(v);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
int PerfectHash::hash(uint8_t *string)
|
||||
{
|
||||
uint16_t u, v;
|
||||
int j;
|
||||
|
||||
u = 0;
|
||||
for (j=0; j < EntryLen; j++)
|
||||
{
|
||||
T1 = T1base + j * SetSize;
|
||||
u += T1[string[j] - SetMin];
|
||||
}
|
||||
u %= NumVert;
|
||||
|
||||
v = 0;
|
||||
for (j=0; j < EntryLen; j++)
|
||||
{
|
||||
T2 = T2base + j * SetSize;
|
||||
v += T2[string[j] - SetMin];
|
||||
}
|
||||
v %= NumVert;
|
||||
|
||||
return (g[u] + g[v]) % NumEntry;
|
||||
}
|
||||
|
||||
#if 0
|
||||
void dispRecord(int i);
|
||||
|
||||
void
|
||||
duplicateKeys(int v1, int v2)
|
||||
{
|
||||
int i, j;
|
||||
uint8_t *keys;
|
||||
int u, v;
|
||||
|
||||
v1--; v2--; /* These guys are origin 1 */
|
||||
|
||||
printf("Duplicate keys:\n");
|
||||
|
||||
for (i=0; i < NumEntry; i++)
|
||||
{
|
||||
getKey(i, &keys);
|
||||
u = 0;
|
||||
for (j=0; j < EntryLen; j++)
|
||||
{
|
||||
T1 = T1base + j * SetSize;
|
||||
u += T1[keys[j] - SetMin];
|
||||
}
|
||||
u %= NumVert;
|
||||
if ((u != v1) && (u != v2)) continue;
|
||||
|
||||
v = 0;
|
||||
for (j=0; j < EntryLen; j++)
|
||||
{
|
||||
T2 = T2base + j * SetSize;
|
||||
v += T2[keys[j] - SetMin];
|
||||
}
|
||||
v %= NumVert;
|
||||
|
||||
if ((v == v2) || (v == v1))
|
||||
{
|
||||
printf("Entry #%d key: ", i+1);
|
||||
for (j=0; j < EntryLen; j++) printf("%02X ", keys[j]);
|
||||
printf("\n");
|
||||
dispRecord(i+1);
|
||||
}
|
||||
}
|
||||
exit(1);
|
||||
|
||||
|
||||
}
|
||||
#endif
|
||||
37
common/perfhlib.h
Normal file
37
common/perfhlib.h
Normal file
@@ -0,0 +1,37 @@
|
||||
#include <stdint.h>
|
||||
/** Perfect hashing function library. Contains functions to generate perfect
|
||||
hashing functions */
|
||||
struct PatternCollector;
|
||||
struct PerfectHash {
|
||||
uint16_t *T1base;
|
||||
uint16_t *T2base; /* Pointers to start of T1, T2 */
|
||||
short *g; /* g[] */
|
||||
|
||||
int NumEntry; /* Number of entries in the hash table (# keys) */
|
||||
int EntryLen; /* Size (bytes) of each entry (size of keys) */
|
||||
int SetSize; /* Size of the char set */
|
||||
char SetMin; /* First char in the set */
|
||||
int NumVert; /* c times NumEntry */
|
||||
/** Set the parameters for the hash table */
|
||||
void setHashParams(int _numEntry, int _entryLen, int _setSize, char _setMin, int _numVert);
|
||||
|
||||
public:
|
||||
void map(PatternCollector * collector); /* Part 1 of creating the tables */
|
||||
void hashCleanup(); /* Frees memory allocated by setHashParams() */
|
||||
void assign(); /* Part 2 of creating the tables */
|
||||
int hash(uint8_t *string); /* Hash the string to an int 0 .. NUMENTRY-1 */
|
||||
const uint16_t *readT1(void) const { return T1base; }
|
||||
const uint16_t *readT2(void) const { return T2base; }
|
||||
const uint16_t *readG(void) const { return (uint16_t *)g; }
|
||||
uint16_t *readT1(void){ return T1base; }
|
||||
uint16_t *readT2(void){ return T2base; }
|
||||
uint16_t *readG(void) { return (uint16_t *)g; }
|
||||
private:
|
||||
void initGraph();
|
||||
void addToGraph(int e, int v1, int v2);
|
||||
bool isCycle();
|
||||
bool DFS(int parentE, int v);
|
||||
void traverse(int u);
|
||||
PatternCollector *m_collector; /* used to retrieve the keys */
|
||||
|
||||
};
|
||||
@@ -1,3 +1,4 @@
|
||||
#!/bin/bash
|
||||
makedir -p tests/outputs
|
||||
./test_use_all.sh
|
||||
./regression_tester.rb ./dcc_original -s -c 2>stderr >stdout; diff -wB tests/prev/ tests/outputs/
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
#pragma once
|
||||
#include <stdint.h>
|
||||
#include <vector>
|
||||
struct PROG /* Loaded program image parameters */
|
||||
{
|
||||
int16_t initCS;
|
||||
@@ -8,15 +9,17 @@ struct PROG /* Loaded program image parameters */
|
||||
uint16_t initSP;
|
||||
bool fCOM; /* Flag set if COM program (else EXE)*/
|
||||
int cReloc; /* No. of relocation table entries */
|
||||
uint32_t * relocTable; /* Ptr. to relocation table */
|
||||
std::vector<uint32_t> relocTable; /* Ptr. to relocation table */
|
||||
uint8_t * map; /* Memory bitmap ptr */
|
||||
int cProcs; /* Number of procedures so far */
|
||||
int offMain; /* The offset of the main() proc */
|
||||
uint16_t segMain; /* The segment of the main() proc */
|
||||
bool bSigs; /* True if signatures loaded */
|
||||
int cbImage; /* Length of image in bytes */
|
||||
const uint8_t *image() const {return Imagez;}
|
||||
uint8_t * Imagez; /* Allocated by loader to hold entire program image */
|
||||
int addressingMode;
|
||||
public:
|
||||
const uint8_t *image() const {return Imagez;}
|
||||
void displayLoadInfo();
|
||||
};
|
||||
|
||||
|
||||
19
include/CallGraph.h
Normal file
19
include/CallGraph.h
Normal file
@@ -0,0 +1,19 @@
|
||||
#pragma once
|
||||
#include "Procedure.h"
|
||||
/* CALL GRAPH NODE */
|
||||
struct CALL_GRAPH
|
||||
{
|
||||
ilFunction proc; /* Pointer to procedure in pProcList */
|
||||
std::vector<CALL_GRAPH *> outEdges; /* array of out edges */
|
||||
public:
|
||||
void write();
|
||||
CALL_GRAPH()
|
||||
{
|
||||
}
|
||||
public:
|
||||
void writeNodeCallGraph(int indIdx);
|
||||
bool insertCallGraph(ilFunction caller, ilFunction callee);
|
||||
bool insertCallGraph(Function *caller, ilFunction callee);
|
||||
void insertArc(ilFunction newProc);
|
||||
};
|
||||
//extern CALL_GRAPH * callGraph; /* Pointer to the head of the call graph */
|
||||
17
include/DccFrontend.h
Normal file
17
include/DccFrontend.h
Normal file
@@ -0,0 +1,17 @@
|
||||
#pragma once
|
||||
#include <QObject>
|
||||
class Project;
|
||||
class DccFrontend : public QObject
|
||||
{
|
||||
Q_OBJECT
|
||||
void LoadImage();
|
||||
void parse(Project &proj);
|
||||
std::string m_fname;
|
||||
public:
|
||||
explicit DccFrontend(QObject *parent = 0);
|
||||
bool FrontEnd(); /* frontend.c */
|
||||
|
||||
signals:
|
||||
|
||||
public slots:
|
||||
};
|
||||
@@ -9,6 +9,7 @@
|
||||
#include <utility>
|
||||
#include <algorithm>
|
||||
#include <bitset>
|
||||
#include <QtCore/QString>
|
||||
|
||||
#include "Enums.h"
|
||||
#include "types.h"
|
||||
@@ -26,7 +27,7 @@ extern bundle cCode; /* Output C procedure's declaration and code */
|
||||
|
||||
/**** Global variables ****/
|
||||
|
||||
extern char *asm1_name, *asm2_name; /* Assembler output filenames */
|
||||
extern QString asm1_name, asm2_name; /* Assembler output filenames */
|
||||
|
||||
typedef struct { /* Command line option flags */
|
||||
unsigned verbose : 1;
|
||||
@@ -37,7 +38,7 @@ typedef struct { /* Command line option flags */
|
||||
unsigned Stats : 1;
|
||||
unsigned Interact : 1; /* Interactive mode */
|
||||
unsigned Calls : 1; /* Follow register indirect calls */
|
||||
char filename[80]; /* The input filename */
|
||||
QString filename; /* The input filename */
|
||||
} OPTION;
|
||||
|
||||
extern OPTION option; /* Command line options */
|
||||
@@ -71,22 +72,11 @@ extern STATS stats; /* Icode statistics */
|
||||
|
||||
|
||||
/**** Global function prototypes ****/
|
||||
class DccFrontend
|
||||
{
|
||||
void LoadImage(Project &proj);
|
||||
void parse(Project &proj);
|
||||
std::string m_fname;
|
||||
public:
|
||||
DccFrontend(const std::string &fname) : m_fname(fname)
|
||||
{
|
||||
}
|
||||
bool FrontEnd(); /* frontend.c */
|
||||
};
|
||||
|
||||
void udm(void); /* udm.c */
|
||||
void freeCFG(BB * cfg); /* graph.c */
|
||||
BB * newBB(BB *, int, int, uint8_t, int, Function *); /* graph.c */
|
||||
void BackEnd(char *filename, CALL_GRAPH *); /* backend.c */
|
||||
void BackEnd(CALL_GRAPH *); /* backend.c */
|
||||
extern char *cChar(uint8_t c); /* backend.c */
|
||||
eErrorId scan(uint32_t ip, ICODE &p); /* scanner.c */
|
||||
void parse (CALL_GRAPH * *); /* parser.c */
|
||||
|
||||
25
include/dcc_interface.h
Normal file
25
include/dcc_interface.h
Normal file
@@ -0,0 +1,25 @@
|
||||
#pragma once
|
||||
#include "Procedure.h"
|
||||
|
||||
#include <QtCore/QObject>
|
||||
#include <QtCore/QDir>
|
||||
#include <llvm/ADT/ilist.h>
|
||||
|
||||
class IXmlTarget;
|
||||
|
||||
struct IDcc {
|
||||
static IDcc *get();
|
||||
virtual void BaseInit()=0;
|
||||
virtual void Init(QObject *tgt)=0;
|
||||
virtual lFunction::iterator GetFirstFuncHandle()=0;
|
||||
virtual lFunction::iterator GetCurFuncHandle()=0;
|
||||
virtual void analysis_Once()=0;
|
||||
virtual void load(QString name)=0; // load and preprocess -> find entry point
|
||||
virtual void prtout_asm(IXmlTarget *,int level=0)=0;
|
||||
virtual void prtout_cpp(IXmlTarget *,int level=0)=0;
|
||||
virtual size_t getFuncCount()=0;
|
||||
virtual const lFunction &validFunctions() const =0;
|
||||
virtual void SetCurFunc_by_Name(QString )=0;
|
||||
virtual QDir installDir()=0;
|
||||
virtual QDir dataDir(QString kind)=0;
|
||||
};
|
||||
@@ -1,38 +0,0 @@
|
||||
#pragma once
|
||||
/* Perfect hashing function library. Contains functions to generate perfect
|
||||
hashing functions
|
||||
* (C) Mike van Emmerik
|
||||
*/
|
||||
#include <stdint.h>
|
||||
|
||||
/* Prototypes */
|
||||
void hashCleanup(void); /* Frees memory allocated by hashParams() */
|
||||
void map(void); /* Part 1 of creating the tables */
|
||||
|
||||
/* The application must provide these functions: */
|
||||
void getKey(int i, uint8_t **pKeys);/* Set *keys to point to the i+1th key */
|
||||
void dispKey(int i); /* Display the key */
|
||||
class PatternHasher
|
||||
{
|
||||
uint16_t *T1base, *T2base; /* Pointers to start of T1, T2 */
|
||||
int NumEntry; /* Number of entries in the hash table (# keys) */
|
||||
int EntryLen; /* Size (bytes) of each entry (size of keys) */
|
||||
int SetSize; /* Size of the char set */
|
||||
char SetMin; /* First char in the set */
|
||||
int NumVert; /* c times NumEntry */
|
||||
int *graphNode; /* The array of edges */
|
||||
int *graphNext; /* Linked list of edges */
|
||||
int *graphFirst;/* First edge at a vertex */
|
||||
public:
|
||||
uint16_t *readT1(void); /* Returns a pointer to the T1 table */
|
||||
uint16_t *readT2(void); /* Returns a pointer to the T2 table */
|
||||
uint16_t *readG(void); /* Returns a pointer to the g table */
|
||||
void init(int _NumEntry, int _EntryLen, int _SetSize, char _SetMin,int _NumVert); /* Set the parameters for the hash table */
|
||||
void cleanup();
|
||||
int hash(unsigned char *string); //!< Hash the string to an int 0 .. NUMENTRY-1
|
||||
};
|
||||
extern PatternHasher g_pattern_hasher;
|
||||
/* Macro reads a LH uint16_t from the image regardless of host convention */
|
||||
#ifndef LH
|
||||
#define LH(p) ((int)((uint8_t *)(p))[0] + ((int)((uint8_t *)(p))[1] << 8))
|
||||
#endif
|
||||
@@ -8,22 +8,25 @@
|
||||
#include <boost/icl/interval_map.hpp>
|
||||
#include <boost/icl/split_interval_map.hpp>
|
||||
#include <unordered_set>
|
||||
#include <QtCore/QString>
|
||||
#include "symtab.h"
|
||||
#include "BinaryImage.h"
|
||||
#include "Procedure.h"
|
||||
class QString;
|
||||
class SourceMachine;
|
||||
struct CALL_GRAPH;
|
||||
class IProject
|
||||
{
|
||||
virtual PROG *binary()=0;
|
||||
virtual const std::string & project_name() const =0;
|
||||
virtual const std::string & binary_path() const =0;
|
||||
virtual const QString & project_name() const =0;
|
||||
virtual const QString & binary_path() const =0;
|
||||
};
|
||||
class Project : public IProject
|
||||
{
|
||||
static Project *s_instance;
|
||||
std::string m_fname;
|
||||
std::string m_project_name;
|
||||
QString m_fname;
|
||||
QString m_project_name;
|
||||
QString m_output_path;
|
||||
public:
|
||||
|
||||
typedef llvm::iplist<Function> FunctionListType;
|
||||
@@ -41,9 +44,12 @@ typedef FunctionListType lFunction;
|
||||
Project(); // default constructor,
|
||||
|
||||
public:
|
||||
void create(const std::string & a);
|
||||
const std::string &project_name() const {return m_project_name;}
|
||||
const std::string &binary_path() const {return m_fname;}
|
||||
void create(const QString &a);
|
||||
bool load();
|
||||
const QString &output_path() const {return m_output_path;}
|
||||
const QString &project_name() const {return m_project_name;}
|
||||
const QString &binary_path() const {return m_fname;}
|
||||
QString output_name(const char *ext);
|
||||
ilFunction funcIter(Function *to_find);
|
||||
ilFunction findByEntry(uint32_t entry);
|
||||
ilFunction createFunction(FunctionType *f,const std::string &name);
|
||||
@@ -60,6 +66,7 @@ public:
|
||||
PROG * binary() {return &prog;}
|
||||
SourceMachine *machine();
|
||||
|
||||
const FunctionListType &functions() const { return pProcList; }
|
||||
protected:
|
||||
void initialize();
|
||||
void writeGlobSymTable();
|
||||
|
||||
@@ -36,21 +36,13 @@ struct SYM : public SymbolCommon
|
||||
struct STKSYM : public SymbolCommon
|
||||
{
|
||||
typedef int16_t tLabel;
|
||||
Expr *actual; /* Expression tree of actual parameter */
|
||||
AstIdent *regs; /* For register arguments only */
|
||||
tLabel label; /* Immediate off from BP (+:args, -:params) */
|
||||
uint8_t regOff; /* Offset is a register (e.g. SI, DI) */
|
||||
bool hasMacro; /* This type needs a macro */
|
||||
Expr *actual=0; /* Expression tree of actual parameter */
|
||||
AstIdent *regs=0; /* For register arguments only */
|
||||
tLabel label=0; /* Immediate off from BP (+:args, -:params) */
|
||||
uint8_t regOff=0; /* Offset is a register (e.g. SI, DI) */
|
||||
bool hasMacro=false; /* This type needs a macro */
|
||||
std::string macro; /* Macro name */
|
||||
bool invalid; /* Boolean: invalid entry in formal arg list*/
|
||||
STKSYM()
|
||||
{
|
||||
actual=0;
|
||||
regs=0;
|
||||
label=0;
|
||||
regOff=0;
|
||||
invalid=hasMacro = false;
|
||||
}
|
||||
bool invalid=false; /* Boolean: invalid entry in formal arg list*/
|
||||
void setArgName(int i)
|
||||
{
|
||||
char buf[32];
|
||||
|
||||
BIN
prototypes/dcclibs.dat
Normal file
BIN
prototypes/dcclibs.dat
Normal file
Binary file not shown.
@@ -14,9 +14,9 @@ def perform_test(exepath,filepath,outname,args)
|
||||
filepath=path_local(filepath)
|
||||
joined_args = args.join(' ')
|
||||
printf("calling:" + "#{exepath} -a1 #{joined_args} -o#{output_path}.a1 #{filepath}\n")
|
||||
STDERR << "Errors for : #{filepath}"
|
||||
result = `#{exepath} -a1 -o#{output_path}.a1 #{filepath}`
|
||||
result = `#{exepath} -a2 #{joined_args} -o#{output_path}.a2 #{filepath}`
|
||||
STDERR << "Errors for : #{filepath}\n"
|
||||
result = `#{exepath} -a 1 -o#{output_path}.a1 #{filepath}`
|
||||
result = `#{exepath} -a 2 #{joined_args} -o#{output_path}.a2 #{filepath}`
|
||||
result = `#{exepath} #{joined_args} -o#{output_path} #{filepath}`
|
||||
puts result
|
||||
p $?
|
||||
|
||||
BIN
sigs/dccb2s.sig
Normal file
BIN
sigs/dccb2s.sig
Normal file
Binary file not shown.
@@ -28,11 +28,11 @@ BB *BB::Create(const rCODE &r,eBBKind _nodeType, Function *parent)
|
||||
pnewBB->loopHead = pnewBB->caseHead = pnewBB->caseTail =
|
||||
pnewBB->latchNode= pnewBB->loopFollow = NO_NODE;
|
||||
pnewBB->instructions = r;
|
||||
int addr = pnewBB->begin()->loc_ip;
|
||||
/* Mark the basic block to which the icodes belong to, but only for
|
||||
* real code basic blocks (ie. not interval bbs) */
|
||||
if(parent)
|
||||
{
|
||||
int addr = pnewBB->begin()->loc_ip;
|
||||
//setInBB should automatically handle if our range is empty
|
||||
parent->Icode.SetInBB(pnewBB->instructions, pnewBB);
|
||||
|
||||
@@ -40,10 +40,10 @@ BB *BB::Create(const rCODE &r,eBBKind _nodeType, Function *parent)
|
||||
parent->m_ip_to_bb[addr] = pnewBB;
|
||||
parent->m_actual_cfg.push_back(pnewBB);
|
||||
pnewBB->Parent = parent;
|
||||
}
|
||||
|
||||
if ( r.begin() != parent->Icode.end() ) /* Only for code BB's */
|
||||
stats.numBBbef++;
|
||||
}
|
||||
return pnewBB;
|
||||
|
||||
}
|
||||
@@ -90,7 +90,7 @@ void BB::displayDfs()
|
||||
dfsFirstNum, dfsLastNum,
|
||||
immedDom == MAX ? -1 : immedDom);
|
||||
printf("loopType = %s, loopHead = %d, latchNode = %d, follow = %d\n",
|
||||
s_loopType[loopType],
|
||||
s_loopType[(int)loopType],
|
||||
loopHead == MAX ? -1 : loopHead,
|
||||
latchNode == MAX ? -1 : latchNode,
|
||||
loopFollow == MAX ? -1 : loopFollow);
|
||||
@@ -136,12 +136,14 @@ void BB::displayDfs()
|
||||
*/
|
||||
ICODE* BB::writeLoopHeader(int &indLevel, Function* pProc, int *numLoc, BB *&latch, bool &repCond)
|
||||
{
|
||||
if(loopType == eNodeHeaderType::NO_TYPE)
|
||||
return nullptr;
|
||||
latch = pProc->m_dfsLast[this->latchNode];
|
||||
std::ostringstream ostr;
|
||||
ICODE* picode;
|
||||
switch (loopType)
|
||||
{
|
||||
case WHILE_TYPE:
|
||||
case eNodeHeaderType::WHILE_TYPE:
|
||||
picode = &this->back();
|
||||
|
||||
/* Check for error in while condition */
|
||||
@@ -169,15 +171,16 @@ ICODE* BB::writeLoopHeader(int &indLevel, Function* pProc, int *numLoc, BB *&lat
|
||||
picode->invalidate();
|
||||
break;
|
||||
|
||||
case REPEAT_TYPE:
|
||||
case eNodeHeaderType::REPEAT_TYPE:
|
||||
ostr << "\n"<<indentStr(indLevel)<<"do {\n";
|
||||
picode = &latch->back();
|
||||
picode->invalidate();
|
||||
break;
|
||||
|
||||
case ENDLESS_TYPE:
|
||||
case eNodeHeaderType::ENDLESS_TYPE:
|
||||
ostr << "\n"<<indentStr(indLevel)<<"for (;;) {\n";
|
||||
picode = &latch->back();
|
||||
break;
|
||||
}
|
||||
cCode.appendCode(ostr.str());
|
||||
stats.numHLIcode += 1;
|
||||
@@ -209,10 +212,7 @@ void BB::writeCode (int indLevel, Function * pProc , int *numLoc,int _latchNode,
|
||||
/* Check for start of loop */
|
||||
repCond = false;
|
||||
latch = nullptr;
|
||||
if (loopType)
|
||||
{
|
||||
picode=writeLoopHeader(indLevel, pProc, numLoc, latch, repCond);
|
||||
}
|
||||
|
||||
/* Write the code for this basic block */
|
||||
if (repCond == false)
|
||||
@@ -227,12 +227,12 @@ void BB::writeCode (int indLevel, Function * pProc , int *numLoc,int _latchNode,
|
||||
return;
|
||||
|
||||
/* Check type of loop/node and process code */
|
||||
if ( loopType ) /* there is a loop */
|
||||
if ( loopType!=eNodeHeaderType::NO_TYPE ) /* there is a loop */
|
||||
{
|
||||
assert(latch);
|
||||
if (this != latch) /* loop is over several bbs */
|
||||
{
|
||||
if (loopType == WHILE_TYPE)
|
||||
if (loopType == eNodeHeaderType::WHILE_TYPE)
|
||||
{
|
||||
succ = edges[THEN].BBptr;
|
||||
if (succ->dfsLastNum == loopFollow)
|
||||
@@ -248,7 +248,7 @@ void BB::writeCode (int indLevel, Function * pProc , int *numLoc,int _latchNode,
|
||||
|
||||
/* Loop epilogue: generate the loop trailer */
|
||||
indLevel--;
|
||||
if (loopType == WHILE_TYPE)
|
||||
if (loopType == eNodeHeaderType::WHILE_TYPE)
|
||||
{
|
||||
std::ostringstream ostr;
|
||||
/* Check if there is need to repeat other statements involved
|
||||
@@ -260,9 +260,9 @@ void BB::writeCode (int indLevel, Function * pProc , int *numLoc,int _latchNode,
|
||||
ostr <<indentStr(indLevel)<< "} /* end of while */\n";
|
||||
cCode.appendCode(ostr.str());
|
||||
}
|
||||
else if (loopType == ENDLESS_TYPE)
|
||||
else if (loopType == eNodeHeaderType::ENDLESS_TYPE)
|
||||
cCode.appendCode( "%s} /* end of loop */\n",indentStr(indLevel));
|
||||
else if (loopType == REPEAT_TYPE)
|
||||
else if (loopType == eNodeHeaderType::REPEAT_TYPE)
|
||||
{
|
||||
string e = "//*failed*//";
|
||||
if (picode->hl()->opcode != HLI_JCOND)
|
||||
|
||||
413
src/DccFrontend.cpp
Normal file
413
src/DccFrontend.cpp
Normal file
@@ -0,0 +1,413 @@
|
||||
#include "dcc.h"
|
||||
#include "DccFrontend.h"
|
||||
#include "project.h"
|
||||
#include "disassem.h"
|
||||
#include "CallGraph.h"
|
||||
|
||||
#include <QtCore/QFileInfo>
|
||||
#include <QtCore/QDebug>
|
||||
|
||||
#include <cstdio>
|
||||
|
||||
|
||||
class Loader
|
||||
{
|
||||
bool loadIntoProject(IProject *);
|
||||
};
|
||||
|
||||
struct PSP { /* PSP structure */
|
||||
uint16_t int20h; /* interrupt 20h */
|
||||
uint16_t eof; /* segment, end of allocation block */
|
||||
uint8_t res1; /* reserved */
|
||||
uint8_t dosDisp[5]; /* far call to DOS function dispatcher */
|
||||
uint8_t int22h[4]; /* vector for terminate routine */
|
||||
uint8_t int23h[4]; /* vector for ctrl+break routine */
|
||||
uint8_t int24h[4]; /* vector for error routine */
|
||||
uint8_t res2[22]; /* reserved */
|
||||
uint16_t segEnv; /* segment address of environment block */
|
||||
uint8_t res3[34]; /* reserved */
|
||||
uint8_t int21h[6]; /* opcode for int21h and far return */
|
||||
uint8_t res4[6]; /* reserved */
|
||||
uint8_t fcb1[16]; /* default file control block 1 */
|
||||
uint8_t fcb2[16]; /* default file control block 2 */
|
||||
uint8_t res5[4]; /* reserved */
|
||||
uint8_t cmdTail[0x80]; /* command tail and disk transfer area */
|
||||
};
|
||||
|
||||
static struct MZHeader { /* EXE file header */
|
||||
uint8_t sigLo; /* .EXE signature: 0x4D 0x5A */
|
||||
uint8_t sigHi;
|
||||
uint16_t lastPageSize; /* Size of the last page */
|
||||
uint16_t numPages; /* Number of pages in the file */
|
||||
uint16_t numReloc; /* Number of relocation items */
|
||||
uint16_t numParaHeader; /* # of paragraphs in the header */
|
||||
uint16_t minAlloc; /* Minimum number of paragraphs */
|
||||
uint16_t maxAlloc; /* Maximum number of paragraphs */
|
||||
uint16_t initSS; /* Segment displacement of stack */
|
||||
uint16_t initSP; /* Contents of SP at entry */
|
||||
uint16_t checkSum; /* Complemented checksum */
|
||||
uint16_t initIP; /* Contents of IP at entry */
|
||||
uint16_t initCS; /* Segment displacement of code */
|
||||
uint16_t relocTabOffset; /* Relocation table offset */
|
||||
uint16_t overlayNum; /* Overlay number */
|
||||
} header;
|
||||
|
||||
#define EXE_RELOCATION 0x10 /* EXE images rellocated to above PSP */
|
||||
|
||||
//static void LoadImage(char *filename);
|
||||
static void displayMemMap(void);
|
||||
/****************************************************************************
|
||||
* displayLoadInfo - Displays low level loader type info.
|
||||
***************************************************************************/
|
||||
void PROG::displayLoadInfo(void)
|
||||
{
|
||||
int i;
|
||||
|
||||
printf("File type is %s\n", (fCOM)?"COM":"EXE");
|
||||
if (! fCOM) {
|
||||
printf("Signature = %02X%02X\n", header.sigLo, header.sigHi);
|
||||
printf("File size %% 512 = %04X\n", LH(&header.lastPageSize));
|
||||
printf("File size / 512 = %04X pages\n", LH(&header.numPages));
|
||||
printf("# relocation items = %04X\n", LH(&header.numReloc));
|
||||
printf("Offset to load image = %04X paras\n", LH(&header.numParaHeader));
|
||||
printf("Minimum allocation = %04X paras\n", LH(&header.minAlloc));
|
||||
printf("Maximum allocation = %04X paras\n", LH(&header.maxAlloc));
|
||||
}
|
||||
printf("Load image size = %04" PRIiPTR "\n", cbImage - sizeof(PSP));
|
||||
printf("Initial SS:SP = %04X:%04X\n", initSS, initSP);
|
||||
printf("Initial CS:IP = %04X:%04X\n", initCS, initIP);
|
||||
|
||||
if (option.VeryVerbose && cReloc)
|
||||
{
|
||||
printf("\nRelocation Table\n");
|
||||
for (i = 0; i < cReloc; i++)
|
||||
{
|
||||
printf("%06X -> [%04X]\n", relocTable[i],LH(image() + relocTable[i]));
|
||||
}
|
||||
}
|
||||
printf("\n");
|
||||
}
|
||||
|
||||
/*****************************************************************************
|
||||
* fill - Fills line for displayMemMap()
|
||||
****************************************************************************/
|
||||
static void fill(int ip, char *bf)
|
||||
{
|
||||
PROG &prog(Project::get()->prog);
|
||||
static uint8_t type[4] = {'.', 'd', 'c', 'x'};
|
||||
uint8_t i;
|
||||
|
||||
for (i = 0; i < 16; i++, ip++)
|
||||
{
|
||||
*bf++ = ' ';
|
||||
*bf++ = (ip < prog.cbImage)? type[(prog.map[ip >> 2] >> ((ip & 3) * 2)) & 3]: ' ';
|
||||
}
|
||||
*bf = '\0';
|
||||
}
|
||||
|
||||
|
||||
/*****************************************************************************
|
||||
* displayMemMap - Displays the memory bitmap
|
||||
****************************************************************************/
|
||||
static void displayMemMap(void)
|
||||
{
|
||||
PROG &prog(Project::get()->prog);
|
||||
|
||||
char c, b1[33], b2[33], b3[33];
|
||||
uint8_t i;
|
||||
int ip = 0;
|
||||
|
||||
printf("\nMemory Map\n");
|
||||
while (ip < prog.cbImage)
|
||||
{
|
||||
fill(ip, b1);
|
||||
printf("%06X %s\n", ip, b1);
|
||||
ip += 16;
|
||||
for (i = 3, c = b1[1]; i < 32 && c == b1[i]; i += 2)
|
||||
; /* Check if all same */
|
||||
if (i > 32)
|
||||
{
|
||||
fill(ip, b2); /* Skip until next two are not same */
|
||||
fill(ip+16, b3);
|
||||
if (! (strcmp(b1, b2) || strcmp(b1, b3)))
|
||||
{
|
||||
printf(" :\n");
|
||||
do
|
||||
{
|
||||
ip += 16;
|
||||
fill(ip+16, b1);
|
||||
} while (! strcmp(b1, b2));
|
||||
}
|
||||
}
|
||||
}
|
||||
printf("\n");
|
||||
}
|
||||
DccFrontend::DccFrontend(QObject *parent) :
|
||||
QObject(parent)
|
||||
{
|
||||
}
|
||||
|
||||
/*****************************************************************************
|
||||
* FrontEnd - invokes the loader, parser, disassembler (if asm1), icode
|
||||
* rewritter, and displays any useful information.
|
||||
****************************************************************************/
|
||||
bool DccFrontend::FrontEnd ()
|
||||
{
|
||||
|
||||
/* Do depth first flow analysis building call graph and procedure list,
|
||||
* and attaching the I-code to each procedure */
|
||||
parse (*Project::get());
|
||||
|
||||
if (option.asm1)
|
||||
{
|
||||
qWarning() << "dcc: writing assembler file "<<asm1_name<<'\n';
|
||||
}
|
||||
|
||||
/* Search through code looking for impure references and flag them */
|
||||
Disassembler ds(1);
|
||||
for(Function &f : Project::get()->pProcList)
|
||||
{
|
||||
f.markImpure();
|
||||
if (option.asm1)
|
||||
{
|
||||
ds.disassem(&f);
|
||||
}
|
||||
}
|
||||
if (option.Interact)
|
||||
{
|
||||
interactDis(&Project::get()->pProcList.front(), 0); /* Interactive disassembler */
|
||||
}
|
||||
|
||||
/* Converts jump target addresses to icode offsets */
|
||||
for(Function &f : Project::get()->pProcList)
|
||||
{
|
||||
f.bindIcodeOff();
|
||||
}
|
||||
/* Print memory bitmap */
|
||||
if (option.Map)
|
||||
displayMemMap();
|
||||
return(true); // we no longer own proj !
|
||||
}
|
||||
struct DosLoader {
|
||||
protected:
|
||||
void prepareImage(PROG &prog,size_t sz,QFile &fp) {
|
||||
/* Allocate a block of memory for the program. */
|
||||
prog.cbImage = sz + sizeof(PSP);
|
||||
prog.Imagez = new uint8_t [prog.cbImage];
|
||||
prog.Imagez[0] = 0xCD; /* Fill in PSP int 20h location */
|
||||
prog.Imagez[1] = 0x20; /* for termination checking */
|
||||
/* Read in the image past where a PSP would go */
|
||||
if (sz != fp.read((char *)prog.Imagez + sizeof(PSP),sz))
|
||||
fatalError(CANNOT_READ, fp.fileName().toLocal8Bit().data());
|
||||
}
|
||||
};
|
||||
struct ComLoader : public DosLoader {
|
||||
bool canLoad(QFile &fp) {
|
||||
fp.seek(0);
|
||||
char sig[2];
|
||||
if(2==fp.read(sig,2)) {
|
||||
return not (sig[0] == 0x4D && sig[1] == 0x5A);
|
||||
}
|
||||
return false;
|
||||
}
|
||||
bool load(PROG &prog,QFile &fp) {
|
||||
fp.seek(0);
|
||||
/* COM file
|
||||
* In this case the load module size is just the file length
|
||||
*/
|
||||
auto cb = fp.size();
|
||||
|
||||
/* COM programs start off with an ORG 100H (to leave room for a PSP)
|
||||
* This is also the implied start address so if we load the image
|
||||
* at offset 100H addresses should all line up properly again.
|
||||
*/
|
||||
prog.initCS = 0;
|
||||
prog.initIP = 0x100;
|
||||
prog.initSS = 0;
|
||||
prog.initSP = 0xFFFE;
|
||||
prog.cReloc = 0;
|
||||
|
||||
prepareImage(prog,cb,fp);
|
||||
|
||||
|
||||
/* Set up memory map */
|
||||
cb = (prog.cbImage + 3) / 4;
|
||||
prog.map = (uint8_t *)malloc(cb);
|
||||
memset(prog.map, BM_UNKNOWN, (size_t)cb);
|
||||
return true;
|
||||
}
|
||||
};
|
||||
struct ExeLoader : public DosLoader {
|
||||
bool canLoad(QFile &fp) {
|
||||
if(fp.size()<sizeof(header))
|
||||
return false;
|
||||
MZHeader tmp_header;
|
||||
fp.seek(0);
|
||||
fp.read((char *)&tmp_header, sizeof(header));
|
||||
if(not (tmp_header.sigLo == 0x4D && tmp_header.sigHi == 0x5A))
|
||||
return false;
|
||||
|
||||
/* This is a typical DOS kludge! */
|
||||
if (LH(&header.relocTabOffset) == 0x40)
|
||||
{
|
||||
qDebug() << "Don't understand new EXE format";
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
bool load(PROG &prog,QFile &fp) {
|
||||
/* Read rest of header */
|
||||
fp.seek(0);
|
||||
if (fp.read((char *)&header, sizeof(header)) != sizeof(header))
|
||||
return false;
|
||||
|
||||
/* Calculate the load module size.
|
||||
* This is the number of pages in the file
|
||||
* less the length of the header and reloc table
|
||||
* less the number of bytes unused on last page
|
||||
*/
|
||||
uint32_t cb = (uint32_t)LH(&header.numPages) * 512 - (uint32_t)LH(&header.numParaHeader) * 16;
|
||||
if (header.lastPageSize)
|
||||
{
|
||||
cb -= 512 - LH(&header.lastPageSize);
|
||||
}
|
||||
|
||||
/* We quietly ignore minAlloc and maxAlloc since for our
|
||||
* purposes it doesn't really matter where in real memory
|
||||
* the program would end up. EXE programs can't really rely on
|
||||
* their load location so setting the PSP segment to 0 is fine.
|
||||
* Certainly programs that prod around in DOS or BIOS are going
|
||||
* to have to load DS from a constant so it'll be pretty
|
||||
* obvious.
|
||||
*/
|
||||
prog.initCS = (int16_t)LH(&header.initCS) + EXE_RELOCATION;
|
||||
prog.initIP = (int16_t)LH(&header.initIP);
|
||||
prog.initSS = (int16_t)LH(&header.initSS) + EXE_RELOCATION;
|
||||
prog.initSP = (int16_t)LH(&header.initSP);
|
||||
prog.cReloc = (int16_t)LH(&header.numReloc);
|
||||
|
||||
/* Allocate the relocation table */
|
||||
if (prog.cReloc)
|
||||
{
|
||||
prog.relocTable.resize(prog.cReloc);
|
||||
fp.seek(LH(&header.relocTabOffset));
|
||||
|
||||
/* Read in seg:offset pairs and convert to Image ptrs */
|
||||
uint8_t buf[4];
|
||||
for (int i = 0; i < prog.cReloc; i++)
|
||||
{
|
||||
fp.read((char *)buf,4);
|
||||
prog.relocTable[i] = LH(buf) + (((int)LH(buf+2) + EXE_RELOCATION)<<4);
|
||||
}
|
||||
}
|
||||
/* Seek to start of image */
|
||||
uint32_t start_of_image= LH(&header.numParaHeader) * 16;
|
||||
fp.seek(start_of_image);
|
||||
/* Allocate a block of memory for the program. */
|
||||
prepareImage(prog,cb,fp);
|
||||
|
||||
/* Set up memory map */
|
||||
cb = (prog.cbImage + 3) / 4;
|
||||
prog.map = (uint8_t *)malloc(cb);
|
||||
memset(prog.map, BM_UNKNOWN, (size_t)cb);
|
||||
|
||||
/* Relocate segment constants */
|
||||
for(uint32_t v : prog.relocTable) {
|
||||
uint8_t *p = &prog.Imagez[v];
|
||||
uint16_t w = (uint16_t)LH(p) + EXE_RELOCATION;
|
||||
*p++ = (uint8_t)(w & 0x00FF);
|
||||
*p = (uint8_t)((w & 0xFF00) >> 8);
|
||||
}
|
||||
return true;
|
||||
}
|
||||
};
|
||||
/*****************************************************************************
|
||||
* LoadImage
|
||||
****************************************************************************/
|
||||
bool Project::load()
|
||||
{
|
||||
// addTask(loaderSelection,PreCond(BinaryImage))
|
||||
// addTask(applyLoader,PreCond(Loader))
|
||||
const char *fname = binary_path().toLocal8Bit().data();
|
||||
QFile finfo(binary_path());
|
||||
/* Open the input file */
|
||||
if(!finfo.open(QFile::ReadOnly)) {
|
||||
fatalError(CANNOT_OPEN, fname);
|
||||
}
|
||||
/* Read in first 2 bytes to check EXE signature */
|
||||
if (finfo.size()<=2)
|
||||
{
|
||||
fatalError(CANNOT_READ, fname);
|
||||
}
|
||||
ComLoader com_loader;
|
||||
ExeLoader exe_loader;
|
||||
if(exe_loader.canLoad(finfo)) {
|
||||
prog.fCOM = false;
|
||||
return exe_loader.load(prog,finfo);
|
||||
}
|
||||
if(com_loader.canLoad(finfo)) {
|
||||
prog.fCOM = true;
|
||||
return com_loader.load(prog,finfo);
|
||||
}
|
||||
return false;
|
||||
}
|
||||
uint32_t SynthLab;
|
||||
/* Parses the program, builds the call graph, and returns the list of
|
||||
* procedures found */
|
||||
void DccFrontend::parse(Project &proj)
|
||||
{
|
||||
PROG &prog(proj.prog);
|
||||
STATE state;
|
||||
|
||||
/* Set initial state */
|
||||
state.setState(rES, 0); /* PSP segment */
|
||||
state.setState(rDS, 0);
|
||||
state.setState(rCS, prog.initCS);
|
||||
state.setState(rSS, prog.initSS);
|
||||
state.setState(rSP, prog.initSP);
|
||||
state.IP = ((uint32_t)prog.initCS << 4) + prog.initIP;
|
||||
SynthLab = SYNTHESIZED_MIN;
|
||||
|
||||
/* Check for special settings of initial state, based on idioms of the
|
||||
startup code */
|
||||
state.checkStartup();
|
||||
Function *start_proc;
|
||||
/* Make a struct for the initial procedure */
|
||||
if (prog.offMain != -1)
|
||||
{
|
||||
start_proc = proj.createFunction(0,"main");
|
||||
start_proc->retVal.loc = REG_FRAME;
|
||||
start_proc->retVal.type = TYPE_WORD_SIGN;
|
||||
start_proc->retVal.id.regi = rAX;
|
||||
/* We know where main() is. Start the flow of control from there */
|
||||
start_proc->procEntry = prog.offMain;
|
||||
/* In medium and large models, the segment of main may (will?) not be
|
||||
the same as the initial CS segment (of the startup code) */
|
||||
state.setState(rCS, prog.segMain);
|
||||
state.IP = prog.offMain;
|
||||
}
|
||||
else
|
||||
{
|
||||
start_proc = proj.createFunction(0,"start");
|
||||
/* Create initial procedure at program start address */
|
||||
start_proc->procEntry = (uint32_t)state.IP;
|
||||
}
|
||||
|
||||
/* The state info is for the first procedure */
|
||||
start_proc->state = state;
|
||||
|
||||
/* Set up call graph initial node */
|
||||
proj.callGraph = new CALL_GRAPH;
|
||||
proj.callGraph->proc = start_proc;
|
||||
|
||||
/* This proc needs to be called to set things up for LibCheck(), which
|
||||
checks a proc to see if it is a know C (etc) library */
|
||||
SetupLibCheck();
|
||||
//BUG: proj and g_proj are 'live' at this point !
|
||||
|
||||
/* Recursively build entire procedure list */
|
||||
start_proc->FollowCtrl(proj.callGraph, &state);
|
||||
|
||||
/* This proc needs to be called to clean things up from SetupLibCheck() */
|
||||
CleanupLibCheck();
|
||||
}
|
||||
@@ -4,6 +4,8 @@
|
||||
* Purpose: Back-end module. Generates C code for each procedure.
|
||||
* (C) Cristina Cifuentes
|
||||
****************************************************************************/
|
||||
#include <QDir>
|
||||
#include <QFile>
|
||||
#include <cassert>
|
||||
#include <string>
|
||||
#include <boost/range.hpp>
|
||||
@@ -167,13 +169,13 @@ void Project::writeGlobSymTable()
|
||||
|
||||
/* Writes the header information and global variables to the output C file
|
||||
* fp. */
|
||||
static void writeHeader (std::ostream &_ios, char *fileName)
|
||||
static void writeHeader (std::ostream &_ios, const std::string &fileName)
|
||||
{
|
||||
PROG &prog(Project::get()->prog);
|
||||
/* Write header information */
|
||||
cCode.init();
|
||||
cCode.appendDecl( "/*\n");
|
||||
cCode.appendDecl( " * Input file\t: %s\n", fileName);
|
||||
cCode.appendDecl( " * Input file\t: %s\n", fileName.c_str());
|
||||
cCode.appendDecl( " * File type\t: %s\n", (prog.fCOM)?"COM":"EXE");
|
||||
cCode.appendDecl( " */\n\n#include \"dcc.h\"\n\n");
|
||||
|
||||
@@ -341,22 +343,21 @@ static void backBackEnd (CALL_GRAPH * pcallGraph, std::ostream &_ios)
|
||||
|
||||
|
||||
/* Invokes the necessary routines to produce code one procedure at a time. */
|
||||
void BackEnd (char *fileName, CALL_GRAPH * pcallGraph)
|
||||
void BackEnd(CALL_GRAPH * pcallGraph)
|
||||
{
|
||||
std::ofstream fs; /* Output C file */
|
||||
|
||||
/* Get output file name */
|
||||
std::string outNam(fileName);
|
||||
outNam = outNam.substr(0,outNam.rfind("."))+".b"; /* b for beta */
|
||||
QString outNam(Project::get()->output_name("b")); /* b for beta */
|
||||
|
||||
/* Open output file */
|
||||
fs.open(outNam);
|
||||
fs.open(outNam.toStdString());
|
||||
if(!fs.is_open())
|
||||
fatalError (CANNOT_OPEN, outNam.c_str());
|
||||
printf ("dcc: Writing C beta file %s\n", outNam.c_str());
|
||||
fatalError (CANNOT_OPEN, outNam.toStdString().c_str());
|
||||
std::cout<<"dcc: Writing C beta file "<<outNam.toStdString()<<"\n";
|
||||
|
||||
/* Header information */
|
||||
writeHeader (fs, option.filename);
|
||||
writeHeader (fs, option.filename.toStdString());
|
||||
|
||||
/* Initialize total Icode instructions statistics */
|
||||
stats.totalLL = 0;
|
||||
@@ -367,7 +368,7 @@ void BackEnd (char *fileName, CALL_GRAPH * pcallGraph)
|
||||
|
||||
/* Close output file */
|
||||
fs.close();
|
||||
printf ("dcc: Finished writing C beta file\n");
|
||||
std::cout << "dcc: Finished writing C beta file\n";
|
||||
}
|
||||
|
||||
|
||||
|
||||
@@ -5,18 +5,17 @@
|
||||
* (C) Mike van Emmerik
|
||||
*/
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#ifdef __BORLAND__
|
||||
#include <mem.h>
|
||||
#else
|
||||
#include <memory.h>
|
||||
#endif
|
||||
#include <string.h>
|
||||
#include "dcc.h"
|
||||
#include "project.h"
|
||||
#include "perfhlib.h"
|
||||
#include "dcc_interface.h"
|
||||
|
||||
#include <QDir>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <memory.h>
|
||||
#include <string.h>
|
||||
PerfectHash g_pattern_hasher;
|
||||
#define NIL -1 /* Used like NULL, but 0 is valid */
|
||||
|
||||
/* Hash table structure */
|
||||
@@ -68,7 +67,6 @@ void readFileSection(uint16_t* p, int len, FILE *_file);
|
||||
void cleanup(void);
|
||||
void checkStartup(STATE *state);
|
||||
void readProtoFile(void);
|
||||
void fixNewline(char *s);
|
||||
int searchPList(char *name);
|
||||
void checkHeap(char *msg); /* For debugging */
|
||||
|
||||
@@ -301,10 +299,11 @@ void SetupLibCheck(void)
|
||||
PROG &prog(Project::get()->prog);
|
||||
uint16_t w, len;
|
||||
int i;
|
||||
|
||||
if ((g_file = fopen(sSigName, "rb")) == nullptr)
|
||||
IDcc *dcc = IDcc::get();
|
||||
QString fpath = dcc->dataDir("sigs").absoluteFilePath(sSigName);
|
||||
if ((g_file = fopen(qPrintable(fpath), "rb")) == nullptr)
|
||||
{
|
||||
printf("Warning: cannot open signature file %s\n", sSigName);
|
||||
printf("Warning: cannot open signature file %s\n", qPrintable(fpath));
|
||||
return;
|
||||
}
|
||||
|
||||
@@ -332,7 +331,7 @@ void SetupLibCheck(void)
|
||||
|
||||
/* Initialise the perfhlib stuff. Also allocates T1, T2, g, etc */
|
||||
/* Set the parameters for the hash table */
|
||||
g_pattern_hasher.init(
|
||||
g_pattern_hasher.setHashParams(
|
||||
numKeys, /* The number of symbols */
|
||||
PatLen, /* The length of the pattern to be hashed */
|
||||
256, /* The character set of the pattern (0-FF) */
|
||||
@@ -639,7 +638,6 @@ void STATE::checkStartup()
|
||||
char chModel = 'x';
|
||||
char chVendor = 'x';
|
||||
char chVersion = 'x';
|
||||
char *pPath;
|
||||
char temp[4];
|
||||
|
||||
startOff = ((uint32_t)prog.initCS << 4) + prog.initIP;
|
||||
@@ -830,21 +828,6 @@ void STATE::checkStartup()
|
||||
|
||||
gotVendor:
|
||||
|
||||
/* Use the DCC environment variable to set where the .sig files will
|
||||
be found. Otherwise, assume current directory */
|
||||
pPath = getenv("DCC");
|
||||
if (pPath)
|
||||
{
|
||||
strcpy(sSigName, pPath); /* Use path given */
|
||||
if (sSigName[strlen(sSigName)-1] != '/')
|
||||
{
|
||||
strcat(sSigName, "/"); /* Append a slash if necessary */
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
strcpy(sSigName, "./"); /* Current directory */
|
||||
}
|
||||
strcat(sSigName, "dcc");
|
||||
temp[1] = '\0';
|
||||
temp[0] = chVendor;
|
||||
@@ -867,45 +850,29 @@ gotVendor:
|
||||
*/
|
||||
void readProtoFile(void)
|
||||
{
|
||||
IDcc *dcc = IDcc::get();
|
||||
QString szProFName = dcc->dataDir("prototypes").absoluteFilePath(DCCLIBS); /* Full name of dclibs.lst */
|
||||
|
||||
FILE *fProto;
|
||||
char *pPath; /* Point to the environment string */
|
||||
char szProFName[81]; /* Full name of dclibs.lst */
|
||||
int i;
|
||||
|
||||
/* Use the DCC environment variable to set where the dcclibs.lst file will
|
||||
be found. Otherwise, assume current directory */
|
||||
pPath = getenv("DCC");
|
||||
if (pPath)
|
||||
if ((fProto = fopen(qPrintable(szProFName), "rb")) == nullptr)
|
||||
{
|
||||
strcpy(szProFName, pPath); /* Use path given */
|
||||
if (szProFName[strlen(szProFName)-1] != '/')
|
||||
{
|
||||
strcat(szProFName, "/"); /* Append a slash if necessary */
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
strcpy(szProFName, "./"); /* Current directory */
|
||||
}
|
||||
strcat(szProFName, DCCLIBS);
|
||||
|
||||
if ((fProto = fopen(szProFName, "rb")) == nullptr)
|
||||
{
|
||||
printf("Warning: cannot open library prototype data file %s\n", szProFName);
|
||||
printf("Warning: cannot open library prototype data file %s\n", qPrintable(szProFName));
|
||||
return;
|
||||
}
|
||||
|
||||
grab(4, fProto);
|
||||
if (strncmp(buf, "dccp", 4) != 0)
|
||||
{
|
||||
printf("%s is not a dcc prototype file\n", szProFName);
|
||||
printf("%s is not a dcc prototype file\n", qPrintable(szProFName));
|
||||
exit(1);
|
||||
}
|
||||
|
||||
grab(2, fProto);
|
||||
if (strncmp(buf, "FN", 2) != 0)
|
||||
{
|
||||
printf("FN (Function Name) subsection expected in %s\n", szProFName);
|
||||
printf("FN (Function Name) subsection expected in %s\n", qPrintable(szProFName));
|
||||
exit(2);
|
||||
}
|
||||
|
||||
@@ -932,7 +899,7 @@ void readProtoFile(void)
|
||||
grab(2, fProto);
|
||||
if (strncmp(buf, "PM", 2) != 0)
|
||||
{
|
||||
printf("PM (Parameter) subsection expected in %s\n", szProFName);
|
||||
printf("PM (Parameter) subsection expected in %s\n", qPrintable(szProFName));
|
||||
exit(2);
|
||||
}
|
||||
|
||||
|
||||
@@ -168,7 +168,7 @@ static void findNodesInLoop(BB * latchNode,BB * head,Function * pProc,queue &int
|
||||
(inList (loopNodes, head->edges[THEN].BBptr->dfsLastNum) &&
|
||||
inList (loopNodes, head->edges[ELSE].BBptr->dfsLastNum)))
|
||||
{
|
||||
head->loopType = REPEAT_TYPE;
|
||||
head->loopType = eNodeHeaderType::REPEAT_TYPE;
|
||||
if (latchNode->edges[0].BBptr == head)
|
||||
head->loopFollow = latchNode->edges[ELSE].BBptr->dfsLastNum;
|
||||
else
|
||||
@@ -177,7 +177,7 @@ static void findNodesInLoop(BB * latchNode,BB * head,Function * pProc,queue &int
|
||||
}
|
||||
else
|
||||
{
|
||||
head->loopType = WHILE_TYPE;
|
||||
head->loopType = eNodeHeaderType::WHILE_TYPE;
|
||||
if (inList (loopNodes, head->edges[THEN].BBptr->dfsLastNum))
|
||||
head->loopFollow = head->edges[ELSE].BBptr->dfsLastNum;
|
||||
else
|
||||
@@ -186,7 +186,7 @@ static void findNodesInLoop(BB * latchNode,BB * head,Function * pProc,queue &int
|
||||
}
|
||||
else /* head = anything besides 2-way, latch = 2-way */
|
||||
{
|
||||
head->loopType = REPEAT_TYPE;
|
||||
head->loopType = eNodeHeaderType::REPEAT_TYPE;
|
||||
if (latchNode->edges[THEN].BBptr == head)
|
||||
head->loopFollow = latchNode->edges[ELSE].BBptr->dfsLastNum;
|
||||
else
|
||||
@@ -196,12 +196,12 @@ static void findNodesInLoop(BB * latchNode,BB * head,Function * pProc,queue &int
|
||||
else /* latch = 1-way */
|
||||
if (latchNode->nodeType == LOOP_NODE)
|
||||
{
|
||||
head->loopType = REPEAT_TYPE;
|
||||
head->loopType = eNodeHeaderType::REPEAT_TYPE;
|
||||
head->loopFollow = latchNode->edges[0].BBptr->dfsLastNum;
|
||||
}
|
||||
else if (intNodeType == TWO_BRANCH)
|
||||
{
|
||||
head->loopType = WHILE_TYPE;
|
||||
head->loopType = eNodeHeaderType::WHILE_TYPE;
|
||||
pbb = latchNode;
|
||||
thenDfs = head->edges[THEN].BBptr->dfsLastNum;
|
||||
elseDfs = head->edges[ELSE].BBptr->dfsLastNum;
|
||||
@@ -222,7 +222,7 @@ static void findNodesInLoop(BB * latchNode,BB * head,Function * pProc,queue &int
|
||||
* loop, so it is safer to consider it an endless loop */
|
||||
if (pbb->dfsLastNum <= head->dfsLastNum)
|
||||
{
|
||||
head->loopType = ENDLESS_TYPE;
|
||||
head->loopType = eNodeHeaderType::ENDLESS_TYPE;
|
||||
findEndlessFollow (pProc, loopNodes, head);
|
||||
break;
|
||||
}
|
||||
@@ -234,7 +234,7 @@ static void findNodesInLoop(BB * latchNode,BB * head,Function * pProc,queue &int
|
||||
}
|
||||
else
|
||||
{
|
||||
head->loopType = ENDLESS_TYPE;
|
||||
head->loopType = eNodeHeaderType::ENDLESS_TYPE;
|
||||
findEndlessFollow (pProc, loopNodes, head);
|
||||
}
|
||||
|
||||
|
||||
@@ -143,7 +143,6 @@ void Function::elimCondCodes ()
|
||||
//auto reversed_instructions = pBB->range() | reversed;
|
||||
for (useAt = pBB->rbegin(); useAt != pBB->rend(); useAt++)
|
||||
{
|
||||
ICODE &useIcode(*useAt);
|
||||
llIcode useAtOp = llIcode(useAt->ll()->getOpcode());
|
||||
use = useAt->ll()->flagDU.u;
|
||||
if ((useAt->type != LOW_LEVEL) || ( ! useAt->valid() ) || ( 0 == use ))
|
||||
@@ -159,7 +158,6 @@ void Function::elimCondCodes ()
|
||||
continue;
|
||||
notSup = false;
|
||||
LLOperand *dest_ll = defIcode.ll()->get(DST);
|
||||
LLOperand *src_ll = defIcode.ll()->get(SRC);
|
||||
if ((useAtOp >= iJB) && (useAtOp <= iJNS))
|
||||
{
|
||||
iICODE befDefAt = (++riICODE(defAt)).base();
|
||||
|
||||
275
src/dcc.cpp
275
src/dcc.cpp
@@ -5,23 +5,10 @@
|
||||
****************************************************************************/
|
||||
|
||||
#include <cstring>
|
||||
#include "dcc.h"
|
||||
#include "project.h"
|
||||
|
||||
#include "CallGraph.h"
|
||||
/* Global variables - extern to other modules */
|
||||
extern char *asm1_name, *asm2_name; /* Assembler output filenames */
|
||||
extern SYMTAB symtab; /* Global symbol table */
|
||||
extern STATS stats; /* cfg statistics */
|
||||
//PROG prog; /* programs fields */
|
||||
extern OPTION option; /* Command line options */
|
||||
//Function * pProcList; /* List of procedures, topologically sort */
|
||||
//Function * pLastProc; /* Pointer to last node in procedure list */
|
||||
//FunctionListType pProcList;
|
||||
//CALL_GRAPH *callGraph; /* Call graph of the program */
|
||||
|
||||
static char *initargs(int argc, char *argv[]);
|
||||
static void displayTotalStats(void);
|
||||
#include <iostream>
|
||||
#include <QtCore/QCoreApplication>
|
||||
#include <QCommandLineParser>
|
||||
#ifdef LLVM_EXPERIMENTAL
|
||||
#include <llvm/Support/raw_os_ostream.h>
|
||||
#include <llvm/Support/CommandLine.h>
|
||||
#include <llvm/Support/TargetSelect.h>
|
||||
@@ -33,14 +20,29 @@ static void displayTotalStats(void);
|
||||
#include <llvm/Target/TargetInstrInfo.h>
|
||||
#include <llvm/MC/MCAsmInfo.h>
|
||||
#include <llvm/CodeGen/MachineInstrBuilder.h>
|
||||
|
||||
#include <llvm/TableGen/Main.h>
|
||||
#include <llvm/TableGen/TableGenBackend.h>
|
||||
#include <llvm/TableGen/Record.h>
|
||||
#endif
|
||||
#include <QtCore/QFile>
|
||||
|
||||
#include "dcc.h"
|
||||
#include "project.h"
|
||||
#include "CallGraph.h"
|
||||
#include "DccFrontend.h"
|
||||
|
||||
/* Global variables - extern to other modules */
|
||||
extern QString asm1_name, asm2_name; /* Assembler output filenames */
|
||||
extern SYMTAB symtab; /* Global symbol table */
|
||||
extern STATS stats; /* cfg statistics */
|
||||
extern OPTION option; /* Command line options */
|
||||
|
||||
static char *initargs(int argc, char *argv[]);
|
||||
static void displayTotalStats(void);
|
||||
/****************************************************************************
|
||||
* main
|
||||
***************************************************************************/
|
||||
#include <iostream>
|
||||
#ifdef LLVM_EXPERIMENTAL
|
||||
using namespace llvm;
|
||||
bool TVisitor(raw_ostream &OS, RecordKeeper &Records)
|
||||
{
|
||||
@@ -65,63 +67,128 @@ bool TVisitor(raw_ostream &OS, RecordKeeper &Records)
|
||||
// rec = Records.getDef("CCR");
|
||||
// if(rec)
|
||||
// rec->dump();
|
||||
for(auto val : Records.getDefs())
|
||||
{
|
||||
//std::cout<< "Def "<<val.first<<"\n";
|
||||
}
|
||||
// for(auto val : Records.getDefs())
|
||||
// {
|
||||
// //std::cout<< "Def "<<val.first<<"\n";
|
||||
// }
|
||||
return false;
|
||||
}
|
||||
int testTblGen(int argc, char **argv)
|
||||
{
|
||||
using namespace llvm;
|
||||
sys::PrintStackTraceOnErrorSignal();
|
||||
PrettyStackTraceProgram(argc,argv);
|
||||
cl::ParseCommandLineOptions(argc,argv);
|
||||
return llvm::TableGenMain(argv[0],TVisitor);
|
||||
InitializeNativeTarget();
|
||||
Triple TheTriple;
|
||||
std::string def = sys::getDefaultTargetTriple();
|
||||
std::string MCPU="i386";
|
||||
std::string MARCH="x86";
|
||||
InitializeAllTargetInfos();
|
||||
InitializeAllTargetMCs();
|
||||
InitializeAllAsmPrinters();
|
||||
InitializeAllAsmParsers();
|
||||
InitializeAllDisassemblers();
|
||||
std::string TargetTriple("i386-pc-linux-gnu");
|
||||
TheTriple = Triple(Triple::normalize(TargetTriple));
|
||||
MCOperand op=llvm::MCOperand::CreateImm(11);
|
||||
MCAsmInfo info;
|
||||
raw_os_ostream wrap(std::cerr);
|
||||
op.print(wrap,&info);
|
||||
wrap.flush();
|
||||
std::cerr<<"\n";
|
||||
std::string lookuperr;
|
||||
TargetRegistry::printRegisteredTargetsForVersion();
|
||||
const Target *t = TargetRegistry::lookupTarget(MARCH,TheTriple,lookuperr);
|
||||
TargetOptions opts;
|
||||
std::string Features;
|
||||
opts.PrintMachineCode=1;
|
||||
TargetMachine *tm = t->createTargetMachine(TheTriple.getTriple(),MCPU,Features,opts);
|
||||
std::cerr<<tm->getInstrInfo()->getName(97)<<"\n";
|
||||
const MCInstrDesc &ds(tm->getInstrInfo()->get(97));
|
||||
const MCOperandInfo *op1=ds.OpInfo;
|
||||
uint16_t impl_def = ds.getImplicitDefs()[0];
|
||||
std::cerr<<lookuperr<<"\n";
|
||||
// using namespace llvm;
|
||||
// sys::PrintStackTraceOnErrorSignal();
|
||||
// PrettyStackTraceProgram(argc,argv);
|
||||
// cl::ParseCommandLineOptions(argc,argv);
|
||||
// return llvm::TableGenMain(argv[0],TVisitor);
|
||||
// InitializeNativeTarget();
|
||||
// Triple TheTriple;
|
||||
// std::string def = sys::getDefaultTargetTriple();
|
||||
// std::string MCPU="i386";
|
||||
// std::string MARCH="x86";
|
||||
// InitializeAllTargetInfos();
|
||||
// InitializeAllTargetMCs();
|
||||
// InitializeAllAsmPrinters();
|
||||
// InitializeAllAsmParsers();
|
||||
// InitializeAllDisassemblers();
|
||||
// std::string TargetTriple("i386-pc-linux-gnu");
|
||||
// TheTriple = Triple(Triple::normalize(TargetTriple));
|
||||
// MCOperand op=llvm::MCOperand::CreateImm(11);
|
||||
// MCAsmInfo info;
|
||||
// raw_os_ostream wrap(std::cerr);
|
||||
// op.print(wrap,&info);
|
||||
// wrap.flush();
|
||||
// std::cerr<<"\n";
|
||||
// std::string lookuperr;
|
||||
// TargetRegistry::printRegisteredTargetsForVersion();
|
||||
// const Target *t = TargetRegistry::lookupTarget(MARCH,TheTriple,lookuperr);
|
||||
// TargetOptions opts;
|
||||
// std::string Features;
|
||||
// opts.PrintMachineCode=1;
|
||||
// TargetMachine *tm = t->createTargetMachine(TheTriple.getTriple(),MCPU,Features,opts);
|
||||
// std::cerr<<tm->getInstrInfo()->getName(97)<<"\n";
|
||||
// const MCInstrDesc &ds(tm->getInstrInfo()->get(97));
|
||||
// const MCOperandInfo *op1=ds.OpInfo;
|
||||
// uint16_t impl_def = ds.getImplicitDefs()[0];
|
||||
// std::cerr<<lookuperr<<"\n";
|
||||
|
||||
exit(0);
|
||||
// exit(0);
|
||||
|
||||
}
|
||||
#endif
|
||||
void setupOptions(QCoreApplication &app) {
|
||||
//[-a1a2cmsi]
|
||||
QCommandLineParser parser;
|
||||
parser.setApplicationDescription("dcc");
|
||||
parser.addHelpOption();
|
||||
//parser.addVersionOption();
|
||||
//QCommandLineOption showProgressOption("p", QCoreApplication::translate("main", "Show progress during copy"));
|
||||
QCommandLineOption boolOpts[] {
|
||||
QCommandLineOption {"v", QCoreApplication::translate("main", "verbose")},
|
||||
QCommandLineOption {"V", QCoreApplication::translate("main", "very verbose")},
|
||||
QCommandLineOption {"c", QCoreApplication::translate("main", "Follow register indirect calls")},
|
||||
QCommandLineOption {"m", QCoreApplication::translate("main", "Print memory maps of program")},
|
||||
QCommandLineOption {"s", QCoreApplication::translate("main", "Print stats")}
|
||||
};
|
||||
for(QCommandLineOption &o : boolOpts) {
|
||||
parser.addOption(o);
|
||||
}
|
||||
QCommandLineOption assembly("a", QCoreApplication::translate("main", "Produce assembly"),"assembly_level");
|
||||
// A boolean option with multiple names (-f, --force)
|
||||
//QCommandLineOption forceOption(QStringList() << "f" << "force", "Overwrite existing files.");
|
||||
// An option with a value
|
||||
QCommandLineOption targetFileOption(QStringList() << "o" << "output",
|
||||
QCoreApplication::translate("main", "Place output into <file>."),
|
||||
QCoreApplication::translate("main", "file"));
|
||||
parser.addOption(targetFileOption);
|
||||
parser.addOption(assembly);
|
||||
//parser.addOption(forceOption);
|
||||
// Process the actual command line arguments given by the user
|
||||
parser.addPositionalArgument("source", QCoreApplication::translate("main", "Dos Executable file to decompile."));
|
||||
parser.process(app);
|
||||
|
||||
const QStringList args = parser.positionalArguments();
|
||||
if(args.empty()) {
|
||||
parser.showHelp();
|
||||
}
|
||||
// source is args.at(0), destination is args.at(1)
|
||||
option.verbose = parser.isSet(boolOpts[0]);
|
||||
option.VeryVerbose = parser.isSet(boolOpts[1]);
|
||||
if(parser.isSet(assembly)) {
|
||||
option.asm1 = parser.value(assembly).toInt()==1;
|
||||
option.asm2 = parser.value(assembly).toInt()==2;
|
||||
}
|
||||
option.Map = parser.isSet(boolOpts[3]);
|
||||
option.Stats = parser.isSet(boolOpts[4]);
|
||||
option.Interact = false;
|
||||
option.Calls = parser.isSet(boolOpts[2]);
|
||||
option.filename = args.first();
|
||||
if(parser.isSet(targetFileOption))
|
||||
asm1_name = asm2_name = parser.value(targetFileOption);
|
||||
else if(option.asm1 || option.asm2) {
|
||||
asm1_name = option.filename+".a1";
|
||||
asm2_name = option.filename+".a2";
|
||||
}
|
||||
|
||||
}
|
||||
int main(int argc, char **argv)
|
||||
{
|
||||
/* Extract switches and filename */
|
||||
strcpy(option.filename, initargs(argc, argv));
|
||||
QCoreApplication app(argc,argv);
|
||||
|
||||
QCoreApplication::setApplicationVersion("0.1");
|
||||
setupOptions(app);
|
||||
|
||||
/* Front end reads in EXE or COM file, parses it into I-code while
|
||||
* building the call graph and attaching appropriate bits of code for
|
||||
* each procedure.
|
||||
*/
|
||||
DccFrontend fe(option.filename);
|
||||
Project::get()->create(option.filename);
|
||||
|
||||
DccFrontend fe(&app);
|
||||
if(!Project::get()->load()) {
|
||||
return -1;
|
||||
}
|
||||
if (option.verbose)
|
||||
Project::get()->prog.displayLoadInfo();
|
||||
if(false==fe.FrontEnd ())
|
||||
return -1;
|
||||
if(option.asm1)
|
||||
@@ -138,98 +205,16 @@ int main(int argc, char **argv)
|
||||
* analysis, data flow etc. and outputs it to output file ready for
|
||||
* re-compilation.
|
||||
*/
|
||||
BackEnd(asm1_name ? asm1_name:option.filename, Project::get()->callGraph);
|
||||
BackEnd(Project::get()->callGraph);
|
||||
|
||||
Project::get()->callGraph->write();
|
||||
|
||||
if (option.Stats)
|
||||
displayTotalStats();
|
||||
|
||||
/*
|
||||
freeDataStructures(pProcList);
|
||||
*/
|
||||
return 0;
|
||||
}
|
||||
|
||||
/****************************************************************************
|
||||
* initargs - Extract command line arguments
|
||||
***************************************************************************/
|
||||
static char *initargs(int argc, char *argv[])
|
||||
{
|
||||
char *pc;
|
||||
|
||||
while (--argc > 0 && (*++argv)[0] == '-')
|
||||
{
|
||||
for (pc = argv[0]+1; *pc; pc++)
|
||||
switch (*pc)
|
||||
{
|
||||
case 'a': /* Print assembler listing */
|
||||
if (*(pc+1) == '2')
|
||||
option.asm2 = true;
|
||||
else
|
||||
option.asm1 = true;
|
||||
if (*(pc+1) == '1' || *(pc+1) == '2')
|
||||
pc++;
|
||||
break;
|
||||
case 'c':
|
||||
option.Calls = true;
|
||||
break;
|
||||
case 'i':
|
||||
option.Interact = true;
|
||||
break;
|
||||
case 'm': /* Print memory map */
|
||||
option.Map = true;
|
||||
break;
|
||||
case 's': /* Print Stats */
|
||||
option.Stats = true;
|
||||
break;
|
||||
case 'V': /* Very verbose => verbose */
|
||||
option.VeryVerbose = true;
|
||||
case 'v':
|
||||
option.verbose = true; /* Make everything verbose */
|
||||
break;
|
||||
case 'o': /* assembler output file */
|
||||
if (*(pc+1)) {
|
||||
asm1_name = asm2_name = pc+1;
|
||||
goto NextArg;
|
||||
}
|
||||
else if (--argc > 0) {
|
||||
asm1_name = asm2_name = *++argv;
|
||||
goto NextArg;
|
||||
}
|
||||
default:
|
||||
fatalError(INVALID_ARG, *pc);
|
||||
return *argv;
|
||||
}
|
||||
NextArg:;
|
||||
}
|
||||
|
||||
if (argc == 1)
|
||||
{
|
||||
if (option.asm1 || option.asm2)
|
||||
{
|
||||
if (! asm1_name)
|
||||
{
|
||||
asm1_name = strcpy((char*)malloc(strlen(*argv)+4), *argv);
|
||||
pc = strrchr(asm1_name, '.');
|
||||
if (pc > strrchr(asm1_name, '/'))
|
||||
{
|
||||
*pc = '\0';
|
||||
}
|
||||
asm2_name = (char*)malloc(strlen(asm1_name)+4) ;
|
||||
strcat(strcpy(asm2_name, asm1_name), ".a2");
|
||||
unlink(asm2_name);
|
||||
strcat(asm1_name, ".a1");
|
||||
}
|
||||
unlink(asm1_name); /* Remove asm output files */
|
||||
}
|
||||
return *argv; /* filename of the program to decompile */
|
||||
}
|
||||
|
||||
fatalError(USAGE);
|
||||
return *argv; // does not reach this.
|
||||
}
|
||||
|
||||
static void
|
||||
displayTotalStats ()
|
||||
/* Displays final statistics for the complete program */
|
||||
|
||||
61
src/dcc_interface.cpp
Normal file
61
src/dcc_interface.cpp
Normal file
@@ -0,0 +1,61 @@
|
||||
#include "dcc_interface.h"
|
||||
#include "dcc.h"
|
||||
#include "project.h"
|
||||
struct DccImpl : public IDcc{
|
||||
|
||||
|
||||
// IDcc interface
|
||||
public:
|
||||
void BaseInit()
|
||||
{
|
||||
}
|
||||
void Init(QObject *tgt)
|
||||
{
|
||||
}
|
||||
ilFunction GetFirstFuncHandle()
|
||||
{
|
||||
}
|
||||
ilFunction GetCurFuncHandle()
|
||||
{
|
||||
}
|
||||
void analysis_Once()
|
||||
{
|
||||
}
|
||||
void load(QString name)
|
||||
{
|
||||
option.filename = name;
|
||||
Project::get()->create(name);
|
||||
}
|
||||
void prtout_asm(IXmlTarget *, int level)
|
||||
{
|
||||
}
|
||||
void prtout_cpp(IXmlTarget *, int level)
|
||||
{
|
||||
}
|
||||
size_t getFuncCount()
|
||||
{
|
||||
}
|
||||
const lFunction &validFunctions() const
|
||||
{
|
||||
return Project::get()->functions();
|
||||
}
|
||||
void SetCurFunc_by_Name(QString)
|
||||
{
|
||||
}
|
||||
QDir installDir() {
|
||||
return QDir(".");
|
||||
}
|
||||
QDir dataDir(QString kind) { // return directory containing decompilation helper data -> signatures/includes/etc.
|
||||
QDir res(installDir());
|
||||
res.cd(kind);
|
||||
return res;
|
||||
}
|
||||
};
|
||||
|
||||
IDcc* IDcc::get() {
|
||||
static IDcc *v=0;
|
||||
if(!v)
|
||||
v = new DccImpl;
|
||||
|
||||
return v;
|
||||
}
|
||||
@@ -150,10 +150,10 @@ void Disassembler::disassem(Function * ppProc)
|
||||
if (pass != 3)
|
||||
{
|
||||
auto p = (pass == 1)? asm1_name: asm2_name;
|
||||
m_fp.open(p,ios_base::app);
|
||||
m_fp.open(p.toStdString(),ios_base::app);
|
||||
if (!m_fp.is_open())
|
||||
{
|
||||
fatalError(CANNOT_OPEN, p);
|
||||
fatalError(CANNOT_OPEN, p.toStdString().c_str());
|
||||
}
|
||||
}
|
||||
/* Create temporary code array */
|
||||
|
||||
@@ -82,7 +82,7 @@ bool DccFrontend::FrontEnd ()
|
||||
|
||||
if (option.asm1)
|
||||
{
|
||||
printf("dcc: writing assembler file %s\n", asm1_name);
|
||||
printf("dcc: writing assembler file %s\n", asm1_name.c_str());
|
||||
}
|
||||
|
||||
/* Search through code looking for impure references and flag them */
|
||||
|
||||
@@ -3,7 +3,12 @@
|
||||
* (C) Cristina Cifuentes
|
||||
****************************************************************************/
|
||||
|
||||
#include <llvm/Support/PatternMatch.h>
|
||||
//#include <llvm/Config/llvm-config.h>
|
||||
//#if( (LLVM_VERSION_MAJOR==3 ) && (LLVM_VERSION_MINOR>3) )
|
||||
//#include <llvm/IR/PatternMatch.h>
|
||||
//#else
|
||||
//#include <llvm/Support/PatternMatch.h>
|
||||
//#endif
|
||||
#include <boost/iterator/filter_iterator.hpp>
|
||||
#include <cstring>
|
||||
#include <deque>
|
||||
|
||||
@@ -27,16 +27,16 @@ bool Idiom14::match(iICODE pIcode)
|
||||
return false;
|
||||
m_icodes[0]=pIcode++;
|
||||
m_icodes[1]=pIcode++;
|
||||
LLInst * matched [] = {m_icodes[0]->ll(),m_icodes[1]->ll()};
|
||||
LLInst * matched [] {m_icodes[0]->ll(),m_icodes[1]->ll()};
|
||||
/* Check for regL */
|
||||
m_regL = m_icodes[0]->ll()->m_dst.regi;
|
||||
if (not m_icodes[0]->ll()->testFlags(I) && ((m_regL == rAX) || (m_regL ==rBX)))
|
||||
m_regL = matched[0]->m_dst.regi;
|
||||
if (not matched[0]->testFlags(I) && ((m_regL == rAX) || (m_regL ==rBX)))
|
||||
{
|
||||
/* Check for XOR regH, regH */
|
||||
if (m_icodes[1]->ll()->match(iXOR) && not m_icodes[1]->ll()->testFlags(I))
|
||||
if (matched[1]->match(iXOR) && not matched[1]->testFlags(I))
|
||||
{
|
||||
m_regH = m_icodes[1]->ll()->m_dst.regi;
|
||||
if (m_regH == m_icodes[1]->ll()->src().getReg2())
|
||||
m_regH = matched[1]->m_dst.regi;
|
||||
if (m_regH == matched[1]->src().getReg2())
|
||||
{
|
||||
if ((m_regL == rAX) && (m_regH == rDX))
|
||||
return true;
|
||||
@@ -49,14 +49,11 @@ bool Idiom14::match(iICODE pIcode)
|
||||
}
|
||||
int Idiom14::action()
|
||||
{
|
||||
int idx;
|
||||
AstIdent *lhs;
|
||||
Expr *rhs;
|
||||
|
||||
idx = m_func->localId.newLongReg (TYPE_LONG_SIGN, LONGID_TYPE(m_regH,m_regL), m_icodes[0]);
|
||||
lhs = AstIdent::LongIdx (idx);
|
||||
int idx = m_func->localId.newLongReg (TYPE_LONG_SIGN, LONGID_TYPE(m_regH,m_regL), m_icodes[0]);
|
||||
AstIdent *lhs = AstIdent::LongIdx (idx);
|
||||
m_icodes[0]->setRegDU( m_regH, eDEF);
|
||||
rhs = AstIdent::id (*m_icodes[0]->ll(), SRC, m_func, m_icodes[0], *m_icodes[0], NONE);
|
||||
Expr *rhs = AstIdent::id (*m_icodes[0]->ll(), SRC, m_func, m_icodes[0], *m_icodes[0], NONE);
|
||||
m_icodes[0]->setAsgn(lhs, rhs);
|
||||
m_icodes[1]->invalidate();
|
||||
return 2;
|
||||
|
||||
@@ -20,70 +20,8 @@ static void setBits(int16_t type, uint32_t start, uint32_t len);
|
||||
static void process_MOV(LLInst &ll, STATE * pstate);
|
||||
static SYM * lookupAddr (LLOperand *pm, STATE * pstate, int size, uint16_t duFlag);
|
||||
void interactDis(Function * initProc, int ic);
|
||||
static uint32_t SynthLab;
|
||||
extern uint32_t SynthLab;
|
||||
|
||||
/* Parses the program, builds the call graph, and returns the list of
|
||||
* procedures found */
|
||||
void DccFrontend::parse(Project &proj)
|
||||
{
|
||||
PROG &prog(proj.prog);
|
||||
STATE state;
|
||||
|
||||
/* Set initial state */
|
||||
state.setState(rES, 0); /* PSP segment */
|
||||
state.setState(rDS, 0);
|
||||
state.setState(rCS, prog.initCS);
|
||||
state.setState(rSS, prog.initSS);
|
||||
state.setState(rSP, prog.initSP);
|
||||
state.IP = ((uint32_t)prog.initCS << 4) + prog.initIP;
|
||||
SynthLab = SYNTHESIZED_MIN;
|
||||
|
||||
// default-construct a Function object !
|
||||
/*auto func = */;
|
||||
|
||||
/* Check for special settings of initial state, based on idioms of the
|
||||
startup code */
|
||||
state.checkStartup();
|
||||
Function *start_proc;
|
||||
/* Make a struct for the initial procedure */
|
||||
if (prog.offMain != -1)
|
||||
{
|
||||
start_proc = proj.createFunction(0,"main");
|
||||
start_proc->retVal.loc = REG_FRAME;
|
||||
start_proc->retVal.type = TYPE_WORD_SIGN;
|
||||
start_proc->retVal.id.regi = rAX;
|
||||
/* We know where main() is. Start the flow of control from there */
|
||||
start_proc->procEntry = prog.offMain;
|
||||
/* In medium and large models, the segment of main may (will?) not be
|
||||
the same as the initial CS segment (of the startup code) */
|
||||
state.setState(rCS, prog.segMain);
|
||||
state.IP = prog.offMain;
|
||||
}
|
||||
else
|
||||
{
|
||||
start_proc = proj.createFunction(0,"start");
|
||||
/* Create initial procedure at program start address */
|
||||
start_proc->procEntry = (uint32_t)state.IP;
|
||||
}
|
||||
|
||||
/* The state info is for the first procedure */
|
||||
start_proc->state = state;
|
||||
|
||||
/* Set up call graph initial node */
|
||||
proj.callGraph = new CALL_GRAPH;
|
||||
proj.callGraph->proc = start_proc;
|
||||
|
||||
/* This proc needs to be called to set things up for LibCheck(), which
|
||||
checks a proc to see if it is a know C (etc) library */
|
||||
SetupLibCheck();
|
||||
//BUG: proj and g_proj are 'live' at this point !
|
||||
|
||||
/* Recursively build entire procedure list */
|
||||
start_proc->FollowCtrl(proj.callGraph, &state);
|
||||
|
||||
/* This proc needs to be called to clean things up from SetupLibCheck() */
|
||||
CleanupLibCheck();
|
||||
}
|
||||
|
||||
/* Returns the size of the string pointed by sym and delimited by delim.
|
||||
* Size includes delimiter. */
|
||||
|
||||
101
src/perfhlib.cpp
101
src/perfhlib.cpp
@@ -1,101 +0,0 @@
|
||||
/*
|
||||
* Perfect hashing function library. Contains functions to generate perfect
|
||||
* hashing functions
|
||||
* (C) Mike van Emmerik
|
||||
*/
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include "perfhlib.h"
|
||||
|
||||
/* Private data structures */
|
||||
|
||||
static uint16_t *T1, *T2; /* Pointers to T1[i], T2[i] */
|
||||
static short *g; /* g[] */
|
||||
|
||||
//static int numEdges; /* An edge counter */
|
||||
//static bool *visited; /* Array of bools: whether visited */
|
||||
|
||||
/* Private prototypes */
|
||||
//static void initGraph(void);
|
||||
//static void addToGraph(int e, int v1, int v2);
|
||||
//static bool isCycle(void);
|
||||
//static void duplicateKeys(int v1, int v2);
|
||||
PatternHasher g_pattern_hasher;
|
||||
|
||||
void PatternHasher::init(int _NumEntry, int _EntryLen, int _SetSize, char _SetMin, int _NumVert)
|
||||
{
|
||||
/* These parameters are stored in statics so as to obviate the need for
|
||||
passing all these (or defererencing pointers) for every call to hash()
|
||||
*/
|
||||
|
||||
NumEntry = _NumEntry;
|
||||
EntryLen = _EntryLen;
|
||||
SetSize = _SetSize;
|
||||
SetMin = _SetMin;
|
||||
NumVert = _NumVert;
|
||||
|
||||
/* Allocate the variable sized tables etc */
|
||||
T1base = new uint16_t [EntryLen * SetSize];
|
||||
T2base = new uint16_t [EntryLen * SetSize];
|
||||
graphNode = new int [NumEntry*2 + 1];
|
||||
graphNext = new int [NumEntry*2 + 1];
|
||||
graphFirst = new int [NumVert + 1];
|
||||
g = new short [NumVert + 1];
|
||||
// visited = new bool [NumVert + 1];
|
||||
return;
|
||||
|
||||
}
|
||||
|
||||
void PatternHasher::cleanup(void)
|
||||
{
|
||||
/* Free the storage for variable sized tables etc */
|
||||
delete [] T1base;
|
||||
delete [] T2base;
|
||||
delete [] graphNode;
|
||||
delete [] graphNext;
|
||||
delete [] graphFirst;
|
||||
delete [] g;
|
||||
// delete [] visited;
|
||||
}
|
||||
|
||||
int PatternHasher::hash(uint8_t *string)
|
||||
{
|
||||
uint16_t u, v;
|
||||
int j;
|
||||
|
||||
u = 0;
|
||||
for (j=0; j < EntryLen; j++)
|
||||
{
|
||||
T1 = T1base + j * SetSize;
|
||||
u += T1[string[j] - SetMin];
|
||||
}
|
||||
u %= NumVert;
|
||||
|
||||
v = 0;
|
||||
for (j=0; j < EntryLen; j++)
|
||||
{
|
||||
T2 = T2base + j * SetSize;
|
||||
v += T2[string[j] - SetMin];
|
||||
}
|
||||
v %= NumVert;
|
||||
|
||||
return (g[u] + g[v]) % NumEntry;
|
||||
}
|
||||
|
||||
uint16_t * PatternHasher::readT1(void)
|
||||
{
|
||||
return T1base;
|
||||
}
|
||||
|
||||
uint16_t *PatternHasher::readT2(void)
|
||||
{
|
||||
return T2base;
|
||||
}
|
||||
|
||||
uint16_t * PatternHasher::readG(void)
|
||||
{
|
||||
return (uint16_t *)g;
|
||||
}
|
||||
|
||||
@@ -27,7 +27,6 @@ const char *indentStr(int indLevel) // Indentation according to the depth of the
|
||||
* not exist. */
|
||||
void CALL_GRAPH::insertArc (ilFunction newProc)
|
||||
{
|
||||
CALL_GRAPH *pcg;
|
||||
|
||||
|
||||
/* Check if procedure already exists */
|
||||
@@ -35,7 +34,7 @@ void CALL_GRAPH::insertArc (ilFunction newProc)
|
||||
if(res!=outEdges.end())
|
||||
return;
|
||||
/* Include new arc */
|
||||
pcg = new CALL_GRAPH;
|
||||
CALL_GRAPH *pcg = new CALL_GRAPH;
|
||||
pcg->proc = newProc;
|
||||
outEdges.push_back(pcg);
|
||||
}
|
||||
@@ -49,13 +48,10 @@ bool CALL_GRAPH::insertCallGraph(ilFunction caller, ilFunction callee)
|
||||
insertArc (callee);
|
||||
return true;
|
||||
}
|
||||
else
|
||||
{
|
||||
for (CALL_GRAPH *edg : outEdges)
|
||||
if (edg->insertCallGraph (caller, callee))
|
||||
return true;
|
||||
return (false);
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
bool CALL_GRAPH::insertCallGraph(Function *caller, ilFunction callee)
|
||||
@@ -333,7 +329,6 @@ void STKFRAME::adjustForArgType(size_t numArg_, hlType actType_)
|
||||
{
|
||||
hlType forType;
|
||||
STKSYM * psym, * nsym;
|
||||
int off;
|
||||
/* If formal argument does not exist, do not create new ones, just
|
||||
* ignore actual argument
|
||||
*/
|
||||
@@ -341,7 +336,7 @@ void STKFRAME::adjustForArgType(size_t numArg_, hlType actType_)
|
||||
return;
|
||||
|
||||
/* Find stack offset for this argument */
|
||||
off = m_minOff;
|
||||
int off = m_minOff;
|
||||
size_t i=0;
|
||||
for(STKSYM &s : *this) // walk formal arguments upto numArg_
|
||||
{
|
||||
@@ -353,7 +348,6 @@ void STKFRAME::adjustForArgType(size_t numArg_, hlType actType_)
|
||||
|
||||
/* Find formal argument */
|
||||
//psym = &at(numArg_);
|
||||
//i = numArg_;
|
||||
//auto iter=std::find_if(sym.begin(),sym.end(),[off](STKSYM &s)->bool {s.off==off;});
|
||||
auto iter=std::find_if(begin()+numArg_,end(),[off](STKSYM &s)->bool {return s.label==off;});
|
||||
if(iter==end()) // symbol not found
|
||||
@@ -361,15 +355,16 @@ void STKFRAME::adjustForArgType(size_t numArg_, hlType actType_)
|
||||
psym = &(*iter);
|
||||
|
||||
forType = psym->type;
|
||||
if (forType != actType_)
|
||||
{
|
||||
if (forType == actType_)
|
||||
return;
|
||||
switch (actType_) {
|
||||
case TYPE_UNKNOWN: case TYPE_BYTE_SIGN:
|
||||
case TYPE_BYTE_UNSIGN: case TYPE_WORD_SIGN:
|
||||
case TYPE_WORD_UNSIGN: case TYPE_RECORD:
|
||||
break;
|
||||
|
||||
case TYPE_LONG_UNSIGN: case TYPE_LONG_SIGN:
|
||||
case TYPE_LONG_UNSIGN:
|
||||
case TYPE_LONG_SIGN:
|
||||
if ((forType == TYPE_WORD_UNSIGN) ||
|
||||
(forType == TYPE_WORD_SIGN) ||
|
||||
(forType == TYPE_UNKNOWN))
|
||||
@@ -395,6 +390,5 @@ void STKFRAME::adjustForArgType(size_t numArg_, hlType actType_)
|
||||
default:
|
||||
fprintf(stderr,"STKFRAME::adjustForArgType unhandled actType_ %d \n",actType_);
|
||||
} /* eos */
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -1,10 +1,13 @@
|
||||
#include <QtCore/QString>
|
||||
#include <QtCore/QDir>
|
||||
#include <utility>
|
||||
#include "dcc.h"
|
||||
#include "CallGraph.h"
|
||||
#include "project.h"
|
||||
#include "Procedure.h"
|
||||
using namespace std;
|
||||
//Project g_proj;
|
||||
char *asm1_name, *asm2_name; /* Assembler output filenames */
|
||||
QString asm1_name, asm2_name; /* Assembler output filenames */
|
||||
SYMTAB symtab; /* Global symbol table */
|
||||
STATS stats; /* cfg statistics */
|
||||
//PROG prog; /* programs fields */
|
||||
@@ -19,19 +22,17 @@ void Project::initialize()
|
||||
delete callGraph;
|
||||
callGraph = nullptr;
|
||||
}
|
||||
void Project::create(const string &a)
|
||||
void Project::create(const QString &a)
|
||||
{
|
||||
initialize();
|
||||
QFileInfo fi(a);
|
||||
m_fname=a;
|
||||
string::size_type ext_loc=a.find_last_of('.');
|
||||
string::size_type slash_loc=a.find_last_of('/',ext_loc);
|
||||
if(slash_loc==string::npos)
|
||||
slash_loc=0;
|
||||
else
|
||||
slash_loc++;
|
||||
if(ext_loc!=string::npos)
|
||||
m_project_name = a.substr(slash_loc,(ext_loc-slash_loc));
|
||||
else
|
||||
m_project_name = a.substr(slash_loc);
|
||||
m_project_name = fi.completeBaseName();
|
||||
m_output_path = fi.path();
|
||||
}
|
||||
|
||||
QString Project::output_name(const char *ext) {
|
||||
return m_output_path+QDir::separator()+m_project_name+"."+ext;
|
||||
}
|
||||
bool Project::valid(ilFunction iter)
|
||||
{
|
||||
|
||||
1
tools/CMakeLists.txt
Normal file
1
tools/CMakeLists.txt
Normal file
@@ -0,0 +1 @@
|
||||
add_subdirectory(makedsig)
|
||||
248
tools/dispsrch/dispsig.cpp
Normal file
248
tools/dispsrch/dispsig.cpp
Normal file
@@ -0,0 +1,248 @@
|
||||
/* Quick program to copy a named signature to a small file */
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <memory.h>
|
||||
#include <string.h>
|
||||
#include "perfhlib.h"
|
||||
|
||||
/* statics */
|
||||
byte buf[100];
|
||||
int numKeys; /* Number of hash table entries (keys) */
|
||||
int numVert; /* Number of vertices in the graph (also size of g[]) */
|
||||
int PatLen; /* Size of the keys (pattern length) */
|
||||
int SymLen; /* Max size of the symbols, including null */
|
||||
FILE *f; /* File being read */
|
||||
FILE *f2; /* File being written */
|
||||
|
||||
static word *T1base, *T2base; /* Pointers to start of T1, T2 */
|
||||
static word *g; /* g[] */
|
||||
|
||||
/* prototypes */
|
||||
void grab(int n);
|
||||
word readFileShort(void);
|
||||
void cleanup(void);
|
||||
|
||||
|
||||
#define SYMLEN 16
|
||||
#define PATLEN 23
|
||||
|
||||
/* Hash table structure */
|
||||
typedef struct HT_tag
|
||||
{
|
||||
char htSym[SYMLEN];
|
||||
byte htPat[PATLEN];
|
||||
} HT;
|
||||
|
||||
HT ht; /* One hash table entry */
|
||||
|
||||
void
|
||||
main(int argc, char *argv[])
|
||||
{
|
||||
word w, len;
|
||||
int i;
|
||||
|
||||
if (argc <= 3)
|
||||
{
|
||||
printf("Usage: dispsig <SigFilename> <FunctionName> <BinFileName>\n");
|
||||
printf("Example: dispsig dccm8s.sig printf printf.bin\n");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
if ((f = fopen(argv[1], "rb")) == NULL)
|
||||
{
|
||||
printf("Cannot open %s\n", argv[1]);
|
||||
exit(2);
|
||||
}
|
||||
|
||||
if ((f2 = fopen(argv[3], "wb")) == NULL)
|
||||
{
|
||||
printf("Cannot write to %s\n", argv[3]);
|
||||
exit(2);
|
||||
}
|
||||
|
||||
|
||||
/* Read the parameters */
|
||||
grab(4);
|
||||
if (memcmp("dccs", buf, 4) != 0)
|
||||
{
|
||||
printf("Not a dccs file!\n");
|
||||
exit(3);
|
||||
}
|
||||
numKeys = readFileShort();
|
||||
numVert = readFileShort();
|
||||
PatLen = readFileShort();
|
||||
SymLen = readFileShort();
|
||||
|
||||
/* Initialise the perfhlib stuff. Also allocates T1, T2, g, etc */
|
||||
hashParams( /* Set the parameters for the hash table */
|
||||
numKeys, /* The number of symbols */
|
||||
PatLen, /* The length of the pattern to be hashed */
|
||||
256, /* The character set of the pattern (0-FF) */
|
||||
0, /* Minimum pattern character value */
|
||||
numVert); /* Specifies C, the sparseness of the graph.
|
||||
See Czech, Havas and Majewski for details
|
||||
*/
|
||||
|
||||
T1base = readT1();
|
||||
T2base = readT2();
|
||||
g = readG();
|
||||
|
||||
/* Read T1 and T2 tables */
|
||||
grab(2);
|
||||
if (memcmp("T1", buf, 2) != 0)
|
||||
{
|
||||
printf("Expected 'T1'\n");
|
||||
exit(3);
|
||||
}
|
||||
len = PatLen * 256 * sizeof(word);
|
||||
w = readFileShort();
|
||||
if (w != len)
|
||||
{
|
||||
printf("Problem with size of T1: file %d, calc %d\n", w, len);
|
||||
exit(4);
|
||||
}
|
||||
if (fread(T1base, 1, len, f) != len)
|
||||
{
|
||||
printf("Could not read T1\n");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
grab(2);
|
||||
if (memcmp("T2", buf, 2) != 0)
|
||||
{
|
||||
printf("Expected 'T2'\n");
|
||||
exit(3);
|
||||
}
|
||||
w = readFileShort();
|
||||
if (w != len)
|
||||
{
|
||||
printf("Problem with size of T2: file %d, calc %d\n", w, len);
|
||||
exit(4);
|
||||
}
|
||||
if (fread(T2base, 1, len, f) != len)
|
||||
{
|
||||
printf("Could not read T2\n");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
/* Now read the function g[] */
|
||||
grab(2);
|
||||
if (memcmp("gg", buf, 2) != 0)
|
||||
{
|
||||
printf("Expected 'gg'\n");
|
||||
exit(3);
|
||||
}
|
||||
len = numVert * sizeof(word);
|
||||
w = readFileShort();
|
||||
if (w != len)
|
||||
{
|
||||
printf("Problem with size of g[]: file %d, calc %d\n", w, len);
|
||||
exit(4);
|
||||
}
|
||||
if (fread(g, 1, len, f) != len)
|
||||
{
|
||||
printf("Could not read T2\n");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
|
||||
/* This is now the hash table */
|
||||
grab(2);
|
||||
if (memcmp("ht", buf, 2) != 0)
|
||||
{
|
||||
printf("Expected 'ht'\n");
|
||||
exit(3);
|
||||
}
|
||||
w = readFileShort();
|
||||
if (w != numKeys * (SymLen + PatLen + sizeof(word)))
|
||||
{
|
||||
printf("Problem with size of hash table: file %d, calc %d\n", w, len);
|
||||
exit(6);
|
||||
}
|
||||
|
||||
|
||||
for (i=0; i < numKeys; i++)
|
||||
{
|
||||
if (fread(&ht, 1, SymLen + PatLen, f) != (size_t)(SymLen + PatLen))
|
||||
{
|
||||
printf("Could not read pattern %d from %s\n", i, argv[1]);
|
||||
exit(7);
|
||||
}
|
||||
if (stricmp(ht.htSym, argv[2]) == 0)
|
||||
{
|
||||
/* Found it! */
|
||||
break;
|
||||
}
|
||||
|
||||
}
|
||||
fclose(f);
|
||||
if (i == numKeys)
|
||||
{
|
||||
printf("Function %s not found!\n", argv[2]);
|
||||
exit(2);
|
||||
}
|
||||
|
||||
printf("Function %s index %d\n", ht.htSym, i);
|
||||
for (i=0; i < PatLen; i++)
|
||||
{
|
||||
printf("%02X ", ht.htPat[i]);
|
||||
}
|
||||
|
||||
fwrite(ht.htPat, 1, PatLen, f2);
|
||||
fclose(f2);
|
||||
|
||||
printf("\n");
|
||||
|
||||
|
||||
}
|
||||
|
||||
|
||||
void
|
||||
cleanup(void)
|
||||
{
|
||||
/* Free the storage for variable sized tables etc */
|
||||
if (T1base) free(T1base);
|
||||
if (T2base) free(T2base);
|
||||
if (g) free(g);
|
||||
}
|
||||
|
||||
void grab(int n)
|
||||
{
|
||||
if (fread(buf, 1, n, f) != (size_t)n)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
}
|
||||
|
||||
word
|
||||
readFileShort(void)
|
||||
{
|
||||
byte b1, b2;
|
||||
|
||||
if (fread(&b1, 1, 1, f) != 1)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
if (fread(&b2, 1, 1, f) != 1)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
return (b2 << 8) + b1;
|
||||
}
|
||||
|
||||
/* Following two functions not needed unless creating tables */
|
||||
|
||||
void getKey(int i, byte **keys)
|
||||
{
|
||||
}
|
||||
|
||||
/* Display key i */
|
||||
void
|
||||
dispKey(int i)
|
||||
{
|
||||
}
|
||||
|
||||
11
tools/dispsrch/dispsig.mak
Normal file
11
tools/dispsrch/dispsig.mak
Normal file
@@ -0,0 +1,11 @@
|
||||
CFLAGS = -Zi -c -AL -W3 -D__MSDOS__
|
||||
|
||||
dispsig.exe: dispsig.obj perfhlib.obj
|
||||
link /CO dispsig perfhlib;
|
||||
|
||||
dispsig.obj: dispsig.c dcc.h perfhlib.h
|
||||
cl $(CFLAGS) $*.c
|
||||
|
||||
perfhlib.obj: perfhlib.c dcc.h perfhlib.h
|
||||
cl $(CFLAGS) $*.c
|
||||
|
||||
221
tools/dispsrch/dispsrch.txt
Normal file
221
tools/dispsrch/dispsrch.txt
Normal file
@@ -0,0 +1,221 @@
|
||||
DISPSIG and SRCHSIG
|
||||
===================
|
||||
|
||||
1 What are DispSig and SrchSig?
|
||||
|
||||
2 How do I use DispSig?
|
||||
|
||||
3 How do I use SrchSig?
|
||||
|
||||
4 What can I do with the binary pattern file from DispSig?
|
||||
|
||||
5 How can I create a binary pattern file for SrchSig?
|
||||
|
||||
|
||||
|
||||
1 What are DispSig and SrchSig?
|
||||
-------------------------------
|
||||
|
||||
SrchSig is a program to display the name of a function, given a
|
||||
signature (pattern).
|
||||
DispSig is a program to display a signature, given a function name.
|
||||
Dispsig also writes the signature to a binary file, so you can
|
||||
disassemble it, or use it in Srchsig to see if some other signature
|
||||
file has the same pattern.
|
||||
|
||||
|
||||
2 How do I use DispSig?
|
||||
-----------------------
|
||||
Just type
|
||||
DispSig <SignatureFileName> <FunctionName> <BinaryFileName>
|
||||
|
||||
For example:
|
||||
|
||||
dispsig dccb2s.sig strcmp strcmp.bin
|
||||
Function index 58
|
||||
55 8B EC 56 57 8C D8 8E C0 FC 33 C0 8B D8 8B 7E 06 8B F7 32 C0 B9 F4
|
||||
|
||||
This tells us that the function was the 59th function in the
|
||||
signature file (and that the signature above will hash to 58
|
||||
(decimal)). We can see that it is a standard C function, since it
|
||||
starts with "55 8B EC", which is the standard C function prologue.
|
||||
The rest of it is a bit hard to follow, but fortunately we have also
|
||||
written the pattern to a binary file, strcmp.bin. See section 4 on
|
||||
how to disassemble this pattern.
|
||||
|
||||
If I type
|
||||
|
||||
dispsig dcct4p.sig writeln wl.bin
|
||||
|
||||
I get
|
||||
Function writeln not found!
|
||||
|
||||
In fact, there is no one function that performs the writeln function;
|
||||
there are functions like WriteString, WriteInt, CrLf (Carriage
|
||||
return, linefeed), and so on. Dispsig is case insensitive, so:
|
||||
|
||||
dispsig dcct4p.sig writestring wl.bin
|
||||
produces
|
||||
|
||||
Function WriteString index 53
|
||||
55 8B EC C4 7E 0C E8 F4 F4 75 25 C5 76 08 8B 4E 06 FC AC F4 F4 2B C8
|
||||
|
||||
|
||||
3 How do I use SrchSig?
|
||||
-----------------------
|
||||
Just type
|
||||
|
||||
srchsig <SignatureFileName> <BinaryFileName>
|
||||
|
||||
dispsig dcct4p.sig writeln wl.bin
|
||||
where BinaryFileName contains a pattern. See section 5 for how to
|
||||
create one of these. For now, we can use the pattern file from the
|
||||
first example:
|
||||
|
||||
srchsig dccb2s.sig strcmp.bin
|
||||
|
||||
Pattern:
|
||||
55 8B EC 56 57 8C D8 8E C0 FC 33 C0 8B D8 8B 7E 06 8B F7 32 C0 B9 F4
|
||||
Pattern hashed to 58 (0x3A), symbol strcmp
|
||||
Pattern matched
|
||||
|
||||
Note that the pattern reported above need not be exactly the same as
|
||||
the one we provided in <BinaryFileName>. The pattern displayed is the
|
||||
wildcarded and chopped version of the pattern provided; it will have
|
||||
F4s (wildcards) and possibly zeroes at the end; see the file
|
||||
makedstp.txt for a simple explanation of wildcarding and chopping.
|
||||
|
||||
If we type
|
||||
|
||||
srchsig dccb2s.sig ws.bin
|
||||
|
||||
we get
|
||||
|
||||
Pattern:
|
||||
55 8B EC C4 7E 0C E8 F4 F4 75 25 C5 76 08 8B 4E 06 FC AC F4 F4 2B C8
|
||||
Pattern hashed to 0 (0x0), symbol _IOERROR
|
||||
Pattern mismatch: found following pattern
|
||||
55 8B EC 56 8B 76 04 0B F6 7C 14 83 FE 58 76 03 BE F4 F4 89 36 F4 F4
|
||||
300
|
||||
|
||||
The pattern often hashes to zero when the pattern is unknown, due to
|
||||
the sparse nature of the tables used in the hash function. The first
|
||||
pattern in dccb2s.sig happens to be _IOERROR, and its pattern is
|
||||
completely different, apart from the first three bytes. The "300" at
|
||||
the end is actually a running count of signatures searched linearly,
|
||||
in case there is a problem with the hash function.
|
||||
|
||||
|
||||
|
||||
4 What can I do with the binary pattern file from DispSig?
|
||||
----------------------------------------------------------
|
||||
|
||||
You can feed it into SrchSig; this might make sense if you wanted to
|
||||
know if, e.g. the signature for printf was the same for version 2 as
|
||||
it is for version 3. In this case, you would use DispSig on the
|
||||
version 2 signature file, and SrchSig on the version 3 file.
|
||||
|
||||
You can also disassemble it, using debug (it comes with MS-DOS). For
|
||||
example
|
||||
debug strcmp.bin
|
||||
-u100 l 17
|
||||
|
||||
1754:0100 55 PUSH BP
|
||||
1754:0101 8BEC MOV BP,SP
|
||||
1754:0103 56 PUSH SI
|
||||
1754:0104 57 PUSH DI
|
||||
1754:0105 8CD8 MOV AX,DS
|
||||
1754:0107 8EC0 MOV ES,AX
|
||||
1754:0109 FC CLD
|
||||
1754:010A 33C0 XOR AX,AX
|
||||
1754:010C 8BD8 MOV BX,AX
|
||||
1754:010E 8B7E06 MOV DI,[BP+06]
|
||||
1754:0111 8BF7 MOV SI,DI
|
||||
1754:0113 32C0 XOR AL,AL
|
||||
1754:0115 B9F42B MOV CX,2BF4
|
||||
-q
|
||||
|
||||
Note that the "2B" at the end is actually past the end of the
|
||||
signature. (Signatures are 23 bytes (17 in hex) long, so only
|
||||
addresses 100-116 are valid). Remember that most 16 bit operands will
|
||||
be "wildcarded", so don't believe the resultant addresses.
|
||||
|
||||
|
||||
5 How can I create a binary pattern file for SrchSig?
|
||||
-----------------------------------------------------
|
||||
|
||||
Again, you can use debug. Suppose you have found an interesing piece
|
||||
of code at address 05BE (this example comes from a hello world
|
||||
program):
|
||||
|
||||
-u 5be
|
||||
15FF:05BE 55 PUSH BP
|
||||
15FF:05BF 8BEC MOV BP,SP
|
||||
15FF:05C1 83EC08 SUB SP,+08
|
||||
15FF:05C4 57 PUSH DI
|
||||
15FF:05C5 56 PUSH SI
|
||||
15FF:05C6 BE1E01 MOV SI,011E
|
||||
15FF:05C9 8D4606 LEA AX,[BP+06]
|
||||
15FF:05CC 8946FC MOV [BP-04],AX
|
||||
15FF:05CF 56 PUSH SI
|
||||
15FF:05D0 E8E901 CALL 07BC
|
||||
15FF:05D3 83C402 ADD SP,+02
|
||||
15FF:05D6 8BF8 MOV DI,AX
|
||||
15FF:05D8 8D4606 LEA AX,[BP+06]
|
||||
15FF:05DB 50 PUSH AX
|
||||
15FF:05DC FF7604 PUSH [BP+04]
|
||||
-mcs:5be l 17 cs:100
|
||||
-u100 l 17
|
||||
15FF:0100 55 PUSH BP
|
||||
15FF:0101 8BEC MOV BP,SP
|
||||
15FF:0103 83EC08 SUB SP,+08
|
||||
15FF:0106 57 PUSH DI
|
||||
15FF:0107 56 PUSH SI
|
||||
15FF:0108 BE1E01 MOV SI,011E
|
||||
15FF:010B 8D4606 LEA AX,[BP+06]
|
||||
15FF:010E 8946FC MOV [BP-04],AX
|
||||
15FF:0111 56 PUSH SI
|
||||
15FF:0112 E8E901 CALL 02FE
|
||||
15FF:0115 83C41F ADD SP,+1F
|
||||
-nfoo.bin
|
||||
-rcx
|
||||
CS 268A
|
||||
:17
|
||||
-w
|
||||
Writing 0017 bytes
|
||||
-q
|
||||
c>dir foo.bin
|
||||
foo.bin 23 3-25-94 12:04
|
||||
c>
|
||||
|
||||
The binary file has to be exactly 23 bytes long; that's why we
|
||||
changed cx to the value 17 (hex 17 = decimal 23). If you are studying
|
||||
a large file (> 64K) remember to set bx to 0 as well. The m (block
|
||||
move) command moves the code of interest to cs:100, which is where
|
||||
debug will write the file from. The "rcx" changes the length of the
|
||||
save, and the "nfoo.bin" sets the name of the file to be saved. Now
|
||||
we can feed this into srchsig:
|
||||
|
||||
srchsig dccb2s.sig foo.bin
|
||||
Pattern:
|
||||
55 8B EC 83 EC 08 57 56 BE F4 F4 8D 46 06 89 46 FC 56 E8 F4 F4 83 C4
|
||||
Pattern hashed to 278 (0x116), symbol sleep
|
||||
Pattern mismatch: found following pattern
|
||||
55 8B EC 83 EC 04 56 57 8D 46 FC 50 E8 F4 F4 59 80 7E FE 5A 76 05 BF
|
||||
300
|
||||
|
||||
Hmmm. Not a Borland C version 2 small model signature. Perhaps its a
|
||||
Microsoft Version 5 signature:
|
||||
|
||||
Pattern:
|
||||
55 8B EC 83 EC 08 57 56 BE F4 F4 8D 46 06 89 46 FC 56 E8 F4 F4 83 C4
|
||||
Pattern hashed to 31 (0x1F), symbol printf
|
||||
Pattern matched
|
||||
|
||||
Yes, it was good old printf. Of course, no need for you to guess, DCC
|
||||
will figure out the vendor, version number, and model for you.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
287
tools/dispsrch/srchsig.cpp
Normal file
287
tools/dispsrch/srchsig.cpp
Normal file
@@ -0,0 +1,287 @@
|
||||
/* Quick program to see if a pattern is in a sig file. Pattern is supplied
|
||||
in a small .bin or .com style file */
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <memory.h>
|
||||
#include "perfhlib.h"
|
||||
|
||||
/* statics */
|
||||
byte buf[100];
|
||||
int numKeys; /* Number of hash table entries (keys) */
|
||||
int numVert; /* Number of vertices in the graph (also size of g[]) */
|
||||
int PatLen; /* Size of the keys (pattern length) */
|
||||
int SymLen; /* Max size of the symbols, including null */
|
||||
FILE *f; /* Sig file being read */
|
||||
FILE *fpat; /* Pattern file being read */
|
||||
|
||||
static word *T1base, *T2base; /* Pointers to start of T1, T2 */
|
||||
static word *g; /* g[] */
|
||||
|
||||
#define SYMLEN 16
|
||||
#define PATLEN 23
|
||||
|
||||
typedef struct HT_tag
|
||||
{
|
||||
/* Hash table structure */
|
||||
char htSym[SYMLEN];
|
||||
byte htPat[PATLEN];
|
||||
} HT;
|
||||
|
||||
HT *ht; /* Declare a pointer to a hash table */
|
||||
|
||||
/* prototypes */
|
||||
void grab(int n);
|
||||
word readFileShort(void);
|
||||
void cleanup(void);
|
||||
void fixWildCards(char *buf); /* In fixwild.c */
|
||||
void pattSearch(void);
|
||||
|
||||
|
||||
void
|
||||
main(int argc, char *argv[])
|
||||
{
|
||||
word w, len;
|
||||
int h, i;
|
||||
int patlen;
|
||||
|
||||
if (argc <= 2)
|
||||
{
|
||||
printf("Usage: srchsig <SigFilename> <PattFilename>\n");
|
||||
printf("Searches the signature file for the given pattern\n");
|
||||
printf("e.g. %s dccm8s.sig mypatt.bin\n", argv[0]);
|
||||
exit(1);
|
||||
}
|
||||
|
||||
if ((f = fopen(argv[1], "rb")) == NULL)
|
||||
{
|
||||
printf("Cannot open signature file %s\n", argv[1]);
|
||||
exit(2);
|
||||
}
|
||||
|
||||
if ((fpat = fopen(argv[2], "rb")) == NULL)
|
||||
{
|
||||
printf("Cannot open pattern file %s\n", argv[2]);
|
||||
exit(2);
|
||||
}
|
||||
|
||||
/* Read the parameters */
|
||||
grab(4);
|
||||
if (memcmp("dccs", buf, 4) != 0)
|
||||
{
|
||||
printf("Not a dccs file!\n");
|
||||
exit(3);
|
||||
}
|
||||
numKeys = readFileShort();
|
||||
numVert = readFileShort();
|
||||
PatLen = readFileShort();
|
||||
SymLen = readFileShort();
|
||||
|
||||
/* Initialise the perfhlib stuff. Also allocates T1, T2, g, etc */
|
||||
hashParams( /* Set the parameters for the hash table */
|
||||
numKeys, /* The number of symbols */
|
||||
PatLen, /* The length of the pattern to be hashed */
|
||||
256, /* The character set of the pattern (0-FF) */
|
||||
0, /* Minimum pattern character value */
|
||||
numVert); /* Specifies C, the sparseness of the graph.
|
||||
See Czech, Havas and Majewski for details
|
||||
*/
|
||||
|
||||
T1base = readT1();
|
||||
T2base = readT2();
|
||||
g = readG();
|
||||
|
||||
/* Read T1 and T2 tables */
|
||||
grab(2);
|
||||
if (memcmp("T1", buf, 2) != 0)
|
||||
{
|
||||
printf("Expected 'T1'\n");
|
||||
exit(3);
|
||||
}
|
||||
len = PatLen * 256 * sizeof(word);
|
||||
w = readFileShort();
|
||||
if (w != len)
|
||||
{
|
||||
printf("Problem with size of T1: file %d, calc %d\n", w, len);
|
||||
exit(4);
|
||||
}
|
||||
if (fread(T1base, 1, len, f) != len)
|
||||
{
|
||||
printf("Could not read T1\n");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
grab(2);
|
||||
if (memcmp("T2", buf, 2) != 0)
|
||||
{
|
||||
printf("Expected 'T2'\n");
|
||||
exit(3);
|
||||
}
|
||||
w = readFileShort();
|
||||
if (w != len)
|
||||
{
|
||||
printf("Problem with size of T2: file %d, calc %d\n", w, len);
|
||||
exit(4);
|
||||
}
|
||||
if (fread(T2base, 1, len, f) != len)
|
||||
{
|
||||
printf("Could not read T2\n");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
/* Now read the function g[] */
|
||||
grab(2);
|
||||
if (memcmp("gg", buf, 2) != 0)
|
||||
{
|
||||
printf("Expected 'gg'\n");
|
||||
exit(3);
|
||||
}
|
||||
len = numVert * sizeof(word);
|
||||
w = readFileShort();
|
||||
if (w != len)
|
||||
{
|
||||
printf("Problem with size of g[]: file %d, calc %d\n", w, len);
|
||||
exit(4);
|
||||
}
|
||||
if (fread(g, 1, len, f) != len)
|
||||
{
|
||||
printf("Could not read T2\n");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
|
||||
/* This is now the hash table */
|
||||
/* First allocate space for the table */
|
||||
if ((ht = (HT *)malloc(numKeys * sizeof(HT))) == 0)
|
||||
{
|
||||
printf("Could not allocate hash table\n");
|
||||
exit(1);
|
||||
}
|
||||
grab(2);
|
||||
if (memcmp("ht", buf, 2) != 0)
|
||||
{
|
||||
printf("Expected 'ht'\n");
|
||||
exit(3);
|
||||
}
|
||||
w = readFileShort();
|
||||
if (w != numKeys * (SymLen + PatLen + sizeof(word)))
|
||||
{
|
||||
printf("Problem with size of hash table: file %d, calc %d\n", w, len);
|
||||
exit(6);
|
||||
}
|
||||
|
||||
for (i=0; i < numKeys; i++)
|
||||
{
|
||||
if ((int)fread(&ht[i], 1, SymLen + PatLen, f) != SymLen + PatLen)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
}
|
||||
|
||||
/* Read the pattern to buf */
|
||||
if ((patlen = fread(buf, 1, 100, fpat)) == 0)
|
||||
{
|
||||
printf("Could not read pattern\n");
|
||||
exit(11);
|
||||
}
|
||||
if (patlen != PATLEN)
|
||||
{
|
||||
printf("Error: pattern length is %d, should be %d\n", patlen, PATLEN);
|
||||
exit(12);
|
||||
}
|
||||
|
||||
/* Fix the wildcards */
|
||||
fixWildCards(buf);
|
||||
|
||||
printf("Pattern:\n");
|
||||
for (i=0; i < PATLEN; i++)
|
||||
printf("%02X ", buf[i]);
|
||||
printf("\n");
|
||||
|
||||
|
||||
h = hash(buf);
|
||||
printf("Pattern hashed to %d (0x%X), symbol %s\n", h, h, ht[h].htSym);
|
||||
if (memcmp(ht[h].htPat, buf, PATLEN) == 0)
|
||||
{
|
||||
printf("Pattern matched");
|
||||
}
|
||||
else
|
||||
{
|
||||
printf("Pattern mismatch: found following pattern\n");
|
||||
for (i=0; i < PATLEN; i++)
|
||||
printf("%02X ", ht[h].htPat[i]);
|
||||
printf("\n");
|
||||
pattSearch(); /* Look for it the hard way */
|
||||
}
|
||||
cleanup();
|
||||
free(ht);
|
||||
fclose(f);
|
||||
fclose(fpat);
|
||||
|
||||
}
|
||||
|
||||
void pattSearch(void)
|
||||
{
|
||||
int i;
|
||||
|
||||
for (i=0; i < numKeys; i++)
|
||||
{
|
||||
if ((i % 100) == 0) printf("\r%d ", i);
|
||||
if (memcmp(ht[i].htPat, buf, PATLEN) == 0)
|
||||
{
|
||||
printf("\nPattern matched offset %d (0x%X)\n", i, i);
|
||||
}
|
||||
}
|
||||
printf("\n");
|
||||
}
|
||||
|
||||
|
||||
void
|
||||
cleanup(void)
|
||||
{
|
||||
/* Free the storage for variable sized tables etc */
|
||||
if (T1base) free(T1base);
|
||||
if (T2base) free(T2base);
|
||||
if (g) free(g);
|
||||
}
|
||||
|
||||
void grab(int n)
|
||||
{
|
||||
if (fread(buf, 1, n, f) != (size_t)n)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
}
|
||||
|
||||
word
|
||||
readFileShort(void)
|
||||
{
|
||||
byte b1, b2;
|
||||
|
||||
if (fread(&b1, 1, 1, f) != 1)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
if (fread(&b2, 1, 1, f) != 1)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
return (b2 << 8) + b1;
|
||||
}
|
||||
|
||||
/* Following two functions not needed unless creating tables */
|
||||
|
||||
void getKey(int i, byte **keys)
|
||||
{
|
||||
}
|
||||
|
||||
/* Display key i */
|
||||
void
|
||||
dispKey(int i)
|
||||
{
|
||||
}
|
||||
|
||||
14
tools/dispsrch/srchsig.mak
Normal file
14
tools/dispsrch/srchsig.mak
Normal file
@@ -0,0 +1,14 @@
|
||||
CFLAGS = -Zi -c -AL -W3 -D__MSDOS__
|
||||
|
||||
srchsig.exe: srchsig.obj perfhlib.obj fixwild.obj
|
||||
link /CO srchsig perfhlib fixwild;
|
||||
|
||||
srchsig.obj: srchsig.c dcc.h perfhlib.h
|
||||
cl $(CFLAGS) $*.c
|
||||
|
||||
perfhlib.obj: perfhlib.c dcc.h perfhlib.h
|
||||
cl $(CFLAGS) $*.c
|
||||
|
||||
fixwild.obj: fixwild.c dcc.h
|
||||
cl $(CFLAGS) $*.c
|
||||
|
||||
11
tools/makedsig/CMakeLists.txt
Normal file
11
tools/makedsig/CMakeLists.txt
Normal file
@@ -0,0 +1,11 @@
|
||||
set(SRC
|
||||
makedsig
|
||||
fixwild.cpp
|
||||
LIB_PatternCollector.cpp
|
||||
LIB_PatternCollector.h
|
||||
TPL_PatternCollector.cpp
|
||||
TPL_PatternCollector.h
|
||||
)
|
||||
add_executable(makedsig ${SRC})
|
||||
target_link_libraries(makedsig dcc_hash)
|
||||
qt5_use_modules(makedsig Core)
|
||||
237
tools/makedsig/LIB_PatternCollector.cpp
Normal file
237
tools/makedsig/LIB_PatternCollector.cpp
Normal file
@@ -0,0 +1,237 @@
|
||||
#include "LIB_PatternCollector.h"
|
||||
#include <cstring>
|
||||
#include <cstring>
|
||||
/** \note there is an untested assumption that the *first* segment definition
|
||||
with class CODE will be the one containing all useful functions in the
|
||||
LEDATA records. Functions such as _exit() have more than one segment
|
||||
declared with class CODE (MSC8 libraries) */
|
||||
|
||||
extern void fixWildCards(uint8_t pat[]);
|
||||
void readNN(int n, FILE *fl)
|
||||
{
|
||||
if (fseek(fl, (long)n, SEEK_CUR) != 0)
|
||||
{
|
||||
printf("Could not seek file\n");
|
||||
exit(2);
|
||||
}
|
||||
}
|
||||
|
||||
void LIB_PatternCollector::readString(FILE *fl)
|
||||
{
|
||||
uint8_t len;
|
||||
|
||||
len = readByte(fl);
|
||||
if (fread(buf, 1, len, fl) != len)
|
||||
{
|
||||
printf("Could not read string len %d\n", len);
|
||||
exit(2);
|
||||
}
|
||||
buf[len] = '\0';
|
||||
offset += len;
|
||||
}
|
||||
|
||||
int LIB_PatternCollector::readSyms(FILE *fl)
|
||||
{
|
||||
int i;
|
||||
int count = 0;
|
||||
int firstSym = 0; /* First symbol this module */
|
||||
uint8_t b, c, type;
|
||||
uint16_t w, len;
|
||||
|
||||
codeLNAMES = NONE; /* Invalidate indexes for code segment */
|
||||
codeSEGDEF = NONE; /* Else won't be assigned */
|
||||
|
||||
offset = 0; /* For diagnostics, really */
|
||||
|
||||
if ((leData = (uint8_t *)malloc(0xFF80)) == 0)
|
||||
{
|
||||
printf("Could not malloc 64k bytes for LEDATA\n");
|
||||
exit(10);
|
||||
}
|
||||
|
||||
while (!feof(fl))
|
||||
{
|
||||
type = readByte(fl);
|
||||
len = readWord(fl);
|
||||
/* Note: uncommenting the following generates a *lot* of output */
|
||||
/*printf("Offset %05lX: type %02X len %d\n", offset-3, type, len);//*/
|
||||
switch (type)
|
||||
{
|
||||
|
||||
case 0x96: /* LNAMES */
|
||||
while (len > 1)
|
||||
{
|
||||
readString(fl);
|
||||
++lnum;
|
||||
if (strcmp((char *)buf, "CODE") == 0)
|
||||
{
|
||||
/* This is the class name we're looking for */
|
||||
codeLNAMES= lnum;
|
||||
}
|
||||
len -= strlen((char *)buf)+1;
|
||||
}
|
||||
b = readByte(fl); /* Checksum */
|
||||
break;
|
||||
|
||||
case 0x98: /* Segment definition */
|
||||
b = readByte(fl); /* Segment attributes */
|
||||
if ((b & 0xE0) == 0)
|
||||
{
|
||||
/* Alignment field is zero. Frame and offset follow */
|
||||
readWord(fl);
|
||||
readByte(fl);
|
||||
}
|
||||
|
||||
w = readWord(fl); /* Segment length */
|
||||
|
||||
b = readByte(fl); /* Segment name index */
|
||||
++segnum;
|
||||
|
||||
b = readByte(fl); /* Class name index */
|
||||
if ((b == codeLNAMES) && (codeSEGDEF == NONE))
|
||||
{
|
||||
/* This is the segment defining the code class */
|
||||
codeSEGDEF = segnum;
|
||||
}
|
||||
|
||||
b = readByte(fl); /* Overlay index */
|
||||
b = readByte(fl); /* Checksum */
|
||||
break;
|
||||
|
||||
case 0x90: /* PUBDEF: public symbols */
|
||||
b = readByte(fl); /* Base group */
|
||||
c = readByte(fl); /* Base segment */
|
||||
len -= 2;
|
||||
if (c == 0)
|
||||
{
|
||||
w = readWord(fl);
|
||||
len -= 2;
|
||||
}
|
||||
while (len > 1)
|
||||
{
|
||||
readString(fl);
|
||||
w = readWord(fl); /* Offset */
|
||||
b = readByte(fl); /* Type index */
|
||||
if (c == codeSEGDEF)
|
||||
{
|
||||
char *p;
|
||||
HASHENTRY entry;
|
||||
p = (char *)buf;
|
||||
if (buf[0] == '_') /* Leading underscore? */
|
||||
{
|
||||
p++; /* Yes, remove it*/
|
||||
}
|
||||
i = std::min(size_t(SYMLEN-1), strlen(p));
|
||||
memcpy(entry.name, p, i);
|
||||
entry.name[i] = '\0';
|
||||
entry.offset = w;
|
||||
/*printf("%04X: %s is sym #%d\n", w, keys[count].name, count);//*/
|
||||
keys.push_back(entry);
|
||||
count++;
|
||||
}
|
||||
len -= strlen((char *)buf) + 1 + 2 + 1;
|
||||
}
|
||||
b = readByte(fl); /* Checksum */
|
||||
break;
|
||||
|
||||
|
||||
case 0xA0: /* LEDATA */
|
||||
{
|
||||
b = readByte(fl); /* Segment index */
|
||||
w = readWord(fl); /* Offset */
|
||||
len -= 3;
|
||||
/*printf("LEDATA seg %d off %02X len %Xh, looking for %d\n", b, w, len-1, codeSEGDEF);//*/
|
||||
|
||||
if (b != codeSEGDEF)
|
||||
{
|
||||
readNN(len,fl); /* Skip the data */
|
||||
break; /* Next record */
|
||||
}
|
||||
|
||||
|
||||
if (fread(&leData[w], 1, len-1, fl) != len-1)
|
||||
{
|
||||
printf("Could not read LEDATA length %d\n", len-1);
|
||||
exit(2);
|
||||
}
|
||||
offset += len-1;
|
||||
maxLeData = std::max<uint16_t>(maxLeData, w+len-1);
|
||||
|
||||
readByte(fl); /* Checksum */
|
||||
break;
|
||||
}
|
||||
|
||||
default:
|
||||
readNN(len,fl); /* Just skip the lot */
|
||||
|
||||
if (type == 0x8A) /* Mod end */
|
||||
{
|
||||
/* Now find all the patterns for public code symbols that
|
||||
we have found */
|
||||
for (i=firstSym; i < count; i++)
|
||||
{
|
||||
uint16_t off = keys[i].offset;
|
||||
if (off == (uint16_t)-1)
|
||||
{
|
||||
continue; /* Ignore if already done */
|
||||
}
|
||||
if (keys[i].offset > maxLeData)
|
||||
{
|
||||
printf(
|
||||
"Warning: no LEDATA for symbol #%d %s "
|
||||
"(offset %04X, max %04X)\n",
|
||||
i, keys[i].name, off, maxLeData);
|
||||
/* To make things consistant, we set the pattern for
|
||||
this symbol to nulls */
|
||||
memset(&keys[i].pat, 0, PATLEN);
|
||||
continue;
|
||||
}
|
||||
/* Copy to temp buffer so don't overrun later patterns.
|
||||
(e.g. when chopping a short pattern).
|
||||
Beware of short patterns! */
|
||||
if (off+PATLEN <= maxLeData)
|
||||
{
|
||||
/* Available pattern is >= PATLEN */
|
||||
memcpy(buf, &leData[off], PATLEN);
|
||||
}
|
||||
else
|
||||
{
|
||||
/* Short! Only copy what is available (and malloced!) */
|
||||
memcpy(buf, &leData[off], maxLeData-off);
|
||||
/* Set rest to zeroes */
|
||||
memset(&buf[maxLeData-off], 0, PATLEN-(maxLeData-off));
|
||||
}
|
||||
fixWildCards((uint8_t *)buf);
|
||||
/* Save into the hash entry. */
|
||||
memcpy(keys[i].pat, buf, PATLEN);
|
||||
keys[i].offset = (uint16_t)-1; // Flag it as done
|
||||
//printf("Saved pattern for %s\n", keys[i].name);
|
||||
}
|
||||
|
||||
|
||||
while (readByte(fl) == 0);
|
||||
readNN(-1,fl); /* Unget the last byte (= type) */
|
||||
lnum = 0; /* Reset index into lnames */
|
||||
segnum = 0; /* Reset index into snames */
|
||||
firstSym = count; /* Remember index of first sym this mod */
|
||||
codeLNAMES = NONE; /* Invalidate indexes for code segment */
|
||||
codeSEGDEF = NONE;
|
||||
memset(leData, 0, maxLeData); /* Clear out old junk */
|
||||
maxLeData = 0; /* No data read this module */
|
||||
}
|
||||
|
||||
else if (type == 0xF1)
|
||||
{
|
||||
/* Library end record */
|
||||
return count;
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
free(leData);
|
||||
keys.clear();
|
||||
|
||||
return count;
|
||||
}
|
||||
25
tools/makedsig/LIB_PatternCollector.h
Normal file
25
tools/makedsig/LIB_PatternCollector.h
Normal file
@@ -0,0 +1,25 @@
|
||||
#pragma once
|
||||
|
||||
#include "PatternCollector.h"
|
||||
|
||||
struct LIB_PatternCollector : public PatternCollector
|
||||
{
|
||||
protected:
|
||||
unsigned long offset;
|
||||
uint8_t lnum = 0; /* Count of LNAMES so far */
|
||||
uint8_t segnum = 0; /* Count of SEGDEFs so far */
|
||||
uint8_t codeLNAMES; /* Index of the LNAMES for "CODE" class */
|
||||
uint8_t codeSEGDEF; /* Index of the first SEGDEF that has class CODE */
|
||||
#define NONE 0xFF /* Improbable segment index */
|
||||
uint8_t *leData; /* Pointer to 64K of alloc'd data. Some .lib files
|
||||
have the symbols (PUBDEFs) *after* the data
|
||||
(LEDATA), so you need to keep the data here */
|
||||
uint16_t maxLeData; /* How much data we have in there */
|
||||
/* read a length then string to buf[]; make it an asciiz string */
|
||||
void readString( FILE *fl);
|
||||
|
||||
public:
|
||||
/* Read the .lib file, and put the keys into the array *keys[]. Returns the count */
|
||||
int readSyms(FILE *fl);
|
||||
|
||||
};
|
||||
300
tools/makedsig/TPL_PatternCollector.cpp
Normal file
300
tools/makedsig/TPL_PatternCollector.cpp
Normal file
@@ -0,0 +1,300 @@
|
||||
#include "TPL_PatternCollector.h"
|
||||
#include <cstring>
|
||||
|
||||
/** \note Fundamental problem: there seems to be no information linking the names
|
||||
in the system unit ("V" category) with their routines, except trial and
|
||||
error. I have entered a few. There is no guarantee that the same pmap
|
||||
offset will map to the same routine in all versions of turbo.tpl. They
|
||||
seem to match so far in version 4 and 5.0 */
|
||||
|
||||
|
||||
#define roundUp(w) ((w + 0x0F) & 0xFFF0)
|
||||
extern void fixWildCards(uint8_t pat[]);
|
||||
void TPL_PatternCollector::enterSym(FILE *f, const char *name, uint16_t pmapOffset)
|
||||
{
|
||||
uint16_t pm, cm, codeOffset, pcode;
|
||||
uint16_t j;
|
||||
|
||||
/* Enter a symbol with given name */
|
||||
allocSym(count);
|
||||
strcpy(keys[count].name, name);
|
||||
pm = pmap + pmapOffset; /* Pointer to the 4 byte pmap structure */
|
||||
fseek(f, unitBase+pm, SEEK_SET);/* Go there */
|
||||
cm = readShort(f); /* CSeg map offset */
|
||||
codeOffset = readShort(f); /* How far into the code segment is our rtn */
|
||||
j = cm / 8; /* Index into the cmap array */
|
||||
pcode = csegBase+csegoffs[j]+codeOffset;
|
||||
fseek(f, unitBase+pcode, SEEK_SET); /* Go there */
|
||||
grab(f,PATLEN); /* Grab the pattern to buf[] */
|
||||
fixWildCards(buf); /* Fix the wild cards */
|
||||
memcpy(keys[count].pat, buf, PATLEN); /* Copy to the key array */
|
||||
count++; /* Done one more */
|
||||
}
|
||||
|
||||
void TPL_PatternCollector::allocSym(int count)
|
||||
{
|
||||
keys.resize(count);
|
||||
}
|
||||
|
||||
void TPL_PatternCollector::readCmapOffsets(FILE *f)
|
||||
{
|
||||
uint16_t cumsize, csize;
|
||||
uint16_t i;
|
||||
|
||||
/* Read the cmap table to find the start address of each segment */
|
||||
fseek(f, unitBase+cmap, SEEK_SET);
|
||||
cumsize = 0;
|
||||
csegIdx = 0;
|
||||
for (i=cmap; i < pmap; i+=8)
|
||||
{
|
||||
readShort(f); /* Always 0 */
|
||||
csize = readShort(f);
|
||||
if (csize == 0xFFFF) continue; /* Ignore the first one... unit init */
|
||||
csegoffs[csegIdx++] = cumsize;
|
||||
cumsize += csize;
|
||||
grab(f,4);
|
||||
}
|
||||
}
|
||||
|
||||
void TPL_PatternCollector::enterSystemUnit(FILE *f)
|
||||
{
|
||||
/* The system unit is special. The association between keywords and
|
||||
pmap entries is not stored in the .tpl file (as far as I can tell).
|
||||
So we hope that they are constant pmap entries.
|
||||
*/
|
||||
|
||||
fseek(f, 0x0C, SEEK_SET);
|
||||
cmap = readShort(f);
|
||||
pmap = readShort(f);
|
||||
fseek(f, offStCseg, SEEK_SET);
|
||||
csegBase = roundUp(readShort(f)); /* Round up to next 16 bdry */
|
||||
printf("CMAP table at %04X\n", cmap);
|
||||
printf("PMAP table at %04X\n", pmap);
|
||||
printf("Code seg base %04X\n", csegBase);
|
||||
|
||||
readCmapOffsets(f);
|
||||
|
||||
enterSym(f,"INITIALISE", 0x04);
|
||||
enterSym(f,"UNKNOWN008", 0x08);
|
||||
enterSym(f,"EXIT", 0x0C);
|
||||
enterSym(f,"BlockMove", 0x10);
|
||||
unknown(f,0x14, 0xC8);
|
||||
enterSym(f,"PostIO", 0xC8);
|
||||
enterSym(f,"UNKNOWN0CC", 0xCC);
|
||||
enterSym(f,"STACKCHK", 0xD0);
|
||||
enterSym(f,"UNKNOWN0D4", 0xD4);
|
||||
enterSym(f,"WriteString", 0xD8);
|
||||
enterSym(f,"WriteInt", 0xDC);
|
||||
enterSym(f,"UNKNOWN0E0", 0xE0);
|
||||
enterSym(f,"UNKNOWN0E4", 0xE4);
|
||||
enterSym(f,"CRLF", 0xE8);
|
||||
enterSym(f,"UNKNOWN0EC", 0xEC);
|
||||
enterSym(f,"UNKNOWN0F0", 0xF0);
|
||||
enterSym(f,"UNKNOWN0F4", 0xF4);
|
||||
enterSym(f,"ReadEOL", 0xF8);
|
||||
enterSym(f,"Read", 0xFC);
|
||||
enterSym(f,"UNKNOWN100", 0x100);
|
||||
enterSym(f,"UNKNOWN104", 0x104);
|
||||
enterSym(f,"PostWrite", 0x108);
|
||||
enterSym(f,"UNKNOWN10C", 0x10C);
|
||||
enterSym(f,"Randomize", 0x110);
|
||||
unknown(f,0x114, 0x174);
|
||||
enterSym(f,"Random", 0x174);
|
||||
unknown(f,0x178, 0x1B8);
|
||||
enterSym(f,"FloatAdd", 0x1B8); /* A guess! */
|
||||
enterSym(f,"FloatSub", 0x1BC); /* disicx - dxbxax -> dxbxax*/
|
||||
enterSym(f,"FloatMult", 0x1C0); /* disicx * dxbxax -> dxbxax*/
|
||||
enterSym(f,"FloatDivide", 0x1C4); /* disicx / dxbxax -> dxbxax*/
|
||||
enterSym(f,"UNKNOWN1C8", 0x1C8);
|
||||
enterSym(f,"DoubleToFloat",0x1CC); /* dxax to dxbxax */
|
||||
enterSym(f,"UNKNOWN1D0", 0x1D0);
|
||||
enterSym(f,"WriteFloat", 0x1DC);
|
||||
unknown(f,0x1E0, 0x200);
|
||||
|
||||
}
|
||||
|
||||
void TPL_PatternCollector::readString(FILE *f)
|
||||
{
|
||||
uint8_t len;
|
||||
|
||||
len = readByte(f);
|
||||
grab(f,len);
|
||||
buf[len] = '\0';
|
||||
}
|
||||
|
||||
void TPL_PatternCollector::unknown(FILE *f, unsigned j, unsigned k)
|
||||
{
|
||||
/* Mark calls j to k (not inclusive) as unknown */
|
||||
unsigned i;
|
||||
|
||||
for (i=j; i < k; i+= 4)
|
||||
{
|
||||
sprintf((char *)buf, "UNKNOWN%03X", i);
|
||||
enterSym(f,(char *)buf, i);
|
||||
}
|
||||
}
|
||||
|
||||
void TPL_PatternCollector::nextUnit(FILE *f)
|
||||
{
|
||||
/* Find the start of the next unit */
|
||||
|
||||
uint16_t dsegBase, sizeSyms, sizeOther1, sizeOther2;
|
||||
|
||||
fseek(f, unitBase+offStCseg, SEEK_SET);
|
||||
dsegBase = roundUp(readShort(f));
|
||||
sizeSyms = roundUp(readShort(f));
|
||||
sizeOther1 = roundUp(readShort(f));
|
||||
sizeOther2 = roundUp(readShort(f));
|
||||
|
||||
unitBase += dsegBase + sizeSyms + sizeOther1 + sizeOther2;
|
||||
|
||||
fseek(f, unitBase, SEEK_SET);
|
||||
if (fread(buf, 1, 4, f) == 4)
|
||||
{
|
||||
buf[4]='\0';
|
||||
printf("Start of unit: found %s\n", buf);
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
void TPL_PatternCollector::setVersionSpecifics()
|
||||
{
|
||||
|
||||
version = buf[3]; /* The x of TPUx */
|
||||
|
||||
switch (version)
|
||||
{
|
||||
case '0': /* Version 4.0 */
|
||||
offStCseg = 0x14; /* Offset to the LL giving the Cseg start */
|
||||
charProc = 'T'; /* Indicates a proc in the dictionary */
|
||||
charFunc = 'U'; /* Indicates a function in the dictionary */
|
||||
skipPmap = 6; /* Bytes to skip after Func to get pmap offset */
|
||||
break;
|
||||
|
||||
|
||||
case '5': /* Version 5.0 */
|
||||
offStCseg = 0x18; /* Offset to the LL giving the Cseg start */
|
||||
charProc = 'T'; /* Indicates a proc in the dictionary */
|
||||
charFunc = 'U'; /* Indicates a function in the dictionary */
|
||||
skipPmap = 1; /* Bytes to skip after Func to get pmap offset */
|
||||
break;
|
||||
|
||||
default:
|
||||
printf("Unknown version %c!\n", version);
|
||||
exit(1);
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
void TPL_PatternCollector::savePos(FILE *f)
|
||||
{
|
||||
|
||||
if (positionStack.size() >= 20)
|
||||
{
|
||||
printf("Overflowed filePosn array\n");
|
||||
exit(1);
|
||||
}
|
||||
positionStack.push_back(ftell(f));
|
||||
}
|
||||
|
||||
void TPL_PatternCollector::restorePos(FILE *f)
|
||||
{
|
||||
if (positionStack.empty() == 0)
|
||||
{
|
||||
printf("Underflowed filePosn array\n");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
fseek(f, positionStack.back(), SEEK_SET);
|
||||
positionStack.pop_back();
|
||||
}
|
||||
|
||||
void TPL_PatternCollector::enterUnitProcs(FILE *f)
|
||||
{
|
||||
|
||||
uint16_t i, LL;
|
||||
uint16_t hash, hsize, dhdr, pmapOff;
|
||||
char cat;
|
||||
char name[40];
|
||||
|
||||
fseek(f, unitBase+0x0C, SEEK_SET);
|
||||
cmap = readShort(f);
|
||||
pmap = readShort(f);
|
||||
fseek(f, unitBase+offStCseg, SEEK_SET);
|
||||
csegBase = roundUp(readShort(f)); /* Round up to next 16 bdry */
|
||||
printf("CMAP table at %04X\n", cmap);
|
||||
printf("PMAP table at %04X\n", pmap);
|
||||
printf("Code seg base %04X\n", csegBase);
|
||||
|
||||
readCmapOffsets(f);
|
||||
|
||||
fseek(f, unitBase+pmap, SEEK_SET); /* Go to first pmap entry */
|
||||
if (readShort(f) != 0xFFFF) /* FFFF means none */
|
||||
{
|
||||
sprintf(name, "UNIT_INIT_%d", ++unitNum);
|
||||
enterSym(f,name, 0); /* This is the unit init code */
|
||||
}
|
||||
|
||||
fseek(f, unitBase+0x0A, SEEK_SET);
|
||||
hash = readShort(f);
|
||||
//printf("Hash table at %04X\n", hash);
|
||||
fseek(f, unitBase+hash, SEEK_SET);
|
||||
hsize = readShort(f);
|
||||
//printf("Hash table size %04X\n", hsize);
|
||||
for (i=0; i <= hsize; i+= 2)
|
||||
{
|
||||
dhdr = readShort(f);
|
||||
if (dhdr)
|
||||
{
|
||||
savePos(f);
|
||||
fseek(f, unitBase+dhdr, SEEK_SET);
|
||||
do
|
||||
{
|
||||
LL = readShort(f);
|
||||
readString(f);
|
||||
strcpy(name, (char *)buf);
|
||||
cat = readByte(f);
|
||||
if ((cat == charProc) || (cat == charFunc))
|
||||
{
|
||||
grab(f,skipPmap); /* Skip to the pmap */
|
||||
pmapOff = readShort(f); /* pmap offset */
|
||||
printf("pmap offset for %13s: %04X\n", name, pmapOff);
|
||||
enterSym(f,name, pmapOff);
|
||||
}
|
||||
//printf("%13s %c ", name, cat);
|
||||
if (LL)
|
||||
{
|
||||
//printf("LL seek to %04X\n", LL);
|
||||
fseek(f, unitBase+LL, SEEK_SET);
|
||||
}
|
||||
} while (LL);
|
||||
restorePos(f);
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
int TPL_PatternCollector::readSyms(FILE *f)
|
||||
{
|
||||
grab(f,4);
|
||||
if ((strncmp((char *)buf, "TPU0", 4) != 0) && ((strncmp((char *)buf, "TPU5", 4) != 0)))
|
||||
{
|
||||
printf("Not a Turbo Pascal version 4 or 5 library file\n");
|
||||
fclose(f);
|
||||
exit(1);
|
||||
}
|
||||
|
||||
setVersionSpecifics();
|
||||
|
||||
enterSystemUnit(f);
|
||||
unitBase = 0;
|
||||
do
|
||||
{
|
||||
nextUnit(f);
|
||||
if (feof(f)) break;
|
||||
enterUnitProcs(f);
|
||||
} while (1);
|
||||
|
||||
return count;
|
||||
}
|
||||
38
tools/makedsig/TPL_PatternCollector.h
Normal file
38
tools/makedsig/TPL_PatternCollector.h
Normal file
@@ -0,0 +1,38 @@
|
||||
#ifndef TPL_PATTERNCOLLECTOR_H
|
||||
#define TPL_PATTERNCOLLECTOR_H
|
||||
#include "PatternCollector.h"
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdint.h>
|
||||
#include <vector>
|
||||
|
||||
struct TPL_PatternCollector : public PatternCollector {
|
||||
protected:
|
||||
uint16_t cmap, pmap, csegBase, unitBase;
|
||||
uint16_t offStCseg, skipPmap;
|
||||
int count = 0;
|
||||
int cAllocSym = 0;
|
||||
int unitNum = 0;
|
||||
char version, charProc, charFunc;
|
||||
uint16_t csegoffs[100];
|
||||
uint16_t csegIdx;
|
||||
std::vector<long int> positionStack;
|
||||
|
||||
void enterSym(FILE *f,const char *name, uint16_t pmapOffset);
|
||||
void allocSym(int count);
|
||||
void readCmapOffsets(FILE *f);
|
||||
void enterSystemUnit(FILE *f);
|
||||
void readString(FILE *f);
|
||||
void unknown(FILE *f,unsigned j, unsigned k);
|
||||
void nextUnit(FILE *f);
|
||||
void setVersionSpecifics(void);
|
||||
void savePos(FILE *f);
|
||||
void restorePos(FILE *f);
|
||||
void enterUnitProcs(FILE *f);
|
||||
public:
|
||||
/* Read the .tpl file, and put the keys into the array *keys[]. Returns the count */
|
||||
int readSyms(FILE *f);
|
||||
};
|
||||
|
||||
#endif // TPL_PATTERNCOLLECTOR_H
|
||||
|
||||
525
tools/makedsig/fixwild.cpp
Normal file
525
tools/makedsig/fixwild.cpp
Normal file
@@ -0,0 +1,525 @@
|
||||
/*
|
||||
*$Log: fixwild.c,v $
|
||||
* Revision 1.10 93/10/28 11:10:10 emmerik
|
||||
* Addressing mode [reg+nnnn] is now wildcarded
|
||||
*
|
||||
* Revision 1.9 93/10/26 13:40:11 cifuente
|
||||
* op0F(byte pat[])
|
||||
*
|
||||
* Revision 1.8 93/10/26 13:01:29 emmerik
|
||||
* Completed the odd opcodes, like 0F XX and F7. Result: some library
|
||||
* functions that were not recognised before are recognised now.
|
||||
*
|
||||
* Revision 1.7 93/10/11 11:37:01 cifuente
|
||||
* First walk of HIGH_LEVEL icodes.
|
||||
*
|
||||
* Revision 1.6 93/10/01 14:36:21 emmerik
|
||||
* Added $ log, and made independant of dcc.h
|
||||
*
|
||||
*
|
||||
*/
|
||||
|
||||
/* * * * * * * * * * * * *\
|
||||
* *
|
||||
* Fix Wild Cards Code *
|
||||
* *
|
||||
\* * * * * * * * * * * * */
|
||||
|
||||
#include <memory.h>
|
||||
#include <stdint.h>
|
||||
#ifndef PATLEN
|
||||
#define PATLEN 23
|
||||
#define WILD 0xF4
|
||||
#endif
|
||||
|
||||
static int pc; /* Indexes into pat[] */
|
||||
|
||||
/* prototypes */
|
||||
static bool ModRM(uint8_t pat[]); /* Handle the mod/rm byte */
|
||||
static bool TwoWild(uint8_t pat[]); /* Make the next 2 bytes wild */
|
||||
static bool FourWild(uint8_t pat[]); /* Make the next 4 bytes wild */
|
||||
void fixWildCards(uint8_t pat[]); /* Main routine */
|
||||
|
||||
|
||||
/* Handle the mod/rm case. Returns true if pattern exhausted */
|
||||
static bool ModRM(uint8_t pat[])
|
||||
{
|
||||
uint8_t op;
|
||||
|
||||
/* A standard mod/rm byte follows opcode */
|
||||
op = pat[pc++]; /* The mod/rm byte */
|
||||
if (pc >= PATLEN) return true; /* Skip Mod/RM */
|
||||
switch (op & 0xC0)
|
||||
{
|
||||
case 0x00: /* [reg] or [nnnn] */
|
||||
if ((op & 0xC7) == 6)
|
||||
{
|
||||
/* Uses [nnnn] address mode */
|
||||
pat[pc++] = WILD;
|
||||
if (pc >= PATLEN) return true;
|
||||
pat[pc++] = WILD;
|
||||
if (pc >= PATLEN) return true;
|
||||
}
|
||||
break;
|
||||
case 0x40: /* [reg + nn] */
|
||||
if ((pc+=1) >= PATLEN) return true;
|
||||
break;
|
||||
case 0x80: /* [reg + nnnn] */
|
||||
/* Possibly just a long constant offset from a register,
|
||||
but often will be an index from a variable */
|
||||
pat[pc++] = WILD;
|
||||
if (pc >= PATLEN) return true;
|
||||
pat[pc++] = WILD;
|
||||
if (pc >= PATLEN) return true;
|
||||
break;
|
||||
case 0xC0: /* reg */
|
||||
break;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
/* Change the next two bytes to wild cards */
|
||||
static bool TwoWild(uint8_t pat[])
|
||||
{
|
||||
pat[pc++] = WILD;
|
||||
if (pc >= PATLEN) return true; /* Pattern exhausted */
|
||||
pat[pc++] = WILD;
|
||||
if (pc >= PATLEN) return true;
|
||||
return false;
|
||||
}
|
||||
|
||||
/* Change the next four bytes to wild cards */
|
||||
static bool FourWild(uint8_t pat[])
|
||||
{
|
||||
TwoWild(pat);
|
||||
return TwoWild(pat);
|
||||
}
|
||||
|
||||
/* Chop from the current point by wiping with zeroes. Can't rely on anything
|
||||
after this point */
|
||||
static void chop(uint8_t pat[])
|
||||
{
|
||||
if (pc >= PATLEN) return; /* Could go negative otherwise */
|
||||
memset(&pat[pc], 0, PATLEN - pc);
|
||||
}
|
||||
|
||||
static bool op0F(uint8_t pat[])
|
||||
{
|
||||
/* The two byte opcodes */
|
||||
uint8_t op = pat[pc++];
|
||||
switch (op & 0xF0)
|
||||
{
|
||||
case 0x00: /* 00 - 0F */
|
||||
if (op >= 0x06) /* Clts, Invd, Wbinvd */
|
||||
return false;
|
||||
else
|
||||
{
|
||||
/* Grp 6, Grp 7, LAR, LSL */
|
||||
return ModRM(pat);
|
||||
}
|
||||
case 0x20: /* Various funnies, all with Mod/RM */
|
||||
return ModRM(pat);
|
||||
|
||||
case 0x80:
|
||||
pc += 2; /* Word displacement cond jumps */
|
||||
return false;
|
||||
|
||||
case 0x90: /* Byte set on condition */
|
||||
return ModRM(pat);
|
||||
|
||||
case 0xA0:
|
||||
switch (op)
|
||||
{
|
||||
case 0xA0: /* Push FS */
|
||||
case 0xA1: /* Pop FS */
|
||||
case 0xA8: /* Push GS */
|
||||
case 0xA9: /* Pop GS */
|
||||
return false;
|
||||
|
||||
case 0xA3: /* Bt Ev,Gv */
|
||||
case 0xAB: /* Bts Ev,Gv */
|
||||
return ModRM(pat);
|
||||
|
||||
case 0xA4: /* Shld EvGbIb */
|
||||
case 0xAC: /* Shrd EvGbIb */
|
||||
if (ModRM(pat)) return true;
|
||||
pc++; /* The #num bits to shift */
|
||||
return false;
|
||||
|
||||
case 0xA5: /* Shld EvGb CL */
|
||||
case 0xAD: /* Shrd EvGb CL */
|
||||
return ModRM(pat);
|
||||
|
||||
default: /* CmpXchg, Imul */
|
||||
return ModRM(pat);
|
||||
}
|
||||
|
||||
case 0xB0:
|
||||
if (op == 0xBA)
|
||||
{
|
||||
/* Grp 8: bt/bts/btr/btc Ev,#nn */
|
||||
if (ModRM(pat)) return true;
|
||||
pc++; /* The #num bits to shift */
|
||||
return false;
|
||||
}
|
||||
return ModRM(pat);
|
||||
|
||||
case 0xC0:
|
||||
if (op <= 0xC1)
|
||||
{
|
||||
/* Xadd */
|
||||
return ModRM(pat);
|
||||
}
|
||||
/* Else BSWAP */
|
||||
return false;
|
||||
|
||||
default:
|
||||
return false; /* Treat as double byte opcodes */
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
/* Scan through the instructions in pat[], looking for opcodes that may
|
||||
have operands that vary with different instances. For example, load and
|
||||
store from statics, calls to other procs (even relative calls; they may
|
||||
call procs loaded in a different order, etc).
|
||||
Note that this procedure is architecture specific, and assumes the
|
||||
processor is in 16 bit address mode (real mode).
|
||||
PATLEN bytes are scanned.
|
||||
*/
|
||||
void fixWildCards(uint8_t pat[])
|
||||
{
|
||||
|
||||
uint8_t op, quad, intArg;
|
||||
|
||||
|
||||
pc=0;
|
||||
while (pc < PATLEN)
|
||||
{
|
||||
op = pat[pc++];
|
||||
if (pc >= PATLEN) return;
|
||||
|
||||
quad = op & 0xC0; /* Quadrant of the opcode map */
|
||||
if (quad == 0)
|
||||
{
|
||||
/* Arithmetic group 00-3F */
|
||||
|
||||
if ((op & 0xE7) == 0x26) /* First check for the odds */
|
||||
{
|
||||
/* Segment prefix: treat as 1 byte opcode */
|
||||
continue;
|
||||
}
|
||||
if (op == 0x0F) /* 386 2 byte opcodes */
|
||||
{
|
||||
if (op0F(pat)) return;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (op & 0x04)
|
||||
{
|
||||
/* All these are constant. Work out the instr length */
|
||||
if (op & 2)
|
||||
{
|
||||
/* Push, pop, other 1 byte opcodes */
|
||||
continue;
|
||||
}
|
||||
else
|
||||
{
|
||||
if (op & 1)
|
||||
{
|
||||
/* Word immediate operands */
|
||||
pc += 2;
|
||||
continue;
|
||||
}
|
||||
else
|
||||
{
|
||||
/* Byte immediate operands */
|
||||
pc++;
|
||||
continue;
|
||||
}
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
/* All these have mod/rm bytes */
|
||||
if (ModRM(pat)) return;
|
||||
continue;
|
||||
}
|
||||
}
|
||||
else if (quad == 0x40)
|
||||
{
|
||||
if ((op & 0x60) == 0x40)
|
||||
{
|
||||
/* 0x40 - 0x5F -- these are inc, dec, push, pop of general
|
||||
registers */
|
||||
continue;
|
||||
}
|
||||
else
|
||||
{
|
||||
/* 0x60 - 0x70 */
|
||||
if (op & 0x10)
|
||||
{
|
||||
/* 70-7F 2 byte jump opcodes */
|
||||
pc++;
|
||||
continue;
|
||||
}
|
||||
else
|
||||
{
|
||||
/* Odds and sods */
|
||||
switch (op)
|
||||
{
|
||||
case 0x60: /* pusha */
|
||||
case 0x61: /* popa */
|
||||
case 0x64: /* overrides */
|
||||
case 0x65:
|
||||
case 0x66:
|
||||
case 0x67:
|
||||
case 0x6C: /* insb DX */
|
||||
case 0x6E: /* outsb DX */
|
||||
continue;
|
||||
|
||||
case 0x62: /* bound */
|
||||
pc += 4;
|
||||
continue;
|
||||
|
||||
case 0x63: /* arpl */
|
||||
if (TwoWild(pat)) return;
|
||||
continue;
|
||||
|
||||
case 0x68: /* Push byte */
|
||||
case 0x6A: /* Push byte */
|
||||
case 0x6D: /* insb port */
|
||||
case 0x6F: /* outsb port */
|
||||
/* 2 byte instr, no wilds */
|
||||
pc++;
|
||||
continue;
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
else if (quad == 0x80)
|
||||
{
|
||||
switch (op & 0xF0)
|
||||
{
|
||||
case 0x80: /* 80 - 8F */
|
||||
/* All have a mod/rm byte */
|
||||
if (ModRM(pat)) return;
|
||||
/* These also have immediate values */
|
||||
switch (op)
|
||||
{
|
||||
case 0x80:
|
||||
case 0x83:
|
||||
/* One byte immediate */
|
||||
pc++;
|
||||
continue;
|
||||
|
||||
case 0x81:
|
||||
/* Immediate 16 bit values might be constant, but
|
||||
also might be relocatable. Have to make them
|
||||
wild */
|
||||
if (TwoWild(pat)) return;
|
||||
continue;
|
||||
}
|
||||
continue;
|
||||
case 0x90: /* 90 - 9F */
|
||||
if (op == 0x9A)
|
||||
{
|
||||
/* far call */
|
||||
if (FourWild(pat)) return;
|
||||
continue;
|
||||
}
|
||||
/* All others are 1 byte opcodes */
|
||||
continue;
|
||||
case 0xA0: /* A0 - AF */
|
||||
if ((op & 0x0C) == 0)
|
||||
{
|
||||
/* mov al/ax to/from [nnnn] */
|
||||
if (TwoWild(pat)) return;
|
||||
continue;
|
||||
}
|
||||
else if ((op & 0xFE) == 0xA8)
|
||||
{
|
||||
/* test al,#byte or test ax,#word */
|
||||
if (op & 1) pc += 2;
|
||||
else pc += 1;
|
||||
continue;
|
||||
|
||||
}
|
||||
case 0xB0: /* B0 - BF */
|
||||
{
|
||||
if (op & 8)
|
||||
{
|
||||
/* mov reg, #16 */
|
||||
/* Immediate 16 bit values might be constant, but also
|
||||
might be relocatable. For now, make them wild */
|
||||
if (TwoWild(pat)) return;
|
||||
}
|
||||
else
|
||||
{
|
||||
/* mov reg, #8 */
|
||||
pc++;
|
||||
}
|
||||
continue;
|
||||
}
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
/* In the last quadrant of the op code table */
|
||||
switch (op)
|
||||
{
|
||||
case 0xC0: /* 386: Rotate group 2 ModRM, byte, #byte */
|
||||
case 0xC1: /* 386: Rotate group 2 ModRM, word, #byte */
|
||||
if (ModRM(pat)) return;
|
||||
/* Byte immediate value follows ModRM */
|
||||
pc++;
|
||||
continue;
|
||||
|
||||
case 0xC3: /* Return */
|
||||
case 0xCB: /* Return far */
|
||||
chop(pat);
|
||||
return;
|
||||
case 0xC2: /* Ret nnnn */
|
||||
case 0xCA: /* Retf nnnn */
|
||||
pc += 2;
|
||||
chop(pat);
|
||||
return;
|
||||
|
||||
case 0xC4: /* les Gv, Mp */
|
||||
case 0xC5: /* lds Gv, Mp */
|
||||
if (ModRM(pat)) return;
|
||||
continue;
|
||||
|
||||
case 0xC6: /* Mov ModRM, #nn */
|
||||
if (ModRM(pat)) return;
|
||||
/* Byte immediate value follows ModRM */
|
||||
pc++;
|
||||
continue;
|
||||
case 0xC7: /* Mov ModRM, #nnnn */
|
||||
if (ModRM(pat)) return;
|
||||
/* Word immediate value follows ModRM */
|
||||
/* Immediate 16 bit values might be constant, but also
|
||||
might be relocatable. For now, make them wild */
|
||||
if (TwoWild(pat)) return;
|
||||
continue;
|
||||
|
||||
case 0xC8: /* Enter Iw, Ib */
|
||||
pc += 3; /* Constant word, byte */
|
||||
continue;
|
||||
case 0xC9: /* Leave */
|
||||
continue;
|
||||
|
||||
case 0xCC: /* Int 3 */
|
||||
continue;
|
||||
|
||||
case 0xCD: /* Int nn */
|
||||
intArg = pat[pc++];
|
||||
if ((intArg >= 0x34) && (intArg <= 0x3B))
|
||||
{
|
||||
/* Borland/Microsoft FP emulations */
|
||||
if (ModRM(pat)) return;
|
||||
}
|
||||
continue;
|
||||
|
||||
case 0xCE: /* Into */
|
||||
continue;
|
||||
|
||||
case 0xCF: /* Iret */
|
||||
continue;
|
||||
|
||||
case 0xD0: /* Group 2 rotate, byte, 1 bit */
|
||||
case 0xD1: /* Group 2 rotate, word, 1 bit */
|
||||
case 0xD2: /* Group 2 rotate, byte, CL bits */
|
||||
case 0xD3: /* Group 2 rotate, word, CL bits */
|
||||
if (ModRM(pat)) return;
|
||||
continue;
|
||||
|
||||
case 0xD4: /* Aam */
|
||||
case 0xD5: /* Aad */
|
||||
case 0xD7: /* Xlat */
|
||||
continue;
|
||||
|
||||
case 0xD8:
|
||||
case 0xD9:
|
||||
case 0xDA:
|
||||
case 0xDB: /* Esc opcodes */
|
||||
case 0xDC: /* i.e. floating point */
|
||||
case 0xDD: /* coprocessor calls */
|
||||
case 0xDE:
|
||||
case 0xDF:
|
||||
if (ModRM(pat)) return;
|
||||
continue;
|
||||
|
||||
case 0xE0: /* Loopne */
|
||||
case 0xE1: /* Loope */
|
||||
case 0xE2: /* Loop */
|
||||
case 0xE3: /* Jcxz */
|
||||
pc++; /* Short jump offset */
|
||||
continue;
|
||||
|
||||
case 0xE4: /* in al,nn */
|
||||
case 0xE6: /* out nn,al */
|
||||
pc++;
|
||||
continue;
|
||||
|
||||
case 0xE5: /* in ax,nn */
|
||||
case 0xE7: /* in nn,ax */
|
||||
pc += 2;
|
||||
continue;
|
||||
|
||||
case 0xE8: /* Call rel */
|
||||
if (TwoWild(pat)) return;
|
||||
continue;
|
||||
case 0xE9: /* Jump rel, unconditional */
|
||||
if (TwoWild(pat)) return;
|
||||
chop(pat);
|
||||
return;
|
||||
case 0xEA: /* Jump abs */
|
||||
if (FourWild(pat)) return;
|
||||
chop(pat);
|
||||
return;
|
||||
case 0xEB: /* Jmp short unconditional */
|
||||
pc++;
|
||||
chop(pat);
|
||||
return;
|
||||
|
||||
case 0xEC: /* In al,dx */
|
||||
case 0xED: /* In ax,dx */
|
||||
case 0xEE: /* Out dx,al */
|
||||
case 0xEF: /* Out dx,ax */
|
||||
continue;
|
||||
|
||||
case 0xF0: /* Lock */
|
||||
case 0xF2: /* Repne */
|
||||
case 0xF3: /* Rep/repe */
|
||||
case 0xF4: /* Halt */
|
||||
case 0xF5: /* Cmc */
|
||||
case 0xF8: /* Clc */
|
||||
case 0xF9: /* Stc */
|
||||
case 0xFA: /* Cli */
|
||||
case 0xFB: /* Sti */
|
||||
case 0xFC: /* Cld */
|
||||
case 0xFD: /* Std */
|
||||
continue;
|
||||
|
||||
case 0xF6: /* Group 3 byte test/not/mul/div */
|
||||
case 0xF7: /* Group 3 word test/not/mul/div */
|
||||
case 0xFE: /* Inc/Dec group 4 */
|
||||
if (ModRM(pat)) return;
|
||||
continue;
|
||||
|
||||
case 0xFF: /* Group 5 Inc/Dec/Call/Jmp/Push */
|
||||
/* Most are like standard ModRM */
|
||||
if (ModRM(pat)) return;
|
||||
continue;
|
||||
|
||||
default: /* Rest are single byte opcodes */
|
||||
continue;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
175
tools/makedsig/makedsig.cpp
Normal file
175
tools/makedsig/makedsig.cpp
Normal file
@@ -0,0 +1,175 @@
|
||||
/* Program for making the DCC signature file */
|
||||
|
||||
#include "LIB_PatternCollector.h"
|
||||
#include "TPL_PatternCollector.h"
|
||||
#include "perfhlib.h" /* Symbol table prototypes */
|
||||
|
||||
#include <QtCore/QCoreApplication>
|
||||
#include <QtCore/QStringList>
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <malloc.h>
|
||||
#include <memory.h>
|
||||
#include <string.h>
|
||||
#include <algorithm>
|
||||
|
||||
/* Symbol table constnts */
|
||||
#define C 2.2 /* Sparseness of graph. See Czech, Havas and Majewski for details */
|
||||
|
||||
/* prototypes */
|
||||
|
||||
void saveFile(FILE *fl, const PerfectHash &p_hash, PatternCollector *coll); /* Save the info */
|
||||
|
||||
int numKeys; /* Number of useful codeview symbols */
|
||||
|
||||
|
||||
static void printUsage(bool longusage) {
|
||||
if(longusage)
|
||||
printf(
|
||||
"This program is to make 'signatures' of known c and tpl library calls for the dcc program.\n"
|
||||
"It needs as the first arg the name of a library file, and as the second arg, the name "
|
||||
"of the signature file to be generated.\n"
|
||||
"Example: makedsig CL.LIB dccb3l.sig\n"
|
||||
" or makedsig turbo.tpl dcct4p.sig\n"
|
||||
);
|
||||
else
|
||||
printf("Usage: makedsig <libname> <signame>\n"
|
||||
"or makedsig -h for help\n");
|
||||
}
|
||||
int main(int argc, char *argv[])
|
||||
{
|
||||
QCoreApplication app(argc,argv);
|
||||
FILE *f2; // output file
|
||||
FILE *srcfile; // .lib file
|
||||
int s;
|
||||
if(app.arguments().size()<2) {
|
||||
printUsage(false);
|
||||
return 0;
|
||||
}
|
||||
QString arg2 = app.arguments()[1];
|
||||
if (arg2.startsWith("-h") || arg2.startsWith("-?"))
|
||||
{
|
||||
printUsage(true);
|
||||
return 0;
|
||||
}
|
||||
PatternCollector *collector;
|
||||
if(arg2.endsWith("tpl")) {
|
||||
collector = new TPL_PatternCollector;
|
||||
} else if(arg2.endsWith(".lib")) {
|
||||
collector = new LIB_PatternCollector;
|
||||
}
|
||||
if ((srcfile = fopen(argv[1], "rb")) == NULL)
|
||||
{
|
||||
printf("Cannot read %s\n", argv[1]);
|
||||
exit(2);
|
||||
}
|
||||
|
||||
if ((f2 = fopen(argv[2], "wb")) == NULL)
|
||||
{
|
||||
printf("Cannot write %s\n", argv[2]);
|
||||
exit(2);
|
||||
}
|
||||
|
||||
fprintf(stderr, "Seed: ");
|
||||
scanf("%d", &s);
|
||||
srand(s);
|
||||
|
||||
PerfectHash p_hash;
|
||||
numKeys = collector->readSyms(srcfile); /* Read the keys (symbols) */
|
||||
|
||||
printf("Num keys: %d; vertices: %d\n", numKeys, (int)(numKeys*C));
|
||||
/* Set the parameters for the hash table */
|
||||
p_hash.setHashParams( numKeys, /* The number of symbols */
|
||||
PATLEN, /* The length of the pattern to be hashed */
|
||||
256, /* The character set of the pattern (0-FF) */
|
||||
0, /* Minimum pattern character value */
|
||||
numKeys*C); /* C is the sparseness of the graph. See Czech,
|
||||
Havas and Majewski for details */
|
||||
|
||||
/* The following two functions are in perfhlib.c */
|
||||
p_hash.map(collector); /* Perform the mapping. This will call getKey() repeatedly */
|
||||
p_hash.assign(); /* Generate the function g */
|
||||
|
||||
saveFile(f2,p_hash,collector); /* Save the resultant information */
|
||||
|
||||
fclose(srcfile);
|
||||
fclose(f2);
|
||||
|
||||
}
|
||||
|
||||
/* * * * * * * * * * * * *\
|
||||
* *
|
||||
* S a v e t h e s i g f i l e *
|
||||
* *
|
||||
\* * * * * * * * * * * * */
|
||||
|
||||
|
||||
void writeFile(FILE *fl,const char *buffer, int len)
|
||||
{
|
||||
if ((int)fwrite(buffer, 1, len, fl) != len)
|
||||
{
|
||||
printf("Could not write to file\n");
|
||||
exit(1);
|
||||
}
|
||||
}
|
||||
|
||||
void writeFileShort(FILE *fl,uint16_t w)
|
||||
{
|
||||
uint8_t b;
|
||||
|
||||
b = (uint8_t)(w & 0xFF);
|
||||
writeFile(fl,(char *)&b, 1); /* Write a short little endian */
|
||||
b = (uint8_t)(w>>8);
|
||||
writeFile(fl,(char *)&b, 1);
|
||||
}
|
||||
|
||||
void saveFile(FILE *fl, const PerfectHash &p_hash, PatternCollector *coll)
|
||||
{
|
||||
int i, len;
|
||||
const uint16_t *pTable;
|
||||
|
||||
writeFile(fl,"dccs", 4); /* Signature */
|
||||
writeFileShort(fl,numKeys); /* Number of keys */
|
||||
writeFileShort(fl,(short)(numKeys * C)); /* Number of vertices */
|
||||
writeFileShort(fl,PATLEN); /* Length of key part of entries */
|
||||
writeFileShort(fl,SYMLEN); /* Length of symbol part of entries */
|
||||
|
||||
/* Write out the tables T1 and T2, with their sig and byte lengths in front */
|
||||
writeFile(fl,"T1", 2); /* "Signature" */
|
||||
pTable = p_hash.readT1();
|
||||
len = PATLEN * 256;
|
||||
writeFileShort(fl,len * sizeof(uint16_t));
|
||||
for (i=0; i < len; i++)
|
||||
{
|
||||
writeFileShort(fl,pTable[i]);
|
||||
}
|
||||
writeFile(fl,"T2", 2);
|
||||
pTable = p_hash.readT2();
|
||||
writeFileShort(fl,len * sizeof(uint16_t));
|
||||
for (i=0; i < len; i++)
|
||||
{
|
||||
writeFileShort(fl,pTable[i]);
|
||||
}
|
||||
|
||||
/* Write out g[] */
|
||||
writeFile(fl,"gg", 2); /* "Signature" */
|
||||
pTable = p_hash.readG();
|
||||
len = (short)(numKeys * C);
|
||||
writeFileShort(fl,len * sizeof(uint16_t));
|
||||
for (i=0; i < len; i++)
|
||||
{
|
||||
writeFileShort(fl,pTable[i]);
|
||||
}
|
||||
|
||||
/* Now the hash table itself */
|
||||
writeFile(fl,"ht ", 2); /* "Signature" */
|
||||
writeFileShort(fl,numKeys * (SYMLEN + PATLEN + sizeof(uint16_t))); /* byte len */
|
||||
for (i=0; i < numKeys; i++)
|
||||
{
|
||||
writeFile(fl,(char *)&coll->keys[i], SYMLEN + PATLEN);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
|
||||
188
tools/makedsig/makedsig.txt
Normal file
188
tools/makedsig/makedsig.txt
Normal file
@@ -0,0 +1,188 @@
|
||||
MAKEDSIG
|
||||
|
||||
1 What is MakeDsig?
|
||||
|
||||
2 How does it work?
|
||||
|
||||
3 How do I use MakeDsig?
|
||||
|
||||
4 What's in a signature file?
|
||||
|
||||
5 What other tools are useful for signature work?
|
||||
|
||||
|
||||
1 What is MakeDsig?
|
||||
-------------------
|
||||
|
||||
MakeDsig is a program that reads a library (.lib) file from a
|
||||
compiler, and generates a signature file for use by DCC. Without
|
||||
signature files, dcc cannot recognise library functions, and so will
|
||||
attempt to decompile them, and cannot name them. This makes the
|
||||
resultant decompiled code bulkier and difficult to understand.
|
||||
|
||||
|
||||
2 How does it work?
|
||||
-------------------
|
||||
|
||||
Library files contain complete functions, relocation information,
|
||||
function names, and more. MakeDsig reads a library file, and for each
|
||||
function found, it saves the name, and creates a signature. These
|
||||
are stored in an array. When all functions are done, tables for the
|
||||
perfect hashing function are generated. During this process,
|
||||
duplicate keys (functions that produce identical signatures) may be
|
||||
detected; if so, one of the keys will be zeroed.
|
||||
|
||||
The signature file contains information needed by dcc to hash the
|
||||
signatures, as well as the symbols and signatures. Dcc reads the various
|
||||
sections of the signature file to be able to hash signatures. The
|
||||
signatures, not the symbols, are hashed, since dcc gets a signature
|
||||
from the executable file, and needs to know quickly if there is a
|
||||
symbolic name for it.
|
||||
|
||||
3 How do I use MakeDsig?
|
||||
------------------------
|
||||
|
||||
You can always find out by just executing it with no arguments, or
|
||||
MakeDsig -h for more details.
|
||||
|
||||
Basically, you just give it the names of the files that it needs:
|
||||
MakeDsig <libname> <signame>
|
||||
|
||||
It will ask you for a seed; enter any number, e.g. 1.
|
||||
|
||||
You need the library file for the appropriate compiler. For example,
|
||||
to analyse executable programs created from Turbo C 2.1 small model,
|
||||
you need the cs.lib file that comes with that compiler.
|
||||
|
||||
You also need to know the correct name for the signature file, i.e.
|
||||
<signame>. Dcc will detect certain compiler vendors and version
|
||||
numbers, and will look for a signature file named like this:
|
||||
d c c <vendor> <version> <model> . s i g
|
||||
|
||||
Here are the current vendors:
|
||||
Vendor Vendor letter
|
||||
Microsoft C/C++ m
|
||||
Borland C/C++ b
|
||||
Logitech (Modula) l
|
||||
Turbo Pascal t
|
||||
|
||||
Here are the model codes:
|
||||
small/tiny s
|
||||
medium m
|
||||
compact c
|
||||
large l
|
||||
Turbo Pascal p
|
||||
|
||||
The version codes are fairly self explanatory:
|
||||
Microsoft C 5.1 5
|
||||
Microsoft C 8 8
|
||||
Borland C 2.0 2
|
||||
Borland C 3.0 3
|
||||
Turbo Pascal 3.0 3 Note: currently no way to make dcct3p.sig
|
||||
Turbo Pascal 4.0 4 Use Makedstp, not makedsig
|
||||
Turbo Pascal 5.0 5 Use Makedstp, not makedsig
|
||||
|
||||
Some examples: the signature file for Borland C version 2.0, small
|
||||
model, would be dccb2s.sig. To generate it, you would supply as the
|
||||
library file cs.lib that came with that compiler. Suppose it was in
|
||||
the \bc\lib directory. To generate the signature file required to
|
||||
work with files produced by this compiler, you would type
|
||||
|
||||
makedsig \bc\lib\cs.lib dccb2s.sig
|
||||
|
||||
This will create dccb2s.sig in the current directory. For dcc to use
|
||||
this file, place it in the same directory as dcc itself, or point the
|
||||
environment variable DCC to the directory containing it.
|
||||
|
||||
Another example: to make the signature file for Microsoft Visual
|
||||
C/C++ (C 8.0), large model, and assuming the libraries are in
|
||||
the directory \msvc\lib, you would type
|
||||
|
||||
makedsig \msvc\lib\llibce.lib dccm8l.sig
|
||||
|
||||
Note that the signature files for Turbo Pascal from version 4 onwards
|
||||
are generated by makedstp, not makedsig. The latter program reads a
|
||||
special file called turbo.tpl, as there are no normal .lib files for
|
||||
turbo pascal. Dcc will recognise turbo pascal 3.0 files, and look
|
||||
for dcct3p.sig. Because all the library routines are contained in
|
||||
every Turbo Pascal executable, there are no library files or even a
|
||||
turbo.tpl file, so the signature file would have to be constructed by
|
||||
guesswork. You can still use dcc on these files; just ignore the
|
||||
warning about not finding the signature file.
|
||||
|
||||
For executables that dcc does not recognise, it will look for the
|
||||
signature file dccxxx.sig. That way, if you have a new compiler, you
|
||||
can at least have dcc detect library calls, even if it attempts to
|
||||
decompile them all, and has not identified the main program.
|
||||
|
||||
Logitech Modula V1.0 files are recognised, and the signature file
|
||||
dccl1x.sig is looked for. This was experimental in nature, and is not
|
||||
recommended for serious analysis at this stage.
|
||||
|
||||
|
||||
|
||||
4 What's in a signature file?
|
||||
-----------------------------
|
||||
|
||||
The details of a signature file are best documented in the source for
|
||||
makedsig; see the function saveFile(). Briefly:
|
||||
1) a 4 byte pattern identifying the file as a signature file: "dccs".
|
||||
2) a two byte integer containing the number of keys (signatures)
|
||||
3) a two byte integer containing the number of vertices on the graph
|
||||
used to generate the hash table. See the source code and/or the
|
||||
Czech, Havas and Majewski articles for details
|
||||
4) a two byte integer containing the pattern length
|
||||
5) a two byte integer containing the symbolic name length
|
||||
|
||||
The next sections all have the following structure:
|
||||
1) 2 char ID
|
||||
2) a two byte integer containing the size of the body
|
||||
3) the body.
|
||||
|
||||
There are 4 sections: "T1", "T2", "gg", and "ht". T1 and T2 are the
|
||||
tables associated with the hash function. (The hash function is a
|
||||
random function, meaning that it involves tables. T1 and T2 are the
|
||||
tables used by the hash function). "gg" is another table associated
|
||||
with the graph needed by the perfect hashing function algorithm.
|
||||
|
||||
"ht" contains the actual hash table. The body of this section is an
|
||||
array of records of this structure:
|
||||
typedef struct _hashEntry
|
||||
{
|
||||
char name[SYMLEN]; /* The symbol name */
|
||||
byte pat [PATLEN]; /* The pattern */
|
||||
word offset; /* Offset (needed temporarily) */
|
||||
} HASHENTRY;
|
||||
|
||||
This part of the signature file can be browsed with a binary dump
|
||||
program; a PATLEN length signature will follow the (null padded)
|
||||
symbol name. There are tools for searching signature files, e.g.
|
||||
srchsig, dispsig, and readsig. See below.
|
||||
|
||||
|
||||
|
||||
5 What other tools are useful for signature work?
|
||||
-------------------------------------------------
|
||||
|
||||
Makedstp - makes signature files from turbo.tpl. Needed to make
|
||||
signature files for Turbo Pascal version 4.0 and later.
|
||||
|
||||
SrchSig - tells you whether a given pattern exists in a signature
|
||||
file, and gives its name. You need a binary file with the signature
|
||||
in it, exactly the right length. This can most easily be done with
|
||||
debug (comes with MS-DOS).
|
||||
|
||||
DispSig - given the name of a function, displays its signature, and
|
||||
stores the signature into a binary file as well. (You can use this
|
||||
file with srchsig on another signature file, if you want).
|
||||
|
||||
ReadSig - reads a signature file, checking for correct structure, and
|
||||
displaying duplicate signatures. With the -a switch, it will display
|
||||
all signatures, with their symbols.
|
||||
|
||||
The file perfhlib.c is used by various of these tools to do the work
|
||||
of the perfect hashing functions. It could be used as part of other
|
||||
tools that use signature files, or just perfect hashing functions for
|
||||
that matter.
|
||||
|
||||
|
||||
0
tools/parsehdr/CMakeLists.txt
Normal file
0
tools/parsehdr/CMakeLists.txt
Normal file
117
tools/parsehdr/locident.h
Normal file
117
tools/parsehdr/locident.h
Normal file
@@ -0,0 +1,117 @@
|
||||
/*$Log: locident.h,v $
|
||||
* Revision 1.6 94/02/22 15:20:23 cifuente
|
||||
* Code generation is done.
|
||||
*
|
||||
* Revision 1.5 93/12/10 09:38:20 cifuente
|
||||
* New high-level types
|
||||
*
|
||||
* Revision 1.4 93/11/10 17:30:51 cifuente
|
||||
* Procedure header, locals
|
||||
*
|
||||
* Revision 1.3 93/11/08 12:06:35 cifuente
|
||||
* du1 analysis finished. Instantiates procedure arguments for user
|
||||
* declared procedures.
|
||||
*
|
||||
* Revision 1.2 93/10/25 11:01:00 cifuente
|
||||
* New SYNTHETIC instructions for d/u analysis
|
||||
*
|
||||
* Revision 1.1 93/10/11 11:47:39 cifuente
|
||||
* Initial revision
|
||||
*
|
||||
* File: locIdent.h
|
||||
* Purpose: High-level local identifier definitions
|
||||
* Date: October 1993
|
||||
*/
|
||||
|
||||
|
||||
/* Type definition */
|
||||
typedef struct {
|
||||
Int csym; /* # symbols used */
|
||||
Int alloc; /* # symbols allocated */
|
||||
Int *idx; /* Array of integer indexes */
|
||||
} IDX_ARRAY;
|
||||
|
||||
/* Type definitions used in the decompiled program */
|
||||
typedef enum {
|
||||
TYPE_UNKNOWN = 0, /* unknown so far */
|
||||
TYPE_BYTE_SIGN, /* signed byte (8 bits) */
|
||||
TYPE_BYTE_UNSIGN, /* unsigned byte */
|
||||
TYPE_WORD_SIGN, /* signed word (16 bits) */
|
||||
TYPE_WORD_UNSIGN, /* unsigned word (16 bits) */
|
||||
TYPE_LONG_SIGN, /* signed long (32 bits) */
|
||||
TYPE_LONG_UNSIGN, /* unsigned long (32 bits) */
|
||||
TYPE_RECORD, /* record structure */
|
||||
TYPE_PTR, /* pointer (32 bit ptr) */
|
||||
TYPE_STR, /* string */
|
||||
TYPE_CONST, /* constant (any type) */
|
||||
TYPE_FLOAT, /* floating point */
|
||||
TYPE_DOUBLE, /* double precision float */
|
||||
} hlType;
|
||||
|
||||
static char *hlTypes[13] = {"", "char", "unsigned char", "int", "unsigned int",
|
||||
"long", "unsigned long", "record", "int *", "char *",
|
||||
"", "float", "double"};
|
||||
|
||||
typedef enum {
|
||||
STK_FRAME, /* For stack vars */
|
||||
REG_FRAME, /* For register variables */
|
||||
GLB_FRAME, /* For globals */
|
||||
} frameType;
|
||||
|
||||
|
||||
/* Enumeration to determine whether pIcode points to the high or low part
|
||||
* of a long number */
|
||||
typedef enum {
|
||||
HIGH_FIRST, /* High value is first */
|
||||
LOW_FIRST, /* Low value is first */
|
||||
} hlFirst;
|
||||
|
||||
|
||||
/* LOCAL_ID */
|
||||
typedef struct {
|
||||
hlType type; /* Probable type */
|
||||
boolT illegal;/* Boolean: not a valid field any more */
|
||||
IDX_ARRAY idx; /* Index into icode array (REG_FRAME only) */
|
||||
frameType loc; /* Frame location */
|
||||
boolT hasMacro;/* Identifier requires a macro */
|
||||
char macro[10];/* Macro for this identifier */
|
||||
char name[20];/* Identifier's name */
|
||||
union { /* Different types of identifiers */
|
||||
byte regi; /* For TYPE_BYTE(WORD)_(UN)SIGN registers */
|
||||
struct { /* For TYPE_BYTE(WORD)_(UN)SIGN on the stack */
|
||||
byte regOff; /* register offset (if any) */
|
||||
Int off; /* offset from BP */
|
||||
} bwId;
|
||||
struct _bwGlb { /* For TYPE_BYTE(WORD)_(UN)SIGN globals */
|
||||
int16 seg; /* segment value */
|
||||
int16 off; /* offset */
|
||||
byte regi; /* optional indexed register */
|
||||
} bwGlb;
|
||||
struct _longId{ /* For TYPE_LONG_(UN)SIGN registers */
|
||||
byte h; /* high register */
|
||||
byte l; /* low register */
|
||||
} longId;
|
||||
struct _longStkId { /* For TYPE_LONG_(UN)SIGN on the stack */
|
||||
Int offH; /* high offset from BP */
|
||||
Int offL; /* low offset from BP */
|
||||
} longStkId;
|
||||
struct { /* For TYPE_LONG_(UN)SIGN globals */
|
||||
int16 seg; /* segment value */
|
||||
int16 offH; /* offset high */
|
||||
int16 offL; /* offset low */
|
||||
byte regi; /* optional indexed register */
|
||||
} longGlb;
|
||||
struct { /* For TYPE_LONG_(UN)SIGN constants */
|
||||
dword h; /* high word */
|
||||
dword l; /* low word */
|
||||
} longKte;
|
||||
} id;
|
||||
} ID;
|
||||
|
||||
typedef struct {
|
||||
Int csym; /* No. of symbols in the table */
|
||||
Int alloc; /* No. of symbols allocated */
|
||||
ID *id; /* Identifier */
|
||||
} LOCAL_ID;
|
||||
|
||||
|
||||
1538
tools/parsehdr/parsehdr.cpp
Normal file
1538
tools/parsehdr/parsehdr.cpp
Normal file
File diff suppressed because it is too large
Load Diff
98
tools/parsehdr/parsehdr.h
Normal file
98
tools/parsehdr/parsehdr.h
Normal file
@@ -0,0 +1,98 @@
|
||||
/*
|
||||
*$Log: parsehdr.h,v $
|
||||
*/
|
||||
/* Header file for parsehdr.c */
|
||||
|
||||
typedef unsigned long dword; /* 32 bits */
|
||||
typedef unsigned char byte; /* 8 bits */
|
||||
typedef unsigned short word; /* 16 bits */
|
||||
typedef unsigned char boolT; /* 8 bits */
|
||||
|
||||
#define TRUE 1
|
||||
#define FALSE 0
|
||||
|
||||
#define BUFF_SIZE 8192 /* Holds a declaration */
|
||||
#define FBUF_SIZE 32700 /* Holds part of a header file */
|
||||
|
||||
#define NARGS 15
|
||||
#define NAMES_L 160
|
||||
#define TYPES_L 160
|
||||
#define FUNC_L 160
|
||||
|
||||
#define ERRF stdout
|
||||
|
||||
void phError(char *errmsg);
|
||||
void phWarning(char *errmsg);
|
||||
|
||||
#define ERR(msg) phError(msg)
|
||||
#ifdef DEBUG
|
||||
#define DBG(str) printf(str);
|
||||
#else
|
||||
#define DBG(str) ;
|
||||
#endif
|
||||
#define WARN(msg) phWarning(msg)
|
||||
#define OUT(str) fprintf(outfile, str)
|
||||
|
||||
#define PH_PARAMS 32
|
||||
#define PH_NAMESZ 15
|
||||
|
||||
#define SYMLEN 16 /* Including the null */
|
||||
#define Int long /* For locident.h */
|
||||
#define int16 short int /* For locident.h */
|
||||
#include "locident.h" /* For the hlType enum */
|
||||
#define bool unsigned char /* For internal use */
|
||||
#define TRUE 1
|
||||
#define FALSE 0
|
||||
|
||||
typedef
|
||||
struct ph_func_tag
|
||||
{
|
||||
char name[SYMLEN]; /* Name of function or arg */
|
||||
hlType typ; /* Return type */
|
||||
int numArg; /* Number of args */
|
||||
int firstArg; /* Index of first arg in chain */
|
||||
int next; /* Index of next function in chain */
|
||||
bool bVararg; /* True if variable num args */
|
||||
} PH_FUNC_STRUCT;
|
||||
|
||||
typedef
|
||||
struct ph_arg_tag
|
||||
{
|
||||
char name[SYMLEN]; /* Name of function or arg */
|
||||
hlType typ; /* Parameter type */
|
||||
} PH_ARG_STRUCT;
|
||||
|
||||
#define DELTA_FUNC 32 /* Number to alloc at once */
|
||||
|
||||
|
||||
#define PH_JUNK 0 /* LPSTR buffer, nothing happened */
|
||||
#define PH_PROTO 1 /* LPPH_FUNC ret val, func name, args */
|
||||
#define PH_FUNCTION 2 /* LPPH_FUNC ret val, func name, args */
|
||||
#define PH_TYPEDEF 3 /* LPPH_DEF definer and definee */
|
||||
#define PH_DEFINE 4 /* LPPH_DEF definer and definee */
|
||||
#define PH_ERROR 5 /* LPSTR error string */
|
||||
#define PH_WARNING 6 /* LPSTR warning string */
|
||||
#define PH_MPROTO 7 /* ????? multi proto???? */
|
||||
#define PH_VAR 8 /* ????? var decl */
|
||||
|
||||
/* PROTOS */
|
||||
|
||||
boolT phData(char *buff, int ndata);
|
||||
boolT phPost(void);
|
||||
boolT phFree(void);
|
||||
void checkHeap(char *msg); /* For debugging only */
|
||||
|
||||
void phBuffToFunc(char *buff);
|
||||
|
||||
void phBuffToDef(char *buff);
|
||||
|
||||
|
||||
#define TOK_TYPE 256 /* A type name (e.g. "int") */
|
||||
#define TOK_NAME 257 /* A function or parameter name */
|
||||
#define TOK_DOTS 258 /* "..." */
|
||||
#define TOK_EOL 259 /* End of line */
|
||||
|
||||
typedef enum
|
||||
{
|
||||
BT_INT, BT_CHAR, BT_FLOAT, BT_DOUBLE, BT_STRUCT, BT_VOID, BT_UNKWN
|
||||
} baseType;
|
||||
217
tools/parsehdr/parsehdr.txt
Normal file
217
tools/parsehdr/parsehdr.txt
Normal file
@@ -0,0 +1,217 @@
|
||||
PARSEHDR
|
||||
|
||||
1 What is ParseHdr?
|
||||
|
||||
2 What is dcclibs.dat?
|
||||
|
||||
3 How do I use ParseHdr?
|
||||
|
||||
4 What about languages other than C?
|
||||
|
||||
5 What is the structure of the dcclibs.dat file?
|
||||
|
||||
6 What are all these errors, and why do they happen?
|
||||
|
||||
|
||||
1 What is ParseHdr?
|
||||
-------------------
|
||||
|
||||
ParseHdr is a program that creates a special prototype file for DCC
|
||||
from a set of include files (.h files). This allows DCC to be aware
|
||||
of the type of library function arguments, and return types. The file
|
||||
produced is called dcclibs.dat. ParseHdr is designed specifically for
|
||||
C header files.
|
||||
|
||||
As an example, this is what allows DCC to recognise that printf has
|
||||
(at least) a string argument, and so converts the first argument from
|
||||
a numeric constant to a string. So you get
|
||||
printf("Hello world")
|
||||
instead of
|
||||
printf(0x42).
|
||||
|
||||
|
||||
2 What is dcclibs.dat?
|
||||
----------------------
|
||||
|
||||
dcclibs.dat is the file created by the ParseHdr program. It contains
|
||||
a list of function names and parameter and return types. See section
|
||||
5 for details of the contents of the file.
|
||||
|
||||
|
||||
3 How do I use ParseHdr?
|
||||
------------------------
|
||||
|
||||
To use ParseHdr you need a file containing a list of header files,
|
||||
like this:
|
||||
\tc\include\alloc.h
|
||||
\tc\include\assert.h
|
||||
\tc\include\bios.h
|
||||
...
|
||||
\tc\include\time.h
|
||||
|
||||
There must be one file per line, no blank lines, and unless the
|
||||
header files are in the current directory, a full path must be given.
|
||||
The easiest way to create such a file is to redirect the output of a
|
||||
dir command to a file, like this:
|
||||
c>dir \tc\include\*.h > tcfiles.lst
|
||||
and then edit the resultant file. Note that the path will not be
|
||||
included in this, so you will have to add that manually. Remove
|
||||
everything after the .h, such as file size, date, etc.
|
||||
|
||||
Once you have this file, you can run parsehdr:
|
||||
|
||||
parsehdr <listfile>
|
||||
|
||||
For example,
|
||||
|
||||
parsehdr tcfiles.lst
|
||||
|
||||
You will get some messages indicating which files are being
|
||||
processed, but also some error messages. Just ignore the error
|
||||
messages, see section 6 for why they occur.
|
||||
|
||||
|
||||
|
||||
4 What about languages other than C?
|
||||
-----------------------------------------
|
||||
|
||||
ParseHdr will only work on C header files. It would be possible to
|
||||
process files for other languages that contained type information, to
|
||||
produce a dcclibs.dat file specific to that language. Ideally, DCC
|
||||
should look for a different file for each language, but since only a
|
||||
C version of dcclibs.dat has so far been created, this has not been
|
||||
done.
|
||||
|
||||
Prototype information for Turbo Pascal exists in the file turbo.tpl,
|
||||
at least for things like the graphics library, so it would be
|
||||
possible for MakeDsTp to produce a dcclibs.dat file as well as the
|
||||
signature file. However, the format of the turbo.tpl file is not
|
||||
documented by Borland; for details see
|
||||
|
||||
W. L. Peavy, "Inside Turbo Pascal 6.0 Units", Public domain software
|
||||
file tpu6doc.txt in tpu6.zip. Anonymous ftp from garbo.uwasa.fi and
|
||||
mirrors, directory /pc/turbopas, 1991.
|
||||
|
||||
|
||||
|
||||
|
||||
5 What is the structure of the dcclibs.dat file?
|
||||
------------------------------------------------
|
||||
|
||||
The first 4 bytes are "dccp", identifying it as a DCC prototype file.
|
||||
After this, there are two sections.
|
||||
|
||||
The first section begins with "FN", for Function Names. It is
|
||||
followed by a two byte integer giving the number of function names
|
||||
stored. The remainder of this section is an array of structures, one
|
||||
per function name. Each has this structure:
|
||||
char Name[SYMLEN]; /* Name of the function, NULL terminated */
|
||||
int type; /* A 2 byte integer describing the return type */
|
||||
int numArg; /* The number of arguments */
|
||||
int firstArg; /* The index of the first arg, see below */
|
||||
char bVarArg; /* 1 if variable arguments, 0 otherwise */
|
||||
|
||||
SYMLEN is 16, alowing 15 chars before the NULL. Therefore, the length
|
||||
of this structure is 23 bytes.
|
||||
|
||||
The types are as defined in locident.h (actually a part of dcc), and
|
||||
at present are as follows:
|
||||
typedef enum {
|
||||
TYPE_UNKNOWN = 0, /* unknown so far 00 */
|
||||
TYPE_BYTE_SIGN, /* signed byte (8 bits) 01 */
|
||||
TYPE_BYTE_UNSIGN, /* unsigned byte 02 */
|
||||
TYPE_WORD_SIGN, /* signed word (16 bits) 03 */
|
||||
TYPE_WORD_UNSIGN, /* unsigned word (16 bits) 04 */
|
||||
TYPE_LONG_SIGN, /* signed long (32 bits) 05 */
|
||||
TYPE_LONG_UNSIGN, /* unsigned long (32 bits) 06 */
|
||||
TYPE_RECORD, /* record structure 07 */
|
||||
TYPE_PTR, /* pointer (32 bit ptr) 08 */
|
||||
TYPE_STR, /* string 09 */
|
||||
TYPE_CONST, /* constant (any type) 0A */
|
||||
TYPE_FLOAT, /* floating point 0B */
|
||||
TYPE_DOUBLE, /* double precision float 0C */
|
||||
} hlType;
|
||||
|
||||
firstArg is an index into the array in the second section.
|
||||
|
||||
The second section begins with "PM" (for Parameters). It is followed
|
||||
by a 2 byte integer giving the number of parameter records. After
|
||||
this is the array of parameter structures. Initially, the names of the
|
||||
parameters were being stored, but this has been removed at present.
|
||||
The parameter structure is therefore now just a single 2 byte
|
||||
integer, representing the type of that argument.
|
||||
|
||||
The way it all fits together is perhaps best described by an example.
|
||||
Lets consider this entry in dcclibs.dat:
|
||||
|
||||
73 74 72 63 6D 70 00 ; "strcmp"
|
||||
00 00 00 00 00 00 00 00 00 ; Padding to 16 bytes
|
||||
03 00 ; Return type 3, TYPE_WORD_UNSIGN
|
||||
02 00 ; 2 arguments
|
||||
15 02 ; First arg is 0215
|
||||
00 ; Not var args
|
||||
|
||||
If we now skip to the "PM" part of the file, skip the number of
|
||||
arguments word, then skip 215*2 = 42A bytes, we find this:
|
||||
09 00 09 00 09 00 ...
|
||||
|
||||
The first 09 00 (TYPE_STR) refers to the type of the first parameter,
|
||||
and the second to the second parameter. There are only 2 arguments,
|
||||
so the third 09 00 refers to the first parameter of the next
|
||||
function. So both parameters are strings, as is expected.
|
||||
|
||||
For functions with variable parameters, bVarArg is set to 01, and the
|
||||
number of parameters reported is the number of fixed parameters. Here
|
||||
is another example:
|
||||
|
||||
66 70 72 69 6E 74 66 00 ; "fprintf"
|
||||
00 00 00 00 00 00 00 00 ; padding
|
||||
03 00 ; return type 3, TYPE_WORD_UNSIGN
|
||||
02 00 ; 2 fixed args
|
||||
81 01 ; First arg at index 0181
|
||||
01 ; Var args
|
||||
|
||||
and in the "PM" section at offset 181*2 = 0302, we find 08 00 09 00
|
||||
03 00 meaning that the first parameter is a pointer (in fact, we know
|
||||
it's a FILE *), and the second parameter is a string.
|
||||
|
||||
|
||||
|
||||
|
||||
6 What are all these errors, and why do they happen?
|
||||
----------------------------------------------------
|
||||
|
||||
When you run ParseHdr, as well as the progress statements like
|
||||
Processing \tc\include\alloc.h ...
|
||||
|
||||
you can get error messages. Basically, ignore these errors. They occur
|
||||
for a variety of reasons, most of which are detailed below.
|
||||
|
||||
1)
|
||||
Expected type: got ) (29)
|
||||
void __emit__()
|
||||
^
|
||||
This include file contained a non ansi prototype. This is rare, and
|
||||
__emit__ is not a standard function anyway. If it really bothers you,
|
||||
you could add the word "void" to the empty parentheses in your
|
||||
include file.
|
||||
|
||||
2)
|
||||
Expected ',' between parameter defs: got ( (28)
|
||||
void _Cdecl ctrlbrk (int _Cdecl (*handler)(void))
|
||||
|
||||
Here "handler" is a pointer to a function. Being a basically simple
|
||||
program, ParseHdr does not expand all typedef and #define statements,
|
||||
so it cannot distinguish between types and user defined function
|
||||
names. Therefore, it is not possible in general to parse any
|
||||
prototypes containing pointers to functions, so at this stage, any
|
||||
such prototypes will produce an error of some sort. DCC cannot
|
||||
currently make use of this type information anyway, so this is no
|
||||
real loss. There are typically half a dozen such errors.
|
||||
|
||||
3)
|
||||
Unknown type time_t
|
||||
|
||||
Types (such as time_t) that are structures or pointers to structures
|
||||
are not handled by ParseHdr, since typedef and #define statements are
|
||||
ignored. Again, there are typically only about a dozen of these.
|
||||
8
tools/parsehdr/parselib.mak
Normal file
8
tools/parsehdr/parselib.mak
Normal file
@@ -0,0 +1,8 @@
|
||||
CFLAGS = -Zi -c -AS -W3 -D__MSDOS__
|
||||
|
||||
parselib.exe: parselib.obj
|
||||
link /CO parselib;
|
||||
|
||||
parselib.obj: parselib.c
|
||||
cl $(CFLAGS) $*.c
|
||||
|
||||
24
tools/parsehdr/tcfiles.lst
Normal file
24
tools/parsehdr/tcfiles.lst
Normal file
@@ -0,0 +1,24 @@
|
||||
\tc\include\alloc.h
|
||||
\tc\include\assert.h
|
||||
\tc\include\bios.h
|
||||
\tc\include\conio.h
|
||||
\tc\include\ctype.h
|
||||
\tc\include\dir.h
|
||||
\tc\include\dos.h
|
||||
\tc\include\errno.h
|
||||
\tc\include\fcntl.h
|
||||
\tc\include\float.h
|
||||
\tc\include\io.h
|
||||
\tc\include\limits.h
|
||||
\tc\include\math.h
|
||||
\tc\include\mem.h
|
||||
\tc\include\process.h
|
||||
\tc\include\setjmp.h
|
||||
\tc\include\share.h
|
||||
\tc\include\signal.h
|
||||
\tc\include\stdarg.h
|
||||
\tc\include\stddef.h
|
||||
\tc\include\stdio.h
|
||||
\tc\include\stdlib.h
|
||||
\tc\include\string.h
|
||||
\tc\include\time.h
|
||||
0
tools/readsig/CMakeLists.txt
Normal file
0
tools/readsig/CMakeLists.txt
Normal file
239
tools/readsig/readsig.cpp
Normal file
239
tools/readsig/readsig.cpp
Normal file
@@ -0,0 +1,239 @@
|
||||
/* Quick program to read the output from makedsig */
|
||||
|
||||
#include <stdio.h>
|
||||
#include <io.h>
|
||||
#include <stdlib.h>
|
||||
#include <memory.h>
|
||||
#include <string.h>
|
||||
#include "perfhlib.h"
|
||||
|
||||
/* statics */
|
||||
byte buf[100];
|
||||
int numKeys; /* Number of hash table entries (keys) */
|
||||
int numVert; /* Number of vertices in the graph (also size of g[]) */
|
||||
int PatLen; /* Size of the keys (pattern length) */
|
||||
int SymLen; /* Max size of the symbols, including null */
|
||||
FILE *f; /* File being read */
|
||||
|
||||
static word *T1base, *T2base; /* Pointers to start of T1, T2 */
|
||||
static word *g; /* g[] */
|
||||
|
||||
/* prototypes */
|
||||
void grab(int n);
|
||||
word readFileShort(void);
|
||||
void cleanup(void);
|
||||
|
||||
static bool bDispAll = FALSE;
|
||||
|
||||
void
|
||||
main(int argc, char *argv[])
|
||||
{
|
||||
word w, len;
|
||||
int h, i, j;
|
||||
long filePos;
|
||||
|
||||
if (argc <= 1)
|
||||
{
|
||||
printf("Usage: readsig [-a] <SigFilename>\n");
|
||||
printf("-a for all symbols (else just duplicates)\n");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
i = 1;
|
||||
|
||||
if (strcmp(argv[i], "-a") == 0)
|
||||
{
|
||||
i++;
|
||||
bDispAll = TRUE;
|
||||
}
|
||||
if ((f = fopen(argv[i], "rb")) == NULL)
|
||||
{
|
||||
printf("Cannot open %s\n", argv[i]);
|
||||
exit(2);
|
||||
}
|
||||
|
||||
/* Read the parameters */
|
||||
grab(4);
|
||||
if (memcmp("dccs", buf, 4) != 0)
|
||||
{
|
||||
printf("Not a dccs file!\n");
|
||||
exit(3);
|
||||
}
|
||||
numKeys = readFileShort();
|
||||
numVert = readFileShort();
|
||||
PatLen = readFileShort();
|
||||
SymLen = readFileShort();
|
||||
|
||||
/* Initialise the perfhlib stuff. Also allocates T1, T2, g, etc */
|
||||
hashParams( /* Set the parameters for the hash table */
|
||||
numKeys, /* The number of symbols */
|
||||
PatLen, /* The length of the pattern to be hashed */
|
||||
256, /* The character set of the pattern (0-FF) */
|
||||
0, /* Minimum pattern character value */
|
||||
numVert); /* Specifies C, the sparseness of the graph.
|
||||
See Czech, Havas and Majewski for details
|
||||
*/
|
||||
|
||||
T1base = readT1();
|
||||
T2base = readT2();
|
||||
g = readG();
|
||||
|
||||
/* Read T1 and T2 tables */
|
||||
grab(2);
|
||||
if (memcmp("T1", buf, 2) != 0)
|
||||
{
|
||||
printf("Expected 'T1'\n");
|
||||
exit(3);
|
||||
}
|
||||
len = PatLen * 256 * sizeof(word);
|
||||
w = readFileShort();
|
||||
if (w != len)
|
||||
{
|
||||
printf("Problem with size of T1: file %d, calc %d\n", w, len);
|
||||
exit(4);
|
||||
}
|
||||
if (fread(T1base, 1, len, f) != len)
|
||||
{
|
||||
printf("Could not read T1\n");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
grab(2);
|
||||
if (memcmp("T2", buf, 2) != 0)
|
||||
{
|
||||
printf("Expected 'T2'\n");
|
||||
exit(3);
|
||||
}
|
||||
w = readFileShort();
|
||||
if (w != len)
|
||||
{
|
||||
printf("Problem with size of T2: file %d, calc %d\n", w, len);
|
||||
exit(4);
|
||||
}
|
||||
if (fread(T2base, 1, len, f) != len)
|
||||
{
|
||||
printf("Could not read T2\n");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
/* Now read the function g[] */
|
||||
grab(2);
|
||||
if (memcmp("gg", buf, 2) != 0)
|
||||
{
|
||||
printf("Expected 'gg'\n");
|
||||
exit(3);
|
||||
}
|
||||
len = numVert * sizeof(word);
|
||||
w = readFileShort();
|
||||
if (w != len)
|
||||
{
|
||||
printf("Problem with size of g[]: file %d, calc %d\n", w, len);
|
||||
exit(4);
|
||||
}
|
||||
if (fread(g, 1, len, f) != len)
|
||||
{
|
||||
printf("Could not read T2\n");
|
||||
exit(5);
|
||||
}
|
||||
|
||||
|
||||
/* This is now the hash table */
|
||||
grab(2);
|
||||
if (memcmp("ht", buf, 2) != 0)
|
||||
{
|
||||
printf("Expected 'ht'\n");
|
||||
exit(3);
|
||||
}
|
||||
w = readFileShort();
|
||||
if (w != numKeys * (SymLen + PatLen + sizeof(word)))
|
||||
{
|
||||
printf("Problem with size of hash table: file %d, calc %d\n", w, len);
|
||||
exit(6);
|
||||
}
|
||||
|
||||
if (bDispAll)
|
||||
{
|
||||
fseek(f, 0, SEEK_CUR); /* Needed due to bug in MS fread()! */
|
||||
filePos = _lseek(fileno(f), 0, SEEK_CUR);
|
||||
for (i=0; i < numKeys; i++)
|
||||
{
|
||||
grab(SymLen + PatLen);
|
||||
|
||||
printf("%16s ", buf);
|
||||
for (j=0; j < PatLen; j++)
|
||||
{
|
||||
printf("%02X", buf[SymLen+j]);
|
||||
if ((j%4) == 3) printf(" ");
|
||||
}
|
||||
printf("\n");
|
||||
}
|
||||
printf("\n\n\n");
|
||||
fseek(f, filePos, SEEK_SET);
|
||||
}
|
||||
|
||||
for (i=0; i < numKeys; i++)
|
||||
{
|
||||
grab(SymLen + PatLen);
|
||||
|
||||
h = hash(&buf[SymLen]);
|
||||
if (h != i)
|
||||
{
|
||||
printf("Symbol %16s (index %3d) hashed to %d\n",
|
||||
buf, i, h);
|
||||
}
|
||||
}
|
||||
|
||||
printf("Done!\n");
|
||||
fclose(f);
|
||||
|
||||
}
|
||||
|
||||
|
||||
void
|
||||
cleanup(void)
|
||||
{
|
||||
/* Free the storage for variable sized tables etc */
|
||||
if (T1base) free(T1base);
|
||||
if (T2base) free(T2base);
|
||||
if (g) free(g);
|
||||
}
|
||||
|
||||
void grab(int n)
|
||||
{
|
||||
if (fread(buf, 1, n, f) != (size_t)n)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
}
|
||||
|
||||
word
|
||||
readFileShort(void)
|
||||
{
|
||||
byte b1, b2;
|
||||
|
||||
if (fread(&b1, 1, 1, f) != 1)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
if (fread(&b2, 1, 1, f) != 1)
|
||||
{
|
||||
printf("Could not read\n");
|
||||
exit(11);
|
||||
}
|
||||
return (b2 << 8) + b1;
|
||||
}
|
||||
|
||||
/* Following two functions not needed unless creating tables */
|
||||
|
||||
void getKey(int i, byte **keys)
|
||||
{
|
||||
}
|
||||
|
||||
/* Display key i */
|
||||
void
|
||||
dispKey(int i)
|
||||
{
|
||||
}
|
||||
|
||||
11
tools/readsig/readsig.mak
Normal file
11
tools/readsig/readsig.mak
Normal file
@@ -0,0 +1,11 @@
|
||||
CFLAGS = -Zi -c -AL -W3 -D__MSDOS__
|
||||
|
||||
readsig.exe: readsig.obj perfhlib.obj
|
||||
link /CO readsig perfhlib;
|
||||
|
||||
readsig.obj: readsig.c dcc.h perfhlib.h
|
||||
cl $(CFLAGS) $*.c
|
||||
|
||||
perfhlib.obj: perfhlib.c dcc.h perfhlib.h
|
||||
cl $(CFLAGS) $*.c
|
||||
|
||||
97
tools/readsig/readsig.txt
Normal file
97
tools/readsig/readsig.txt
Normal file
@@ -0,0 +1,97 @@
|
||||
READSIG
|
||||
|
||||
1 What is ReadSig?
|
||||
|
||||
2 How do I use ReadSig?
|
||||
|
||||
3 What are duplicate signatures?
|
||||
|
||||
4 How can I make sense of the signatures?
|
||||
|
||||
|
||||
1 What is ReadSig?
|
||||
------------------
|
||||
|
||||
ReadSig is a quick and dirty program to read signatures from a DCC
|
||||
signature file. It was originally written as an integrity checker for
|
||||
signature files, but can now be used to see what's in a signature
|
||||
file, and which functions have duplicate signatures.
|
||||
|
||||
2 How do I use ReadSig?
|
||||
-----------------------
|
||||
|
||||
Just type
|
||||
readsig <sigfilename>
|
||||
|
||||
or
|
||||
|
||||
readsig -a <sigfilename>
|
||||
|
||||
|
||||
For example:
|
||||
readsig -a dcct2p.sig
|
||||
|
||||
Either way, you get a list of duplicate signatures, i.e. functions
|
||||
whose first 23 bytes, after wildcarding and chopping, (see section 3
|
||||
for details), that have the same signature.
|
||||
|
||||
With the -a switch, you also (before the above) get a list of all
|
||||
symbolic names in the signature file, and the signatures themselves
|
||||
in hex. This could be a dozen or more pages for large signature
|
||||
files.
|
||||
|
||||
Currently, signatures are 23 bytes long, and the symbolic names are
|
||||
truncated to 15 characters.
|
||||
|
||||
|
||||
3 What are duplicate signatures?
|
||||
--------------------------------
|
||||
|
||||
Duplicate signatures arise for 3 reasons. 1: length of the signature.
|
||||
2: wildcards. 3: chopping of the signature.
|
||||
|
||||
1: Because signatures are only 23 bytes long, there is a chance that
|
||||
two distinct signatures (first part of the binary image of a
|
||||
function) are identical in the first 23 bytes, but diverge later.
|
||||
|
||||
2: Because part of the binary image of a function depends on where it
|
||||
is loaded, parts of the signature are replaced with wildcards. It is
|
||||
possible that two functions are distinct only in places that are
|
||||
replaced by the wildcard byte (F4).
|
||||
|
||||
3: Signatures are "chopped" (cut short, and the remainder filled with
|
||||
binary zeroes) after an unconditional branch or subroutine return.
|
||||
This is to cope with functions shorter than the 23 byte size of
|
||||
signatures, so unrelated functions are not included at the end of a
|
||||
signature. (This would cause dcc to fail to recognise these short
|
||||
signatures if some other function happened to be loaded at the end).
|
||||
|
||||
The effect of duplicate signatures is that only one of the functions
|
||||
that has the same signature will be recognised. For example, suppose
|
||||
that sin, cos, and tan were just one wildcarded instruction followed
|
||||
by a jump to the same piece of code. Then all three would have the
|
||||
same signature, and calls to sin, cos, or tan would all be reported
|
||||
by dcc as just one of these, e.g. tan. If you suspect that this is
|
||||
happening, then at least ReadSig can alert you to this problem.
|
||||
|
||||
In general, the number of duplicate signatures that would actually be
|
||||
used in dcc is small, but it is possible that the above problem will
|
||||
occur.
|
||||
|
||||
|
||||
|
||||
4 How can I make sense of the signatures?
|
||||
-----------------------------------------
|
||||
|
||||
If you're one of those unfortunate individuals that can't decode hex
|
||||
instructions in your head, you can always use DispSig to copy it to a
|
||||
binary file, since you now know the name of the function. Then you
|
||||
can use debug or some other debugger to disassemble the binary file.
|
||||
Generally, most entries in signature files will be executable code,
|
||||
so it should disassemble readily.
|
||||
|
||||
Be aware that signatures are wildcarded, so don't pay any attention
|
||||
to the destination of jmp or call instructions (three or 5 byte
|
||||
jumps, anyway; 2 byte jumps are not wildcarded), and 16 bit immediate
|
||||
values. The latter will always be F4F4 (two wildcard bytes),
|
||||
regardless of what they were in the original function.
|
||||
Reference in New Issue
Block a user