128 lines
6.1 KiB
Markdown
128 lines
6.1 KiB
Markdown
I've fixed many issues in this codebase, among other things - memory reallocation during decompilation.
|
|
|
|
To reflect those fixes, I've edited the original readme a bit.
|
|
|
|
* * *
|
|
dcc Distribution
|
|
================
|
|
|
|
The code provided in this distribution is (C) by their authors:
|
|
- Cristina Cifuentes (most of dcc code)
|
|
- Mike van Emmerik (signatures and prototype code)
|
|
- Jeff Ledermann (some disassembly code)
|
|
|
|
and is provided "as is". Additional contributor list is available
|
|
[on GitHub](https://github.com/nemerle/dcc/graphs/contributors).
|
|
|
|
The following files are included in the dccoo.tar.gz distribution:
|
|
- dcc.zip (dcc.exe DOS program, 1995)
|
|
- dccsrc.zip (source code *.c, *.h for dcc, 1993-1994)
|
|
- dcc32.zip (dcc_oo.exe 32 bit console (Win95/Win-NT) program, 1997)
|
|
- dccsrcoo.zip (source code *.cpp, *.h for "oo" dcc, 1993-1997)
|
|
- dccbsig.zip (library signatures for Borland C compilers, 1994)
|
|
- dccmsig.zip (library signatures for Microsoft C compilers, 1994)
|
|
- dcctpsig.zip (library signatures for Turbo Pascal compilers, 1994)
|
|
- dcclibs.dat (prototype file for C headers, 1994)
|
|
- test.zip (sample test files: *.c *.exe *.b, 1993-1996)
|
|
- makedsig.zip (creates a .sig file from a .lib C file, 1994)
|
|
- makedstp.zip (creates a .sig file from a Pascal library file, 1994)
|
|
- readsig.zip (reads signatures in a .sig file, 1994)
|
|
- dispsrch.zip (displays the name of a function given a signature, 1994)
|
|
- parsehdr.zip (generates a prototype file (dcclibs.dat) from C *.h files, 1994)
|
|
|
|
Note that the dcc_oo.exe program (in dcc32.zip) is a 32 bit program,
|
|
so it won't work under Windows 3.1. Also, it is a console mode program,
|
|
meaning that it has to be run in the "Command Prompt" window (sometimes
|
|
known as the "Dos Box"). It is not a GUI program.
|
|
|
|
The following files are included in the test.zip file: fibo,
|
|
benchsho, benchlng, benchfn, benchmul, byteops, intops, longops,
|
|
max, testlong, matrixmu, strlen, dhamp.
|
|
The version of dcc included in this distribution (dccsrcoo.zip and
|
|
dcc32.exe) is a bit better than the first release, but it is still
|
|
broken in some cases, and we do not have the time to work in this
|
|
project at present so we cannot provide any changes.
|
|
Comments on individual files:
|
|
- fibo (fibonacci): the small model (fibos.exe) decompiles correctly,
|
|
the large model (fibol.exe) expects an extra argument for
|
|
`scanf()`. This argument is the segment and is not displayed.
|
|
- benchsho: the first `scanf()` takes loc0 as an argument. This is
|
|
part of a long variable, but dcc does not have any clue at that
|
|
stage that the stack offset pushed on the stack is to be used
|
|
as a long variable rather than an integer variable.
|
|
- benchlng: as part of the `main()` code, `LO(loc1) | HI(loc1)` should
|
|
be displayed instead of `loc3 | loc9`. These two integer variables
|
|
are equivalent to the one long loc1 variable.
|
|
- benchfn: see benchsho.
|
|
- benchmul: see benchsho.
|
|
- byteops: decompiles correctly.
|
|
- intops: the du analysis for `DIV` and `MOD` is broken. dcc currently
|
|
generates code for a long and an integer temporary register that
|
|
were used as part of the analysis.
|
|
- longops: decompiles correctly.
|
|
- max: decompiles correctly.
|
|
- testlong: this example decompiles correctly given the algorithms
|
|
implemented in dcc. However, it shows that when long variables
|
|
are defined and used as integers (or long) without giving dcc
|
|
any hint that this is happening, the variable will be treated as
|
|
two integer variables. This is due to the fact that the assembly
|
|
code is in terms of integer registers, and long registers are not
|
|
available in 80286, so a long variable is equivalent to two integer
|
|
registers. dcc only knows of this through idioms such as add two
|
|
long variables.
|
|
- matrixmu: decompiles correctly. Shows that arrays are not supported
|
|
in dcc.
|
|
- strlen: decompiles correctly. Shows that pointers are partially
|
|
supported by dcc.
|
|
- dhamp: this program has far more data types than what dcc recognizes
|
|
at present.
|
|
|
|
Our thanks to Gary Shaffstall for some debugging work. Current bugs
|
|
are:
|
|
- [ ] if the code generated in the one line is too long, the (static)
|
|
buffer used for that line is clobbered. Solution: make the buffer
|
|
larger (currently 200 chars).
|
|
- [ ] the large memory model problem & `scanf()`
|
|
- [ ] dcc's error message shows a p option available which doesn't
|
|
exist, and doesn't show an i option which exists.
|
|
- [x] there is a nasty problem whereby some arrays can get reallocated
|
|
to a new address, and some pointers can become invalid. This mainly
|
|
tends to happen to larger executable files. A major rewrite will
|
|
probably be required to fix this.
|
|
|
|
For more information refer to the thesis "Reverse Compilation
|
|
Techniques" by Cristina Cifuentes, Queensland University of
|
|
Technology, 1994, and the dcc home page:
|
|
http://www.it.uq.edu.au/groups/csm/dcc_readme.html
|
|
|
|
Please note that the executable version of dcc provided in this
|
|
distribution does not necessarily match the source code provided,
|
|
some changes were done without us keeping track of every change.
|
|
|
|
Using dcc
|
|
---------
|
|
|
|
Here is a very brief summary of switches for dcc:
|
|
|
|
* `a1`, `a2`: assembler output, before and after re-ordering of input code
|
|
* `c`: Attempt to follow control through indirect call instructions
|
|
* `i`: Enter interactive disassembler
|
|
* `m`: Memory map
|
|
* `s`: Statistics summary
|
|
* `v`, `V`: verbose (and Very verbose)
|
|
* `o` filename: Use filename as assembler output file
|
|
|
|
If dcc encounters illegal instructions, it will attempt to enter the so called
|
|
interactive disassembler. The idea of this was to allow commands to fix the
|
|
problem so that dcc could continue, but no such changes are implemented
|
|
as yet. (Note: the Unix versions do not have the interactive disassembler). If
|
|
you get into this, you can get out of it by pressing `^X` (control-X). Once dcc
|
|
has entered the interactive disassembler, however, there is little chance that
|
|
it will recover and produce useful output.
|
|
|
|
If dcc loads the signature file `dccxxx.sig`, this means that it has not
|
|
recognised the compiler library used. You can place the signatures in a
|
|
different direcory to where you are working if you set the DCC environment
|
|
variable to point to their path. Note that if dcc can't find its signature
|
|
files, it will be severely handicapped.
|