dcc/tools/makedsig/makedsig.txt
nemerle a697ad05c0 Add original dcc tools to repository
* makedsig has been integrated with makedstp, it should handle both LIB and TPL files
* other tools have not been modified
2015-02-10 17:28:50 +01:00

189 lines
6.9 KiB
Plaintext

MAKEDSIG
1 What is MakeDsig?
2 How does it work?
3 How do I use MakeDsig?
4 What's in a signature file?
5 What other tools are useful for signature work?
1 What is MakeDsig?
-------------------
MakeDsig is a program that reads a library (.lib) file from a
compiler, and generates a signature file for use by DCC. Without
signature files, dcc cannot recognise library functions, and so will
attempt to decompile them, and cannot name them. This makes the
resultant decompiled code bulkier and difficult to understand.
2 How does it work?
-------------------
Library files contain complete functions, relocation information,
function names, and more. MakeDsig reads a library file, and for each
function found, it saves the name, and creates a signature. These
are stored in an array. When all functions are done, tables for the
perfect hashing function are generated. During this process,
duplicate keys (functions that produce identical signatures) may be
detected; if so, one of the keys will be zeroed.
The signature file contains information needed by dcc to hash the
signatures, as well as the symbols and signatures. Dcc reads the various
sections of the signature file to be able to hash signatures. The
signatures, not the symbols, are hashed, since dcc gets a signature
from the executable file, and needs to know quickly if there is a
symbolic name for it.
3 How do I use MakeDsig?
------------------------
You can always find out by just executing it with no arguments, or
MakeDsig -h for more details.
Basically, you just give it the names of the files that it needs:
MakeDsig <libname> <signame>
It will ask you for a seed; enter any number, e.g. 1.
You need the library file for the appropriate compiler. For example,
to analyse executable programs created from Turbo C 2.1 small model,
you need the cs.lib file that comes with that compiler.
You also need to know the correct name for the signature file, i.e.
<signame>. Dcc will detect certain compiler vendors and version
numbers, and will look for a signature file named like this:
d c c <vendor> <version> <model> . s i g
Here are the current vendors:
Vendor Vendor letter
Microsoft C/C++ m
Borland C/C++ b
Logitech (Modula) l
Turbo Pascal t
Here are the model codes:
small/tiny s
medium m
compact c
large l
Turbo Pascal p
The version codes are fairly self explanatory:
Microsoft C 5.1 5
Microsoft C 8 8
Borland C 2.0 2
Borland C 3.0 3
Turbo Pascal 3.0 3 Note: currently no way to make dcct3p.sig
Turbo Pascal 4.0 4 Use Makedstp, not makedsig
Turbo Pascal 5.0 5 Use Makedstp, not makedsig
Some examples: the signature file for Borland C version 2.0, small
model, would be dccb2s.sig. To generate it, you would supply as the
library file cs.lib that came with that compiler. Suppose it was in
the \bc\lib directory. To generate the signature file required to
work with files produced by this compiler, you would type
makedsig \bc\lib\cs.lib dccb2s.sig
This will create dccb2s.sig in the current directory. For dcc to use
this file, place it in the same directory as dcc itself, or point the
environment variable DCC to the directory containing it.
Another example: to make the signature file for Microsoft Visual
C/C++ (C 8.0), large model, and assuming the libraries are in
the directory \msvc\lib, you would type
makedsig \msvc\lib\llibce.lib dccm8l.sig
Note that the signature files for Turbo Pascal from version 4 onwards
are generated by makedstp, not makedsig. The latter program reads a
special file called turbo.tpl, as there are no normal .lib files for
turbo pascal. Dcc will recognise turbo pascal 3.0 files, and look
for dcct3p.sig. Because all the library routines are contained in
every Turbo Pascal executable, there are no library files or even a
turbo.tpl file, so the signature file would have to be constructed by
guesswork. You can still use dcc on these files; just ignore the
warning about not finding the signature file.
For executables that dcc does not recognise, it will look for the
signature file dccxxx.sig. That way, if you have a new compiler, you
can at least have dcc detect library calls, even if it attempts to
decompile them all, and has not identified the main program.
Logitech Modula V1.0 files are recognised, and the signature file
dccl1x.sig is looked for. This was experimental in nature, and is not
recommended for serious analysis at this stage.
4 What's in a signature file?
-----------------------------
The details of a signature file are best documented in the source for
makedsig; see the function saveFile(). Briefly:
1) a 4 byte pattern identifying the file as a signature file: "dccs".
2) a two byte integer containing the number of keys (signatures)
3) a two byte integer containing the number of vertices on the graph
used to generate the hash table. See the source code and/or the
Czech, Havas and Majewski articles for details
4) a two byte integer containing the pattern length
5) a two byte integer containing the symbolic name length
The next sections all have the following structure:
1) 2 char ID
2) a two byte integer containing the size of the body
3) the body.
There are 4 sections: "T1", "T2", "gg", and "ht". T1 and T2 are the
tables associated with the hash function. (The hash function is a
random function, meaning that it involves tables. T1 and T2 are the
tables used by the hash function). "gg" is another table associated
with the graph needed by the perfect hashing function algorithm.
"ht" contains the actual hash table. The body of this section is an
array of records of this structure:
typedef struct _hashEntry
{
char name[SYMLEN]; /* The symbol name */
byte pat [PATLEN]; /* The pattern */
word offset; /* Offset (needed temporarily) */
} HASHENTRY;
This part of the signature file can be browsed with a binary dump
program; a PATLEN length signature will follow the (null padded)
symbol name. There are tools for searching signature files, e.g.
srchsig, dispsig, and readsig. See below.
5 What other tools are useful for signature work?
-------------------------------------------------
Makedstp - makes signature files from turbo.tpl. Needed to make
signature files for Turbo Pascal version 4.0 and later.
SrchSig - tells you whether a given pattern exists in a signature
file, and gives its name. You need a binary file with the signature
in it, exactly the right length. This can most easily be done with
debug (comes with MS-DOS).
DispSig - given the name of a function, displays its signature, and
stores the signature into a binary file as well. (You can use this
file with srchsig on another signature file, if you want).
ReadSig - reads a signature file, checking for correct structure, and
displaying duplicate signatures. With the -a switch, it will display
all signatures, with their symbols.
The file perfhlib.c is used by various of these tools to do the work
of the perfect hashing functions. It could be used as part of other
tools that use signature files, or just perfect hashing functions for
that matter.