Linker Overview
                              Linker Overview

-  The primary purpose of a linker is to produce an file which can be
   executed by an operating system.

-  In DOS, executable files come in three flavors:

                          "EXE" files
                          "COM" files
                          "SYS" files

-  The input to the linker comes from module(s) called translator
   modules.

-  Individual translator modules can be stored in "OBJ" files.  "OBJ"
   files are produce by language translators such as assemblers and
   compilers.

-  Libraries of translator modules can be stored in "LIB" files which
   are produced by a librarian from either "OBJ" files or other "LIB"
   files.

-  A linker can process translator modules from either "OBJ" or "LIB"
   files, but at least one module must have come from an "OBJ" file.

-  All translator modules from "OBJ" files will be included in the
   executable file, but only modules which resolve externals come from
   "LIB" files.


                    Comparison of Executable Files

-  "EXE" files have a header which contains relocation and other
   information which permit "EXE" files to have multiple segments.
   This header sets "EXE" files apart from the other executable file
   types.  Programs are written starting at address 0 (relative to
   where the program is loaded in memory).

-  "COM" files have no header.  "COM" files are written starting a
   address 100H because the "COM" file is loaded immediately after the
   program segment prefix (PSP) which is 100H bytes long.  At the start
   of the program, the CS and DS segment registers point to the PSP.
   "EXE" files which have an empty relocation table can be converted to
   "COM" files by the "EXE2BIN" program.

-  "SYS" files are more closely related to "COM" files than "EXE"
   files.  "SYS" files are loaded once at DOS initialization time.
   Since there is no PSP, the programs are written starting at address
   0.  However, like "COM" files, "SYS" files must have an empty
   relocation table.



                         Contents of Executable Files


    "EXE" files                                      "SYS" files
Ŀ                                Ŀ
   "EXE" header                                   load moduleĿ
Ĵ                                  
 relocation table                                                
Ĵ                    Ŀ                    
   load module    Ĵorg 0
                    



                        "COM" files
                       Ŀ                 Ŀ
                       load moduleĴorg 100H
                                        


-  Each of the above contents are described in detail in the following
   slides.


                              "EXE" File Header
Ŀ
Offset                       Description                             
Ĵ
00-01  The "EXE" file signature 4D5AH.                               
Ĵ
02-03  Length of the "EXE" file modulo 512.                          
Ĵ
04-05  Number of 512-byte pages.  If the last page is not full it is 
       still included in the count.                                  
Ĵ
06-07  Number of relocation items.                                   
Ĵ
08-09  Num of 16-byte paras occupied by "EXE" header and relo table. 
Ĵ
0A-0B  Number of paragraphs required immediately after load module.  
       The linker computes the number of uninitialized bytes at the  
       end of the load module.  Instead of writing these bytes to    
       the "EXE" file, the linker sets this value to provide space   
       for this data.                                                
Ĵ
0C-0D  Max number of paras which may be required immediately after   
       the "EXE" file. This value comes from the CPARMAXALLOC switch.
Ĵ
0E-11  Offset/Segment displacement into the load module of initial   
       SP/SS/. The displacement is converted to an actual address by 
       adding the base address of the load module.  Since there may  
       be several stack segments in the translator modules, the      
       linker uses the highest address of the largest segment with   
       the stack attribute.                                          
Ĵ
12-13  Word checksum computed as minus the sum of all the words in   
       the file.  Overflows are ignored.  DOS will not validate the  
       checksum if this value is set to zero                         
Ĵ
14-17  Offset/Segment displacement into the load module of the       
       initial IP/CS.  The displacement is converted to an actual    
       address by adding the base address of the load module.        
Ĵ
18-19  Lgh of "EXE" file header. (Used to find start of relo table.) 
Ĵ
1A-1B  Overlay number.  This is zero for the main program.           
Ĵ
1C-1D  Always 0001H.                                                 

Note:  "EXE" file header routines are in "EXECFILE.C" (handout page 9)


                               Relocation Table

The number of relocation items is specified at offset 06-07 in the
"EXE" file header.  The offset of the relocation table is located at
offset 18-19 in the "EXE" file header.  Usually this value is 001EH,
but can be larger if the "EXE" file header has been extended.

The relocation table can be viewed as an array of Offset/Segment
displacements into the load module.  For each address, the segment
portion of the base address of the load module is added to the word at
that displacement.  The only time a relocation item is needed is when a
fixup involves a segment.  As we will see later, this can only be
caused by base and pointer type fixups.

As an example of a how a relocation item is generated, consider the
following:

                           foo      db      'a'
                           bar      dd      foo

Note that the contents of "bar" is the address of "foo", but the actual
address of "foo" is not known until the program is loaded in memory.
All that is known at link time is how far (i.e., the displacement) into
the load module "foo" and "bar" are located.  Only the displacement of
"foo" is stored at link time.  But, the linker makes a relocation entry
showing that "bar"+2 must be relocated.



                             Load Module

By far, the generation of the load module is the trickest part of
producing an executable file.  The contents of the load module is
generated from the contents of translator modules stored in "OBJ" and
"LIB" files.  Much of the structure of translator modules comes from
the assembler directives.  In turn, those directives are related to the
architecture of the 80x86 family of micros.

Here are some of the important directives:

SEGMENT/ENDS is used to give a logical name along with grouping and
     combining information for the segments.  The linker is responsible
     for arranging and combining these logical segments into the
     physical segments which comprise a load module.

EXTRN/PUBLIC is used access data across translator modules.

A detailed description of the format of translator modules follows.


                      Terminology and Abbreviations

MAS - Memory Address Space:  The memory capable of being addressed by
    the hardware architecture.

T-MODULE - Translator Module:  This is a unit of object code produced
    by a language translator.  They may be stored individually in "OBJ"
    files or collections may be stored in "LIB" files.

FRAME:  A contiguous 64K chunk of MAS.

FRAME NUMBER:  Paragraph number where a FRAME begins.

CANONIC FRAME:  For the 8086, each byte of memory is encompassed by up
    to 4096 FRAMEs.  The CANONIC FRAME frame is lowest FRAME NUMBER
    which encompasses that byte.

LSEG - Logical Segment:  Data and code between SEGMENT - ENDS
    directives.

PSEG - Physical Segment:  A collection of one or more LSEGs placed into
    a load module.



                            Fixup Overview

Not all references to MAS can be resolved at translation time.  This
can happen when one T-MODULE must access data located in another
T-MODULE.  When this happens, the language translator places an entry
in the T-MODULE so that the linker can complete the address reference.
Such entries in a T-MODULE are called "Fixups".

In order for the linker to complete the address reference, it needs
five pieces of information:

          The LOCATION in memory where the reference occurs.  This is
        the address which must be fixed up.

          The type of LOCATION in memory where the reference occurs.

          Whether the fixup is relative to the IP or not.  This is
        refered to as the fixup MODE.

          The TARGET address which LOCATION is referencing.

          The FRAME number in the segment register used to reference
        the TARGET address.


                            LOCATION Types

There are five types of LOCATIONs.  They are POINTER, BASE, OFFSET,
HIBYTE, and LOBYTE.  The relative position and length of each LOCATION
type from a LOCATION, X is given below:

             Ŀ
                X+0       X+1       X+2       X+3    
             

             LOBYTEĴ

                        HIBYTEĴ

             OFFSETĴ

                                   BASEĴ

             POINTERĴ


                             Fixup Modes

There are two fixup modes.  Both modes stem from the architecture of
the 80x86 microprocessor family.  The TARGET may be addressed directly
via the offset/segment mechanism.  This fixup mode is called
"segment-relative".

The other manner a TARGET may be addressed is relative to the IP.  For
example, the TARGET of the 80x86 jump instructions (e.g., JE, JC, JA,
etc) are all relative to the IP.  This fixup mode is called
"self-relative".  Note that the self-relative mode is relative to the
value of the IP when the instruction executes.  In all cases, the IP
points to the first byte of the instruction following the one being
executed.


                                TARGET

The TARGET is the location in MAS being referenced by LOCATION.  There
are four basic ways of specifying the TARGET.  The four methods of
specifying a TARGET are:

                  TARGET is specified relative to an LSEG.
                  TARGET is specified relative to a group.
                  TARGET is specified relative to an external symbol.
                  TARGET is specified relative to an absolute FRAME.

The four primary methods specify a displacement while the four
secondary methods do not specify a displacement (because the
displacement is 0).

                            Primary TARGET Methods
Ŀ
Method       Notation                     Description               
Ĵ
  T0  SI(segment),displacement The TARGET is at the specified       
                               displacement in the LSEG segment.    
Ĵ
  T1  GI(group),displacement   The TARGET is at the specified       
                               displacement in the group.           
Ĵ
  T2  EI(external),displacementThe TARGET is at the specified       
                               displacement past the external.      
Ĵ
  T3  FR(frame),displacement   The TARGET is at the specified       
                               displacement past FRAME NUMBER frame.


Example:

SI(foo),4 means the TARGET is 4 bytes into the LSEG foo.  Several
T-MODULES could have an LSEG named foo which may be combined into a
PSEG foo.  So, the final displacement in the PSEG foo may not be 4.
The linker must take this into consideration.

                           Secondary TARGET Methods

     Ŀ
     Method Notation                  Description                   
     Ĵ
       T4  SI(segment) The TARGET is the base of the LSEG segment.  
     Ĵ
       T5  GI(group)   The TARGET is the base of the group.         
     Ĵ
       T6  EI(external)The TARGET is the specified external.        
     Ĵ
       T7  FR(frame)   The TARGET is the specified frame.           
     


                                  FRAME
 
The FRAME portion of a fixup specifies the FRAME NUMBER that will be
used as the frame of reference for LOCATION's reference to TARGET.
Typically, this frame of reference is one of the segment registers.
The FRAME NUMBER in the segment register is specified via the assembler
"ASSUME" directive.

Even if the fixup is self-relative, the TARGET must still be in the
FRAME given by the FRAME NUMBER in the segment register.  So, a FRAME
is required for both segment-relative fixups and self-relative fixups.

There are seven methods of specifying a FRAME:

                             A segment
                             A group
                             An external
                             An absolute FRAME NUMBER
                             LOCATION's FRAME
                             TARGET's FRAME
                             No FRAME specified


                              FRAME Methods
Ŀ
Method                      Description                              
Ĵ
  F0  The FRAME for the fixup is the CANONIC FRAME of the PSEG       
      containing the LSEG.  (Since the fixup is generated at         
      translation time, an LSEG is specified.)                       
      Notation:  EI(segment)                                         
Ĵ
  F1  The FRAME for the fixup is the CANONIC FRAME of the PSEG       
      located lowes in MAS.  A group is specified.                   
      Notation:  GI(group)                                           
Ĵ
  F2  The FRAME for the fixup is specified by an external.           
      Typically, the external is located in a T-MODULE different     
      from the T-MODULE generating the fixup.                        
      Notation:  EI(external)                                        
Ĵ
  F3  The absolute FRAME number is specified.                        
      Notation:  FR(FRAME)                                           
Ĵ
  F4  The FRAME is the CANONIC FRAME of the PSEG containing LOCATION.
      Notation:  LOCATION                                            
Ĵ
  F5  The FRAME is determined by the TARGET.  Notation:  TARGET      
Ĵ
  F6  No frame of reference specified.  Notation:  NONE              



                          FRAME Method F2 cases

When FRAME method F2 is specified (FRAME is specified by an external),
there are three cases depending on how the external is defined:

Ŀ
Method                       Description                             
Ĵ
  F2a If the external is defined relative to an LSEG which is not in 
      a group then the FRAME is the CANONIC FRAME of the PSEG        
      containing the LSEG.                                           
Ĵ
  F2b If the external is defined absolutely and not in a group, then 
      the FRAME is the CANONIC FRAME of the external.                
ĳ
  F2c If the external is associated with a group, the FRAME is the   
      CANONIC FRAME of the PSEG in the group with the lowest MAS.    



                          FRAME Method F5 cases

When FRAME method F2 is specified (FRAME is specified by the TARGET),
there are four cases depending on how the TARGET was specified:

Ŀ
Method                          Description                          
Ĵ
  F5a If the TARGET method is T0 or T4, then FRAME is the CANONIC    
      FRAME of PSEG containing TARGET.                               
Ĵ
  F5b If the TARGET method is T1 or T5, then FRAME is the CANONIC    
      FRAME of PSEG in the same group as TARGET and with the lowest  
      MAS.                                                           
Ĵ
  F5c If the TARGET method is T2 or T6, then the FRAME is determined 
      by the rules given in FRAME method F2.                         
Ĵ
  F5d If the TARGET method is T3 or T7, then the FRAME is the FRAME  
      NUMBER specified by the TARGET.                                


                           Performing a Fixup

Regardless of the fixup mode (segment-relative or self-relative), the
first step in performing a fixup is to insure that the TARGET is
addressable given the FRAME of reference.  That is, the TARGET must lie
between FRAME and FRAME+65535 inclusive.

                      FRAME  TARGET  FRAME+65535

If this is not the case, a warning is given.

After verifying that TARGET can be addressed by FRAME, how a fixup is
performed depends of the FIXUP mode.


                          Self-Relative Fixups

Self-relative fixups are permitted for LOBYTE and OFFSET type LOCATIONs
only.  If the LOCATION type is HIBYTE, BASE, or POINTER, no fixup is
performed and an error message is given.

For self-relative fixups, the value of the PC (CS:IP) when the
instruction is executed must be determined.  Then, the DISTANCE to the
TARGET from the PC is computed.  The formulae for DISTANCE and PC are
given below:

                 PC = LOCATION + 1 (If LOCATION type is LOBYTE)
                 PC = LOCATION + 2 (If LOCATION type is OFFSET)
                 DISTANCE = TARGET - PC

For LOBYTE locations, an error is issued when DISTANCE falls outside of
-128  DISTANCE  127.

The fixup is performed by adding DISTANCE to LOCATION.  For LOBYTE
LOCATIONs, DISTANCE is added modulo 256.  For OFFSET LOCATIONs,
DISTANCE is added modulo 65536.

                         Segment-Relative Fixups

For segment-relative fixups, DISTANCE = TARGET - FRAME.  If DISTANCE
falls outside of 0  DISTANCE  65535, a warning is issued.  The
following table gives how to perform the fixup depending on LOCATION
type:
Ŀ
LOCATION                            Action                           
  type                                                               
Ĵ
LOBYTE  DISTANCE is added (modulo 256) to low order byte at LOCATION.
Ĵ
HIBYTE  DISTANCE is added (modulo 256) to high order byte at LOCATION
Ĵ
OFFSET  DISTANCE is added (modulo 65536) to the word at LOCATION.    
Ĵ
BASE    FRAME is added (modulo 65536) to the word at LOCATION.       
ĳ
POINTER DISTANCE is added (modulo 65536) low order word at LOCATION. 
        A relocation item for the word at LOCATION is created.       
        FRAME is added (modulo 65536) to the high order word of the  
        DWORD at LOCATION.  A relocation item for the high order word
        of the DWORD at LOCATION is created.                         



                      T-MODULE Record Format Basics

All T-MODULE records have the following basic format:

                       Ŀ
                       Record RecordInformation SpecificCheck
  Field Names> Type  Length   to Record Type    Sum 
                       Ĵ
  Field Lengths>  1      2     Record Length - 1   1  
     (bytes)           

Record Type --
  This one byte field identifies the type of T-MODULE record.

Record Length -- 
  This word contains the number of bytes in all following fields
  (including the checksum).

Information Specific to Record Type --
  This field contains the data for the specified Record Type.
 
Check Sum --
  This byte contains the negative of the sum of all the preceding bytes
  in the record.  Therefore, the sum of all the bytes in the record
  will be 0.


                  T-MODULE Record Format -- Bit Fields

Bit fields are denoted as follows:

                               Ŀ
                                Bit  Bit  Bit 
        Field Name>FieldFieldField
                                 1    2    n  
                               Ĵ
        Field Length>  4    1    3  
           (bits)              Ĵ
                                     byte       
                               


                     T-MODULE Record Format -- INDEX Fields

There are three special kinds of fields in a T-MODULE record:

"INDEX" fields:        Ŀ
                       INDEX
                       Ĵ
                        1-2 
                       

An INDEX field is one or two bytes long.  If the high order bit of the
first byte of the INDEX is 0, then the INDEX is one byte long and the
value is the remaining 7 bits (0 - 127).  Otherwise, the INDEX is two
bytes long and the value of the INDEX is the low order 7 bits of the
first byte * 256 plus the second byte.


                  T-MODULE Record Format -- NAME Fields

The format of a "NAME" field is:
                              Ŀ
                               NAME    NAME    
                              Length           
                              Ĵ
                                1   NAME Length
                              

Note:  NAMEs of 0 bytes are permitted.  The NAME is not NULL
terminated.


                 T-MODULE Record Format -- VALUE Fields

The format of "VALUE" fields is:
                               Ŀ
                               CodeNumber
                               Ĵ
                                1   0-4  
                               

When 0  Code  128, the Number field is omitted and the VALUE is Code.

When Code = 129, the Number field is 2 bytes long, and the VALUE is
Number.

When Code = 132, the Number field is 3 bytes long, and the VALUE is
Number.

When Code = 136, the Number field is 4 bytes long, and the VALUE is
Number.


                T-MODULE Record Format -- T-MODULE Header

                        Ŀ
                           RecordT-MODULE NAMECheck
                        80HLength              Sum 
                        Ĵ
                         1   2       NAME       1  
                        

This record type must be the first record in the T-MODULE, and it names
the T-MODULE.  Frequently, the T-MODULE NAME is the name of the source
file to the language translator.


            T-MODULE Record Format -- List of NAMEs (LNAMEs)
                  
                        Ŀ
                           Record            Check
                        96HLengthLogical NAME Sum 
                        Ĵ
                         1   2       NAME      1  
                        
                                   repeated

Each Logical NAME is entered into a "List of NAMES" (LNAME) in the
order they are encountered in type 96H records.  The list index starts
at 1 (0 means not specified).  There may be more than one type 96H
record in a T-MODULE.  When this occurs, append the Logical NAMEs to
the list.  The Logical NAME field is repeated.  The number of
repetitions is determined by the Record Length.


           T-MODULE Record Format -- LSEG Definition (SEGDEF)

           Ŀ
              Record Segment SegmentSegmentClassOverlayCheck
           90HLengthAttributeLength  INDEX INDEX INDEX  Sum 
           Ĵ
            1   2     1-4      2     INDEX INDEX INDEX   1  
           

Segment Attribute:  Ŀ  ACBP:  Ŀ
                    ACBPFRAME NUMBEROffset         ACBP
                    Ĵ         Ĵ
                     1       2        1            3311
                    Ĵ         Ĵ
                         conditional           Byte 
                                                       
                                                Note:  ACBP.P must be 0.

There is one SEGDEF record for each LSEG in a T-MODULE.  Like the LNAME
list, this forms a list of LSEGs (indexed from 1).


                 T-MODULE Record Format -- SEGDEF.ACBP.A

The following table gives the meaning of the A field of the ACBP field:

Ŀ
A                             Description                            
Ĵ
0This is an absolute segment.  The FRAME NUMBER and Offset fields of 
 the Segment Attribute field will be present.                        
Ĵ
1This is a relocatable, byte-aligned LSEG.                           
Ĵ
2This is a relocatable, word-aligned LSEG.                           
Ĵ
2This is a relocatable, paragraph-aligned LSEG.                      
Ĵ
4This is a relocatable, page-aligned LSEG.                           
Ĵ
5This is a relocatable, DWORD-aligned LSEG.                          



                     T-MODULE Record Format -- SEGDEF.ACBP.C

The following table gives the meaning of the C field of the ACBP field:
Ŀ
C                             Description                            
Ĵ
0The LSEG is private and may not be combined.                        
Ĵ
1Undefined.                                                          
Ĵ
2The LSEG is public and may be combined with other LSEGs of the same 
 name.                                                               
Ĵ
3Undefined.                                                          
Ĵ
4The LSEG is public and may be combined with other LSEGs of the same 
 name.                                                               
Ĵ
5The LSEG is a stack segment and may be combined with other LSEGs of 
 the same name.                                                      
Ĵ
6The LSEG is a common segment and must be combined with other LSEGs  
 of the same name.                                                   
Ĵ
7The LSEG is public and may be combined with other LSEGs of the same 
 name.                                                               



               T-MODULE Record Format -- Notes on Combining

LSEGs which can be combined and are not common are combined as follows:

                      Ŀ
                      LSEG data for first T-MODULE     
                      Ĵ
                      Alignment Gap for second T-MODULE
                      Ĵ
                      LSEG data for second T-MODULE    
                      Ĵ
                                  ///                  
                      Ĵ
                      Alignment Gap for last T-MODULE  
                      Ĵ
                      LSEG data for last T-MODULE      
                      

The resultant PSEG is the combined length of the above.

For common LSEGs, the length of the PSEG is the length of the largest
LSEG.  Therefore, the length of the PSEG cannot be determined until all
the T-MODULEs are processed.  For this reason, data cannot be loaded
into common segments until fixup time.


               T-MODULE Record Format -- LSEG Definition (SEGDEF)

           Ŀ
              Record Segment SegmentSegmentClassOverlayCheck
           90HLengthAttributeLength  INDEX INDEX INDEX  Sum 
           Ĵ
            1   2     1-4      2     INDEX INDEX INDEX   1  
           

Segment Length --
  If ACBP.B is 0, then this is the length of the LSEG.  If ACBP.B is 1,
  then the Segment Length must be 0 and the length of the LSEG is
  65536 bytes.

Segment INDEX --
  This is an INDEX into the LNAME list.  LNAME[Segment INDEX] is the
  name of the LSEG.

Class INDEX --
  This is an INDEX into the LNAME list.  LNAME[Class INDEX] is the name
  of the LSEG.

Overlay INDEX --
  This is an INDEX into the LNAME list.  LNAME[Class INDEX] is the name
  of the LSEG.


               T-MODULE Record Format -- Group Definition (GRPDEF)

                    Ŀ
                       RecordGroup NAME    LSEG Check
                    9AHLength  INDEX    FFHINDEX Sum 
                    Ĵ
                     1   2     INDEX     1 INDEX  1  
                    
                                           repeated

Like the LNAME list and SEGDEF list, the GRPDEFs form a list (indexed
relative to 1) of groups.  The is one GRPDEF record for each group in
the T-MODULE.

Group NAME INDEX --
  LNAME[Group NAME INDEX] is the name of the group.

LSEG INDEX --
  This field is repeated once for each LSEG in the group.  The LSEG
  INDEX is the INDEX into the LSEG list which corresponds to the LSEG
  in the group.


              T-MODULE Record Format -- Public Definition (PUBDEF)

           Ŀ
              RecordGroupSegmentFRAME PublicPublicType Check
           90HLengthINDEX INDEX NUMBER NAME OffsetINDEX Sum 
           Ĵ
            1   2   INDEX INDEX  0-2   NAME   2   INDEX  1  
           
                                           repeated
Group INDEX --
  If the public(s) are defined is an LSEG which is part of a group,
  then this is the index into the group list.

Segment INDEX --
  If the public(s) are defined in an LSEG, this is the index into the
  LSEG list.

FRAME NUMBER --
  This field is only present if the public(s) are absolute (indicated
  by both Group INDEX and Segment INDEX being zero).  When present,
  this is the FRAME NUMBER used to reference the public(s).


              T-MODULE Record Format -- Public Definition (PUBDEF)

           Ŀ
              RecordGroupSegmentFRAME PublicPublicType Check
           90HLengthINDEX INDEX NUMBER NAME OffsetINDEX Sum 
           Ĵ
            1   2   INDEX INDEX  0-2   NAME   2   INDEX  1  
           
                                           repeated
Public NAME --
  This is the name of the public.

Public Offset --
  This is the distance of the start of the public from the group, LSEG
  or FRAME.

Type INDEX --
  This is ignored.


              T-MODULE Record Format -- Public Definition (EXTDEF)

                        Ŀ
                           RecordExternalType Check
                        BCHLength  NAME  INDEX Sum 
                        Ĵ
                         1   2     NAME  INDEX  1  
                        
                                   repeated

Like the LNAMEs, SEGDEFs, and GRPDEFs, the EXTDEFs form a list (indexed
relative to 1) of the external names used in this T-MODULE.

External NAME --
  This is the name of the external public symbol.

Type INDEX --
  This is ignored.


           T-MODULE Record Format -- Logical Enumerated Data (LEDATA)
 
                   Ŀ
                      RecordSegment              Check
                   A0HLength INDEX Offset  Data   Sum 
                   Ĵ
                    1   2    INDEX   2      1      1  
                   
                                             repeated

Segment INDEX --
  This data is to be loaded into the LSEG corresponding to SEGDEF list
  entry SEGDEF[Segment INDEX].

Offset --
  This data is to be loaded starting at this offset in the LSEG.

Data --
  The byte(s) to be loaded.  No more than 1024 can be loaded by an
  LEDATA record.


            T-MODULE Record Format -- Logical Iterated Data (LIDATA)
 
                  Ŀ
                     RecordSegment       Iterated Check
                  A2HLength INDEX OffsetData Block Sum 
                  Ĵ
                   1   2    INDEX   2    variable   1  
                  
                                            repeated

Segment INDEX --
  This data is to be loaded into the LSEG corresponding to SEGDEF list
  entry SEGDEF[Segment INDEX].

Offset --
  This data is to be loaded starting at this offset in the LSEG.

Iterated Data Block --
  This is (recursively) defined later, but the total size cannot exceed
  1024 bytes.


            T-MODULE Record Format -- Logical Iterated Data (LIDATA)


Iterated Data Block:  Ŀ
                      RepeatBlock        
                      Count Count Content
                      Ĵ
                        2     2  variable
                      

Repeat Count --
  If Block Count is zero then Content is interpreted as a string of
  bytes of length Repeat Count.  If Block Count is zero, then this is
  the number of times the Content field is repeated.

Block Count --
  If this is zero, then the Content field is interpreted as a string of
  bytes of length Repeat Count.  If this is non-zero, then the Content
  field contains a string of Block Count Iterated Data Blocks.

Content --
  This is either a string of bytes as described above or it is a string
  of Iterated Data Blocks.


                 T-MODULE Record Format -- Fixup Record (FIXUPP)

                           Ŀ
                              Record Thread Check
                           9CHLengthor Fixup Sum 
                           Ĵ
                            1   2   variable  1  
                           
                                      repeated

Thread or Fixup --
  This field can be either a Thread (high order bit is 0) or Fixup
  (high order bit is 1).  A Thread is a default TARGET or FRAME method.
  There are four TARGET threads and four FRAME threads.  Thread fields
  are used to store the default TARGET or FRAME method.  A Fixup type
  field specifies the five pieces of information (discussed earlier)
  necessary to perform a fixup.



                        T-MODULE Record Format -- Thread

Thread:  Ŀ  Thread Data:  Ŀ
         ThreadThread                 0D0MethodThred
          Data  INDEX                 Ĵ
         Ĵ                111  3     2  
           1   0-INDEX                Ĵ
                                byte       
                                         

D --
  If D is zero then a TARGET thread is being specified, otherwise a
  FRAME thread is being specified.

Method --
  This is the TARGET or FRAME method.  For TARGET threads, only the
  four primary methods are specified.  All seven FRAME methods can be
  specified.

Thred --
  This is the TARGET or FRAME thread number being specified.

Thread INDEX --
  This is not present when F4, F5 or F6 is being specified.  In all
  other cases, this is either a Segment, Group or External index
  depending on the Method.


                         T-MODULE Record Format -- Fixup

                    Ŀ
                    LOCAT Fixup  FRAME TARGET TARGET
                         Methods INDEX INDEX  Offset
                    Ĵ
                      2    1    0-INDEX0-INDEX 0-2  
                    

LOCAT:  Ŀ
              LOCATIONLE/LIDATA
        1Mode0  Type    Offset 
        Ĵ
        1 1  1   3        10   
        Ĵ
        low byte          high byte Note:  Low and high bytes
                are swapped.
         
Mode --
  If Mode is 0 then this is a self-relative fixup, otherwise it is a
  segment-relative fixup.  Self-relative fixups on LIDATA are not
  permitted.


                         T-MODULE Record Format -- Fixup

LOCAT:  Ŀ
              LOCATIONLE/LIDATA
        1Mode0  Type    Offset 
        Ĵ
        1 1  1   3        10   
        Ĵ
        low byte          high byte
        

LOCATION Type:  Ŀ
                LOCATIONType Description
                Ĵ
                      0      LOBYTE     
                Ĵ
                      1      OFFSET     
                Ĵ
                      2      BASE       
                Ĵ
                      3      POINTER    
                Ĵ
                      4      HIBYTE     
                


                         T-MODULE Record Format -- Fixup

LOCAT:  Ŀ
              LOCATIONLE/LIDATA
        1Mode0  Type    Offset 
        Ĵ
        1 1  1   3        10   
        Ĵ
        low byte          high byte
        

LE/LIDATA Offset --
  This field is used to determine the LOCATION information for the
  fixup.  This offset is actually an offset into the last LEDATA or
  LIDATA record.  The LEDATA or LIDATA contains the base LSEG and
  offset information.  Note that for LIDATA records, each time the data
  at LE/LIDATA Offset is repeated, the fixup must occur.


                         T-MODULE Record Format -- Fixup

                    Ŀ
                    LOCAT Fixup  FRAME TARGET TARGET
                         Methods INDEX INDEX  Offset
                    Ĵ
                      2    1    0-INDEX0-INDEX 0-2  
                    

Fixup Methods:  Ŀ
                FFRAMETPTARGET F --
                Ĵ  If F is 1 then FRAME is a thread,
                1  3  11  2     else FRAME is the FRAME method.
                Ĵ
                      byte         T --
                  If T is 1 then TARGET is a thread,
                                      else TARGET is the TARGET method.
FRAME --
  This is either the FRAME method (F=0) or a FRAME thread (F=1).
TARGET --
  This is either the TARGET method (T=0) or a TARGET thread (T=1).
P --
  If P=0 then the primary TARGET methods are used and the TARGET Offset
  field will be present.


                         T-MODULE Record Format -- Fixup

                    Ŀ
                    LOCAT Fixup  FRAME TARGET TARGET
                         Methods INDEX INDEX  Offset
                    Ĵ
                      2    1    0-INDEX0-INDEX 0-2  
                    

FRAME INDEX --
  Depending on the FRAME method, this is either a Segment, Group, or
  External INDEX.  This will be present only when a FRAME thread is not
  used (F=0).

TARGET INDEX --
  Depending on the TARGET method, this is either a Segment, Group, or
  External INDEX.  This will be present only when a TARGET thread is
  not used (T=0).

TARGET Offset --
  The TARGET is TARGET Offset bytes from the Segment, Group, or
  External given by TARGET INDEX.


                 T-MODULE Record Format -- T-MODULE End (MODEND)

                        Ŀ
                           RecordEnd  Start  Check
                        8AHLengthTypeAddress  Sum 
                        Ĵ
                         1   2    1  variable  1  
                        

End Type:                   Attribute:
       Ŀ            Ŀ
       Attribute01            AttributeDescription               
       Ĵ            Ĵ
           2    51                0    Non-main, no Start Address
       Ĵ            Ĵ
           byte                     1    Non-main, Start Address   
                   Ĵ
                                      2    Main, no Start Address    
                                  Ĵ
                                      3    Main, Start Address       
                                  


                 T-MODULE Record Format -- T-MODULE End (MODEND)


    Start Address:  Ŀ
                     Fixup  FRAME TARGET TARGET
                    Methods INDEX INDEX  Offset
                    Ĵ
                      1    0-INDEX0-INDEX 0-2  
                    

The above fields work exactly the same as they do in a FIXUPP record.
The start address is computed from the FRAME and TARGET specified
above.  The initial CS is the FRAME NUMBER of the FRAME, and the
initial IP is TARGET - FRAME.


                T-MODULE Record Format -- Comment Record (COMENT)

                      Ŀ
                         RecordComment        Check
                      88HLength Type  Comment  Sum 
                      Ĵ
                       1   2     2    variable  1  
                      

Comment Type:  Ŀ
               PurgeList0Class
               Ĵ
                 1   1  6  8  
               Ĵ
                      word       
               

Purge --
  COMENT record should not be deleted by utilities which can delete
  comments (Purge=1).

List --
  COMENT record should not be listed by utilities which can list
  comments (LIST=1).


                T-MODULE Record Format -- Comment Record (COMENT)

Ŀ
Class                          Description                           
Ĵ
 129 Do not do a default library search.                             
Ĵ
 157 Memory model information is in the Comment field.               
Ĵ
 158 Use the "DOSSEG" ordering.                                      
Ĵ
 159 Library name is in the Comment field.                           
Ĵ
 161 Codeview (registered trademark of Microsoft) information is     
     present.                                                        
Ĵ
 162 Pass 1 of linker can stop processing T-MODULE here.             



             T-MODULE Record Format -- Communal Definition (COMDEF)

            Ŀ
                     SymbolType Far or Near Communal Check
            B0HLength NAME INDEX Communal     Size    Sum 
            Ĵ
             1   2    NAME INDEX     1      variable   1  
            

Symbol NAME --
  This is the name of the communal symbol.  It is placed in the list of
  external symbols just as if it were an EXTDEF record.  If a PUBDEF
  record with the same Symbol NAME is encountered, it overrides the
  COMDEF.

Type INDEX --
  This is ignored.

Far or Near Communal --
  Far communals have a 61H coded here.  Near communals have a 62H coded
  here.

The ultimate size of the communal is the largest communal.


             T-MODULE Record Format -- Communal Definition (COMDEF)

For near communals, Communal Size is:  Ŀ
                                       SIZE 
                                       VALUE
                                       Ĵ
                                       VALUE
                                       

For far communals, Communal Size is:   Ŀ
                                       COUNTSIZE 
    The size of the communal is:       VALUEVALUE
                                       Ĵ
      COUNT VALUE * SIZE VALUE         VALUEVALUE
                                       

Near communals go in DGROUP.  Far communals go in HUGE_BSS and are
packed as compactly as possible into PSEGs of no more than 64K.


           T-MODULE Record Format -- Forward Reference Fixups (FORREF)

                 Ŀ
                          Segment           Fixup  Check
                 B2HLength INDEX SizeOffset Data    Sum 
                 Ĵ
                  1   2    INDEX  1    2   variable  1  
                 
                                         repeated

Segment INDEX --
  The Forward Reference Fixup is to be applied to the LSEG element
  whose index into the SEGDEF list is Segment INDEX.
Size --
  This specifies the size of the Fixup Data fields.  The Fixup Data
  fields are a byte when Size = 0, a word when Size = 1, or a DWORD
  when Size = 2.
Offset --
  This is the Offset into the LSEG specified by Segment Index where the
  fixup is applied.
Fixup Data --
  This value is added at the specified Offset.

Note:  The FORREF record may occur before the LE/LIDATA records which
       load data into the LSEG.  Therefore, FORREFs must be applied at
       fixup time.


          T-MODULE Record Format -- Local External Definition (MODEXT)

                        Ŀ
                           RecordExternalType Check
                        B4HLength  NAME  INDEX Sum 
                        Ĵ
                         1   2     NAME  INDEX  1  
                        
                                   repeated


The fields of the MODEXT record function just like the EXTDEF record
except that the external is local to this T-MODULE only.  The External
NAME is included in the list of externals.


           T-MODULE Record Format -- Local Public Definition (MODPUB)

           Ŀ
              RecordGroupSegmentFRAME PublicPublicType Check
           B6HLengthINDEX INDEX NUMBER NAME OffsetINDEX Sum 
           Ĵ
            1   2   INDEX INDEX  0-2   NAME   2   INDEX  1  
           
                                           repeated

The fields of the MODPUB record function just like the PUBDEF record
except that the public symbol is local to this T-MODULE only.


                 T-MODULE Record Format -- Line Number (LINNUM)

                        Ŀ
                           Record          Check
                        94HLength   Data    Sum 
                        Ĵ
                         1   2       1       1  
                        
                                   repeated

The LINNUM record is ignored by the linker.


               T-MODULE Record Format -- Type Definition (TYPDEF)

                        Ŀ
                           Record          Check
                        8EHLength   Data    Sum 
                        Ĵ
                         1   2       1       1  
                        
                                   repeated

The TYPDEF record is ignored by the linker.


                     T-MODULE Record Format -- Record Order

Object modules are parsed via recursive descent as defined below:

   t_module::     THEADR seg_grp {component} modtail

   seg_grp::      {LNAMES | SEGDEF | EXTDEF} {TYPDEF | EXTDEF | GRPDEF}

   component::    data | debug_record

   data::         content_def | thread_def | COMDEF | TYPDEF | PUBDEF |
                  EXTDEF | FORREF | MODPUB | MODEXT

   debug_record:: LINNUM

   content_def::  data_record {FIXUPP}

   thread_def::   FIXUPP  (containing only thread fields)

   data_record::  LIDATA | LEDATA

   modtail::      MODEND


                         Primary Internal Data Structure

Ŀ             Ŀ              Ŀ
 Segment #1 > LSEG #1 > LSEG #1 Contents 
                           
                               
                               ^
                          Ŀ              Ŀ
                           LSEG #2 > LSEG #2 Contents 
                                        
                               ^
                              ///
      ^
Ŀ             Ŀ              Ŀ
 Segment #2 > LSEG #1 > LSEG #1 Contents 
                           
                               
                               ^
                          Ŀ              Ŀ
      ^                     LSEG #2 > LSEG #2 Contents 
     ///                                 
                                ^
                               ///



                              Temp File

The linker employs a temp file to save information which can only be
processed after all the T-MODULEs have been processed.  The information
which must be saved is:

                              Fixups
                              LE/LIDATA for common LSEGS
                              FORREF records

The temp file is deleted when processing is complete.


                               Library File Format

 library_file::  header_page {t_modules} trailer_page {directory_pages}

 header_page ::    Ŀ
                      RecordDirectoryDirectory           
                   F0HLength Offset   Pages      Pad    0
                   Ĵ
                    1   2      4        2         1     1
                   
                                         (prime)  repeated

 t_modules ::    The t_modules are as described above except a pad is
                 added after the MODEND record to make the t_module
                 occupy a full page.  The page size is the header_page
                 Record Length + 3.

 trailer_page ::   Ŀ
                      Record           
                   F1HLength   Pad    0
                   Ĵ
                    1   2       1     1
                   
                              repeated


                     Library File Format -- Directory

  directory_pages ::  public_pointer_array {public_entry} pad

Notes:  A directory page is always 512 bytes.  A directory page can
        contain up to 37 public entries.

public_pointer_array --
  This is a 38 byte array which is used to point into the public_entry
  field.  To determine where public i is located in the directory page,
  take the ith byte of the public_pointer_array (relative 0) and
  multiple it by 2.  That byte will be the beginning of the
  public_entry for ith public in the directory.  The 38th entry is used
  to point to the beginning of the free space in directory page.

public_entry ::  Ŀ
                  PublicStarting
                  NAME    Page  
                 Ĵ
                  NAME     2    
                 


                     Library File Format -- Finding a Public

The library directory employs a two-tiered hashing scheme to store
public names in its directory.  A detailed description of the algorithm
is given later, but for now the following general aspects of the
algoritm are useful.  To start the search, you need to know which
directory page to start searching, and if you don't find it in that
page, which directory page to search next.  Once in a directory page,
you have to know which entry to use to begin the search and which entry
to search next if it was not found.

We will call the four required values STARTING_PAGE, DELTA_PAGE,
STARTING_ENTRY, and DELTA_ENTRY.  The detail on how to compute these
values is give later.

Start with directory page STARTING_PAGE.  On that page, examine
public_entry STARTING_ENTRY.  There are three cases.  This could be the
public symbol you desire, in which case you are done.  The
public_pointer_array for this entry could be zero, in which case the
symbol is not in the library.  Or, this the public symbol at
STARTING_ENTRY could be some other public symbol.  In this case, add
DELTA_ENTRY (modulo 37) to the STARTING_ENTRY and examine that public
entry.  Since there are at most 37 entries in any directory page,
examine no more than 37 entries in any given page.  If you have tried
all entries on a page, proceed to the next page by adding DELTA_PAGE
(modulo Directory_Pages) to STARTING_PAGE and continue the process.
When you move to a new page, continue processing the public entries
where you left off.


To compute the STARTING_PAGE, DELTA_PAGE, STARTING_ENTRY, and
DELTA_ENTRY, view a NAME field as if it were an array of bytes
containing the public name:

                        //Ŀ
            NAME>Lengthbytebytebyte     byte 
                        //Ĵ
                          1    1   1   1        1   
                        //Ĵ
            index>  0    1   2   3      Length
                        //

Then, the following code define the values:
STARTING_PAGE, DELTA_PAGE, STARTING_ENTRY, DELTA_ENTRY := 0;
for i := 0 .. Length-1
 STARTING_PAGE := STARTING_PAGE+(NAME[i] or 20H) xor (<<STARTING_PAGE);
 DELTA_PAGE := DELTA_PAGE+(NAME[Length-i+1] or 20H) xor (<<DELTA_PAGE);
 STARTING_ENTRY:= STARTING_ENTRY+(NAME[Length-i+1] or 20H)
                                     xor (>>STARTING_ENTRY);
 DELTA_ENTRY := DELTA_ENTRY+(NAME[i] or 20H) xor (>>DELTA_ENTRY);
end for;
if DELTA_ENTRY = 0 then DELTA_ENTRY := 1;
if DELTA_PAGE = 0 then DELTA_PAGE := 1;
Note:  << is circular shift left twice and
       >> is circular shift right twice.
