7. EXE?

I've probably been moving pretty quickly, for some. Maybe even way too quickly. If it is only a little bit too fast, perhaps just take a break and let things soak in. Otherwise, look at some other web pages, now that you know some 'keywords' to try out in a search. I may, at some other point in time, try to expand the previous explanations but not for now. So regardless, I'm going to proceed to .EXE type programs under DOS -- with apologies to those I may leave behind.

DOS .EXE Programs

Let's take the second program, the one that printed a short message, and turn in into a DOS .EXE program. We can then see the necessary differences in the source code. It's not much.

Here's the original source code:

                    .model tiny

                    .code

                    .startup
                    mov      dx, OFFSET msg
                    mov      ah, 9
                    int      21h
                    .exit

                    .data

    msg             DB      "Hello out there!", 13, 10, '$'

                    END

You can see some differences in what I entered to the COPY command, earlier. The differences are in spacing, not words. I've reformatted it a little here because I think it might be a little easier to see the pieces, this way. At this point, it's probably a good idea that you start using an editor of some kind for your assembly source code. You can find a good editor for free in the MASM32 installation mentioned on my PC Tools web page, called QEDITOR. It works just fine under Windows and it's pretty easy to use as an editor. It includes features for writing Windows programs in its menus, but it's still a fine program for writing DOS assembly programs. Or just use whatever text editor you are used to using.

By the way, you can also see that it often doesn't matter what case you use in writing your programs. 'END' is the same as 'end' and so on. For the most part, your choice of CASE is your own. An obvious exception is in your literal ASCII text, of course, where the case you use is the case you will see displayed. Another exception is the labels, such as 'msg'. The linker needs to match up some of these labels when it is busy linking a final program together and if the case is different and if the linker believes that a label upper case labels aren't the same as lower case labels (this is a switch option for the linker, usually) then the linker will simply not match up labels that are different in their case.

So let's see what the source code would look like when writing the same program, but for making a .EXE instead of a .COM program.

                    .model small

                    .code

                    .startup
                    mov      dx, OFFSET msg
                    mov      ah, 9
                    int      21h
                    .exit

                    .data

    msg             DB      "Hello out there!", 13, 10, '$'

                    .stack

                    END

Can you find the differences? They are pretty small, to be sure -- no pun intended. The .model directive has been changed to specify a small model program. These are always .EXE programs. The other change is to add a .stack directive at the bottom.

Normally, .COM programs are entirely contained within a single memory segment. This means that the PSP, the code for the program, its constants, data and variables, and its stack are all in the same memory segment -- all fitting within 65536 bytes of memory. The size of the stack for .COM programs is simply whatever is in between the end of the code and data and the end of the memory segment. For example, if the code and data for the .COM program takes up 1000 bytes of memory and the PSP takes (by definition) 256 bytes, then the stack has assigned to it the difference, or simply 65536 - 1000 - 256 = 64280 bytes. In other words, one doesn't really need to specify a stack or its size for .COM programs. The stack gets the left-overs, by default.

In the case of an .EXE program, the stack needs to exist and you need to tell the assembler to allocate some or else the linker will fail to do so and will probably complain about it, as well. You can just use a .stack directive without specifying a size or you can add a size to that directive, as in ".stack 100" to allocate 100 bytes of stack space. Part of the reason for all this is that DOS tries to allocate only the memory actually needed by an .EXE program, unlike the case with the .COM program where DOS usually allocates everything available. So DOS needs to know how much stack you really need. And to tell DOS, the linker needs to know, too. And to tell the linker, you need to tell the assembler. So there it is.

The registers that DOS sets up for .EXE programs are mentioned here, below. The EXE header structure is mentioned in these entries and documented at the bottom of this page.

CS: This segment register is loaded with the starting segment address of the code, as indicated in the EXE header structure and then adjusted by DOS once it selects an available memory segment.
IP: The starting offset address of the EXE program, taken from the EXE header structure without modification.
SS: This segment register is loaded with the segment address of the stack, as indicated in the EXE header structure and then adjusted by DOS once it selects an available memory segment.
SP: The starting offset address of the EXE program, taken from the EXE header structure without modification.
DS, ES: These two segment registers are initialized by DOS (when used with an offset of 0) to point to the beginning of the Program Segment Prefix (PSP.)
BX:CX: This register pair is usually set to the size of the starting segment for the .EXE program, treating BX as the upper 16-bits of a 32-bit value. I don't think these values are of much use, though.
AX, DX, SI, DI: These registers are set to 0, when the .EXE program starts. I wouldn't rely on this behavior, though. Just set them as you need them and don't count on them being zero when DOS starts the .EXE program.

Oh, a final note. The directive, .startup, will actually generate some code this time. If you get a chance, compare the listing files for this third lesson and the second one. It'll be interesting to note, if not entirely clear.

How are .EXE Programs Different?

Well, let's to a quick test. Enter the new program shown above using an editor or, if you want, just use the COPY command. Name it lesson03.asm. Now, enter the following command:

C>ml /Fl /Sa lesson03.asm «
Microsoft (R) Macro Assembler Version 6.15.8803
Copyright (C) Microsoft Corp 1981-2000.  All rights reserved.

 Assembling: lesson03.asm

Microsoft (R) Segmented Executable Linker  Version 5.60.339 Dec  5 1994
Copyright (C) Microsoft Corp 1984-1993.  All rights reserved.

Object Modules [.obj]: lesson03.obj
Run File [lesson01.com]: "lesson03.com"
List File [nul.map]: NUL
Libraries [.lib]:
Definitions File [nul.def]:

C>

Notice that the "/t" option is now missing from the linker's "Object Modules" prompt. This is because ML knows now that this isn't a .COM program. So it takes away that switch option. The result should be a new file called lesson03.exe, among some others. This is the new executable program. But it's format is definitely different. Notice the size??

.EXE programs are usually bigger than .COM programs, even when they do exactly the same thing. Part of the reason is that an .EXE program has a special header section -- the first part of the program file is reserved for some information that DOS will need once it tries to run the program. DOS will use this information and, in some cases (most of them, actually), DOS will also modify the program code after it loads it into memory. This header takes up some space. Some of the early linker programs from Microsoft would attempt to keep this header to a minimum size, but their more modern linkers will always set aside 512 bytes for the header. After this, the code and data follows. This is why .EXE programs are almost always larger than 512 bytes in size, regardless of how short the actual program is.

You can test out the new .EXE program by running it. Be absolutely sure that there is no .COM file sitting in your directory with the same name, though. DOS will prefer to execute the .COM file over the .EXE file, if both have the same name.

By the way, here are the .EXE header details. I won't go into a deep explanation of them all, but this list will give you an idea of what DOS needs when it tries to run a .EXE program.

EXE Header Fields Table
	Name	Description
	exeSignature	EXE Header Signature This value is set to the two initials of an MS-DOS developer, 'MZ'. This word value is 0x5A4D, since this is a little-endian machine. This is just a "magic" value that is placed at the beginning of every .EXE file. If the file isn't identified with these two bytes, then it probably isn't an .EXE file and DOS will not load it (hopefully.)
	exeExtraBytes	Last Page Byte Count Each disk block (or page) of the EXE file is an exact 512 bytes in size. EXE programs are not, however, neatly divisible by 512. They might be 100 bytes or 10,000 bytes long. But rarely do they work out to an exact multiple of 512 bytes. This value specifies how many bytes in the last block (or page) are valid, if the value is other than zero. If zero, then the entire last block is considered valid.
	exePages	Page Count of EXE This specifies how many blocks (pages) are used by the entire EXE program. This value includes the size of the header, itself. This should be equal to: FLOOR( (exefilesize+511) / 512 ).
	exeRelocItems	Pointer Count in Relocation Table This is number of entries in the relocation table, provided elsewhere in the EXE file.
	exeHeaderSize	Header Size This value is the size, in paragraphs (16-byte "chunks"), of the EXE header. Even though the fixed size part of the header is 28 bytes, this value allows the EXE file to include other information after the 28-byte header, but before the beginning of the program, itself. For example, the relocation entries may be located directly after the 28-byte header.
	exeMinAlloc	Minimum Memory Allocation This is the minimum number of memory paragraphs, beyond the amount required to actually load the program. Often, this value is 0. DOS will not load the program if there isn't enough memory available for both the actual program size plus this additional amount beyond that actual value.
	exeMaxAlloc	Maximum Memory Allocation This is the maximum number of memory paragraphs to allocate for the program. DOS will allocate this much, if available, falling back to the minimum allocation, if less is available. This value helps accommodate stack and heap memory space desired by the program.
	exeInitSS	Initial SS Value This is the initial value of the SS segment register. The DOS loader will adjust this value by the base segment value of the memory allocated to run the program.
	exeInitSP	Initial SP Value This is the initial value of the SP stack pointer. This value isn't changed by the DOS loader.
	exeChecksum	Checksum This may have originally been intended to provide DOS with a further check on the validity of a program, before trying to run it. However, it was never implemented. Any value may be placed here, including zero.
	exeInitIP	Initial IP Value This is the initial value of the IP register. Basically, this sets the starting point for an EXE program. This value isn't changed by the DOS loader.
	exeInitCS	Initial CS Value This is the initial value of the CS segment register. The DOS loader will adjust this value by the base segment value of the memory allocated to run the program.
	exeRelocTable	Relocation Table Offset This is the byte position, within the EXE file, of the relocation table. Set this to the address just at the end of the 28-byte header, usually, even if the relocation table is empty.
	exeOverlay	Overlay Number This is usually 0, for resident programs. This value isn't always included in descriptions of EXE header structures and isn't used by the DOS program loader, I believe.

Last updated: Tuesday, January 18, 2005 02:14

DOS .EXE Programs

How are .EXE Programs Different?

EXE Header Fields Table