Topspeed assembler

This is a reproduction from an old newsgroup post by Carl Sumner.
Dave mentioned it in the comments of his latest zd_fast.a release and I thought it might be nice to archive it here also!

TOPSPEED ASSEMBLER

  • The lexical structure is derived from Modula-2. In particular, the Assembler is case sensitive, comments are marked with (* and *) and hexadecimal constants must be in upper case.
  • A semi-colon (;) is used to separate multiple statements on a line. In fact, a newline and a semi-colon are regarded as identical by the Assembler.
  • There is only one kind of label and there are no data types. Instead of data types, TopSpeed Assembler uses specifiers. Thus, the specifiers near, far, byte and word can be used to define the data type of the operands of an instruction.
  • Memory operands and segment overrides must always be explicitly specified.

Program Layout

A TopSpeed Assembler program has the following general layout:

 module <name> (* Names the module *)
 segment <name>.....(* Define segments *)
 <code and data>....
 section (* Another section - Optional *)
 ....<and so on>
 end (* Marks the end of the module *)

Keywords

The TopSpeed Assembler uses a number of keywords:

 byte db dd dw dword
 end extrn far fixup group
 include log2 module near org
 power2 public qword section select
 seg segment st tbyte word

Operators

The following operators are defined in TopSpeed Assembler. The operators are listed in order of precedence (highest precedence first):

 power2 power of two
 log2 log base 2 (truncated to integer)
 / % * division, modulus, multiplication
 + - addition, subtraction
 : segment reference
 ~ bitwise not
 & bitwise and
 | bitwise or 
 seg segment of address

Operators of equal precedence associate with the expression to their left.

Assembly Language Considerations

This section groups together a number of points regarding the use of the TopSpeed Assembler. This section should be used in conjunction with Appendix D, which contains full details of the 8086 and 8087 instruction sets. This chapter describes a number points of interest for the assembly level programmer, who is used to using other assemblers. You may find it useful to use the TopSpeed Disassembler TSDA to generate examples of TopSpeed Assembler from high-level language object files.

The Symbol Table

The TopSpeed Assembler discards the current symbol table every time a new section is started, except for any global equates which occur before the module keyword in the source file. This means that symbols may be redefined. If you redefine a public symbol, the linker issues a warning message. If you wish to reference external objects, the extrn
keyword should be used, for example:

 extrn ReturnOne
 ...
 call far ReturnOne
 ...

To make an object accessible from another module, the name must be
preceded by the public keyword.

Operand Sizes

In most cases, the operands to an instruction have an implied size (byte, word, etc). This size can be calculated by the Assembler and requires no intervention on your part. However, when there is a chance of ambiguity, you must use a specifier to indicate the size of the operand:

 inc word [bp] [-6]
 mov byte es:[bx], 1
 mov es:[bx], ax (* no specifier needed *)
 (* - ax implies word *)
 push es:[bx] (* push always uses a word *)

Jumps and Calls

Jumps and calls default to the smallest possible case. Thus, unless a label has been defined before use, the Assembler assumes that it is a label in the same segment as the statement referencing it. Labels should result in a jump of between -128 bytes and +127 bytes in the output object code. If a jump exceeds this range, a specifier must be
added.

For example:

 jmp near _loop

Strings

Single character string constants are treated as numbers by the TopSpeed Assembler. The Assembler syntax does not provide a segment override for string instructions. If you wish to do this, it must be done using the db keyword.

For example:

db 2EH; (*cs:*) stosb

References

Within TopSpeed Assembler, forward references are assumed for labels that have not yet been defined. The TopSpeed Assembler assumes that such labels are in the currently selected segment within the current section. Within a single section, equated symbols may be redefined; labels, on the other hand, may not be redefined.

Non-8086 Instructions

The TopSpeed Assembler supports only 8086, 8088, 8087 and 80287 instructions. If you need to use the opcodes specific to the 80286, 80386 or 80387, they must be simulated with the db directive, as described in Strings above.

Floating-point Options

The foption mnemonic is used to control floating point emulation and carries a single argument, as follows:

foption 0 causes floating point emulation.
foption 1 suppresses the generation of emulation fixups.
foption 2 suppresses the generation of fixups and wait instructions.
Unless foption 2 is used, the Assembler generates a wait instruction between successive floating-point instructions in order to synchronize the 8086 with the 8087.

Variables and Data

Variables can be defined, and uninitialized data areas reserved, using the following directives:

db initialize storage as bytes (expects byte or string oper-ands).
dw initialize storage as words (expects word operands).
org reserve some uninitialized memory of a size equal to the value of the operand.

File Inclusion

The TopSpeed Assembler has an include directive which allows a header file to be included. This is used in the library to define commonly used identifiers:

include "corelib.inc"

Conditional Assembly

The TopSpeed Assembler has the ability to conditionally assemble sections of code based on the value of an identifier. There are three special directives that mark which blocks to include/exclude. These are:

(*%T <identifier> *)

means process this section only when <identifier>=1

(*%F <identifier> *)

means process this section only when <identifier>=0

(*%E *) means end of conditional section

These directors are commonly used with the return code for a procedure that is memory model independent:

 (*%T _fcall *)
 ret 0
 (*%E *)
 (*%F _fcall *)
 ret far 0
 (*%E *)

Conditional sections can be nested.

For further examples of using conditional assembly, please refer to the file COREGRAP.A in the \TS\SRC directory. This file is also an excellent example of assembly language procedures providing memory model and register passed parameter support.

Predefined Identifiers

The assembler pre-defines a symbol for each#pragma define(<symbol>=><value>) which is active in the project file at the point where the #compile occurs. This allows conditional assembly according to which model has been selected. The value is 1 if the project value is on, otherwise the value is 0.

Calling Conventions

Your assembly language functions need to follow the TopSpeed calling conventions if they are to successfully be called from a high-level language. In particular the bp register must be preserved. Which other registers need to be preserved will depend on the calling convention being used, which can be modified with pragmas. In the JPI calling convention, all registers except those used to pass parameters and return function results must be preserved.

Miscellaneous Points

Other assemblers allow instruction formats such as:

mov ax, es:[bx+6]

In TopSpeed Assembler, these must be expressed as:

mov ax, es:[bx] [6]

The instructions rep, repne and lock are regarded by the Assembler as separate instructions. They must be followed by a semi-colon, for example:

rep; movsb (* NOTE the semi-colon *)
1 Like