Module AsmSupport

brahn · January 20, 2016, 10:48pm

Here is another one from Carl that I found on the newsgroups:

Carl W. Sumner [email protected]
Re: Assembler in Clarion

The c compiler syntax is very standard. Any book on C will do. No header files though.
The assembler is harder but to get the op codes I just opened the assembler exe in the bin subdir in a hex editor and they jumped out at me.

Here are the notes I have built and collected over the years.

Note:
These directives are equivalents of a #ifdef _WIN32 / #endif. The reason for the data segment was actually the result of a copy and paste from some other larger body of code.

module AsmSupport

%T    -    ifdef    -    true
%F    -    ifndef    -    false
%E    -    endif
 
(* Segment Attributes *)
 
use16             = 00H
use32             = 01H
 
abs_align         = 00H
byte_align        = 20H
word_align        = 40H
para_align        = 60H
page_align        = 80H
dword_align       = 0A0H
 
dont_combine      = 00H
memory_combine    = 04H
public_combine    = 08H
stack_combine     = 14H
common_combine    = 18H
 
(* %T _WIN32 *)
 
segment _DATA(DATA, use32 + dword_align + public_combine)
segment _TEXT(CODE, use32 + dword_align + public_combine)
 
select _TEXT
 
section
 
public _DynPolygon :
 
extrn Cla$POLYGON
 
  call Cla$POLYGON
  ret 0
 
(* %E *)
 
end

module Call
 
(* Segment Attributes *)
 
use16             = 00H
use32             = 01H
 
abs_align         = 00H
byte_align        = 20H
word_align        = 40H
para_align        = 60H
page_align        = 80H
dword_align       = 0A0H
 
dont_combine      = 00H
memory_combine    = 04H
public_combine    = 08H
stack_combine     = 14H
common_combine    = 18H
 
(* %T _WIN32 *)
 
segment _DATA(DATA, use32 + dword_align + public_combine)
segment _TEXT(CODE, use32 + dword_align + public_combine)
 
select _TEXT
 
section
 
public _DebugBreak :
 
  db 204
  ret 0
 
(* %E *)
 
end

(1) Changes/incompatibilities

Some memory operands need to be given an explicit type where none was
needed previously, e.g.

    push word [bp][-6]
    pop  word [bp][-6]

Some floating point instructions need to have the specifier moved to just in front of the memory operand, e.g.

    fld st(0), qword [si]

rather than

    fld qword st(0), [si]

There are many new opcodes. All opcodes are now reserved and cannot be used as labels e.g.

    loop = 2    (loop instruction)
    str  = -6   (store task register instruction)
    ....

(3) New Features

Out of range jumps are automatically corrected, using 386 jump opcodes or extra jumps where necessary.

All 8086/8087 … 386/387/486 opcodes are supported. The correct segment override, data and address prefixes are automatically generated.

The size of operands is specified using byte/word/dword/fword/qword/tbyte/ptr/near/far.

fword means 48-bits, made up of a 16-bit segment + 32-bit offset.

ptr means a 16-bit segment + 16-bit offset.

dword means a 32-bit offset.

near means a word or dword depending on whether the current segment
is 16 or 32 bit.

far means ptr or fword depending on whether the current segment
is 16 or 32 bit.

e.g.

    call near xxx
    call near [edi]
    and dword [di], 1
    fstp st(0), qword [esp]
    call ptr far_proc_in_16_bit_seg
    call fword far_proc_in_32_bit_seg

32-bit segments are specified by the least significant bit of the segment attribute. Equates suitable for specifying segment attributes are as follows:

    (* segment attributes *)
    use16 = 0H
    use32 = 1H
 
    abs_align   =  00H
    byte_align  =  20H
    word_align  =  40H
    para_align  =  60H
    page_align  =  80H
    dword_align = 0A0H
 
    dont_combine   = 00H
    memory_combine = 04H
    public_combine = 08H
    stack_combine  = 14H
    common_combine = 18H

A typical usage is then

    segment _TEXT('CODE',use16+byte_align+public_combine)

Equated symbols may now be given any expression which can be used as an operand for an instruction, rather than just a numeric constant.
For example:

    v6 = word [bp][-6]
    ...
    mov ax, v6

Simple (parameter-less) macros can be defined.
For example:

    macro movsb "movs byte [si], byte es:[di]"

Note that no processing is performed during macro definition except to look for the delimiter. In particular there is no conditional processing or macro expansion. The delimiter may any non-blank character and may be repeated. The delimiter(s) are removed when the macro is invoked.

A symbol can be purged (deleted) from the symbol table using purge fred

(4) The reserved words are as follows:

      byte
      word
      dword
      fword
      qword
      tbyte
      db
      dw
      dd
      df
      log2
      power2
      ptr
      seg
      vdisp
      far
      near
      end
      extrn
      group
      include
      macro
      module
      org
      public
      purge
      section
      segment
      select
      st

The register names are:

    es  cs  ss  ds  fs  gs
    ax  cx  dx  bx  sp  bp  si  di
    eax ecx edx ebx esp ebp esi edi
    al  cl  dl  bl  ah  ch  dh  bh
    cr0 cr1 cr2 cr3 cr4 cr5 cr6 cr7
    dr0 dr1 dr2 dr3 dr4 dr5 dr6 dr7
    tr0 tr1 tr2 tr3 tr4 tr5 tr6 tr7
    st(0) st(1) st(2) st(3) st(4) st(5) st(6) st(7)

The opcode names (also reserved) are:

    aaa    aad    aam    aas    adc    add    and    arpl
    bound  bsf    bsr    bswap  bt     btc    btr    bts
    call   clc    cld    cli    clts   cmc    cmp    cmps
    cmpxchg daa   das    dec    div    enter  esc    halt
    idiv   imul   in     inc    ins    int    into   invd
    invlpg jmp
    jb     jae    je     jne    jbe    ja     jp     jpo
    jl     jge    jle    jg     jo     jno    js     jns
    lahf   lar    lds    lea    leave  les    lfs    lgdt
    lgs    lidt   lldt   lmsw   lods
    lsl    lss    ltr    mov    movs   movsx  movzx  mul
    neg    nop    not    or     out    outs   pop    push
    rcl    rcr    rol    ror    sahf   sar    sbb    scas
    retf   retn
    setb   setae  sete   setne  setbe  seta   setp   setpo
    setl   setge  setle  setg   seto   setno  sets   setns
    sgdt   shl    shld   shr    shrd
    sidt   sldt   smsw   stc    std    sti    stos   str
    sub    test   verr   verw   wbinvd fwait  xadd   xchg
    xlats  xor
 
    f2xm1  fabs   fadd   faddp  fbld   fbstp  fchs   fclex
    fcom   fcomp  fcompp fdecstpfdiv   fdivp  fdivr  fdivrp
    ffree  fiadd  ficom  ficomp fidiv  fidivr fild   fimul
    fincstp finit  fist   fistp  fisub  fisubr fld    fld1
    fldcw  fldenv fldl2e fldl2t fldlg2 fldln2 fldpi  fldz
    fmul   fmulp  fnop   fpatan fprem  fptan  frndint frstor
    fsave  fscale fsetpm fsqrt  fst    fstcw  fstenv fstp
    fstsw  fsub   fsubp  fsubr  fsubrp ftst   fxam   fxch
    fxtract fyl2x  fyl2xp1 fsin   fcos fsincos fprem1 fucom
    fucomp fucompp fdisi  feni
 
    lock   rep    repne
 
    cbw    cwd    iret   pusha  pushf  popa popf
    jcxz   loop   loope  loopne
 
    cwde   cdq    iretd   pushad pushfd popad popfd
    jecxz  loopd   looped loopned
 
    ret

(5) Things to watch out for:

The meaning of an instruction is derived from the opcode or operands, and prefixes are generated as necessary e.g.

    cwde
    cdq
    iretd
    jecxz lab
    loopd lab
    pushad
    pushfd
    popad
    popfd
 
    push dword 0
    push dword [si]

Note that loop/loope/loopne always means “loop on cx”. To loop on ecx loopd/looped/loopned must be used. (This is not the same as Microsoft where the meaning of loop depends on the whether the current segment is 32-bit)

However the the default size of the offset of a label depends on whether the current segment is 16-bit or 32-bit.

  extrn fred
  inc byte es:[fred]

The offset size can be specified explicitly, e.g.

  extrn fred
  inc byte es:[dword fred] (* 32-bit offset *)
  inc byte es:[word fred]  (* 16-bit offset *)

(6) The assembler requires string instructions to have operands.

The old short forms may be expressed as macros:

macro lodsb "lods byte [si]"
macro lodsw "lods word [si]"
macro stosb "stos byte es:[di]"
macro stosw "stos word es:[di]"
macro scasb "scas byte es:[di]"
macro scasw "scas word es:[di]"
macro movsb "movs byte [si], byte es:[di]"
macro movsw "movs word [si], word es:[di]"
macro cmpsb "cmps byte [si], byte es:[di]"
macro cmpsw "cmps word [si], word es:[di]"
macro xlat  "xlats byte [bx]"

However for readability I suggest using the long forms.
32-bit addressing is achieved by using esi/edi.
The source segment may be over-ridden, e.g.
e.g. lodsb byte ss:[si] )
Beware of doing this in conjunction with rep on 8086’s since on an interrupt multiple prefixes are not saved.

Other opcodes which are defined as macros are as follows:

macro repe  "rep"
macro repz  "rep"
macro repnz "repne"
 
macro jc      "jb"
macro jnc     "jnb"
macro jnae    "jb"
macro jnb     "jae"
macro jz      "je"
macro jnz     "jne"
macro jna     "jbe"
macro jnbe    "ja"
macro jpe     "jp"
macro jnp     "jpo"
macro jnge    "jl"
macro jnl     "jge"
macro jng     "jle"
macro jnle    "jg"
 
macro setnae  "setb"
macro setnb   "setae"
macro setz    "sete"
macro setnz   "setne"
macro setna   "setbe"
macro setnbe  "seta"
macro setpe   "setp"
macro setnp   "setpo"
macro setnge  "setl"
macro setnl   "setge"
macro setng   "setle"
macro setnle  "setg"