/Vault/WebSites/test.thinkage.ca/gcos/expl/yaa/manu/manu.html
Thinkage Ltd.
85 McIntyre Drive
Kitchener, Ontario
Canada N2R 1H6
Copyright © 2007 by Thinkage Ltd.
The YAA assembler is a portable assembler, in the sense that it is not strongly tied to a particular machine or operating system. Its structure and its pseudo-ops are intended to support assembler programs for a variety of hardware types. Of course, the actual assembler operands will be those of a particular machine, but program structure is intended to be as system-independent as possible.
This version of the manual describes YAA as implemented for GCOS-8 running on DPS-8 machines. The output of YAA on this system is object code in a format recognized by the LD Link Editor (created by Thinkage Ltd.). LD can translate this object code into a format suitable for loading in the multi-segment GCOS-8 environment (i.e. into an OM) or into a format suitable for loading in the single segment GCOS-8 environment (i.e. into a Bstar file). Note that some assembler instructions are unique to the multi-segment environment and may not be used in single segment programs. The LD Link Editor is explained in "expl ld".
This chapter discusses the basic concepts for programming with YAA. We begin by describing the lexical elements of YAA, i.e. the pieces that go together to make up a YAA program. These pieces are often known as the tokens of the language.
Tokens are the smallest meaningful pieces of input that the assembler recognizes. Tokens can be identifiers like
x lda .data
constants like
14 2.56 'ab' 4E-2
and string literals like
"hello there"
Punctuation characters and operators like
* = ++ ! ( ) [ ]
are each considered tokens as well.
Identifiers serve as names for items in a YAA program (e.g. data objects, opcodes, macros, etc.). Identifiers may be formed from the upper case letters 'A'- 'Z', the lower case letters 'a'- 'z', the digits '0'-'9', the underscore character '_', the dollar sign '$', and the dot '.'. The first character of an identifier may not be a digit.
Identifiers may be arbitrarily long. The case of letters in an identifier is significant; thus NAME, name, and Name are all different identifiers.
There are several types of integer constants: decimal, octal, hexadecimal, binary, BCD, and ASCII. YAA's internal arithmetic is always performed with the same precision as the longest integer format on a particular machine. On the DPS-8, this means that integer arithmetic uses 36-bit integers. Of course, integer data objects in an assembled program may be given any format recognized by the machine.
A decimal integer is written as a sequence of digits with no leading zero, as in
2400 5 87 1340932
The minus sign ('-') may be used to create negative integers, but it is actually considered an operator on the integer, not part of the constant itself.
An octal integer is written as a sequence of digits with at least one leading zero. Only octal digits may be used (i.e. the digits '0' through '7'), as in
01 007 000400 0777
Each octal digit represents three bits.
A hexadecimal integer is written as a 0x or 0X followed by a sequence of hexadecimal digits. The hexadecimal digits are '0' through '9', plus the letters 'A' through 'F' standing for the (decimal) values from 10 to 15. Digits represented by letters may appear in either upper or lower case, as in
0X10 0x10 0xc0BB 0XFFFF 0xFf
Each hexadecimal digit represents four bits.
A binary integer is written as a 0b or 0B followed by a sequence of ones and zeroes, as in
0b0000 0B1111 0B101010
Each binary digit represents one bit.
An ASCII character constant consists of one or more ASCII characters enclosed in single quotes, as in
'a' '0' 'ab' '.'
ASCII character constants are integer constants. They occupy the same amount of memory as other integer constants. Therefore the maximum number of characters that may be specified in an ASCII character constant is the maximum number of ASCII characters that may be stored in an integer. On the DPS-8, this is four ASCII characters.
When an ASCII character constant contains fewer characters than can be stored in an integer, the characters that are specified are right-justified within the integer and padded on the left with 0-bits.
A BCD character constant consists of one or more BCD characters enclosed in grave accents, as in
`a` `ab` `012345` `......`
BCD character constants are integer constants and occupy the same amount of memory as other integer constants. Therefore the maximum number of characters that may be specified in a BCD character constant is the maximum number of characters that may be stored in an integer. On the DPS- 8, this is six BCD characters.
When a BCD character constant contains fewer characters than can be stored in an integer, the characters that are specified are right-justified within the integer and padded on the left with 0-bits.
A floating point constant represents a number that may have a fractional part or exponent.
A floating point constant is written as a sequence of digits, optionally followed by a decimal point, optionally followed by more digits, optionally followed by an exponent consisting of E or e followed by a signed decimal integer. If the decimal integer in the exponent is positive, the '+' sign may be omitted. Either the decimal point or the exponent (or both) must be present for a constant to be interpreted as a floating point number. Here are some examples of floating point constants.
2.3 1. 3E5 4.E-2 0.34e7
Notice that the first character of a floating point constant must be a digit; input like .5 is not a valid floating point constant.
Internally, YAA uses the floating point format with the longest precision available on a particular machine. On the DPS-8, this means double precision. Of course, floating point data objects in an assembled program may be given any format recognized by the machine.
If you attempt to store a floating point constant in a memory area that is not long enough to hold a floating point value, the rightmost bits of the constant will be discarded in order to truncate the constant to the required size.
An ASCII string consists of zero or more ASCII characters enclosed in double quotes, as in
"This is a string." "" "The above is a null string."
The string consists only of the characters specified. NOTE to C programmers: there is no special character added to mark the end of the string (so no '\0').
Special input sequences may be used in strings and character constants to represent unusual characters. Such sequences are known as escape sequences.
An escape sequence consists of a backslash (\) followed by one or more other characters. In an ASCII string or character constant, an escape sequence represents a single ASCII character. In a BCD character constant, an escape sequence represents a single BCD character. Below we list the escape sequences recognized by YAA. Some of these are only appropriate as ASCII characters, while others can be used for both ASCII and BCD.
"He said, \"Hello there.\"\n"
"The backslash character is \\"
Notice that escape sequences are written with several characters but only represent a single character. Therefore a character constant like
'\n'
is a single character, even though it is written with two characters.
For the most part, YAA source code is "free-format". It differs from older assemblers in the following respects.
In the sections to come, we will describe the general format of YAA source code.
YAA uses parentheses ( ) and square brackets [ ] to enclose various constructs. These will be grouped together under the name "Brackets" in this manual.
A Bracket-balanced token sequence is a sequence of tokens in which every opening Bracket has an appropriate closing Bracket of the same type, and Bracket pairs are properly nested. This rules out such constructs as
A ( B -- opening (, but no closing A (B [ C) ] -- not properly nested
A Bracket-balanced token sequence may not contain a new- line character. Code blocks are described in Chapter 8.
The term white space refers to any sequence of blank or horizontal tab characters. Blanks and horizontal tabs are called white space characters.
Two consecutive tokens must be separated by white space if they would form a single token when joined together. For example, suppose your code had an identifier A followed by the floating point constant 1.0. They must be separated by at least one white space character, as in
A 1.0
Without the white space character, they would read A1.0 which is a valid identifier token.
Tokens may always be separated by one or more white space characters, even if the white space is not necessary. Therefore an expression like
A+B
could also be written
A + B
if desired.
A statement consists of three fields: the label, the opcode, and the operand list.
The statement label field consists of a Bracket-balanced token sequence, followed by a colon. The most common form of the statement label is
identifier :
where the identifier is a normal YAA identifier. You may also have
[ identifier,identifier, ... ] :
to put more than one label on the same statement. The labels are separated by commas and enclosed in square brackets.
The label field may also be an expression, provided that the result of the expression is a token sequence containing one or more comma-separated identifiers. For example,
(X >> 0) ? [A] : [B] :
is a conditional expression, as explained in Chapter 5. The value of this label is [A] if X is greater than 0, and [B] otherwise.
Statement labels are optional. A statement may consist only of a label, as in
name:
This associates the given name with the current location (i.e. the value of the instruction counter).
Input of the form
name1: name2: name3: ...
is invalid. To put several labels on the same statement, give a list enclosed in square brackets (as shown previously).
The opcode field consists of the shortest possible Bracket-balanced token sequence following the statement label (if the statement has a label). In general, this will be a machine opcode or a YAA pseudo-op, as in
lda .null
However, it could also be an expression, as in
(.unquote("lda"))
Notice that the above expression had to be enclosed in parentheses. If we had just written
.unquote("lda")
the opcode would have been taken to be .unquote, since this is the shortest possible Bracket-balanced token sequence.
Anything following the opcode field is taken to be part of a list of operands for the operation. Operands in the list are Bracket-balanced token sequences that are separated by commas. For example,
A+B,NAME,[tok1,tok2]
is a list consisting of three operands:
A+B NAME [tok1,tok2]
The comma inside the square brackets does not count as an operand separator because operands must be Bracket-balanced.
Remember that any amount of white space may be used in an operand list. Thus the above operand list could have been written as
A + B, NAME, [tok1, tok2]
The end of an operand list can be marked in several ways.
tsx1 subroutine,modifier; tra error_rtnshows a line that contains two separate instructions. The semi-colon marks the end of the operand list of the first instruction, and therefore the end of the instruction itself. The second instruction starts after the semi-colon.
tra error_rtnthe operand list (and therefore the instruction itself) ends at the end of the line.
Some types of statements require an operand list, others allow an operand list to be present or omitted, while still others result in errors if an operand list is present.
An empty statement is a statement without opcode or operand list fields. For example, in
tsx1 subroutine, modifier; ; tra error_rtn
there is an empty statement between the two other instructions. Such statements have no effect on generated code.
The empty statement form in the example above is not very useful. It is much more common to use empty statements which are simply blank lines between other instructions, as in
tsx1 subroutine, modifier tra error_rtn
Such empty statements can be used to divide instructions into logical groups, making the program source code easier to read.
A YAA comment consists of a number sign (#), followed by zero or more other characters, followed by a new-line. In other words, a YAA comment begins at a '#' and extends to the end of the source code line. For example, you might have
jump: tsx1 subroutine,modifier # Call subroutine tra error_rtn # Return location # if error
Notice that the final line of the above example does not have an opcode or operand list. Comments may occupy a line all on their own.
Number signs inside ASCII string or character constants do not count as the beginning of a comment. For example, in
Str: .data "This song is in C#" #Comment
the first number sign is part of the string. The comment does not begin until the second number sign.
Number signs inside token sequences may be confused with the beginning of a comment. In this case, put a backslash in front of the number sign, as in \#. This always indicates a literal number sign, not the start of a comment.
Statements may be split over several lines of source input. If a line of source input ends in a backslash, YAA will discard the backslash, the new-line character at the end of the source line, and any white space characters at the beginning of the next source line. For example,
Str: .data \ "A string"
is changed to
Str: .data "A string"
Similarly,
Str: .data "A \ string"
is changed to
Str: .data "A string"
as before. This shows how ASCII strings may be broken over more than one line of input.
In this kind of construction, the backslash doesn't have to be the last character on the line. It can be followed by any number of blanks or tabs. It can also be followed by at least one blank or tab, followed by a comment, as in
Str: .data \ #Comment "A string"
This is equivalent to
Str: .data "A string"
The continuation goes "around" the comment. However, you could not say
Str: .data "A \ #Comment string"
because '#' is not recognized as the beginning of a comment when it is inside a string.
In this sort of construction, there must always be at least one blank or tab between the backslash and the '#'. Remember that the sequence \# is always interpreted as a literal number sign, not the beginning of a comment.
Notice that white space at the beginning of the second line is discarded. If you do not want YAA to discard white space at the beginning of a continued line, put a backslash in front of the first white space character that you want to be significant. For example,
Str: .data "A \ \ string"
is equivalent to
Str: .data "A string"
High level programming languages usually have reserved words, i.e. symbols (identifiers) which can only be used for purposes defined by the language. For example, in C, the keyword if may only be used to begin a statement; it may not be used as a variable name.
YAA is not quite this restrictive. Symbols may be reserved in some contexts, but unreserved in others. Most importantly, the machine opcodes are reserved words when they appear in the opcode field, but not when they are used in the label or operand fields. As an example, suppose the machine has an opcode named lda. This is a reserved word in the opcode field; you could not create an opcode macro with the same name. However, it is not reserved in the label or operand fields; you could create a variable with the same name, provided that you didn't intend to use the variable to stand for an opcode. This feature makes sure that code doesn't "break" if a new opcode is added to your machine's instruction set, conflicting with a variable in some existing program.
As a second example, the symbol du is a reserved word when it appears in an operand list as an instruction tag field on the DPS-8 machine. However, in other contexts, it is not reserved.
YAA is defined so that all contextually reserved words must be typed in lower case in source code. This means that all opcodes, pseudo-ops, special operands, etc. must appear in lower case. Users who prefer to write such symbols in upper or mixed case may create appropriate macros for the symbols. For example, you might create a macro named LDA which stands for the lda instruction. Macro definition is described in Chapter 8.
YAA's machine-independent reserved words (i.e. pseudo-ops and special symbols used by the assembler) all start with the dot (.) character. To avoid conflicts with such symbols, users should avoid creating names that begin with a dot. Note that other reserved words (e.g. names of opcodes on a particular machine) may not begin with a dot.
The YAA assembler accepts a command line option of the form
INITialization=file
where file is the name of a file. If this is present, YAA will read in the contents of the given file and assemble them before any other code. A typical initialization file may contain definitions for macros (see Chapter 8), options controlling listing format (see Chapter 9), and .search directives setting up search rules (see Chapter 8).
If there is no explicit initialization file specified on the command line, YAA will use a default initialization file. If you do not want to use this default, specify
init=
on the command line (without any file name after the '='.) See "expl yaa" for details.
The output produced by YAA is organized into sections. A section is a block of assembled material that should be considered as a unit for the purpose of linking. For example, the executable code of each function in a program might be considered to be a section. Similarly, each external data object is a section all on its own.
A section is either a data section or a code section. Data sections contain data objects and code sections contain executable code. Some operations (e.g. the .align pseudo-op described in Chapter 8) behave differently in data sections and code sections.
Sections may contain other sections. For example, a section may contain several external data objects which are themselves sections. Any level of nesting is allowed. If section A contains section B, A is said to be the parent section of B.
Sections may or may not have names. For example, a section that just contains an external data object has the name of the object.
For the most part, the symbols created in your program are simply for the convenience of assembly -- once the program has been assembled, they are forgotten. An external symbol is one whose name is retained until the program is linked. Special directives are needed to inform YAA that a particular symbol is external; these are explained in Chapter 8.
There are two types of external symbols.
External symbols are always the names of sections or of fixed offsets within sections.
When a name is associated with a location in a section, the location is represented by an offset from the beginning of the section. Thus a location name has two associated quantities: an identifier indicating the section that contains the location; and an offset. This offset may be expressed in bits, bytes, or words.
At the time a section is created, YAA must decide whether subsequent location names should represent offsets in bits, bytes, or words. This means choosing the offset mode for the section. Possible offset modes are represented by the keywords bit, byte, and word. The default offset mode for GCOS-8 is word, meaning that subsequent location names are associated with word offsets from the beginning of the section. Different offset modes can be specified on the statement that creates the section.
The offset mode for a section can be changed part way through the section using the .usage pseudo-op (described in Chapter 8). After the offset mode has been changed, subsequent location names will be associated with offsets in the new units. However, the offsets will still be from the beginning of the section.
When a program defines a symbol referring to a location, the symbol is associated with an integer representing the offset of the location from the beginning of the section. The offset is measured in the units dictated by the section's current offset mode.
For example, suppose the current offset mode of a section is byte and that X is used to name a location in that section. X will be associated with an integer giving the byte offset of the location in the section. Now, what happens if you try an instruction like
lda X
X is just a number (giving a byte offset in a section). But the lda instruction expects a word address. To resolve the conflict here, you must convert the byte offset X into words, as in
lda X/4
In one sense then, a symbol associated with a location in a section is just an integer. In another sense, however, the symbol has its own "offset mode" inherited from the section's offset mode -- the symbol represents an offset in specific units.
YAA keeps track of each symbol's offset mode. This is considered to be the type of the symbol. For example, if a symbol is defined in a section that has a byte offset mode, that symbol is taken to have the byte type.
YAA will warn you if you try to create expressions which mix symbols of different types. For example, consider
.usage word X: .data 0 .usage byte Y: .data 0 lda X+Y,du
In this example, X has the word type and Y has the byte type. YAA will therefore warn you about mixing types in the lda instruction.
Note that YAA does not warn you about using the wrong type of operand inside a machine instruction. For example, you can write
lda Y,du
even though Y is a byte offset and lda expects a word offset. In this case, you would have to use an explicit type cast, as discussed in the next section.
You can cast symbols to different types using an expression of the form
type :: symbol
For example,
.usage byte Y: .data 0 lda (word::Y),du
shows how to change a byte offset into a word offset when necessary.
You may also use the :: operator to cast the type of an expression. For example,
word::(4+4)
represents an offset of eight words.
Cast operators associate from right to left, so that
byte :: word :: (4+4)
is equivalent to
byte :: (word::(4+4))
This is evaluated in two steps. word::(4+4) represents a word offset of 8. The byte:: then converts this to a byte offset, so the final result is a byte offset of 32 (which is equivalent to a word offset of 8).
The reserved word none represents values with no type. For example, a constant expression would have the none type; such expressions are just numbers, not offsets. If you use such a value in a context where an offset is expected, YAA assumes that the numeric value represents an offset of the correct type. For example, consider
lda 3+2,du
The result of 3+2 is 5 and its type is none. Since lda expects this operand to be a word offset, the expression denotes a word offset of 5.
You may use the none type in :: cast operations. For example,
none::X
just stands for the value of X; the type of the result is none. To see what effect this has, consider
byte :: none :: word :: 8
The expression word::8 stands for a word offset of 8; using none:: results in just the number 8; finally, applying byte:: gives a byte offset of 8. Contrast this with
byte :: word :: 8
where word::8 is a word offset of 8 and byte:: converts this to a byte offset of 32.
As statements are assembled, the generated code is stored in a section. A value called the instruction counter or IC measures how much code has been placed into the section. (Note that YAA's instruction counter is an artificial construct that is local to a section, and has no direct relation to the hardware's instruction counter.) The offset mode of the section determines the units in which the IC measures quantity of code (bits, bytes, or words).
If the offset mode of a section is changed, the IC will measure code in the new units. For example, suppose the offset mode starts out at word. The IC indicates the present location in the section, expressed as a word offset from the beginning of the section. If the offset mode changes to byte, the IC will then indicate the present location in the section, expressed as a byte offset from the beginning of the section.
The value of the IC may be obtained via the .ic function, described in Chapter 7. This may be used anywhere in the program to represent a location in a section (represented as an integer offset into the section). The asterisk '*' can also be used to refer to the value of the IC. For example, the TRA (transfer) instruction
tra *+2
jumps to the location two units beyond the current value of the IC. The units used are given by the offset mode of the section. For example, if the offset mode is word, *+2 is two words beyond the current value of the IC.
TECHNICAL NOTE: internally, the IC is always kept as a bit offset. The value of the IC is converted to words or bytes (as appropriate) whenever it is used in source code.
All instructions automatically force the alignment they require. For example, the RPD instruction on GCOS-8 requires odd-word alignment. YAA will automatically generate NO-OP instructions to align RPD instructions in this way. For more on instruction alignment, see Chapter 8.
The link editing process groups all the sections of a program into one or more segments. On DPS-8 architecture machines, the segment is the basic organizing unit for memory. Each segment is referred to using a 12-bit quantity called a SEGID. A location in a program is completely specified by determining the SEGID of the segment that contains the location and the offset of the location from the beginning of that segment.
Since sections are not collected into segments until link editing, addresses cannot be fully determined at assembly time -- YAA cannot calculate either the SEGID of a location or the offset of that location from the beginning of the segment. Therefore in addition to assembled machine code, YAA outputs ""pseudo-addresses"" and relocation instructions telling the link editor how to resolve these pseudo-addresses into real address formats.
The format YAA uses for pseudo-addresses is transparent to the programmer -- the link editor does all the work of conversion. However, the programmer should be aware of the relocation instructions recognized by the link editor. Each of these is identified by a keyword.
All of the above relocation instructions can be explicitly requested by the programmer. There is another type of relocation instruction that cannot be explicitly requested. When you use an offset from an ar register, YAA generates a pseudo-address that the link editor will turn into a 15-bit word offset from whatever value is stored in the ar register. This offset will be stored in the bottom 15 bits of the upper half of a machine word. Up to 18 bits can be allocated for the offset, but only 15 bits will be used.
In general, you do not have to specify a specific relocation type for a relocatable value -- YAA uses default types which are usually what you want. The default is dictated by the context in which the relocatable value is used. For example, the GCOS-8 lda instruction expects a word offset as an argument; therefore YAA will generate an (upper) word relocation instruction for lda's argument.
YAA expressions may be used as labels, as opcodes, or as operands. While most expressions are likely to be simple (e.g. just the name of an opcode or a data object), YAA does allow the use of complicated expressions.
Every expression has a type. The type of an expression depends on the type of values used in the expression and the operations performed on those values. The possible types are described in the sections that follow.
An integer expression yields an integer value that does not stand for a location. Remember that this value will be expressed using the longest integer format available on the machine for which the program is assembled. On the DPS-8, this is a 36-bit integer.
There are a large number of expressions that yield an integer value: expressions that perform arithmetic with integer values; comparison expressions (e.g. A>>B which yields 1 if A is greater than B and 0 otherwise); logical and bit manipulation operations; and a number of special operations (e.g. an operation that determines the length of a string). All of these expressions will be discussed later in the chapter.
The result of a location expression represents a memory location. A location value may be absolute (in which case it is just an integer standing for a bit, byte, or word offset from some other location) or relocatable (in which case it consists of an integer offset and one or more linkable symbols, i.e. symbols whose locations will be determined by the link editor). The most general form for relocatable values is
I + LS1 - LS2 + LS3 ...
where I is an integer indicating a constant positive or negative offset, and LS1, LS2, LS3, etc. are all linkable symbols.
Notice that there is only one integer in a relocatable value. All other parts are linkable symbols. When an expression combines several relocatable values in any way, all the integer parts will be gathered together to yield a single integer offset. For example, if you subtract one location value from another, YAA will subtract the two integer parts first, and then calculate the remaining "linkable" part.
When performing arithmetic on the integer offsets of relocatable values, YAA pays no attention to the units of the offsets -- they're just integers. Thus if an expression contains offsets in different units, you must do the conversions yourself.
Operations with relocatable values may be restricted by the link editor's ability to resolve such operations.
A floating point expression yields a floating point value as its result. In this version of YAA, floating point expressions may contain addition, subtraction, multiplication, division, and the .max and .min functions (described in Chapter 7).
A string expression yields a string as its result. For example,
.concat("abc","def")
is a string expression whose result is the concatenation of the two strings (i.e. the string "abcdef"). Other expressions with string results are described later in this chapter.
A token sequence expression yields a sequence of tokens as its result. This sequence consists of zero or more tokens. The way in which these tokens are used depends on their context in the program being assembled.
As an example of a token sequence expression, the .unquote operator takes a string expression as its argument and returns a token sequence consisting of the tokens that appear in the string. This means that
.unquote("1+2")
yields the token sequence
1 + 2
This could then be re-evaluated as an expression using the .eval function (described in Chapter 7); the final result would be the value 3.
An immediate expression is one that can be evaluated as soon as YAA encounters it in source code (on YAA's "first pass"). In general, this means that the operands in an immediate expression can only be constants, YAA variables (as described in Chapter 8), and symbols whose value has been previously determined. An immediate expression cannot refer to symbols whose value cannot yet be determined (e.g. symbols defined later in the program or outside the program).
An immediate expression can have any type (integer, string, location, etc.). Some opcodes and pseudo-ops require that their arguments be immediate expressions. In addition, some of the expressions described in this chapter require that sub-expressions be immediate.
Some expressions can be evaluated immediately, simply because of their forms. For example,
A * 0
is always zero, regardless of the value of A. Similarly,
A - A
is always zero. YAA doesn't even bother to determine the value of A in cases like these, since the value of A is not relevant to the result. In fact, the expressions will be evaluated properly even if A is never defined.
When an expression consists of several sub-expressions, the operations in the expression are evaluated according to a fixed order of precedence. For example, multiplication operations take place before addition operations (as in conventional arithmetic). The standard order of evaluation may be changed using parentheses in the usual way.
Some operations share the same precedence (e.g. addition and subtraction). The set of all operations with a given precedence form a precedence class. An operator X is said to have a higher precedence than an operator Y if operation X is performed before Y.
Each precedence class has its own binding. The binding tells the order in which operations of the class are performed. A class may bind right-to-left or left-to-right. For example, the addition/subtraction class binds left to right, which means that in the expression
A - B + C
the left operation (subtraction) is performed before the right operation (addition).
In the sections to come, we will describe all the operations of YAA in order of precedence, from highest to lowest. Operations of equal precedence will be described as subsections of a common section.
Primary expressions have the highest precedence of evaluation. They are evaluated from left to right. Primary expressions have one of the following forms.
identifier integer_constant floating_point_constant string_constant ( expression ) [ token_sequence ] expression [ int_expression ] function ( expression )
The value of an identifier in an expression depends on the definition of the identifier, as described in later chapters. The value of an integer or floating point constant is just the constant's numeric value. The value of a string constant depends on the operation in which the string appears, as described in later sections. The value of a parenthesized expression is the value of the expression inside the parentheses. The values of the other expressions listed above are described in the subsections that follow.
A sequence of tokens enclosed in square brackets is a token sequence expression whose value is the enclosed sequence of tokens. For example, the value of
[ tok1, tok2 ]
is the sequence of three tokens
tok1 , tok2
Notice that the comma is a separate token; it is not a delimiter separating the other tokens.
The token sequence operator may not contain a new-line character or a semicolon as one of the tokens in the sequence.
In the rest of this manual, we will usually write token sequence values by enclosing the token sequence in square brackets.
The first type of subscripting operation has the form
string_expression [ int_expression ]
where string_expression is an expression yielding a string result and int_expression is an expression yielding an integer result. To evaluate this expression, YAA first evaluates the integer expression to get an integer. (This integer is called the subscript.) YAA then obtains the corresponding character from the string. The character at the beginning of a string has a subscript of 0, the next has a subscript of 1, and so on.
The result of subscripting a string is the character obtained from the string. This character is expressed as an integer value, and therefore subscripting a string is an integer expression. As an example, the result of
"abc"[0]
is the integer value 'a'. The result of
"abc"[2]
is the integer value 'c'.
Note that any string expression may precede the square brackets. For example,
( .concat("abc","def") )[4]
has the value 'e', since the result of the .concat operator is the concatenated string "abcdef".
The second form of the subscripting operation is
token_sequence_expr [ int_expression ]
The expression before the square brackets is one that yields a token sequence. This token sequence should consist of a list of Bracket-balanced subsequences, separated by commas. YAA splits this into Bracket balanced token sequences at the commas. For example, the token sequence
[a(b,c),d[(e)],f]
would be split into the subsequences
a(b,c) d[(e)] f
Notice that commas inside Brackets do not count as list separators.
The result of the subscripting operation is the Ith subsequence from the token sequence. The subsequence at the beginning of the sequence has a subscript of 0, the next token has a subscript of 1, and so on. As an example, the value of the expression
[lda y,ldq x][0]
is the token sequence
[lda y]
The value of
[ldx0,ldx1,ldx2,ldx3][2]
is ldx2. As a more complicated example,
[ [a,b],[c,d] ] [1]
gives the token sequence we could write as
[ [ c , d ] ]
This sequence consists of five tokens:
[ c , d ]
An expression of the form
function_name ( expression, expression, ...)
is a function call. The expressions inside the parentheses are the arguments of the function. Arguments are separated by commas and must be Bracket- balanced. The functions recognized by YAA are described in Chapter 7.
Unary operators are evaluated after primary expressions. They are evaluated from right to left. Recognized unary operators are
+ numeric_expression - numeric_expression ! int_expression ~ int_expression ++ int_variable -- int_variable int_variable ++ int_variable --
The unary plus (+) and (-) operators perform the usual operations on integer, floating point, and location expressions. For example, the result of -i is the value with the same magnitude as i but the opposite sign.
The logical negation operator '!' may be applied to any integer expression. The result of !I is 0 if I is non-zero, and 1 if I is zero.
The bitwise complement operator '~' (tilde) may be applied to any integer expression. The result of ~I is an integer that has a 1-bit wherever I has a 0-bit, and a 0-bit wherever I has a 1-bit. For example, ~0777000777000 is 0000777000777.
++ is called the auto-increment operator. It may only be applied to YAA variables (described in Chapter 8).
To evaluate the expression
++int_variable
YAA first adds 1 to the current value of the given integer variable. The result of the expression is the resulting value of the variable.
The result of the expression
int_variable++
is the current value of the given integer variable. After this value is obtained, YAA adds one to the variable. In other words, when the ++ appears before its argument, the variable is incremented before the result of the expression is obtained; when the ++ appears after its argument, the variable is incremented after the result of the expression is obtained.
For example, suppose the variable X currently has a value of 3. The result of the expression
++X
is 4 and after the expression is evaluated, X will also have a value of 4. If Y currently has a value of 10, the result of
Y++
is 10, but the value of Y after the expression has been evaluated will be 11.
The auto-decrement operator (--) works like the auto- increment operator, except that a value of 1 is subtracted from the argument instead of added.
--int_variable
subtracts 1 from the given variable and returns the resulting value as its result.
int_variable--
obtains the current value of the given variable as the result of the expression, then subtracts one from the variable's value.
The multiplicative operators follow the unary operators in order of precedence. They are evaluated from left to right. The multiplicative operators are
expression * expression expression / expression int_expression % int_expression
The * operator represents normal multiplication. For example, the result of 5*2 is 10. The arguments must be integer expressions. YAA does not take note of arithmetic overflow -- for example, if the true result of an integer multiplication cannot be represented in the long integer format, the actual result returned will be reduced modulo 2**36.
YAA allows the special construction of multiplying an integer times a location expression, as in
4*A
This is equivalent to adding the location expression the given number of times. For example, the above is equivalent to
A + A + A + A
Multiplying by a negative integer is equivalent to subtracting the location the given number of times. For example, the following are equivalent.
-3*A - A - A - A
In expressions of this form, the integer must be in the range from -10 to 10 (inclusive).
The / operator represents normal division. If either operand is floating point, both operands will be converted to floating point and floating point division will be used. For example, the result of 3.0/2 is 1.5. If both operands are integers, integer division will be used. For example, the result of 14/7 is 2. If division of integers is not exact, the result is truncated towards zero. For example, the result of 5/3 is 1, while the result of -5/3 is -1.
The % operator represents the integer remainder operation, also known as the "modulo" operation. The arguments must be integer expressions. If A and B are positive, A%B is the remainder obtained when A is divided by B (A modulo B). More generally, A%B is defined so that
((A/B) * B) + (A%B)
is equal to A.
The additive operators follow the multiplicative operators in order of precedence. They are evaluated from left to right. The additive operators are
expression + expression expression - expression
The binary + operator is used to indicate a variety of operations, depending on the types of its two arguments.
When both arguments are integer expressions, the result is the integer sum of the two arguments. If the addition results in overflow (greater than the largest representable integer or smaller than the most negative such integer), the result is reduced modulo 2**36.
If either of the arguments is a floating point value, both values will be converted to floating point and the result will be a floating point value.
Integer expressions may be added to location expressions. The integer will be taken to represent an offset in the same units as the offset mode of the location expression. For example, if you add 3 to a word offset, the 3 is taken to mean three words.
The binary - operator denotes subtraction. Both arguments may be integer expressions, in which case the result is the arithmetic difference of the two expressions. If the operation overflows, the result of the operation will be the true result of the subtraction, reduced modulo 2**36.
If either of the arguments is a floating point value, both values will be converted to floating point and the result will be a floating point value.
Subtraction operations may also have one simple integer argument and one location argument. The rules for this operation are similar to the rules for adding a simple integer and a location.
Finally, two location values representing offsets may be subtracted from one another.
The shift operators follow the additive operators in order of precedence. They are evaluated from left to right. The shift operations are
int_expression <<<< int_expression int_expression >>>> int_expression
The arguments of the <<<< operator must be integer
expressions. The result of A<<<If the right argument is negative or greater than the
number of bits in an integer value, the result of the
operation is undefined.
5.8.2 Right Bit Shifts
The arguments of the >>>> operator must be integer expressions. The result of A>>>>B is the value of A with its bits shifted B positions to the right. Vacated bits are filled with zeros (i.e. the shift is always performed logically). For example, 070>>>>3 is 007.
If the right argument is negative or greater than the number of bits in an integer value, the result of the operation is undefined.
The relational operators follow the shift operators in order of precedence. They are evaluated from left to right, but this is seldom useful. The relational operations are
expression << expression expression >> expression expression <<= expression expression >>= expression
<< stands for "less than". >> stands for "greater than". <<= stands for "less than or equal". >>= stands for "greater than or equal".
The result of every relational operation is an integer value: 1 if the relation is true and 0 if it is false. For example, the result of A>>B is 1 if A is greater than B and 0 otherwise.
The arguments of any relational operator must be numbers (floating point or integer) or else have the same type. Numbers are compared with other numbers in the usual way. Strings are compared to other strings using the ASCII collating sequence; for example, the string "abc" is less than the string "abd".
Relocatable values may be compared to relocatable values if they are in the same section, in which case one value is greater than another if it has a greater integer offset from the beginning of the section. For example, this lets you compare two statement labels.
YAA does not let you compare token sequences.
Note that expressions like A<5.10 Equality Operators
The equality operators follow the relational operators in order of precedence. They are evaluated from left to right. The equality operations are
expression == expression expression != expression
== stands for "is equal to". != stands for "is not equal to". Like the relational operators, the equality operators return the integer 1 if the relation is true and 0 if it is false.
Any two expressions of the same type can be compared for equality or inequality. In addition, you may compare any number to any other number (floating point or integer). You may also compare relocatable expressions to integers, but they are always considered to be unequal.
If you compare two SYMREFs with different names, they will always be considered to be unequal, even if the linking process eventually puts the two SYMREFs at the same memory location. The same is true for two locations expressed as offsets from the beginning of different sections.
The binary && operator is used to "AND" together bits in integers. The result of
int_expression && int_expression
is an integer value that has a 1-bit wherever both arguments have a 1-bit, and that has a 0-bit everywhere else. For example, the result of 0101&&0011 is 0001.
The binary ^ (caret or circumflex) operator is used to obtain the exclusive "OR" of the bits in two integers. The result of
int_expression ^ int_expression
is an integer value that has a 0-bit wherever both arguments have 1-bits or 0-bits, and that has a 1-bit everywhere else. For example, the result of 0707^0077 is 0770.
The binary | (or-bar) operator is used to obtain the inclusive "OR" of the bits in two integers. The result of
int_expression | int_expression
is an integer value that has a 0-bit wherever both arguments have 0-bits, and that has a 1-bit everywhere else. For example, the result of 0101|0011 is 0111.
&&&The binary &&&& operator is used to obtain the logical "AND" of two integers.
int_expression &&&& int_expression
has the value 1 if both arguments are non-zero, and the value 0 otherwise. The first operand (before the &&&&) must be an immediate expression.
In evaluating the arguments of a logical AND expression, the first argument is always evaluated before the second. If the first argument proves to be zero, YAA knows that the result of the entire &&&& expression will be zero, so the second argument is not evaluated.
The binary || operator is used to obtain the logical "OR" of two integers.
int_expression || int_expression
has the value 0 if both arguments are 0, and 1 otherwise. The first operand (before the ||) must be an immediate expression.
In evaluating the arguments of a logical OR expression, the first argument is always evaluated before the second. If the first argument proves to be non-zero, YAA knows that the result of the entire || expression will be 1, so the second argument is not evaluated.
Conditional operations have the form
int_expression ? expression : expression
The last two expressions must have the same type, but any type is allowed. The first expression (before the ?) must be an immediate expression.
To evaluate this expression, YAA first finds the value of the integer expression before the ?. If this value is non-zero, the result of the entire conditional expression will be the value of the expression before the colon (:). If the value is zero, the result of the entire conditional expression will be the value of the expression after the colon. For example, the value of
(A >> B) ? A : B
is the value of A if A is greater than B; otherwise, it is the value of B. In other words, the value of the above expression is the maximum of A and B.
Assignment operators are evaluated after all other operations. They may only be applied to YAA variables, as described in Chapter 8. The assignment operations are
variable = expression variable += expression variable -= expression variable *= expression variable /= expression variable %= expression variable >>>>= expression variable <<<<= expression variable &&= expression variable ^= expression variable |= expression
The = operator represents simple assignment. The value of the expression on the right is assigned to the variable on the left. The right hand expression may have any type. The left operand must be a single YAA variable.
The other assignment operators are called compound assignment operators because they combine assignment with another operation. For example, += combines assignment with addition.
A += B
is precisely equivalent to
A = A + B
(C programmers should note that this is slightly different from C: in C, the variable A would only be evaluated once in A+=B, while in YAA, it is evaluated twice.)
The types of arguments in a compound assignment must be appropriate to the actions being performed. For example, in
A >>>>= B
A must be an integer variable and B must be an integer expression.
The arguments of a compound assignment must have types that are compatible with the desired operation. For example, in
A += B
A and B must have types that can be added together.
Assignment operations are expressions, and as such they have values. The value of an assignment expression is the value that is assigned to the left hand variable. Thus the value of the expression
X = "xyz"
is the string "xyz", while the value of
A += 2
is the value of A+2.
Assignment expressions bind from right to left. This means that expressions like
X = Y = Z = 0;
are valid. First, 0 is assigned to Z. The result of this assignment (the value 0) is then assigned to Y, and so on.
The machine instructions available on the DPS-8 family of machines are described in various Bull HN hardware manuals (e.g. DPS90 Assembly Instructions, Bull HN document DX20). These hardware manuals describe the nature of the machine's instructions and how they are used.
We will not attempt to duplicate this information in this manual. However, an understanding the instruction set is not the same as understanding how to code those instructions in a YAA program. Thus this section provides information on how to code machine instructions in YAA.
(Note: The Bull HN hardware manuals occasionally show instruction example written using the Bull HN GMAP assembler. Appendix A discusses ways in which GMAP differs from YAA.)
A instruction consists of a one or more machine words, possibly followed by additional words giving arguments for the instruction. We will call the first word of an instruction the instruction word, since this dictates what kind of instruction we are dealing with. In most cases, the instruction word is the entire instruction; the exceptions are vector instructions and so-called register- register instructions. See the hardware manuals for more details.
The instruction word has the following format:
Note that this description is slightly different from the description in the hardware manuals. The hardware manuals say that the opcode field is contained in bits 18-26, with bit 27 used to distinguish EIS (Extended Instruction Set) instructions from other instructions. In our opinion, it is more natural to regard bit 27 as another bit in the opcode field rather than a separate flag bit.
YAA recognizes the standard DPS-8 opcodes. All opcodes are reserved words when they appear in the opcode field. They must be entered in lower case.
Most machine instructions involve hardware registers. Instructions refer to registers by mnemonics; these must be in lower case (for example, p1 must be used for pointer register 1).
The hardware manuals describe the use and characteristics of all the hardware registers. We will not duplicate that information here. However, below we list the symbolic names used to refer to various registers in machine instructions. This is not the complete list of registers; it only discusses those registers which may be named in instructions.
The above symbol names may be used as arguments to machine instructions that require register operands. For example, the following piece of code comes from the standard GCOS-8 function call sequence.
ldp p1,func ldx x0,STBUMP,du eppr p0,*+3,$ tra .call+S_BIAS,,p3
This shows the use of the symbols p1 (pointer register 1), x0 (index register 0), p0 (pointer register 0), and p3 (pointer register 3).
Register names are only reserved in positions where a register value is allowed or expected.
Note: Examples in the Bull HN hardware manuals are written using the GMAP assembler rather than YAA. GMAP lets you use numbers to indicate registers instead of requiring symbolic names. For example,
lda 3,1,1 # GMAP
could be written for
lda 3,x1,ar1 # YAA
In YAA, you must use the full symbolic name. For this reason, some of the examples given in the hardware manual will not work with YAA. See Appendix A for more information on differences between GMAP and YAA.
Some instructions incorporate a register number directly in their opcode. For example, ldp1 loads a pointer value into register p1. If you prefer, you can write the register as the first argument rather than as part of the opcode itself. For example, the following instructions are equivalent
ldp1 func ldp p1,func
The second format can be more useful if you are writing macros or using synonyms (described in Chapter 8). For example, you might create a synonym with
p_entr: .synonym p1
and then write
ldp p_entr,func
This would be more difficult if you used the form of the opcode that incorporates the register number right in the opcode.
This trick can be used with any opcode that incorporates a register number into the opcode. For example, the following instructions are equivalent
ldx1 y ldx x1,y
Note that you have to give the full name of the register; you can't say
ldx 1,y # Incorrect!
Also note that the trick of splitting off the register from the opcode only works with register numbers, not register names. For example, you cannot use
ldx a,value # Incorrect!
instead of
lda value
Addresses in non-EIS machine instructions may be specified in a number of different ways. The tag field in the machine instruction indicates which approach a particular instruction is using.
There are four types of addressing in non-EIS instructions. These are:
Register (R)
Register then indirect (RI)
Indirect then register (IR)
Indirect then tally (IT)
The type of addressing is indicated by the first two bits of the tag field. The other four bits provide further information, as described in the Bull HN hardware manuals.
The sections that follow are not intended to duplicate the addressing information in the hardware manuals, but to provide enough information to write up YAA instructions once you understand the hardware addressing modes.
In R addressing, the address of an operand is calculated using the address field and the contents of a register. The registers which can be used are the index registers (x0-x7), half-words in the aq (au, al, qu, ql), or the instruction counter ic. The contents of the address field are added to the contents of the register, giving the address of the instruction's operand.
For example,
lda 2,x1
generates its operand address by adding 2 and the contents of the x1 register. If x1 holds the address of the beginning of an array iarray, the above instruction loads the A register with the value of iarray[2]. If the value of the constant is zero, it can be omitted as in
lda ,ic
This takes the current value of the ic as the operand address, and loads the value at that address into the A register. The result is that it loads the lda instruction itself into the A register.
If you omit a register, the address field is assumed to contain the operand address itself. For example,
lda name
assumes that name gives the address of the actual operand. The system will find the value in that address and load it into the A register. This type of instruction can also be written by adding a tag of n after the address, as in
lda name,n
This form of the instruction is equivalent to the previous one.
R addressing also allows immediate constants. In this case, the address field is treated as the actual value of the operand rather than a value used in calculating the operand's address. A tag of dl after the constant in the YAA instruction indicates that the constant represents the lower half of a word. For example,
lda 2,dl
loads the constant 2 into the A register. A tag of du after the constant indicates that the constant represents the upper half of a word. For example,
lda 2,du
loads the constant 2 into the upper half of the A register (and puts zeroes in the other half).
Indirect addressing specifies the address of an operand in a two-stage process. First, you constuct a memory address; then you use the contents of that address to find the address of the true operand (directly or indirectly).
The Register Then Indirect (RI) method of addressing uses a register (R) address format to calculate the first memory address. This is said to be the address of the indirect word.
The contents of the indirect word have the same format as a machine instruction: an address field in the top 18 bits and a tag field in the bottom 6 bits. If the tag field specifies R addressing, the address of the true operand is calculated from the indirect word using normal R addressing. If the tag field is some other type of addressing, a new indirect word is calculated from the old one, as if the old indirect word was part of a machine instruction. This process is repeated until it finds an indirect word with R addressing.
To write an instruction that uses RI addressing, put an asterisk immediately after the register operand in an R addressing construct. For example,
lda 3,x5*
obtains the indirect word address by taking the contents of x5 and adding 3. The instruction will use the indirect word to find the address of the true operand.
The du and dl forms of register addressing are not valid with an RI tag. However, they may be used in the R address that ends the indirect word chain.
In a previous section, we mentioned a form of R addressing where no register was actually used. For example,
lda Z
takes the value of Z as the operand address. The corresponding RI instruction is written
lda Z,*
or
lda Z,n*
This will follow around a chain of indirect words beginning with the address given by Z.
The arg pseudo-op is useful for constructing indirect words. arg is described in a later section of this chapter.
Indirect then Register (IR) is similar to RI addressing, but there are a number of important differences. IR addressing is indicated by an asterisk * preceding the register involved in the addressing. For further information on IR addressing, see the hardware manuals.
As an example of an instruction which uses IR addressing, consider
lda Y,*x1
The value of x1 is cached away, and Y is used as the address of the indirect word. If this word has the format
arg N,x6
the final operand address is N plus the cached away contents of x1. In an IR addressing chain, the hardware ignores any register specified with the R instruction that ends the chain. The address field of the R instruction is added to the cached register value.
As another example, consider
lda S,*du S: arg T,x3*
The first instruction calculates S as the indirect word. This instruction is in RI format. The new indirect word will be calculated as T plus the contents of x3. If the new indirect word is
arg U,ql
then the final operand address is effectively
arg U,du
since the du from the original IR instruction was cached away for use at this time.
If an IR instruction has no register for modification, you must specify *n, as in
lda name,*n
This starts an IR indirect word chain, but no register value is cached. You cannot omit the n, since
lda name,*
is interpreted as
lda name,n*
(RI addressing).
The Indirect then Tally (IT) addressing modification combines indirection with an automatic increment or decrement of fields in the indirect word. The indirect word is broken into the following fields: Bits 0-17: Address field. Bits 18-29: Tally field. Bits 30-35: Tag field. Indirect words for IT instructions are usually created using one of the opcodes tally, tallyb, tallyc, or tallyd. These simplify the creation of the three fields of a tallying indirect word. They will be discussed in a later section of this chapter.
There are many ways in which tallying instructions and indirect words can be set up. These are described in the appropriate hardware manual and will not be repeated here.
Instructions using IT addressing are created by special symbols in the tag field of the instructions. The following symbols are recognized:
All of these tag symbols must be in lower case. The meanings of all these tags are explained in the appropriate hardware manual.
Below are a few instructions using IT addressing. We will not explain them here, but simply provide them as examples of the format.
lda Z,id lda A,sc sta Y,idc
The asterisk stands for the current instruction counter. For example, consider
ldq X,dl # load X into Q cmpq min,dl # is X less than min? tpl *+2 # if so... ldq min,dl # ...use min # and on we go
The operand of the tpl instruction is *+2. Since the instruction counter is currently pointing to the tpl instruction itself, this operand stands for the second word after the tpl instruction. In other words, the tpl instruction will jump around the ldq instruction that immediately follows it if X is not less than min.
A pseudo-op is a construct which can be used in place of a normal machine instruction. A macro is a construct that expands into one or more YAA instructions (either machine instructions or pseudo-ops).
The pseudo-ops and macros of YAA divide into two classes: ones that perform DPS-8 specific operations, and ones that perform machine-independent operations. The rest of this chapter describes DPS-8 specific pseudo-ops and macros. Chapter 8 describes the machine- independent pseudo-ops.
The adsc pseudo-ops create arguments which describe alphanumeric strings for EIS instructions. Note that these arguments are not the strings themselves; the strings are located elsewhere in memory and the arguments tell where the strings are.
adsc4 creates arguments describing strings with characters four bits long; adsc6 creates arguments describing strings with characters six bits long; and adsc9 creates arguments describing strings with characters nine bits long.
The format of an adsc pseudo-op is one of
adsc4 address,charnum,length,ar_reg adsc6 address,charnum,length,ar_reg adsc9 address,charnum,length,ar_reg
where
The ar_reg field of a descriptor may be omitted. Other fields may also be omitted, but you must put in extra commas to indicate fields that are missing. Trailing commas may be omitted. For example,
adsc9 X,,24
demonstrates that you must add commas for interior missing fields, but not for one on the end.
For an example of how adsc pseudo-ops are used, see the section on EIS instructions later in this chapter.
The arg pseudo-op is designed for constructing indirect words for IR and RI addressing. It creates a machine word with a given address field and tag field. arg has the format
arg address,tag,ar_reg
where:
For example,
arg Z,x1
creates a machine word with an address field whose value is Z and a tag field with R addressing using x1. Suppose you have the instruction
lda 0,x2*
This calculates the indirect word as the address contained in x2. At this address, you could have
arg Y,x3*
which is also in RI format. This calculates a new indirect word by adding Y and the value in x3. At this new indirect word, you could have
arg T,qu
which calculates yet another address by adding T and the upper half of the Q register. Since this indirect word has R addressing, the resulting value is the address of the true operand. Thus the original lda instruction loads the A register from the location whose address is T plus the contents of qu.
As another example,
arg 5,,ar3
creates an argument word with 3 in bits 0-2, 5 in bits 3-17, and a tag field indicating no address modification.
bdsc is similar to the adsc pseudo- ops, in that it creates an argument for an EIS instruction. adsc describes an alphanumeric string argument, while bdsc describes a bit string. The format of bdsc is
bdsc address,length,byte,bit,ar_reg
where
The ar_reg field of a descriptor may be omitted. Other fields may also be omitted, but you must put in extra commas to indicate fields that are missing. Trailing commas may be omitted. For example,
bdsc9 X,,3
demonstrates that you must add commas for interior missing fields, but not for one on the end.
More information on using EIS instructions is given in a later section of this chapter.
Packed decimal data can be created with the edec4 pseudo-op. It has the format
edec4 int_expression,string_exp
The int_expression gives the number of packed decimal digits in the number. The string_exp gives the value of the number, in string form. For example,
PX: edec4 10,"-8.3e-9"
creates a packed decimal number with ten digits and the value -8.3e-9. The string_exp can give the value in any of the following formats.
There is no provision for left-justification of the values.
The edec9 pseudo-op creates ASCII decimal data. It has the form
edec9 int_expression,string_exp
and behaves like edec4. The instruction
edec9 10,"1234"
creates the ASCII string
"0000001234\0\0"
when generating code in word offset mode (see Chapter 8 for more on offset modes). This is ten ASCII characters, plus two bytes of 0-bits to pad out to the next word boundary. Note that the zeros before the 1234 are ASCII zero characters, not binary zeros.
The iontp pseudo-op is used to create data control words for hardware I/O processing. iontp stands for "I/O NonTransmit and Proceed". The pseudo-op has the form
iontp X,Y
where X is an address that can be represented by an 18-bit quantity, and Y is a word count of data to be transferred per block (in the range 1 through 4096). The result of iontp is a machine word containing the octal value
XXXXXX03YYYY
where XXXXXX is the value of X and YYYY is the value of Y.
The iotd pseudo-op is used to create data control words for hardware I/O processing. iotd stands for "I/O Transmit and Disconnect". The pseudo-op has the form
iotd X,Y
where X is an address that can be represented by an 18-bit quantity, and Y is a word count of data to be transferred per block (in the range 1 through 4096). The result of iotd is a machine word containing the octal value
XXXXXX00YYYY
where XXXXXX is the value of X and YYYY is the value of Y.
The iotp pseudo-op is used to create data control words for hardware I/O processing. iotp stands for "I/O Transmit and Proceed". The pseudo-op has the form
iotp X,Y
where X is an address that can be represented by an 18-bit quantity, and Y is a word count of data to be transferred per block (in the range 1 through 4096). The result of iotp is a machine word containing the octal value
XXXXXX01YYYY
where XXXXXX is the value of X and YYYY is the value of Y.
The lptr pseudo-op loads a pointer into a register. It has the format
lptr register,reference
where register is the register where you want to put the pointer and reference is the address that you want to put into the register.
lptr does its work by issuing a special directive to the LD loader. When LD sees this directive, it will generate an eppr instruction to load the address if the address can be directly indexed off a pointer register. Otherwise, LD generates a literal containing the address and an ldp instruction to load the address into register.
Since the eppr form only uses one machine word while the ldp uses two (one for the instruction and one for the literal), it is better to use eppr whenever possible. By using lptr, you can have the loader decide whether a particular address is "within range" of an eppr or whether an ldp is necessary.
The ndsc pseudo-ops are similar to the adsc pseudo-ops, in that they create an argument for an EIS instruction. adsc describes an alphanumeric string argument, while ndsc describes a numeric string. The ndsc pseudo-ops have the form
ndsc4 address,charnum,length,type,scale,ar_reg ndsc9 address,charnum,length,type,scale,ar_reg
where
The ar_reg field of a descriptor may be omitted. Other fields may also be omitted, but you must put in extra commas to indicate fields that are missing. Trailing commas may be omitted. For example,
ndsc9 X,,24
demonstrates that you must add commas for interior missing fields, but not for one on the end.
EIS instructions are discussed in more detail later in this chapter.
The pointer pseudo-op creates a (relocatable) DPS-8 pointer, much like
.data pointer::X
does. (.data is described in Chapter 8).
The format of pointer is
pointer address,char_offset,bit_offset,segid
The last three fields are optional. address gives a (word) address, char_offset a byte number within that word, bit_offset a bit offset within the byte, and segid a SEGID. If char_offset and/or bit_offset are omitted, an offset of zero is assumed. If the segid is omitted, YAA assumes that the address is in the same segment as that of the address. pointer generates a machine pointer in the assembled code. Thus
pointer X
is equivalent to
.data pointer::X
The segid field may contain an expression yielding a value with the segid relocation instruction to override the default SEGID. In our experience, however, the segid field is seldom specified at all. More commonly, the only operand of pointer will be a symbol name from the program or a SYMREF.
tally generates a tally word that can be used in Indirect then Tally (IT) address modification. It is used in connection with 6-bit character strings using ci, sc or scr modification, and with word arrays using i, id, and di modification.
The format of tally is
tally address,count,offset
where
The tally word generated by tally will have a zero in bit 30, indicating that the word references 6-bit characters.
tallyb is almost exactly like tally, except that it generates tally words used in connection with 9-bit character strings using ci, sc, and scr address modification. It has the form
tallyb address,count,offsetaddress
and count have the same meaning as in tally. offset is a byte offset which can be positive, negative, or a relocatable value.
The tally word generated by tallyb will have a one in bit 30, indicating that the word references 9-bit characters.
tallyc also generates a tally word for Indirect then Tally addressing. It is used in connection with indirect references to tag fields using idc and dic address modification.
tallyc has the form
tallyc address,count,modifier
where address and count are the same as for tally. The modifier is used for normal tag field references, and may have any of the symbolic values associated with IT tags:
i id di sc scr ci ad sd f idc dic
The modifier may also have any of the IR address modification forms, or n or n*.
As an example, the following sequence of code shows a simple example of the use of tallyc.
lda X,idc X: tallyc B,10,i B: arg U arg V arg W # etc.
The lda contains an idc reference to X. At X, there is a tallyc pseudo-op which generates the corresponding tally word. This points to B. At B, there is a sequence of indirect words; addressing will run down this sequence until the tally runs out.
tallyd is similar to the other tally pseudo-ops. It generates a tally word for use in connection with indirect references using the i, ad, sd, id, or di addression modifications.
tallyd has the format
tallyd address,count,delta
where address and count are the same as for other tally pseudo-ops. delta is an optional integer expression whose value is used in the delta field of the generated tally word by ad and sd tags. The delta value will be hashed into a value in the range 0 through 64. If delta is omitted, the default is zero.
The tdcw pseudo-op is used to create data control words for hardware I/O processing. tdcw stands for "Transfer to Data Control Word". The pseudo-op has the form
tdcw X
where X is an address that can be represented by an 18-bit quantity. The result of tdcw is a machine word containing the value of X in its upper half; in its lower half it has the octal value
020000
The zero pseudo-op generates a machine word made up of two 18-bit fields. It has the form
zero upper,lower
where upper is a value to be put into the upper half of the word and lower is a value to be put into the lower half.
Either of the arguments may be omitted, as in
zero A zero ,B
In this case, zero puts the specified argument in the appropriate half of the word and puts 0-bits in the other half.
If both arguments are omitted, as in
zero
YAA creates a machine word that contains only 0-bits.
The .bias pseudo-op is provided for backwards compatibility with GMAPV. In our opinion, it is an outmoded concept and should not be used in new programs. The pseudo- op has the form
name: .bias expression,ar_reg
The label name gives the name of a segment or template. If it is omitted, the current segment or template is used.
.bias indicates that the value of the expression should be added to the value of every symbol in the given segment or template, any time one of these symbols is used.
The ar_reg argument is optional. If it appears, it should be the name of an address register. This address register will be added to any references to symbols in the biased segment, unless the reference explicitly uses a different address register.
By default, segments and templates will have a bias of 0, and there is no ar_reg. To turn off an existing bias, use the instruction
name: .bias 0
This turns off the use of the address register as well as setting the bias to zero.
If there are several .bias statements for the same segment or template, the last one appearing in the source code is the one that is used when code is actually generated.
The sdsc pseudo-op is used in conjunction with mtr and rtm instructions. The pseudo- op has the general form
sdsc word,byte,code,register
where word and byte give a memory location and register specifies one of the ar registers to be involved in the operation. The code argument is one of the following letters:
s load a signed value z load an unsigned value
For example, the following could be used with an mtr instruction to load a signed value from memory into AR7.
sdsc addr,0,s,ar7
EIS (Extended Instruction Set) instructions perform multiword alphanumeric operations. For example, mlr is an EIS instruction that copies a number of characters from one region of memory to another. The actual mlr instruction specifies options for the operation; after this instruction comes two adsc4, adsc6, or adsc9 descriptors describing the two memory regions and their contents. In the rest of this section, we will discuss the mlr instruction, with the understanding that other EIS instructions behave in a similar way.
The general format of an mlr instruction is
mlr token_sequence,token_sequence,modifiers adscn address,charnum,length,modifier adscn address,charnum,length,modifier
As shown, the first two arguments of mlr are Bracket- balanced token sequence expressions. These token sequences are called MF's (modifier fields). The first MF provides options that are used in interpreting the first adsc descriptor (describing the memory region from which data will be copied). The second MF provides options for interpreting the second adsc descriptor (describing the memory region to which data will be copied). The modifiers on the end of the mlr instruction give additional options for the operation.
The MF arguments for mlr consist of zero or more option keywords separated by commas (and enclosed in square brackets to make a token sequence). Order of keywords is unimportant. Recognized options are:
The options may appear in any order. For example, you might have
mlr [ar,x2], [rl] adsc9 0,1,8,ar2 adsc9 ABC,0,x5
The first MF says that address register modification is required for the first descriptor, and that index register x2 contains a byte offset. Since the address field of the first descriptor is 0, the result is that the address of the first word of the memory region is whatever value is in ar2, plus the byte offset in x2, plus one more byte (the charnum value). The length of the memory region is 8 characters.
The second MF says that the length field of the corresponding descriptor refers to an index register that holds the actual length. Looking at the corresponding descriptor we see that the memory region begins at the location of symbol ABC, and character 0 in that word. The length of the region is found in x5.
Unlike GMAP, the second descriptor could not be written
adsc9 ABC,0,5
You must write x5 instead of just 5. It is never correct to refer to registers just by number, even though (in this case) the rl option of the mlr instruction suggests that the descriptor should specify a register in the length field.
Earlier versions of YAA insisted that adsc instructions had to have at least three fields. In this version, fields may be omitted; unnecessary trailing commas may also be omitted. In general, if a trailing value may be omitted, the comma may be too.
Several modifiers may be added after the two MF arguments in an EIS instruction. These are
Many other EIS instructions use address modification options specified in similar forms. The options may be specified in any order.
If you specify a modifier that is not one of the characters listed above, it is interpreted as a fill character or a BOLR field, depending on the instruction. For BOLR fields, the value should be an expression that evaluates to an integer; the assembler uses the bottom four bits of this integer as the BOLR field.
Several YAA pseudo-ops create shrink vectors for DPS-8 hardware operations. These are described in the sections that follow.
All of the pseudo-ops described here are actually implemented with macros obtained with the instruction
.include "climb.a"
For more on macros and the .include statement, see Chapter 8.
cvec generates a copy vector for use in descriptor operations. It has the format
cvec segid
where segid is used to fill in the SEGID field in the generated copy vector (i.e. the low-order 12 bits). This argument must be supplied.
cvec generates a two-word copy vector beginning on the next double-word boundary.
fvec and fvecb generate data frame vectors for use by instructions like ldd3. They have the format
fvec size,[attributes] fvecb size,[attributes]
where
r - read nr - no read w - write nw - no write s - save ns - no save c - cache nc - cache bypass x - extended (EI mode) nx - not extended (ES or NS mode) e - execute ne - no execute p - privileged np - not privileged b - bounded (non-zero size) nb - not bounded (null) a - accessible na - not accessible (missing)
For example,
fvec 16,[nw]
is used for generating a read-only frame that is 16 words long.
If no attributes are specified, the defaults are r, w, s, c, x, e, p, b, and a (everything on).
The vec and vecb pseudo-ops generate shrink vectors. Each shrink vector is two words long and double-word aligned. They have the format
vec segid,offset,size,[attributes] vecb segid,offset,size,[attributes]
where
YAA's functions are primary expressions that accept zero or more arguments. All function calls consist of a function name (beginning with a dot) followed by a list of zero or more arguments enclosed in parentheses. Arguments are separated by commas and must be Bracket-balanced.
The result of a function is an expression of one of the types described in Chapter 5: integer, floating point, string, token sequence, or location. This result may be used inside other expressions. The result of a function is an immediate expression if the argument(s) are all immediate expressions.
The functions recognized by YAA are listed below.
.concat( expression, ... ) .defer( expression ) .div( int_expression, int_expression) .eval( token_sequence_expr ) .exists( token_sequence_expr ) .highest() .highest( reloc_expression ) .ic() .ic( reloc_expression ) .length( expression ) .list() .lowest() .lowest( expression ) .max( int_expression, ... ) .min( int_expression, ... ) .mod( int_expression, int_expression) .quote( expression ) .sshift( int_expression,int_expression ) .substr(string_expression, int_expression, int_expression) .substr(token_seq_expr, int_expression, int_expression) .system() .system( string_expression ) .tagval( tag_expression ) .time() .udiv( int_expression,int_expression ) .unquote( expression ) .upto( int_expression ) .upto( int_expression,reloc_expression )
All functions but .defer are evaluated on the first pass through the source code (provided that their arguments can be evaluated on the first pass). This means, for example, that the .highest function refers to the highest point in a section at the time the function is encountered. This may not be the absolute highpoint, if more code is added to the section later in the assembly. If you do not want a function evaluated on the first pass, use .defer (described later in Section 7.21).
.concat(seq1,seq2,...) .concat(string1,string2,...)
.concat concatenates one or more token sequences or strings. For example,
.concat([a b c],[d e f])
results in the token sequence
[a b c d e f]
and
.concat("abc","def")
results in the string
"abcdef"
.div(A,B)
The .div function performs integer division in a slightly different way than the / operator.
.div(A,B)
is the quotient of the integer A divided by B. If the division is inexact, the result of .div is always the largest integer less than the true quotient. In other words, truncation always goes towards negative infinity. Contrast this with A/B, where truncation is always towards zero. Thus we have
.div(7,-3) == -3 7/(-3) == -2
.mod(A,B)
The .mod function returns the remainder from a .div operation. The relationship
A == B*(.div(A,B)) + .mod(A,B)
is always true. This means that .mod always returns a non-negative result.
.eval(tokseq)
The .eval operator evaluates the tokens in the given token sequence as if the tokens formed a single expression. The result of .eval is the result of this expression. For example,
.eval( [1 + 2] )
yields the integer result 3.
.eval( .concat([1 +],[2]) )
also yields the integer result 3. On the other hand,
.eval( ["a"] )
yields the string result "a".
.exists(token)
The .exists function returns the integer 1 if the given token has been defined or declared up to this point in the program; otherwise, .exists returns 0. For example,
.exists([A])
determines if the symbol A has been defined or declared up to this point in the program.
.highest() .highest(reloc_exp)
The .highest function returns a relocatable value representing the largest value of the instruction counter inside a section. Without arguments,
.highest()
returns the largest value of the instruction counter inside the current section.
.highest ( reloc_exp )
returns the largest value of the instruction counter inside the section that contains the location given by the relocatable expression. The result of .highest is an immediate expression.
Note that .highest returns the greatest value of the instruction counter at the time that .highest is called. More material may be added to the section later in the program.
.lowest() .lowest(reloc_exp)
The .lowest function returns a relocatable value representing the beginning of a section.
.lowest()
returns the beginning of the current section.
.lowest(reloc_expression)
returns the beginning of section that contains the given relocatable value. The value of .highest minus .lowest is the length of the section (expressed in units dictated by the section's offset mode).
.ic() .ic(reloc_exp)
The .ic function determines the current value of the instruction counter inside a section. Without arguments,
.ic()
returns a relocatable value representing the current instruction counter value of the current section.
.ic( reloc_exp )
returns the current instruction counter value of the section that contains the given relocatable expression. The result of .ic is an immediate expression. The result is expressed as an offset in the units given by the section's current offset mode.
.length(string) .length(tokexp)
When applied to a string, .length returns the number of characters in the string. When applied to a token sequence, .length returns the number of Bracket-balanced token subsequences (separated by commas) in the token sequence. For example,
.length("abc")
yields the integer 3, while
.length([tok1,tok2])
yields the integer 2.
The .list function returns an integer whose bits describe the listing options that are currently in effect. For more information about listings, see Chapter 9.
.max(arg1,arg2,...)
The .max function evaluates each of the argument and returns the largest of the results.
.min(arg1,arg2,...)
The .min function evaluates each of the argument and returns the smallest of the results.
.sshift(value,offset)
The .sshift function performs a signed right shift. Vacated bits are filled with the high-order (sign) bit of the given value (which means that the shift operation is performed arithmetically). For example,
.sshift(0700777000777,3)
has the result
0770077700077
If you want to perform a logical right shift (filling vacated bits with zeroes), use the >>>> operator described in Chapter 5.
.substr(string,start,end) .substr(tokseq,start,end)
The .substr function returns a substring of a string, or a subsequence of a token sequence. For example,
.substr("abcdef",0,3)
has the value "abcd" (positions 0 through 3). If end is greater than or equal to the number of characters in string, .substr goes up to the end of string and stops. Thus
.substr("abcdef",2,.length("abcdef"))
has the value "cdef".
As an example of taking a subsequence of a token sequence expression,
.substr([a,b,[c,d]],1,2)
has the token sequence value
[b,[c,d]]
If the first argument is a string, the result of .substr is a string. If the first argument is a token sequence, the result of .substr is a token sequence.
.system() .system(string)
The .system function has two forms.
.system()
returns a string indicating the system for which source code is being assembled. Possible values for system are currently
"GCOS8_NS" -- GCOS-8 multi-segment environment "GCOS8_SS" -- GCOS-8 single segment environment "MARKIII" -- MARKIII operating system "PORT" -- PORT operating system on the PC "DOS" -- DOS operating system on the PC
You may specify the option System=name on the assembler command line to specify a different system name.
The other form of the .system function is
.system ( string )
This compares the value of the string to the string that would be returned by .system(). If the two strings are identical, .system returns the integer 1; otherwise, it returns a 0. For example, the value of
.system("GCOS8_NS")
is 1 if the code is being assembled for the GCOS-8 NS mode environment, and 0 otherwise.
.tagval(tagexp)
The .tagval function returns the bit pattern associated with tagexp when it is used as a tag in a simple instruction. For example,
eaa lclvar,,p.fram ada .dr0+.tagval([p.fram])
can be used to build a pointer to a local variable.
.time()
The .time function returns a string that gives the current date and time. This string has the form
"Wed Apr 20 15:32:40 1993" or "Thu Nov 7 03:02:01 1994"
This will always have the same number of characters. Notice that a blank is used to pad out the day of the month if it only has one digit.
.udiv(A,B)
The .udiv function performs unsigned integer division. The result is an integer expression equal to A divided by B in unsigned integer arithmetic.
.unquote(string)
The .unquote function returns a token sequence whose contents are the (tokenized) contents of the "string" argument. For example,
.unquote("a,b,c")
yields the token sequence
[a , b , c]
.quote(exp)
The .quote function returns a string whose contents match the value of the "exp" argument. More precisely, .quote is defined so that
.unquote( .quote(X) ) == X
for any argument X. Below we give some examples:
.quote("abc") == "\"abc\"" .quote([lda b,c]) == "[lda b , c]" .quote(1+2) == "3" .quote(.concat([lda],[b,c])) == "[lda b , c]"
Notice that when .quote is applied to a token sequence, the resulting string contains opening and closing square brackets. Each token inside the brackets is separated by a single space character.
.upto(boundary) .upto(boundary,section)
.upto(36)determines the distance between the current location in the current section and the next word boundary.
The .upto function returns the number of bits between the current position and an alignment boundary. The form
.upto(number)
returns an integer representing the number of bits between the current location in the current section and the next alignment boundary, as determined by the given "number". For example,
.data .upto(36):0
fills in the rest of the current word with zero bits. (See Chapter 8 for a description of the .data statement.)
The form
.upto(number,section)
returns the number of bits between the current location in the specified section (as given by .ic) and the specified alignment boundary. For example,
.upto(36,X)
returns the number of bits between the current location in section X and the next word boundary.
.defer(exp)
The .defer function delays the evaluation of the given expression until the assembler's second pass through the source code; most other expressions are evaluated (or partly evaluated) during the first pass. As an example, the result of
.highest()
is the highest location in the current section, up to this point in the assembly.
.defer( .highest() )
evaluates the .highest function in the second pass, at which point the assembler knows the highest location that was ever reached.
If the opcode field of a statement is not empty, the field must contain a token sequence expression whose result is a single token. There are two acceptable token types.
We assume that you are familiar with the DPS-8 hardware opcodes. The pseudo-ops of YAA are described later in this chapter.
All pseudo-ops and hardware opcodes must be written in lower case in source code. Pseudo-ops and hardware opcodes are reserved words when they appear in the opcode field of a statement.
Before we describe the pseudo-ops of YAA, it will be helpful to cover some general principles of code generation.
Output code is generated by hardware opcodes and by some pseudo-ops (e.g., .data). Every time YAA has to output code, YAA first checks to see if it is at the proper alignment boundary as dictated by the current section's offset mode. For example, if the current offset mode is word, YAA checks to see if the output code is currently at a word boundary.
If the output code is not at an appropriate boundary, YAA outputs filler to get up to the required boundary. In a data section, YAA uses 0-bits for filler. In a code section, YAA tries to use NO-OP instructions; if there isn't enough room to put a complete NO-OP instruction, 0-bits are used instead.
Alignment checks and adjustment take place before YAA finds out what the next instruction is. This has side effects, as we will note in Section 8.4.2.
Some types of statements require additional alignment manipulation. For example, the GCOS-8 rpd instruction must be aligned on an odd-word boundary. When YAA sees such an instruction code, it will react, possibly by outputting more filler to reach such a boundary. The filler will be 0-bits in data sections, NO-OP instructions in code sections. Note that this means there is a difference between
label: .null rpd
and
label: rpd
The first goes to the next word boundary, then may go on to an odd-word boundary for the rpd instruction. The second goes to an odd-word boundary immediately, puts down the rpd instruction, and associates label with that instruction. In the second form, label is always associated with the rpd instruction; in the first, it may be associated with the word before the rpd instruction.
Some pseudo-op statements give a special meaning to the statement label field. For example, the .macro definition pseudo-op uses the statement label field to give the name of the macro.
If a statement does not interpret the label field in a special way, the statement label is taken as a label for the current location in the section. This will be the location after all alignment manipulation has taken place. For example, a label on an rpd instruction will refer to the location of the rpd after it has been properly aligned.
Since YAA aligns to the proper offset mode boundary for every statement, location label names are always associated with whole multiples of the current offset mode. For example, if a section currently has the word offset mode, new locations named in that section will always refer to an exact number of words. Consider the code fragment
X: .data 1:0 #a single bit Y: .data 35:0 #35 bits
This example shows a single bit of data allocated at the location labelled X, followed by 35 bits of data at the location associated with Y. Even though these two pieces of data could fit in a single word, YAA aligns data object on the boundary given by the section's offset mode. In a section with the word offset mode, X would be associated with one word and Y with the next word. If you want the two data objects to share the same word, you could change the section's offset mode to bit just before the definition of X. This is done with the .usage pseudo-op (described in a later section).
YAA offers a convenient way to create literals. If the operand field of an instruction contains a sequence of one or more statements enclosed in brace brackets, the statements are assembled into an unnamed section that can serve as a literal. In addition, the bracketed code regarded as a relocatable expression which can be used in other statements. The value of this expression is a relocatable value referring to the start of the unnamed section.
As a simple example, consider the instruction
lda {.data "abcdefg"},du
An unnamed section is created to hold the assembled output of the .data statement (i.e. the literal string). The code contained in the brace brackets is then replaced with a single relocatable value referring to the beginning of the newly created section. The effect of this instruction is therefore to create a literal string in an unnamed section, then load a pointer to the first character of this string into the A register.
If the link editor discovers two literals with identical contents, the two will be merged ("folded") so that code is not duplicated. Thus literals should be considered non- writable.
The brace bracket construct can also be used in the operator field of an instruction, to create a code block. In this case, YAA makes note of the current section when it encounters the opening brace, and goes back to that section after the closing brace that ends the code block. For example, you might write
# Instructions 1 { .template x: .data 0 y: .data 0 } # Instructions 2
in order to define a template in the middle of other instructions. YAA takes note of the current section before entering the code block and goes back to that section after the code block is finished. This means that the second set of instructions immediately follow the first set in the original section. If you just wrote
# Instructions 1 .template x: .data 0 y: .data 0 # Instructions 2
the second set of instructions would be considered part of the template, not the section that was being assembled before the .template instruction.
When YAA goes back to a section after a code block, YAA restores the offset mode of the section (if necessary). For example, the code
#current offset mode is "word" { .usage byte #now "byte" offset mode A: .data 9:'a' B: .data 9:'b' C: .data 9:'c' D: .data 9:'d' } #current offset mode back to "word"
goes to the byte offset mode within the brace brackets so that it can lay down data in byte quantities. After the closing brace, the section is returned to word offset mode.
When YAA goes back to a section after a code block, YAA does not restore the IC to what it was before the braces. Thus
lda 1; {lda 2; lda 3} lda 4
YAA makes note of the current section when the opening brace is encountered and restores the section at the closing brace. However, the instructions inside the braces do not change sections. Therefore the material inside the braces is assembled as normal. When the section is restored after the closing brace, the IC will have the value it had after the lda 3 instruction. Therefore the above line is equivalent to
lda 1; lda 2; lda 3; lda 4
name: .null [name,name,...]: .null .null
The .null statement just associates the given name(s) with the current location in the current section. If necessary, .null forces alignment to the next boundary as dictated by the section's offset code.
For example, suppose that the offset mode of the current section is word.
A: .null [B,C]: .null
moves up to the next word boundary (if necessary) and associates A with the resulting location. .null does not change the instruction counter, so the next .null associates B and C with the same location.
If the .null has no label, it just forces alignment to an appropriate boundary (if necessary).
YAA has two pseudo-ops that reserve storage for data objects: .space and .data. The .align pseudo-op also contributes to the way that memory is allocated.
.align boundary .align boundary,offset
.align 2moves to the next double-word boundary.
.align 2,1goes to a double-word boundary and then one word more (i.e. an odd-word boundary).
With the format,
.align boundary.align
outputs "filler" (if necessary) until it reaches a memory location that is an even multiple of "boundary" times the offset mode of the current section. For example, if the offset mode of the current section is word,
.align 2
outputs filler (if necessary) until reaching a double-word. The filler consists of 0-bits in a data section, and NO-OP instructions in a code section.
The form
.align boundary,offset
works much the same way, filling to the next location that is at the given offset from the specified type of boundary.
If necessary, you can specify offset types explicitly using the format used by .highest, .lowest, and .ic. For example,
.align byte::1
aligns to the next byte boundary.
.align bit::9
also aligns to the next byte boundary. If an offset argument does not have an explicit offset type but the boundary argument does, the offset takes its type from the first. For example,
.align bit::18,9
aligns to nine bits beyond a half-word boundary.
.align does not accept a label field.
.space size
The .space pseudo-op reserves a block of memory but does not specify the contents of that block. For example, in a section with the word offset mode,
.space 4
reserves four words of space.
Different units may be specified for the reserved space using the notation type::value. For example,
.space byte::2
reserves two bytes of space. Similarly,
.space bit::1
reserves the next available bit.
There is a special consideration in situations where you are using .space to allocate space for a symbol that already has an associated type. In this case, you can omit the size argument for .space, and the assembler will automatically allocate the amount of space required to hold a value of the given type. For example, in
X: .object type=>>.double X: .space.space
automatically allocates enough space to hold a double object.
When .space reserves memory, it does not initialize the contents of the storage.
.data value .data value,value,... .data bitlength:value .data bitlength:value,bitlength:value,...
The .data pseudo-op reserves storage for a data object and places a value in that piece of storage.
With the format
.data expression
YAA moves to an appropriate alignment boundary as dictated by the current offset mode, before looking at the .data statement. Then, if necessary, YAA adds additional 0-bits to fill out to an alignment boundary which is appropriate to the type of the given expression. The value of the expression is then stored in this memory location, taking up the amount of space appropriate to the type of the expression. For example, in
.data 3
the value is an integer (requiring one word of storage on a word boundary). Therefore YAA goes to the next word boundary and reserves one word to hold the value 3.
.data "abcdefgh"
reserves the amount of memory needed to hold eight characters, aligned appropriately for a character string (a byte boundary). C users should remember that YAA does not automatically put a '\0' on the end of this string.
Notice the difference between
.data "a"
which reserves space for a single character and only requires byte alignment, and
.data 'a'
which reserves space for an integer constant containing the ASCII character 'a' and therefore needs word alignment.
If the argument of .data refers to a location, YAA allocates a word for the value. Then YAA represents the location value with a pseudo-address and a relocation instruction based on the offset mode of the location. For example, suppose X is a location in a program and X has the word offset mode.
.data X
reserves a word for the location value and outputs a word relocation instruction. This will be a lower word relocation, since the data area is an entire word. The link editor eventually fills in the data area with a value indicating the word offset of X from the beginning of the segment containing X.
You can request different relocation instructions by preceding the location value with the desired relocation type followed by two colons. For example,
.data pointer::X
allocates a word for the location value and generates a relocation instruction that tells the link editor to fill in this value with a machine pointer to the location X.
For more complex kinds of data, you may use the form
.data bitlength:value
YAA moves to the next word boundary, then reserves bitlength bits to hold the given value. For example,
.data 18:5
moves to the next word boundary and then stores the simple integer value 5 in the next 18 bits of storage.
If the expression after the colon has a location value, YAA generates an appropriate relocation instruction for the value. For example, consider
.data 18: segid::X
The location value is the 12-bit SEGID of the segment containing X. This is stored in the 18 bits reserved by the .data statement. If the alignment of these 18 bits is not suitable to hold a SEGID, YAA issues an error message. YAA generates an upper or lower segid relocation instruction, depending on whether the 18 bits reserved for the SEGID are in the upper or lower half of a machine word.
When a .data statement contains a bitlength, the type of the accompanying value determines how the data is stored in the given location. Integer values are right-justified and sign-extended to the entire width of the data. For example,
.data 100*36:-1
extends the sign of the -1 to 100 36-bit words (filling all the words with 1-bits). String values are left-justified and padded with zero bits on the end. Floating point values are also left-justified and padded with zero bits.
A single .data statement may initialize several consecutive memory locations. Initialization values are specified with a list of Bracket-balanced operands, separated by commas. For example,
.data 1,2,3
initializes three consecutive words to the given values.
.data 18:word::X,18:word::Y
initializes the upper half of a word to the relocatable word offset of X and the lower half to the relocatable word offset of Y. Two relocation instructions would be generated: an upper word relocation for X and a lower word relocation for Y. The two relocatable offsets would be put together into a single word.
Note that this is different from
.data 18:word::X .data 18:word::Y
Since each .data statement automatically goes to the next word boundary before it reserves storage, the above code allocates two consecutive words. Each of these words would have a relocatable offset in its upper half.
A YAA variable is a variable created for use while assembling a YAA program. The variable is not directly associated with any memory locations in the assembled program.
name: .var expression [name,name,...]: .var expression
The .var pseudo-op creates and initializes one or more YAA variables. The given names become the names of YAA variables, each of which has the value of the given expression. For example,
[A,B,C]: .var 0
creates three variables and initializes them to zero.
Names created as YAA variables may not have been used for any other purpose earlier in the program. For example, you can't use a name as the name of a memory location, then use it as a variable name too.
YAA variables must be created by a .var statement before they can be used (which means that the .var statement must precede any references to the variables in the source code).
Once you have created a variable, you can use it in other statements. For example, in
A: .var "Hello!" String: .data A
the token A is replaced by the current value of the YAA variable A. This means that the above statement is equivalent to
String: .data "Hello!"
A .var statement may assign a new value to an existing YAA variable, as in
A: .var 0 ... A: .var A+1
A .var statement does not change the current alignment of the output code.
name: .set expression [name,name,...]: .set expression
The .set pseudo-op lets you change the value of YAA variables or to assign values to other symbols. For example, in
A: .var 0 ... A: .set A+1
the variable A is set to zero, then later incremented. In
[X,Y,Z]: .var 'a' ... [X,Y,Z]: .set 'b'
three variables are initialized to 'a', then changed to 'b'.
You may use .set to assign values to symbols which are not YAA variables. The format of the statement is the same. However, such symbols can only be .set to a value once. YAA variables (created with an earlier .var statement) can have their values changed as often as you like.
YAA variables do not have a fixed type. Every time a variable is given a value by a .set statement, the type of the variable changes to the type of the expression in the statement. For example, you could say
X: .var 5 int: .data X X: .set 7.8 float: .data X
In the first .data statement, X is an integer; in the second, it is a floating point value. If a variable is assigned a location value, the variable takes the same offset mode as the location value.
YAA variables may be associated with the addresses of literals, as in
A: .set {.data 01010101}
This format is better than
A: .data 01010101
if you are creating a constant (i.e. a data object whose value you won't want to change). Remember that if a program has several identical literals, they will be merged into a single literal by the linker. Therefore using literals instead of explicit data objects can save memory.
The previous sections may have given the impression that a YAA variable is very different from other symbols used in a program. This is not really the case. There are only a few differences between YAA variables and other user-defined symbols.
X .set Y ... Y .set 0This is valid if X and Y are not YAA variables. (Note that this is one way in which YAA differs from GMAP. However, it is not valid if either X or Y is a YAA variable, since X could not be given a value using a forward reference and Y could not be used in the forward reference.
Any symbol may be assigned a value with .set, even if it has not been declared as a YAA variable. However, once a normal symbol has been given a value in this way, the value cannot be changed.
Earlier we gave the example of the statement
A: .set A+1
This is valid if A has already been assigned a value. However, if this is the first reference to A, the statement is what we call a circular definition, and it is invalid. If you try an instruction like
lda A
the A will be replaced with A+1, leading to (A+1)+1, leading to ((A+1)+1)+1 and so on. Eventually YAA will stop the expansion with the message
Expression too complex
The same sort of problem occurs if you define A in terms of B and B in terms of A, or use a longer circular chain.
Manifests and macros are symbols used to edit the text of your YAA program. They are similar to symbols created with the #define preprocessor directive in C, but there are several important differences. Manifests and macros are created with the .define pseudo-op.
.define manifest,text .define macro(parm1,parm2,...),text
The .define pseudo-op defines both manifests and text macros. The following sections explain these in detail.
A manifest is a symbol whose value is a piece of source code text. Manifests are created with
.define manifest,text
For example,
.define SIZE,30
associates the text 30 with the name SIZE.
Once a manifest has been created with a .define statement, it can be used anywhere in your source code, except in ASCII string and character constants. When YAA recognizes the manifest, the assembler replaces the name with the associated text. For example,
.space SIZE
is changed to
.space 30
The difference between manifests and YAA variables is that a manifest is strictly textual. For example, suppose you have
X: .set 2+2 .define Y,2+2 A: .data X*X B: .data Y*Y
When X is initialized, the expression 2+2 is evaluated and X is assigned the result of 4. When Y is defined, it is associated with the text 2+2. Thus the definitions of A and B become
A: .data 4*4 B: .data 2+2*2+2
Notice that the data with the label A is given the value 16. However, the data with the label B is given the value 8, because the multiplication operation takes place before the addition.
Because of the textual nature of manifests, it is a good idea to parenthesize the text values when they are defined. For example, you might write
.define Y,(2+2)
With this definition,
B: .data Y*Y
results in the expected value of 16. Because of such complications, it is better to use symbols or YAA variables whenever possible, rather than defining manifests.
A text macro is similar to a manifest. It is created with a .define statement of the form
.define macro(parm,parm,...),text
The macro name and parm values are normal YAA identifiers. The parm symbols are called the parameters of the macro. The text can be any source code text. For example, you might have
.define plus(A,B),A+B
Once a text macro has been defined in this way, it may be used in any location in the program's source code. To use a macro, you specify the macro's name followed by a parenthesized list of macro arguments. This is known as a macro call. Each macro argument is a Bracket- balanced sequence of tokens. Macro arguments are separated by commas. The number of arguments must be equal to the number of macro parameters given in the original macro definition.
When YAA encounters a macro call in source code, it replaces the call with the text given in the original macro definition. Wherever that text contains a token equal to one of the macro parameters, the token is replaced by the macro argument that corresponds to the parameter. For example, if your program has
.data plus(3,2)
the macro call plus(3,2) is replaced by the text associated with plus. The macro argument 3 replaces all occurrences of the parameter A in the macro text; similarly, the argument 2 replaces all occurrences of the parameter B. Therefore, the above .data statement becomes
.data 3+2
As with manifests, this kind of macro substitution is strictly textual. For example, in
.define times(X,Y),X*Y .data times(1+2,3+4)
the .data statement becomes
.data 1+2*3+4
For this reason, it is usually a good idea to parenthesize all appearances of macro parameters in the macro definition, as in
.define times(X,Y),(X)*(Y)
With this definition, the .data statement becomes
.data (1+2)*(3+4)
Macro parameters are only recognized in macro text when they are separate tokens. In particular, they are not recognized when they appear as part of string constants. For example, if you define
.define HOWMANY(WHO,N),"A WHO has N lives"
the parameters WHO and N will not be replaced inside the string constant. You could get around this problem by defining
.define HOWMANY(WHO,N),\ .concat("A ",.quote([WHO]),\ " has ",.quote([N])," lives")
A text macro definition is a single statement, and therefore extends to the first semi-colon or to the end of the line. However, long text macros may be created by continuing the macro definition onto additional lines in the usual way (putting a backslash at the end of each continued line).
You should be careful of the way that null token sequences affect macros. For example, consider
.define sample(X),( ([X]==[]) ? 0 : X )
This looks like the value of sample should be X unless X is null. However, if X is null, the macro expands to
( ([] == []) ? 0 : )
which is syntactically incorrect. Thus it will get an error. One way to solve the problem is to change the definition to
.define sample(X), @(([X]==[]) ? [0] : [X])
If X is non-null, this evaluates to @[X] or just X. If X is null, it evaluates to @[0], or just the integer 0. (The @ operator is described below.)
The @ operator is related to manifests and text macros in that it is used to modify source code text. The general form of the operation is
@ token_sequence_expression
where token_sequence_expression is the shortest possible Bracket-balanced token sequence following the @. When YAA encounters such a construct in source code, the construct is replaced by the sequence of tokens that are the result of the token sequence expression. As a simple example,
x: .var [1,x0] lda @x
turns into
lda 1,x0x
is a YAA variable that has been given a token sequence value. The construct @x is therefore replaced by the tokens associated with x.
As with manifests and text macros, the replacement is purely textual: the source code is altered. The token sequence expression after @ must be an immediate expression.
YAA also lets you create a second kind of macro, called an opcode macro. Creating an opcode macro is like creating a new type of statement. In source code, it appears to be a single statement. However, when the program is assembled, the opcode macro statement is replaced by the text associated with the macro, and that text may actually be made up of a number of statements.
macro_name: .macro parm,parm,... # macro definition .endmacro
The .macro pseudo-op starts the definition of an opcode macro. It must be on a line on its own; it cannot come before or after other statements on the same source code line.
The .macro statement is followed by a sequence of zero or more other statements called the body of the macro.
The end of macro definition is indicated by an .endmacro statement as shown above. If the .endmacro statement has a label, the label must be the same as the name on the .macro statement that began the macro definition.
Below we give an example of a simple macro definition
VECTOR5: .macro A,B,C,D,E .data A .data A*B .data A*B*C .data A*B*C*D .data A*B*C*D*E .endmacro
Once the macro has been defined, it can be used as a statement. For example, you might write
label: VECTOR5 1,2,3,4,5
The opcode field is the name associated with the opcode macro. The operand field gives a list of Bracket-balanced token sequences which serve as the arguments of the macro.
When YAA recognizes an opcode macro name in the opcode field, it inserts the body of the macro in the program where the opcode macro statement appears. Occurrences of the macro parameters in the macro body are replaced by the corresponding arguments in the operand field of the opcode macro statement. Thus the above statement turns into
label: .null .data 1 .data 1*2 .data 1*2*3 .data 1*2*3*4 .data 1*2*3*4*5
Notice that the label of the opcode macro is associated with a .null directive that associates the label with the value of the IC before the macro expansion begins.
The number of arguments specified when you invoke the macro cannot be greater than the number of parameters given in the macro definition. As with text macros, arguments are strictly textual.
When invoking an opcode macro, any or all of the (positional) arguments may be omitted. For example, if a macro is defined with
mac: .macro A,B,C
you could call it with
mac 1,2 #omits C mac ,2,3 #omits A mac 1,,3 #omits B mac ,,3 #omits A,B mac 1 #omits B,C mac #omits all arguments # and so on
As shown above, you do not have to put trailing commas on the argument list if you leave off trailing arguments. However, you can still do it if you want, as in
mac 1,,
When you omit an argument value, the macro is passed "emptiness", i.e. no text. This can be useful. For example,
.if [A] == []
tests whether a value was passed for parameter A. (The .if directive is discussed later in this chapter.)
In addition to the normal parameters explained above, macros may be defined to accept keyword parameters. Keyword parameters are specified on the .macro statement that begins a macro definition. They appear in the parameter list of the macro definition, after all the normal parameters have been given. A keyword parameter has the form
keyword=>>default_value
where keyword is a normal identifier and default_value consists of one or more Bracket- balanced token sequences. Keyword parameter definitions are separated by commas. The keyword parameter list ends at a semicolon or the end of the source code line.
When a program uses an opcode macro that was defined with one or more keyword parameters, YAA checks the operand list for keyword arguments. A keyword argument has the form
keyword=>>value
where keyword is the same as in one of the keyword parameters specified in the macro definition. If there is a keyword argument matching a particular keyword parameter, the keyword is replaced by the specified argument value wherever the keyword appears in the body of the macro. If there is no such keyword argument, the keyword is replaced by the default_value given when the macro was defined.
As an example, suppose you define
GOSUB: .macro location,modifier,SIZE=>>30 tsx1 location,modifier .space SIZE .endmacro
If you call this macro with
GOSUB rtn,x3,SIZE=>>20
the macro is expanded to
tsx1 rtn,x3 .space 20
If you omit the keyword argument, as in
GOSUB rtn,x3
you get the default value of SIZE as specified in the macro definition:
tsx1 rtn,3 .space 30
In a call to an opcode macro that has both keyword and normal parameters, the normal parameters must precede the keyword ones. The normal arguments must appear in the same order as the corresponding parameters in the macro definition, but the keyword parameters may be given in any order.
.undefine name
The .undefine pseudo-op gets rid of the current definition of the symbol with the given name. For example,
.define SIZE,10 ... .undefine SIZE
gets rid of the SIZE manifest.
Once you have "undefined" a macro or manifest, the name no longer has its special meaning. It is interpreted as a normal identifier for the rest of the program.
You can use .undefine to discard any symbol or variable. However, there are some situations in which this gets you in trouble. If YAA encounters a forward reference to an unknown symbol on the first pass through the source code, YAA simply copies that reference to the intermediate working file. The assumption is that the symbol will be defined later in the source, so that the forward reference can be resolved on the second pass. However, if you define the symbol then .undefine it again, the symbol is not defined on the second pass either and you get an error. The result is that you cannot .undefine anything that is referenced through forward references.
You must have a separate .undefine statement for each symbol you undefine.
macro_name: .macro parm,parm,... .label placeholder #statements placeholder: #statement
When a statement containing an opcode macro call has a label field, the label is assigned the value of the instruction counter before the beginning of the macro expansion. At times, however, a macro may prefer to have the statement label associated with a statement inside the macro expansion (i.e. not the first statement of the expansion). To do this, you declare use a statement of the form
.label placeholder_name
where placeholder_name is a normal YAA identifier. If the .label statement itself has a label, the label must be the name of the macro, as given on the .macro statement.
The .label pseudo-op states that the given placeholder_name will stand for any label that is supplied when the macro is called. When the macro is expanded, any appearance of the placeholder name within the macro body will be replaced by the macro call's statement label field, expressed as a token sequence. For example, suppose you have
f: .macro .label LNAME statement1 LNAME: statement2 .endmacro
If you call this macro with the statement
here: f
the macro is expanded to
statement1 [here]: statement2
In other words, the appearance of LNAME is replaced by the label here that was specified when the macro was called. Notice that the label is converted to the token sequence
[here]
even though it was only specified as a symbol name (without the square brackets).
If the label field of a macro statement contains a list of labels, the placeholder given on the .label statement is associated with the entire list. Using our above example,
[A,B,C]: f
expands to
statement1 [A,B,C]: statement2
The placeholder name specified in a .label statement does not have to be used in a label field inside the macro. It can be used anywhere, e.g. in an operand field, as in
tra LNAME
If a macro definition contains a .label statement, but the macro call does not, the placeholder name is associated with an empty token sequence. In the above example, if you just use f without a label, the macro expands to
statement1 []: statement2
This is a syntax error, since a statement can't have this kind of null label.
macro_name: .macro parm,parm,... .local var1,var2,... # statements .endmacro
The .local pseudo-op declares a set of variables which are local to a macro definition. If the .local statement has a label, the label must be the same as the name of the macro being defined.
The variables declared with .local are like normal YAA variables, except that they can only be used within the macro definition itself and they disappear at the end of each invocation of the macro. Using local variables can avoid several problems that might arise if you used normal .var variables in the macro:
Local variables avoid these problems. The following macro provides a simple example of the use of such variables.
dbl_word: .macro A,B .local MAX,MIN MAX: .set ((A)>>(B))?(A):(B) MIN: .set ((A)<<(B))?(A):(B) .data MAX .data MIN .endmacro
This macro takes two arguments. The local variable MAX is set to the greater of the two arguments and the local variable MIN is set to the lesser. The macro then generates two .data statements with the first having the greater value and the second the lesser. Notice that we parenthesized every appearance of A and B to avoid problems if the arguments are expressions.
The .local statement should appear after the .macro statement that begins the macro definition, and before the first use of any of the local variables. Generally, .local statements should appear immediately after the .macro statement so that their declarations can be easily found.
A macro definition may have as many local variables and as many .local statements as required.
A macro definition may contain the definition of another macro. This process is called "nesting" macro definitions.
Any .local and .label statements are associated with the macro that began at the most recent .macro statement. For example, in
f: .macro A,B,C .local X,Y,Z .label LNAME ... g: .macro D,E,F .local R,S,T .label LNAME2 ... .endmacro # end of g ... .endmacro # end of f
the variables X, Y, and Z are local to f, while R, S, and T are local to g. X, Y, and Z can be used inside both f and g; R, S, and T can only be used inside g.
In the above example, the macro f could only be called successfully once. If f were expanded a second time, YAA would try to redefine g and this would conflict with the definition of g given the first time f was called. Of course, the program could use .if to skip the definition of g on subsequent calls to f, but this can clutter up your code.
An alternative approach would be to declare g as local to f. This could be done with
f: .macro .local g g: .macro ... .endmacro ... g ... .endmacro
In this instance, g may only be used inside the definition of f. Each time f is invoked, g is defined from scratch. g can be used by f; at the end of the expansion of f, however, YAA behaves as if g has been automatically undefined. In this way, f may be called as often as you like, and the definitions of g will not conflict with each other.
Defining a macro inside a macro can use up a lot of memory, since a new internal macro is defined every time the external one is used. For this reason, the technique should only be used in circumstances where no other method of implementation will work.
In Chapter 3, we showed how the backslash could be used to tell YAA to ignore a new-line character. In general, putting a backslash before a code token tells YAA to treat that token literally, without any special meaning it may usually have. This applies to keywords, manifests, text and opcode macros, and macro parameters. For example, in
.define one,1 lda one # becomes lda 1 lda \one # remains lda one
the backslash in front of the manifest tells YAA not to replace it with the associated text.
As an example, suppose you have an opcode macro named fred and it takes a keyword argument SIZE. Suppose also that you want to define a macro named george that also has a keyword SIZE. Finally, suppose that george does some work, then calls fred with the passed arguments. If you wrote
george: .macro SIZE=>>1 blah: #stuff here fred SIZE=>>SIZE .endmacro
the call to FRED would be changed to
fred 1=>>1
(or whatever the value of SIZE turned out to be). Obviously, this is not what you want (it's a syntax error). Instead, you must write
george: .macro SIZE=>>1 blah: #stuff here fred \SIZE=>>SIZE .endmacro
Now the call to fred will be expanded to
fred SIZE=>>1
(or whatever the value of SIZE is).
Because the backslash has this special meaning, you must type two of them if you want an actual backslash character (e.g. in a string literal).
YAA provides a number of pseudo-ops for manipulating the code produced by the assembler. These perform such operations as obtaining source code from another file, skipping sections of source code, and repeating sections of source code.
.include filename
The .include statement obtains source code from another file. YAA replaces the .include statement with the contents of the given file. For example,
.include "user/cat/file"
obtains the contents of the specified file. Presumably this file contains YAA source code, e.g. macro and symbol definitions that are shared by many different source modules.
YAA remembers each string expression filename specified in an .include instruction. If an .include instruction later in the source code contains the same string expression as an earlier .include, the second inclusion will not take place. This lets you avoid including the same file twice. YAA only compares the string expressions; it does not compare actual file names resulting from those expressions. Thus
.include "jdortmunder/incfile" .include "/incfile"
would both be executed, even if the two statements happen to refer to the same file.
No message is printed when YAA ignores a .include statement.
If YAA cannot find the file that you want to include, YAA normally aborts the assembly. However, you can avoid this by putting a label on the .include statement, as in
X: .include "file"
In this case, the label is treated like a variable. The variable is assigned a value of 0 if the file is read successfully read, and assigned a non-zero value if the file cannot be read. (For C programmers, the non-zero value is the same value that would be assigned to "errno" in this situation.) With this format, YAA does not abort the assembly if the file cannot be read. This gives the source code a chance to examine the value of the label X and to figure out what should be done next.
If YAA reads an include file successfully, YAA completely processes the file before going on to the statement that follows the .include statement. Thus if you test the label X and it indicates a successful read, the include file has already been included at that point.
An .include statement may give either a relative or an absolute pathname for a file. If the given pathname is relative, as in
.include "myfile"
YAA must look for the file under some catalog.
When you assemble YAA source code, you may specify
Include=catalog
options on the command line. These options indicate catalogs that YAA should search through when attempting to find a relative .include file.
.search catname
The .search pseudo-op lets you specify catalogs where YAA should search for files named in .include statements. For example, with
.search "user/cat" .include "file"
YAA tries to satisfy the .include statement by looking for user/cat/file. More specifically, YAA searches through catalogs in this order:
You can have any number of .search statements in your code. .search directives are particularly useful inside initialization files, where they can set up search rules for all of the .include statements in a program.
.if int_expression1 # first set of statements .elseif int_expression2 # second set of statements .elseif int_expression3 # third set of statements ... .else # final set of statements .endif
The .if statement tells YAA to skip zero or more source code statements if a certain condition is false. The simplest way to use .if is with a block of code of the form
.if int_expression # statements here .endif
When YAA encounters an .if statement, it evaluates the given integer expression. If the value of the expression is non-zero, YAA proceeds normally. If, however, the value of the expression is zero (false), YAA skips all the statements that follow until it finds the .endif statement. YAA begins assembling statements again following the .endif. None of the statements between the .if and the .endif cause any actions. YAA just skips through them looking for the .endif.
A second way to use .if is with a block of code of the form
.if int_expression # first set of statements .else # second set of statements .endif
Again, YAA evaluates the integer expression. If the value of the expression is non-zero, YAA assembles the first set of statements, but skips past the second set (between the .else and the .endif). If the value of the expression is zero, YAA skips the first set of statements, and assembles the second set.
The final way to use .if is with a block of code of the form
.if int_expression1 # first set of statements .elseif int_expression2 # second set of statements .elseif int_expression3 # third set of statements ... .else # final set of statements .endif
If the first integer expression is non-zero, YAA assembles the first set of statements and skips the rest. If the first expression is zero but the second is non-zero, YAA assembles the second set of statements and skips the other sets. If all of the integer expressions on the .elseif statements turn out to be zero, YAA assembles the final set of statements following the .else.
The .exists expression is particularly useful in .if statements. For example,
.if .exists([DEBUG]) ... .endif
assembles the contained code if a symbol named DEBUG has been defined. Such code could contain instructions helpful during the debugging process. If you put
DEBUG: .var 1
at the beginning of your program, YAA assembles the debugging code after the .if. Once the program has been debugged, you can remove the definition and YAA skips the debugging code.
The condition in an .if statement must be an expression that can be evaluated on the first pass through the code. For example, suppose A and B are location expressions. You can use
.if (A == A)
since YAA can immediately tell that this is true. However, you cannot use
.if (A == B)
unless YAA can immediately tell if A and B refer to the same memory location (e.g. A and B are defined at the same offset from the beginning of a section that has already been defined).
.while condition #statements .endwhile
The .while loop repeats the assembly of a block of statements. It is similar to a "while" loop in higher level programming languages, except that it works on the text of the program.
To execute a .while loop, YAA first evaluates the condition of the loop as an integer expression. If the condition expression is non-zero, the statements between the .while and .endwhile are assembled. Once they have been assembled, YAA returns to the .while statement, and evaluates the condition again to see if it is still non-zero. The statements are repeatedly assembled until the condition expression is found to be zero. If the condition has the value zero the first time the .while statement is encountered, YAA does not assemble the enclosed statements at all.
As a simple example, consider
I: .var 4 .while I>>0 .data I I: .set I-1 .endwhile
The .while loop will produce the statements
.data 4 .data 3 .data 2 .data 1
One .while loop may be nested inside another .while loop. Each .while statement must have a corresponding .endwhile statement. An .endwhile statement is always associated with the most recent .while statement.
name: .foreach tokseq #statements name: .endforeach
The easiest way to understand the .foreach construct is to begin with an example.
x: .foreach [du,dl,qu] lda 1,x x: .endforeach
generates the following instructions.
lda 1,du lda 1,dl lda 1,qu
In other words, the statement that follows the .foreach is repeated once for each item in the token sequence on the .foreach line. On each repetition, the symbol x (the label on the .foreach line) is replaced by the token sequence item wherever x appears.
The name used to label a .foreach statement is used as a variable inside the .foreach loop. If this hasn't already been declared as a variable, it is automatically created as such.
The enclosed statements are repeated once for each item in the token sequence. Each time through these statements, the YAA variable name is replaced by the appropriate item from the token sequence. The value of the variable is a token sequence containing exactly one token from the token sequence specified on the .foreach statement.
Individual items in the token sequence must be Bracket- balanced and separated with commas. The .foreach statement must have a label, and this label must be able to serve as a variable. The .endforeach statement must have the same label.
As another example of .foreach, consider the following macro definition.
list: .macro .label x y: .foreach x .data .quote( y ) y: .endforeach .endmacro
A statement like
[A,B,C]: list
would expand to
y: .foreach [A,B,C] .data .quote( y ) y: .endforeach
which in turn becomes
.data .quote([A]) .data .quote([B]) .data .quote([C])
YAA allows nesting of .if, .while, and .foreach constructs with opcode macro definitions. There are several rules that control this nesting.
A nestable block is a set of statements beginning with an .if and ending at the corresponding .endif, or beginning at a .while and ending at the corresponding .endwhile, or beginning at a .foreach and ending at the corresponding .endforeach, or beginning at a .macro and ending at the corresponding .endmacro. The first and last statements of a nestable block may be given the same label, as in
A: .if something ... A: .endif
If both the first and last statements of a nestable block have labels, and the labels are not the same, an error message will be printed. In this way, you can use labels in your source code to show where each nestable block begins and ends.
One nestable block can be contained in another nestable block. For example, an opcode macro definition can contain an .if-.endif block, or vice versa. However, you cannot have two blocks partly overlap, as in
A: .if something B: .macro .endif .endmacro
One block must be wholly contained by the other.
.label and .local statements must be inside an opcode macro definition, but cannot be inside .if or .while nestable blocks. For example, you cannot say
M: .macro .if something .local X,Y,Z ...
When an .if or .while block appears in a macro definition, the blocks are not evaluated until the macro is expanded. For example, in
M: .macro .while A>>10 ...
YAA does not check the value of A at the time the macro is defined. The .while statement is just taken literally. The .while statement is not executed until the macro is actually used, at which time YAA will check the current value of A.
The last statement of a nestable block may use the "generic" pseudo-op .end instead of .endmacro, .endif, .endwhile, or .endforeach. For example, you might have
A: .macro .if something .while something ... .end .end .end
An .end statement at the end of a nestable block may have the same label as the beginning of the nestable block, as in
A: .macro ... A: .end
We recommend against using the generic .end statement--it can make debugging very difficult. If you use .endmacro, .endif, and so on, the assembler itself will detect nesting errors for you.
There are several pseudo-ops dealing with external symbols: .symdef creates symbol definitions, .symref creates symbol references, and .options specifies loader options for a list of symbols.
[name,name,...]: .symdef options
A .symdef statement indicates that the names given in the label field are SYMDEFs. The location of each defined symbol will be established by other statements in the program.
If a symbol is a SYMDEF, an appropriate .symdef statement may appear anywhere in the program. However, good programming style suggests that .symdef statements should appear either at the beginning of the program, or shortly before the first use of the external symbol.
A symbol may be mentioned in more than one .symdef and/or .symref statement in the same program.
[name,name,...]: .symref options
A .symref statement creates SYMREFs for the names given in the label field. The possible options are
If a symbol is a SYMREF, an appropriate .symref statement must appear before the first use of that symbol. However, good programming style suggests that .symref statements should appear either at the beginning of the program, or shortly before the first use of the external symbol.
A symbol may be mentioned in more than one .symdef and/or .symref statement in the same program.
If a relocation type is not specified for a SYMREF, word is assumed.
[name,name,...]: .options opts
The .options pseudo-op specifies loader options to be applied when the assembled code is link-edited by LD. The label field gives the names of one or more symbols that are declared as SYMDEFs with .symdef statements somewhere in the program. The .options statement may come before or after the .symdef statement(s). The opts string expression gives a string that contains the options to be passed to the loader. For example,
.setu.: .symdef .setu.: .options "+entdef"
declares .setu. to be a SYMDEF and associates the +entdef LD option with the symbol. A list of possible options is given in Appendix D of the LD Reference Manual.
The .options directive may be applied to SYMREFs and local sections. However, this has limited use on this system.
name: .section options
A .section statement marks the beginning of a section in YAA code. The name of the section is given as the statement label. This name is passed on to the link editor.
If you omit the section name, YAA begins an unnamed section. Such sections are treated like named sections, but cannot be referenced by external object modules.
The operand field of the .section statement provides additional information describing the section.
If the keyword argument data appears, the section is assumed to be a data segment (for the purposes of .align). If the keyword argument code appears, the section is assumed to be a code segment. The default is code.
Common sections (as in Fortran) can be created by specifying the common option, as in
BLOCK: .section common,data
Such sections must be given a name by specifying a label on the .section instruction.
The option
parent=>>name
gives the name of the section's parent. This is the name of the section that immediately contains the section being created. The parent name must evaluate to a relocatable expression. This can be either a SYMREF or a location defined in some section of the program. If the location is not the name of an actual section, the parent section is the section that contains the given location.
The parent option can be omitted. In this case, the location of the section is determined by the linker.
The option
align=>>number
lets you specify the alignment of a section. The number gives the alignment as a number of bits. For example,
X: .section align=>>36
aligns the section on a full word boundary. If an alignment is not specified, the default is the offset mode of the section.
By default, the .section pseudo-op sets the instruction counter to zero. Thus all instruction offset values for the code that follows will be relative to the start of the section at offset zero. However, you can specify
origin=>>expression
in the operand field of .section. In this case, the IC is set to the value of the given expression. The values of .lowest() and .highest() will also have this value (although .highest() will change as soon as you assemble code in the section). The value of the origin cannot be negative; it can be positive or zero.
In some types of section, you want to create a SYMDEF or secondary SYMDEF for each label defined in the section. To do this, specify one of the options
label=>>symdef label=>>secondary
for the .section pseudo-op. If you specify the first, the assembler automatically generates a SYMDEF for every label in the section. Similarly, if you specify the second, the assembler automatically a secondary SYMDEF for every label in the section.
The default is
label=>>local
In this case, the labels are considered local to the section, unless you explicitly use .symdef or .secondary pseudo-ops to create appropriate SYMDEFs.
The .section pseudo-op can also specify one of the keywords bit, byte, or word to indicate the beginning offset mode for the section. For example,
BYTE_SECT: .section byte
gives the byte offset mode to the section. If no offset mode keyword is specified, the default is word.
If a section is to be identified with a SYMDEF, the .symdef statement should precede the .section statement. If the .section comes first, the section will have no external name; instead, the specified SYMDEF is created at offset 0 in the (unnamed) section. While this has almost the same effect as naming the section itself, it is usually not what you want.
.usage bit .usage byte .usage word
The .usage pseudo-op changes the offset mode of the current section to the given type. A .usage statement cannot have a label.
You may specify the offset mode of a section by specifying an alignment as part of a .usage pseudo-op, as in
.usage align=>>36
The alignment is given as a number of bits.
In some types of section, you want to create a SYMDEF or secondary SYMDEF for each label defined in the section. To do this, you may specify one of the options
label=>>symdef label=>>secondary
on the .usage pseudo-op. If you specify the first, the assembler automatically generates a SYMDEF for every label in the section. Similarly, if you specify the second, the assembler automatically a secondary SYMDEF for every label in the section.
The default is
label=>>local
In this case, the labels are considered local to the section, unless you explicitly use .symdef or .secondary pseudo-ops to create appropriate SYMDEFs.
.origin where
The .origin instruction moves to the section that contains the location given by the where operand and sets the instruction counter to the value it had at that location.
The .origin statement marks the beginning of a block of code that extends to the next .section or .origin statement (if there is one). The .origin statement tells YAA that this block of code should be placed beginning at the relocatable position given by the specified relocatable expression. For example,
X: .section # statements .origin X # code
states that the given code should be positioned at the beginning of the section called X. This overwrites any generated code at the beginning of X, with undefined results. However, the .origin is usually used to set an origin to a location where no code has yet been created.
It is common to set origins at offsets from the current IC value, as in
.origin *+10
By default, YAA assumes the offset has the units indicated by the section's offset mode. For example, if the section has the word offset mode, the above instruction moves ahead ten words. Explicit offset modes may also be specified, as in
.origin byte::( .ic(byte::*) + 2 )
which sets the origin ahead by two bytes. Note that we had to use
.ic(byte::*)
to obtain the current IC as a byte offset, and use byte:: as the first part of the operand field to indicate that we were setting a byte origin.
If you leave a section and come back with .origin, the section has the offset mode that it had when you left. This is true, no matter where you set the origin within that section.
name: .template options
The .template statement is similar to the .section statement. It marks the beginning of a section and by default, it sets the IC to zero. However, a template is different from a normal section in several respects.
Template sections are intended to be useful when defining the lay-out of data structures. The value of the instruction counter gives an absolute offset from the beginning of a structure.
The statement
.template offset_mode
begins a template section and sets the offset mode for the template. For example,
.template byte
begins a template with the byte offset mode. The offset mode of a template can be changed with .usage. If no offset mode is specified, the default is word.
As with sections, you may use .origin to change your origin to an existing template. Thus you can build templates in stages, as in
A: .template B: .data 0 x: .template y: .data 0 .origin .ic(a) # back to A C: .data 0 # third position in A
Unlike sections, you may use .origin to set your current location to a negative offset from the beginning of the template. For example,
x: .template .origin *-10 y: .data 0
defines x at location 0 of the template and y at location -10. You may not use .origin to move to a negative offset inside a section.
Negative offsets may also be set using the origin=>>expression option. This is similar to the option for .section, except that the given expression may have a negative value as well as a positive one or zero.
In some types of template, you want to create a SYMDEF or secondary SYMDEF for each label defined in the template. To do this, specify one of the options
label=>>symdef label=>>secondary
for the .template pseudo-op. If you specify the first, the assembler automatically generates a SYMDEF for every label in the template. Similarly, if you specify the second, the assembler automatically a secondary SYMDEF for every label in the template.
The default is
label=>>local
In this case, the labels are considered local to the template, unless you explicitly use .symdef or .secondary pseudo-ops to create appropriate SYMDEFs.
The .lowest and .highest functions may be applied to templates. Note that .lowest may return a negative number if you use .origin to define items at negative offsets. The return value of .lowest can never be greater than zero.
Default alignments within the section can be specified with align=>>number on the .template pseudo-op. The alignment is given as a number of bits.
.pool .pool parent=>>section
The sections and templates of an assembly program form trees. A parent section may have a number of subsections as its branches and each of those can have other branches.
Each such tree has a root: a section without a parent section. Each root has an associated literal pool which can contain literals created in the sections of that tree. When the source code refers to a string literal, YAA automatically looks for a literal pool where the data of that literal can be stored. It begins this search with the current section and proceeds up the tree until it finds a section that has an associated literal pool. YAA stores the data of the string in that pool.
The .pool statement associates a literal pool with the current section. Any subsequent literals in the code for this section or any of its branches are stored in this new pool.
YAA puts literals into literal pools as soon as they are encountered in code. For example, suppose a section has the form
code .pool more code
If the code before the .pool statement contains literals, YAA immediately looks backward through the tree to find a suitable literal pool; the literals are stored in the pool associated with some parent section. If the code after the .pool statement contains literals, YAA stores those literals in the pool associated with the current section; the current section doesn't have its own pool until YAA finds the .pool statement. As a result, you should usually put the .pool statement at the beginning of a section.
A .pool statement may take the option
parent=>>section
where section is the name of some other section. This says that the pool should be associated with the specified section instead of the current one.
GMAP programmers should note that .pool is the exact reverse of the GMAP .LIT pseudo-op. .LIT refers to all of the literals that appear before the statement; .pool refers to all of the literals that appear after.
As noted in the discussion of literal pools, the sections of a program form trees.
The LD link editor decides where sections are placed in memory. These sections go together to form segments.
The first thing in a segment is the root of a tree. Inside the segment, parent sections are followed by their children, in the order that .section statements for the children appeared in the code. Of course, child segments may also be parents with children of their own. As an example, here's a simple segment layout:
Root Child A 1st child of A 2nd child of A 3rd child of A Literal pool for A Child B 1st child of B 2nd child of B 1st child of 2nd child of B 2nd child of 2nd child of B 3rd child of B Child C Main literal pool
In other words, LD lays down the tree branch by branch. On each branch, sections are laid down in the order in which they are defined in the source code.
Literal pools are always laid down at the end of the relevant branch. This is shown in the above layout diagram. Child A has a literal pool of its own, so the literal pool is the last section laid down in the branch associated with Child A. Similarly, the main literal pool associated with the root is the last thing that is laid down in the segment.
LD may put several section trees into the same segment. For example, sections without parents are usually all put into the same segment. Trees are laid down in an unspecified order (effectively random). If you want to control the order in which sections are laid down in a segment, you have to use the parent mechanism.
YAA offers a variety of pseudo-ops that specify important facts about your source code. These facts are also written into the object files produced by YAA if the system's object file format allows the recording of this information.
.title title_string .title code,title_string
0 assigns the title to the assembly as a whole. This is the default if you don't specify a code.
1 uses the title as a heading for your listing.
2 uses the title as a subheading for your listing.
3 uses the title as a sub-subheading for your listing.
The .title statement lets you specify a title (or capsule description) of your source module. For example, you could use
.title "sqrt"
to label a square root routine. YAA places no restriction on the length of the title string, although the linker may restrict this length.
You can also use .title to specify
heading, subheadings, or sub-subheadings for the listing of
an assembly. In this case, the title is not applied
to the source module itself; it just appears in the listing.
For more about listings, see Chapter 9 of the YAA
Reference Manual.
The .module statement lets you specify a
name for the object module produced as a result of the
assembly. YAA places no restriction on the length of the
module name, although the linker may restrict this length.
The .revision statement lets you specify
a revision name or number for the assembled source code, in
situations where you may have several versions of the same
code. YAA places no restriction on the length of the
revision string, although the linker may restrict this
length.
The .copyright statement lets you place
a copyright notice in the assembled source code. YAA places
no restriction on the length of the copyright string,
although the linker may restrict this length.
YAA has several pseudo-ops that let you issue diagnostic
messages.
The .error statement causes an assembly
error. The given msg string is printed as part
of the diagnostic message associated with the error. For
example, you might have
If the value of SYMBOL is not one of the
recognized ones, YAA executes the .error
statement.
If YAA encounters an .error statement,
the assembly is marked as erroneous and does not produce a
useful object file. However, YAA continues to process the
source code, and may report additional errors in the
assembly. If you requested a listing, YAA continues to
produce the listing after .error.
The .abort statement stops assembly
immediately after writing out the specified diagnostic
message. Because .abort makes the
assembler stop immediately, YAA does not produce a listing,
no matter what options you may have specified for the
assembly.
The .warning statement displays the
msg string as part of a diagnostic message. YAA
then continues assembling source code normally.
The .comment statement displays the
msg as part of a diagnostic message, on the
terminal and in any listing that might be produced. YAA
then continues assembling source code normally.
The format of the output message is
A comment is similar to a warning (generated by
.warning) but simply provides information.
It does not warn of a possible problem.
The .print statement evaluates
expr, then displays the value on the terminal and
in any listing that might be produced. This evaluation
takes place on the assembler's first pass.
The format of the output message is
If the given expr has a string value, the
value is displayed inside double quotes. This is different
from what's done with similar statements like
.warning and .comment,
where string values are displayed without double quotes.
Note that .print works on the
assembler's first pass through the source code. This means
that assembler variables will be evaluated with whatever
value they have at the time. There is also a
.print2 statement which works on the second
pass through the source code; at this point, assembler
variables will have whatever values they had at the end of
the assembler's first pass.
The .print2 statement evaluates
expr, then displays the value on the terminal and
in any listing that might be produced. Evaluation takes
place on the assembler's second pass.
The format of the output message is
If the given expr has a string value, the
value is displayed inside double quotes. This is different
from what's done with similar statements like
.warning and .comment,
where string values are displayed without double quotes.
Note that .print2 works on the
assembler's second pass through the source code. This means
that assembler variables will have whatever values they had
at the end of the assembler's first pass. There is also a
.print statement which works on the first
pass through the source code; at this point, assembler
variables will have whatever value they have during the
pass.
The .wrapup statement lets you "clean
up" after an assembly. After all code has been assembled,
YAA invokes the specified opcode macro without arguments, in
order to perform any clean-up operations you want.
A program may have any number of .wrapup
statements. Macros specified in .wrapup
statements are invoked in the reverse of the order in which
they were specified (last in, first out).
If there are several .wrapup statements
specifying the same macro name, only the first is
significant. The rest are quietly ignored.
The opcode macro specified in a .wrapup
statement need not be defined at the time the
.wrapup statement is encountered; YAA
simply records the macro's name. The macro must be defined
at the time the assembly terminates. All expressions in the
macro must be immediate expressions (relative to the end of
the assembly).
Every section has an associated inhibit value.
This is set by the .inhibit pseudo-op. For
example,
The interpretation of an inhibit value is system-
dependent. For example, on the DPS-8 an inhibit value of 1
turns on the inhibit bit in a hardware instruction, while an
inhibit value of 0 turns it off.
When a section is created, its default inhibit value is 0
(implying no inhibit).
If you leave a section then return, the inhibit value of
the section is the same as when you last left it. To change
the inhibit value, you must issue an explicit
.inhibit statement.
If a .inhibit statement has a label
field, the symbol in the label field is assigned the
previous value of the hardware inhibit. For example,
Names on a given system may not match the rules governing
names in YAA. For example, a system may allow symbols to
have names that contain a "*" character, even though YAA
does not.
The .equate directive gets around this
problem by letting you associate the valid YAA name
asmname for the invalid name realname.
For example,
The .equate statement must appear before
the first occurrence of the asmname it defines.
The realname in an .equate
statement may not contain '\0' characters (octal
000, the ASCII NUL).
Note that the above .equate statement is
not the same as
The .equate directive can also be used
in cases where a symbol name might conflict with a YAA
reserved word. This sort of conflict should seldom happen,
because YAA's keywords are only reserved in restricted
contexts.
The .synonym pseudo-op defines
asmname as a synonym for a (contextual) keyword.
For example, consider:
The first lets you enter the keyword du in
either upper or lower case. The second lets you use the
symbolic name P.RET in place of p4
(for pointer register 4).
.synonym may also be used to define
synonyms for normal opcodes (e.g., lda,
tra), but not for YAA pseudo-ops (e.g.,
.data, .if).
The difference between .synonym and
.define is that .synonym
only does the replacement within contexts where the keyword
is reserved. For example, suppose you have a label named
DU. With
.synonym has a feature designed to
support extensions to the hardware. For example, suppose
that new hardware comes out with opcodes that are not
currently recognized by YAA. Ultimately, YAA will be
updated to recognize the new codes; in the meantime,
however, you can work around the problem using
.synonym.
To use .synonym for this purpose, you
use the format
Once you have specified a new opcode in this way, you can
use it in source code. For example, the following declares
a new opcode that is similar to a tra
instruction, then uses that opcode in an assembler
statement:
You should note the limitations of this operation. For
example, suppose new hardware contains a new vector opcode.
If the new opcode follows the same model as existing vector
opcodes, you will have no problem defining it (in terms of
one of those existing codes). However, if the new opcode
follows a different model, the assembler will not know how
to handle the arguments of the new opcode.
You can use the same technique for defining new tags:
Alters are instructions which change lines found
elsewhere in the source code. They are usually used to
record patches made after software has been released.
For example, suppose you make an emergency bug fix to
code that is already in production use. You may want to
keep the original code intact as a record of the "official"
release, then supply alters that put in the bug fix.
Typically, you would put all the alters into a single file,
then use .include statements to apply those
alterations to the relevant source files. For example,
suppose that the file test.y contains
YAA only makes alterations if the alter pseudo-ops appear
before the text they are supposed to alter. As a result,
you should .include a file containing
alters as one of the first instructions in the source code.
It's important to note the difference between alters in
YAA and patches made to machine code. For example, if you
are using patches to delete machine code instructions, you
usually replace the instructions with an appropriate number
of no-op instructions, or else put in an instruction to jump
around the code you want to get rid of; the size of the
program doesn't actually change. However, if you use an
alter to delete source code, the code is actually removed
from the text that YAA assembles, so the program actually
does get smaller.
The sections that follow describe the constructs that can
be used to make alterations.
The .alter_append construct tells YAA to
append one or more statements after line
number in file. For example,
The file name may be omitted from the
.alter_append directive. When YAA finds a
construct of the form
For example, if test.y includes
alts.y and alts.y contains an
.alter_append of this form, the alterations
are made in test.y. If the construct is in a
file that was not included (e.g. if the construct is found
right in test.y) the alterations are applied in
the file that contains the .alter_append
instruction.
The .alter_delete directive deletes
lines from a specified file, beginning at line number
startline and ending with line number
endline. For example,
The file name may be omitted from
.alter_delete. If so, the situation is
similar to that of .alter_append. YAA
deletes the line(s) from the file that includes the file
that contains the .alter_delete
instruction. If the instruction appears in a source file
that has not been included by another file, YAA deletes the
line from the current file.
The .alter_change construct substitutes
statements in the specified file, beginning at line number
startline and endling at line number endline.
For example,
If only one line number is supplied, the given
statements are put in place of that line.
You may omit the file name on the
.alter_change instruction. In this case,
the change is made in the file that includes the file that
contains the .alter_change. If the current
file was not included by another file, the change is made in
the current file.
YAA has sophisticated facilities for producing listings
of source input. In this chapter, we examine these
facilities.
In order to produce a listing, you must specify the
option
The command line option
The command line option
Bear in mind that all of the above are command line
options, specified when you assemble your source code. They
do not appear in the source code itself.
The .list pseudo-op controls the format
of output listings. Source code may contain any number of
.list pseudo-ops, to change listing options
in the middle of assembly. For example, if you only want a
listing for one section of a source module, you can use a
.list pseudo-op to start up a listing at
the beginning of the section, then use another
.list pseudo-op to turn off the listing at
the end of the section. When the listing is produced, it
will display the code that was assembled between the two
pseudo-ops.
NOTE: As we mentioned in the last section, the YAA
command line must contain a List=file option to obtain a
listing. .list pseudo-ops have no effect
if source code is being assembled without a
List=file option.
The .list pseudo-op has the form
We will discuss the various listing options shortly. For
the moment, we will discuss a few more features of
.list. If a label is specified for the
.list pseudo-op, as in
If the form of the statement is
You should take care that the expression for
.list does not contain any errors. For
example, a typo in entering a listing option could create an
undefined variable name, which is automatically given the
value 0. The result of this will often be to
turn off all listing options and kill the rest of your
listing.
The .list function returns the current
listing code value. For example,
In the sections to come, we will give examples of how the
.list function may be used productively.
Listing options are represented by bits in an integer
code. Each bit may be referenced by a predefined YAA
variable whose name begins with .l_. For
example, the variable .l_macro represents the bit
that controls whether or not the listing will show the
expansions of opcode macros.
If options indicate that any part of an input line should
be printed, the whole line is printed. For example, if
there are several commands on a single line and one of those
commands is a type that should be printed, the entire line
is printed.
As noted in an earlier chapter a statement may be
continued from one line to the next by putting a backslash
on the end of the line. If any line in such a continued
statement is printed in the listing, the whole continued
statement is printed.
You may find that listings become hard to read if you put
more than one statement on a line, especially when this is
combined with continued statements. This is because of the
interplay between the various options that determine what is
and isn't listed. For the greatest readability of listings,
avoid multiple statements on the same line.
Note that you may still run into some odd situations when
a line contains a literal. A statement inside the literal
may cause the whole line to be printed, even though the
statement that contains the literal is not one of the types
that is currently being printed.
In the descriptions that follow, we give the name
associated with each code bit. Since bit positions may
change from one release to the next, the actual bit
positions are not documented. You should always refer to
code bits using their symbolic names, not their actual
values.
Listing options break up naturally into several classes.
We will describe these classes separately.
The first class consists of options related to the origin
of source code.
Note that the .list statement above
contained the notation other_options. As you
will see in the next section, you need to turn on some
"statement use" options as well as some "listing source"
options; otherwise, you get no output.
When YAA is deciding whether or not to print a source
line, it checks the appropriate listing source option first.
If the source option is off, the input line is not printed,
no matter what other options might be relevant.
The second class of options control the kind of
statements that appear in the listing.
Note that there are several options that relate to opcode
macros. .l_show_def shows the definition of
macro. .l_macro_src records the macro
definition. .l_invoke shows statements that
invoke macros. .l_macro shows lines generated by
the invocation of a macro; for it to work,
.l_macro_src had to be turned on when the macro
was defined. For example, if both .l_invoke and
.l_macro are on, you see the statement that
invokes an opcode macro, followed by the code that was
generated when the macro was expanded.
The third class of options control the format of the
listing.
Cross-reference information takes up a great deal of
memory. If the assembler runs out of memory because of
the presence of cross-reference information, YAA turns
off the option and discards all cross-reference
information that has been collected up to this point.
(YAA outputs a warning message if it is forced to do
this.)
Absolute line numbers are only assigned to lines
that would be printed because of the listing source
options. For example, if you are not listing the
contents of .include files, the
contents are not assigned absolute line numbers. On
the other hand, lines that are not displayed because of
other kinds of options are still assigned
absolute line numbers. For example, even if comments
are not being printed, they are still assigned absolute
line numbers. In this way, a jump in the absolute line
numbers gives a hint that the input contained lines
that are not shown in the listing.
The fourth class of options control how alters appear in
the listing.
By default, the following options are turned on.
The .forcelist pseudo-op overrides the
behavior of .list. It is for very
specialized debugging purposes and should be avoided in
everyday use.
To understand the use of .forcelist,
consider a macro that contains its own
.list statements (e.g. to suppress the
display of items created in the macro expansion). Debugging
such a macro would be difficult, since the action of the
macro itself affects the output listing.
This is where .forcelist comes in handy.
.forcelist does not change the current
listing code as used by the .list pseudo-op
and returned by the .list function, so all
such statements continue working as normal. However, when a
listing is actually produced, the listing options will be
those set up by .forcelist rather than
those dictated by .list.
As an example of .forcelist, consider
The .eject statement tells YAA to start
a new page in the listing (if a listing is being produced at
the time). For example, you might put an
.eject at the beginning of each function so
that each function starts on a new page.
.eject does not start a new page if it
appears in a file that is not being listed (e.g. an
.include file). If there is an
.eject in a macro, invoking the macro
causes the .eject even if the code
generated by the macro is not being listed.
In Chapter 8, we discussed the
.title pseudo-op and how it could be used
to record a title in the object output. It also has an
effect on the listing. If you say,
Like .eject, .title has
no effect if it appears in a file that is not being listed.
If a .title statement appears in an
include file and the include file is being listed, the new
title is used for the remainder of the include file. When
the assembler returns to the original source file, it
reverts to the previous title.
YAA lets you create subtitles with
.title. This is done by putting a number
before the string, as in
8.10.2 Naming the Object Module:
.module
Use:
.module name
Where:
Description:
8.10.3 Revision Names or
Numbers: .revision
Use:
.revision name
Where:
Description:
8.10.4 Copyright Notices:
.copyright
Use:
.copyright notice
Where:
Description:
8.11 Issuing Messages
8.11.1 Error Messages:
.error
Use:
.error msg
Where:
Description:
.if SYMBOL=1
# statements
.elseif SYMBOL=2
# statements
.elseif SYMBOL=3
# statements
.else
.error "Invalid value for SYMBOL"
.endif
8.11.2 Abort Messages:
.abort
Use:
.abort msg
Where:
Description:
8.11.3 Warning Messages:
.warning
Use:
.warning msg
Where:
Description:
8.11.4 Other Diagnostics:
.comment
Use:
.comment msg
label: .comment msg
Where:
Description:
filename,linenumber: label: msg
if a label is specified. Otherwise, the format
is
filename,linenumber: comment: msg
8.11.5 Displaying Values (First
Pass): .print
Use:
.print expr
label: .print expr
Where:
Description:
filename,linenumber: label: expr
if a label is specified. Otherwise, the format
is
filename,linenumber: print: expr
8.11.6 Displaying Values (Second
Pass): .print2
Use:
.print2 expr
label: .print2 expr
Where:
Description:
filename,linenumber: label: expr
if a label is specified. Otherwise, the format
is
filename,linenumber: print: expr
8.12 Assembly Clean-Up:
.wrapup
Use:
.wrapup macroname
Where:
Description:
8.13 Hardware Inhibits:
.inhibit
Use:
oldvalue: .inhibit newvalue
Where:
Description:
.inhibit 1
sets the inhibit value of the current section to 1.
X: .inhibit 1
assigns X the old inhibit value.
8.14 Naming Difficulties:
.equate
Use:
asmname: .equate realname
Where:
Description:
_name: .equate "*name"
states that your YAA source code uses _name to
stand for the actual symbol *name. When YAA
generates linking information, it uses *name as
required.
.define _name,*name
The .define pseudo-op works strictly
textually, so _name would immediately turn into
*name and an error message would be generated.
8.15 Keyword Synonyms and Hardware
Extensions: .synonym
Use:
asmname: .synonym keyword
Where:
Description:
DU: .synonym du
P.RET: .synonym p4
DU: .synonym du
the label is not changed because du is
only reserved in the operand field. On the other hand,
.define DU,du
replaces all occurrences of DU with
du, regardless of where the DU occurs.
Hardware Extensions
newop: .synonym oldop,bitpattern
where newop is the new opcode, oldop
is an old opcode that has a similar format, and
bitpattern is a word-length expression giving the
pattern of bits in an instruction containing the new opcode.
Note that the bit pattern should look like a complete
instruction; YAA will extract the relevant bits of the
opcode from this instruction.
tnew: .synonym tra,bitpattern
tnew address
newtag: .synonym oldtag,bitpattern
This says that the new tag is patterned on an existing tag,
but has the given bit pattern. In general, you can then use
the new tag in any context you could use the old tag. There
are, unfortunately, some exceptions. In particular, YAA
does validity checking on certain EIS instructions and only
recognizes old tags as valid. Even if you define a new tag
with a .synonym pseudo-op, the new tag will
not pass YAA's EIS validity checking.
8.16 Alters
Use:
.alter_append number,"file"
#statements
.endalter
.alter_delete startline,endline,"file"
.alter_change startline,endline,"file"
#statements
.endalter
Where:
Description:
.include "alts.y"
ldi 0,dl
adx2 2,du
tra 0,xl
stq 0,ic
and that the file alts.y contains
.alter_change 3,4,"test.y"
adx2 4,du
.endalter
.alter_append 3,"blah.y"
adx2 5,du
.endalter
When YAA assembles test.y it will find the
.include statement and read code from
alts.y. It will collect the alterations for
test.y from alts.y and use those
alterations to change the source code of test.y
before processing. YAA ignores any alterations that don't
apply to test.y (specifically, the
.alter_append that applies to the file
blah.y). Thus the same file can contain
alterations to many different source files.
Appending New Instructions
.alter_append 4,"test.y"
ldi 0,dl
.endalter
appends the given ldi instruction after line 4 in
test.y. Any number of statements may be supplied
between the .alter_append and
.endalter.
.alter_append number
statements
.endalter
YAA assumes that the statements should be
appended after the given line number in the file that
included the file that contained the construct.
Deleting Existing Instructions
.alter_delete 3,10,"test.y"
deletes lines 3 through 10 from test.y. If only
one line number is supplied, as in
.alter_delete 7,"test.y"
YAA only deletes that line.
Changing Existing Instructions
.alter_change 3,10,"test.y"
adx2 4,du
stq 0,ic
.endalter
replaces lines 3 through 10 in test.y with the
two given instructions.
9. Using YAA to Produce
Listings
9.1 Command Line Options
List=file
on the command line that invokes the YAA assembler. This
tells the assembler to write the listing to the specified
file. If this option is not present on the command line,
all other listing features will be inoperative.
PageSize=number
lets you control the number of lines per page in the
listing. The given number must be an integer greater than
or equal to 20.
LineSize=number
lets you control the number of characters per line in the
listing. The given number must be an integer greater than
or equal to 72. If your source code is using 80-character
lines, the LineSize of the listing should be at least 130;
otherwise, listing lines will have to be "folded" (split in
the middle). Folding makes output much harder to read.
9.2 Controlling Listings:
.list
Use:
.list code
Where:
Description:
.list code
The code is an integer value whose bits represent
listing options: if a bit in code is on, the
corresponding listing option is turned on; if a bit in
code is off, the corresponding listing option is
turned off.
X: .list value
the label is treated as a variable and assigned a code
integer that represents the previous listing options. For
example, in
old: .list new
...
.list old
the first statement saves the current options in
old and then sets the options to new.
The final statement restores the old listing options using
the old code.
label: .list
without any code, the current code value is assigned to
label. The listing options are not changed.
The .list Function
Use:
.list()
Description:
X: .set .list()
assigns the current listing code function to X.
Listing Options
.list (.list() | .l_macro)
uses the .list function to obtain the
current listing code and then uses a bitwise OR operation
(|) to turn on the .l_macro bit. The
effect of the above instruction is to turn on the printing
of macro expansions, and to leave all the other options the
same.
.list (.list() && (~.l_macro))
turns off the printing of macro expansions and leaves all
the other options the same. It does this by using the
bitwise complement operator (~) to produce a
value that has every bit on except the .l_macro
bit, then bitwise ANDing this value with the current option
code.
Listing Source Options
For example, with
.list (.l_source | .l_loop) | other_options
the listing shows all statements that appeared in the
primary source file and all statements generated by looping
constructs. If the primary source file contains an
.include statement, the listing shows the
statement itself (because that was in the primary source
file) but does not show the source code obtained from the
include file (because .l_include is not on).
Statement Use Options
For example, suppose you have
.list (.l_source | .l_opcode | .l_invoke)
The listing shows all statements from the primary source
file that have true opcodes or macro invocations. It will
not show comments or pseudo-ops.
Output Format Options
.data 1,2,3
producing three words of data. With
.l_detail off, the listing only shows the
contents of the first data word produced; with
.l_detail on, the listing shows all data
produced. If .l_detail is off, YAA still
tries to show as much data as possible; for example, if
a line that produces many words of data is followed by
a blank line or a comment, YAA puts the first word of
data on the original statement, and displays more words
of data on the blank lines or comments that follow.
Thus in some circumstances, you see more than just one
word of data.
Alter Options
##alter: "name",number
where name is the name of the file that
contained the pseudo-op and number is the
line number where the pseudo-op appeared. The command
after the alterations has the form
##end alter
Defaults
.l_source .l_macro_src .l_show_def
.l_comment .l_if .l_list
.l_opcode .l_var .l_invoke
.l_pseudo .l_continuation .l_crossref
.l_relative .l_absolute .l_toc
.l_alter
Other options are turned off.
9.3 Overriding List Options:
.forcelist
Use:
.forcelist ANDlist,ORlist
Where:
Description:
.forcelist -1,.l_macro | .l_macro_src
The -1 is ANDed with the current settings, leaving
everything the same; the .l_macro bit is ORed
with the current settings, turning that bit on. Thus the
above .forcelist statement turns on the
.l_macro bit so that statements generated in
macro expansions will be printed. Even if a macro uses
.list to turn off the printing of macro
expansions, the expansion is still listed, because the
effects of .forcelist override
.list.
9.4 Starting a New Page:
.eject
Use:
.eject
Description:
9.5 The .title Pseudo-Op
Revisited
.title "string"
the listing starts a new page and prints "string" as the
title at the top of that page. Each subsequent page has the
same string as title, until a new .title
statement is encountered.
.title 1,"heading"
.title 2,"sub-heading"
.title 3,"sub-subheading"
Numbers cannot be greater than 3. See the description of
.title in Chapter 8
for further details.
The main part of the listing displays source code and the data that it generates. It is divided into a right and left part.
The right hand side of the listing shows input code, as dictated by the listing options. Typically, lines may have both a relative and an absolute line number.
The right part of the listing has tab stops set every four spaces.
The left hand side of the listing indicates the material that is generated by the corresponding statements on the right hand side.
The left hand side begins with a field giving the current offset within the current section. This is basically the value that would be returned by .ic at this point.
The next field shows the first word of object code generated for the given statement. Underscores are used to break this up into logical pieces. For example, a generated machine instruction might be divided into one piece for the actual opcode and separate pieces for the various operands. Generated data is broken up into pieces according to a variety of heuristic rules aimed at figuring out the most intelligent method of division. If YAA cannot decide on a good method of division, the generated object output is not split up.
After this comes a list of relocations that may be applied to the generated object. Relocations have the form
Xnn
where X is an uppercase letter and nn is an integer.
Some generated code may have more than one associated relocation. The listing will only show the first four relocations, regardless of how many additional relocations there may be.
If the value that would normally be printed on the left part of the listing is too long to fit, it just won't be printed. For example, this will happen with long string constants.
If the .l_detail listing option is turned on, the left hand part of the listing will show all the code generated for the statements on the right hand part.
If the .l_detail listing option is turned off, the left hand part of the listing will only show the first word of output generated.
If no code is generated (e.g. by a .set pseudo-op that assigns a value to a YAA variable), the listing will try to show some intelligible value. For example, with .set, it shows the value assigned to the YAA variable. If you want to display some other value, you can use the .display pseudo-op (described in Section 9.6.4).
There is a special case when a statement generates more than one word of output and the statement is followed by one or more comment lines. In this case, the additional words of output are shown on the left hand side of the comment lines.
The YAA listing facilities can "backtrack" a maximum of three lines. For example, consider input of the form
macro(arg1,\ arg2,\ arg3)
When YAA reaches the end of the macro call, it can backtrack to the line where the macro started and put the appropriate left hand side on that line. However, if the macro call was
macro(arg1,\ arg2,\ arg3,\ arg4)
YAA would read down to the end of the macro call before generating any code. When it figured what should go on the left hand side of the listing, YAA could only backtrack three lines. Thus the left hand side output for the macro would begin on the arg2 line.
The value printed on the left of an .if directive is the value of the expression on the .if. An .if or .elseif that has been skipped will not have a value on the left; by checking which do and don't have values on the left, you can follow the flow of control.
If a macro invocation actually generates code or data, the left side of the listing will show the first code or data generated. If no code was generated, YAA normally does not put anything on the left side of the macro. However, the .display pseudo-op (described below) can be used to give YAA specific instructions on what should be displayed on the left hand side.
.display expression
The .display pseudo-op gives you some control over what is displayed on the left hand side of the listing. The pseudo-op tells YAA that you would like the value of the given expression displayed on the left hand side of the listing.
In situations where YAA has to decide what to display on the left hand hand side of the listing, .display takes priority over other values that YAA might display. For example, YAA normally uses the left hand side of the listing to show the value of an expression on an .if statement; however, if you specify a .display statement, this overrides the usual behavior.
.display is useful inside macros. If a macro actually generates code or data, the left side of the listing normally shows the generated code or data; but if the macro doesn't generate code or data, there is usually nothing on the left side of the listing. By putting a .display inside the macro, you can display a particular value beside the macro invocation.
After displaying the source code, the listing provides several information sections summarizing features of the source code. Information sections will all be put on one page if possible.
If you use .title statements to give titles to your listings, the information section page(s) will use the first heading created by a .title statement within the primary source file. Sub-headings and sub-subheadings are not used.
At present, the following information sections are provided:
The cross-reference index appears as the last part of the listing. It contains data that was collected during periods when the .l_crossref option was turned on.
An entry in the cross-reference index describes all the places where a particular symbol appeared. Entries are collected into groups:
Within each group, symbol names are sorted according to the collating order of the underlying character set. Using the ASCII character set, this means that names beginning with uppercase letters will precede names beginning with lowercase letters.
Each entry describes the symbol's type. Possibilities include:
argument -- argument in macro definition keyword -- opcode or standard symbol local -- local variable in macro symbol -- YAA variable or other symbol built-in -- built-in symbol
Following the symbol's type, the listing gives the value of the symbol at the end of the assembly. This value will not be given if the symbol was undefined or out of scope at the end of the assembly. Some symbols may have a value of the form
<+Y
This means that the symbol refers to the location that is an offset of Y from the symbol with reference number X. The External Names Section gives the reference number of some symbols. Negative reference numbers refer to templates. Other reference numbers may be generated artificially (e.g. to refer to literals).
Each entry also lists the lines where the symbol appeared. If absolute line numbers are being used, absolute line numbers will be given. If relative line numbers are being used, the cross-reference will show the file name and the relative line number. If both types of line numbers are being used, the cross-reference will show both.
Each line number reference is followed by a character indicating how the symbol was used on that line:
If none of the above usages can be determined, a blank character is used.
If a name is a synonym for another symbol, the name will appear in the group that is appropriate to the original symbol. If A is a synonym of B which is a synonym of C, YAA will follow through the synonym "chain" until it finds the original name. All synonyms will be listed with this original name.
The local variables used in macros must be given unique names for each invocation of the macro. The reason for this is that it is possible for macros to "export" local variable names in various ways (so that the names become visible outside of the macro). Thus, every local variable is given a name of the form
.MACRONAME_N_I
where MACRONAME is the name of the macro that contains the variable, N says that this is the Nth local variable defined within the macro, and I says that this is the Ith invocation of the macro.
Names of this form will often appear in the cross- reference index. Various option listings may also make these names visible.
Error messages are always written to the terminal as the YAA assembler runs. They will also be written to the listing if a listing is being generated.
Errors are written to the listing at the point where the error is detected. This may not be the point at which the error actually occurred. A mistake may not become apparent until several statements after the mistake actually takes place.
YAA makes two passes through input: once to parse the source code and once to generate appropriate object code. If you are not creating a listing, YAA will stop assembling after 20 errors are being detected. If you are creating a listing, YAA can handle up to 200 errors on each pass. If you have more than 200 errors on the first pass, the listing will probably not be valid. If you have more than 200 errors on the second pass, the listing will simply stop when the error limit is exceeded; this means that you will not get a table of contents, cross-reference listing, etc.
When you compile a program with the system's C compiler, the compiler can store debugging directives in the object code. These directives can be examined by a symbolic debugger to obtain information about your program. Debugging directives record information that would otherwise not be present in the compiled code.
Unlike a compiler, the YAA assembler does not automatically generate debugging directives for your code. However, you can put explicit statements in your code that generate such directives. For example, you can use YAA statements to create named types that the debugger can later interpret as C data types. You might use this if you are using YAA to write a subprogram that will be called from a C program; by giving the YAA code the same kind of debugging directives that the C code has, you can analyze the YAA code with the same debugging tools that you use with C.
Debugging directives are generated using YAA pseudo-ops. The actual output created has the format of LD object code directives, explained in The LD Object Format Reference Manual. The rest of this chapter assumes that the reader is familiar with the LD object format as described in that manual.
name: .type op op op ... known_type
The .type pseudo-op defines a data type by generating an LD_DEFTYPE directive. This defines a data type in terms of a type that is already known. A known type is one that has been defined by a previous .type statement, or else one of the following keywords:
.struct -- structure type .union -- union type .enum -- enumerated class .void -- the C void type .char -- C char (signed) type .lchar -- C long char type .short -- C short int type .int -- C int type .long -- C long type .uchar -- C unsigned char type .ulchar -- C unsigned long char .ushort -- C unsigned short type .uint -- C unsigned int type .ulong -- C unsigned long type .float -- C float type .double -- C double type .ldouble -- C long double type .label -- statement label .block -- block label
As an example,
X: .type .int
defines a type with the name X and the type int.
A .type pseudo-op may define several names for the same simple type, as in
[NAME1,NAME2,...]: .type .int
However, if the type is a typedef, a structure, a union, or an enum class, there can only be one name.
Type operators are used in .type pseudo- op statements to create derived types. For example, the ptr operator creates pointer types.
IPTR: .type ptr .int
declares IPTR as a type which is a pointer to int values. This is similar to the C statement
typedef int *IPTR;
All .type statements are similar in purpose to C typedef statements. However, if you want to generate a debugging directive that corresponds to true C typedef declaration, you have to use the typedef operator (discussed below).
The following operators are currently supported:
ptr -- pointer const -- C const modifier volatile -- C volatile modifier far -- C far modifier near -- C near modifier huge -- C huge modifier func -- function (see below) field -- bit field (see below) typedef -- C typedef incomplete -- incomplete type "array" -- see below
Note that "array" is not an operator; the form of an array operator is described later on.
Function types in .type pseudo-ops may be specified by the operator func followed by a list of argument types, in parentheses, followed by the type of the return value:
NAME: .type func (argtypes) result_type
For example,
FUNC: .type func (.int,.double) .int
says that the FUNC type of function takes an int and double argument, and returns an int value. Argument and return value types take the usual form: op op op ... known_type, as in
IPTR: .type ptr .int IFUNC: .type func (IPTR,.int) IPTR
The IFUNC function returns a pointer to an integer. It takes two arguments: one a pointer to an integer and one a normal integer. This corresponds to the C declaration
typedef int *IFUNC(int *,int);
If a function takes no arguments, just use empty parentheses, as in
X: .type func () .int
which takes no arguments and returns an int value. If the argument types are unspecified or unknown, omit the parentheses, as in
Y: .type func .int
which returns an int value and has unspecified arguments.
The "..." notation of ANSI C can be used for functions which take a list of arguments with indeterminate numbers or types. For example,
PRINTF_TYPE: .type func (ptr .char,...) .int
defines a function type that takes a char pointer plus an indeterminate number of other arguments, and returns an int. The format
func (...)
is allowed, when there are no fixed arguments.
The two formats for defining a bit field type are
field (expr) .int field (expr) .uint
where expr is any expression yielding an integer value. This gives the number of bits in the bit field. Only int and unsigned int bit fields are supported.
If the expr giving the length of the bit field is a single integer constant, the parentheses may be omitted.
Array operators may have three forms. The first form is:
[number]
This indicates an array with the given number of elements. The lowest subscript is assumed to be zero. For example,
IARRAY: .type [10] .int
defines a type named IARRAY. This type corresponds to an array of ten integers, numbered 0 through 9.
The second array operator form is
[lower : upper]
where lower is an expression giving the lower bound of array subscripts and upper is an expression giving the upper bound of array subscripts. For example,
FORTRAN: .type [1:10] .int
defines a type named FORTRAN. This type corresponds to an array of ten integers, number 1 through 10.
The final array operator form is
[lower : upper : stride]
where lower is an expression giving the lower bound of array subscripts, upper is an expression giving the upper bound of array subscripts, and stride is an expression indicating the distance between the starts of two adjacent elements. For example, in a Pascal packed array of char, characters are placed in adjacent bytes, while in unpacked array of char, characters are distributed one per machine word. Using a stride also lets you specify a "slice" of a multi-dimensional array (e.g. a column from a matrix stored in column major order).
The operators in a .type pseudo-op are parsed strictly left to right. For example,
X: .type [10] ptr .char
defines a type which is an array of 10 pointers to char values. This corresponds to the C declaration
typedef char *(X[10]);
To define a type as a structure, union, or enumerated class, you simply use
NAME: .type .struct NAME: .type .union NAME: .type .enum
This treats the specified name as if it were the tag of a structure, union, or enumerated class. Note that no modifiers are allowed before the keywords .struct, .union, or .enum. Later in this chapter, we will show how to define the contents of a structure, union, or enumerated class.
To create a typedef type, create the type first and then create the typedef itself. For example, to do the equivalent of
typedef IPTR *int;
use
tmp: .type ptr .int IPTR: .type typedef tmp
or equivalently
IPTR: .type typedef ptr .int
Several functions can be used to work with named types: .sizeof, .alignof, and .typecompare.
.sizeof(type)
.sizeof(.int) .sizeof(IPTR) .sizeof(ptr .int)
The .sizeof function returns an unsigned integer giving the size of the given type in bits. Note that this is different from the C sizeof operator, which gives size in bytes.
.alignof(type)
.alignof(.int) .alignof(IPTR) .alignof(ptr .int)
The .alignof function returns an unsigned integer giving the alignment of the given type in bits. For example, if a type must be aligned on a double-word boundary on DPS-8 machines, the result of .alignof will be 72.
.typecompare(type1,type2)
The .typecompare function compares two types to see if they are equal. The function returns a 1 if the types are equal and 0 if they are not. For the purposes of this function, two types are equal if they are constructed from the same sequence of operators and built-in types. Thus, in
X: .type .int XPTR: .type ptr X IPTR: .type ptr .intXPTR
and IPTR would be equivalent for the purposes of .typecompare, since they can be worked back to the same sequence of operators and built-in types. A typedef is not considered equivalent to the underlying type. Thus
X: .type .int XD: .type typedef .int
are considered different types for the purposes of .typecompare.
name: .scope parent,options
[X,Y,Z]: .scope parent,options
.extern stands for the external scope. This is the scope that contains all external data objects and functions. In C, this corresponds to extern data objects and non- static functions.
root_scope=>>yeswhich turns on the LF_ROOT_SCOPE flag,
root_scope=>>nowhich turns off the LF_ROOT_SCOPE flag,
same_frame=>>yeswhich turns on the LF_SAME_FRAME flag, and
same_frame=>>nowhich turns off the LF_SAME_FRAME flag.
The .scope pseudo-op defines a new scope in the source code. YAA generates appropriate LD_DEFVLIST and LD_SCOPEFLAGS directives to store this information in the object code.
The name space of scopes is different from named types and normal variables. Thus you can have named types, normal variables, and scopes that all have the same name.
Defining a structure type automatically makes the structure into a scope. For example, the instruction
X: .type .struct
creates a scope for X.
name: .object options
type=>>namewhich specifies a type for the object. name can be a built-in type name, a named type defined in a previous .type pseudo-op statement, or a type expression. If this option is not specified, YAA determines if there is a named type with the same name as this object. If there is no type=>> option, a default type of .void is used.
class=>>keywordspecifies a storage class for the object. Possible keywords are:
extern -- external static -- static auto -- auto register -- register arg -- function argument s_elem -- structure element u_elem -- union element e_elem -- enumerated class element display -- displayIf no storage class option is specified, YAA chooses intelligent defaults depending on the context. We will discuss this in greater detail shortly.
name=>>"string"specifies a name to be put into the LD_DEFVAR directive. If this is different from the name that labels the .object pseudo-op, the debugging directive is given the name specified with name=>> and the YAA source code uses the name given in the label. If no name=>> option is specified, YAA uses the name given in the label.
scope=>>parentspecifies the name of the scope that contains the object. This can be a scope named in a .scope directive, or the built-in scopes .extern. If this option is not given, YAA uses the scope of the name given in the label, if the label is associated with a location that has a scope.
The .object pseudo-op describes the functions and variables of a program. It generates an LD_DEFVAR directive.
.object only generates an appropriate debugging directive. It does not generate space for the object being described. Thus you usually need a .space directive for an object as well as an .object directive generating the debugging directive.
Objects have a different name space from named types, scopes, and normal variables. In fact, there are good reasons to define objects with the same name as a named type, as we will see shortly.
When a program issues an .object statement for a symbol that already exists in the program, the .object statement effectively associates attributes with the existing symbol. For example, consider
.align .alignof(.double) X: .space .sizeof(.double) X: .object type=>>.double
In this, the symbol X is given the alignment and space of a double object. The .object statement gives X the double type attribute, and generates an appropriate LD_DEFVAR directive for X.
Note that X could be used as a double value even if we didn't use .object to give it an explicit double type. However, the .object directive serves two purposes. It generates a debugging directive; and it provides type- checking information.
In a similar way, .object statements can associate other attributes with existing symbols: storage class, scope, names, etc.
If a storage class is not specified in the .object statement, YAA supplies a reasonable default. The default depends on the context. For example, consider the C declaration
struct complex { float X; float Y; } Z1;
You could create the same sort of data object with these declarations:
complex: .template complex: .type .struct complex: .object X: .type .float X: .object X: .space .sizeof(X)/36 Y: .type .float Y: .object Y: .space .sizeof(Y)/36
.section Z1: .type complex Z1: .space .sizeof(complex)/36 Z1: .object
Note that each structure element is defined with a .type, .object, and .space statement. We divided the .sizeof results by 36, since .sizeof returns sizes in bits but we wanted to reserve space in terms of words. Since the .object statements do not explicitly specify a type, YAA looks for a type with the same name as the object. As a variation, here is the same thing, with types specified explicitly in the structure element definitions.
complex: .template complex: .object type=>>.struct X: .object type=>>.float X: .space .sizeof(X)/36 Y: .object type=>>.float Y: .space .sizeof(Y)/36
.section Z1: .space .sizeof(complex)/36 Z1: .object type=>>complex.space
directives are required to reserve space for the structure elements.
Because the definitions of X and Y occur inside a template with a struct type, they are automatically given the "structure element" storage class. If they were in a normal section and the storage class was not specified, YAA would give them the storage class extern if there is a SYMDEF or SYMREF for the symbol, and static otherwise.
If a .space statement has a label that matches the label of a previous .object or .type statement, the default space reserved has the size and alignment of the given object. Thus we could have written
X: .object type=>>.float X: .space
since the .space statement automatically gives X the size and alignment of a float object.
When you omit the size in a .space statement, the type of the object must be known at the time of the .space statement, since .space needs to know how much space to reserve. Otherwise, you have freedom to arrange .type, .object, and .space statements in whatever order you choose.
Code of the form
NAME: .template NAME: .object type=>>.struct
creates an implicit scope variable named NAME that holds the implicit scope created for the structure. The parent of the scope variable NAME is .extern by default. You can specify a different scope with a scope=>> option on the .object pseudo-op.
Note that the scope of the symbol NAME is the parent of the implicit scope created for the structure.
As another example, consider the code
T: .template T: .scope .extern A: .object type=>>.int A: .space
The scope of the symbol A is not defined, so YAA looks for a scope variable named A. In the above code, there isn't one. YAA next looks to see if the section or template in which A is defined has a scope variable. This is the variable T. The scope of A will therefore be the parent of T. If you want the scope of A to be T itself, you must specify this explicitly with a scope=>>T option on the .object pseudo-op.
If the enclosing section or template does not have a scope variable either, the default scope of A is just .extern.
We emphasize that the scope of a symbol like A is the parent of the scope created by the .scope directive for the enclosing section or template.
The previous section showed a sample structure declaration. In this section, we show a sample union declaration. We will use the union
union type { char c; int i; double x; };
The corresponding YAA code is
type: .template word type: .object type=>>.union c: .object type=>>.char c: .space .origin type i: .object type=>>.int i: .space .origin type x: .object type=>>.double x: .space
Note that we used .origin statements to go back to the beginning of the type template section.
As mentioned earlier, the .object statement normally generates LD_DEFVAR directives to define a variable. However, if YAA finds a scope variable defined with the same name as that on the .object statement, YAA assumes this must be a scope owned by the object and issues an LD_SCOPEVAR instead of LD_DEFVAR. For example,
function: .scope .extern function: .object type=>>(.int,.int) .int
is the usual way of declaring a function. It shows that the function has external scope and the prototype
int function(int,int);
Since the scope of function is not defined on the .object statement, YAA looks for a scope variable of the same name. Since there is one, YAA issues an LD_SCOPEVAR directive for the .object statement instead of an LD_DEFVAR. The scope function refers to the scope owned by the symbol function, and the scope of the symbol function is the parent of the scope created by the .scope directive.
name: .element class_name,value
The .element pseudo-op defines an element of an enumerated class. For example, the C declaration
enum sample { elem1, elem2 = 10 };
would correspond to the YAA statements
sample: .template sample: .object type=>>.enum elem1: .element sample elem1: .object elem2: .element sample,10 elem2: .object
Notice that the elements have both an .element directive to tell which class they belong to, and an .object directive to generate an LD_DEFVAR.
.line line_no,stat_type
The .line pseudo-op shows how source code is broken into text lines. It generates an LD_LINETAB directive.
The possible values for the stat_type argument are:
expr -- expression assign -- assignment break -- break statement continue -- continue statement goto -- goto statement if -- if statement endif -- end of if else -- else clause endelse -- end of else-if while -- while statement endwhile -- end of while repeat -- Pascal repeat statement do -- do of do-while dowhile -- while of do-while call -- function call until -- Pascal until forinit -- initialization of for loop fortest -- test of for loop forincr -- increment of for loop endfor -- end of for switch -- switch statement endswitch -- end of switch with -- Pascal with endwith -- end of with func -- beginning of function def funcend -- end of function definition return -- return without expr returnexp -- return with expression filename -- file name innerscope -- beginning of inner scope write -- Pascal write or writeln read -- Pascal read or readln misc -- miscellaneous restore -- restore to previous file eofline -- line number of end of file endstat -- marks end of statement, if ambiguous pushfile -- pushes the current source file onto a stack
.file "name"
The .file pseudo-op specifies the name of the source file being compiled. The pseudo-op generates an LC_FILENAME directive.
Existing GCOS-8 assembly programs will probably be written in GMAP. The fundamental tool for converting GMAP to YAA source is the FRED buffer program called gtoa (distributed as part of the YAA package). To execute the buffer, type
fred gtoa gfile >>yfile
where gfile is a file containing GMAP source and yfile is a file that can receive the equivalent YAA source as output. GTOA can only convert one source file at a time.
In this appendix, we describe the steps that the GTOA program takes in converting GMAP code to YAA. This serves two purposes: to document how GTOA behaves; and to describe what programmers must do if they cannot use GTOA.
YAA code is free format, meaning that instruction fields do not have to begin at any particular column on the line. However, we have found that you get good looking code by setting tab stops every four columns. This is the format used by GTOA.
GMAP uses the etc pseudo-op to allow long instructions to be broken into more than one line. Such tricks are not needed in YAA, since input lines can have any length and the backslash (\) can be used to continue a source code statement onto a new input line. If GTOA finds instructions that are longer than one line, it concatenates all the parts of the instruction into a single (long) line.
GMAP accepts symbols that begin with numeric characters, while YAA does not. Therefore GTOA changes the name of any such GMAP symbol by inserting underscores at the beginning, until the name is six characters long. For example, 1A becomes _ _ _ _1A. If a name is longer than six characters, a single underscore will be added.
Every section in converted GMAP code has the word offset mode, and word-addressing is used in all the code.
Since YAA demands that all registers be referenced with identifiers instead of numbers, instructions must be changed to use the proper names. For example, GTOA converts a GMAP instruction like
lda 3,1,1
into
lda 3,x1,ar1
In YAA, all integers that begin with a leading 0 are considered to be written in octal. In GMAP, all integers are assumed to be in decimal unless prefixed by =o. Therefore YAA must strip off any leading zeros on integers, and must replace the =o construct with a leading zero.
The exceptions to this rule are STCA, STCQ, STBA, and STBQ. In GMAP, the second field of such instructions is always assumed to be in octal. Therefore, the field must be given a leading zero in YAA code.
Constructs of the form
=ddd,dl =ddd,du
are converted by removing the '=' character. All other constructs of the form =ddd are handled by declaring an unnamed literal with the construct
{.data ddd}
In GMAP, instructions may force a particular alignment by putting an 'e', 'o', or '8' marker in column 8. These are replaced with the pseudo-ops
.align 2 .align 2,1 .align 8
respectively.
GMAP has several kinds of comments:
YAA only has one kind of comment: text following a '#' (outside of string or character constants). Thus GTOA must convert all GMAP comments into YAA comments.
The GMAP cpr pseudo-ops allow a copyright to be specified in source code. These are converted to equivalent .copyright instructions, placed at the beginning of the YAA source.
The GMAP lbl pseudo-op is replaced by appropriate .module and .title pseudo-ops.
The first GMAP ttl pseudo-op is replaced by an appropriate .revision pseudo-op. GTOA simply prints diagnostic messages for any additional ttl pseudo-ops that are found.
The GMAP ttldat pseudo-op is not converted by GTOA -- the conversion program simply prints a diagnostic message about the use of the pseudo-op. Programmers may convert ttldat operations into appropriate YAA instructions using YAA's .time function.
The GMAP DUP statement is replaced by an equivalent .while construction, provided that the argument of DUP is a constant expression. This loop uses a YAA variable named _ _dup_ to control repetition of instructions.
If the argument of DUP is not a constant expression, GTOA cannot figure out what the corresponding .while construction should be. Therefore it leaves the DUP unchanged.
The GTOA program only converts some of GMAP's IF statements. Specifically, it converts the statements
ife ifg ifl ine
provided they have numeric arguments (i.e. arguments that do not contain the single quote character). These are converted to appropriate .if statements in YAA. If the original GMAP IF statements are not properly nested, the resulting YAA statements will not have the correct nesting either. As a result, the YAA code will not have the same behaviour as the GMAP code.
The GMAP ascii, uasci, and bci statements are all converted to appropriate YAA .data statements. For example,
ascii 3,abcdefghijkl
becomes
.data "abcdefghijkl"
GMAP VFD instructions are converted to various types of .data statements. In the simplest case,
vfd bits/value
becomes
.data bits:value
When the bits field is preceded by an 'o' (indicating an octal value), the form of the value must be changed to an octal constant by adding a leading zero. If the value is represented by an expression, the GMAP operators
* + - /
must be replaced by the YAA (bitwise) operators
&& | ^ ~
respectively.
When the bits field of a VFD instruction is preceded by 'a' or 'u', the conversion process is similar to that for converting ascii and uasci statements. GTOA expects that the number of bits specified in the VFD statement is a round number of ASCII characters, i.e. a multiple of 9. If it is not a multiple of 9, GTOA rounds up to the next multiple of 9 and assumes that many characters follow the '/' in the VFD.
When the bits field of a VFD instruction is preceded by 'r' or 'h', the instruction specifies a BCD string. The steps for converting this to an appropriate .data statement is similar to the steps for converting a VFD specifying ASCII data.
GTOA converts MICROP instructions into special macros that perform equivalent operations. Macro definitions are obtained with
.include "mop.a"
GTOA automatically adds this at the beginning of your source code if you use any MICROPs.
GMAP uses the constructs ** and *-* to represent operands that are not available at assembly time. These will presumably be filled in with values at execution time. YAA converts such constructs to 0-0. This has the same effect.
The stuffing idiom *** in the opcode field is replaced by a zero pseudo-op.
In GMAP, the format of SYMDEF and SYMREF declarations is
symdef name,name,... symref name,name,...
This is converted to the YAA form
[name,name,...]: symdef [name,name,...]: symref
The GMAP end and privit pseudo-ops are not needed in YAA. They are simply removed.
The GMAP use pseudo-op is converted to an appropriate .section pseudo-op, or to an .origin pseudo-op that sets an origin at the end of an existing section (as given by the .highest function). By convention, the first instructions in the source code are considered to be part of a section called _blank_. Since there is no SYMDEF for _blank_, there is no conflict if several modules of the same program are converted with GTOA.
In order to implement the
use previous
statement, GTOA maintains a YAA variable which is always associated with the highest point in the most recently defined section.
GMAP USE statements basically break up the program into chunks the same way that sections do. The difference is that USE statements are resolved by GMAP and can therefore be used to specify the order in which program chunks are arranged in memory. By default, YAA sections are ordered by the linker and therefore appear in no particular order.
By specifying parent sections judiciously, you can control the order of sections when necessary. Often, this means that you must define extra parent sections.
The .LIT pseudo-op tells GMAP to position a literal pool at this point in the code being generated. The .pool pseudo-op in YAA is similar, but it is not quite the same. To see the difference, consider this situation:
code .pool more code
If the .pool above was a GMAP .LIT, the literal pool would immediately be placed in between the two chunks of code. With .pool, however, the literal pool is created as a child section of the current parent. This will not be placed between the two chunks of code, but will be relocatable within the parent section.
In addition, .pool only covers literals that occur after the .pool directive; .LIT only covers literals that occur before.
To get the effect of the GMAP, you must make both code chunks into separate sections inside the same parent as the literal pool. You can then arrange to arrange these child sections into the right order.
Because of this complication, GTOA does not convert GMAP .LIT pseudo-ops. It leaves these to be converted by hand.
GTOA converts GMAP common data areas into sections with the common type.
The GMAP pseudo-ops setb and set manipulate objects that can be represented by YAA variables. GTOA therefore generates .var pseudo-ops to declare such symbols as YAA variables.
GMAP bool statements are implemented in almost the same way. However, a symbol defined with bool cannot be assigned new values, so it will not be declared with a .var statement. This prevents the symbol from being given a new value once it has been .set.
In both bool and setb, the GMAP operators
* + - /
must be converted to the YAA operators
&& | ^ ~
GMAP's CALL and RETURN pseudo-ops are not converted. Appropriate diagnostic messages are printed.
YAA has no equivalent of GMAP's ERLK instruction. On the other hand, ERLK is usually used in constructions of the form
erlk org *-2
These are simply deleted -- they are not necessary in YAA code. Other ERLK instructions are not converted.
EIS instructions are complicated to convert. The most important point to note is that YAA does not remember the options field of an EIS instruction when analyzing the descriptors for the instruction. Thus GTOA must incorporate options into descriptors when necessary. In particular, GTOA must figure out when the length field of a descriptor should be a number and when it should be written as the name of an X register.
Note also that YAA uses a very different format to specify options in EIS instructions. See the description of mlr in Section 6.7 for further details.
GMAP has a number of instructions that control listings. These include
list on list off list save list save,on list save,off list restore
and similar instructions with list replaced by pmc, pcc, and ref. If source code contains any of these, GTOA puts
.include "gtoa.a"
at the beginning of your source file. This defines opcode macros list, pmc, pcc, and ref. In this way, a GMAP instruction like
list off
becomes a call to the YAA macro list.
The macros reproduce the effects of the GMAP instructions by turning listing options on or off. The list below shows which options each macro controls.
list -- .l_source pmc -- .l_macro pcc -- .l_list ref -- .l_crossref
All of the macros assume that you begin with the standard listing options. If you add .list pseudo-ops of your own to manipulate listing options, the macros may not work as expected. In addition, you should bear in mind that the macros are macros and will be expanded in the listing when the appropriate options are set.
The file gtoa.a mentioned in the previous section also includes the definition of a macro named inhib. This macro properly handles the GMAP instructions
inhib on inhib off ...and so on
by generating a corresponding YAA .inhibit pseudo-op.
GMAP's oct and dec statements can be converted to .data statements in a straightforward way.
The GMAP date statement is replaced by
.data .time()
Note that the time string format created by .time will be different from the GMAP date format.
The GMAP block statement is replaced by
.section common,data,word
An instruction of the form
X: bfs 10
becomes
.space 10 X: .null
while
X: bss
just becomes
X: .space
The following conversions are also made.
GMAP YAAYAA null .null org .origin even .align 2 odd .align 2,1 eight .align 8
GTOA attempts to convert standard the B environment macros to YAA constructs that have the same effect. This includes the following GMAP macros:
advnce aentry argdef aschar auto backup bend bentry bmac bretrn cheap chmac ifalse iftrue incall scall sentry switch
In the output file produced by GTOA, lines that could not be converted are marked with an '<<' character in column 1. These will cause errors if put through the YAA assembler. Programmers should look at each of these lines by hand to determine if and how they should be converted.