The B Environment Manual: Contents

1. Introduction
    1.1  How to Use this Document
    1.2  Descriptive Shorthand
        1.2.1  Addressing: "Above" & "Below"
        1.2.2  Bytes: "Upper" & "Lower"
        1.2.3  Stack Pointers: "Previous" & "Current"
    1.3  Run-time Environment GMAP Macros: BMAC
    1.4  Further Information
2.  Initialization: B Set-up and Starting Execution
    2.1  Structuring the Run-Time Memory
        2.1.1  Growing to Accommodate Overlays
        2.1.2  Creating the Unallocated Memory Pool
    2.2  Allocating and Initializing the Main B Stack
    2.3  Setting Fault Vectors
    2.4  Parsing the Command Line
    2.5  Calling MAIN
3.  Stack-Oriented Languages
    3.1  Local Storage Requirements
        3.1.1  The FCV/Stack Descriptor
        3.1.2  Location of the Stack
        3.1.3  Stack Overflow Checking
    3.2  The Stack Pointer
    3.3  The Stack Frame
        3.3.1  Components of the Stack Frame
4.  Function Call, Entry, and Return
    4.1  Passing Function Arguments
    4.2 The Call/Return Linkage
        4.2.1  Function Trap Linkage
        4.2.2  "Cheap" Local Storage
    4.3  Function Call, Entry, and Return: Detail
        4.3.1  In the Caller
        4.3.2  In the Callee
        4.3.3  In .ENTRY
        4.3.4  In .RETRN
        4.3.5  Alternate Function Entry Routines
5.  Memory Allocation and Management
    5.1  External Storage
    5.2  Auto Storage
    5.3  Dynamic Storage
    5.4  Memory Management: Which to Use
    5.5  Memory-Image of a B Program
6.  Input and Output
    6.1  I/O Data Structure
    6.2  Set-up and Tear-down
    6.3  Terminal I/O
        6.3.1  The New-Line Character
        6.3.2  Flushing Output
        6.3.3  Terminal Input
        6.3.4  VIP Terminals
        6.3.5  Raw Terminal I/O Mode
7.  Register Conventions and Usage
Appendix A: Variable Number of Arguments: .VNTRY
    A.1  Variable with a Maximum
    A.2  Variable with No Maximum
    A.3  Description of a .VNTRY Sequence
        A.3.1  In the Caller
        A.3.2  In the Callee
        A.3.3  In .VNTRY
Appendix B: Run-Time Memory
    B.1  The Pool of Run-Time Memory
        B.1.1  Growing the Pool
        B.1.2  Shrinking the Pool
    B.2  RLSEVEC: Making Memory Unallocated
    B.3  GETVEC: Allocating Memory
    B.4  Obtaining More Run-Time Memory
    B.5  Wreaking Havoc with Run-Time Memory
Appendix C: Function Traps and Gotos
    C.1  Setting a Function Trap
    C.2  The Trap Handler
    C.3  Multiple Traps
    C.4  Non-Local GOTOs
        C.4.1  Finding the Destination
        C.4.2  Triggering Traps
        C.4.3  B Library Support
Appendix D: Debug Tables
    D.1  Loader External Symbol Tables
    D.2  Local Debug Tables
        D.2.1  Line Number Table
        D.2.2  Local Symbol Tables

1. Introduction

This document describes the internal workings of programs that have been compiled by GCOS8 compilers maintained by Thinkage Ltd. This includes the following compilers.

The B Compiler
The C Compiler
The Pascal 66 Compiler

There are very few differences between the run-time environments used by the three compilers. The differences will be described where appropriate; for the most part, you should assume that the environments are identical.

This manual describes programs in terms of their overall structure and in terms of the equivalent GMAP assembler instructions generated for some of those structures. The information should prove helpful if you are trying to write or understand functions written in GMAP for the run-time environment. It will also help you understand the overhead associated with function call and entry, and memory allocation and management. Such "inside knowledge" can also help greatly in the efficient use of the interactive debuggers BOFF and COFF.

1.1 How to Use this Document

Because many of the topics in this manual are inter-related, most people will have difficulty assimilating all the material in a single reading. We suggest that you skim the document once or twice to get a general feel for the most important points, and then go for a detailed reading.

1.2 Descriptive Shorthand

Before we get into the actual description of the run-time environment, it will be helpful to ensure that the reader understands some of the terminology that will be used in the rest of this document.

1.2.1 Addressing: "Above" & "Below"

In visualizing the run-time environment, you should think of a program's address space originating at a base of zero and extending "upward" to higher addresses. Memory location A is "above" location B if A's address is higher than B's. The "bottom" of a block of memory or a structure is its lowest address. The "top" is its highest address.

We shall use the terms "below" to mean "lower in address" and "above" to mean "higher in address" throughout this document. Our diagrams have been drawn with this visualization in mind. Where possible, higher addresses are displayed above lower addresses. This convention must be violated in diagrams containing GMAP assembler statements, since assembler code is written "down" the page towards "higher" addresses.

1.2.2 Bytes: "Upper" & "Lower"

The terms "upper" and "lower" refer to the left 18 bits (most significant half) and the right 18 bits (least significant half) of the machine word. Where two values are stored in two halves of the same word, we will write "upper" and "lower" to indicate where each is.

1.2.3 Stack Pointers: "Previous" & "Current"

When we say each function has its "own" stack pointer, we mean that the single hardware register used as the stack pointer takes on a specific value for each function invocation. Each function invocation has its "own" value of the stack pointer.

In the same way, "the stack pointer of a function" means the value that the stack pointer has while the function is executing.

In the description of stack frames used by function calls, phrases like "the previous stack pointer" do not mean that there is more than one hardware stack pointer. This shorthand just means "the value of the stack pointer previous to the value it has now". When a function returns to its caller and "the previous stack pointer" is restored, we mean "the stack pointer is given the value it had previous to the current value".

The stack pointer is often referred to as if it had an actual memory address or location, e.g. a word lies "below the stack pointer". This sort of thing always refers to the address that the stack pointer points to. For example, saying that the linkage section of a stack frame lies "below its stack pointer" means that the locations that make up the linkage section lie below the location addressed by the function's stack pointer.

"Moving" the stack pointer means changing the value of the stack pointer, i.e. "moving" the location addressed by the stack pointer. When the stack pointer is "moved up" to the next stack frame, the address in the stack pointer register is replaced by the address of the next stack frame.

1.3 Run-time Environment GMAP Macros: BMAC

If you intend to work at the assembler level, you should look at the B system GMAP macro file. This file contains the correct sequences of GMAP assembler instructions for such things as function entry, return, defining auto storage, and bumping the stack. "expl b bmac" explains the use of these macros. The macros may be loaded with your assembler program, enabling you to write compatible GMAP functions easily and accurately. The macros also provide more detail on the exact code used to manipulate the structures described in this manual.

The information supplied in this document is subject to change, and is often abbreviated or slightly distorted for the sake of clarity. Please refer to the above macro file for the current implementation of the features described here.

1.4 Further Information

More detailed explanations of the use and syntax of library functions described in this document may be obtained with "expl b lib". Note however that many internal library functions are not documented in an effort to prevent their use and abuse by normal users. Note also that the C and Pascal packages do not contain some of the functions that are distributed with B; therefore the corresponding explanations may not be available if your system does not have B.

2. Initialization: B Set-up and Starting Execution

There are several procedures which can or must be performed in every program before the function MAIN is finally invoked. The internal library routine .SETU. performs most of these procedures. They may be grouped as follows:

2.1 Structuring the Run-Time Memory

The loader brings your program into memory and does little else. The block of memory occupied by your program is often called the "memory hole" or (an anachronism) the "core hole". B, C, and Pascal programs require a structured memory hole to permit dynamic allocation and de-allocation of memory.

2.1.1 Growing to Accommodate Overlays

.SETU. determines the maximum memory size needed by the overlays that may be called into memory. The library function ADDMEM may be called to increase the memory-image size to accommodate the largest overlay. The memory obtained by this call to ADDMEM is kept reserved for use by incoming overlays. It cannot be used for run-time memory allocation, since the overlays must be called into this space and would overwrite any vectors dynamically allocated in this region of memory.

2.1.2 Creating the Unallocated Memory Pool

The B library function RLSEVEC is called to de-allocate all unused memory above the user's program, or above the highest address needed to accommodate the largest overlay, up to the next "K" boundary. This memory is now available to your program at run-time through calls to the B library function GETVEC. (See Appendix B: Run-Time Memory for details on GETVEC and RLSEVEC.)

2.2 Allocating and Initializing the Main B Stack

Associated with the main stack is a standard stack descriptor, or Function Control Vector (FCV). The stacks allocated for co-routines also have FCVs (see "expl b coroutines fcv"). The main stack is a vector whose length is specified by the "Stack=" option of the command that invokes the compiler. This vector is allocated by the B library function GETVEC. The absolute high and low addresses of the stack are placed in the main stack descriptor located in .SETU.

An empty stack frame is constructed on the bottom of the new stack and Address Register 7, the stack pointer (sp), is initialized to point to the stack address of this empty stack frame. Subsequent function calls will now build their stack frames on the main stack.

2.3 Setting Fault Vectors

Overflow faults are normally masked out. .SETU. calls the internal library function B.FALT to initialize the break and wrapup vectors. Other fault vectors are left untouched.

2.4 Parsing the Command Line

The command line that invoked the program is read and processed by the B library function .BSET. Once it has finished parsing the command line, .BSET releases itself into the pool of unallocated memory (using RLSEVEC). You can prevent .BSET from releasing itself by initializing an external variable named .KEEP to a non-zero value.

2.5 Calling MAIN

Finally, MAIN is called from the initialization routine. Two arguments are supplied, courtesy of .BSET:

main( argc, argv );

"argc" is the number of arguments (separated by blanks or tabs) given on the command line. "argv" is a vector of pointers to those arguments (see "expl b lib .bset".)

When MAIN returns, it returns to .SETU., the initialization routine. .SETU. terminates your program.

3. Stack-Oriented Languages

B, C, and Pascal use a stack to store local variables and linkage information for each function. The main stack is allocated dynamically by .SETU. when the load module begins execution. The size of the stack is specified at compile time using the "Stack=" option of the compiler command. The default is 500 words. 300 words is the minimum stack needed to get through the initialization routines in .SETU., so you should not specify a value less than 300.

The memory for the stack is obtained from the pool of unallocated memory in your program (see Appendix B: Run-Time Memory for more detail on the available memory pool).

3.1 Local Storage Requirements

Each time a function is called, a segment of the stack is reserved for that function's local storage. When the function returns, the stack space is made available again. Each function call uses four words of stack space for the call/return linkage, plus one word for each function argument and one word for each auto variable. Each auto variable defined as a vector takes an additional "size of the vector plus one" words of stack space.

Unless special stack-monitoring routines are used, it is quite possible to "walk off the end" of the stack and overwrite other memory locations above the stack. (See Appendix B: Run-Time Memory for an example of this.) This generally leads to immediate or eventual disaster for your program; hence, it is wise to know what data gets stored on the stack, and to avoid the use of a lot of local storage when it is not needed.

3.1.1 The FCV/Stack Descriptor

The entry routine .SETU. contains a template FCV (Function Control Vector or stack descriptor) describing the main B stack. (See "expl b coroutines fcv" for a full description of the layout and contents of this descriptor.) In a regular B program, the coroutine pointer (cp=x6) points to .SETU.-6. This is the value of the FCV pointer for a regular B program, and the value returned by the B library function WMI.

The FCV for the main stack contains the absolute stack addresses in the upper and lower halves of .SETU.-9 (FCV[-3]). The current high water mark stack usage is in the upper half of .SETU.-8 (FCV[-2],upper). If the stack is severely overwritten or the registers altered, reference may be made to these locations in .SETU. to determine the actual location in memory of the main stack.

3.1.2 Location of the Stack

The main stack is allocated dynamically from available memory. The stack start is rounded up to an odd boundary and the absolute high and low addresses of the stack are placed in the upper and lower halves of FCV[-3].

3.1.3 Stack Overflow Checking

When the stack monitoring routines are loaded with your program, the internal library routine .ENTRY performs a stack bound check before entering each function. These routines are loaded automatically unless the "-Tables" option of the B command is used, in which case "USe=.bound" will load the checking routines explicitly.

When the stack overflow checking code is being used, the stack high water mark is monitored with respect to the absolute stack addresses. If the high water mark ever runs off the top of the stack, an error message is printed and your program aborts.

If the monitoring routines have been loaded and the stack descriptor (FCV) and coroutine pointer (cp=x6) are intact after an abort, BOFF and COFF are able to interpret and print the stack bounds information from the FCV.

3.2 The Stack Pointer

The stack is indexed using Address Register 7 (ar7), hereafter referred to as the stack pointer. Set-up initializes the stack pointer to point to the bottom of the stack, and the stack will grow upward toward higher memory from this point.

3.3 The Stack Frame

Since local storage is unique to each particular call or invocation of each function, each function call attempts to obtain a fresh segment of the stack for its use. The stack space associated with a particular function invocation is called a stack frame. When a function is called, a new stack frame is allocated for that function, above the stack frame used by the calling function. When the function returns, the new stack frame is abandoned and the stack pointer is once again aimed at the stack frame of the caller.

In practice, a stack frame is implemented by moving (changing the address in) the Stack Pointer (sp=ar7), and by addressing all local storage on the stack relative to this pointer. The address in the stack pointer determines the location of a stack frame. When a function is called and a new stack frame is needed, the address in the stack pointer is advanced to point a safe distance up the stack past the previous stack frame. When a function returns and its stack frame is abandoned, the address in the stack pointer is moved back down the stack to point to the previous stack frame.

3.3.1 Components of the Stack Frame

The stack frame is the name given to the segment of the stack reserved for the current function invocation (call). It must be large enough to contain all the local storage needed for the function, as well as some space for function call and return linkage. In addition, it must be situated on the stack in such a way that it does not overwrite or interfere with any previous stack frames of functions still in execution. (These would be functions that called functions that called the current function.)

The components of the stack frame, in order of increasing address, are as follows:

The Linkage Section: two words of call/return linkage (detailed later).

One word to store the value of the caller's stack pointer (the stack pointer for the current stack frame points to this word).

The Local Storage Section:

N words of arguments supplied to the function.
N words of auto variables used by the function.

The "Temporaries" Section: N words used for compiler "temporaries" and for setting up the arguments to functions called by this function.

In keeping with the visualization of the B address space that places low addresses on the bottom, you should invert the above list so that you envision the call/return linkage as lying at the very bottom of the stack frame, "below" the stack pointer.

Note that the stack pointer points to the third word above the bottom of the stack frame, i.e. to the word containing a copy of the value of the caller's stack pointer. The first argument to the function is the first word above the stack pointer. The two words of linkage are actually below the stack pointer. There may be no arguments, autos, or temporaries in the stack frame if the corresponding function does not need such things.

4. Function Call, Entry, and Return

In B, C, and Pascal, it is the responsibility of the caller to save any registers it needs across a function call. Most other GCOS-8 programming languages do NOT work this way.

Most of the overhead associated with a function call is to generate and organize a complete new stack frame for the function being called (the callee). This stack frame must be situated safely above the stack frame of the calling function (the caller). The creation of this new stack frame, with linkage, arguments, and space for local storage, is a shared responsibility between the caller and the callee.

Below we list the steps of this process and whether the operation is carried out by the caller or the callee.

In the "temporaries" section of the current stack frame, organize an ordered list of arguments to the function being called (caller's responsibility).

Initialize a word with the current size of the caller's local storage ("stbump") in the upper half, and the number of arguments being passed in the list of arguments to the callee ("nargs") in the lower half. (Caller's responsibility. For B, C, and Pascal functions, this is done automatically by the compiler.)

Transfer to the callee (caller's responsibility).

Save the call/return linkage information about the caller, so the callee knows where to return to (callee's responsibility).

Adjust the stack pointer to skip the caller's local storage and to point to the completed callee stack frame (callee's responsibility).

You will note that the caller stores arguments in the "temporaries" section of its own stack frame, but when the function call is complete the arguments must be situated immediately above the stack pointer in the callee's stack frame. In fact, the stack frames associated with each function call overlap.

The temporaries section of the stack frame contains temporary values needed to hold intermediate results while evaluating non-trivial expressions. A function call may be considered a special case of a non-trivial expression, where the arguments to the function are just intermediate results copied by the caller into the "temporaries" section.

When the stack pointer is advanced for the function call, it is only moved far enough to clear the local storage section but not the "temporaries" section of the caller. This places the stack pointer above the local storage section, but below the "temporaries" section into which the arguments have been copied. The caller's "temporaries" section containing the list of arguments becomes the start of the stack frame of the callee. In this way, the "temporaries" section is both the end of the caller's stack frame and the beginning of the callee's.

4.1 Passing Function Arguments

Not all arguments are stored on the stack for a function call. The first argument is passed in the A Register and the second in the Q Register; only arguments three and up are placed ahead in the "temporaries" section of the caller's stack frame, pending the function call.

Every function knows how many words are needed for its own local storage. This means that each function knows where it is safe to start storing temporaries and/or arguments for function calls. The callee is able to access this same information via the call/return linkage section, and it is this information that enables the callee to adjust the stack pointer the right number of words to point to the new stack frame being created.

4.2 The Call/Return Linkage

This is a two-word area immediately below the stack pointer of the currently executing function. (This makes it the rock-bottom of the current stack frame.) It is not used until the current function does a function call. At this point, the callee places the call/return linkage information it needs to return to the caller into the caller's linkage section. This means that the call/return linkage for the callee is actually stored in the stack frame of the calling function, i.e. in the linkage section of the caller. Thus, the linkage section of any function's stack frame is only used by the callees of that function (possibly none).

When used normally, the linkage section contains the return address to the caller saved in the upper half of both linkage words. One of these words will be used by .RETRN to do the function return; the other is saved for a fixed reference to the first.

4.2.1 Function Trap Linkage

When a function trap is set, the return address in the lower linkage word is altered to point to the trap handler, which will ultimately call the desired trap routine. The trap is not actually executed until the function does its return transfer. Since this return is always done through the linkage, the return actually transfers into the trap handler, where additional trap processing takes place in preparation for the trap function call (see Appendix C: Function Traps and Gotos).

The return address in the upper linkage word is never altered. It is used by the BOFF and COFF debuggers (to obtain function call tracebacks), and by the B library function NARGS. Since it is possible to compare the two linkage words, one can tell if a trap is set for the associated function. If the two return addresses differ, the address in the lower linkage word actually points to the trap handler. If the addresses are the same, no trap is currently set.

4.2.2 "Cheap" Local Storage

Since it is the responsibility of the callee to store any return linkage that it needs, and also to bump the stack, it is possible for the callee to forego these responsiblities and use the caller's linkage for other things.

One of the things it can do is to use the caller's linkage as two words of local storage, either for arguments, auto variables, or one of each. This is done in GMAP functions that have such small local storage requirements that the overhead of a full stack-bumping function call is not needed. Only two words of local storage are available by this method and, if used, regular call and entry procedures cannot be used.

To summarize, here are the bottom three words of the stack frame, showing the two linkage words. (The pointer to the "framsi" word will be discussed later.)

Word  Contents
  0   saved stack pointer of caller
 -1   upper: return address (never changed)
      lower: pointer to "framsi" word in callee
 -2   upper: return address, or trap handler address
      lower: trap block address, if trap set

4.3 Function Call, Entry, and Return: Detail

The next sections give more detail about the steps of function calling, entry, and return. After each list of actions is a "Notes" section giving more detail about the actions.

4.3.1 In the Caller

Calculate and store each function argument on the stack in the "temporaries" section of the caller's stack frame.

Call the function:

tsx1    func
zero    stbump,nargs

If the called function returns a value, via return(<value>), it will be in the Q Register. (In C and Pascal, the E, A, and Q registers may be used.)

Notes:

"stbump" is the stack-bump parameter. It is the size of the local storage requirements of the caller (arguments plus auto variables), as defined up to the point where the function call takes place. The callee will arrange to move the stack pointer up at least this many words to protect the stack frame of the caller. To ensure that double-word indirection works properly, the value of "stbump" is always kept even. The stack pointer itself will always be odd.

As will be noted later, there is enough extra space allowed in the "stbump" parameter to permit both the caller's stack pointer and the callee's call/return linkage to be stored safely below the stack pointer after it is advanced. This will put these items above the caller's local storage, and below the newly-constructed function argument list.

"nargs" is the number of words of arguments used in this function call. This may not be the same as the number of arguments if arguments take more than one word each (as may happen in C or Pascal). The B library routine NARGS returns exactly this value.

If you define auto variables in the middle of a function, the "stbump" parameter will be larger for function calls compiled after the auto definition, since there will be more local storage defined by the auto definition. This can lead to unhappy results if the function calls occurring before the misplaced auto definition overwrite an area of the stack not reserved until after the auto definition. (The first function calls will only bump the stack sufficiently to cover the auto variables defined up to that point.) This problem only exists if there is some backward transfer of control occurring after the misplaced auto definition that would cause the functions preceding the auto definition to overwrite the unprotected area of the stack.

4.3.2 In the Callee

Transfer to the start of the function (this is the same for all functions), and call the standard function entry subroutine .ENTRY:

func    tra     func+1
        tsx0    .entry
        zero    framsi,dbgptr

This will save the call/return linkage into the caller's linkage area, and then adjust the stack pointer to point to the new stack frame (see the description of .ENTRY below).

There are alternate function entry subroutines to .ENTRY; these are outlined in a following section and described fully in an appendix.

Save the A and Q Registers, if arguments have been passed in them, onto the new stack. This puts them there with the other function arguments, if any.

Continue with the body of the function.

To return to the caller, transfer to the standard function exit subroutine, .RETRN:

tsx0   .retrn

This will extract the return address from the caller's linkage and transfer through it (see the description of .RETRN below).

Notes:

B functions always start with an instruction that transfers to the actual body of the function. This means that the value of a variable which is a function name, is this transfer instruction. This allows the value of a function name to be copied, and the copy used in the same way as the function. A function call using the copied name transfers to the word containing the copy, which in turn transfers to the actual function body.

funny;
main(){
    extrn funny, putstr;
    funny = putstr;
    /* the value of funny is now "tra putstr+1" */
    funny("hi*n");
    /* tsx1 funny transfers to tra putstr+1 */
}

"framsi" is the stack frame size parameter. It is the absolute high-water-mark size of the stack frame needed by this function, and should be determined at the time the function is compiled or assembled. It is the sum of the space used by function arguments, by auto variables and by temporaries, plus a few extra words to allow for the linkage section of its own callee(s). In the event of an asynchronous function call, as might occur during an interrupt, break, or fault, this value can be used by the asynchronous function to establish a safe place to construct its own stack frame, above the stack frame of the function in which the fault occurred.

"dbgptr" is a pointer to local debug tables (tables specific to the function), if they exist. Local debug tables describe auto variables and provide more detail on the line numbers of statements. If the "-Tables" option was used when compiling the program, these tables are not generated and "dbgptr" is either zero or points to the end of the function. (This applies only to B programs. C and Pascal programs have different approaches to debugging.)

4.3.3 In .ENTRY

At this point, the stack pointer still points to the caller's stack frame, and all stack references here are within it.

Save the return address into the top half of both the caller's linkage words. As mentioned earlier in the section dealing with the linkage, the return address in the lower linkage word may be modified by a trap set in the callee; the return address in the upper linkage word is permanent and can be used to determine the real caller, as might happen in a debugging traceback.

Save a pointer to the word containing the callee's stack frame size ("framsi", described earlier), into the bottom half of the upper linkage word.

Save the value of the caller's stack pointer into a word located exactly "stbump" words above the caller's stack pointer. ("stbump" was described earlier, in the section "In the caller".) This is where the stack pointer for the new stack frame will point. It is just below the newly-constructed function argument list, near the bottom of the "temporaries" section of the caller's stack frame.

Finally, bump the stack pointer by the amount "stbump" (described earlier). This points the stack pointer to the word where the caller's stack pointer was just saved. The argument list is now located correctly immediately above the stack pointer. There is a two-word space reserved immediately below the stack pointer for call/return linkage for the next function call(s) (thanks to the extra space allowed in "stbump"). The callee's stack frame is now complete.

If "USe=.bound" was specified when the program was compiled, either explicitly or through a B command option or default, do a stack overflow check.

Leave .ENTRY and return control to the callee.

Notes:

The lower half of the upper linkage word points to the

zero   framsi,dbgptr

word of the callee. An asynchronous function can look in the linkage for the pointer to "framsi" and use it to establish the absolute top end of the callee's stack frame (see "In the callee").

This pointer is also the only backward link between this stack frame and the address of the associated function, since it points back to the address of the "framsi" word in the callee. Debugging programs use this feature to locate which function is associated with each stack frame on the stack.

4.3.4 In .RETRN

Remember that in every stack frame, the current stack pointer points to the address at which the caller's stack pointer was saved; from this word, restore the caller's stack pointer.

Using the caller's stack pointer now, get the return address from the linkage section of the caller. (Note: if a trap is set in the callee, this return address will return control to the trap handler, not to the caller. See Appendix C: Function Traps and Gotos.)

Transfer using the return address just obtained.

4.3.5 Alternate Function Entry Routines

There are several alternate subroutines that enter a function:

.VNTRY: For functions that take a variable number of arguments. A special entry is needed, since the called function cannot know how much space to allow for arguments in its own stack frame. .VNTRY does some special processing to get around this problem (see Appendix A: Variable Number of Arguments: .VNTRY).
.ZNTRY: (Currently used only by Pascal) Performs stack overflow checking, and also initializes each auto variable in the stack frame of the callee to the magic number 0377740377740. This number usually causes a memory fault or similar abort if an auto variable is used as a pointer, an index, or an vector name without being initialized.

In addition, for functions which must not be interrupted by the BREAK key, there is protection which may be obtained through the use of the INHIB keyword in the macro call for .ENTRY, .VNTRY, or .ZNTRY. (See 1.3 Run-time Environment GMAP Macros: BMAC.)

5. Memory Allocation and Management

B has exactly two data structures: "word" and "vector of words". These may be allocated in any of three ways: as externals, as auto variables, or dynamically from the pool of unallocated memory.

5.1 External Storage

When the program is loaded, the memory is set aside and (optionally) initialized. This form of storage generates entries in the loader symbol and debug tables, since these are external symbols. These locations may be addressed by name in the BOFF debugger.

5.2 Auto Storage

Located on the stack as part of the current stack frame, auto variables should be initialized locally by the function that uses them. (Otherwise, you get whatever value just happened to be on the stack where the variable is allocated.) Auto variables and their values "vanish" when the function returns and the stack frame disappears. They must be re-initialized on entry into the function again.

The upper bound on stack space is determined at compile time; if too much auto storage is used, the stack may "overflow" and data above the stack may get overwritten. If the stack monitoring routines have been loaded with your program, the message "Stack Overflow" will be printed before your program aborts. If the monitoring routines are not present, you don't find out about the problem until some critical piece of memory is overwritten.

The names of auto variables are defined in the local debug tables, which are not kept if the "-Tables" option of the B command has been used. Since they are offsets from the current value of the stack pointer (sp=ar7), they have no external (loader symbol table) names.

5.3 Dynamic Storage

Vectors of memory may be allocated dynamically, from a pool of unallocated memory in your program. B supplies the run-time functions GETVEC and RLSEVEC to allocate and free vectors of words in this pool of unallocated memory. Dynamic memory is the only convenient way to allocate a two-dimensional matrix (see "expl b lib getmatrix"). For more detail on dynamic memory allocation and management, see Appendix B: Run-Time Memory.

5.4 Memory Management: Which to Use

In choosing how to allocate words or vectors, the following considerations should be kept in mind.

As a useful rule of thumb, vectors of length 64 words or less should be allocated as auto variables on the stack. Longer vectors, or many such vectors, should be allocated dynamically using GETVEC, to prevent stack overflow.
Only external storage can be automatically initialized right from the time the program is loaded. Auto and dynamic storage must always be explicitly initialized, since the memory contains garbage otherwise. (.ZNTRY tries to make it easier to catch uninitialized auto storage by setting all auto variables to a specific garbage value; see the detail section under "In the callee".)
If the maximum size of a vector cannot be determined until run-time, it should be allocated dynamically using GETVEC. If auto or external storage is used, the vector will have a fixed upper bound, building an unnecessary arbitrary limit into your program.
Because dynamic vectors are not identified to the loader at load time, finding their location and contents after a serious abort may be difficult, especially if the stack is overwritten (as is all too often the case).

5.5 Memory-Image of a B Program

nn7777  exact "K" boundary, top of memory hole
  ..
  ..    pool of unallocated memory
  ..
<nnnn>
  ..
  ..    Main stack (allocated dynamically
  ..    from unallocated memory)
  ..
<nnnn>
  ..
  ..    run-time library functions
  ..
<nnnn>
  ..
  ..    user program functions
  ..
000144  (000100 in batch)
  ..
  ..    slave prefix (program preface)
  ..    fault vectors, program size, etc...
  ..
000000  zero, bottom of memory; start of slave prefix

If the command line parsing routine .BSET releases itself back into unallocated memory after parsing the command line, it will leave a large pool of unallocated memory somewhere in the section labeled "run-time library functions".

6. Input and Output

The I/O routines in the B library are designed in accordance with the strong "sequential character stream" orientation of I/O in the language. Other design goals were speed and size. To this end, almost all routines are written in assembler; the B compiler, unhappily, makes no serious attempt at code optimization. Limited facilities are provided for subverting the normal flow of a character stream (e.g. backspacing, rewinding, record and binary I/O), but these are not allowed to compromise the efficiency of the sequential character stream.

C and Pascal I/O routines are built on top of the B I/O structures. A typical C I/O routine will just set up some arguments and then call a B routine to do the actual work. Because of this, C and Pascal I/O behave in the same way as B I/O.

6.1 I/O Data Structure

Underlying the whole I/O structure is a vector IO.UNITS which has range [-5:49]. Each element of this vector defines an I/O unit (the unit number being the offset into the vector) by containing either a pointer to an I/O control vector (henceforth abbreviated IOV), or 0 for an undefined unit.

6.2 Set-up and Tear-down

The control vectors for the "system" and the teletype in TSS and the console in batch are initialized at load time, with only the line buffers being allocated dynamically in TSS (by .SETU.). Set-up of all other IOV's (files and sysout) is accomplished via OPEN which checks its arguments then allocates a buffer for the IOV and block image.

If the unit is being opened for reading, OPEN then calls .READ which maintains two variables: RD.UN., the current read unit, and RD.IOV, the current read IOV. .READ also maintains Address Register 4 (rd=ar4) pointing the same place as RD.IOV.

For units opened for writing, .WRITE is like .READ but maintains WR.UN., WR.IOV and Address Register 5 (wr=ar5).

For details on the structure of an IOV and more detail on media handling and character stream input and output, see "expl b ioplm".

6.3 Terminal I/O

There are several oddities related to reading and writing on terminals that may be relevant to programmers who are interested in the internal workings of their programs. This information is particularly important if you want to understand the sleight of hand that is associated when dealing with the new-line character. Therefore, we will begin with a discussion of new-lines.

6.3.1 The New-Line Character

When we speak precisely about the new-line character, we are referring to the ASCII linefeed character (octal 012). In B, this is written as "*n". In C and Pascal, it is written "\n", but we will use the B version in this manual.

The linefeed character is used to separate lines of text in text files. However, to separate lines on the terminal, you usually need two characters: a carriage return (to get the cursor to the beginning of the line) and then a linefeed. The carriage return is written as "*r" in B, and "\r" in C and Pascal. Because "*r*n" is used to separate terminal lines, this sequence is also called a new-line when speaking loosely. In the sections that follow, however, "new-line" will only refer to the ASCII linefeed.

6.3.2 Flushing Output

Output written to the terminal is always buffered. The library functions only write out the buffer when it becomes necessary. Writing the buffer becomes necessary under any of the following conditions:

The buffer becomes full.
The program explicitly flushes the buffer or closes the terminal output unit.
The program issues a read operation on the terminal input unit.

As output is written into the buffer, "*n" characters are converted to "*r*n" sequences. This ensures that when the buffer is finally written to the terminal, lines will be printed properly.

The routine that flushes the buffer to the terminal checks to see if the last character in the buffer is "*n". If this is the case, the "*n" is not written to the terminal immediately.

The reason for this is that it is sometimes not appropriate to write a linefeed at the end of a line of output. For example, consider what happens if the program terminates after the line of output is written. When a command terminates in build mode, TSS automatically outputs a linefeed before issuing the "*" prompt for another command. If the linefeed on the last line of output was also printed out, the result would be a blank line between the last line of output and the "*" prompt.

To avoid this problem, the library functions take special action if the last character written from a terminal output buffer is a "*n". The library outputs a carriage return "*r" and then sets a flag that indicates there is a linefeed character waiting for output on the terminal. Normally, the next I/O operation on the terminal will print out that linefeed and then carry on to read or write (as appropriate). However, if the program terminates (or if certain other special conditions occur), the pending linefeed will not be written out.

6.3.3 Terminal Input

With normal CRTs, a program only receives a line of terminal input when the user presses ENTER (because of the way GCOS-8 does front end processing). Thus, all input is terminated by a "*r" character.

When a library routine reads the "*r" at the end of the input line, it sets the linefeed pending flag. As in the case with terminal output, this flag means that the library will usually output a linefeed to go to a new line on the terminal. This linefeed can be stopped if the program explicitly suppresses it using the NO.NL function (see "expl b lib no.nl").

Special things can happen if the terminal input is a null line (i.e., only a "*r"). In this case, the library sets a conditional pending linefeed flag. This flag indicates that a linefeed should be printed out by the next terminal I/O operation, unless all of the following are true.

The program just prompted for input with a linefeed.

The user typed nothing but a "*r".

The next terminal I/O operation is a write. Under these conditions, the pending new-line is not printed out. This mechanism is provided to handle the paging of output (as with the P command). If the user only types a carriage return, the library makes sure that output resumes on the line immediately following the last line written, without extra blank lines inserted when the user is asked for input.

Users can simulate an end of file condition on terminal output by entering a line that consists only of a File Separator character (CTRL-\) followed immediately by a carriage return. In CTRL-\ appears in any other way (e.g. in the middle of a longer line), the end of file condition is not induced. Note that this way of signaling end of file means that users cannot enter a line that consists only of CTRL-\, since the library will return an EOF indicator instead of the actual CTRL-\ value.

6.3.4 VIP Terminals

TSS treats VIP terminals unlike normal CRTs. This can be a nuisance to programs that expect to dealing with normal terminals, so the library attempts to make VIPs seem like normal terminals as much as possible.

Whenever a VIP opens for input, TSS automatically sends a linefeed character. Thus, the run-time library never sends out pending linefeed characters when opening the terminal for input.

VIP terminals submit input using the TRANSMIT key instead of a carriage return. To split the screen image into lines in the usual way, the library always puts a "*r" in the terminal output buffer after input is received.

VIPs can transmit multiple lines in one input operation. Each of these lines will be separated with "*r*n". The library input routines automatically convert all embedded "*r*n" sequences into normal "*n" delimiters.

6.3.5 Raw Terminal I/O Mode

If the user program wants to suppress all of the special handling that the library does with new-lines, it can execute the TTY.IO function on the terminal output and/or input units. This puts the terminal in "raw" mode.

If TTY.IO is applied to the terminal output unit, output to the terminal will not be processed in any way. For example, "*n" characters are not changed to "*r*n". In addition, tab characters are not expanded and the library does not keep track of what output column it is in.

If TTY.IO is applied to the terminal input unit, input is passed on to the program "raw". Carriage returns are not converted to "*n". CTRL-\ never induces the end of file condition. For VIPs, "*r*n" input sequences are not compressed to "*n", missing carriage returns are not echoed, and so on.

7. Register Conventions and Usage

x1        Holds return address across call
x0,x2,    User scratch not preserved across call
x3,x4
x5        Save before use, restore before return
x6=cp     Pointer to the current co-routine header
x7        Holds Pascal display pointer across call

ar0,ar1,  User scratch not preserved across call
ar2,ar3
ar4=rd    Pointer to current read IO vector
          Use, but change only by call to .READ or ..RD
ar5=wr    Pointer to current write IO vector
          Use, but change only by call to .WRITE or ..WR
ar6       Reserved for co-routine package
ar7=sp    Stack pointer changed by entry/exit code
A         A Reg: first function argument
Q         Q Reg: second argument; function return value

Register x1 is used to hold the return address. If the routine uses the standard entry sequence, this is saved on the stack and x1 may be used as scratch in the program.

In Pascal environments, x7 may contain a display pointer. If so, the display must be copied to a safe place before x7 is used as scratch.

Note that the standard function entry sequence destroys x0, x1, and x2. On function entry, the AQ contains the first two words of the argument vector. On function return, the Q Register contains the function's return value, if any, as supplied by "return(value)". (In C and Pascal, the E, A, and Q registers may contain this return value.)

Appendix A: Variable Number of Arguments: .VNTRY

A function which accepts a variable number of arguments may be written in one of two ways.

A.1 Variable with a Maximum

Some maximum number of arguments to the function may be decided upon, and the function can assume that this maximum will not be exceeded. In this way, the maximum size of the function's stack frame (which is partially determined by the number of arguments the function is designed to accept) can be determined at compile (or assemble) time. Unused arguments are just so much unused space on the stack.

In execution, the function can call NARGS to determine exactly the number of arguments supplied by the caller. The remaining arguments, those which are not supplied, can then be initialized to default values by your program.

Since the number of arguments to this style of function is really fixed, even if not all are supplied, the standard .ENTRY function sequence may be used for this type of variable-number-of-arguments function.

A.2 Variable with No Maximum

The second case occurs when there can be no real specification of a maximum number of arguments, e.g. a call to CONCAT or PRINTF. One could conceivably define each function to accept 1000 arguments and only process those that were supplied, but this would waste a lot of stack space. The problem here is that the maximum size of the callee's stack frame cannot be determined at compile time, if one has to take into account the maximum number of arguments which might be supplied to the callee and which must be stored in the callee's stack frame.

The way around this is to keep the list of arguments out of the callee's stack frame, and to make the function accept and use a pointer to the list of arguments instead. When building the argument list, the arguments can be stored, as usual, in the "temporaries" section of the caller's stack frame. (Note that the caller knows how many arguments there are in the function call, and has made sure it has allocated enough "temporary" space for itself.) In the normal stack-bump procedure, this "temporaries" area becomes the beginning of the new callee's stack frame, so we need a special function entry subroutine to "double-bump" the stack frame of the callee past both the local storage of the caller (as usual) and also past the segment of the "temporaries" area where the argument list is stored. This function is .VNTRY.

A.3 Description of a .VNTRY Sequence

Below we give the parts of a variable entry sequence.

A.3.1 In the Caller

The caller sets up its list of function arguments in the "temporaries" area of its own stack frame, as usual. In fact, the caller is completely unaware that the function which it is about to call takes a variable number of arguments.

The function is called, using the standard "tsx1 func" sequence.

A.3.2 In the Callee

(This would be a function written in GMAP to process a variable number of arguments.)

Transfer to the start of the function. (This is the same for all functions.)

Call the special function entry subroutine .VNTRY to create the call/return linkage and adjust the stack pointer:

tsx0    .vntry
zero    framsi,dbgptr

A.3.3 In .VNTRY

Perform the standard linkage instructions to save the return address.

Instead of "single-bumping" the stack pointer, as in .ENTRY, determine (from the "nargs" word in the caller) the actual number of arguments used in this function call, and "double-bump" the stack pointer past both

the caller's local storage (as usual), and
the entire list of arguments stored in the caller's stack frame.

The callee's stack frame will then be located safely above this argument list.

The arguments to the callee are located in a never-never land above the caller's local storage and below the start of this "double-bumped" callee stack frame. No arguments are present within the stack frame of the callee. In the A register, place a pointer back to the start of the argument list.

Leave .VNTRY and return control to the callee.

The called function will have been written to accept and use this pointer to its argument list, rather than using the conventional arguments-in-callee-stack-frame approach. In this way, the arguments do not reside in the callee's stack frame and the size of the callee's stack frame can be determined at compile (assemble) time, without worrying about how many arguments will actually be supplied to the function when it is called.

The B library function NARGS can be used, as usual, to determine from the caller the actual number of arguments supplied in the function call.

It should be noted that the B compiler, and the BMAC macros compute frame size values ("framsi" in the entry sequence) with sufficient extra space to permit safe double bumping in this .VNTRY sequence.

Appendix B: Run-Time Memory

Both GETVEC and RLSEVEC make use of the caller's linkage area to store local variables. Neither function needs to call .ENTRY or bump the stack, making both functions very efficient.

B.1 The Pool of Run-Time Memory

B supplies the run-time functions GETVEC and RLSEVEC to allocate and free vectors of words in the pool of run-time memory above the end of your program.

B.1.1 Growing the Pool

If you use GETVEC to request more memory than can be satisfied in this pool of run-time memory, B attempts to grow your program by obtaining additional memory from the operating system. The memory is always obtained in segments of 1K (1024 words) or more. This additional memory increases the memory-image size of your executing program correspondingly. There is an upper limit on the size of a TSS program imposed by TSS; this limit will be different at different sites.

B.1.2 Shrinking the Pool

If you use RLSEVEC to free one or more entire 1K segments of memory at the top of your program's memory-image, you may call RELMEM to return this memory to the operating system and reduce the size of your memory-image again. RLSEVEC does not call this function itself, no matter how much of the run-time memory pool you de-allocate. RELMEM must be called explicitly, and you must ensure that nothing in the segment(s) you are releasing will ever be accessed from your program again. Attempts to access this released memory will result in a memory fault.

B.2 RLSEVEC: Making Memory Unallocated

The pool of unallocated memory above the last word of your program is managed as a linked list of unallocated blocks of words. Initially, during set-up when your program begins execution, this entire pool is made unallocated as one block by calling RLSEVEC. RLSEVEC initializes the external word VEC.CH to contain:

in the upper half, a pointer to the start of the next unallocated block of memory (after initialization, this pointer is to the entire pool of run-time memory), and
in the lower half, the size of the unallocated block pointed to by the upper half.

As memory is alternately allocated and de-allocated, the first word of each new block of unallocated memory is used in a similar manner to VEC.CH, as a pointer to the next unallocated block. The pointers all have the same format as VEC.CH.

Note that the size field in the pointer word in a block of unallocated memory does not refer to the size of the block it is imbedded in, but to the size of the block pointed to in the upper half of the word, i.e. the next unallocated block. In other words, the size of any given unallocated block of memory is contained in the pointer word of the previous block.

This chaining of pointers creates a forward-linked list of blocks of unallocated memory with an originating pointer in VEC.CH. RLSEVEC sees to it that when contiguous blocks of words are freed, they are merged into one larger block. It also ensures that newly freed blocks are inserted into the chain in such a way as to keep the linking pointers in ascending order, from low to high addresses. The last block of unallocated memory, which cannot point to anything, has its pointer set to 0777777777777.

B.3 GETVEC: Allocating Memory

GETVEC allocates a vector in run-time memory by scanning down the linked list of unallocated blocks, and looking for the first block of a size large enough to satisfy its request. This scan is done by one RPL assembler instruction, and is very fast. The requested size of vector is "broken off" the block of unallocated memory and the link pointer for the block is adjusted to reflect the allocation. GETVEC returns a pointer to the allocated vector.

B.4 Obtaining More Run-Time Memory

If GETVEC fails to find a block of a size that will satisfy its request, the scan goes right to the pointer in the last block of memory with the value 0777777777777. This last pointer is actually interpreted by the RPL instruction as a 256K segment of memory (0777777) starting at location 256K (0777777). Even if the RPL scan does not find a large enough block of unallocated memory in the body of the linked list, the scan will stop here on seeing the 256K segment. This special pointer is an indication that memory will have to be obtained from the operating system to grant the GETVEC request. Failure to obtain this memory results in a "0K" abort in batch, and the message "not enough core to run job" in TSS.

B.5 Wreaking Havoc with Run-Time Memory

Since the linked list of blocks of unallocated memory is maintained solely by the single chain of pointers and sizes running through the first word of every unallocated block, any program which tampers with or overwrites any of this chain, whether intentionally or by "walking off the end" of an allocated block of memory, is almost certainly doomed to failure the next time GETVEC attempts to scan this chain.

A favorite way of effecting this manner of program suicide is walking off the end of the main stack, which is itself one of the first vectors dynamically allocated by set-up. In a similar manner, you might write data past the end of a vector, onto the first word of some block of unallocated memory (e.g. a=getvec(10); a[11]="overwrite!";). More difficult to detect perhaps is the use of a pointer to a vector after you have freed the vector, as in

a = getvec(10);
rlsevec(a,10);
a[0] = "overwrite!";

In this case, even staying within the bounds of the original vector leads to trouble when you overwrite the first word of the free block and destroy the link pointer put there by RLSEVEC.

When the link-following scan used by GETVEC (a GMAP RPL instruction) runs across the overwritten pointer, it may either cause an immediate memory fault or, possibly worse, it may allocate the vector at the random address formed by the upper half of the overwritten word. In the BOFF debugger, you can follow the chain of pointers starting at the external VEC.CH to determine the integrity of your run-time memory pool.

Appendix C: Function Traps and Gotos

Normally, every function either returns to its caller or calls EXIT to terminate the program. Traps let you temporarily save the current return address of a function and replace it with a new one. When the trapped function does its return transfer, the new return address sends control into a "trap handler" that can perform user routines. In the B run-time library, traps are used to arrange that some special routine be performed at the future ending of a particular function invocation. This may best be illustrated by an example.

The B library function ALLOCATE sets up a temporary dynamic vector that must be released when the function that called ALLOCATE returns. The releasing of this vector is assured by the trap which ALLOCATE sets in its caller. When the trapped function does its return transfer, the trap handler gets control and makes sure the dynamic vector is released. It does not matter where the function return takes place, at the end of the function or at any point before; the trap is still invoked by the return process itself.

Traps apply to function calls because they are associated with the call/return linkage of a specific stack frame. The current invocation of a function might have a trap set in its return linkage, but a previous invocation might not. More precisely, the trap is set by replacing the lower of the two call/return linkage words currently being used by a function. The original word containing the function's return address is saved; the replacement word contains a new return address that points to the trap handler. Once control passes to the trap handler, the handler removes the trap by restoring the saved bottom linkage word back into the linkage.

Traps are temporary. The first thing the trap handler does on gaining control when a function returns is to delete the trap by restoring the bottom linkage word (containing the pre-trap return address of the function) back into the function's linkage. It then calls a user-specified function which may call other functions, free allocated memory, or print debug information before returning to the trap handler. The trap handler then does a normal return transfer using the return address in the restored bottom linkage word. If this was the only trap set on this function, this return transfer is to the function's caller, as usual.

The bottom linkage word saved and replaced at the time a trap is set may even be one which has already been saved and replaced by a previously set trap. In this case, the return transfer at the completion of the current trap will use the bottom linkage word which was replaced by the previously set trap. This sends control through the trap handler again into this previous trap, instead of into the function's caller. Similarly, a virtually endless series of traps can be chained together for the same function invocation. In effect, the traps are "stacked" and will trigger in reverse order to the order they were set, i.e. the last trap set is the first to be invoked. The first trap set is the one that saved the function's real return address from the linkage. The return transfer out of this first trap returns to the function's caller.

C.1 Setting a Function Trap

A trap is set using the internal library routine:

trap_blk = .trap(user_function, trap_blk_size, stack_frame);

(You are cautioned not to use this routine yourself.)

"trap_blk" is a pointer to a dynamically allocated block of memory, the size of which is requested by trap_blk_size. This trap_blk is associated uniquely with this trap. "user_function" is the name of the user-supplied function to be called once the trap handler gains control. "trap_blk_size" is the size of the trap_blk. "stack_frame" is the stack pointer address of the stack frame to set the trap in.

When .TRAP is called, the trap_blk is dynamically allocated and initialized as follows:

Word  Comment
<nn>  - <nn> is trap_blk_size
  .
  .   - user requested memory space
  .
  0   - this address returned to the user by .TRAP
 -1   upper: pointer to stack frame where trap set
      lower: trap_blk_size
 -2   upper: address of user_function
      lower: backward pointer to 3 words above the
             location where the trap is set
 -3   copy of the function's lower linkage word

Note that the actual size of the trap_blk is three words greater than the given trap_blk_size, because three words of internal information are needed. Also remember that references to the call/return linkage of any function actually reference the pair of words stored in the caller's stack frame. The trap_blk now contains all the information needed by the trap handler.

The trap is set by replacing the bottom linkage word in the specified stack_frame by a word pointing to the trap handler routine (upper half) and to the local trap_blk (lower half). When the return transfer out of the trapped function takes place, the bottom linkage word is accessed by .RETRN. The transfer takes place through the return address in this word; therefore it goes to the trap handler instead of whatever return address was in the original linkage word (normally, the return address of the caller).

C.2 The Trap Handler

This is the sequence of steps taken when control passes into the trap handler:

Remember that in the standard function return sequence, as executed by .RETRN, the stack pointer gets reset to the caller before the return transfer takes place. The trap handler subverts this process by setting the stack pointer back up again, effectively "appropriating" the trapped function's old stack frame.

Save the E, A, Q Registers. This preserves any return value which may be passed in these registers. (B returns values in the Q register alone; C and Pascal may use all three.)

Save the value of trap_blk[-2] and replace the user_function in trap_blk[-2] with a null function. This prevents traps from possible looping when the user_function is called.

Call the user_function. The trap handler calls this user_function with a pointer to the trap_blk in the A Register, and a "1" in the Q Register. The "1" indicates that this trap was invoked normally, in the course of a regular trapped function return sequence.

Remove all traces of this particular trap by copying the saved bottom linkage word, preserved in trap_blk[-3], back into the function's linkage.

When the user_function returns to the trap handler, release the entire trap_blk back to unallocated memory.

The trap handler now restores the E, A, and Q Registers, saved earlier, and uses .RETRN to do a standard function return. Since the trap handler has been using the appropriated stack frame of the trapped function, this return transfer uses the same linkage words as previously used by the trapped function. The trap handler has already restored the bottom word of this linkage using the word which it had saved before the trap was set, so this transfer is through the return address present before the trap was set.

C.3 Multiple Traps

If the bottom linkage word saved in the current trap_blk at the time the trap was set was itself saved and replaced by a trap set earlier, then the return transfer out of the trap handler is an effective "sideways bounce" into the previous trap:

.RETRN sets the stack pointer down to the caller.

.RETRN gets the return address from the linkage. This return address has just been restored by the trap which is currently finishing; but, having also been set by a trap, this return address leads into the trap handler again.

.RETRN transfers from the trap handler to the trap handler. The trap handler then sets the stack pointer back up again.

This does not become an infinite loop because each trap has its own unique trap_blk allocated for it. Each invocation of the trap handler restores the bottom linkage word from its own trap_blk, processes its own user_function, and releases its trap_blk. Ultimately, the linkage word which is restored has to be the original return address, which comes from the first trap_blk allocated by the first trap set. The return address in this linkage word is not to the trap handler, and there is no associated trap_blk. This last time, .RETRN will transfer back into the caller.

C.4 Non-Local GOTOs

Non-local gotos are transfers from an arbitrary location in one function to an arbitrary location in another. In practice, this means from one existing function invocation to another existing function invocation. The process presents two special problems for the run-time environment:

Locating the specific destination stack frame and setting the stack pointer there so that local storage references work correctly.
Locating on the stack, identifying, and triggering any and all traps which have been set between the originating and destination functions.

It is necessary to trigger the traps, since the traps were intended to execute when their respective functions ended. A non-local goto effectively ends several functions all at once, and it would not do to have the tidying-up features of the traps bypassed. As an example: ALLOCATE uses the trap feature to free the dynamic vectors it has allocated. If the goto bypassed these de-allocation traps, useful memory would be lost.

C.4.1 Finding the Destination

Since non-local gotos depend on being able to find an existing stack frame as a destination, it is only possible to go to function invocations still in existence, i.e. still having stack frames on the stack. These function invocations are all represented by stack frames below the one for the function originating the goto. Since each stack frame is centred on a word containing the saved stack pointer of the previous stack frame, it is easy to "walk backwards" down this chain of stack pointers, looking for a particular stack frame. When a match is found, the goto succeeds. If no match is found, disaster ensues when the walk finds the zero fence which marks the rock bottom of the chain of stack pointers. An error message is generated and the program aborts.

C.4.2 Triggering Traps

During the backward walk down the stack, traps must be identified and invoked. The fact that the return address of every function has been saved twice into both words in the call/return linkage for that function, lets us tell if one return address has been altered to point to the trap handler. If both return addresses are the same, no trap exists. If they differ, the bottom linkage word has been replaced by a trap and the trap can be invoked.

Referring back to the notes on the trap handler, remember that when the trap handler called the user_function it supplied a pointer to the trap_blk in the A Register, and the value "1" in the Q Register. The "1" indicated that the trap was being sprung normally, by a function returning. When a non-local goto is performed, the trap handler is not used to invoke the trap. Instead, the goto mechanism itself invokes the trap, and calls the user_function with a "0" in the Q Register. This indicates to the user_function that the stack is merely being "unwound", and that this is not a normal trap invocation. The user_function may switch on this value, to perform slightly different processing than during a regular trap sequence.

C.4.3 B Library Support

The B library functions SETEXIT and RESET perform non-local gotos by using the internal library routine:

 .goto(stack_frame, IC [, Q_Reg ]);

(You are cautioned not to use this routine yourself.)

"stack_frame" is the stack pointer address of the stack frame to transfer into. "IC" is the address at which to restart execution. The optional "Q_Reg" is a value to place in the Q Register before restarting execution.

.GOTO performs the stack unwinding housekeeping described above, invoking traps along the way until an exact match to the stack frame address is found. The "Q_Reg" value is loaded, and execution continues at location "IC".

Appendix D: Debug Tables

Debug tables provide a method of keeping around the names you have given to your variables and memory locations, so that you can recognize and verify them using an appropriate debugging program (e.g. BOFF). There are three types of debugging tables generated in the course of compiling and loading a B program:

Loader external symbol table: This table is always generated and kept in the load module. It is returned to unallocated memory during B set-up, when your program starts execution.
Auto and label variable table: This table is suppressed if the "-Tables" option of the B command was used. It keeps the names of your local auto variables and labels in each function. This table is a permanent part of your load module, if present.
Line number and statement-type table: This table is suppressed if the "-Tables" option of the B command was used. It associates memory addresses and statement types with line numbers in the input file(s). This table is a permanent part of your load module, if present.

You may wonder how the loader debug tables can be of any use in debugging an aborted program if the tables are released by B set-up as soon as the program begins execution. The answer lies in the BOFF debugger's ability to use the tables from your original load module to examine and debug the program in the abrt file. See "expl boff" for details.

D.1 Loader External Symbol Tables

When released at B set-up, the address of the loader symbol table is:

ltable = *(&.setu.-4) & 0777777;

Its format consists of a series of two-word entries containing the BCD symbol name and the address of the symbol:

Word  Contents
<nn>  Fence value -1
  .
  .
  .
  4   (etc.)
  3   upper: address of symbol
      lower: not used
  2   BCD symbol name
  1   upper: address of symbol
      lower: not used
  0   BCD symbol name
 -1   upper: size (words) of loader debug table
      lower: not used

D.2 Local Debug Tables

The local tables are pointed to by the "dbgptr" half of the word following the standard "tsx0 .entry" instruction in a function:

func   tra     func+1
       tsx0    .entry
       zero    framsi,dbgptr

A function's local debug tables come after the actual code of the function (in higher addresses). There are two types of local table: the line number table, and the local symbol table. They follow each other:

Word  Contents
<nn>  Fence value -1
  .
  .   <local symbol table>
  .
  .   Fence value -1
  .
  .   <line number debug table>
  .
  0   upper: backpointer to word containing dbgptr
      lower: line no. of func definition

In some cases, the part before the first fence (i.e. the line number table and the back pointer word) is missing. This is the case for scopes that are not associated with functions. For example, in C, each source file is a separate scope (file scope) independent of the functions inside the source file. Static objects declared outside of all functions are global in the sense that they can be referenced by any function in the file itself, but are local in the sense that they cannot be referenced by functions outside the file. Such a scope would have a local symbol table and two fences, but not a line number table.

D.2.1 Line Number Table

Each entry in the line number table is a single word containing a line number in the bottom 13 bits, the type of expression identified at this line number in the next 5 bits, and the address which corresponds to the start of this expression in the upper half word. In the rare case where the line number is too large to fit in 13 bits (over 8192 lines), the line number is placed in the full word following this entry and the bottom 13 bits of the first word (which would normally contain the line number) are zeroed.

The types of expressions chosen to be recognized and listed in the debug table are: expr, break, goto, next, return(<value>), if, switch, while, repeat, else, for, do, and return. Pascal has a wider range of statement types than B; the table for Pascal is correspondingly richer in identified expressions. The table for C is about the same as the table for B, since the languages are very similar.

D.2.2 Local Symbol Tables

Each entry in the symbol table consists of a two-word ASCII name followed by one word containing:

In the upper half of the word, a code value indicating the general type of the symbol. The following values are used.

 1  -- auto variable in stack
 2  -- B label
 3  -- C static object
 4  -- static function
 5  -- "static parent"

(A static parent entry is used when one scope is enclosed in another scope. For example, one Pascal subprogram may be declared inside another; the enclosing scope is the static parent of the enclosed scope. As another example, the C file scope is the static parent of all function scopes inside the source file.)

In the lower half of the word, a value whose meaning depends on the symbol type.

For auto variables, the lower half word holds the offset of the variable from the stack pointer.
For B labels, the lower half word holds the address associated with the label.
For static objects and functions, the lower half word holds the starting address of the object/function.
For a static parent entry, the lower half word contains a pointer to the debug tables for the static parent.

The last entry in the local symbol table may be the dummy variable "#temp". It indicates the end of the local storage in the stack frame and the start of the "temporaries" section. The "#temp" entry is not used with C programs.

The ordering of the local symbol table appears random; it is actually a copy of the hashing order chosen by the compiler.