Lesson 6

\       Lesson 6 - Strings
\       The Forth Course
\       by Richard E. Haskell
\          Dept. of Computer Science and Engineering
\          Oakland University, Rochester, MI 48309
comment:


                                Lesson 6

                                STRINGS


                6.1  STRING INPUT                     6-2

                6.2  ASCII - BINARY CONVERSIONS       6-4

                6.3  NUMBER OUTPUT CONVERSIONS        6-5

                6.4  SCREEN OUTPUT                    6-7



































6.1  STRING INPUT

        To get a string from the terminal and put it in a buffer at "addr"
        you can use the word

        EXPECT  ( addr len -- )

        This word has limited editing capabilities (you can backspace, for
        example) and will continue storing the ASCII codes of the keys you
        type until "len" characters are entered or you press <Enter>.  The
        number of characters entered is stored in the variable SPAN.

        The address of the Terminal Input Buffer is stored in the variable
        'TIB.  The word

        TIB     ( -- addr )

        puts this address on top of the stack.

        The word QUERY will get a string from the keyboard and store it in
        the Terminal Input Buffer.  It could be defined as follows:

        : QUERY         ( -- )
                        TIB 80 EXPECT
                        SPAN @ #TIB !
                        >IN OFF ;

        The variable #TIB contains the number of characters in the TIB.
        The variable >IN is a pointer to the characters in the TIB.  It
        is initially set to zero by the word OFF which stores a zero at
        the address on the stack.

        For example, suppose in response to QUERY you type the following
        characters plus <Enter>.

                3.1415,2.789

        The ASCII codes for each of these characters would be stored in the
        Terminal Input Buffer beginning at the address TIB.  The value 12
        would be stored in the variable #TIB.

        Now suppose you want to parse this input stream and extract the two
        numbers that are separated by a comma.  You can do this with the word

        WORD    ( char -- addr )

        which will parse the input stream for "char" and leave a "counted"
        string at "addr" (which is really the address HERE).  Thus, if the
        program executes the phrase

                ASCII , WORD

        the words ASCII , will put the ASCII code for a comma (hex 2C) on
        the stack and then WORD will result in the following bytes being
        stored at HERE.

                                |-------|
                       HERE --> |   6   |
                                |-------|
                                |  '3'  |
                                |-------|
                                |  '.'  |
                                |-------|
                                |  '1'  |
                                |-------|
                                |  '4'  |
                                |-------|
                                |  '1'  |
                                |-------|
                                |  '5'  |
                                |-------|
                                | blank |
                                |-------|


        Note that the first byte of a "counted" string contains the number
        of bytes in the string (6 in this case).  Also note that the word
        WORD will append a blank character (ASCII 20 hex) to the end of the
        string.

        At this point the variable >IN will be pointing to the character
        following the comma (in this case the 2).  The next time the phrase

                ASCII , WORD

        is executed the counted string "2.789" will be stored at HERE.  Even
        though there is no comma at the end of this string the word WORD will
        parse to the end of the string if it does not find the delimiting
        character "char."






















6.2  ASCII - BINARY CONVERSIONS

        Suppose you enter the number 3.1415.  We saw in Lesson 5 that if you
        do this in the interpretive mode, the value 31415 will be stored as
        a double number on the stack and the number of places to the right
        of the decimal point (4) will be stored in the variable DPL.

        How can you have the same thing done as part of your program.  For
        example, you may want to ask the user to enter a number and have
        that number end up on the stack.  The Forth word NUMBER will convert
        an ASCII string to a binary number.  Its stack picture looks like
        this.

        NUMBER  ( addr -- d )

        This word will convert a counted string at "addr" and store the
        result as a double number on the stack.  The string can represent
        a real, signed number with a radix point in the current BASE.  The
        number of digits after the radix point is stored in the variable DPL.
        If there is no radix point, the value of DPL is -1.  The number
        string must end with a blank.  This is just the situation that is
        created by WORD.

        If we want to enter a single number (16-bits) from the keyboard,
        we can define the following word:
comment;

        : enter.number          ( -- n )
                        QUERY
                        BL WORD
                        NUMBER DROP ;

comment:
        In this definition BL is the ASCII code for a blank (ASCII 20 hex).
        Thus, WORD will parse the input string until a blank or the end of
        the string is reached.  NUMBER will convert this input string to
        a double number and DROP will drop the high word and leave the
        value of the single number on the stack.  Note that its value must
        be in the range -32768 to 32767 in order for the high word of the
        double number to be zero.














6.3  NUMBER OUTPUT CONVERSION

        To print out the number 1234 on the screen we must

                1) Divide the number by the base.

                2) Convert the remainder to ASCII and store
                   backwards as a number string

                Repeat 1) and 2) until the quotient = 0

        Example:                                        |------|
                        1234/10 = 123  Rem = 4 -|       |  31  |
                                                |       |------|
                        123/10 = 12    Rem = 3  | |---> |  32  |
                                                | |     |------|
                        12/10 = 1      Rem = 2 -|-|     |  33  |
                                                |       |------|
                        1/10 = 0       Rem = 1  |-----> |  34  |
                                                        |------|

        The following Forth words are used to perform this conversiion
        and print the result on the screen:

        PAD is a temporary buffer 80 bytes above HERE.

        : PAD           ( --- addr )
                        HERE 80 + ;

        HLD is a VARIABLE that points to the last character stored in
        the number string.

        <# starts the number conversion which will store the number string
        in the memory bytes below PAD.

        : <#            ( -- )
                        PAD HLD ! ;

        HOLD will insert the character "char" in the output string.

        : HOLD          ( char -- )
                        -1 HLD +!
                        HLD @ C! ;

        The Forth word +! ( n addr -- ) adds n to the value at addr.  Thus,
        in the definition of HOLD, the value in HLD is decremented by 1
        and then the ASCII code "char" is stored in the byte at HLD.








        The word # ("sharp") converts the next digit by performing steps
        1) and 2) above.  The dividend must be a double nunber.

        : #             ( d1 -- d2 )
                        BASE @ MU/MOD           \ rem d2
                        ROT 9 OVER              \ d2 rem 9 rem
                        <
                        IF                      \ if 9 < rem
                           7 +                  \    add 7 to rem
                        THEN
                        ASCII 0 +               \ conv. rem to ASCII
                        HOLD ;                  \ insert in string

        The word #S ("sharp-S") converts the rest of a double number and
        leaves a double zero on the stack.

        : #S            ( d -- 0 0 )
                        BEGIN
                           #                    \ convert next digit
                           2DUP OR 0=           \ continue until
                        UNTIL ;                 \ quotient = 0

        The word #> completes the conversion by dropping the double zero
        left by #S and then computing the length of the string by
        subtracting the address of the first character (now in HLD) from
        the address following the last char (PAD).  This string length
        is left on the stack above the string address (in HLD).

        : #>            ( d -- addr len )
                        2DROP                   \ drop 0 0
                        HLD @                   \ addr
                        PAD OVER                \ addr pad addr
                        - ;                     \ addr len


        The Forth word SIGN is used to insert a minus-sign (-) in an
        output string if the value on top of the stack is negative.

        : SIGN          ( n -- )
                        0<
                        IF
                           ASCII - HOLD
                        THEN ;


        These words will be used in the next section to display number
        values on the screen.








6.4  SCREEN OUTPUT

        The word TYPE prints a string whose address and length are on the
        stack.

        : TYPE          ( addr len -- )
                        0 ?DO                   \ addr
                           DUP C@               \ addr char
                           EMIT 1+              \ next.addr
                        LOOP
                        DROP ;

        F-PC actually uses a somewhat different definition of TYPE that
        allows you to TYPE a string that is stored in any segment by
        storing the segment address of the string in the variable TYPESEG.

        Strings are usually specified in one of three ways:

                (1) Counted strings in which the first byte contains
                    the number of characters in the string.  The string
                    is specified by the address of the count byte ( addr -- ).

                (2) The address of the first character of the string and
                    the length of the string are specified: ( addr len -- ).

                (3) An ASCIIZ string is a string specified by the address
                    of the first character ( addr -- ).  The string is
                    terminated by a nul character (a zero byte).

        A counted string (1) can be converted to an address-length string (2)
        by using the Forth word COUNT.  Thus, the word

        COUNT   ( addr -- addr+1 len )

        takes the address of a counted string (addr) and leaves the address
        of the first character of the string (addr+1) and the length of the
        string (which it got from the byte at addr).

        Since TYPE requires the address and length of the string to be on
        the stack, to print a counted string you would use

                COUNT TYPE

        Example:  The following word will echo whatever you type on the
                  screen.
comment;

        : echo          ( -- )
                        QUERY                   \ get a string
                        ASCII , WORD
                        CR COUNT TYPE ;




comment:
        The use of the number output conversion words in Section 6.3 can be
        illustrated by showing how the various number output words are
        defined.

        The word (U.) converts an unsigned single number and leaves the
        address and length of the converted string on the stack.

        : (U.)          ( u -- addr len )
                        0 <# #S #> ;

        The word U. prints this string on the screen followed by a blank.

        : U.            ( u -- )
                        (U.) TYPE SPACE ;

        The word SPACE prints one blank.  It is just

        : SPACE         ( -- )
                        BL EMIT ;

        where BL is the CONSTANT 32, the ASCII code of a blank.
        The Forth word SPACES ( n -- ) prints n spaces.

        When printing numbers in columns on the screen it is necessary to
        print the number right-justified in a field of width "wid".
        This can be done for an unsigned number with the Forth word U.R.

        : U.R           ( u wid -- )
                        >R (U.)                 \ addr len
                        R>                      \ addr len wid
                        OVER - SPACES           \ addr len
                        TYPE ;

        For example, 8 U.R will print an unsigned number, right-justified
        in a field of width 8.

        To print a signed number we need to insert a minus-sign at the
        beginning of the string if the number is negative.  The word (.)
        will do this.

        : (.)           ( n -- addr len )
                        DUP ABS                 \ n u
                        0 <# #S                 \ n 0 0
                             ROT SIGN #> ;

        The word dot (.) is then defined as

        : .             ( n -- )
                        (.) TYPE SPACE ;





        The word .R can be used to print a signed number, right-justified
        in a field of width "wid".

        : .R            ( n wid -- )
                        >R (.)                  \ addr len
                        R>                      \ addr len wid
                        OVER - SPACES           \ addr len
                        TYPE ;

        Similar words are defined to print unsigned and signed double
        numbers.

        : (UD.)         ( ud -- addr len )
                        <# #S #> ;

        : UD.           ( ud -- )
                        (UD.) TYPE SPACE ;

        : UD.R          ( ud wid -- )
                        >R (UD.)                \ addr len
                        R>                      \ addr len wid
                        OVER - SPACES           \ addr len
                        TYPE ;

        : (D.)          ( d -- addr len )
                        TUCK DABS               \ dH ud
                        <# #S ROT SIGN #> ;

        : D.            ( d -- )
                        (D.) TYPE SPACE ;

        : D.R           ( ud wid -- )
                        >R (D.)                 \ addr len
                        R>                      \ addr len wid
                        OVER - SPACES           \ addr len
                        TYPE ;

        To clear the screen, use the word

        DARK    ( -- )

        To set the cursor at the x,y coordinate col,row use the word

        AT      ( col row -- )

        For example, the following word Example_6.4 will clear the screen
        and print the message "Message starting at col 20, row 10".
comment;

        : Example_6.4   ( -- )
                        DARK
                        20 10 AT
                        ." Message starting at col 20, row 10"
                        CR ;