\ Lesson 6 - Strings \ The Forth Course \ by Richard E. Haskell \ Dept. of Computer Science and Engineering \ Oakland University, Rochester, MI 48309 comment: Lesson 6 STRINGS 6.1 STRING INPUT 6-2 6.2 ASCII - BINARY CONVERSIONS 6-4 6.3 NUMBER OUTPUT CONVERSIONS 6-5 6.4 SCREEN OUTPUT 6-7 6.1 STRING INPUT To get a string from the terminal and put it in a buffer at "addr" you can use the word EXPECT ( addr len -- ) This word has limited editing capabilities (you can backspace, for example) and will continue storing the ASCII codes of the keys you type until "len" characters are entered or you press <Enter>. The number of characters entered is stored in the variable SPAN. The address of the Terminal Input Buffer is stored in the variable 'TIB. The word TIB ( -- addr ) puts this address on top of the stack. The word QUERY will get a string from the keyboard and store it in the Terminal Input Buffer. It could be defined as follows: : QUERY ( -- ) TIB 80 EXPECT SPAN @ #TIB ! >IN OFF ; The variable #TIB contains the number of characters in the TIB. The variable >IN is a pointer to the characters in the TIB. It is initially set to zero by the word OFF which stores a zero at the address on the stack. For example, suppose in response to QUERY you type the following characters plus <Enter>. 3.1415,2.789 The ASCII codes for each of these characters would be stored in the Terminal Input Buffer beginning at the address TIB. The value 12 would be stored in the variable #TIB. Now suppose you want to parse this input stream and extract the two numbers that are separated by a comma. You can do this with the word WORD ( char -- addr ) which will parse the input stream for "char" and leave a "counted" string at "addr" (which is really the address HERE). Thus, if the program executes the phrase ASCII , WORD the words ASCII , will put the ASCII code for a comma (hex 2C) on the stack and then WORD will result in the following bytes being stored at HERE. |-------| HERE --> | 6 | |-------| | '3' | |-------| | '.' | |-------| | '1' | |-------| | '4' | |-------| | '1' | |-------| | '5' | |-------| | blank | |-------| Note that the first byte of a "counted" string contains the number of bytes in the string (6 in this case). Also note that the word WORD will append a blank character (ASCII 20 hex) to the end of the string. At this point the variable >IN will be pointing to the character following the comma (in this case the 2). The next time the phrase ASCII , WORD is executed the counted string "2.789" will be stored at HERE. Even though there is no comma at the end of this string the word WORD will parse to the end of the string if it does not find the delimiting character "char." 6.2 ASCII - BINARY CONVERSIONS Suppose you enter the number 3.1415. We saw in Lesson 5 that if you do this in the interpretive mode, the value 31415 will be stored as a double number on the stack and the number of places to the right of the decimal point (4) will be stored in the variable DPL. How can you have the same thing done as part of your program. For example, you may want to ask the user to enter a number and have that number end up on the stack. The Forth word NUMBER will convert an ASCII string to a binary number. Its stack picture looks like this. NUMBER ( addr -- d ) This word will convert a counted string at "addr" and store the result as a double number on the stack. The string can represent a real, signed number with a radix point in the current BASE. The number of digits after the radix point is stored in the variable DPL. If there is no radix point, the value of DPL is -1. The number string must end with a blank. This is just the situation that is created by WORD. If we want to enter a single number (16-bits) from the keyboard, we can define the following word: comment; : enter.number ( -- n ) QUERY BL WORD NUMBER DROP ; comment: In this definition BL is the ASCII code for a blank (ASCII 20 hex). Thus, WORD will parse the input string until a blank or the end of the string is reached. NUMBER will convert this input string to a double number and DROP will drop the high word and leave the value of the single number on the stack. Note that its value must be in the range -32768 to 32767 in order for the high word of the double number to be zero. 6.3 NUMBER OUTPUT CONVERSION To print out the number 1234 on the screen we must 1) Divide the number by the base. 2) Convert the remainder to ASCII and store backwards as a number string Repeat 1) and 2) until the quotient = 0 Example: |------| 1234/10 = 123 Rem = 4 -| | 31 | | |------| 123/10 = 12 Rem = 3 | |---> | 32 | | | |------| 12/10 = 1 Rem = 2 -|-| | 33 | | |------| 1/10 = 0 Rem = 1 |-----> | 34 | |------| The following Forth words are used to perform this conversiion and print the result on the screen: PAD is a temporary buffer 80 bytes above HERE. : PAD ( --- addr ) HERE 80 + ; HLD is a VARIABLE that points to the last character stored in the number string. <# starts the number conversion which will store the number string in the memory bytes below PAD. : <# ( -- ) PAD HLD ! ; HOLD will insert the character "char" in the output string. : HOLD ( char -- ) -1 HLD +! HLD @ C! ; The Forth word +! ( n addr -- ) adds n to the value at addr. Thus, in the definition of HOLD, the value in HLD is decremented by 1 and then the ASCII code "char" is stored in the byte at HLD. The word # ("sharp") converts the next digit by performing steps 1) and 2) above. The dividend must be a double nunber. : # ( d1 -- d2 ) BASE @ MU/MOD \ rem d2 ROT 9 OVER \ d2 rem 9 rem < IF \ if 9 < rem 7 + \ add 7 to rem THEN ASCII 0 + \ conv. rem to ASCII HOLD ; \ insert in string The word #S ("sharp-S") converts the rest of a double number and leaves a double zero on the stack. : #S ( d -- 0 0 ) BEGIN # \ convert next digit 2DUP OR 0= \ continue until UNTIL ; \ quotient = 0 The word #> completes the conversion by dropping the double zero left by #S and then computing the length of the string by subtracting the address of the first character (now in HLD) from the address following the last char (PAD). This string length is left on the stack above the string address (in HLD). : #> ( d -- addr len ) 2DROP \ drop 0 0 HLD @ \ addr PAD OVER \ addr pad addr - ; \ addr len The Forth word SIGN is used to insert a minus-sign (-) in an output string if the value on top of the stack is negative. : SIGN ( n -- ) 0< IF ASCII - HOLD THEN ; These words will be used in the next section to display number values on the screen. 6.4 SCREEN OUTPUT The word TYPE prints a string whose address and length are on the stack. : TYPE ( addr len -- ) 0 ?DO \ addr DUP C@ \ addr char EMIT 1+ \ next.addr LOOP DROP ; F-PC actually uses a somewhat different definition of TYPE that allows you to TYPE a string that is stored in any segment by storing the segment address of the string in the variable TYPESEG. Strings are usually specified in one of three ways: (1) Counted strings in which the first byte contains the number of characters in the string. The string is specified by the address of the count byte ( addr -- ). (2) The address of the first character of the string and the length of the string are specified: ( addr len -- ). (3) An ASCIIZ string is a string specified by the address of the first character ( addr -- ). The string is terminated by a nul character (a zero byte). A counted string (1) can be converted to an address-length string (2) by using the Forth word COUNT. Thus, the word COUNT ( addr -- addr+1 len ) takes the address of a counted string (addr) and leaves the address of the first character of the string (addr+1) and the length of the string (which it got from the byte at addr). Since TYPE requires the address and length of the string to be on the stack, to print a counted string you would use COUNT TYPE Example: The following word will echo whatever you type on the screen. comment; : echo ( -- ) QUERY \ get a string ASCII , WORD CR COUNT TYPE ; comment: The use of the number output conversion words in Section 6.3 can be illustrated by showing how the various number output words are defined. The word (U.) converts an unsigned single number and leaves the address and length of the converted string on the stack. : (U.) ( u -- addr len ) 0 <# #S #> ; The word U. prints this string on the screen followed by a blank. : U. ( u -- ) (U.) TYPE SPACE ; The word SPACE prints one blank. It is just : SPACE ( -- ) BL EMIT ; where BL is the CONSTANT 32, the ASCII code of a blank. The Forth word SPACES ( n -- ) prints n spaces. When printing numbers in columns on the screen it is necessary to print the number right-justified in a field of width "wid". This can be done for an unsigned number with the Forth word U.R. : U.R ( u wid -- ) >R (U.) \ addr len R> \ addr len wid OVER - SPACES \ addr len TYPE ; For example, 8 U.R will print an unsigned number, right-justified in a field of width 8. To print a signed number we need to insert a minus-sign at the beginning of the string if the number is negative. The word (.) will do this. : (.) ( n -- addr len ) DUP ABS \ n u 0 <# #S \ n 0 0 ROT SIGN #> ; The word dot (.) is then defined as : . ( n -- ) (.) TYPE SPACE ; The word .R can be used to print a signed number, right-justified in a field of width "wid". : .R ( n wid -- ) >R (.) \ addr len R> \ addr len wid OVER - SPACES \ addr len TYPE ; Similar words are defined to print unsigned and signed double numbers. : (UD.) ( ud -- addr len ) <# #S #> ; : UD. ( ud -- ) (UD.) TYPE SPACE ; : UD.R ( ud wid -- ) >R (UD.) \ addr len R> \ addr len wid OVER - SPACES \ addr len TYPE ; : (D.) ( d -- addr len ) TUCK DABS \ dH ud <# #S ROT SIGN #> ; : D. ( d -- ) (D.) TYPE SPACE ; : D.R ( ud wid -- ) >R (D.) \ addr len R> \ addr len wid OVER - SPACES \ addr len TYPE ; To clear the screen, use the word DARK ( -- ) To set the cursor at the x,y coordinate col,row use the word AT ( col row -- ) For example, the following word Example_6.4 will clear the screen and print the message "Message starting at col 20, row 10". comment; : Example_6.4 ( -- ) DARK 20 10 AT ." Message starting at col 20, row 10" CR ;