Words, words, words...

When a REXX program deals with words, it always means a list of blank-delimited tokens in a string.  Here's an example of such a list:


Tokens separated by blanks... each token a word.  (For the examples below, we will presume that variable 'string' contains the list of tokens above.)  Usually the requirement is to process each token in succession.  There are several ways to skin this cat.  Here are just two:

   Method #1:
      do ii = 1 to Words(string)
         thisword = Word(string,ii)  /* isolate one word */ something with 'thisword'...

The mantra "Words(string)" returns the number of words in 'string', in this case, five.  "do ii = 1 to..." causes variable 'ii' to successively take on the values 1, 2, 3, 4, and 5, and the code from 'do' to 'end' is repeated five times, once for each value of 'ii'.  "Word(string,ii)" returns the ii-th word of 'string', so on each pass, the next word in the string is delivered as 'thisword'.

   Method #2:
      do while string ^= "" /* while string not empty */
         parse var string thisword string something with 'thisword'...

'Parse' is the heart and soul of REXX.  It slices, it dices, it juliennes.  Unfortunately, it exists in a half-dozen unique incarnations, each of which is worth considerable explanation.  For the moment, the parse above says:  take variable 'string' and split it; put the first word into 'thisword' and the remainder into 'string'.  So, on each iteration 'string' gets shorter, much the way salami gets shorter in the deli department:  zip, zip, zip, zip,...  On each iteration, 'thisword' gets a fresh slice of salami.  Eventually, the last slice is sliced and the salami proper is no more; 'string' is empty, and the loop stops.

Who *Was* That Masked Man.... ?

Most times when we are dealing with a list of words (as above), the list contains several instances of the same type of thing, whether it be member names or dataset names or userids, and the requirement is to do some process iteratively to each.  That's a pretty straight-forward request, and we should expect that even a novice programmer would, with a little book-work, knock that task off forthwith regardless of the language.  But what if the requirement is to identify which elements of the list look like something else ?

Here's that list again:


and now the question is "which items look like ***T* ?"  (with "*" being a 'wildcard').  Can you even imagine a solution in COBOL ?  Well, it's not exactly 'intuitively obvious' in REXX, either, but it can be done in a fairly straight-forward manner once you understand the problem.  This is the problem:  characters which are NOT wildcards must match exactly; wildcards match anything; obviously, we don't want to do a character-by-character match.

The solution is dependent on REXX's ability to deal with data at the bit-level.  First, we need a copy of the mask in which all the wildcards have been converted to "0" bits:

   string    = "ABC12PRM DBB33PT TEST14 TEST14A ZBMTXMP"
   srchfor   = "***T*"
   lomask    = translate(srchfor,'00'X,"*")
(This says: convert each asterisk in <srchfor> to eight off-bits.)

Next, we need a copy of the mask in which all the wildcards have been converted to "1" bits:

   himask    = translate(srchfor,'ff'X,"*")
(This says: convert each asterisk in <srchfor> to eight on-bits.

...and we need to know how long the mask is:
   masklen   = Length(srchfor)

Now, do a Boolean compare between each word and the <himask> and again between each word and the <lomask>, but only for the length of the masks.

   (1) isolate the leftmost n-characters of the word

The rest of it is a piece-o'-cake:

   do zz = 1 to words(string)
      check = word(string,zz)   /* isolate one word as "check" */

      say c2x(bitand(himask,Left(check,masklen))),
      if  bitand(himask,Left(check,masklen)),
            = bitor(lomask,Left(check,masklen)) then do;
         say srchfor "=" check

   end                                 /* zz    */

Frank Clarke <>