Serious First Steps In UserTalk Scripting
Prev | Table Of Contents | Next


13. String Parsing and Substitution

Create a new script in the workspace table, called BuildSite. Our routine takes no parameters so get rid of the automatically generated BuildSite handler; all our commands can be at surface level.

The whole routine requires that we know the pathname of the AmelioWeb folder. We assume it contains two things: the template, called "template.html", and a folder called "Source Files" containing our email text files. We'll start by learning the pathname of the folder and saving this as a variable called "folder".

    local (folder)
    if not file.getFolderDialog ("Where is the AmelioWeb folder?", @folder) {return}

We need next to read the textfile "template.html" into a string, which we'll call "templatetext". We can derive the pathname of "template.html" by concatenating "template.html" onto the folder pathname we have just obtained (a file called "myFile" in a folder whose pathname is "myFolder:" has a pathname "myFolder:myFile"). We learn from ALittleHelp, under "external textfiles", that we can read a whole file with suites.toys.readWholeFile. We investigate this by jumping to it with command-double-click; the core verb of the routine is clearly file.read, so we control-double-click this to study it with DocServer. Here we find we have a problem; file.read returns a binary, whereas we want a string. So we need to coerce the result to a string:

    local (templatetext = string(toys.readWholeFile (folder + "template.html")))

We will need to be able to talk about the "Source Files" folder in which the email textfiles live; we can define a local "infolder" to the pathname to it, in terms of the AmelioWeb folder's pathname, just as we did for the file "template.html". Notice, though, that a folder's pathname ends in a colon.

    local (infolder = folder + "Source Files:")

Now we're ready for our "fileloop". For each file in "infolder", which we will call "infile", we must read it into a variable; since we called the template's variable "templatetext", let's call the file's variable "filetext".

    local (infile, filetext)
    fileloop (infile in infolder, 1)
        filetext = string (toys.readWholeFile (infile))

Let's stop and think now. We want to extract the title, author, and subtitle material from "filetext". How will we do this? We know that "filetext" starts this way: the first paragraph is the title; then there's a blank paragraph; the next paragraph is the author; then there's a blank paragraph; the next paragraph is the thing we're calling the subtitle; then there's a blank paragraph; and everything after that is the bodytext. What we need is a verb that means: tell me the contents of the first paragraph of "filetext", and delete that first paragraph from "filetext" as well. Clearly if we call this six times in a row, assigning each of the pieces we receive to a variable or just throwing it away, as necessary, we will have extracted all the desired items, and reduced "filetext" to just its bodytext as well.

Imagine for a moment that we have such a verb: call it "popline". For each line we aren't actually interested in, we can just assign the result to a variable "dummy" that we don't care about. Then it suffices to say:

        title = popline ()
        dummy = popline ()
        dummy = popline ()
        dummy = popline ()
        subtitle = popline ()
        dummy = popline ()

This extracts the title and subtitle, and throws away the blank lines that we don't care about, also throwing away the author line which we aren't going to use for anything. (The truth is, we're going to pretend, for our demo, that these letters are by Gil Amelio, so we're throwing away Chris Gulker's name entirely! We have Chris' permission to do this.) We can write the above a little more concisely, though; instead of assigning to "dummy" the lines we don't want, we can just ignore the result that comes back from "popline", by calling it without capturing that result. Condensing still further by using semicolon notation, this gives:

        title = popline (); popline (); popline (); popline ()
        subtitle = popline (); popline ()

Now we'd better go back and write the verb "popline". It need only be a local handler; in fact, this will make life considerably easier, because of the matter of variable of scope. If we define "popline" in a place that is already inside the scope where "filetext" is defined, "popline" will be able to see and alter "filetext" directly; we won't have to keep handing "filetext" back and forth. In other words, we intend (revising our list of locals a bit) to say something like this:

    local (infile, filetext, title, subtitle)
    on popline()
        local (s)
        ???? extract a line and delete a line from filetext
        return s
    fileloop (infile in infolder, 1)
        filetext = string (toys.readWholeFile (infile))
        title = popline (); popline (); popline (); popline ()
        subtitle = popline (); popline ()

But what goes where the "????" line is? Well, we have to obtain the first line of "filetext". But we already know how to do this; we did it when FileLister was reading all the filenames into a string and then decomposing that string into individual filenames again. We just say:

        s = string.NthField (filetext, cr, 1)

What about deleting that first line from "filetext"? Is there a verb string.deleteNthField? No, unfortunately not; but from ALittleHelp we find (under "language basics: strings, lists, and records: other string operations: insert, delete, substring, etc.") that there's a verb string.delete. Control-clicking this to look it up in DocServer, we learn that it takes three parameters: the string, the number of the character where the deletion is to begin, and the number of characters to be deleted. We know the first two: the string is "filetext", and the deletion is to begin at character 1. But to get the third parameter we need to know how long the first line is. Well, the first line consists of the same number of characters as "s", because "s" was obtained by extracting that line - except that "s" doesn't include the return-character, whereas we want to delete the return-character from "filetext" as well. We see from ALittleHelp that the length of a string is obtained by a verb "sizeof". So the formula is:

        filetext = string.delete (filetext, 1, sizeOf(s) + 1)

Summing up, then, our script so far goes like this:

    local (folder)
    if not file.getFolderDialog ("Where is the AmelioWeb folder?", @folder) {return}
    local (templatetext = string(toys.readWholeFile (folder + "template.html")))
    local (infolder = folder + "Source Files:")
    local (infile, filetext, title, subtitle)
    on popline() 
        local (s)
        s = string.NthField (filetext, cr, 1)
        filetext = string.delete (filetext, 1, sizeOf(s) + 1)
        return s
    fileloop (infile in infolder, 1)
        filetext = string (toys.readWholeFile (infile))
        title = popline (); popline (); popline (); popline ()
        subtitle = popline (); popline ()

Time for a reality check! Modify this by adding to the end of the "fileloop" this line:

        msg (title + " - " + subtitle); clock.waitseconds (1)

and run the script, choosing the AmelioWeb folder when the dialog comes up. Sure enough, it looks like we're seeing the title and subtitle of each file in turn. Time for a cup of coffee; you've earned it.

The task now is to substitute the title, subtitle, and bodytext we've extracted for their markers in "templatetext". Consulting ALittleHelp, we find that find and replace in strings is done with verbs called string.patternMatch, string.replace, and string.replaceAll. Looking these up in DocServer it is clear that we may as well use string.replaceAll, which replaces every occurence of a given substring with another; the "<<title>>" marker is the only one that occurs multiple times, but it can do no harm to be simple and general. We should make a copy of "templatetext" (let's call it "s"), because we're going to be performing this substitution for each "infile". So what we want to say is:

        local (s = templatetext)
        s = string.replaceAll (s, "<<title>>", title)
        s = string.replaceAll (s, "<<subtitle>>", subtitle)
        s = string.replaceAll (s, "<<bodytext>>", filetext)

The above lines should go after the line where set the value of "subtitle", replacing the "msg" line which was put there only for testing purposes.

We are almost done creating the text to go into our new HTML file. The only thing left to do is to call html.processMacros to process "s": this will do things like substitute <p> wherever two return-characters in a row appear. So add a further line:

        s = html.processMacros (s)

One other thing. The behaviour of html.processMacros is going to depend on our preferences as set in user.html.prefs, so go there and make sure that autoParagraphs is set to "true" (without the quotes). You might as well set activeURLs to "true" too, for good measure.

All that's left is to write out "s" as a new textfile. That, however, is something of a separate challenge, so let's finish by checking our work so far. We'll write "s" into a wptext instead, and just look it over to make sure that everything has come out reasonably. Add another couple of lines, like this:

        new (wptextType, @workspace.tempText)
        edit (@workspace.tempText)
        wp.setText(s)

We don't really want to do this for every single "infile"; it's enough to check our results for one. So we'll use Debug mode to stop when we've processed just the first file. Command-click the triangle on the "fileloop" line, to turn it into a breakpoint. Hit the Debug button, and then the Go button. The dialog appears; tell it where the AmelioWeb folder is. The process reaches the breakpoint at the "fileloop", before reading the first file, and stops; hit the Go button so that we do one cycle of the "fileloop". Our new wptext, workspace.tempText, opens; we are then instantly taken back to workspace.BuildSite's window, because we've paused at "fileloop". Hit the Kill button to abort execution.

Examine workspace.tempText. It looks like it's worked! Our title appears between the title tags; then, down in the middle of the page, we see our title and our subtitle, each surrounded by the appropriate formatting tags, followed by our bodytext with <p> tags inserted.

That's enough for now. In the next chapter we'll tackle writing our HTML text out as a file, and some other miscellaneous file-related issues we ignored up to this point.


Prev | Table Of Contents | Next
Text © Matt Neuburg 1997 ALL RIGHTS RESERVED
You can download a copy of this tutorial.
This Web document scripted with Frontier. Last build at 4/18/97; 10:49:44 PM.