[UMN logo]

CSCI 5161: Introduction to Compilers
Spring 2023, University of Minnesota
Using Standard ML of New Jersey

You will be using a recent version of Standard ML of New Jersey, hereafter referred to as SML, in this course. The version I see installed on my CS machine that I use to test code for the course before releasing it in homeworks is 110.71, the version available as a package on ubuntu seems to be 110.79 and the most recent stable version seems to be 110.99. To use the system on our local cluster, you have to load the module soft/sml. If you have not encountered the module command before and don't know how to use it yet, try the command

    module help
This is only for the CSE Labs cluster, you won't need this on you personal machine.

SML, like Scheme and Prolog, is an interactive language. This means that starting up SML gives rise to an interaction loop. In this interaction loop you can define functions and ask for expressions that use these functions to be evaluated.

On the CSE cluster, once you have installed the right module, you can start up the SML system by typing in sml at the command level. SML will then print some introductory text and present you with a prompt:

gopalan@rishabh (~/.www/courses/5161) % sml
Standard ML of New Jersey v110.71 [built: Fri Feb  5 14:23:17 2010]
This signals the fact that SML is ready to calculate expressions that you type in. Perhaps the simplest kind of expression that you may type in is an arithmetic one. An example interaction may thus be the following:
- 2 + 3;
val it = 5 : int
What has happened here is that the user has typed in the expression 2 + 3, SML has calculated the value of the expression as 5 and displayed this and it has then become ready for another expression to be input. A few fine points to note here. First, the user typically needs to signify the end of the expression he/she wants evaluated. This is done by typing a semicolon at the end; this is why 2 + 3 is followed by a semicolon. Second, when SML evaluates an expression, it sets an identifier called it (for item) to this value. When it presents back a result it is really telling you that this is the value of it. Finally, notice that it is also telling you the type of the result---here it happens to be an integer. SML is a typed language, unlike Prolog and Scheme and like Java, but is also has the characteristic that the types of many expressions need not be provided by the user since these can be inferred. The inference of the type integer for the expression is not very dramatic in this example, but we will see another one shortly where this is much more significant.

Note that after evaluating a given expression, SML presents its prompt for input to you again. You can type in another expression to be evaluated and then another and so on. This is exactly like the Read-Eval-Print loop of Scheme.

In any significant programming task, you will generally want to develop a set of functions independently that you store in a file; the idea is that these function definitions can then be used as many times as you want and they don't necessarily get lost once you end an interaction session. You can do this in SML and it is useful to understand how.

The first thing is, of course, to create a file that contains the necessary definitions. For this you will have to use a text editor such as emacs or vi. I am going to assume that if you are taking this course then you already know all about these editors and can use one of them with facility and that you can create a file containing an SML program. Of course, you need to know what these programs look like. I will say a little about this in the first few lectures in class and we will study SML in greater detail later in the course. However, I urge you also to take a look at the examples in Chapter 2 of the report Introduction to Standard ML by Bob Harper towards becoming familiar already with this language.

Now, you want to be able to use the functions whose definitions you have stored in a file in creating expressions to be evaluated. To do this you can use the use function in the interaction mode with SML. This function takes a string as argument. The result it produces is irrelevant, its main purpose being the side-effect of loading in the function definitions in the file.

To give this discussion some concreteness, let us suppose you have put the following lines in a file called app.sml in the current directory:

fun app [] L = L 
  | app (X :: L1) L2 = (X :: (app L1 L2))
The function defined by these two lines is called app and it serves to append two lists. We will discuss this definition a little in class, but here is a further explanation. Each of the two lines in the definition handles one of two independent but exhaustive cases in the definition. The first line says that the result of appending the empty list (represented in SML by []) to any list is that list itself, and the second line says that the result of appending a list with head X and tail L1 to L2 is a list whose head is X and whose tail is the result of appending L1 and L2. Note that the representation in SML of a list with head X and tail L is (X :: L) and that the separation of cases in the definition of a function is done by the symbol |. Again, look at the mentioned part of Harper's report to understand these aspects of SML syntax and also to see many small but interesting examples of ML usage.

As mentioned already, the function app can be made available within a SML session by typing the following expression:

- use "app.sml";
[opening app.sml]
val app = fn : 'a list -> 'a list -> 'a list
val it = () : unit
Note that strings in SML are enclosed within double quotes. The SML system checks all the definitions in a file when loading the file in. If there are errors in these definitions, then it indicates this and the loading is not successful. If there are no errors, as in the case under consideration, then SML compiles the definitions it has just read in, tells you the names and types of all the functions (and other identifiers) that have been defined in the file, and gets ready for the next expression to be evaluated. In the particular case in question, SML tells you that it has successfully loaded and compiled a function called app. You may now use this function in computations such as the following:
- app [1,2] [3,4];
val it = [1,2,3,4] : int list
Here SML has been asked to evaluate an expression that corresponds to appending the two lists [1,2] and [3,4] and it has done this using the definition of app it has been provided with.

An interesting thing to note with regard to the last example is that SML infers a type for the app function that it has loaded. This is a nontrivial task and we will discuss the way in which SML does this later in the course. For the moment let us focus on what has been inferred. If you look carefully at the type that is displayed, SML is telling you that app is a function that takes a list as an argument and produces another function that takes another list as an argument and yields a list. Notice that there is a constraint in the types of the various lists: they must all have the same kind of elements. This is signalled by the use of the expression 'a in conjunction with all the lists.

The type attributed to app may be of a kind that you are not quite used to, so here are some clarifying remarks. First, you may be used to thinking of an append function as something that takes two lists as argument and produces a (third) list as a result. In other words, you may have expected a type such as

  (('a list) * ('a list)) -> ('a list)
as the type for app. This kind of typing is actually a little less flexible. In particular, it is not possible to use app with this typing unless both argument lists have been provided. With the type inferred for app by SML from the definition we have provided it, it is possible to apply app to only one list, as exemplified in the expression
(app [1,2])
The result of this expression is itself a function that takes integer lists and produces a new list from them that have the integers 1 and 2 in the front. In fact, the expression
app [1,2] [3,4]
that we had earlier asked SML to evaluate can be thought of as the application of two functions:
((app [1,2]) [3,4])
You can rewrite the definition of the append function so that it has to be given both argument lists at the same time or not at all, and I suggest that you try to do this. (If you cannot do it after some trying, post a query to Piazza.) However, note that one of the elegances of the present definition is that it gives you two different append functions instead of just one.

The second aspect that is unusual about the typing of app is the use of the expression 'a in conjunction with lists. You might have seen integer lists or lists of strings or even lists of lists, but what kind of beast is a ('a list)? The thing to note is that the definition of app is independent of the choice of element for the lists to be appended; these elements could be integers or strings of lists and append would work the same way with all of them. SML is capable of recognizing this and it also allows you, the user, to use app at all these types. It tells you this by using the symbol 'a in the type it displays. Read symbols that begin with a quote of this kind as standing for 'any type that you want to fill in'. Of course, you have to fill in this symbol in the same way in the entire expression. Thus, app can be used to append two integer lists as in the examples discussed above or to append two string lists as in the example below:

- app ["abc", "def"] ["foo", "bar"];
val it = ["abc", "def", "foo", "bar"] : string list
However, it cannot be used to append an integer list and a string list:
- app [1] ["bcd"];
stdIn:3.1-3.16 Error: operator and operand don't agree [literal]
  operator domain: int list
  operand:         string list
  in expression:
    (app (1 :: nil)) ("bcd" :: nil)
If you look at what SML tells you carefully, it is essentially complaining about the mismatch in the types of the two lists.

When you are finished with your SML session, you would want to exit. Under Unix, you would do this by typing in an end of file character that happens to be a ^D.

Created by ngopalan atsign umn dot edu. Last updated on January 8, 2023.

The views and opinions expressed in this page are strictly those of the page author(s). The contents of this page have not been reviewed or approved by the University of Minnesota.