Overview

Lesson plan

Course theme – first part

Course theme

“A career in computer science is a commitment to a lifetime of learning. You will not be taught every detail you will need in your career while you are a student. The goal of a computer science education is to give you the tools you need so you can teach yourself new languages, frameworks, and architectures as they come along. The creativity encouraged by a lifetime of learning makes computer science one of the most exciting fields today” (Kent D. Lee, ).

Course theme – second part

“Education should prepare young people for jobs that do not yet exist, using technologies that have not yet been invented, to solve problems of which we are not yet aware” (Richard Riley, former Secretary of Education).

What will we study?

Course theme

Tell me

And I will forget

Show me

And I may remember

Involve me

And I will understand

Why study programming languages?

History of programming languages

Early history

Structured programming, module systems, and the \(\lambda\)-calculus

Object orientation, dynamic languages, and the Internet

Classification of languages

Imperative languages

Imperative Languages, C

while (b > 0) {
  a = a \* 2;
  b = b - 1;
}

Object-oriented languages

Classes are complex data types grouped with operations (methods) for creating, examining, and modifying elements (objects); subclasses include (inherit) the objects and methods from superclasses

Object-oriented languages, C++

class Square {
  public:
    int x,y,h,l;
    Square() { x = y = h = l = 0; }
};

Applicative/Functional Languages

Programs as functions that take arguments and return values; arguments and returned values may be functions

Applicative/Functional Languages, OCaml

# let twice f x = f(f(x));;
# let inc x = x + 1;;
# inc 5;;
6
# twice inc 5;;
7

Logic/rule-based languages

- human(socrates).
- mortal(X) :- human(X).
-? mortal(Who).
Who = socrates

Domain-specific languages (DSLs)

Themes in language development

Language evaluation

Evaluation criteria

Readability

Writability

Reliability

Cost

Goals can conflict…

Compilation vs. interpretation

Language implementation

<a name='figure1.1'>Figure 1.1</a>. With the von Neuman computer architecture, the central processing unit reads the instructions and the data from memory,  writes the results of the operations in memory, and interacts with the input and output devices.
Figure 1.1. With the von Neuman computer architecture, the central processing unit reads the instructions and the data from memory, writes the results of the operations in memory, and interacts with the input and output devices.

Virtual Machines

<a name='figure1.2'>Figure 1.2</a>. The operating system and the compilers/interpreters create virtual machines on which the programs can run.
Figure 1.2. The operating system and the compilers/interpreters create virtual machines on which the programs can run.

Compilation

What is a compiler?

A compiler from language \(L_1\) to language \(L_2\) is a program that takes an \(L_1\) program and for each piece of code in \(L_1\) generates a piece of code in \(L_2\) with the same meaning

There are different ways to compile a program:

<a name='figure1.a'>Figure 1.a</a>. With traditional compilation the compiler translates the source program into an executable that runs on the target machine, the target program. The latter can then run and converts the input into the output.
Figure 1.a. With traditional compilation the compiler translates the source program into an executable that runs on the target machine, the target program. The latter can then run and converts the input into the output.

<a name='figure1.b'>Figure 1.b</a>. If the source program imports any libraries, the compiler first translates it into an incomplete machine language. The linker then combines it with the compiled library routines to produce an executable program in machine language.
Figure 1.b. If the source program imports any libraries, the compiler first translates it into an incomplete machine language. The linker then combines it with the compiled library routines to produce an executable program in machine language.

<a name='figure1.c'>Figure 1.c</a>. An alternative is for the compiler to translate the source program into assembly language, and then for the assembler to convert it into machine language.
Figure 1.c. An alternative is for the compiler to translate the source program into assembly language, and then for the assembler to convert it into machine language.

<a name='figure1.d'>Figure 1.d</a>. Traditionally, C code is first passed through the preprocessor which replaces the macros with their larger constructs, after which the compiler translates it into assembly language.
Figure 1.d. Traditionally, C code is first passed through the preprocessor which replaces the macros with their larger constructs, after which the compiler translates it into assembly language.

<a name='figure1.e'>Figure 1.e</a>. The C++ compiler takes the source code after it was expanded by the preprocessor, and converts into C code. This is then passed through the C compiler, which converts it into assembly language.
Figure 1.e. The C++ compiler takes the source code after it was expanded by the preprocessor, and converts into C code. This is then passed through the C compiler, which converts it into assembly language.

The compilation steps are shown in Figure 1.3, from Concepts of Programming Languages, by Robert W Sebesta.

<a name='figure1.3'>Figure 1.3</a>. The compilation process. The source code goes through the scanner (lexical analyzer), the parser (syntax analyzer), and the intermediate code generator (semantic analyzer). Then the code is optimized and converted to machine language.
Figure 1.3. The compilation process. The source code goes through the scanner (lexical analyzer), the parser (syntax analyzer), and the intermediate code generator (semantic analyzer). Then the code is optimized and converted to machine language.

Interpretation

What is an interpreter?

An interpreter of \(L_1\) in \(L_2\) is an \(L_2\) program that executes the meaning of a given \(L_1\) program

Interpreters and compilers must know about both the syntax of the language and its semantics, and must preserve the semantics (the meaning)

Different types of interpretation are shown below:

<a name='figure1.f'>Figure 1.f</a>. With traditional interpretation, the interpreter is the locus of control during the code execution. It takes the source program and the user input to generate the output.
Figure 1.f. With traditional interpretation, the interpreter is the locus of control during the code execution. It takes the source program and the user input to generate the output.

<a name='figure1.g'>Figure 1.g</a>. Some programming languages translate the source code into intermediate code that is then executed on a virtual machine.
Figure 1.g. Some programming languages translate the source code into intermediate code that is then executed on a virtual machine.

<a name='figure1.h'>Figure 1.h</a>. The Pascal compiler, written in Pascal, generates output in P-code. A P-code interpreter, written in Pascal, then translates this code into machine language.
Figure 1.h. The Pascal compiler, written in Pascal, generates output in P-code. A P-code interpreter, written in Pascal, then translates this code into machine language.

The interpretation steps are shown in Figure 1.5, from Concepts of Programming Languages, by Robert W Sebesta.

<a name='figure1.5'>Figure 1.5</a>. The interpretation process. The source code goes through the scanner (lexical analyzer), the parser (syntax analyzer), and the intermediate code generator (semantic analyzer). The interpreter then runs the intermediate code, gets the user input, and generates the results.
Figure 1.5. The interpretation process. The source code goes through the scanner (lexical analyzer), the parser (syntax analyzer), and the intermediate code generator (semantic analyzer). The interpreter then runs the intermediate code, gets the user input, and generates the results.

Major compiler phases

The compiler phases are shown in Figure 1.i, from Programming Language Pragmatics, by Michael L. Scott.

<a name='figure1.i'>Figure 1.i</a>. Compilation has two major components: the front-end (scanner, parser, and semantic analysis), and the back-end (optimization, code generation, and machine-specific optimization).
Figure 1.i. Compilation has two major components: the front-end (scanner, parser, and semantic analysis), and the back-end (optimization, code generation, and machine-specific optimization).

Compiler front-end

GCD program, C-like language

int main() {
  int i = getint(), j = getint();
  while (i != j) {
    if (i > j) i = i - j;
    else j = j - i;
  }
  puint(i);
}

Lexemes

int	main	(	)	{	int	i	=
getint	(	)	,	j	=	getint	(
)	;	while	(	i	!=	j	)
{	if	(	i	>	j	)	i
=	i	-	j	;	else	j	=
j	-	i	;	}	putint	(	i
)	;	}

The parser takes the lexemes generated by the scanner and generates a parse tree, as shown in Figure 1.j and Figure 1.k, and then reduces it to an abstract syntax tree, as shown in Figure 1.j (figures from Programming Language Pragmatics, by Michael L. Scott).

<a name='figure1.j'>Figure 1.j</a>. Part of the parse tree generated from lexemes.
Figure 1.j. Part of the parse tree generated from lexemes.

<a name='figure1.k'>Figure 1.k</a>. Part of the parse tree generated from lexemes.
Figure 1.k. Part of the parse tree generated from lexemes.

<a name='figure1.j'>Figure 1.j</a>. Most of the parse tree is removed to generate an abstract syntax tree.
Figure 1.j. Most of the parse tree is removed to generate an abstract syntax tree.

Compiler back-end

Sample IR

Source:

X = A + B + C;

Three address code:

%tmp.1 = A + B;
X = %tmp.1 + C;

Sample optimization

Source:

while (X != 0) {
  B = 5;
  C = X + B;
  X = ...
}

Target:

if (X != 0) B = 5;
while (X != 0) {
  C = X + B;
  X = ...
}

For further information

Summary

Summary

Next time: Python programming language, part 1 of 2
Reminders:


Back to course home