Sungwon Chung

Dynamic Type and Static Type

Runtime vs Compile time
Slow vs fast
flexible vs inflexible

Strong Typing

Sound type system: guarantees that a no type errors can occur at runtime
cons
- Some checks can only be done dynamically (Array bounds checking)
- Such checking can cause serious performance degradation
- Even so-called strongly typed languages often have “holes" in the type system (e.g. variant records in Pascal)

Type Equivalence

If types are required to match, the question of what types are considered to be “equivalent" arises
Structural equivalence requires equivalent structures
- Identical basic types
- Same type constructor applied to equivalent types
Name equivalence considered types to be equal only if the same names are used
C uses structural equivalence except for records

Type Signatures

The signature of a procedure is a description of the types of its arguments and type of its result.
i.e) sqrt : real -> real, mod: integer X integer -> integer

Polymorphic Procedures

A procedure that has multiple type signatures is called polymorphic

Failure Of Type Checking Due To Aliasing, Variant Records, Subroutine Calls

Citation:

Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
UT Austin CS375: Compilers by G.Novak.

Posted on May 11th, 2015

Record Structure Declarations and References

What Information Is Stored In Symbol Table From Declaration

A record declaration has a form such as : record $field_1$, …, $field_n$ : $type_1$ ; … end
1. Initialize offset within the record to be 0.
1. For each entry group,
2. (a) Find the symbol table entry for the type
3. (b) Allocate storage within the record using the storage allocation algorithm
4. (c) Make a symbol table entry for each field, filling in its print name, type, offset in the record, and size
5. (d) Link the entries for the fields to an entry for the record
1. The size of the record is the total size given by the storage allocation algorithm, rounded up to whole words, (e.g. multiple of 8)
1. Variant records simply restart the storage allocation at the place where the variant part begins. Total size is the maximum size of the variants

Generation Of Intermediate Code For References To Records

type date = record mo : 1..12;
day : 1..31;
year : integer end;
person = record name : alfa;
ss : integer;
birth : date end;

var people : array[1..100] of person;

people[i].birth.day
people[i] : (aref people (i-1)32 ) = (aref people (+ -32 ( i 32) ) )

References Using Pointers

Records With Variant Parts

Citation:

Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
UT Austin CS375: Compilers by G.Novak.

Posted on May 11th, 2015

Methods For Storing Arrays

Effect of Array Order and Loop Order On Efficiency

Computation Of Effective Address(Formula)

Why Optimization Of Array References Is Important

Citation:

Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
UT Austin CS375: Compilers by G.Novak.

Posted on May 11th, 2015

Topics In Parsing Programming Language Constructs:

Current Status Variable

Type Coercion

When a binary operator is reduced, it is necessary to check the types of the arguments, possibly to coerce an argument to the correct type, and to infer the result type.
char < int < float < double
For most operators, right-most type takes precedence
For assignment, left-most type takes precedence

Conversion of Names With Constant Compile-Time Values To Constants

Intermediate Code for Non-Arithmetic Statements

Citation:

Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
UT Austin CS375: Compilers by G.Novak.

Posted on May 11th, 2015

Example Questions For Final Exam

Adderessing

From Midterm!

Show how an operator precedence parser would parse the string (ex. A-(B/C-D)/E+F)

Show the contents of the stacks at each step; produce a tree as output

Give one advantage and disadvantage of hashing as a method of symbol table organization

Consider the regular expression (a|b)b+b. What is the simplest regular expression that denotes the same language?

Give the allowable form of productions for a Regular grammar.

Consider the following grammar :

S -> a S
S -> S b
S -> b
What kind of grammar is this in the Chomsky hierarchy?
What kind of language does it denote?
Is there a simpler kind of grammar that denotes the same language? If so, give the grammar, if not, explain why not.

Briefly and clearly define the following terms : vocab

Other Questions

A robot moves on a square grid. The robot can go forward, turn left or turn right. Give a grammar to describe the language of all sequences of moves that leave the robot pointing in the same direction as when it started

What kind of grammar is the above(in the Chomsky hierarchy)?

Describe a kind of local optimization

Describe what it means for a subexpression to be (a) available (b) busy (c) killed

Describe sources of extra run-time overhead in (a) time and (b)space in an OOP language

Draw boxes around the following code to show the basic blocks:

n := k*2;
if n < j then write(‘less’) else begin k := k - 1; write(‘more’) end;
writeln(k);

Number the blocks and draw a flow graph; give the matrix form of the flow graph and its transitive closure

Give an advantage and a disadvantage for (a) call by reference (b) call by value

What are the most important things to optimize in a scientific program? Why?

Give three examples of computer architecture innovations that require compiler innovations to be effective

Briefly and clearly define the following terms: … 20 terms chosen from the vocabulary list on the study guide

Citation:

Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
UT Austin CS375: Compilers by G.Novak.

Posted on May 11th, 2015

Other Concepts

Synthesized Translation

The term synthesized translation includes two ideas:

A complex expression is constructed from subexpressions
The translation of a complex expression is constructed in a mechanical way form the translations of its subexpressions
if x > max then max := x
is constructed of (x > max) and (max := x), therefore, ( if (> x max) (:= max x) )

Controllability and Observability

These are central concepts from control theory. We will define them as :

Controllability: the ability to chance the behavior of a system by changing its parameters.
Observability: the ability to observe the behavior of a system well enough to control it.

In order to control a system, both controllability and observability are required.

The implications For Large Software Systems

Aspects of software that cannot easily be observed will never be debugged
All Large Software Systems must have observability built in
Observability is a requirement, not a luxury. The time spent building in observability will be well repaid.
In-process traces can be turned on by setting bits in a bit vector.

Comparison of Parsing Techniques

Recursive Descent and Operator Precedence

Advantages:

easy to control, understand
Good error messages can be produced

Disadvantages:

More code to write
Grammar not as clear

Parser Generator (yacc)

Advantages

Less code to write
Grammar is clear and easy to change

Disadvantages:

The programmer may get stuck on grammar errors that are hard to understand
Error messages during compilation may be unhelpful and far from the actual error

Lexical Scoping

Scope is the region of program text over which a symbol can be referenced.

With lexical scoping, a name is defined when it can be looked up in the lexicon at the point of use.

Structure References

Every expression has type
Types from a tree structure (a graph when pointers to types are included, but still treated as a tree)
The structure references in source code specify a traversal down the type tree
A reference to a part of a structure depends on the type of the structure; the results of the reference are:
- An address expression for the substructure
- A new type for the substructure

Citation:

Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
UT Austin CS375: Compilers by G.Novak.

Posted on March 12th, 2015

Address Calculations

For Array References (Know the Fomulars)

A simple array declaration has a form such as:
* array[ $low_1$ .. $high_1$] of type

Find the symbol table entry for type
Make a symbol table entry for the array type. The total size of the array is: ($high_1$ - $low_1$ + 1) * size(type)

Address Calculations for Access to Fields of a Record

Determining Types of Accesses To Parts of a Structure

Be Able to Calculate the Size of An Array Records

Be Able to Calculate the Address Offsets For Access to an Array Element and Record Field

Citation:

Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
UT Austin CS375: Compilers by G.Novak.

Posted on March 12th, 2015

Data Area and Base-Offset Addressing

Storage Allocation Algorithm

Data area is contiguous region of storage specified by its base address and size.
Two kinds of data are are arrays and records.

Allocation of Data in Records

Allocation of storage is done as an offset to a base address, which is associated with a block of storage. Assignment of storage locations is done sequentially by a simple algorithm:

next = 0

offset = next;
next = next + n;
return offset;

Finally, next gives the total size of the block.

Word Alignment and Padding

storage alignment: Certain items must be allocated at restricted locations.
In such cases, next is advanced to next available boundary if needed, and the intervening(사이에 오는) storage is wasted; this is called padding

wordaddress(next, m) = ( (next + m - 1) / m) *m

Variant Records

A record declaration has a form such as : record $field_1$, …, $field_n$ : $type_1$ ; .. end *
Such a declaration is processed as follows :

Initialize offset within the record to be 0.
For each entry group,
- find the symbol table entry for the type
- allocate storage within the record using the storage allocation algorithm and size of type
- make a symbol table entry for each field, filling in its print name, type, offset in the record, and size.
- link the entries for the field to an entry for the record
The size of the record is the total size given by the storage allocation algorithm, rounded up to whole words.
Variant records simply restart the storage allocation at the place where the variant part begins. Total size is the maximum size of the variants.

Citation:

Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
UT Austin CS375: Compilers by G.Novak.

Posted on March 12th, 2015

Symbol Tables

Definition

Symbol table is a data structure that associates names with information about the objects that are denoted by the names.
Must be well organized:

For fast lookup of symbols
To reflect the organization of the program (block structure)
쉽게 생각하면, Lexicon = dictionary = symbol table.

Operations : insert / search / search-insert

Method of Organization

Think of key-value dictionary.
Linear, unsorted: Symbols are stored in an array; new symbols are inserted at the end.
Linear, sorted: Symbols are stored in an array that is sorted.
Hash table: typical hash table :: do you remember collision and rehash?
Indexed Buckets: hashed with buckets / alphabetized buckets
Tree of symbol table
Stack symbol table
Tree and stack formats both have concepts of scoping: inheritance.

Speed

Linear, unsorted: Insert O(1) Search O(n)
Linear, sorted: Insert: O(n) Search(log n) binary search
Hash table: Insert: O(1) Search(1)

Advantages

Binary tree symbol tables have log n search time, so build AVL trees.
Hash tables are super fast, don’t have to sort!

Disadvantages

AVL tree is bit complicated.
Hash Table is more complex, and hard to find hashing function, also complexity of rehashing.

Comparison

Be Able to select a method for a particular application.

Search is More Frequent than Insertion

Symbol Tables for Block-Structured Languages

Tree, Stack structure

What Things Are Put in Symbol Tables?

Symbols: statement labels, variables, constants, subprogram names

Symbolic constants
Enumerated types
Subrange types
Scalar types
Variables
Arrays
Records
Functions and procedures
Labels
Objects

Examples of Symbols in a Compiler

Deletion of Symbols

When it is needed

Relation to Table Organization Methods

Methods of Strong Identifier Names

User of Indirection or Auxiliary Table Structures

Citation:

Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
UT Austin CS375: Compilers by G.Novak.

Posted on March 12th, 2015

Shift-Reduce Parser

Use of Stack

Use stack to save statements

Four Actions of Shift-Reduce Parser

Shift : Shift the current input onto the stack and go to the specified next state s.
Reduce : Given the production r : A -> b, remove enough items from the stack to match b, produce output structure from those items, and put the resulting structure A back on the stack. The next state is found from the goto table.
Accept : Parsing is completed! return top of stack as result.
Error : The input cannot be parsed according to the grammar; call an error recovery routine. Error entries are denoted by blanks in the parsing table.

Relationship to Canonical Derivation

x

How Parse Tree is Constructed Using Shift-Reduce Parser

Conflicts

Reduce/reduce: resolved by choosing the first listed production
Shift/reduce: resolved in favor of shift
Reduce if the precedence of the production is greater than that of the terminal or if they are equal and the associativity of the production is left.
reduce의 결과가 터미널의 결과보다 우선순위가 높을 때.
만약 그 결과들이 같으면 associativity는 왼 쪽.
Shift otherwise.

Citation:

Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
UT Austin CS375: Compilers by G.Novak.

Posted on March 12th, 2015

Sungwon Chung@sungpi

Showing all posts tagged "Compiler"

Dynamic Type and Static Type

Strong Typing

Type Equivalence

Type Signatures

Polymorphic Procedures

Failure Of Type Checking Due To Aliasing, Variant Records, Subroutine Calls

Citation:

What Information Is Stored In Symbol Table From Declaration

Generation Of Intermediate Code For References To Records

References Using Pointers

Records With Variant Parts

Citation:

Methods For Storing Arrays

Effect of Array Order and Loop Order On Efficiency

Computation Of Effective Address(Formula)

Why Optimization Of Array References Is Important

Citation:

Current Status Variable

Type Coercion

Conversion of Names With Constant Compile-Time Values To Constants

Intermediate Code for Non-Arithmetic Statements

Citation:

Adderessing

From Midterm!

Show how an operator precedence parser would parse the string (ex. A-(B/C-D)/E+F)

Give one advantage and disadvantage of hashing as a method of symbol table organization

Consider the regular expression (a|b)b+b. What is the simplest regular expression that denotes the same language?

Give the allowable form of productions for a Regular grammar.

Consider the following grammar :

Briefly and clearly define the following terms : vocab

Other Questions

A robot moves on a square grid. The robot can go forward, turn left or turn right. Give a grammar to describe the language of all sequences of moves that leave the robot pointing in the same direction as when it started

What kind of grammar is the above(in the Chomsky hierarchy)?

Describe a kind of local optimization

Describe what it means for a subexpression to be (a) available (b) busy (c) killed

Describe sources of extra run-time overhead in (a) time and (b)space in an OOP language

Draw boxes around the following code to show the basic blocks:

Give an advantage and a disadvantage for (a) call by reference (b) call by value

What are the most important things to optimize in a scientific program? Why?

Give three examples of computer architecture innovations that require compiler innovations to be effective

Briefly and clearly define the following terms: … 20 terms chosen from the vocabulary list on the study guide

Citation:

Synthesized Translation

Controllability and Observability

The implications For Large Software Systems

Comparison of Parsing Techniques

Recursive Descent and Operator Precedence

Advantages:

Disadvantages:

Parser Generator (yacc)

Advantages

Disadvantages:

Lexical Scoping

Structure References

Citation:

For Array References (Know the Fomulars)

Address Calculations for Access to Fields of a Record

Determining Types of Accesses To Parts of a Structure

Be Able to Calculate the Size of An Array Records

Be Able to Calculate the Address Offsets For Access to an Array Element and Record Field

Citation:

Storage Allocation Algorithm

Allocation of Data in Records

Word Alignment and Padding

Variant Records

Citation:

Definition

Method of Organization

Speed

Advantages

Disadvantages

Comparison

Be Able to select a method for a particular application.

Search is More Frequent than Insertion

Symbol Tables for Block-Structured Languages

What Things Are Put in Symbol Tables?

Examples of Symbols in a Compiler

Deletion of Symbols