Showing all posts tagged "Compiler"
Type Systems & Type Checking
Dynamic Type and Static Type
- Runtime vs Compile time
- Slow vs fast
- flexible vs inflexible
Strong Typing
- Sound type system: guarantees that a no type errors can occur at runtime
- cons
- Some checks can only be done dynamically (Array bounds checking)
- Such checking can cause serious performance degradation
- Even so-called strongly typed languages often have “holes" in the type system (e.g. variant records in Pascal)
Type Equivalence
- If types are required to match, the question of what types are considered to be “equivalent" arises
- Structural equivalence requires equivalent structures
- Identical basic types
- Same type constructor applied to equivalent types
- Name equivalence considered types to be equal only if the same names are used
- C uses structural equivalence except for records
Type Signatures
- The signature of a procedure is a description of the types of its arguments and type of its result.
- i.e) sqrt : real -> real, mod: integer X integer -> integer
Polymorphic Procedures
- A procedure that has multiple type signatures is called polymorphic
Failure Of Type Checking Due To Aliasing, Variant Records, Subroutine Calls
Citation:
-
Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
-
UT Austin CS375: Compilers by G.Novak.
Posted on May 11th, 2015
Record Structure Declarations and References
What Information Is Stored In Symbol Table From Declaration
- A record declaration has a form such as : record $field_1$, …, $field_n$ : $type_1$ ; … end
-
- Initialize offset within the record to be 0.
-
- For each entry group,
- (a) Find the symbol table entry for the type
- (b) Allocate storage within the record using the storage allocation algorithm
- (c) Make a symbol table entry for each field, filling in its print name, type, offset in the record, and size
- (d) Link the entries for the fields to an entry for the record
-
- The size of the record is the total size given by the storage allocation algorithm, rounded up to whole words, (e.g. multiple of 8)
-
- Variant records simply restart the storage allocation at the place where the variant part begins. Total size is the maximum size of the variants
Generation Of Intermediate Code For References To Records
type date = record mo : 1..12;
day : 1..31;
year : integer end;
person = record name : alfa;
ss : integer;
birth : date end;
var people : array[1..100] of person;
- people[i].birth.day
- people[i] : (aref people (i-1)32 ) = (aref people (+ -32 ( i 32) ) )
References Using Pointers
Records With Variant Parts
Citation:
-
Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
-
UT Austin CS375: Compilers by G.Novak.
Posted on May 11th, 2015
Array References:
Methods For Storing Arrays
Effect of Array Order and Loop Order On Efficiency
Computation Of Effective Address(Formula)
Why Optimization Of Array References Is Important
Citation:
-
Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
-
UT Austin CS375: Compilers by G.Novak.
Posted on May 11th, 2015
Topics In Parsing Programming Language Constructs:
Current Status Variable
Type Coercion
- When a binary operator is reduced, it is necessary to check the types of the arguments, possibly to coerce an argument to the correct type, and to infer the result type.
- char < int < float < double
- For most operators, right-most type takes precedence
- For assignment, left-most type takes precedence
Conversion of Names With Constant Compile-Time Values To Constants
Intermediate Code for Non-Arithmetic Statements
Citation:
-
Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
-
UT Austin CS375: Compilers by G.Novak.
Posted on May 11th, 2015
Example Questions For Final Exam
Adderessing
From Midterm!
Show how an operator precedence parser would parse the string (ex. A-(B/C-D)/E+F)
- Show the contents of the stacks at each step; produce a tree as output
Give one advantage and disadvantage of hashing as a method of symbol table organization
Consider the regular expression (a|b)b+b. What is the simplest regular expression that denotes the same language?
Give the allowable form of productions for a Regular grammar.
Consider the following grammar :
- S -> a S
- S -> S b
- S -> b
- What kind of grammar is this in the Chomsky hierarchy?
- What kind of language does it denote?
- Is there a simpler kind of grammar that denotes the same language? If so, give the grammar, if not, explain why not.
Briefly and clearly define the following terms : vocab
Other Questions
A robot moves on a square grid. The robot can go forward, turn left or turn right. Give a grammar to describe the language of all sequences of moves that leave the robot pointing in the same direction as when it started
What kind of grammar is the above(in the Chomsky hierarchy)?
Describe a kind of local optimization
Describe what it means for a subexpression to be (a) available (b) busy (c) killed
Describe sources of extra run-time overhead in (a) time and (b)space in an OOP language
Draw boxes around the following code to show the basic blocks:
n := k*2;
if n < j then write(‘less’) else begin k := k - 1; write(‘more’) end;
writeln(k);
- Number the blocks and draw a flow graph; give the matrix form of the flow graph and its transitive closure
Give an advantage and a disadvantage for (a) call by reference (b) call by value
What are the most important things to optimize in a scientific program? Why?
Give three examples of computer architecture innovations that require compiler innovations to be effective
Briefly and clearly define the following terms: … 20 terms chosen from the vocabulary list on the study guide
Citation:
-
Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
-
UT Austin CS375: Compilers by G.Novak.
Posted on May 11th, 2015
Other Concepts
Synthesized Translation
The term synthesized translation includes two ideas:
- A complex expression is constructed from subexpressions
- The translation of a complex expression is constructed in a mechanical way form the translations of its subexpressions
if x > max then max := x
is constructed of (x > max) and (max := x), therefore, ( if (> x max) (:= max x) )
Controllability and Observability
These are central concepts from control theory. We will define them as :
- Controllability: the ability to chance the behavior of a system by changing its parameters.
- Observability: the ability to observe the behavior of a system well enough to control it.
In order to control a system, both controllability and observability are required.
The implications For Large Software Systems
- Aspects of software that cannot easily be observed will never be debugged
- All Large Software Systems must have observability built in
- Observability is a requirement, not a luxury. The time spent building in observability will be well repaid.
- In-process traces can be turned on by setting bits in a bit vector.
Comparison of Parsing Techniques
Recursive Descent and Operator Precedence
Advantages:
- easy to control, understand
- Good error messages can be produced
Disadvantages:
- More code to write
- Grammar not as clear
Parser Generator (yacc)
Advantages
- Less code to write
- Grammar is clear and easy to change
Disadvantages:
- The programmer may get stuck on grammar errors that are hard to understand
- Error messages during compilation may be unhelpful and far from the actual error
Lexical Scoping
Scope is the region of program text over which a symbol can be referenced.
With lexical scoping, a name is defined when it can be looked up in the lexicon at the point of use.
Structure References
- Every expression has type
- Types from a tree structure (a graph when pointers to types are included, but still treated as a tree)
- The structure references in source code specify a traversal down the type tree
- A reference to a part of a structure depends on the type of the structure; the results of the reference are:
- An address expression for the substructure
- A new type for the substructure
Citation:
- Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
- UT Austin CS375: Compilers by G.Novak.
Posted on March 12th, 2015
Address Calculations
For Array References (Know the Fomulars)
A simple array declaration has a form such as:
* array[ $low_1$ .. $high_1$] of type
- Find the symbol table entry for type
- Make a symbol table entry for the array type. The total size of the array is: ($high_1$ - $low_1$ + 1) * size(type)
Address Calculations for Access to Fields of a Record
Determining Types of Accesses To Parts of a Structure
Be Able to Calculate the Size of An Array Records
Be Able to Calculate the Address Offsets For Access to an Array Element and Record Field
Citation:
- Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
- UT Austin CS375: Compilers by G.Novak.
Posted on March 12th, 2015
Data Area and Base-Offset Addressing
Storage Allocation Algorithm
Data area is contiguous region of storage specified by its base address and size.
Two kinds of data are are arrays and records.
Allocation of Data in Records
Allocation of storage is done as an offset to a base address, which is associated with a block of storage. Assignment of storage locations is done sequentially by a simple algorithm:
- next = 0
offset = next;
next = next + n;
return offset;
- Finally, next gives the total size of the block.
Word Alignment and Padding
storage alignment: Certain items must be allocated at restricted locations.
In such cases, next is advanced to next available boundary if needed, and the intervening(사이에 오는) storage is wasted; this is called padding
wordaddress(next, m) = ( (next + m - 1) / m) *m
Variant Records
A record declaration has a form such as : record $field_1$, …, $field_n$ : $type_1$ ; .. end *
Such a declaration is processed as follows :
- Initialize offset within the record to be 0.
- For each entry group,
- find the symbol table entry for the type
- allocate storage within the record using the storage allocation algorithm and size of type
- make a symbol table entry for each field, filling in its print name, type, offset in the record, and size.
- link the entries for the field to an entry for the record
- The size of the record is the total size given by the storage allocation algorithm, rounded up to whole words.
- Variant records simply restart the storage allocation at the place where the variant part begins. Total size is the maximum size of the variants.
Citation:
- Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
- UT Austin CS375: Compilers by G.Novak.
Posted on March 12th, 2015
Symbol Tables
Definition
Symbol table is a data structure that associates names with information about the objects that are denoted by the names.
Must be well organized:
- For fast lookup of symbols
- To reflect the organization of the program (block structure)
쉽게 생각하면, Lexicon = dictionary = symbol table.
Operations : insert / search / search-insert
Method of Organization
- Think of key-value dictionary.
- Linear, unsorted: Symbols are stored in an array; new symbols are inserted at the end.
- Linear, sorted: Symbols are stored in an array that is sorted.
- Hash table: typical hash table :: do you remember collision and rehash?
- Indexed Buckets: hashed with buckets / alphabetized buckets
- Tree of symbol table
- Stack symbol table
Tree and stack formats both have concepts of scoping: inheritance.
Speed
- Linear, unsorted: Insert O(1) Search O(n)
- Linear, sorted: Insert: O(n) Search(log n) binary search
- Hash table: Insert: O(1) Search(1)
Advantages
- Binary tree symbol tables have log n search time, so build AVL trees.
- Hash tables are super fast, don’t have to sort!
Disadvantages
- AVL tree is bit complicated.
- Hash Table is more complex, and hard to find hashing function, also complexity of rehashing.
Comparison
Be Able to select a method for a particular application.
Search is More Frequent than Insertion
Symbol Tables for Block-Structured Languages
Tree, Stack structure
What Things Are Put in Symbol Tables?
Symbols: statement labels, variables, constants, subprogram names
- Symbolic constants
- Enumerated types
- Subrange types
- Scalar types
- Variables
- Arrays
- Records
- Functions and procedures
- Labels
- Objects
Examples of Symbols in a Compiler
|variable|description|
|:———:|:—————:|
|link|link to next symbol|
|name string|name of symbol|
Deletion of Symbols
When it is needed
Relation to Table Organization Methods
Methods of Strong Identifier Names
User of Indirection or Auxiliary Table Structures
Citation:
- Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
- UT Austin CS375: Compilers by G.Novak.
Posted on March 12th, 2015
Shift-Reduce Parser
Use of Stack
Use stack to save statements
Four Actions of Shift-Reduce Parser
- Shift : Shift the current input onto the stack and go to the specified next state s.
- Reduce : Given the production r : A -> b, remove enough items from the stack to match b, produce output structure from those items, and put the resulting structure A back on the stack. The next state is found from the goto table.
- Accept : Parsing is completed! return top of stack as result.
- Error : The input cannot be parsed according to the grammar; call an error recovery routine. Error entries are denoted by blanks in the parsing table.
Relationship to Canonical Derivation
x
How Parse Tree is Constructed Using Shift-Reduce Parser
Conflicts
- Reduce/reduce: resolved by choosing the first listed production
- Shift/reduce: resolved in favor of shift
Reduce if the precedence of the production is greater than that of the terminal or if they are equal and the associativity of the production is left. - reduce의 결과가 터미널의 결과보다 우선순위가 높을 때.
- 만약 그 결과들이 같으면 associativity는 왼 쪽.
Shift otherwise.
Citation:
- Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
- UT Austin CS375: Compilers by G.Novak.
Posted on March 12th, 2015