Sungwon Chung

Sungwon Chung@sungpi

Context-free Languages

Grammar Definition

A - a
A $\in$ N
a $\in$ V*

Since left-hand-side of each production is a single nonterminal, every derivation is a tree.

Memory Needed for Parsing.

Recursive program = a stack for temporary storage

Notations for defining the syntax of a language :

BNF
Pascal-style transition diagrams.

Ambiguous grammars VS unambiguous grammars

Ambiguous : grammar for which some sentence has more than one parse tree is ambiguous. There’s two way to deal with this. Re-write the grammar and User disambiguating rules to guide the parser.

Augmented transition network;Yacc

  • Arbitrary tests : can be added to the arcs.
  • Structure-building actions : can be added to the arcs. These actions may save information in registers to be used later by the parser, or to build the representation of the meaning of the sentence. Transformations, can also be handled.
  • Phrase names : as well as part-of-speech names, may appear on arcs. This allows a grammar to be called as a subroutine.

What kinds of language constructs are context-free?

Arithmetic expressions, noun pharses.
Any language that requires counting must be at least context free.

Citation:

  • Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
  • UT Austin CS375: Compilers by G.Novak.

Floating Point

[http://steve.hollasch.net/cgindex/coding/ieeefloat.html]

Lexical Analyzer

What constructs are parsed

Type of grammar used

Use regular expressions.

Need for speed?

NFA to DFA, and converted to tables to control a table-driven parser.

lex

Citation:

  • Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
  • UT Austin CS375: Compilers by G.Novak.

Regular Expression

Definition

Regular expresssions are a more convenient way(than a regular grammar) to specify a regular language.

Construct a Regular Expression for a Language

in lex rule.
[chars] : Any number of the set of characters chars
[c1-c2] : Any character from c1 through c2
[^chars] : not chars
{category} : An instance of a previously named category
“string" : Exactly the specified string.
s1 | s2 : s1 or s2
a : zero or more a
a+ : one or more a
a? : zero or one a
* a{2,5} : 2 to 5 a

Construct the Language Denoted by a Regular Expression

Follow the rule.

Relation Between Regular Expression and Finite Automata

Always can change into DFA.

Automatic Construction of Finite Automata from Regular Expressions

Citation:

  • Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
  • UT Austin CS375: Compilers by G.Novak.

Regular Languages 정규 언어

Grammar?

Formal Definition of a Grammar

Grammar specifies the legal syntax of a language. A grammar specifies a set of productions; non-terminal symbols. A grammar describes the structure of the sentences of a language in terms of components or phrases. The mathematical description of phrase structure grammars is due to Chomsky.
문법은 언어의 syntax(the rules by which legitimate statements can be constructed)를 나타내는 규칙이다. 문법은 production으로 이루어져 있다.

Formally, a Grammar is a four-tuple G = (T, N, S, P) where:

  • T is the set of terminal symbols or words of the language.
  • N is the set of nonterminal symbols or phrase names that are used in specifying the grammar. We say V = T $\cup$ N is the vocabulary of the grammar.
  • S is a distinguished element of N called the start symbol.
  • P is a set of productions, P $\subseteq V^NV^ \texttimes V^*$. We write productions in the form a $\textrightarrow$ b where a is a string of symbols from V containing at lest one nonterminal and b is any string of symbols from V.
  • V = Vocabulrary
    문법은 언어의 구조가 올바른지를 판단하는 거다.

Chart about Grammar Classes.

  • General Phrase-structure
  • Context Sensitive
  • Context Free pushdown stack O($n^3$)
  • Regular finite storage O(n)
    We’ll focus on Context-Free and Regular Language.

Regular Languages

Grammar definitions

A - xB.
A - x.
A, B $\in$ N
x $\in T^*$
Only one nonterminal can appear in any derived string, and it must appear at the rightmost.

Uses

In compilers, we use to parse the stream of characters into tokens. Cause parser never has to back up or do search. O(n); Linear parsing time. Indentifiers, numbers, word forms.

Local Ambiguity

Examples

A - xB
B - x

Way to Handle

Make test cases to pre-process it.

Relationship of finite automata and regular grammars

Equivalent to a deterministic finite automaton

Linear nature of a derivation from a regular grammar

Since only one nonterminal can appear in any derived string, and the rightmost end, parse tree has only one step to evaluate it, produces right-biased tree.

Citation:

  • Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
  • UT Austin CS375: Compilers by G.Novak.

Major Phases of a Compiler 컴파일러의 주요 단계들

What the phases do 각 단계는 무엇을 하나?

Lexical Analyzer

or scanning
Reads the stream of characters and group the characters into meaningful sequences called lexemes.
글자들을 읽어서 글자들에 의미를 부여하는 단계; 문장에서 단어를 잘라준다고 생각하면 편하다.
ex a = 1 + 2 는 a,=,1,+,2 5개의 token이 된다.

Syntax Analyzer

or parsing
The parser uses the first components of the tokens produced by the lexical analyzer to produce a tree-like intermediate representation that describes the grammatical structure of the token stream.
Parser는 lexer가 만들어준 토큰을 가지고, 문장에 의미를 부여하기 시작한다. (컴퓨터 과학에서는 tree로 token을 엮어낸다.)

Semantic Analyzer

Semantic analyzer uses the syntax tree and the information in the symbol table to check the source program for semantic consistency with the language definition.
Parser가 만들어준 트리로, 이 것이 과연 문법적으로 맞는 놈인지 문법을 검사한다.

Intermediate Code Generator
Machine-independent Code optimizer
Code Generator
Machine-dependent Code Optimizer

Data flow between phases 각 단계별 데이터의 흐름

  • Character Stream : Before lexical analysis ; lexer는 character들을 받아서
  • Token : Before syntax analysis ; token을 생성하고
  • Syntax Tree : before intermediate code generation ; parser는 syntax tree를 만들며
  • Intermediate representation : before code generation ; 코드 제네레이터는 중간 코드를 생성하고
  • Target Machine Code : after code generation ; 코드 제네레이터는 컴퓨터가 읽을수 있는 형태로 뽑아준다.

Citation:

  • Aho, Lam, Sethi, & Ullman, Compilers: Principles, Techniques, and Tools
  • UT Austin CS375: Compilers by G.Novak.

Lexical Analysis

Lexical Analysis
3.1 The Role of the Lexical Analyzer
  1. To read the input characters of the source program

  2. Group them into lexemes

  3. Produce as output a sequence of tokens for each lexeme in the source program

Scanning and Lexical analysis

3.1.1 Lexical Analysis Versus Parsing

Why do we separate lexer and parser?

  1. Simplicity of design is the most important consideration

  2. Compiler efficiency is improved

  3. Compiler portability is enhanced
    (only lexer has to deal with device-input)

3.1.2 Tokens, Patterns, and Lexemes

ex printf(“Hello World!");

printf and score are lexemes matching the pattern for token id.

미국의 해커톤

MLB ( Major League Hacking) 에 속해있는 HackDFW를 다녀와서.
  1. 수상한 팀들은 거의 모두가 기계 장비와 관련 있었다.
  2. 자기 아이디어를 설명하는게 더 중요한 것 같다.
  3. 똑똑한 친구들 정말 많다.

DB Scheme for Who’s Hungry?

There’s two strong entity, User, Restaurant.
Any other tables are weak entities.

Abstract

User table hold data of user, including it’s Facebook id, name, profile picture, contact number. Device table belongs to User table, and saves User’s device data in case user has many devices, mainly exists for push notification. Session table exists for security issue, belongs to User table. Location table exists for saving user’s location, reason this table is separated is, for encrypting issue. Group table exists to re-use the group that people made. Crew table exists holding data between User and Group, for N to N relationship. Vote table is for every single information of votes besides information of restaurant. Choice table holds data between Restaurant and Vote table, for counting issue. (to be updated)

User
Device
Restaurant
Group
Vote
Location
Auth
Subscription
Crew
Rsvp
Chat
Choice
Rating

User

  • id : PK
  • fb_id : string indexed Facebook_id*
  • name : string Name of the user
  • picture : string URL of the user’s profile picture
  • contact : string indexed Phone number of the user*
  • created_at : datetime
  • updated_at : datetime

Device

Belongs to User

  • id : PK id of device in our server
  • os_type : string {ios, android, web} default : web
  • push_id : string push_id
  • (user_id)
  • created_at
  • updated_at

Restaurant

  • id : PK id of restaurant in our server, not Google Place API
  • place_id : string indexed id of restaurant in Google Place API
  • lat : float Latitude of the restaurant location
  • lng : float Longitude of the restaurant location
  • name : string Name of the restaurant
  • rating : float Rating of the restaurant
  • created_at
  • updated_at

Group

  • id : PK
  • lead_user_id : user_id Leader of the group
  • name : string NULL
  • created_at
  • updated_at

Vote

  • id : PK
  • (group_id) :
  • type : string {lunch, dinner, cafe, bar}
  • name : string Name of the vote
  • expires_in : integer Expires in # minute
  • winner: restaurant_id FK to restaurant(id)
  • expires_at : datetime
  • created_at
  • updated_at

Location

  • id : PK
  • (user_id)
  • lat : float
  • lng : float

Auth

Belongs to User

  • user_id
  • login_type
  • token
  • created_at
  • updated_at

Subscription (User-User)

Use to look up mutual friends, invitation, etc..
Recursive relations between User

  • id : PK
  • friend_from : user_id indexed
  • friend_to : user_id

Crew (User- Group)

Belongs to User and Group

  • id
  • user_id
  • group_id

Rsvp (User-Vote)

Belongs to User and Vote

  • id
  • user_id
  • vote_id
  • rsvp : integer {1: going, -1: not going, 0: Dunno}

Chat (User-Vote)

Belongs to User and Vote

user_id
vote_id
chat : text
created_at
*updated_at

Choice (Vote-Restaurant)

Belongs to User and Restaurant

  • restaurant_id
  • user_id
  • count

Rating (User-Restaurant)

Belongs to User and Restaurant

  • restaurant_id
  • user_id
  • rating

Development Environment for Who’s Hungry?

1. Setting up Ruby

visit

http://rvm.io

and install RVM through curl.
‘’'bash

$ \curl -sSL https://get.rvm.io | bash -s stable

‘''
Then, you enable bash_profile for later install.

$ source .bash_profile

We’re currently using Ruby 2.2.0 version, therefore

$ rvm install 2.2.0

2. Setting up Rails

Install rails 4.2.0 , with command (with out documentations, and two “-“s in front of no)
‘’’bash
gem install rails —no-ri —nordoc
‘''

3. Install PostgreSQL

brew install postgresql

To have launchd start postgresql at login:

ln -sfv /usr/local/opt/postgresql/*.plist ~/Library/LaunchAgents

Then to load postgresql now:

launchctl load ~/Library/LaunchAgents/homebrew.mxcl.postgresql.plist

Or, if you don't want/need launchctl, you can just run:

postgres -D /usr/local/var/postgres