IDE Development Course
Andrew Vasilyev
A method to analyze or translate code from one language to another, like to machine codes, involves linking a specific action to each grammar rule. When the parser uses this rule, the action is carried out.
This can be done during the initial analysis of the code, but it's easier to first make an Abstract Syntax Tree (AST), then go through it, doing the linked action for each part of the tree based on its type.
Also, changing a Concrete Syntax Tree (CST) to an AST is same method.
Attributes are the properties assigned to the nodes of the syntax tree, used in Syntax Directed Translation for semantic analysis.
Annotated AST = AST + Attributes
Inherited attributes are computed from the attribute values of their parent and/or sibling nodes. (top and siblings -> bottom)
int x = 10;
{
int y = x + 5;
}
%%{ init: { 'theme': 'base', 'themeVariables': { 'fontSize': '24px', 'darkmode': true, 'lineColor': '#F8B229' } } }%% graph TD; A["Program (Global)"] B["Declaration: int x = 10"] C["Block (Block 1)"] D["Declaration: int y = x + 5"] E["Expression: x + 5"] F["Variable: x"] G["Constant: 5"] A --> B %% Inherited Attributes B -->|Scope: Global| C C -->|Scope: Block 1| D D -->|Scope: Block 1| E E -->|Scope: Block 1| F E -->|Scope: Block 1| G
Synthesized attributes are computed from the attribute values of their child nodes (bottom -> top).
3 + 4 * 5
%%{ init: { 'theme': 'base', 'themeVariables': { 'fontSize': '24px', 'darkmode': true, 'lineColor': '#F8B229' } } }%% graph TD; A["Expression"] B["Addition"] C["3 (int)"] D["Multiplication"] E["4 (int)"] F["5 (int)"] %% Synthesized Attributes C -->|"Type: int"| B E -->|"Type: int"| D F -->|"Type: int"| D D -->|"Type: int"| B B -->|"Type: int"| A
Basically, SDT is depth first traversal of AST.
Calculate and save inherited attributes while moving bottom.
Calculate and save synthesized attributes while moving top.
The Visitor Pattern separates an algorithm from an object structure on which it operates, allowing for the execution of operations on objects without changing their classes.
By implementing a visitor, you can define new operations on the AST nodes, which is crucial for SDT. Each node in the AST can accept a visitor, which carries out the operation defined for that node's type.
sealed class Node {
abstract fun accept(visitor: Visitor): Int
}
data class IntegerNode(val value: Int) : Node() {
override fun accept(visitor: Visitor): Int {
return visitor.visit(this)
}
}
data class AdditionNode(val left: Node, val right: Node) : Node() {
override fun accept(visitor: Visitor): Int {
return visitor.visit(this)
}
}
data class MultiplicationNode(val left: Node, val right: Node) : Node() {
override fun accept(visitor: Visitor): Int {
return visitor.visit(this)
}
}
interface Visitor {
fun visit(node: IntegerNode): Int
fun visit(node: AdditionNode): Int
fun visit(node: MultiplicationNode): Int
}
class EvaluationVisitor : Visitor {
override fun visit(node: IntegerNode): Int {
return node.value
}
override fun visit(node: AdditionNode): Int {
return node.left.accept(this) + node.right.accept(this)
}
override fun visit(node: MultiplicationNode): Int {
return node.left.accept(this) * node.right.accept(this)
}
}
fun main() {
val expression = AdditionNode(
IntegerNode(3),
MultiplicationNode(
IntegerNode(4),
IntegerNode(5)
)
)
val visitor = EvaluationVisitor()
val result = expression.accept(visitor)
println(result) // Output: 23
}
data class SymbolTable(val table: MutableMap = mutableMapOf(), val parent: SymbolTable? = null) {
fun nest(): SymbolTable {
return SymbolTable(table.toMutableMap(), this)
}
fun set(name: String, value: Int) {
table[name] = value
}
operator fun get(name: String): Int? {
return table[name] ?: parent?.get(name)
}
}
sealed class Node {
abstract fun accept(visitor: Visitor, symbolTable: SymbolTable): Int
}
data class BlockNode(val statements: List) : Node() {
override fun accept(visitor: Visitor, symbolTable: SymbolTable): Int {
return visitor.visit(this, symbolTable)
}
}
data class IntegerNode(val value: Int) : Node() {
override fun accept(visitor: Visitor, symbolTable: SymbolTable): Int {
return visitor.visit(this, symbolTable)
}
}
data class VariableNode(val name: String) : Node() {
override fun accept(visitor: Visitor, symbolTable: SymbolTable): Int {
return visitor.visit(this, symbolTable)
}
}
data class DeclarationNode(val name: String, val expression: Node) : Node() {
override fun accept(visitor: Visitor, symbolTable: SymbolTable): Int {
return visitor.visit(this, symbolTable)
}
}
data class AdditionNode(val left: Node, val right: Node) : Node() {
override fun accept(visitor: Visitor, symbolTable: SymbolTable): Int {
return visitor.visit(this, symbolTable)
}
}
data class MultiplicationNode(val left: Node, val right: Node) : Node() {
override fun accept(visitor: Visitor, symbolTable: SymbolTable): Int {
return visitor.visit(this, symbolTable)
}
}
interface Visitor {
fun visit(node: BlockNode, symbolTable: SymbolTable): Int
fun visit(node: IntegerNode, symbolTable: SymbolTable): Int
fun visit(node: VariableNode, symbolTable: SymbolTable): Int
fun visit(node: DeclarationNode, symbolTable: SymbolTable): Int
fun visit(node: AdditionNode, symbolTable: SymbolTable): Int
fun visit(node: MultiplicationNode, symbolTable: SymbolTable): Int
}
class EvaluationVisitor : Visitor {
override fun visit(node: BlockNode, symbolTable: SymbolTable): Int {
var lastResult = 0
var nestedSymbolTable = symbolTable.nest()
for (statement in node.statements) {
lastResult = statement.accept(this, nestedSymbolTable)
}
return lastResult
}
override fun visit(node: IntegerNode, symbolTable: SymbolTable): Int {
return node.value
}
override fun visit(node: VariableNode, symbolTable: SymbolTable): Int {
return symbolTable[node.name] ?: error("Undefined variable: ${node.name}")
}
override fun visit(node: DeclarationNode, symbolTable: SymbolTable): Int {
val value = node.expression.accept(this, symbolTable)
symbolTable.set(node.name, value)
return value
}
override fun visit(node: AdditionNode, symbolTable: SymbolTable): Int {
return node.left.accept(this, symbolTable) + node.right.accept(this, symbolTable)
}
override fun visit(node: MultiplicationNode, symbolTable: SymbolTable): Int {
return node.left.accept(this, symbolTable) * node.right.accept(this, symbolTable)
}
}
fun main() {
val program = BlockNode(
listOf(
DeclarationNode("x", IntegerNode(3)),
BlockNode(
listOf(
DeclarationNode("y", IntegerNode(4)),
BlockNode(
listOf(
DeclarationNode("z", AdditionNode(VariableNode("x"), VariableNode("y")))
)
)
)
)
)
)
val visitor = EvaluationVisitor()
val result = program.accept(visitor, SymbolTable())
println(result) // Output: 7
}
Type is:
A set of values
A set of operations defined on these values
A type error occurs when an operation is performed on a value for which that operation is not defined.
A type system is a collection of rules that govern how operations on values are determined. It plays a pivotal role in ensuring that the program behaves as expected by restricting the operations that can be performed on different types of data.
The method of type checking is a procedure where the type system verifies the correctness of the program by checking whether the operations performed on values are allowed based on their types. It can catch type errors before the program is run, which aids in debugging and ensuring the program's reliability.
Type safety is a characteristic of a programming language that ensures type errors are either prevented or detected, providing a layer of reliability and predictability in the code. It ensures that operations performed are semantically correct according to the type system rules.
Memory safety is a feature that prevents programs from accessing memory outside of their allocated space, which could lead to unpredictable behavior or security vulnerabilities. A strong type system can aid in achieving memory safety by enforcing strict rules on data operations.
The implementation of polymorphism is a way that allows values to be treated as instances of multiple types. Through polymorphism, different types can share the same interface, enabling a unified way of accessing a variety of data types, which can simplify code and promote reusability and flexibility.
Typeless: Types are not checked.
Static Typing: Types are checked at compile time.
Dynamic Typing: Types are checked at runtime.
Aspect | Typeless | Static Typing | Dynamic Typing |
---|---|---|---|
Error Detection | No error detection | Early detection at compile-time | Late detection at run-time |
Debugging | Very hard | Easy | Harder |
Performance | Best | Better | May be slower due to runtime checks |
Code Verbosity | Depends | More verbose | Less verbose |
Static analysis | Very hard | Easy | Very hard |
Type coercion is the process of converting a value from one type to another.
var a = "42";
var b = a * 1; // "42" is coerced to 42
Type punning is reinterpreting the underlying bit representation of a value as a value of a different type.
union {
float f;
int i;
} pun;
pun.f = 3.14;
int pi_approx = pun.i; // Type punning
Strong and weak typing refer to how strictly types are enforced in a programming language.
Strong Typing:
Types are enforced strictly.
Type errors are caught before runtime.
Examples: Java, C++, Rust.
Weak Typing:
Types are enforced more loosely.
Type coercion allows operations between mismatched types.
Examples: JavaScript, PHP.
Strong and weak typing refer to how strictly types are enforced in a programming language.
Static Typing Checking:
Performed at compile-time.
Catches type errors before program runs.
Dynamic Typing Checking:
Performed as program executes.
Type errors detected at runtime, can lead to runtime exceptions.
Null pointers and null references are common concepts in programming, representing a lack of value or
reference. However, they can lead to runtime errors if not handled properly. Tony Hoare, who
introduced null references, later termed it a "billion dollar mistake" due to the myriad of bugs it
led to.
More
What is the semantic of this "value"?
Why do different types allow the assignment of the same "value"?
What operations are applicable to this "value"?
How can we statically check whether an operation is applicable to a "value"?
Null Object Pattern: Utilize a null object that encapsulates the absence of a value or object, yet still conforms to the expected interface, thereby avoiding null reference errors.
Optional Types (Optional<>): Introduce optional types that clearly indicate the possibility of absence of a value, making the code more self-explanatory and safe.
Nullable Reference Types (as seen in C# and Kotlin): Employ nullable reference types that must be explicitly declared, making it clear when a reference could be null and requiring developers to handle the null case.
Type-safe systems help prevent type errors, making code more reliable and easier to maintain. They enforce rules about how different types of data can be used, catching mistakes before they cause problems.
sealed class Type
object IntType : Type()
object StringType : Type()
data class SymbolTable(
val table: MutableMap> = mutableMapOf(),
val parent: SymbolTable? = null
) {
fun nest(): SymbolTable {
return SymbolTable(table.toMutableMap(), this)
}
fun set(name: String, type: Type, value: Any) {
table[name] = Pair(type, value)
}
operator fun get(name: String): Pair? {
return table[name] ?: parent?.get(name)
}
}
sealed class Node {
abstract fun accept(visitor: Visitor, symbolTable: SymbolTable): Any
}
data class BlockNode(val statements: List) : Node() {
override fun accept(visitor: Visitor, symbolTable: SymbolTable): Any {
return visitor.visit(this, symbolTable)
}
}
data class IntegerNode(val value: Int) : Node() {
override fun accept(visitor: Visitor, symbolTable: SymbolTable): Any {
return visitor.visit(this, symbolTable)
}
}
data class StringNode(val value: String) : Node() {
override fun accept(visitor: Visitor, symbolTable: SymbolTable): Any {
return visitor.visit(this, symbolTable)
}
}
data class VariableNode(val name: String) : Node() {
override fun accept(visitor: Visitor, symbolTable: SymbolTable): Any {
return visitor.visit(this, symbolTable)
}
}
data class DeclarationNode(val name: String, val expression: Node, val type: Type) : Node() {
override fun accept(visitor: Visitor, symbolTable: SymbolTable): Any {
return visitor.visit(this, symbolTable)
}
}
data class AdditionNode(val left: Node, val right: Node) : Node() {
override fun accept(visitor: Visitor, symbolTable: SymbolTable): Any {
return visitor.visit(this, symbolTable)
}
}
interface Visitor {
fun visit(node: BlockNode, symbolTable: SymbolTable): Any
fun visit(node: IntegerNode, symbolTable: SymbolTable): Any
fun visit(node: StringNode, symbolTable: SymbolTable): Any
fun visit(node: VariableNode, symbolTable: SymbolTable): Any
fun visit(node: DeclarationNode, symbolTable: SymbolTable): Any
fun visit(node: AdditionNode, symbolTable: SymbolTable): Any
}
class TypeCheckingVisitor : Visitor {
override fun visit(node: BlockNode, symbolTable: SymbolTable): Any {
var lastResult: Any = 0
var lastType: Type = IntType
var nestedSymbolTable = symbolTable.nest()
for (statement in node.statements) {
(lastResult, lastType) = statement.accept(this, nestedSymbolTable) as Pair
}
return Pair(lastResult, lastType)
}
override fun visit(node: IntegerNode, symbolTable: SymbolTable): Any {
return Pair(node.value, IntType)
}
override fun visit(node: StringNode, symbolTable: SymbolTable): Any {
return Pair(node.value, StringType)
}
override fun visit(node: VariableNode, symbolTable: SymbolTable): Any {
val (type, value) = symbolTable[node.name] ?: error("Undefined variable: ${node.name}")
return Pair(value, type)
}
override fun visit(node: DeclarationNode, symbolTable: SymbolTable): Any {
val (value, type) = node.expression.accept(this, symbolTable) as Pair
when (type) {
is IntType -> if (node.type != IntType) error("Type mismatch: expected IntType, got ${value::class}")
is StringType -> if (node.type != StringType) error("Type mismatch: expected StringType, got ${value::class}")
}
symbolTable.set(node.name, node.type, value)
return Pair(value, type)
}
override fun visit(node: AdditionNode, symbolTable: SymbolTable): Any {
val (leftValue, leftType) = node.left.accept(this, symbolTable) as Pair
val (rightValue, rightType) = node.right.accept(this, symbolTable) as Pair
return when {
leftType is IntType && rightType is IntType -> Pair(leftValue as Int + rightValue as Int, IntType)
leftType is StringType && rightType is StringType -> Pair(leftValue as String + rightValue as String, StringType)
else -> error("Type mismatch: cannot add $leftValue and $rightValue")
}
}
}
fun main() {
val program = BlockNode(
listOf(
DeclarationNode("x", IntegerNode(3), IntType),
DeclarationNode("y", StringNode("hello"), StringType),
BlockNode(
listOf(
DeclarationNode("z", AdditionNode(VariableNode("x"), IntegerNode(5)), IntType),
DeclarationNode("w", AdditionNode(VariableNode("y"), StringNode(" world")), StringType)
)
)
)
)
val typeChecker = TypeCheckingVisitor()
val (result, type) = program.accept(typeChecker, SymbolTable()) as Pair
println(result) // Output: hello world
}
%%{ init: { 'theme': 'base', 'themeVariables': { 'fontSize': '30px', 'darkmode': true, 'lineColor': '#F8B229' } } }%% graph LR A[Code] --> B[Lexical Analysis] B --> C[Tokens] C --> D[Syntax Analysis] D --> E[AST] E --> F[Semantic Analysis] F --> G[Annotated and checked AST] G --> I[???] I --> M[PSI]
Types and Programming Languages
by Benjamin C. Pierce
Dive into advanced topics to further enhance the robustness and correctness of your code:
Thank you for your attention!
I'm now open to any questions you might have.