Skip to content

Compilers-aka-Uniwa/Compiler-Uni-C

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

227 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UNIWA

UNIVERSITY OF WEST ATTICA
SCHOOL OF ENGINEERING
DEPARTMENT OF COMPUTER ENGINEERING AND INFORMATICS

University of West Attica · Department of Computer Engineering and Informatics


Compilers

Design and Implementation of a Compiler at Uni-C

Vasileios Evangelos Athanasiou
Student ID: 19390005

GitHub · LinkedIn

Georgios Theocharis
Student ID: 19390283

GitHub

Ioannis Iliou
Student ID: 19390066

GitHub · LinkedIn

Pantelis Tatsis
Student ID: 20390226

GitHub · LinkedIn

Vasileios Dominaris
Student ID: 21390055

GitHub


Supervision

Supervisor: Christos Troussas, Assistant Professor

UNIWA Profile · LinkedIn

Co-supervisor: Michalis Iordanakis, Academic Scholar

UNIWA Profile · Scholar


Athens, May 2024



README

Design and Implementation of a Compiler at Uni-C

This project involves the development of a compiler for Uni-C, a subset of the C programming language. The implementation was completed in three distinct phases, covering the fundamental stages of compiler construction:

  1. Finite State Machine (FSM) Encoding
    Design and simulation of automata for recognizing lexical units.

  2. Lexical Analysis (FLEX)
    Development of a lexical analyzer that identifies tokens using regular expressions.

  3. Syntactic Analysis (BISON)
    Construction of a parser that validates program structure based on predefined grammar rules.


Table of Contents

Section Folder Description
1 A-FLEX/ Lexical analysis phase using Finite State Machines and FLEX
1.1 A-FLEX/A2-FSM/ FSM design and implementation for Uni-C tokens
1.1.1 A-FLEX/A2-FSM/docs/ FSM theory notes, transition tables, and documentation (PDF/XLSX)
1.1.2 A-FLEX/A2-FSM/src/ FSM source files for identifiers, strings, numbers, comments, and whitespace
1.2 A-FLEX/A3-FLEX/ FLEX-based lexical analyzer implementation
1.2.1 A-FLEX/A3-FLEX/docs/ FLEX code documentation
1.2.2 A-FLEX/A3-FLEX/src/ FLEX source code, Makefile, input/output samples
1.3 A-FLEX/assign/ Assignment descriptions for Part A (FSM & FLEX)
2 B-BISON/ Syntax analysis phase using BISON
2.1 B-BISON/assign/ Assignment descriptions for Part B (BISON)
2.2 B-BISON/B2-FLEX-BISON/ Combined FLEX & BISON parser implementation
2.2.1 B-BISON/B2-FLEX-BISON/src/ Integrated lexer/parser source code and build files
2.3 B-BISON/B3-COMPILER/ Final compiler stage
2.3.1 B-BISON/B3-COMPILER/docs/ BISON grammar documentation
2.3.2 B-BISON/B3-COMPILER/src/ Final Uni-C compiler source code
3 Uni-C/ Language specification and usage guide for Uni-C
4 README.md Project documentation
5 INSTALL.md Usage instructions

1. Lexical Analysis (Tokens)

The compiler recognizes the following categories of tokens:

  • Identifiers
    Names for variables and functions

    • Pattern: [a-zA-Z_][a-zA-Z0-9_]{0,31}
  • Keywords
    Reserved words such as:

    • if, else, while, int, return, func
  • Constants
    Supported constant types include:

    • Integers (decimal, octal, hexadecimal)
    • Floating-point numbers
    • Strings
  • Operators

    • Arithmetic: +, -, *, /
    • Relational: >, <, ==
    • Logical: &&, ||
  • Delimiters

    • Characters such as ; used to separate commands

2. Finite State Machine (FSM)

For each token category, a Finite State Automaton (FSA) was designed.

Example – Identifiers:

  • Starts at an initial state (SZ)
  • Transitions to a middle-character state (SMCH) upon receiving a letter or underscore
  • Reaches a GOOD exit state upon encountering a newline, provided the identifier is valid

3. Syntactic Analysis (BISON)

The BISON parser generator is used to define and enforce grammar rules for Uni-C programs:

  • Variable Declarations

    • Support for simple data types and arrays
  • Functions

    • Recognition of both built-in and user-defined functions
  • Expressions

    • Handling of simple and compound expressions
  • Error Handling

    • Detection and reporting of syntax errors
    • Handling of invalid tokens (TOKEN ERROR)

4. Project Files

  • 1_identifiers.fsm
    FSM encoding for identifier recognition

  • simple-flex-code.l
    FLEX source file containing regular expressions and token definitions

  • token.h
    Header file defining numeric constants for tokens

  • simple-bison-code.y
    BISON source file containing grammar and syntax rules

About

Academic project implementing a Uni-C compiler in C using FSMs, FLEX, and BISON, covering lexical and syntactic analysis with a complete compiler pipeline for a subset of the C language. Includes FSM design, token recognition, grammar parsing, and executable testing (Compilers, UNIWA).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors