Scanner-for-lexical-analyzer-in-cpp

A simple c++ program that takes a file as input and after tokenizing the file it finds what each token name is(identifier, keyword, separator, operator, literal or comment).

To see what a compiler's lexical analyzer is click here

Prerequisites

You need c++11 or higher to run the code. To compile the code do:

sudo apt-get install g++

Using

To use it, first you just need to clone it:

git@github.com:amirhakimnejad/Scanner-for-lexical-analyzer-in-cpp.git

Change the testfile (program.txt) as your desire:

// This is the c test file

int x ;
x = x + 5 ;
/*
    Multi line Comment
*/

// Single line Comment
for(int i = 0, i <= 4; i++){
    x += i;
}
    x++;

54dsa
_sda = "gdsdg";

Compile with c++11:

g++ main.cpp scanner.cpp -std=c++11

Run it:

./a.out

This will be your output:

[(comment, //), 
(keyword, int), 
(identifier, x), 
(separator, ;), 
(identifier, x), 
(operator, =), 
(identifier, x), 
(operator, +), 
(literal, 5), 
(separator, ;), 
(comment, /*), 
(comment, //), 
(statement, for), 
(separator, (), 
(keyword, int), 
(identifier, i), 
(operator, =), 
(literal, 0), 
(separator, ,), 
(identifier, i), 
unknown
, 
(operator, =), 
(literal, 4), 
(separator, ;), 
(identifier, i), 
(operator, ++), 
(separator, )), 
(separator, {), 
(identifier, x), 
(operator, +=), 
(identifier, i), 
(separator, ;), 
(separator, }), 
(identifier, x), 
(operator, ++), 
(separator, ;), 
unknown
, 
(identifier, _sda), 
(operator, =), 
(literal, "gdsdg"), 
(separator, ;), 
]

As you can see there are few keywords or statements in my program. You can manually add any other values to their vector.

const vector<std::string> keywords{"int", "float", "auto", "double", "do", "switch", "return"};
const vector<std::string> statements{"for", "while"};
const vector<std::string> operators{"*", "+", "-", "/", "=", "-=", "*=", "+=", "/=", "++", "--", "=="};
const vector<std::string> Separators{"{", "}", ",", "(", ")", ";"};

How each function works

bool isID(const std::string &str);
bool isComment(const std::string &str);
bool isDigit(const std::string &str);
bool isString(const std::string &str);
bool isBool(const std::string &str);
bool isLiteral(const std::string &str);
bool isKeyword(const std::string &str);
bool isStatement(const std::string &str);
bool isOperator(const std::string &str);
bool isSeparator(const std::string &str);
bool isNotLegal(const std::string &str);
void printRoleOfToken(const vector<std::string>& tokens);
void lexicalAnalyze(const std::string &nameOfFile);

Most of the functions above don't need any explanation. Generally lexicalAnalyze() function tokenizes the given file to a vector of strings with ignoring whitespaces, newlines and the contents of comments with using the help of isOperator() isNotLegal() and isComment() because these are the only things that comes between main things of our program.

for(i=5;     i<= 10; i++){}

After making tokens vector lexicalAnalyze() passes it to printRoleOfToken() function then the functions simply uses all other functions to print the name of each token. Known tokens:

identifier: names the programmer chooses;
keyword: names already in the programming language;
separator (also known as punctuators): punctuation characters and paired-delimiters;
operator: symbols that operate on arguments and produce results;
literal: numeric, logical, textual, reference literals;
comment: line, block.

Feel free to ask questions, find bugs(:D) or anything else.

Authors

Amirhossein Hakimnejad - Initial work - amirhakimnejad
Henry - Contributor - henry-bugfree

License

This project is licensed under the MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
src		src
README.md		README.md
program.txt		program.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scanner-for-lexical-analyzer-in-cpp

Prerequisites

Using

How each function works

Authors

License

About

Contributors 2

Languages

amirhakimnejad/Scanner-for-lexical-analyzer-in-cpp

Folders and files

Latest commit

History

Repository files navigation

Scanner-for-lexical-analyzer-in-cpp

Prerequisites

Using

How each function works

Authors

License

About

Resources

Stars

Watchers

Forks

Contributors 2

Languages