# In-Class Exercise 4: Getting Started with LLVM Write an LLVM-based compiler for Brainfuck. At first, only implement non-loop operations (i.e., `<>+-.,`). Once these work, you might look into implementing loops. ## Initialization - Create a new basic block in `main` as program entry. - Create an `llvm::IRBuilder<>` and position at the end of the entry block. - In the entry block, place an alloca (`CreateAlloca`) and zero-initialize the memory (`CreateMemSet`). - End the main function with a return (`CreateRet`). Your program should now compile and emit valid LLVM-IR. Verify this by piping the output to `lli`: `./llvm-bf "+++++." | lli` ## Non-Loop Operations - Keep track of the current SSA value representing the current pointer, initializing it with the result of the `alloca`. - Implement the operations `<>+-.,` using appropriate LLVM-IR instructions. This command should now work and print `3434`: `./llvm-bf "+++++++++++++++++++++++++++++++++++++++++++++++++++>++++++++++++++++++++++++++++++++++++++++++++++++++++<.>.<.>." | lli` ## Loops ### Simple: alloca - Create a stack slot using `alloca` that holds the current pointer. Use `load`/`store` to fetch/update the pointer when encountering loop instructions. - Keep track of loop nesting using a stack of basic block pairs (condition block and continuation block). - For `[`, create a new basic block for the condition. Also create a new basic block for the loop body and update the `IRBuilder` insert point to point to the body block. - For `]`, *un*conditionally branch back to the condition block, get it from the loop nesting stack. Update the `IRBuilder` insert point to point to the continuation block. This command should print `Hello World!`: `./llvm-bf "++++++++[>++++[>++>+++>+++>+<<<<-]>+>->+>>+[<]<-]>>.>>---.+++++++..+++.>.<<-.>.+++.------.--------.>+.>++." | lli` ### Advanced: PHI nodes - Propagate the current pointer using PHI nodes, do not use alloca/load/store for this. - Keep track of loop nesting using a stack of tuples of two basic blocks and one PHI node. - For `[`, create a new basic block for the condition with a PHI node for merging the value of the current pointer. Also create a new basic block for the body and update the `IRBuilder` insert point. - For `]`, *un*conditionally branch back to the condition block and add the updated pointer value to the PHI node. Update the `IRBuilder` insert point to point to the continuation block. ## Optimizations You might have noticed that the generated LLVM-IR is not exactly efficient. Pipe it through the optimizer (`... | opt -O3 -S`). How much does the code quality improve? ## Links You might find these links helpful (as always, only parts of each page are actually relevant): - https://llvm.org/docs/LangRef.html - https://www.llvm.org/docs/ProgrammersManual.html (this also contains documentation on LLVM's custom data structures, which you might find useful, too) - https://llvm.org/doxygen/classllvm_1_1IRBuilder.html (and other Doxygen pages) ## Appendix: Template // c++ -o llvm-bf llvm-bf.cc $(llvm-config --cppflags --ldflags --libs) #include #include #include #include #include #include #include int main(int argc, char** argv) { if (argc < 2) abort(); const char* program = argv[1]; llvm::LLVMContext ctx; auto modUP = std::make_unique("mod", ctx); llvm::Type* voidTy = llvm::Type::getVoidTy(ctx); llvm::Type* i8Ty = llvm::Type::getInt8Ty(ctx); llvm::Type* ptrTy = llvm::PointerType::get(ctx, 0); // Declare putchar llvm::FunctionType* putTy = llvm::FunctionType::get(voidTy, {i8Ty}, false); llvm::Function* put = llvm::Function::Create(putTy, llvm::GlobalValue::ExternalLinkage, "putchar", modUP.get()); // Declare getchar llvm::FunctionType* getTy = llvm::FunctionType::get(i8Ty, {}, false); llvm::Function* get = llvm::Function::Create(getTy, llvm::GlobalValue::ExternalLinkage, "getchar", modUP.get()); // Prepare function main llvm::FunctionType* fnTy = llvm::FunctionType::get(i8Ty, {}, false); llvm::Function* fn = llvm::Function::Create(fnTy, llvm::GlobalValue::ExternalLinkage, "main", modUP.get()); // TODO: implement Brainfuck compiler here // Print module to stdout and run verifier modUP->print(llvm::outs(), nullptr); return !!llvm::verifyModule(*modUP, &llvm::errs()); } ## Solution ``` base64 -d <