Computer Science Department Thesis Defense - Md Anaytul Islam

Event Date: 
Wednesday, January 10, 2024 - 10:30am to 12:00pm EST
Event Location: 
Online
Event Contact Name: 
Rachael Wang
Event Contact E-mail: 

Please join the Computer Science Department for the upcoming thesis defense:

Presenter: Md Anaytul Islam

Thesis title: Supporting the Executability of R Markdown Files

Abstract: R Markdown files are examples of literate programming documents that combine R code with results and explanations. Such dynamic documents are designed to execute easily and reproduce study results. However, little is known about the executability of R Markdown files which can cause frustration among its users who intend to reuse the document. This thesis aims to understand the executability of R Markdown files and improve the current state of supporting the executability of those files.

Towards this direction, a large-scale study has been conducted on the executability of R Markdown files collected from GitHub repositories. Results from the study show that a significant number of R Markdown files (64.95%) are not executable, even after best efforts. To better understand the challenges, the exceptions encountered while executing the files are categorized into different categories and a classifier is developed to determine which Markdown files are likely to be executable. Such a classifier can be utilized by search engines in their ranking which helps developers to find literate programming documents as learning resources. To support the executability of R Markdown files a command-line tool is developed. Such a tool can find issues in R Markdown files that prevent the executability of those files. Using an R Markdown file as an input, the tool generates an intuitive list of outputs that assist developers in identifying areas that require attention to ensure the executability of the file. The tool not only utilizes static analysis of source code but also uses a carefully crafted knowledge base of package dependencies to generate version constraints of involved packages and an SMT solver (i.e., Z3) to identify compatible versions of those packages. Findings from this research can help developers reuse R Markdown files easily, thus improving the productivity of developers.

Committee Members:
Dr. Muhammad Asaduzzaman (supervisor, committee chair), Dr. Garima Bajwa, Dr. Zubair Fadlullah (Western University)

Please contact grad.compsci@lakeheadu.ca for the Zoom link. Everyone is welcome.