Defense Date


Document Type


Degree Name

Master of Science


Computer Science

First Advisor

Dr. Kostadin Damevski


Stack Overflow is a question and answer site for programming questions. It has become one of the most widely used resources for programmers, with many programmers accessing the site multiple times per day. A threat to the continued success of Stack Overflow is the ability to efficiently search the site. Existing research suggests that the inability to find certain questions results in
unanswered questions, long delays in answering questions, or questions which are unable to be found by future visitors to the site. Further research suggests that questions with poor tag quality are particularly vulnerable to these issues.

In this thesis, two approaches are considered for improving tag quality and search efficiency: automatic tag recommendations for question authors, and organizing the existing set of tags in a hierarchy from general to specific for Stack Overflow readers. A hierarchical organization is proposed for it's ability to assist exploratory searches of the site.

L2H, a hierarchical tag topic model, is a particularly interesting solution to these approaches because it can address both approaches with the same model. L2H is evaluated in detail on several proposed evaluation criteria to gauge it's fitness for addressing these search challenges on Stack Overflow.


© The Author

Is Part Of

VCU University Archives

Is Part Of

VCU Theses and Dissertations

Date of Submission