Theses and Dissertations

PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation.

Suad AlshammariFollow

DOI

https://doi.org/10.25772/WWY7-DR92

Author ORCID Identifier

https://orcid.org/0000-0002-5339-4619

Defense Date

2025

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Pharmaceutical Sciences

First Advisor

Dayanjan S Wijesinghe

Abstract

This dissertation presents a systematic evaluation of PyZoBot, an AI-powered platform for literature- based question answering, using the Retrieval-Augmented Generation Assessment Scores (RAGAS) framework. The study focuses on a subset of 49 cardiology-related questions extracted from the BioASQ benchmark dataset. PyZoBot's performance was assessed across 32 configurations, including standard Retrieval-Augmented Generation (RAG) and GraphRAG pipelines, implemented with both OpenAI-based models (GPT-3.5-Turbo, GPT-4o) and open- source models (LLaMA 3.1, Mistral).

To establish a comparative benchmark, responses generated by PyZoBot were evaluated alongside answers manually written by six PhD students and recent graduates from the pharmacotherapy field, using a curated Zotero library containing BioASQ-referenced documents. The evaluation applied four key RAGAS metrics—faithfulness, answer relevancy, context recall, and context precision—along with a composite harmonic score to determine overall performance.

The findings reveal that 22 PyZoBot configurations surpassed the highest-performing human participant, with the top pipeline (GPT-3.5-Turbo + layout-aware chunking, k=10) achieving a harmonic RAGAS score of 0.6944. Statistical analysis using Kruskal-Wallis and Dunn’s post hoc tests confirmed significant differences across all metrics, especially in faithfulness and time efficiency.

These results validate PyZoBot’s ability to support high-quality biomedical information synthesis and demonstrate the system’s potential to meet or exceed human performance in complex, evidence-based academic tasks.

Rights

Is Part Of

VCU University Archives

Is Part Of

VCU Theses and Dissertations

Date of Submission

4-24-2025

Download

Available for download on Saturday, April 24, 2027

Included in

Other Pharmacy and Pharmaceutical Sciences Commons

COinS

Theses and Dissertations

PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation.

DOI

Author ORCID Identifier

Defense Date

Document Type

Degree Name

Department

First Advisor

Abstract

Rights

Is Part Of

Is Part Of

Date of Submission

Included in

Browse

Search

Author Corner

Links

Theses and Dissertations

PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation.

Author

DOI

Author ORCID Identifier

Defense Date

Document Type

Degree Name

Department

First Advisor

Abstract

Rights

Is Part Of

Is Part Of

Date of Submission

Included in

Share

Browse

Search

Author Corner

Links