Author ORCID Identifier

https://orcid.org/0009-0009-1224-8939

Defense Date

2026

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Computer Science

First Advisor

Changqing Luo

Second Advisor

Kemal Akkaya

Abstract

Deep reinforcement learning (DRL), combining reinforcement learning and high-performance function approximations such as deep neural networks (DNN), is a powerful approach to solving complex sequential decision-making problems. However, due to the complex solution space of the sequential decision-making problems and the inefficient design of the DRL algorithms, DRL algorithms usually require a prohibitively large number of data samples to train effective strategies. Consequently, it is difficult to apply these DRL algorithms to complex real-world problems that require high costs to collect a large volume of data samples. This dissertation proposes new mechanisms to address this sample inefficiency issue, realizing sample-efficient DRL algorithms. Specifically, this dissertation first presents a FeedbAck-based Decision-mAking mechanism (FADA) that utilizes feedback from the critic for decision calibration to improve the sample efficiency of off-policy actor-critic DRL algorithms. Then, it presents a Quality-Aware Experience Exploitation scheme (QA2E), which selectively exploits simulated experiences based on their varying quality, to enhance the sample efficiency of model-based policy learning. Finally, it presents a Value-guided Search-to-Imitation framework (VSI), which performs value-guided imitation-based policy improvement to enhance the sample efficiency of off-policy actor-critic DRL. Extensive experiments have been conducted to evaluate the proposed mechanisms on a set of continuous control tasks in DeepMind Control Suite, and experimental results have shown their effectiveness in improving the sample efficiency of DRL algorithms.

Rights

Is Part Of

VCU University Archives

Is Part Of

VCU Theses and Dissertations

Date of Submission

5-6-2026

Download

Included in

Artificial Intelligence and Robotics Commons, Theory and Algorithms Commons

COinS

Theses and Dissertations

Towards Sample-Efficient Deep Reinforcement Learning

Author ORCID Identifier

Defense Date

Document Type

Degree Name

Department

First Advisor

Second Advisor

Abstract

Rights

Is Part Of

Is Part Of

Date of Submission

Included in

Browse

Search

Author Corner

Links

Theses and Dissertations

Towards Sample-Efficient Deep Reinforcement Learning

Author

Author ORCID Identifier

Defense Date

Document Type

Degree Name

Department

First Advisor

Second Advisor

Abstract

Rights

Is Part Of

Is Part Of

Date of Submission

Included in

Share

Browse

Search

Author Corner

Links