Electronic Theses and Dissertations (PhDs)

Permanent URI for this collectionhttps://hdl.handle.net/10539/38005

Browse

Search Results

Now showing 1 - 3 of 3

Towards Lifelong Reinforcement Learning through Temporal Logics and Zero-Shot Composition
(2024-10) Tasse, Geraud Nangue; Rosman, Benjamin; James, Steven
This thesis addresses the fundamental challenge of creating agents capable of solving a wide range of tasks in their environments, akin to human capabilities. For such agents to be truly useful and be capable of assisting humans in our day-to-day lives, we identify three key abilities that general purpose agents should have: Flexibility, Instructability, and Reliability (FIRe). Flexibility refers to the ability of agents to adapt to various tasks with minimal learning; instructability involves the capacity for agents to understand and execute task specifications provided by humans in a comprehensible manner; and reliability entails agents’ ability to solve tasks safely and effectively with theoretical guarantees on their behavior. To build such agents, reinforcement learning (RL) is the framework of choice given that it is the only one that models the agent-environment interaction. It is also particularly promising since it has shown remarkable success in recent years in various domains—including gaming, scientific research, and robotic control. However, prevailing RL methods often fall short of the FIRe desiderata. They typically exhibit poor sample efficiency, demanding millions of environment interactions to learn optimal behaviors. Task specification relies heavily on hand-designed reward functions, posing challenges for non-experts in defining tasks. Moreover, these methods tend to specialize in single tasks, lacking guarantees on the broader adaptability and behavior robustness desired for lifelong agents that need solve multiple tasks. Clearly, the regular RL framework is not enough, and does not capture important aspects of what makes humans so general—such as the use of language to specify and understand tasks. To address these shortcomings, we propose a principled framework for the logical composition of arbitrary tasks in an environment, and introduce a novel knowledge representation called World Value Functions (WVFs) that will enable agents to solve arbitrary tasks specified using language. The use of logical composition is inspired by the fact that all formal languages are built upon the rules of propositional logics. Hence, if we want agents that understand tasks specified in any formal language, we must define what it means to apply the usual logic operators (conjunction, disjunction, and negation) over tasks. The introduction of WVFs is inspired by the fact that humans seem to always seek general knowledge about how to achieve a variety of goals in their environment, irrespective of the specific task they are learning. Our main contributions include: (i) Instructable agents: We formalize the logical composition of arbitrary tasks in potentially stochastic environments, and ensure that task compositions lead to rewards minimising undesired behaviors. (ii) Flexible agents: We introduce WVFs as a new objective for RL agents, enabling them to solve a variety of tasks in their environment. Additionally, we demonstrate zero-shot skill composition and lifelong sample efficiency. (iii) Reliable agents: We develop methods for agents to understand and execute both natural and formal language instructions, ensuring correctness and safety in task execution, particularly in real-world scenarios. By addressing these challenges, our framework represents a significant step towards achieving the FIRe desiderata in AI agents, thereby enhancing their utility and safety in a lifelong learning setting like the real world.
3D Human pose estimation using geometric self-supervision with temporal methods
(University of the Witwatersrand, Johannesburg, 2024-09) Bau, Nandi; Klein, Richard
This dissertation explores the enhancement of 3D human pose estimation (HPE) through self-supervised learning methods that reduce reliance on heavily annotated datasets. Recognising the limitations of data acquired in controlled lab settings, the research investigates the potential of geometric self-supervision combined with temporal information to improve model performance in real-world scenarios. A Temporal Dilated Convolutional Network (TDCN) model, employing Kalman filter post-processing, is proposed and evaluated on both ground-truth and in-the-wild data from the Human3.6M dataset. The results demonstrate a competitive Mean Per Joint Position Error (MPJPE) of 62.09mm on unseen data, indicating a promising direction for self-supervised learning in 3D HPE and suggesting a viable pathway towards reducing the gap with fully supervised methods. This study underscores the value of self-supervised temporal dynamics in advancing pose estimation techniques, potentially making them more accessible and broadly applicable in real-world applications.
Two-dimensional turbulent classical and momentumless thermal wakes
(University of the Witwatersrand, Johannesburg, 2023-07) Mubai, Erick; Mason, David Paul
The two-dimensional classical turbulent thermal wake and the two-dimensional momentumless turbulent thermal wake are studied. The governing partial differential equations result from Reynolds averaging the Navier-Stokes, the continuity and energy balance equations. The averaged Navier-Stokes and energy balance equations are closed using the Boussinesq hypothesis and an analogy of Fourier’s law of heat conduction. They are further simplified using the boundary layer approximation. This leads to one momentum equation with the continuity equation for an incompressible fluid and one thermal energy equation. The partial differential equations are written in terms of a stream function for the mean velocity deficit that identically satisfies the continuity equation and the mean temperature difference which vanishes on the boundary of the wake. The mixing length model and a model that assumes that the eddy viscosity and eddy thermal conductivity depend on spatial variables only are analysed. We extend the von Kármán similarity hypothesis to thermal wakes and derive a new thermal mixing length. It is shown that the kinematic viscosity and thermal conductivity play an important role in the mathematical analysis of turbulent thermal wakes. We obtain and use conservation laws and associated Lie point symmetries to reduce the governing partial differential equations to ordinary differential equations. As a result we find new analytical solutions for the two-dimensional turbulent thermal classical wake and momentumless wake. When the ordinary differential equations cannot be solved analytically we use a numerical shooting method that uses the two conserved quantities as the targets.

Electronic Theses and Dissertations (PhDs)

Browse

Filters

Settings

Sort By

Results per page

Search Results