Vinsamlegast notið þetta auðkenni þegar þið vitnið til verksins eða tengið í það: http://hdl.handle.net/1946/25590
Generally intelligent robots and systems should be evaluated based on their ability to learn new tasks over a wide range of domains. Few if any of the available evaluation methods for artificial intelligence (AI) systems address this need, and most even leave out important aspects of intelligence, such as a system’s ability to learn. As a result, ad-hoc methods of evaluation are commonly used, and no standardized evaluation methods have been accepted. Furthermore, evaluation of controllers in physically realistic task-environments has been left mostly unaddressed. In short, there are vast opportunities for improvement in the way AI systems are evaluated. However, not all AI systems are alike or created equal. This could be addressed if we had a toolkit where developers could easily construct appropriate tasks for evaluating and comparing their systems on a variety of tasks. To be generally applicable such a toolkit should provide answers about the efficiency, both in time and energy, of various control systems, so that they could be ordered with respect to their practical utility in the most general way possible.
In this thesis we present a prototype framework that allows modular construction of task-environments, rooted in physics, and its early-state implementation, the Framework for Modular Task-Environment Construction (FraMoTEC). Simulation is used to evaluate control systems’ performances in terms of expended time and energy. In our approach tasks are dissected into dimensions to be controlled by the system to be evaluated; simpler tasks contain only a few dimensions to be controlled sequentially; more complex tasks have a large number of dimensions, some of which must be controlled simultaneously to achieve the task. In FraMoTEC components can be flexibly modified and changed through the inherent modularity, allowing evaluating control systems on a single or multiple tasks, as well as on a family of tasks.
The utility of FraMoTEC as an AI evaluation framework was demonstrated by evaluating the performance of various controllers (such as SARSA reinforcement learners) on a collection of task-environments, using both simple tasks and a suite of scalable N-dimensional tasks. The results show that FraMoTEC allows both simple and complex state of the art controllers to be flexibly evaluated on a family of physical tasks in a straightforward manner. The evaluation can be along the dimensions of efficiency (time, energy, or both), failures, learning rate, etc. and any combination thereof. Further theoretical analysis of N-dimensional tasks indicates the approach can scale to be suitable for advanced controllers with higher levels of intelligence, learning capacity, etc., making the approach a promising direction to pursue in the context of AI and robotics evaluation.