这个作业是实现一些简单的学习自动机（LA）和强化学习（RL）策略

COMP 4106 – Artificial Intelligence

Winter 2020

Assignment #3

Due date: March 30, 2020

Assignment Involving Elementary Reinforcement Learning

Introduction

The goal of this assignment is to have you implement some simple Learning Automata (LA)

and Reinforcement Learning (RL) strategies. The application domain is quite straightforward,

but it is typical of the domains where LA and RL can be used.

Problem Statement

The assignment consists of using LA to solve a simplified version of the Elevator Problem. A

building has an elevator that stops at k floors as in Figure 1. People can request the elevator

from floor i, and get off the elevator on floor j, where i = 1 : : : k; i 6= j. At any time instant, the

probability that the elevator is requested from a floor, and the probability that the person gets

off the elevator on another floor, are given by the matrices E and L respectively (not shown),

whose components are the probabilities of a person entering and leaving the floors respectively.

After each trip, the elevator can move to rest at one of the k different floors so as to minimize

the waiting time for the next passenger.

>ůŐŽƌŝƚŚŵ

ͲͲͲͲͲͲͲͲͲͲͲͲͲͲͲͲͲͲ

tŚĞƌĞƐŚĂůů/

ƉĂƌŬ͍

&ůŽŽƌ<
͙
&ůŽŽƌϮ
&ůŽŽƌϭ
Figure 1: A simple model of an Elevator system, where the car can be made to wait at one of
k possible floors.
The distributions E and L for are unknown to the LA/RL scheme. However, the unknown
waiting can be quantified as follows. Every index i 2 f1; 2; :::; kg is mapped to a unique number
G(i) 2 f1; 2; :::; kg by a one-to-one and onto mapping. In this assignment, you can assume that
k = 6. Then, the waiting time fi (which is unknown to the LA/RL system) obeys the equation:
fi = 0:8 G(i) + 0:4 CEIL[
G(i)
2
℄ + h; i = f1; 2; :::; 6g;
where the random random times are affected by a noise, h, that has a Gaussian distribution
with a mean of 0 and a variance of an input parameter, 2
, and where CEIL[℄ is the “Ceiling”
function of the argument. In other words, at any time t, h is a random number generated
from a Gaussian distribution that represents the random delays at the respective floors. In
this assignment, you can use a Gaussian Random Number Generator from a standard existing
library available in the language that you are using.
2
Assignment Objectives
These are the tasks you have to do:
Learn the best floor for the car to be waiting using the Tsetlin, Krinsky, Krylov and LRI
schemes.
In each case, use a suitable value for the “memory” and learning parameter.
In each case, plot the ensemble average of the waiting time for an ensemble of 100 experiments.
Questions
During the demo you should be prepared to discuss the following questions:
Explain the way you chose the parameters for each scheme.
Can you rank the schemes in terms of their speed/accuracy?
3

**程序辅导定制C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB**

本网站支持 Alipay WeChatPay PayPal等支付方式

**E-mail:** vipdue@outlook.com **微信号:**vipdue

如果您使用手机请先保存二维码，微信识别。如果用电脑，直接掏出手机果断扫描。