UM 2005 Logo

Fourth Workshop on the
Evaluation of Adaptive Systems

in conjunction with UM'05

Edinburgh, UK, July 24 - 30, 2005
 
   
 
Thanks to all participants!

The workshop's proceedings are available as a single file, and as individual papers (the presentations delivered during the workshop are now also available along with their corresponding papers).

The normative reference for these proceedings is:

S. Weibelzahl, A. Paramythis, and J. Masthoff (Eds.) Proceedings of the Fourth Workshop on the Evaluation of Adaptive Systems, held in conjunction with the 10th International Conference on User Modeling (UM'05), Edinburgh, UK , July 24th to 30th, 2005.

 

Single file

Download the full proceedings of the workshop (PDF1 MB).

 

Individual papers

Using the list below, you can also access paper abstracts, individual papers and the presentations (not available yet) delivered at the workshop.

Main Track

Evaluation Challenge Track

 


Potentials of Eye-Movement Tracking in Adaptive Systems
R. Bednarik
pp. 1-8

Abstract. Eye-movement tracking proved its potentials in many areas of human-computer interaction. Resting on a hypothesis that eye-direction and mind are linked, some of the HCI researchers have employed eye-movement trackers to investigate the visual attention focus of the participants completing their tasks. Others have used the eye-movement tracking in real-time applications, either as a direct interaction device or as an input to gaze-aware interfaces. Inspired by the previous HCI applications, we propose to utilize eyemovement trackers in adaptive systems research and development in two ways. First, the evaluations of adaptive systems could get an access to the information otherwise unavailable, as for instance to how the visual attention and cognitive processing are influenced by an adaptivity implemented into the evaluated system. Second, we propose to employ the eye-movement tracking technologies for a real-time registration of users’ loci of visual attention, therefore increasing the awareness of the adaptive systems about their current users. We discuss possible potentials, difficulties and pitfalls of eye-movement tracking when applied to adaptive systems. We argue that a methodological framework of applying eye-tracking into adaptive systems shall be developed.

Downloads. Paper (PDF289 KB), Presentation (PDF1.4 MB)

The Evaluation of in-vehicle Adaptive Systems
Talia Lavie, Joachim Meyer, Klaus Bengler, and Joseph F. Coughlin
pp. 9-18

Abstract. Although research on adaptive systems has begun only recently, studies have shown the benefits of using adaptive systems. However most of those studies have examined systems with and without adaptive qualities, disregarding additional factors that may influence the interaction. This study presents a first step towards a more comprehensive evaluation of adaptive systems. We assert that adaptive systems should be examined with regard to different types of tasks, different situations and using various users to be able to determine the conditions in which adaptivity will be beneficial. A preliminary study evaluated adaptivity when performing routine and infrequent tasks. The study showed that adaptivity is beneficial for routine tasks, and that adaptivity impairs performance of infrequent tasks. The study proposes a method to calculate the point at which adaptivity ceases to be beneficial as a function of the relative frequencies of different tasks and provides a starting point for a more comprehensive understanding of the subject.

Downloads. Paper (PDF298 KB), Presentation (PDF659 KB)

Is the ACT-value a Valid Estimate for Knowledge? An Empirical Evaluation
of the Inference Mechanism of an Adaptive Help System

D. Iglezakis
pp. 19-26

Abstract. This paper reports the results of an empirical study that evaluates the inference mechanism of an adaptive help system for web-based applications. The help system adapts to a measure for the procedural knowledge that is computed from activity logfiles according to the ACT-theory of Anderson and Lebière [1]. The results of the study show that the ACT-value of procedural knowledge correlates with subjective and objective measures of performance and proves itself as a better estimate of the procedural knowledge than general computer knowledge, a measure often used by other adaptive help systems.

Downloads. Paper (PDF294 KB), Presentation (PDF108 KB)

Impacts of User Modeling on Personalization of Information Retrieval: An Evaluation with Human Intelligence Analysts
E. Santos Jr., Q. Zhao, H. Nguyen, and H. Wang
pp. 27-36

Abstract. User modeling is the key element in assisting intelligence analysts to meet the challenge of gathering relevant information from the massive amounts of available data. We have developed a dynamic user model to predict the analyst’s intent and help the information retrieval application better serve the analyst’s information needs. In order to justify the effectiveness of our user modeling approach, we have conducted a user evaluation study with actual end user, three working intelligence analysts, and compared our user model enhanced information retrieval system with a commercial off-the-shelf system, the Verity Query Language. We describe our experimental setup and the specific metrics essential to evaluate user modeling for information retrieval. The results show that our user modeling approach tracked individual’s interests, adapted to their individual searching strategies, and helped retrieve more relevant documents than the Verity Query Language system.

Downloads. Paper (PDF286 KB), Presentation (PDF133 KB)

Evaluating Scrutable Adaptive Hypertext
M. Czarkowski
pp. 37-46

Abstract. Adaptive hypertext systems personalise documents to meet the individual’s particular preferences, knowledge and goals. There is a debate over how much control should be given to the user as well as how much transparency there should be to the inner workings of the system. Some adaptive systems make the user model available to the user. We propose transparency and control should extend beyond this by involving the user in the personalisation process and granting them power to change it. Our previous evaluations of scrutable systems have revealed users have difficulty understanding and controlling personalisation. We have developed SASY with a focus on improving scrutability support tools. This paper describes our design for the evaluation of SASY.

Downloads. Paper (PDF416 KB), Presentation (PDF351 KB)

Layered Evaluation of Topic-Based Adaptation to Student Knowledge
S. Sosnovsky, and P. Brusilovsky
pp. 47-56

Abstract. A user modeling server is an important part of modern distributed ELearning architectures. The user modeling server CUMULATE has two main levels: the event storage and multiple inference agents. To evaluate adaptive systems functioning as components of the common distributed architecture and using CUMULATE as the central user modeling server we need to evaluate the adaptation provided by those agents. Unfortunately, there are no commonly accepted approaches to the evaluation of the universal user modeling server. This paper describes the results of layered evaluation of our recent topic-based adaptation engine based on the activity students performed using the system QuizGuide. User modeling and adaptation processes are evaluated separately. While previous evaluation experiments of QuizGuide based on the traditional “with-and-without” approach showed that students like the system and benefit from it, this paper provides evidence of unfitness of large topics as knowledge assessment units used for adaptation, which challenges the reasonableness of the entire adaptation performed by the system.

Downloads. Paper (PDF370 KB), Presentation (PDF545 KB)

Problems and Pitfalls in Evaluating Adaptive Systems
S. Weibelzahl
pp. 57-66

Abstract. Empirical studies with adaptive systems offer many advantages and opportunities. Nevertheless, there is still a lack of evaluation studies. This paper lists several problems and pitfalls that arise when evaluating an adaptive system and provides guidelines and recommendations for workarounds or even avoidance of these problems. Among other things the following issues are covered: relating evaluation studies to the development cycle; saving resources; specifying control conditions, sample and criteria; asking users for adaptivity effects; reporting results. An overview of existing evaluation frameworks shows which of these problems have been addressed in which way.

Downloads. Paper (PDF289 KB), Presentation (PDF536 KB)

Introduction to the First Adaptive System Evaluation Challenge
pp. 65-68

Abstract. The Fourth Workshop on Evaluation of Adaptive Systems launched the first “evaluation challenge”. The challenge concerns an adaptive system, which recommends sequences of music clips to groups of users. Focusing on a real world problem, the challenge aimed to foster the development of innovative evaluation designs as well as encourage controversial discussion during the workshop. Participation in the challenge entailed proposing an empirical evaluation design purposely created to answer specific design questions regarding the system’s modeling component. This chapter describes the task and participation requirements given to the participants. The following two chapters contain the two submissions received.

Downloads. Paper (PDF271 KB), Presentation (PDF1.1 MB)

Evaluating an Adaptive Music-Clip Recommender System
T. Zhu, and R. Greiner
pp. 69-73

Abstract. In this paper, we propose an experiment design to address three evaluation goals of “First Adaptive System Evaluation Challenge”, and demonstrate how to achieve each of these goals.

Downloads. Paper (PDF297 KB), Presentation (PDF316 KB)

Addressing Problems in the First Adaptive System Evaluation Challenge
D. Chin
pp. 74-78

Abstract. The adaptive evaluation challenge system has several problems with assumptions including ignoring other emotions, ignoring other influences on happiness and assuming ratings of 5.5 have neutral happiness impact. The recommended evaluation combines established affective response to music surveys and heart rate monitoring to measure happiness. First, ratings are recalibrated by asking users whether music clips affect their emotions positively, neutrally, or negatively. Clips that affect emotions other than happiness are filtered out and a pilot study is used to determine how many positive/negative clips are needed to maximize/minimize Happiness. Finally a custom series of clips are designed for each user in the study following standard DJ music selection practices. The measured/reported happiness of the users is compared to the proposed Happiness models to determine the best model. Ethics of the study are also discussed.

Downloads. Paper (PDF272 KB), Presentation (not available yet)