<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">

<head>

<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">

<meta name="Generator" content="Microsoft Word 15 (filtered medium)">

<style><!--

/* Font Definitions */

@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}

@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0cm;

        margin-bottom:.0001pt;

        font-size:11.0pt;

        font-family:"Calibri",sans-serif;

        mso-fareast-language:EN-US;}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:#0563C1;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {mso-style-priority:99;

        color:#954F72;

        text-decoration:underline;}

p.MsoPlainText, li.MsoPlainText, div.MsoPlainText

        {mso-style-priority:99;

        mso-style-link:"Plain Text Char";

        margin:0cm;

        margin-bottom:.0001pt;

        font-size:11.0pt;

        font-family:"Calibri",sans-serif;

        mso-fareast-language:EN-US;}

span.PlainTextChar

        {mso-style-name:"Plain Text Char";

        mso-style-priority:99;

        mso-style-link:"Plain Text";

        font-family:"Calibri",sans-serif;}

.MsoChpDefault

        {mso-style-type:export-only;

        font-family:"Calibri",sans-serif;

        mso-fareast-language:EN-US;}

@page WordSection1

        {size:612.0pt 792.0pt;

        margin:72.0pt 72.0pt 72.0pt 72.0pt;}

div.WordSection1

        {page:WordSection1;}

--></style><!--[if gte mso 9]><xml>

<o:shapedefaults v:ext="edit" spidmax="1026" />

</xml><![endif]--><!--[if gte mso 9]><xml>

<o:shapelayout v:ext="edit">

<o:idmap v:ext="edit" data="1" />

</o:shapelayout></xml><![endif]-->

</head>

<body lang="EN-GB" link="#0563C1" vlink="#954F72">

<div class="WordSection1">

<p class="MsoPlainText"><b>Call for participation for Workshop on Human and Computer Models of Video Understanding<o:p></o:p></b></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">Workshop on <b>15 May 2024</b>, University of Surrey.<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">Dear Colleagues, <o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">We invite participants to a workshop on “Human and Computer Models of Video Understanding”. We aim to bring together human and computer vision scientists to share the latest knowledge and collaborate.<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">Participants can submit a 500 word (approx) abstract describing the work they would like to present and their preference for an oral or poster presentation. Given time constraints, accommodating all desired oral presentations might not

 be possible, so we will review abstract submissions and assign some to oral presentation slots and some to posters.<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">Please visit the workshop’s website to submit an abstract and/or register:<o:p></o:p></p>

<p class="MsoPlainText"><a href="https://www.ias.surrey.ac.uk/event/human-and-computer-models-of-video-understanding/">https://www.ias.surrey.ac.uk/event/human-and-computer-models-of-video-understanding/</a><o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">It is free to attend but there will be an optional evening dinner at an additional cost (details and link for payment to be sent nearer the date).<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText"><b>Key Dates:<o:p></o:p></b></p>

<p class="MsoPlainText">Abstract submission deadline: 29 March<o:p></o:p></p>

<p class="MsoPlainText">Notification: 5 April<o:p></o:p></p>

<p class="MsoPlainText">Workshop date: 15 May 2024, University of Surrey.<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText"><b>Invited Speakers:<o:p></o:p></b></p>

<p class="MsoPlainText">Professor Shaogang Gong, Queen Mary University of London, Queen Mary Computer Vision Laboratory<o:p></o:p></p>

<p class="MsoPlainText">Professor Frank Pollick, University of Glasgow, School of Psychology and Neuroscience<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">Call for Research Contributions<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">We invite participants for a multi-disciplinary workshop on “Human and Computer Models of Video Understanding”. The core research question we are concerned with is: How does the human brain understand people's activities in a video much

 better than existing computer systems? We invite participants from the science of human vision (psychology or brain sciences) and computer vision, focusing on understanding activities from video.<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">To give some concrete examples: Humans can very quickly make accurate judgements about the activity happening in a video even if the quality of the video is poor, or the motions observed are ambiguous, for example, to discriminate hugging

 from fighting, or smoking from eating finger food. Computers cannot match human performance in these tasks, which are critical for applications in surveillance, monitoring safety and welfare in a care setting, or removing inappropriate videos from social media.

 We do not yet fully understand how humans perform these feats, nor how to make computer vision systems reach their performance.<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">We invite human and computer vision science participants to present the latest advances in their fields, in a language accessible to a multi-disciplinary audience. We intend to foster a cross-fertilisation of ideas between the different

 scientific communities, where each can see ways to incorporate insights and techniques from a foreign field in their models. Moreover we intend to act as a “matchmaker” for new cross-disciplinary partnerships on projects that can incorporate the techniques

 of separate communities. We plan a future special issue of the journal Transactions in Cognitive and Developmental Systems for the topic “Special issue on Vision Sciences for Video Understanding in Cognitive Systems”, where multi-disciplinary teams emerging

 from this workshop can publish a paper in a presentation accessible to a multi-disciplinary audience.<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">We welcome participation from both junior and senior researchers from academia and industry.<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">Potential topics include, but are not limited to: <o:p></o:p></p>

<p class="MsoPlainText">Eye Tracking in Video, <o:p></o:p></p>

<p class="MsoPlainText">Visual Attention and Salient Features in Video, <o:p></o:p></p>

<p class="MsoPlainText">Cognitive Models for Video Understanding, <o:p></o:p></p>

<p class="MsoPlainText">Perceptual Quality in Video Streaming, <o:p></o:p></p>

<p class="MsoPlainText">Multimodal Perception in Videos, <o:p></o:p></p>

<p class="MsoPlainText">Emotion and Affect in Video Viewing, <o:p></o:p></p>

<p class="MsoPlainText">Neuroscience and Brain Imaging in Video Perception, <o:p>

</o:p></p>

<p class="MsoPlainText">Visual Cognition in Virtual Reality (VR) and Augmented Reality (AR),

<o:p></o:p></p>

<p class="MsoPlainText">Attentional Shifts and Change Detection in Video, <o:p></o:p></p>

<p class="MsoPlainText">Video Activity Recognition, <o:p></o:p></p>

<p class="MsoPlainText">Video Object Detection and Tracking, <o:p></o:p></p>

<p class="MsoPlainText">Video Segmentation and Scene Understanding, <o:p></o:p></p>

<p class="MsoPlainText">Event Detection and Recognition in Videos, <o:p></o:p></p>

<p class="MsoPlainText">Video-Based Surveillance and Security, <o:p></o:p></p>

<p class="MsoPlainText">Spatiotemporal Action Localization, <o:p></o:p></p>

<p class="MsoPlainText">Human Vision-Inspired Video Models<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">ORGANISERS<o:p></o:p></p>

<p class="MsoPlainText">Dr Frank Guerin, University of Surrey<o:p></o:p></p>

<p class="MsoPlainText">Dr Andrew Gilbert, University of Surrey<o:p></o:p></p>

<p class="MsoPlainText">Dr Quoc Vuong, Newcastle University<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText">===================================<o:p></o:p></p>

<p class="MsoPlainText">Biosciences Institute<o:p></o:p></p>

<p class="MsoPlainText">Newcastle University<o:p></o:p></p>

<p class="MsoPlainText">Newcastle upon Tyne, UK, NE2 4HH<o:p></o:p></p>

<p class="MsoPlainText">+44 (0)191 208 6183<o:p></o:p></p>

<p class="MsoPlainText">https://www.staff.ncl.ac.uk/q.c.vuong/<o:p></o:p></p>

<p class="MsoPlainText">===================================<o:p></o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

<p class="MsoPlainText"><o:p> </o:p></p>

</div>

</body>

</html>