JOHOR BAHRU, 28th March – An online structured course Computer Vision: A Discussion on its Opportunities and Challenges was organized to share on the trends and issues regarding computer vision. The structured course started at 10am and ended at 12pm using Cisco WebEx online conferencing platform. This course was organized by the Postgraduate Student Society School of Computing (PGSS-SC), with Muhamad Farhin Harun as the moderator. This session was successfully held with the PGSS-SC committee’s help, Muhammad Zafran Muhammad Zaly Shah, Asraful Syifaa’ Ahmad and Muhammad Anwar Ahmad.
This course received 683 online registrations, and there were 141 participants who turned up for the event that had registered. There are 47.5% Ph.D. level participants, 11.3% master’s degree level participants, and 13.4 % Bachelor level participants. We also received participation from UTM staff. Additionally, 75.8% are from UTM, and the rest is non-UTM and alumni.
The honorable speaker for the program is Dr. Md. Sah Bin Hj. Salam. Dr. Md. is currently a senior lecturer at School of Computing, Universiti Teknologi Malaysia and a senior researcher at Vicubelab UTM Research Group. His work focuses in multidisciplinary research on artificial intelligence, speech & pattern recognition, emotion detection, speech and signal processing and image processing.
He received his Bachelor of Science (Computer Science) at University of Pittsburgh, USA. Later he further his master and PhD studies in Computer Science majoring in Speech Processing and AI at UTM. According to UTM Scholars website, Dr. Md. Sah has 38 publications and 18 students under his supervision.
The course starts at 10am with the moderator briefly introducing the speaker, Dr. Md. Sah’s background. Then, the floor was given to the speaker, and he started with further introduction to his background including his past achievements. Then he started the talk with the introduction to computer vision. To begin, he showed a slide with some definitions about computer vision, in which the general statement is a field of Artificial Intelligence (AI) that enables computers and system to derive meaningful information from images and videos and make suitable actions or recommendations.
Then he moved on to a brief history of computer vision, with a timeline view. One notable event was the first computer vision project in MIT in 1960. Since then, there has been many achievements in this field. He then showed more major milestones including object detection with deep convolutional neural networks (CNN). Some examples of neural networks models include R- CNN, Mask R-CNN, YOLOv3 and YOLOv4. Next, he showed a slide depicting the decreasing classification error throughout the year with the advancements of the neural network models.
Afterwards, he began to touch on the levels of image processing hierarchy in the computer vision field, in which it is divided into low level (basic operations such as image sharpening and deblurring), mid-level (features extraction), and high level (analysis and interpretation of image). He then showed some videos about applications of the computer vision technologies, such as camera recognizing people moving in a room, a robotic arm that can detect moving objects on a conveyer belt and sort them, self-driving cars, crop management, cashless shopping, and attendance recording. He also discussed the advantages and the issues on each of the videos. Before moving on to the next topic, he briefly talked about the rising computer vision industry, which is currently about $11 billion, and is expected to grow to $25 billion in the next five to six years.
The next topic touches on the tasks that are usually performed in the computer vision field. These include image classification, image classification + localization, semantic segmentation, and instance segmentation. He then went over these tasks on a more technical level which describes the algorithms and methods used to perform the tasks. Facial recognition was used as the use case, and he showed another example video of a real-life application which is the FaceID system from Apple on their devices.
Before ending the talk, he briefly touched on image segmentation which is an important task of computer vision. Image segmentation subdivides an image into its constituent parts or objects. It is achieved by assigning a label to each pixel in the image. Then, he showed some applications of image segmentation including medical imaging, surface-crack detection in industrial operations, photo editing and autonomous driving. Finally, he concluded the talk with a summary of the topics discussed.
The speaker’s session ended around 11:40 am. Afterwards, a Q&A session was held until the end of the structured course at 12pm. There were many excellent questions, and the speaker manages to answer all the questions. To wrap up the course, a photography session was held, and the moderator took screenshots of all the participants. Overall, the structured course was held without any major issues. There were some slight interruptions from some unmuted microphones from a participant that went for around 1 minute, but it was able to be solved. There was also some microphone issue from the speaker, but the moderator managed to notify the speaker and it was also solved quickly.
To conclude, the participants appreciate all the efforts by the SPS UTM and PGSS SC as the organizers. Additionally, 131 out of the 141 participants rate this structured course 4 and 5 stars in terms of the overall rating. From the feedbacks, the participants are very satisfied with Dr. Md. Sah’s explanation and insight into this topic. This course outcome will give a boost to motivation for PGSS SC to organize more workshops like this.