On January 31st 2021, an online structured course entitled “Python For Data Analysis” was organized to provide an active, hands-on tutorial on data analysis using the Pandas Python library. The structured course started at 9.00 am and ended at 12.15 pm using Cisco Webex online conferencing platform. This course was organized by the Postgraduate Student Society School of Computing (PGSS-SC) with Asraful Syifaa’ Ahmad as the moderator. This session was successfully held with the PGSS-SC committee’s help Muhammad Zafran Muhammad Zaly Shah, Muhammad Luqman Mohd Shafie, and Muhamad Farhin Harun.
This course received 278 online registrations, and 116 participants turned up for the event. However, out of 116, only 105 participants registered during the structured course. There are 46.67% PhD students, 20.0 % Master degree students, 11.43% Bachelor students, and the rest are unidentified. We also received participation from UTM staff. Additionally, 86.67 % are from UTM, and the remainder is non-UTM. The honorable speaker, Dr. Haslina Binti Hashim is a senior lecturer at School of Computing, Universiti Teknologi Malaysia. She received her Bachelor of Computer Science (Artificial Intelligence) in 2003 from Universiti Malaya and her M.Sc. in Bioinformatics from Chalmers University, Sweden in 2007. She received her PhD in Bioinformatics from the University of Manchester, United Kingdom in 2016. Her expertise is in Data Mining, Machine Learning, Business Intelligence and Bioinformatic. She also has industrial collaboration with TERAJU Johor. In this course, Dr. Haslina used a module based on the IBM Python Workshop, which she attended a few years back. Hence, the talk is divided into five topics which are:
1) Topic 1: Importing Dataset
2) Topic 2: Data Wrangling
3) Topic 3: Exploratory Data Analysis
4) Topic 4: Model Development
5) Topic 5: Model Evaluation.
Along with each topic, Dr Haslina also provides hands-on activity using the Jupiter Notebook, an online Python workspace by Google that already includes various libraries covering all the topics. Besides that, the Jupyter Notebook template was also shared among the participants to do the hands-on activities together with Dr. Haslina. The first topic showed how to import any dataset into the Jupiter environment. The speaker also showed how to display the dataset they are imported into the environment. Data wrangling or data cleaning was covered to show the process of cleaning and unifying messy and complicated dataset for easy access and analysis. This is an essential step as machine learning algorithms cannot understand complex structured data, and it can ease analysts to analyze the data. Participants were taught how to transform complex label data into the simplest form of representation, which significantly helps participants understand the concept of a data structure. The next topic is an exploratory data analysis (EDA) which is a process to analyze and investigate datasets and summarize their main characteristics. This topic covered the process of analyzing data structure to ensure the data is correctly distributed. If there is an abnormality in the dataset, Dr. Halsina showed how to handle such a situation.
In conclusion, artificial intelligence or machine learning is one of the most demanded industry skills in recent years. However, data analysis is also essential because, without proper data analysis skills, data could not be well-defined at the client level. Some critical information may be mistranslated or missing. Nowadays, data are considered as the ‘new oil’ which shows its importance in the industry. During the whole course, only the first topic until the third topic is covered because of time constraint. However, some participants are interested in the course and requested PGSS-SC to expand the course to session two to cover remaining topics. There are also many questions asked during the talk. Most of the participants were satisfied with the speaker’s selection, and the topic gave much insight to them. Nevertheless, the participants appreciate all the efforts by the SPS UTM and PGSS SC as the organizers. Additionally, 105 participants rate this structured course 4 and 5 stars in terms of the overall rating. This course outcome is a kick start for PGSS SC to organize more workshops like this.