Computer Vision

Computer Vision was the second module I took, along with Natural Language Processing as part of the AI specialisation track. I had already been exposed to Computer Vision before through internships and projects so I was quite excited to perform more advanced coding in the area and explore more kinds of models and techniques.

To my surprise, the course was more focused on the foundational aspects such as properties of images and the kinds of approaches that existed before the advent of neural networks. The specific topics covered in the first half are imaging in the spatial and frequency domains and image edge processing methods such as Canny Edge Detection. The second half dealt with topics such as image region processing methods such as Otsu algorithm, imaging geometry including conversions between the image and physical frames, stereo vision, parallax and point matching and object recognition.

I was especially challenged by the frequency domain imaging topics. I could not pick up the concepts taught in class and needed further help from YouTube and papers on the internet. Researching on the topic helped me understand it a little but not enough to be confident in it.

The course was very theoretical in nature with very few practical assignments. The assessment was primarily based on labs and a quiz for the first half, and a project for the second half. The quiz was much harder than the lecture material and I was not able to perform well in it. The labs dealt with the concepts taught in the lectures such as edge detection and frequency domain imaging.

The project in the second half was to implement a custom Otsu algorithm in Python and add improvements to it such that the text in two given sample images would be accurately recognised. The project was to be done in pairs and I teamed up with an acquaintance who was taking the course with me. We were both in the same situation of needing to do well in the project since the quiz was difficult for both of us. We worked well together and managed to do well in the project.

Overall, this module was very different from what I had expected as there was no coding on CNN models which I am more interested in. The content was not only unfamiliar but also challenging enough that I needed to put in a lot of effort to understand it and grasp the topics enough for the labs and quizzes.

Not every battle results in a victory. As I have had similar experiences before, I was not fazed by the average grade I received in this module and focused on working towards doing better in the upcoming and final semester.

Keywords
  • Histogram Analysis
  • Fourier Domain Imaging
  • Otsu Algorithm
  • Object Recognition