Document Type



This item is available under a Creative Commons License for non-commercial use only



Publication Details

A dissertation submitted in partial fulfilment of the requirements of Technological University Dublin for the degree of M.Sc. in Computing 9Data Analytics), 2018.


This thesis investigates the different approaches to video object segmentation and the current state-of-the-art in the discipline, focusing on the different deep learning techniques used to solve the problem. The primary contribution of the thesis is the investigation of usefulness of Exponential Linear Units as activation functions for deep convolutional neural architectures trained to perform object semi-supervised segmentation in videos. Mask R-CNN was chosen as the base convolutional neural architecture, with the view of extending the image segmentation algorithm to videos. Two models were created, one with Rectified Linear Units and the other with Exponential Linear Units as the respective activation functions. The models were instantiated and fine-tuned on the first frame of each sequence on the test dataset before predicting segmentations. This was done to focus on the principal object in the video for segmentation. Mean Jaccard index was the metric chosen to evaluate the performance of the models. No significant difference was found between the performance of the two models on the test dataset. A qualitative analysis of the performance of the model with ReLU activation functions was conducted with the view of understanding its strengths and weaknesses. The thesis concludes with an overview and a discussion on limitations and recommendations for future work that can be done to extend on the work presented in this thesis.