Seminars and Defenses

All are welcome and encouraged to attend the seminars and defenses.

Sep 13, 9:30AM, ENG460
Neural Embedding-based Metrics for Pre-Retrieval Query Performance Prediction
Negar Arabzadehghahyazi • MASC ORAL EXAM
Query Performance Prediction (QPP) is concerned with estimating the effectiveness of a querywithin the context of a retrieval model. This is important as it allows for the efficient executionof operations such as query routing and segmentation, leading to improved retrieval perfor-mance. Pre-retrieval QPP methods are oblivious to the performance of the retrieval model asthey predict query difficulty prior to observing the set of documents retrieved for the query. Among pre-retrieval query performance predictors, specificity-based metrics investigate how corpus, query and corpus-query level statistics can be used to predict the performance of the query. In this thesis, we explore how neural embeddings can be utilized to define corpus-independent and semantics-aware specificity metrics. Particularly, we propose to measure term specificity based on the distribution of terms in the neighborhood of the given term in the embedding space. Our metrics are based on the intuition that a term that is closely surrounded by other terms in the embedding space is more likely to be specific while a term surrounded by less closely related terms is more likely to be generic. On this basis, we leverage geometric properties between embedded terms to define four groups of metrics: (1) neighborhood-based, (2) graph-based, (3) cluster-based and (4) vector-based metrics. Moreover, we employ learning-to- rank techniques to analyze the importance of individual specificity metrics. To evaluate the proposed metrics, we have curated and publicly share a test collection of term specificity measurements defined based on Wikipedia category hierarchy and DMOZ taxonomy. We report on our extensive experiments on the effectiveness of our metrics through metric comparison, ablation study and comparison against the state-of-the-art baselines. We have shown that our proposed set of pre-retrieval QPP metrics based on the properties of pre-trained neural embeddings are more effective for performance prediction compared to the state-of-the-art methods. We report our findings based on Robust04, ClueWeb09 and Gov2 corpora and their associated TREC topics.
Sep 6, 9AM, ENG471
Fast and Efficient Edge Fusing Network Architectures for Accurate Single Image Super-Resolution
Debjoy Chowdhury • MASC ORAL EXAM
Recovering a High-Resolution (HR) image from a Low-Resolution (LR) image is the main concept of image Super-Resolution (SR). Convolution Neural Networks (CNN) are becoming widely adopted in many applications including generation of HR images from LR images. Although CNNs are widely used with great performance improvements, there is still much room for improvement. There has always been a trade-off between the number of parameters and performance enhancement. This thesis presents a novel convolutional neural network architecture for high scale image SR inspired by the DenseNet and ResNet architecture. In particular, modifications can be made to the convolutional layers in the network: stacking the features and reusing the weight layers to increase the receptive field. It is shown how this method can be used to expand the receptive field and performance of super-resolution networks, without increasing the number of trainable parameters and sacrificing the computation time. These modifications can easily be integrated into any convolutional neural network to improve the accuracy by efficient high-level feature extraction while reducing training time and parameter numbers. Proposed methods are especially effective for the challenging high scale SR due to edge and texture recovery through the expanded network receptive field. Experimental results show that the proposed model outperforms the state-of-the-art methods.
Sep 5, 10AM, ENG471
Efficient Pre-Processing and Automated Post-Processing in EEG/MEG Brain Source Analysis
Younes Sadat Nejad • MASC ORAL EXAM
Analyzing Electroencephalography (EEG)/Magnetoencephalography (MEG) brain source signals allow us to understand and diagnose various brain-related activities and injuries. Due to the high complexity of the mentioned measurements and their low spatial resolution, different techniques have been employed to enhance the quality of the obtained results. The objective of this work is to employ state-of-art approaches and develop algorithms with higher analysis efficiency. It utilizes subspace denoising and artifact removal approaches such as Independent Component Analysis (ICA) and Signal Space Projection (SSP) and provides a method that automates and improves the estimation of the Number of Component (NoC) for artifacts such as Eye Blinking (EB). Using synthetic EEG and real MEG data, it is shown that the proposed method is faster and more efficient over the conventional manual method in estimating the NoC. The thesis is also devoted to improving source localization techniques which aim to estimate the location of the source within the brain having the time-series measurements. In this context, after obtaining a practical insight into the performance of the popular L2-Regularization based approaches, a post-processing thresholding method is introduced. The proposed method improves the spatial resolution of the L2-Regularization inverse solutions, especially for Standard Low-Resolution Electromagnetic Tomography (sLORETA) that is a well-known and most used inversing solution. The proposed method for noise variance estimation combines Kurtosis statistical parameter and data entropy. This new noise variance estimation technique is superior to the existing ones and results in a more efficient post-processing thresholding. The algorithm is evaluated using synthetic EEG data and well-established evaluation metrics. It is shown that our proposed solution improves the resolution of conventional methods in the process of thresholding/denoising without loss of any critical information.
Sep 5, 2PM, ENG471
Autonomous PEV Charging Scheduling Using Deep-Q Network and Dyna-Q Reinforcement Learning
This thesis proposes a demand response method that aims to reduce the long-term charging cost of a plug-in electric vehicle (PEV) while overcoming obstacles such as the stochastic nature of the user's driving behaviour, traffic condition, energy usage, and energy price. The problem is formulated as a Markov Decision Process (MDP) with unknown transition probabilities and solved using deep reinforcement learning (RL) techniques. Existing methods using machine learning either requires initial user behaviour data, or converges far too slowly. This method does not require any initial data on the PEV owner's driving behaviour and shows imp= rovement on learning speed. A combination of both model-based and model-free learning called Dyna-Q algorithm is utilized. Every time a real experience is obtained, the model is updated and the RL agent will learn from both real data set and "imagined" experience from the model. Due= to the vast amount of state space, a table-look up method is impractical and a value approximation method using deep neural networks is employed for estimating the long-term expected reward of all state-action pairs. An average of historical price is used to predict future price. Simulation results demonstrate the effectiveness of this approach and its ability to reach an optimal policy quicker while avoiding state of charge (SOC) depleting during trips when compared to existing PEV charging schemes.
Sep 5, 2:30PM, ENG460
Multimodal Information fusion for Human Action Recognition
This thesis presents three frameworks of human action recognition to facilitate better recognition performance. The first framework aims to fuse handcrafted features from four different modalities including RGB, depth, skeleton, and accelerometer data. Moreover, a new descriptor for skeleton is proposed which provides a discriminative representation for the poses of the action. Aiming to find a more discrimantive subspace, in the first framework two fusion techniques are proposed, Bimodal Hybrid Centroid Canonical Correlation Analysis (BHCCCA) for two sets of features or modalities and Multimodal Hybrid Centroid Canonical Correlation Analysis (MHCCCA) for three or more sets of features or modalities. The second framework fuses handcrafted and deep learning features from three modalities including RGB, Depth, and Skeleton. In this framework, a new depth representation is introduced which extracts the final representation using Deep ConvNet. The proposed fusion techniques form the backbone of the framework: Biset Globality Locality Preserving Canonical Correlation Analysis (BGLPCCA) for two sets of features or modalities and Multiset Globality Locality Preserving Canonical Correlation Analysis (MGLPCCA) for three or more sets of features or modalities. BGLPCCA/MGLPCCA aims to preserve the local and global structures of data while maximizing the correlation among different modalities or sets. The third framework uses the deep learning techniques to improve the long term temporal modelling through two proposed techniques: Temporal Relational Network (TRN) and Temporal Second Order Pooling Based Network (T-SOPN). Also, Global-Local Network (GLN) and Fuse-Inception Network (FIN) are proposed to encourage the network to learn complementary information about the action and the scene itself. Qualitative and quantitative experiments are conducted on nine different datasets and the experimental results demonstrate the effectiveness of the proposed frameworks over state-of-the-art techniques.
Sep 5, 10AM, ENG460
Towards improved medical image segmentation using deep learning
Nabila Abraham • MASC ORAL EXAM
Convolutional neural networks have been asserted to be fast and precise frameworks with great potential in image segmentation. Within the medical domain, image segmentation is a pre-cursor to several applications including surgical simulations, treatment planning and patient prognosis. In this thesis, we explore common limitations of current segmentation practices and propose two modular solutions. One of the major challenges in utilizing deep networks arises when the dataset presents with unbalanced classes. This is often the case with medical imaging as the regions of interest are typically significantly smaller in volume than the background class. In this thesis we propose an improvement to the current gold standard cost function to boost the focus of the network to the smaller classes. Another problem within medical imaging is the variation in both anatomy and pathology across patients. Utilizing multiple imaging modalities provides complementary, segmentation-specific information and is commonly employed by radiologists when contouring data. We propose a image fusion strategy for multi-modal data that uses the variation in modality specific features to guide the task specific learning. Together, our contributions propose a framework to maximize the representational power of the dataset using models with less complexity and higher generalizability. Our contributions consistently outperform baseline models for multi-class segmentation and are modular enough to be scaled up to deeper networks.
Aug 30, 10AM, ENG460
Modelling and Current-Mode Control of a Modular Multilevel DC-DC Converter
Sandeep Kaler • MASC ORAL EXAM
The visions of multi-terminal direct-current (MTDC) grids, DC distribution systems for densely populated urban areas, and DC microgrids for more straightforward integration of distributed energy resources (including renewable energies, electric vehicles, and energy storage devices) have sparked a great deal of research and development in the recent past. An enabling technology towards the fulfilment of these visions is efficient, highly-controllable, and fault-tolerant AC-DC and DC-DC electronic power converters capable of interfacing networks that operate at different voltage levels. This thesis thus presents the results of an in-depth investigation into the operation and control of a particular class of DC-DC converters. The DC-DC converter studied in this thesis is based upon the so-called modular multi-level converter (MMC) configuration, employing half-bridge submodules and with no galvanic isolation. The thesis first presents the governing dynamic and steady-state equations for the converter. Then, based on the developed mathematical model, it identifies suitable variables, strategies, and feedback loops for the regulation of the submodule DC voltages as well as converter power throughput. In particular, two current-control loops are proposed that, in coordination with one another, not only enable the control of the power flow within the converter, but also promise protection against overloads and terminal shorts. The validity of the mathematical model and effectiveness of the proposed control are verified through off-line simulation of a detailed circuit model as well as experiments conducted on a 1-kW experimental setup. The results of this exercise motivate the extension of the proposed control method to more compact designs with galvanic isolation and enhanced power handing capabilities.
Aug 27, 10AM, ENG460
Athlete Health Prediction Using Machine Learning Methods
Md Raihan Sharif • MASC ORAL EXAM
Due to growing activities in sports, the prediction of athletes' health (AH) has recently become an important research topic. However, it is a challenging task to predict AH due to the nature of the data and the limitations of predictive models. The main objective of this work is to develop appropriate models that can forecast AH using historical data. This work will enable sports organizations to monitor the well-being of their athletes.
In this thesis, we investigate various machine learning (ML) methods for predicting AH. Traditional ML methods do not perform well for class-imbalanced data as these methods are biased towards the majority class. In this work, we propose ensemble-based methods which utilize sampling, bootstrap, and boosting techniques to improve the classification performances. Various metrics are used to evaluate and to compare the model performances. Our results show the superiority of ensemble-based methods over traditional approaches. The random forest and the RUSBoost classifier models are in particular found to produce the best performance in handling imbalanced classes.
Aug 16, 11AM, ENG460
With the growing popularity of smart applications that contain computing-intensive tasks, the provision of radio and computing services with high quality is becoming more and more challenging. Therefore, supporting network scalability is crucial to accommodate the anticipated massive numbers of connected devices in future computing networks. To this end, heterogeneous cloud radio access networks (H-CRANs), software-defined networking (SDN), cooperative cloud-edge computing paradigms, and non-orthogonal multiple access (NOMA), have emerged as promising solutions to improve network scalability. In addition to scalability, energy consumption is considered one of the major challenges facing future mobile networks due to the ever-increasing number of connected devices. Therefore, it is necessary to adopt dynamic energy saving strategies to establish energy-efficient networks while maintaining the desired quality requirements. In this thesis, we present effective energy saving strategies that consider the utilization of network elements such as base stations and virtual machines, and implement on/off mechanisms taking into account the quality of service (QoS) required by mobile users. Moreover, we investigate the performance of a NOMA-based resource allocation scheme in the context of Internet of Things aiming to improve network scalability and reduce the energy consumption of mobile users. The system model is mainly built upon the M/M/k queueing system that has been widely accepted in most relevant works. First, the energy saving mechanism is formulated as a 0-1 knapsack problem where the weight and value of each small base station is determined by the utilization and proportion of computii ing tasks at that base station, respectively. The problem is then solved using the dynamic programming approach which showed significant energy saving performance while maintaining the cloud response time at desired levels. Afterwards, the energy saving mechanism is applied on edge computing to reduce the amount of under-utilized virtual machines in edge devices. Herein, the square-root staffing rule and the Halfin-Whitt function are used to determine the minimum number of virtual machines required to maintain the queueing probability below a threshold value. On the user level, reducing energy consumption can be achieved by maximizing data rate provision to reduce the task completion time, and hence, the transmission energy. Herein, a NOMA-based scheme is introduced, particularly, the sparse code multiple access (SCMA) technique that allows subcarriers to be shared by multiple users. Not only does SCMA help provide higher data rates but also increase the number of accommodated users. In this context, a power optimization and codebook allocation problems are formulated and solved using the water-filling and heuristic approaches, respectively. Results show that SCMA can significantly improve data rate provision and accommodate more mobile users with better user satisfaction.
Aug 29, 1PM, ENG460
Yashodhan Athavale • PHD FINAL ORAL EXAM
The Internet of Things (IoT) framework is a trending model in the wake of recent advancements in wireless communications, cloud services, ubiquitous sensors, and smart devices. Today, the IoT model is rapidly being deployed in communications, infrastructure, transportation and healthcare services. The Internet of Medical Things (IoMT) is a subset of the telehealth framework and provides a layered architecture for connecting individuals with mobile devices and wearables, such that their vital physiological data can be captured and analyzed non-invasively using smart sensors embedded within these devices. Currently available wearables have embedded sensing modules for measuring movement, direction, light and pressure. Actigraphs are one such type of wearables which exclusively employ the use of accelerometers for capturing human movement-based vibration data. The main intention of this research work is the analysis of unstructured, non-stationary actigraphy signals. Specifically, this study has been conducted to meet the following key objectives: (i) enabling compression and denoising of actigraphy data during acquisition at source; (ii) extracting regions of interest from the actigraphy data, highlighting specific movements, and; (iii) deriving actigraphy specific features which could be used for improving the classification accuracy of a simple machine learning tool. In order to achieve these objectives, three key contributions have been proposed and developed in this research work, namely: (A) a signal encoding framework for data compression and denoising, (B) two novel adaptive segmentation schemes which help in extracting specific movement information from actigraphy data, and (C) two key actigraphy specific features, which quantify limb movements, and hence provide a better classification accuracy using machine learning algorithms. The outcome of this research work is the form of an IoMT-based and device-independent actigraphy analysis system for identifying types of daily activity, markers for neuromuscular diseases, physical disabilities and joint disorders. In order to test the efficiency of the proposed system in this study, five different actigraphy datasets from wake and sleep states have been used. >From the various experiments conducted in this study, it was found that in comparison to conventional signal filtering and analysis methods based on manufacturer specifications, employing the proposed set of algorithms ensured a 20-90% increase in SNR (signal-to-noise ratio), 50-80% reduction in bit rate, 50-90% data compression, and an increase in activity recognition accuracy by over 5-20%. In addition to this, this research work has also been validated with ground truth information and machine learning approaches. Results from this systematic investigation indicate that data analysis right at the acquisition source, optimizes signal denoising, memory and power savings, and activity recognition, thereby promoting an edge computing approach to physiological signal analysis using wearables in a low resource environment.
Aug 22, 8AM, ENG460
Augmented Reality and Human Factors for the Neurosurgical Operating Room
The virtual overlay of patient-specific anatomies onto a surgical site through Augmented Reality (AR) technologies has been thought to be a potentially ideal neuronavigational system for use in neurosurgery. Although impressive and futuristic, there are many design considerations that must be taken into account, including surgeon reception, perceived utility, intuitive control and manipulation design, and overall system accuracy during surgery. To implement AR into the neurosurgical Operating Room (OR), a gradual approach of evolutionary design to ensure widespread adoption may be considered. This thesis presents a potential pathway for the introduction of AR technologies into the neurosurgical OR.
The thesis is divided into three parts: incorporation of AR features into existing platforms for improved functionality and introduction of AR concepts to surgical environments, observation and evaluation of surgeon perception of AR overlays and AR headsets to inform display methods and designs, and quantification of virtual object placement accuracy in a clinical environment. The findings presented show that AR integrated systems improve OR workflow when conventional tracked tools are unavailable, user preference of AR overlays onto the surgical site change depending on operator experience level, and the placement accuracy of state-of-the-art AR head mounted displays are suitable for presurgical planning and very close to accuracy needed for surgical guidance. These three elements are key to developing a pathway for adoption of AR technologies in the OR, and help to inform designs for future headsets to assist surgeons and improve patient care.
Aug 22, 10AM, ENG460
Penscriptive Depth-Controlled Robotic Laser Osteotomy
Bone cutting in surgery is currently done using un-intelligent tools that depend on the proficiency of the surgeon to prevent damage to underlying critical structures. As one can imagine, damage isnt always prevented. Iatrogenic damage to dura and sub-dural neural structures during osteotomical procedures such as a craniotomy can result in increased patient morbidity.
This dissertation proposes the development of a robot-guided laser osteotome (bone cutter) with the use of inline optical coherence tomography (OCT) to precisely control the cutting depth in real-time. The all-fiber system design integrates a high peak-power pulsed Yb-doped fiber laser (1064nm) coupled directly into the sample arm of a swept-source OCT system (λc = 1310nm) with a fourth-order power disparity between the OCT system and fiber laser. Sub-millimeter accuracy was achieved in percussion drilling of phantom and porcine bone.
Through the use of optical topographic imaging (OTI), this work presents a novel method for the surgeon to identify arbitrary trajectories for desired cuts. A surgical pencil is used to demarcate cutting trajectories for the robot to follow directly onto the boney surface. OTI imaging combined with a novel algorithm developed through this work allows the penscribed line to be isolated and translated into spatial attitude information for the robot to guide the end effector-mounted laser along. Sub-millimeter trajectory following accuracy was achieved. This work also demonstrates the first use of OCT in continuous, real-time refocusing of the optical end-effector in order to maintain cut quality. The focus of the laser was able to be maintained within the Rayleigh length of the focused Gaussian beam for linear feed rates up to 1mm/s at a 45° surface incline. Finally, optimization of bone ablation is explored in this dissertation. The use of graphite as a high-absorption topical chromophore and the use of nitrogen as an assist gas in the form of a coaxial jet are both analyzed to determine how to achieve the highest etch rate in bone. The results in this dissertation show that the topical application of graphite was able to significantly reduce the mean and variance of etching performance; an improvement by at least two orders of magnitude in the time to 0.5mm etch depth is demonstrated. It is also demonstrated that etch rate during ablation can be optimized for coaxial nitrogen flow (30SCFH out of a nozzle with 3mm output diameter); higher and lower flow rates showed slower etch rates.
It is hypothesized that a system such as the one developed in this dissertation will increase the precision of bone cutting, decrease the amount of time needed to make cuts into sensitive structures and also address certain issues of unsuccessful uptake of lasers in modern medicine.
Aug 26, 11AM, ENG460
Robust Discriminative Analysis Framework For Gaze And Head-Pose Estimation
Salahaldeen Rabba • PHD FINAL ORAL EXAM
Head movements, combined with gaze, play a fundamental role in predicting a person's action and intention. In non-constrained head movement settings, the process is complex, and performance can degrade significantly in the presence of variation in head-pose, gaze position, occlusion and ambient illumination. In this thesis, a framework is therefore proposed to fuse and combine head-pose and gaze information to obtain more robust and accurate gaze estimation.
Specific contributions include: the development of a newly developed graph-based model for pupil localization and accurate estimation of the pupil center; the proposal of a novel iris region descriptor feature using quadtree decomposition, that works together with pupil localization for gaze estimation; the proposal of kernel-based extensions and enhancements to a fusion mechanism known as Discriminative Multiple Canonical Correlation Analysis (DMCCA) for fusing features (proposed and traditional) together, to generate a refined, high quality feature set for classification; and the newly developed methodology of head-pose features based on quadtree decompositions and geometrical moments, to better integrate roll, yaw, pitch and jawline into the overall estimation framework.
The experimental results of the proposed framework demonstrate robustness against variations in illumination, occlusion, head-pose and is calibration free. The proposed framework is validated by achieving an accurate gaze and head-pose estimation of 4.5° using MPII, 4.4° using Cave, 4.8° using EYEDIAP, 5.0° using ACS, 4.1° using OSLO and 4.5° using UULM datasets respectively.
Aug 12, 1PM, ENG460
Safely caching HOG pyramid feature levels, to speed up facial landmark detection
Gareth Adam Higgins • MASC oral exam
This thesis presents an algorithm for improving the execution time of existing Histogram of Oriented Gradients (HOG) pyramid analysis based facial landmark detection. It extends the work of [1] to video data. A Bayesian Network (Bayes Net) is used as a policy network to determine when previously calculated features can be safely reused. This avoids the problem of recalculating expensive features every frame. The algorithm leverages a set of lightweight features to minimize additional overhead. Additionally, it takes advantage of the wide spread adoption of H.264 encoding in consumer grade recording devices, to acquire cheap motions vectors. Experimental results on a difficult real world data set show that policy network is effective, and that the error introduced to the system remains relatively low. A large performance benet is realized due to the use of the cached features.