LAMA-WeST Seminars
Next seminars
No next seminars
Past seminars
The LAMA-WeST Seminar Series - Unsupervised learning to cluster events labeled as “other” in the NSIR-RT incident learning database
Purpose:In radiotherapy, clinical staff are encouraged to report incidents that may occur in an incident learning system (ILS). An investigator is then assigned to follow up on the incident and label it according to radiation oncology-specific taxonomies. However, as new incidents occur, existing labels may become insufficient to correctly categorize all of them or investigators may be unsure which label is most appropriate and choose the catch-all “other” label. As a result, many incidents get labeled as "other" in the ILS, limiting the opportunity for learning and quality improvement they would/should otherwise provide. In this project, we aimed to automatically relabel some of these “other” incidents using already existing labels based on closer inspection of their narrative texts using NLP and unsupervised ML techniques. Method: Over 6,000 incident reports were gathered from the Canadian National System for Incident Reporting-Radiation Treatment (NSIR-RT) as well as our local ILS, which uses the NSIR-RT taxonomy. Incident descriptions from these reports were processed using various NLP techniques to obtain their vectorized representations. Processed data with all the expert-generated labels except for the “other” incidents (2,618 incidents) were clustered using the k-means clustering algorithm based on their incident description data. Each cluster was automatically assigned a label based on the frequency of expert-generated labels within the cluster. Incidents labeled as “other” were then introduced to the latent space to check if they fell within the range of an existing cluster. If they did, the corresponding cluster label was used to relabel the “other” incident. If they did not, we classified them as new incident types that will require a new label. Results: Out of 2,618 incidents labeled as “other” our pipeline re-labeled 1,928 incidents with existing labels, and 690 were labeled as new incident types. Future work will attempt to validate the reballing of the “other” incidents and will cluster the events labeled as new incident types to identify groupings of events that may give rise to new labels.
The LAMA-WeST Seminar Series - Applying Deep Learning for Avionic Security: An Experiment Design Study
In this presentation, we will explore the use of deep learning in the development of an intrusion detection system for avionics. The focus will be on the experiment design and strategies used to create a state-of-the-art system that can detect security threat in real-time. This presentation will also provide the opportunity for an open discussion to validate the methodology and to share ideas and insights on how to further improve the approach.
The LAMA-WeST Seminar Series - Entity Typing with Natural Language Inference for Fine-grained Named Entity Recognition.
Fine-grained Named Entity Recognition (FgNER) consists of detecting named entity mentions (mention detection) and typing them(entity typing - ET) with types from a relatively large set. Typing mentions becomes harder with an increase in the number of types. Also, traditional NER architectures are composed of a fixed-size classifier, which cannot be adapted to a bigger type set. Recent Prompt-based ET systems are achieving good performances for few-shot learning over large type sets without any fixed classifiers. Integrating these ET systems with various techniques for FgNER might increase the performances achieved on FgNER.
The LAMA-WeST Seminar Series - A Copy Mechanism for Handling Knowledge Base Elements in SPARQL Neural Machine Translation
Neural Machine Translation (NMT) models from English to SPARQL are a promising development for SPARQL query generation. However, current architectures are unable to integrate the knowledge base (KB) schema and handle questions on knowledge resources, classes, and properties unseen during training, rendering them unusable outside the scope of topics covered in the training set. Inspired by the performance gains in natural language processing tasks, we propose to integrate a copy mechanism for neural SPARQL query generation as a way to tackle this issue. We illustrate our proposal by adding a copy layer and a dynamic knowledge base vocabulary to two Seq2Seq architectures (CNNs and Transformers). This layer makes the models copy KB elements directly from the questions, instead of generating them. We evaluate our approach on state-of-the-art datasets, including datasets referencing unknown KB elements and measure the accuracy of the copy-augmented architectures. Our results show a considerable increase in performance on all datasets compared to non-copy architectures.
The LAMA-WeST Seminar Series - Structural Embeddings with BERT Matcher: A Representation Learning Approach for Schema Matching
The schema matching task consists of finding different types of relations between 2 ontologies. Algorithms finding these relations often need a combination of semantics, structural and lexical inputs coming from the ontologies. Structural Embeddings with BERT Matcher (SEBMatcher) is a system that leverages all of these inputs by having Random Walks as its foundation. It is also a system that employs a 2 step approach: An unsupervised pretraining of a Masked Language Modeling BERT for random walks, followed by a supervised training of a BERT classifier for positive and negative mappings. During its participation in the Ontology Alignment Evaluation Initiative (OAEI), SEBMatcher obtained promising results in participating tracks.
The LAMA-WeST Seminar Series - Digital Twinning to predict radiotherapy replanning for head and neck cancer patients
Head and neck cancer patients undergoing radiotherapy often experience weight loss over the course of treatment due to the effects of radiation. This weight loss can result in significant anatomical changes that require the patient’s treatment to be replanned to ensure that an acceptable dose of radiation is being delivered to the tumour and the nearby radiosensitive organs. Unfortunately, the decision to replan a patient is typically done with short notice to the planning team, which can significantly disrupt the workflow and consequently affect the timeline of other patient treatments. Our goal is therefore to pre-emptively determine if and when a patient will need replanning by predicting how a patient’s anatomy will change over the course of treatment. The proposed project will be carried out in three main steps. First, a variational autoencoder will be trained on patient’s cone-beam CT (CBCT) scans that are taken throughout treatment to learn latent space representations of the data. Next, the trajectory of each patient’s change in CBCT scans will be mapped in latent space such that a new patient’s trajectory can be predicted based on past patient trends that neighbour them in latent space. Finally, we aim to incorporate a digital twin framework whereby patient trajectories will be dynamically updated based on new data collected over the course of treatment.
The LAMA-WeST Seminar Series - Language Understanding and the Ubiquity of Local Structure
Recent research has shown that neural language models are surprisingly insensitive to text perturbation, such as shuffling the order of words. If the order of words is unnecessary to perform natural language understanding on many tasks, what is? We empirically demonstrate that local structure is always relied upon by neural language models to build understanding, and global structure is often unused. These results hold for over 400 different languages. We use this property of neural language models to automatically detect which of those 400 different languages are not currently well understood by our current crop of pretrained cross-lingual models, thus providing visibility into where our efforts should go as a research community.