LAMA-WeST Seminars
Next seminars
No next seminars
Past seminars
The LAMA-WeST Seminar Series - GeoCoder: Solving Geometry Problems with Multimodal Large Language Models
This seminar is open only to LAMA-WeST members. Large language models (LLMs) have demonstrated exceptional prowess in mathematical reasoning tasks that rely solely on textual input. However, many mathematical problems incorporate both textual and visual data. With the increasing prominence of multimodal large language models (MLLMs), recent works have used these models on geometric mathematical problems. In this seminar, we will discuss our current work on this task.
The LAMA-WeST Seminar Series - Ontology Embeddings with Pretrained Language Models and Schema Information for Ontology-related Tasks
Masters' thesis defence. This thesis explores novel ways of mapping ontologies to a latent space through robust, scalable, and generalized representation learning. Following the recent advances in language models, we focus on how ontology embeddings can leverage them by building more accurate vector representations. Our work is divided into three papers. In our first paper, we describe SEBMatcher, an ontology alignment system that relies on two BERT networks and a context-enhanced input to produce alignment. In our second paper, we present SORBET, an ontology embedding model inspired by SEBMatcher, that leverages a distance-based regression loss and a pre-trained SentenceBERT to produce high-quality ontology embeddings. Finally, we describe SORBETMatcher, a schema matching and subsumption prediction system whose primary objective is to showcase the potential of SORBET embeddings in ontology-related tasks. The obtained results show a clear improvement over the state-of-the-art. In ontology embedding, SORBET outperformed other models on the evaluated tasks and datasets by a large margin. In ontology alignment and subsumption prediction, SORBETMatcher achieved top performances while unmatched in robustness, scalability, and generalization. Our work’s contribution also includes an open-source, flexible, and modular framework for ontology embedding, ontology alignment, and subsumption prediction.
The LAMA-WeST Seminar Series - Training Neural Networks to Perform Structured Prediction Tasks
Masters' thesis defence. Despite their numerous successes on various challenging tasks, deep neural networks still struggle to learn combinatorial structure, where multiple discrete outputs have interconnected relationships governed by constraints, especially when there is not enough data for the model to learn the output structure. Constraint programming, a type of non-learning algorithm, focuses on structure. It has a developed and successful past in recognizing combinatorial structures that frequently recur, and in developing advanced algorithms to extract information from these structures. In particular, we are interested in the relative frequency of a given variable-value assignment in that combinatorial structure. The constraint programming with belief propagation framework generalizes this model by propagating these relative frequencies from a constraint programming model to approximate the marginal probability mass functions of each variable. These estimated marginal probabilities are used as penalties within the loss function, improving the neural network’s learning and efficiency from samples. In this thesis, we propose to train a neural network to generate output that aligns with a combinatorial structure expressed as a constraint programming model. This is achieved by calculating a loss function that includes marginals determined by constraint programming with a belief propagation solver. We argue that this model offers a more natural integration of constraint programming and neural networks. We offer practical evidence that training the model using this approach significantly enhances its performance, especially when there is a limited amount of data available. Our results on the Partial Latin Square problem indicate consistent improvement in the accuracy of the model over the existing methods.
The LAMA-WeST Seminar Series - Clinical Note Summarization using Large Language Models
The manual process of summarizing electronic health records (EHR) is extremely time-consuming for physicians, leading to burnout and potential errors. Large language models (LLMs) offer a promising avenue to automate this laborious task. This study explores the use of LLMs for summarizing clinical notes from datasets like MIMIC-III and MIMIC-CXR. We review prior work on summarization approaches along with evaluation metrics tailored to the medical domain. An empirical study is conducted using LLMs on the task of radiology report summarization. Results demonstrate reasonable performance, but highlight the challenge of hallucinations where models generate inconsistencies or non-factual statements. We investigate techniques to mitigate hallucinations across the model design, training, generation, and evaluation stages. Additionally, we propose a novel information extraction approach automatically generating structured clinical note summaries and showing significant gains in extracting information over normal generation. Despite these promising results, open challenges remain around adapting summaries for different medical specialties and developing robust evaluation metrics for clinical summarization.
The LAMA-WeST Seminar Series - Tag-Debias: Entity and Concept Typing for Social Bias Mitigation in PLMs
Pre-trained language models exhibit noticeable stereotypical biases in various downstream tasks. Consequently, it is imperative to explore methods aimed at addressing or mitigating social biases in these models. In this research, we propose novel gender tagging strategies to achieve a higher level of abstraction for sensitive attributes in the corpus. Subsequently, we fine-tune BERT-family models on this tagged corpus. Our method indicates improvement in fairness when compared to both the initial and scrubbed model. Finally, we applied our proposed tagged model to Candidates CVs' ranking, revealing a 15% improvement in fairness ranking compared to the initial model and 10% compared to state-of-the-art models.
The LAMA-WeST Seminar Series - SORBET: a Siamese Network for Ontology Embeddings using a Distance-based Regression Loss and BERT
Ontology embedding methods have been popular inrecent years, especially when it comes to representation learning algorithms for solving ontology-related tasks. Despite the impact of large language modelson knowledge graphs’ related tasks, there has been less focus on adapting thesemodels to construct ontology embeddings that are both semantically relevant and faithful to the ontological structure. In this paper, we present a novelontology embedding method that encodes ontology classes into a pre-trained SBERT through random walks and then fine-tunes the embeddings using adistance-based regression loss. We benchmark our algorithm on four different datasets across two tasks and show the impact of transfer learning and ourdistance-based loss on the quality of the embeddings. Our results show thatSORBET outperform state-of-the-art ontology embedding techniques for theperformed tasks.
The LAMA-WeST Seminar Series - TwiRGCN:Temporally Weighted Graph Convolution for Question Answering over Temporal Knowledge Graphs
Recent years have witnessed interest in Temporal Question Answering over Knowledge Graphs(TKGQA), resulting in the development of multiple methods. However, these are highly engineered, thereby limiting their generalizability, and they do not automatically discover relevant parts of the KG during multi-hop reasoning.Relational graph convolutional networks (RGCN) provide an opportunity to address both these challenges – we explore this direction in the talk.Specifically, we propose a novel, intuitive and interpretable scheme to modulate the messages passed through a KG edge during convolution based on the relevance of its associated period to the question. We also introduce a gating device to predict if the answer to a complex temporal question is likely to be a KG entity or time and use this prediction to guide our scoring mechanism. We evaluate the resulting system, which we call TwiRGCN, on a recent challenging dataset for multi-hop complex temporal QA called TimeQuestions. We show thatTwiRGCN significantly outperforms state-of-the-art models on this dataset across diverse question types. Interestingly, TwiRGCN improves accuracy by 9–10 percentage points for the most difficult ordinal and implicit question types.
The LAMA-WeST Seminar Series - Natural Language to SPARQL Query Generation: A Comprehensive Evaluation of the Copy Mechanism and its Generalization Capabilities
In recent years, the field of neural machine translation (NMT) for SPARQL query generation has witnessed a significant growth. Recently, the addition of the copy mechanism to traditional encoder-decoder architectures and the use of pre-trained models have set new performance benchmarks. These state-of-the-art models have reached almost perfect query generation for simple datasets. However, such progress raises the question of the ability of these models to generalize and deal with unseen questions and entities. This work presents a large variety of experiments that replicate and expand upon recent NMT-based SPARQL generation studies, comparing pre-trained and non-pre-trained models, question annotation formats, and the use of a copy mechanism for non-pre-trained and pre-trained models. This work then evaluate the ability of several models to handle unknown question-query pairs and out-of-vocabulary URIs.
The LAMA-WeST Seminar Series - Unsupervised learning to cluster events labeled as “other” in the NSIR-RT incident learning database
Purpose:In radiotherapy, clinical staff are encouraged to report incidents that may occur in an incident learning system (ILS). An investigator is then assigned to follow up on the incident and label it according to radiation oncology-specific taxonomies. However, as new incidents occur, existing labels may become insufficient to correctly categorize all of them or investigators may be unsure which label is most appropriate and choose the catch-all “other” label. As a result, many incidents get labeled as "other" in the ILS, limiting the opportunity for learning and quality improvement they would/should otherwise provide. In this project, we aimed to automatically relabel some of these “other” incidents using already existing labels based on closer inspection of their narrative texts using NLP and unsupervised ML techniques. Method: Over 6,000 incident reports were gathered from the Canadian National System for Incident Reporting-Radiation Treatment (NSIR-RT) as well as our local ILS, which uses the NSIR-RT taxonomy. Incident descriptions from these reports were processed using various NLP techniques to obtain their vectorized representations. Processed data with all the expert-generated labels except for the “other” incidents (2,618 incidents) were clustered using the k-means clustering algorithm based on their incident description data. Each cluster was automatically assigned a label based on the frequency of expert-generated labels within the cluster. Incidents labeled as “other” were then introduced to the latent space to check if they fell within the range of an existing cluster. If they did, the corresponding cluster label was used to relabel the “other” incident. If they did not, we classified them as new incident types that will require a new label. Results: Out of 2,618 incidents labeled as “other” our pipeline re-labeled 1,928 incidents with existing labels, and 690 were labeled as new incident types. Future work will attempt to validate the reballing of the “other” incidents and will cluster the events labeled as new incident types to identify groupings of events that may give rise to new labels.
The LAMA-WeST Seminar Series - Applying Deep Learning for Avionic Security: An Experiment Design Study
In this presentation, we will explore the use of deep learning in the development of an intrusion detection system for avionics. The focus will be on the experiment design and strategies used to create a state-of-the-art system that can detect security threat in real-time. This presentation will also provide the opportunity for an open discussion to validate the methodology and to share ideas and insights on how to further improve the approach.
The LAMA-WeST Seminar Series - Entity Typing with Natural Language Inference for Fine-grained Named Entity Recognition.
Fine-grained Named Entity Recognition (FgNER) consists of detecting named entity mentions (mention detection) and typing them(entity typing - ET) with types from a relatively large set. Typing mentions becomes harder with an increase in the number of types. Also, traditional NER architectures are composed of a fixed-size classifier, which cannot be adapted to a bigger type set. Recent Prompt-based ET systems are achieving good performances for few-shot learning over large type sets without any fixed classifiers. Integrating these ET systems with various techniques for FgNER might increase the performances achieved on FgNER.
The LAMA-WeST Seminar Series - A Copy Mechanism for Handling Knowledge Base Elements in SPARQL Neural Machine Translation
Neural Machine Translation (NMT) models from English to SPARQL are a promising development for SPARQL query generation. However, current architectures are unable to integrate the knowledge base (KB) schema and handle questions on knowledge resources, classes, and properties unseen during training, rendering them unusable outside the scope of topics covered in the training set. Inspired by the performance gains in natural language processing tasks, we propose to integrate a copy mechanism for neural SPARQL query generation as a way to tackle this issue. We illustrate our proposal by adding a copy layer and a dynamic knowledge base vocabulary to two Seq2Seq architectures (CNNs and Transformers). This layer makes the models copy KB elements directly from the questions, instead of generating them. We evaluate our approach on state-of-the-art datasets, including datasets referencing unknown KB elements and measure the accuracy of the copy-augmented architectures. Our results show a considerable increase in performance on all datasets compared to non-copy architectures.
The LAMA-WeST Seminar Series - Structural Embeddings with BERT Matcher: A Representation Learning Approach for Schema Matching
The schema matching task consists of finding different types of relations between 2 ontologies. Algorithms finding these relations often need a combination of semantics, structural and lexical inputs coming from the ontologies. Structural Embeddings with BERT Matcher (SEBMatcher) is a system that leverages all of these inputs by having Random Walks as its foundation. It is also a system that employs a 2 step approach: An unsupervised pretraining of a Masked Language Modeling BERT for random walks, followed by a supervised training of a BERT classifier for positive and negative mappings. During its participation in the Ontology Alignment Evaluation Initiative (OAEI), SEBMatcher obtained promising results in participating tracks.
The LAMA-WeST Seminar Series - Digital Twinning to predict radiotherapy replanning for head and neck cancer patients
Head and neck cancer patients undergoing radiotherapy often experience weight loss over the course of treatment due to the effects of radiation. This weight loss can result in significant anatomical changes that require the patient’s treatment to be replanned to ensure that an acceptable dose of radiation is being delivered to the tumour and the nearby radiosensitive organs. Unfortunately, the decision to replan a patient is typically done with short notice to the planning team, which can significantly disrupt the workflow and consequently affect the timeline of other patient treatments. Our goal is therefore to pre-emptively determine if and when a patient will need replanning by predicting how a patient’s anatomy will change over the course of treatment. The proposed project will be carried out in three main steps. First, a variational autoencoder will be trained on patient’s cone-beam CT (CBCT) scans that are taken throughout treatment to learn latent space representations of the data. Next, the trajectory of each patient’s change in CBCT scans will be mapped in latent space such that a new patient’s trajectory can be predicted based on past patient trends that neighbour them in latent space. Finally, we aim to incorporate a digital twin framework whereby patient trajectories will be dynamically updated based on new data collected over the course of treatment.
The LAMA-WeST Seminar Series - Language Understanding and the Ubiquity of Local Structure
Recent research has shown that neural language models are surprisingly insensitive to text perturbation, such as shuffling the order of words. If the order of words is unnecessary to perform natural language understanding on many tasks, what is? We empirically demonstrate that local structure is always relied upon by neural language models to build understanding, and global structure is often unused. These results hold for over 400 different languages. We use this property of neural language models to automatically detect which of those 400 different languages are not currently well understood by our current crop of pretrained cross-lingual models, thus providing visibility into where our efforts should go as a research community.