Publications

Here is a list of my current publications. Be sure to check out my Google Scholar page for a (probably) more updated version.

Accelerating Hierarchical Associative Memory: A Deep Equilibrium Approach (2023)
NeurIPS 2023 - AMHN workshop  [ArXiv]  [OpenReview]
Cédric Goemaere, Johannes Deleu, Thomas Demeester

We show that (Hierarchical) Associative Memory can be cast as a Deep Equilibrium Model. Moreover, we identify and resolve a redundancy in synchronous updates of HAMs, and show that our solution boils down to parallellizing asynchronous updates.

Exploring the Temperature-Dependent Phase Transition in Modern Hopfield Networks (2023)
NeurIPS 2023 - AMHN workshop  [ArXiv]  [OpenReview]
Felix Koulischer, Cédric Goemaere, Tom Van Der Meersch, Johannes Deleu, Thomas Demeester

We investigate the role of the temperature parameter $\beta$ in Modern Hopfield Networks, and identify two behavioral regimes, with a phase transition determined by a critical temperature $\beta_c$.
This work stems from Felix’s Master’s thesis, which I supervised together with Thomas.

Efficient Keyword Generation using Pretrained Language Models (2022)
Master's thesis  [University Library Ghent]
Cédric Goemaere
BNAIC/BeNeLearn 2022 - Thesis abstract  [Proceedings]
Cédric Goemaere, Thomas Demeester, Tim Verbelen, Bart Dhoedt, Cedric De Boom

My Master’s thesis looks into the problem of keyword generation: given the start of a sentence, generate a fluent completion that contains a pre-specified target keyword.
In contrast to prior work on fine-tuning (parts of) a pretrained LM, I show that keyword generation can be achieved more efficiently by working directly on the logits of the unmodified LM. I propose 4 simple, interpretable models: one based on a target-specific prior, another based on the FastText similarity between the target and (every token in) the vocabulary, and finally two combinations of these two.

Example: complete the sentence “As I was walking across the” such that it contains the keyword “analysis”. One correct solution would be “As I was walking across the lab, I noticed the computer had already finished its analysis of the substrate.”

Fun fact: the second model contains only 65 trainable parameters and demonstrates zero-shot generalization

Afterthought: the simple training procedure I used is mathematically equivalent to using DPO where the reward $r$ is zero if the sentence $s$ contains the keyword $k$, and $-\infty$ otherwise (i.e., $r=\log(\mathbb{1}(k \in s))$ ).