Two Papers at ICLR

jkh6 — Sat, 16 Mar 2024 19:27:56 +0000

Two DSP group papers have been accepted by the International Conference on Learning Representations (ICLR) 2024 in Vienna, Austria

"Self-Consuming Generative Models Go MAD" by S. Alemohammad, J. Casco-Rodriguez, L. Luzi, A. I. Humayun, H. Babaei, D. LeJeune, A. Siahkoohi, and R. G. Baraniuk
"Implicit Neural Representations and the Algebra of Complex Wavelets" by M. Roddenberry, V. Saragadam, M. de Hoop, and R. G. Baraniuk

Self-Consuming Generative Models Go MAD

jkh6 — Wed, 12 Jul 2023 14:51:57 +0000

Self-Consuming Generative Models Go MAD
http://arxiv.org/abs/2307.01850

To Appear at ICLR 2024

Sina Alemohammad, Josue Casco-Rodriguez, Lorenzo Luzi, Ahmed Imtiaz Humayun,
Hossein Babaei, Daniel LeJeune, Ali Siahkoohi, Richard G. Baraniuk

Abstract: Seismic advances in generative AI algorithms for imagery, text, and other data types has led to the temptation to use synthetic data to train next-generation models. Repeating this process creates an autophagous ("self-consuming") loop whose properties are poorly understood. We conduct a thorough analytical and empirical analysis using state-of-the-art generative image models of three families of autophagous loops that differ in how fixed or fresh real training data is available through the generations of training and in whether the samples from previous-generation models have been biased to trade off data quality versus diversity. Our primary conclusion across all scenarios is that without enough fresh real data in each generation of an autophagous loop, future generative models are doomed to have their quality (precision) or diversity (recall) progressively decrease. We term this condition Model Autophagy Disorder (MAD), making analogy to mad cow disease.

In the news:

"Generative AI Goes 'MAD' When Trained on AI-Created Data Over Five Times," Tom's Hardware, 12 July 2023
"AI Loses Its Mind After Being Trained on AI-Generated Data," Futurism, 12 July 2023
"Why AI Generated Data Can Cause a Generative AI Nightmare," 12 July 2023
"Scientists make AI go crazy by feeding it AI-generated content," TweakTown, 13 July 2023
"AI models trained on AI-generated data experience Model Autophagy Disorder (MAD) after approximately five training cycles," Multiplatform.AI, 13 July 2023
"AIs trained on AI-generated images produce glitches and blurs,” NewScientist, 18 July 2023
"Training AI With Outputs of Generative AI Is Mad" CDOtrends, 19 July 2023
"When AI Is Trained on AI-Generated Data, Strange Things Start to Happen" Futurism, 1 August 2023
"AI's 'mad cow disease' problem tramples into earnings season," Yahoo Finance, 12 April 2024

In cartoons:

Free textbooks and other open educational resources gain popularity

jkh6 — Tue, 11 Jul 2023 14:54:38 +0000

"Free textbooks and other open educational resources gain popularity," Physics Today 76 (7), 18–21 (2023)

"The prices of college textbooks have skyrocketed: From 2011 to 2018, they went up by 40.6% in the US, according to the Bureau of Labor Statistics’ Consumer Price Index. That can add up to as much as $1000 for a single semester. So it’s no surprise that freely available, openly licensed textbooks, lectures, simulations, problem sets, and more—known collectively as open educational resources (OERs)—are having a moment."

Plenary Talk at IEEE ICASSP 2023

jkh6 — Fri, 16 Jun 2023 22:21:37 +0000

Richard Baraniuk presented the plenary talk "The Local Geometry of Deep Learning" at the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) in Rhodes, Greece in June 2023.

30 Students in 30 Years

jkh6 — Thu, 13 Apr 2023 20:39:28 +0000

Two Papers at CVPR

jkh6 — Tue, 21 Mar 2023 21:47:52 +0000

Two DSP group papers have been accepted by the IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) 2023 in Vancouver, Canada

"SplineCam: Exact Visualization of Deep Neural Network Geometry and Decision Boundaries" by Ahmed Imtiaz Humayun, Randall Balestriero, Guha Balakrishnan, and Richard Baraniuk (Highlight paper, 2.5% of all submissions)
"WIRE: Wavelet Implicit Neural Representations," by Vishwa Saragadam, Daniel LeJeune, Jasper Tan, Guha Balakrishnan, Ashok Veeraraghavan, and Richard Baraniuk

AMS Josiah Willard Gibbs Lecture

jkh6 — Fri, 27 Jan 2023 15:05:31 +0000

Richard Baraniuk presented the 2023 AMS Josiah Willard Gibbs Lecture – entitled "The Mathematics of Deep Learning" – at the Joint Mathematics Meeting in Boston, Massachusetts in January 2023. The first AMS Josiah Willard Gibbs Lecture was given in 1923. This public lecture is one of the signature events in the Society’s calendar. Previous speakers have included Albert Einstein, Vannevar Bush, John von Neumann, Norbert Wiener, Kurt Gödel, Hermann Weyl, Eugene Wigner, Donald Knuth, Herb Simon, David Mumford, Ingrid Daubechies, and Claude Shannon.

Two “Notable” Papers at ICLR 2023

jkh6 — Sun, 22 Jan 2023 01:26:58 +0000

Two DSP group papers have been accepted as "Notable - Top 25%" papers for the International Conference on Learning Representations (ICLR) 2023 in Kigali, Rwanda

"A Primal-Dual Framework for Transformers and Neural Networks," by T. M. Nguyen, T. Nguyen, N. Ho, A. L. Bertozzi, R. G. Baraniuk, and S. Osher
"Retrieval-based Controllable Molecule Generation," by Jack Wang, W. Nie, Z. Qiao, C. Xiao, R. G. Baraniuk, and A. Anandkumar

Abstracts below.

Retrieval-based Controllable Molecule Generation

Generating new molecules with specified chemical and biological properties via generative models has emerged as a promising direction for drug discovery. However, existing methods require extensive training/fine-tuning with a large dataset, often unavailable in real-world generation tasks. In this work, we propose a new retrieval-based framework for controllable molecule generation. We use a small set of exemplar molecules, i.e., those that (partially) satisfy the design criteria, to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria. We design a retrieval mechanism that retrieves and fuses the exemplar molecules with the input molecule, which is trained by a new self-supervised objective that predicts the nearest neighbor of the input molecule. We also propose an iterative refinement process to dynamically update the generated molecules and retrieval database for better generalization. Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning. On various tasks ranging from simple design criteria to a challenging real-world scenario for designing lead compounds that bind to the SARS-CoV-2 main protease, we demonstrate our approach extrapolates well beyond the retrieval database, and achieves better performance and wider applicability than previous methods.

A Primal-Dual Framework for Transformers and Neural Networks

Self-attention is key to the remarkable success of transformers in sequence modeling tasks, including many applications in natural language processing and computer vision. Like neural network layers, these attention mechanisms are often developed by heuristics and experience. To provide a principled framework for constructing attention layers in transformers, we show that the self-attention corresponds to the support vector expansion derived from a support vector regression problem, whose primal formulation has the form of a neural network layer. Using our framework, we derive popular attention layers used in practice and propose two new attentions: 1) the Batch Normalized Attention (Attention-BN) derived from the batch normalization layer and 2) the Attention with Scaled Head (Attention-SH) derived from using less training data to fit the SVR model. We empirically demonstrate the advantages of the Attention-BN and Attention-SH in reducing head redundancy, increasing the model’s accuracy, and improving the model’s efficiency in a variety of practical applications including image and time-series classification.

Machine Learning Privacy Work to Appear at AISTATS 2023

jkh6 — Sat, 21 Jan 2023 22:48:35 +0000

"A Blessing of Dimensionality in Membership Inference through Regularization" by DSP group members Jasper Tan, Daniel LeJeune, Blake Mason, Hamid Javadi, and Richard Baraniuk has been accepted for the International Conference on Artificial Intelligence and Statistics (AISTATS) in Valencia, Spain, April 2023.

IEEE SPS Norbert Wiener Society Award

jkh6 — Tue, 13 Dec 2022 20:53:59 +0000

Richard Baraniuk has been selected for the 2022 IEEE SPS Norbert Wiener Society Award "for fundamental contributions to sparsity-based signal processing and pioneering broad dissemination of open educational resources". The Society Award honors outstanding technical contributions in a field within the scope of the IEEE Signal Processing Society and outstanding leadership in that field (list of previous recipients).