skip to main content
Volume 1, Issue 2June 2024
Bibliometrics
Skip Table Of Content Section
research-article
Open Access
Identification and Semiparametric Efficiency Theory of Nonignorable Missing Data with a Shadow Variable
Article No.: 5, Pages 1–23https://doi.org/10.1145/3592389

We consider identification and estimation with an outcome missing not at random (MNAR). We study an identification strategy based on a so-called shadow variable. A shadow variable is assumed to be correlated with the outcome but independent of the ...

Highlights

Problem statement

Missingness not at random (MNAR) arises in many empirical studies in biomedical, socioeconomic, and epidemiological researches. A fundamental problem of MNAR is the identification problem, that is, the parameter of interest ...

research-article
Open Access
Optimistic Rates: A Unifying Theory for Interpolation Learning and Regularization in Linear Regression
Article No.: 6, Pages 1–51https://doi.org/10.1145/3594234

We study a localized notion of uniform convergence known as an “optimistic rate” [34, 39] for linear regression with Gaussian data. Our refined analysis avoids the hidden constant and logarithmic factor in existing results, which are known to be crucial ...

Highlights

Problem Statement

Generalization theory proposes to explain the ability of machine learning models to generalize to fresh examples by bounding the gap between the test error (error on new examples) and training error (error on the data they ...

research-article
Open Access
Language Models in the Loop: Incorporating Prompting into Weak Supervision
Article No.: 7, Pages 1–30https://doi.org/10.1145/3617130

We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited. Rather than apply the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions ...

Highlights

Problem statement

The goal of this paper is to use large language models to create smaller, specialized models. These specialized models can be better suited to specific tasks because they are tuned for them and are less expensive to serve in ...

research-article
Open Access
Principal Component Networks: Utilizing Low-Rank Activation Structure to Reduce Parameters Early in Training
Article No.: 8, Pages 1–27https://doi.org/10.1145/3617778

Recent works show that overparameterized neural networks contain small subnetworks that exhibit comparable accuracy to the full model when trained in isolation. These results highlight the potential to reduce the computational costs of deep neural network ...

Highlights

Problem Statement

Many recent results show that large neural networks can lead to improved generalization. Yet, training these large models comes with increased computational costs. In an effort to address this issue, several works have show ...

Subjects

Comments