Publications by authors named "Kosuke Imai"

Support vector machine (SVM) is one of the most popular classification algorithms in the machine learning literature. We demonstrate that SVM can be used to balance covariates and estimate average causal effects under the unconfoundedness assumption. Specifically, we adapt the SVM classifier as a kernel-based weighting procedure that minimizes the maximum mean discrepancy between the treatment and control groups while simultaneously maximizing effective sample size.

View Article and Find Full Text PDF
Article Synopsis
  • A significant amount of research highlights the unique challenges that first-generation, low-income (FGLI) students face as "hidden minorities" in elite college environments.
  • Existing studies indicate that brief psychological interventions can help address some of these challenges, leading universities to invest in more comprehensive programs aimed at both changing mindsets and reducing structural disadvantages in academic preparation for FGLI students.
  • A randomized trial of a summer bridge program showed positive outcomes, including increased enrollment in nonintroductory courses and a shift toward taking classes for a grade, demonstrating the program's effectiveness in integrating FGLI students into selective academic communities, despite no significant changes in first-year GPAs or withdrawal rates.
View Article and Find Full Text PDF

The U.S. Census Bureau faces a difficult trade-off between the accuracy of Census statistics and the protection of individual information.

View Article and Find Full Text PDF

The Pd-catalyzed stereoselective construction of decalins with one-carbon units bearing heteroatoms at the ring junction is described. The Pd-catalyzed cyclization of silyl enol ether resulted in exclusive formation of the isomer (89%, >100/1 /). On the contrary, Pd-catalyzed carboiodination and carboborylation (with oxidative workup) provided products in 56% yield (1/>100 /) and 69% yield (1/11 /), respectively.

View Article and Find Full Text PDF

Congressional district lines in many US states are drawn by partisan actors, raising concerns about gerrymandering. To separate the partisan effects of redistricting from the effects of other factors including geography and redistricting rules, we compare possible party compositions of the US House under the enacted plan to those under a set of alternative simulated plans that serve as a nonpartisan baseline. We find that partisan gerrymandering is widespread in the 2020 redistricting cycle, but most of the electoral bias it creates cancels at the national level, giving Republicans two additional seats on average.

View Article and Find Full Text PDF

We provide the largest compiled publicly available dictionaries of first, middle, and surnames for the purpose of imputing race and ethnicity using, for example, Bayesian Improved Surname Geocoding (BISG). The dictionaries are based on the voter files of six U.S.

View Article and Find Full Text PDF

Prediction of individuals' race and ethnicity plays an important role in studies of racial disparity. Bayesian Improved Surname Geocoding (BISG), which relies on detailed census information, has emerged as a leading methodology for this prediction task. Unfortunately, BISG suffers from two data problems.

View Article and Find Full Text PDF

This article introduces the 50STATESIMULATIONS, a collection of simulated congressional districting plans and underlying code developed by the Algorithm-Assisted Redistricting Methodology (ALARM) Project. The 50STATESIMULATIONS allow for the evaluation of enacted and other congressional redistricting plans in the United States. While the use of redistricting simulation algorithms has become standard in academic research and court cases, any simulation analysis requires non-trivial efforts to combine multiple data sets, identify state-specific redistricting criteria, implement complex simulation algorithms, and summarize and visualize simulation outputs.

View Article and Find Full Text PDF

Two-stage randomized experiments become an increasingly popular experimental design for causal inference when the outcome of one unit may be affected by the treatment assignments of other units in the same cluster. In this paper, we provide a methodological framework for general tools of statistical inference and power analysis for two-stage randomized experiments. Under the randomization-based framework, we consider the estimation of a new direct effect of interest as well as the average direct and spillover effects studied in the literature.

View Article and Find Full Text PDF

Census statistics play a key role in public policy decisions and social science research. However, given the risk of revealing individual information, many statistical agencies are considering disclosure control methods based on differential privacy, which add noise to tabulated data. Unlike other applications of differential privacy, however, census statistics must be postprocessed after noise injection to be usable.

View Article and Find Full Text PDF

The matched-pairs design enables researchers to efficiently infer causal effects from randomized experiments. In this paper, we exploit the key feature of the matched-pairs design and develop a sensitivity analysis for missing outcomes due to truncation by death, in which the outcomes of interest (e.g.

View Article and Find Full Text PDF

Mediation analysis has been extensively applied in psychological and other social science research. A number of methodologists have recently developed a formal theoretical framework for mediation analysis from a modern causal inference perspective. In Imai, Keele, and Tingley (2010), we have offered such an approach to causal mediation analysis that formalizes identification, estimation, and sensitivity analysis in a single framework.

View Article and Find Full Text PDF

Background: Wiskott-Aldrich syndrome (WAS) is a rare X-linked immunodeficiency caused by defects of the WAS protein (WASP) gene. Patients with WAS typically demonstrate micro-thrombocytopenia.

Procedures: The report describes seven male infants with WAS that initially presented with leukocytosis, monocytosis, and myeloid and erythroid precursors in the peripheral blood (PB) and dysplasia in the bone marrow (BM), which was initially indistinguishable from juvenile myelomonocytic leukaemia (JMML).

View Article and Find Full Text PDF

A Japanese patient presented with lymphedema, severe Varicella zoster, and Salmonella infection, recurrent respiratory infections, panniculitis, monocytopenia, B- and NK-cell lymphopenia, and myelodysplasia. The phenotype was a mixture of the monocytopenia and mycobacterial infection (MonoMAC) and Emberger syndromes. Sequencing of the GATA-2 cDNA revealed the heterozygous missense mutation 1187 G > A.

View Article and Find Full Text PDF

In this commentary, we demonstrate how the potential outcomes framework can help understand the key identification assumptions underlying causal mediation analysis. We show that this framework can lead to the development of alternative research design and statistical analysis strategies applicable to the longitudinal data settings considered by Maxwell, Cole, and Mitchell (2011).

View Article and Find Full Text PDF

Traditionally in the social sciences, causal mediation analysis has been formulated, understood, and implemented within the framework of linear structural equation models. We argue and demonstrate that this is problematic for 3 reasons: the lack of a general definition of causal mediation effects independent of a particular statistical model, the inability to specify the key identification assumption, and the difficulty of extending the framework to nonlinear models. In this article, we propose an alternative approach that overcomes these limitations.

View Article and Find Full Text PDF

Background: We assessed aspects of Seguro Popular, a programme aimed to deliver health insurance, regular and preventive medical care, medicines, and health facilities to 50 million uninsured Mexicans.

Methods: We randomly assigned treatment within 74 matched pairs of health clusters-ie, health facility catchment areas-representing 118 569 households in seven Mexican states, and measured outcomes in a 2005 baseline survey (August, 2005, to September, 2005) and follow-up survey 10 months later (July, 2006, to August, 2006) in 50 pairs (n=32 515). The treatment consisted of encouragement to enrol in a health-insurance programme and upgraded medical facilities.

View Article and Find Full Text PDF

In his 1923 landmark article, Neyman introduced randomization-based inference to estimate average treatment effects from experiments under the completely randomized design. Under this framework, Neyman considered the statistical estimation of the sample average treatment effect and derived the variance of the standard estimator using the treatment assignment mechanism as the sole basis of inference. In this paper, I extend Neyman's analysis to randomized experiments under the matched-pair design where experimental units are paired based on their pre-treatment characteristics and the randomization of treatment is subsequently conducted within each matched pair.

View Article and Find Full Text PDF

A rapidly aging population, such as the United States today, is characterized by the increased prevalence of chronic impairment. Robust estimation of disability-free life expectancy (DFLE), or healthy life expectancy, is essential for examining whether additional years of life are spent in good health and whether life expectancy is increasing faster than the decline of disability rates. Over 30 years since its publication, Sullivan's method remains the most widely used method to estimate DFLE.

View Article and Find Full Text PDF

A PHP Error was encountered

Severity: Warning

Message: fopen(/var/lib/php/sessions/ci_sessionc7beq8if6f1mur5pp56l2l1i7d38ih3s): Failed to open stream: No space left on device

Filename: drivers/Session_files_driver.php

Line Number: 177

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once

A PHP Error was encountered

Severity: Warning

Message: session_start(): Failed to read session data: user (path: /var/lib/php/sessions)

Filename: Session/Session.php

Line Number: 137

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once