Publications by authors named "J Fukuyama"

Article Synopsis
  • Somatic hypermutation (SHM) is crucial for antibody diversity and requires probabilistic models for analyzing mutations and understanding affinity maturation.
  • The authors create more efficient "thrifty" models using convolutions on 3-mer embeddings, resulting in fewer parameters yet a wider contextual understanding of SHM compared to traditional 5-mer models.
  • They discover that a per-site effect isn't necessary to explain SHM patterns, and note discrepancies between two current methods for fitting SHM models, which do not improve performance when combined.
View Article and Find Full Text PDF

The goal of dimension reduction tools is to construct a low-dimensional representation of high-dimensional data. These tools are employed for a variety of reasons such as noise reduction, visualization, and to lower computational costs. However, there is a fundamental issue that is discussed in other modeling problems that is often overlooked in dimension reduction-overfitting.

View Article and Find Full Text PDF
Article Synopsis
  • Long-term monitoring of landfill leachate (LFL) and landfill gas (LFG) is essential until municipal solid waste is stabilized and post-closure care ends.
  • Methane emissions from a marine landfill were found to be about 30% lower than IPCC model estimates over 30 years, with LFL contributing minimally to carbon emissions.
  • Changes in the CO/CH ratio indicate that methane oxidation in soil might explain the differences between observed and estimated emissions.
View Article and Find Full Text PDF

k-mer-based distances are often used to describe the differences between communities in metagenome sequencing studies because of their computational convenience and history of effectiveness. Although k-mer-based distances do not use information about taxon abundances, we show that one class of k-mer distances between metagenomes (the Euclidean distance between k-mer spectra, or EKS distances) are very closely related to a class of phylogenetically-informed β-diversity measures that do explicitly use both the taxon abundances and information about the phylogenetic relationships among the taxa. Furthermore, we show that both of these distances can be interpreted as using certain features of the taxon abundances that are related to the phylogenetic tree.

View Article and Find Full Text PDF

Topic modeling is a popular method used to describe biological count data. With topic models, the user must specify the number of topics $K$. Since there is no definitive way to choose $K$ and since a true value might not exist, we develop a method, which we call topic alignment, to study the relationships across models with different $K$.

View Article and Find Full Text PDF