Infrared molecular fingerprinting of blood offers a scalable, minimally invasive window into human physiology, but limited follow-up, imbalanced cohorts, and restricted access to diverse phenotypes constrain systematic studies. Here, we introduce a conditional deep generative framework for synthesizing blood-based infrared spectra that preserves individual-level structure while allowing controlled manipulation of demographic and anthropometric covariates. Using 25,308 spectra from 5,863 ostensibly healthy participants in the longitudinal Health4Hungary – Hungary4Health cohort, we train a Conditional Variational Autoencoder and a Conditional Boundary Equilibrium GAN to generate blood-based infrared spectra conditioned on age, sex, and body mass index. We show that the generated spectra closely match held-out real data across multiple levels and faithfully encode demographic and anthropometric information. We further demonstrate two in-silico applications: modeling of individualized healthy aging trajectories that follow cohort-level aging manifolds while retaining subject-specific characteristics, and targeted augmentation of underrepresented body mass index categories. Together, these results establish conditional generative modeling of blood-based infrared spectra as a viable approach for virtual cohort construction, cohort balancing, and controlled in-silico phenotyping, paving the way toward more comprehensive and data-efficient studies in precision health.
misc LHS+26
BibTeXKey: LHS+26