Background: The identification of gene-by-environment interactions is important for understanding the genetic basis of chronic obstructive pulmonary disease (COPD). Many COPD genetic association analyses assume a linear relationship between pack-years of smoking exposure and forced expiratory volume in 1 s (FEV1); however, this assumption has not been evaluated empirically in cohorts with a wide spectrum of COPD severity. Methods: The relationship between FEV1and pack-years of smoking exposure was examined in four large cohorts assembled for the purpose of identifying genetic associations with COPD. Using data from the Alpha-1 Antitrypsin Genetic Modifiers Study, the accuracy and power of two different approaches to model smoking were compared by performing a simulation study of a genetic variant with a range of gene-by-smoking interaction effects. Results: Non-linear relationships between smoking and FEV1were identified in the four cohorts. It was found that, in most situations where the relationship between pack-years and FEV1is non-linear, a piecewise linear approach to model smoking and gene-by-smoking interactions is preferable to the commonly used total pack-years approach. The piecewise linear approach was applied to a genetic association analysis of the PI*Z allele in the Norway Case - Control cohort and a potential PI*Z-by-smoking interaction was identified (p=0.03 for FEV1analysis, p=0.01 for COPD susceptibility analysis). Conclusion: In study samples of subjects with a wide range of COPD severity, a non-linear relationship between pack-years of smoking and FEV1is likely. In this setting, approaches that account for this non-linearity can be more powerful and less biased than the more common approach of using total pack-years to model the smoking effect.