A Data Mining Approach to
Investigate Food Groups related to Incidence of Bladder Cancer in the BLadder
cancer Epidemiology and Nutritional Determinants International Study.
Yu EYW(1), Wesselius A(1),
Sinhart C(2), Wolk A(3), Stern MC(4), Jiang X(4), Tang L(5), Marshall J(5),
Kellen E(6), van den Brandt P(7), Lu CM(8), Pohlabeln H(9), Steineck G(10), Allam MF(11), Karagas MR(12),
La Vecchia C(13), Porru S(14)(15), Carta
A(15)(16), Golka K(17), Johnson KC(18), Benhamou S(19), Zhang ZF(20), Bosetti
C(21), Taylor JA(22), Weiderpass E(23), Grant EJ(24), White E(25), Polesel
J(26), Zeegers MPA(27)(28).
Author information:
(1)Department of Complex
Genetics and Epidemiology, School of Nutrition and Translational Research in
Metabolism, Maastricht University, Maastricht, The Netherlands.
(2)DKE Scientific staff, Data
Science & Knowledge Engineering, Faculty of Science and Engineering.
(3)Division of Nutritional
Epidemiology, Institute of Environmental Medicine, Karolinska
Institute,Stockholm, Sweden.
(4)Department of Preventive
Medicine, University of Southern California, Los Angeles, CA, USA.
(5)Department of Cancer
Prevention and Control, Roswell Park Cancer Institute, Buffalo, NY, USA.
(6)Leuven University Centre
for Cancer Prevention (LUCK), Leuven, Belgium.
(7)Department of Epidemiology,
Schools for Oncology and Developmental Biology and Public Health and Primary
Care, Maastricht University Medical Centre, Maastricht, The Netherlands.
(8)Department of Urology,
Buddhist Dalin Tzu Chi General Hospital, Dalin Township 62247, Chiayi County,
Taiwan.
(9)Leibniz Institute for
Prevention Research and Epidemiology-BIPS, Bremen, Germany.
(10)Department of Oncology and
Pathology, Division of Clinical Cancer Epidemiology, Karolinska Hospital,
Stockholm, Sweden.
(11)Department of Preventive
Medicine and Public Health, Faculty of Medicine, University of Cordoba,
Cordoba, Spain.
(12)Department of
Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
(13)Department of Clinical
Medicine and Community Health, University of Milan, Milan, Italy.
(14)Department of Diagnostics
and Public Health, Section of Occupational Health, University of Verona, Italy.
(15)University Research Center
"Integrated Models for Prevention and Protection in Environmental and
Occupational Health" MISTRAL, University of Verona, Milano Bicocca and
Brescia, Italy.
(16)Department of Medical and
Surgical Specialties, Radiological Sciences and Public Health, University of
Brescia, Italy.
(17)Leibniz Research Centre
for Working Environment and Human Factors at TU Dortmund, Dortmund, Germany.
(18)Department of Epidemiology
and Community Medicine, University of Ottawa, Ottawa, ON, Canada.
(19)INSERM U946, Variabilite
Genetique et Maladies Humaines, Fondation Jean Dausset/CEPH, Paris, France.
(20)Departments of
Epidemiology, UCLA Center for Environmental Genomics, Fielding School of Public
Health, University of California, Los Angeles (UCLA), Los Angeles, CA, USA.
(21)Department of Oncology,
Istituto di Ricerche Farmacologiche Mario Negri-IRCCS, Milan, Italy.
(22)Epidemiology Branch, and
Epigenetic and Stem Cell Biology Laboratory, National Institute of
Environmental Health Sciences, NIH, Research Triangle Park, NC, USA.
(23)International Agency for
Research on Cancer (IARC), World Health Organization, Lyon, France.
(24)Department of Epidemiology
Radiation Effects Research Foundation, Hiroshima, Japan.
(25)Fred Hutchinson Cancer
Research Center, Seattle, WA, USA.
(26)Unit of Cancer
Epidemiology, Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, Italy.
(27)CAPHRI School for Public
Health and Primary Care, University of Maastricht, Maastricht, The Netherlands.
(28)School of Cancer Sciences,
University of Birmingham, Birmingham, UK.
British
Journal of Nutrition 2020 Apr 23:1-28. doi: 10.1017/S0007114520001439. [Epub
ahead of print]
At present, the analysis of
diet and bladder cancer (BC) is mostly based on the intake of individual foods.
The examination of food combinations provides a scope to deal with the
complexity and unpredictability of the diet and aims to overcome the
limitations of the study of nutrients and foods in isolation. This article aims
to demonstrate the usability of supervised data mining methods to extract the
food groups related to BC. In order to derive key food groups associated with BC
risk, we applied the data mining technique C5.0 with 10-fold cross validation in
the BLadder cancer Epidemiology and Nutritional Determinants (BLEND) study, including
data from 18 case-control and 1 nested case-cohort study, compromising 8,320 BC
cases out of 31,551 participants. Dietary data, on the 11 main food groups of
the Eurocode 2 Core classification codebook and relevant non-diet data (i.e.
sex, age and smoking status) were available. Primarily, five key food groups
were extracted; in order of importance: beverages (non-milk); grains and grain
products; vegetables and vegetable products; fats, oils and their products; meats
and meat products were associated with BC risk. Since these food groups are corresponded
with previously proposed BC related dietary factors, data mining seems to be a
promising technique in the field of nutritional epidemiology and deserves
further examination.
DOI: 10.1017/S0007114520001439
PMID: 32321598
British Journal of Nutrition 2020 Apr 23:1-28. doi: 10.1017/S0007114520001439.