top of page

Home  >  For Participants  >  Study Findings  >  Research Methods


  • Large-scale cancer epidemiology cohorts (CEC) have successfully collected, analyzed, and shared patient-reported data for years. CECs increasingly need to make their data more findable, accessible, interoperable, and reusable, or FAIR. This 2020 publication describes how the California Teachers Study has created a scalable infrastructure that provides the security, authorization, data model, metadata, and analytic tools needed to manage, share, and analyze study data in ways that are consistent with the NCI's Cancer Research Data Commons Framework.  Read more here.

  • In a 2019 publication, researchers implemented a new method to predict disease among cohorts by weighting cohort subjects, and tested this method by creating ovarian cancer prediction models using data from the California Teachers Study. Read more here.

A woman looking at charts on the screen
  • This 2020 publication describes the California Teachers Study's creation and dissemination of their 6th study questionnaire (Questionnaire 6) using email marketing automation software and a web and mobile-enabled questionnaire platform. This integrated platform captured data on recruitment activities that may influence overall response, including the date and time questionnaire invitations and reminders were emailed and the date and time questionnaires were started and submitted.  Read more here.

  • The COronavirus Pandemic Epidemiology (COPE) consortium was established to facilitate rapid research of the  coronavirus-2019 (COVID-19) pandemic. This publication describes the deployment of the COVID Symptom Study (previously known as the COVID Symptom Tracker) mobile application as a common data collection tool for epidemiologic cohort studies with active study participants. Read more here.

  • Cancer Informatics for Cancer Centers (CI4CC) is a nonprofit organization focused on providing a focused national forum for engagement of senior cancer informatics leaders. This publication provides a summary of highlights from the 2019 CI4CC meeting, including lessons learned from the California Teachers Study's approach to modernizing an existing epidemiology cohort study to be compatible with the NCI’s Cancer Research Data Common (CRDC) approach and the principles of FAIR data. Read more here.

  • The incidence of advanced breast cancer in premenopausal women has increased in the recent decades, unlike rates in postmenopausal women. The Premenopausal Breast Cancer Collaboration is a cooperative group of 20 cohort studies with the aim of identifying contributors to and reducing these rates. This 2017 publication describes the rationale for the Premenopausal Breast Cancer Collaboration. Read more here.

  • Researchers evaluated whether address data from LexisNexis, a commercially available credit reporting company, could be used to reconstruct residential history for CTS participants. Using LexisNexis address data, researchers doubled the proportion of the study population of whom they had an address of residence during childbearing years, an important period of susceptibility for breast cancer risk. Read more here.

  • This 2014 study used California Teachers Study data to demonstrate how covariates ascertained at cohort entry (when participants completed the baseline questionnaire in 1995) could be used to assign the probability of adverse outcomes within a risk model with a two-stage sampling strategy. Read more here.

  • A 2015 study found that nail clippings, combined with OmniPlex whole-genome amplification, could be used as an alternative to whole blood for DNA analyses. Read more here.

  • This 2013 study applied the Rosner-Colditz breast cancer incidence model to the CTS population. Researchers affirmed that there was a statistically significant higher incidence of breast cancer in the CTS compared with the Nurses’ Health Study (NHS), and that the model worked consistently well when applied to an independent data set. Read more here.

  • The Epidemiology of Endometrial Cancer Consortium (E2C2) pools a variety of data together from more than 30 worldwide studies concerning endometrial cancer. Study cooperation is managed by an executive committee as well as an advisory committee, allowing coordination in researching endometrial risk factors such as genetics, diet, and the causes of rare tumor subtypes. Read more here.

  • This 2007 assessment found that a food-frequency questionnaire (FFQ) used in the CTS was valid and reproducible. Nutrient data had a reproducibility range of 0.60 to 0.87, and validity correlations were reasonably high (range: 0.55-0.85), with a few exceptions. Read more here. 

  • Pooled studies have become more common alongside the increase in epidemiological publications. This 2006 publication describes the methods used in the Pooling Project of Prospective Studies of Diet and Cancer. Read more here.

  • Sensitivity (the ability to identify those with a disease) and specificity (the ability to identify those without a disease) were calculated for self-reported cancers. Sensitivity varied greatly by cancer site, from  >90% for breast and thyroid cancers to <70% for cervical, endometrial, and skin cancers. Researchers found an association between diagnosis in the in-situ phase, older age, and increased false positive reporting. Moreover, older age and non-White race were associated with greater false-negative reporting. Findings from this 2003 study suggest the feasibility of using self-reported data without validation may depend on site location. Read more here.

  • Researchers examined the accuracy and completeness of self-reported hospitalization data and hospital discharge data. Self-reported information was found to be most accurate regarding recent treatments, scheduled admissions, less severe disorders, and longer lengths of stay; however, mental health and infectious disease were not well reported. This 2003 publication also found that self-reported information was also generally more comprehensive, as outpatient treatment is common, while hospital discharge data were found to be more specific. Read more here.

bottom of page