Overview:
In this project, you will take on the role of secondary health science researchers to demonstrate your
understanding of the statistical techniques studied in class. As secondary researchers, you will analyze
data collected on cancer statistics in Canada in 2023 and draw some conclusions based on this data.
• Your project must be completed using Microsoft Excel.
• Each question should be on its own worksheet, labelled by part and question number
(e.g. Part 1-2, Part 2-3, etc.)
• All graphs should be properly titled and fully labelled
• All statistical values/probabilities should be clearly labelled and calculated using Excel formulas
o Statistical values/probabilities that are manually typed in will not receive any mark
out clearly and neatly, with appropriate headings, titles, and labels, and your conclusions
should be presented in a way that is easy to understand.
Go to the website www.cancer.ca/statistics and download the Excel data file underneath the heading
“2023 Resources” called “Source Data File”.
Open a new workbook, and call it “MATH 20025 Data Project – GROUPNUM” where “GROUPNUM” is
replaced by your group number
use the normal distribution to calculate the probability of a randomly-selected individual with that
type of cancer being:
o Under 40 years old
o Between 50 – 74 years old
o 65 years old and over
construct a confidence interval for the true mean age of those diagnosed with the type of cancer
you chose, at ONE of the following significance levels:
o 90%
o 95%
o 99%
Part 2-B and discuss what conclusions you can draw about the distribution of the data based on
these results. Be sure to support your claims by referencing at least one other idea we learned in
class when discussing the shape and distribution of data.
the average annual percent change (AAPC) in age-standardized incidence rates (ASIR) for
selected cancers, and provides a 95% confidence interval for the true AAPC based on the data.
Find a specific cancer type (do not use “all cancers”) that satisfies each of the following:
o Statistically significant increase in AAPC between 1984-2019 for males
o Statistically significant decrease in AAPC between 1984-2019 for females
o No statistically significant change in AAPC between 1984-2019 for both sexes
Be sure to justify your answers for each selection.
scatter diagram between age and ASIR for males. Add a linear trendline to the data, and display
the equation and R^2 value for the data, and calculate the correlation coefficient for the data.
Hint: you will need to use the midpoint of the age class in order to do a regression analysis.
and use them to estimate the ASIR for a person in your group (indicate the appropriate sex* and
age for the person you have selected to perform your estimate for). *Please note these studies
refer to biological sex as opposed to gender identity. However, the selected student is welcome to
use their preferred gender identity in lieu of their biological sex, if they are not the same.