Q1. I didn’t quite understand it. Can you please explain what this assignment is asking?

A. Assignment 2 includes a few quantitative data analysis exercises that require you to analyze the data ( _A2.xlsx) in order to answer some basic questions. Before you start on Assignment 2, please first:

- Read through all course materials on “practice data management and analysis in Microsoft Excel”, namely, lecture slides P03 (including the online tutorial links in the slides) and the additional online tutorials listed on syllabus. There are also many YouTube videos on simple functions, e.g., AVERAGE: teaching video, nested IF: teaching video, pivot table: teaching video.

Complete the exercises in HLTH2411_P03_Excel_Exercises.xlsx – use the 3_Excel_Exercise_Sample_Answer.xlsx for reference.

Please contact me with specific (troubleshooting) questions while you work through the above readings and exercises. After you complete these two steps, you will have learnt all the Excel functions and formulas as well as pivot tables and pivot chart needed to answer Assignment 2 questions. Then we may have more relevant discussion from there.

Once you complete these exercises, I would suggest you work on the Assignment 2 questions one question at a time and in sequence.

Q2.a While calculating the total PHQ scores in Exercise 2.1 Q2.a, how do I exclude those not completing all 9 PHQ questions?

A. For Q2.a, you should easily identify that you will need to use the function SUM to calculate total PHQ-A score in the “Total” column (Column P in the “PHQ- A_Pre_Counselling” worksheet). For instance, in the P2 cell: =SUM(F2:N2). Now, before applying the SUM function to the entire Column P, you may apply filter on “# Q answered” column (Column O) to hide those not completing 9 questions (by selecting only “9” to be visible). Btw, when you apply a filter, the Excel screen (PC version at least) left bottom should display the number of rows in view, which incidentally answers the Q2 question. After applying the filter on Column O, you may apply the SUM function across the whole Total_Score column (Column P) and get the values for all students who have answer 9 PHQ-A questions. Then you may use functions AVERAGE and MEDIUM to answer Q2.a.

Q2.b How to get the type of depression?

A. Right after Q2.a operations above, you should have the filter on Column O and the “Total” values in Column P, then you may proceed to Q2.b. Similar to the Nested IF / IFS function in HLTH2411_P03_Excel_Exercise_Sample_Answer.xlsx (Worksheet “BP_pre- trial_cleaned” Column H or Worksheet “GAD_pre-counsel” Column J), you may calculate Depression profile in the ‘PHQ-A_Pre_Counselling’ worksheet ‘Depression’ column (Column Q), e.g., for Q2 cell: =IF(P2<5,”1. No Symptom”,IF(P2<10,”2. Mild Symptom”,IF(P2<15,”3. Moderate Symptom”,IF(P2<20,”4. Moderately severe Symptom”,”5. Severe Symptom”)))). Once you have values in Column Q, you may use pivot table then pivot chart to present the depression profile in a chart.

a. N.B. remember to clear the filter on Column O after Q2.a and Q2.b, because Q3 should use different filters.

Q3. How to get self-harm risks?

A. Again, before moving on to Q3, you must clear the filter on Column O, and you might apply other filters, e.g., on age (12-19 in Column C), gender (male, female in Column B), and Q9 (not blank in Column N). And similar to Q2.b above, you may use the nested IF / IFS to calculate Column R (self-harm risk) according to Q9 values.

Q3.a Is t-test appropriate for this?

A. No, two independent samples t-test is for comparing the means of a normally distributed interval dependent variable between two independent groups. So t-test would not be appropriate for Q3, as the PHQ-A Q9 values of 0 and 1 both indicate low risk, which suggest you must use the categorical variable of Self-harm risk (Column R) instead of the numerical variable of Q9 (Column N). And the categorical variable comparison requires chi-squared test. For more support on how to choose a statistical test, I always use this guide.

Q3.b How to calculate the p-value?

A. Once you have values in Column R, you may use pivot table to summarize the data into the “observed table”, e.g.:

Row Labels | 1. Low | 2. Medium | 3. High | Grand Total |

Female | 83 | 24 | 23 | 130 |

Male | 32 | 7 | 16 | 55 |

Grand Total | 115 | 31 | 39 | 185 |

Then calculate your “expected table”, which looks very similar to the above “observed table” without the ‘actual range’ 6 cells (from 83 to 16), as per below table:

Row Labels | 1. Low | 2. Medium | 3. High | Grand Total |

Female | 130 | |||

Male | 55 | |||

Grand Total | 115 | 31 | 39 | 185 |

The ‘expected range’ 6 cells in the above expected table are calculated by a simple formula: Row Total * Column Total / Grand Total. For example, the first cell would be

=130*115/185.

Once you have filled up the expected table 6 cells using the above formula, you can easily calculate the p value, by a simple function CHITEST().

The Q3.b question is very similar to two exercises in HLTH2411_P03_Excel_Exercises.xlsx – the “Survival_Dataset” Q and “antihypertensive_trial_analysis” Q3. See P03 lecture slides (Slide 10) and HLTH2411_P03_Excel_Exercise_Sample_Answer.xlsx for tutorials/examples. A well- paced demonstration of these steps is here.

Q4. For PHQ-A questions, does 0 count as an answer or not?

A. The “0” answers to 9 PHQ-A questions are valid answers. Please read the assignment guideline carefully, including Table 1 on Page 1 which explains what “0” means in PHQ- A answers.

Q5. For Exercise 2.2, the “PHQ-A_Pre-counseling” worksheet has 251 rows but the “PHQ- A_Post_Counselling” only goes to 248 rows. As they are misaligned, how to copy pre- counselling information into post-counselling sheet?

A. The ‘missing’ rows in the “PHQ-A_Post_Counselling” worksheet are by design. I.e., not every participant completed the follow-up questionnaire after counselling. So you can

_not_ simply use copy&paste to get pre-counselling data into the post-counselling worksheet. Instead, use the function VLOOKUP. A tutorial on simple VLOOKUP is available here. And I will explain more below.

Q6. For Exercise 2.2, how to get the “# Q answered – Pre” and “# Q answered – Post” values in the ‘PHQ-A_Post_Counselling’ worksheet?

A. You may use VLOOKUP function for the former and COUNTA function for the latter. For example, for the K2 cell: =VLOOKUP(A2,’PHQ-A_Pre_Counselling’!A:O,15,FALSE). For L2 cell: =COUNTA(B2:J2).

Once you have values in the K and L columns, you may use filters on both columns to answer the Exercise 2.2 Q1. Then with the two filters on, you may calculate the pre- counselling total score (Column M, e.g., using VLOOKUP function) and post-counselling total score (Column N, e.g., using SUM function) for Q2. The Exercise 2.2 is very similar to the exercises in HLTH2411_P03_Excel_Exercises.xlsx – “antihypertensive_trial_analysis” Q5 and “Anxiety_counselling_analysis” Q3. See HLTH2411_P03_Excel_Exercise_Sample_Answer.xlsx for calculation examples.

Q7. Why does the BB preview of my Excel file look skewed?

A. Web browser has its limitation. Therefore, for grading I will download your last submission by the deadline and open in Excel. I would strongly suggest you always check on BB after submitting an assignment by downloading it again. The button next to your submission (which looks like a downward-pointing arrow) allows you to download the file. You should always check your submission by downloading it then opening it up locally again. To help me credit assignment grade to the right student, you must include your name in your assignment file name, as required in the assignment guideline – for Assignment 2: A2_ Firstname_Lastname.xlsx.

Now, there are multiple solutions to Assignment 2, some students might prefer pivot tables to filters. All solutions are acceptable as long as you analyze the data correctly. To score all points, please analyze your data carefully in order to answer each question correctly. And submit your Excel file on BB before the deadline