Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Applied Statistics Expert

Linear and Logistic Regression Assignment-

SECTION A - Case-control study

The dataset "data_assessment2_ccstudy.dta" provides data from 560 patients admitted to hospital (in a region with malaria) who are part of a hypothetical nested case-control study. There are 140 patients who died within 1 year of hospital admission (cases) and 420 controls, the cases and controls have been selected from a larger cohort study of 13000 patients where 140 had died within 1 year of follow-up. Sex and age were routinely recorded in the hospital admission records. For the case-control study further information, haemoglobin level and malaria infection status on admission, were extracted from laboratory data records.

The variables in this dataset are:

Variable name                   Description

id                                      Unique identifier

dead                                  Died within 1 year of hospital admission (0 = control, 1 = case)

age                                    Age at baseline (years)

haemoglobin                       Haemoglobin level at baseline (g/dL)

malaria                               Malaria at baseline (0 = no malaria; 1 = malaria)

male                                   Sex of patient (0=female, 1=male)

We will use multivariable logistic regression to investigate the evidence for an association between haemoglobin and death, controlling for the possibility that this association is confounded by other exposure variables that appear in the dataset.

1. Description of study sample

a) Present histograms for age and haemoglobin and describe the distribution of these variables in terms of approximate normality and appropriate measures of centrality and spread.

b) Provide a table that summarises the distribution of age, sex, haemoglobin, and malaria with separate columns for those who died (cases) and the controls (remember this is a case-control study).

c) Using the information regarding the numbers of patients who died in the 1 year follow-up period in the cohort study, estimate the odds of death for a patient in the cohort study.

d) Calculate the estimated odds of death in the case-control study. Why isn't this estimate equal to the odds of death in the cohort study (calculated in 1c)?

2. Univariable logistic regression models

Consider the two univariable logistic regression models of the outcome dead on the variables male and malaria.

a) Present in a table the estimated odds ratios, 95% confidence intervals for the population odds ratio and p-values for the two separate simple logistic regressions.

b) Interpret the Odds Ratios for the univariable logistic regression of death and malaria (an interpretation of the p-value and confidence intervals is not required). Since the study population for this case-control study is hospital 'in-patients' what further information may you want regarding the patients without malaria infection on admission?

3. Linear association between exposure & outcome

We must decide whether it is reasonable to assume a linear association between the numerical exposure variables, age and haemoglobin, and the log odds of death.

a) Create a new variable in the dataset containing quintiles of age using the xtile command:

xtile age_q5=age, nq(5)

Use Stata to plot the log odds of death versus age_q5.

Note please use the Stata option commands:

ciplot yscale(log) yscale(range(0.5 2)) ylabel(0.25 0.5 0.75 1 1.5 2)

[Note:- for earlier versions of Stata you may need to replace "ciplot" above with "graph"]

Briefly summarise the plot, by describing whether the association looks linear.

b) Using the variable age_q5, fit separate simple logistic regression models with age_q5 as a categorical variable and as a continuously valued variable. Compare the models using the likelihood ratio test and comment on whether the association between log odds of death and age is linear.

c) Repeat parts 3a) & 3b) to investigate whether the association between haemoglobin and the log odds of death is linear. Briefly comment on whether the association is linear and state the null hypothesis being tested here.

4. Multivariable logistic regression models - Confounding

Now use univariable logistic regression to estimate the unadjusted odds ratios of death for haemoglobin and all three potential confounders (age, sex, and malaria). Then use multivariable logistic regression (including all four variables) to estimate the adjusted odds ratios. Include haemoglobin and age as categorical variables with the following groupings - age (< 3 & ≥ 3 years, with ≥ 3 years as the reference group) and haemoglobin (<9 (low), 9-14 (normal), >14 (high) g/dL, with the haemoglobin group 9-14 g/dL set as the reference group) [Hint - use 'gen' and 'replace' commands to create new variables].

a) Present in a table two columns - the unadjusted Odds Ratios (95% Confidence Intervals) and the adjusted Odds Ratios (95% Confidence Intervals) for the association between haemoglobin, age, sex, and malaria and the odds of death.

b) Comment on any confounding observed by considering any changes in the odds ratio of haemoglobin (categorical version) from the univariable to the multivariable logistic regression.

c) Investigate the confounding by exploring any univariable associations (in the controls only) between haemoglobin and the potential confounders.

d) Comment on the associations between the potential confounders and the outcome (after adjusting for the exposure of interest, haemoglobin). Together with what you found in 4c, comment on which variables are confounding the association between haemoglobin and death.

5. Final presentation of results and Stata do file

a) Please write a summary (abstract) based on the analyses you performed in the previous questions to answer the research question "Is there an association between haemoglobin and death?" (maximum word count of 200). Your summary should have the headings:- Aim, Study Design, Statistical Methods, Results.

b) Please provide a copy of your Stata do-file for performing the statistical analyses required for questions 1 to 5. Do not upload a second file when submitting your assignment but instead copy and paste the 'Stata do-file' to your word document.

SECTION B -

The dataset "data_assessment2_lupus.dta" provides cross-sectional data from 60 women who have Systemic Lupus Erythematosus (SLE), a chronic, multisystem autoimmune disease. The treatment for SLE often involves steroid therapy. The clinical researcher is particularly interested in bone loss in SLE and the impact of steroid usage. She is seeking your assistance in analysing a dataset she has compiled consisting of bone mineral density at one location (left hip), whether steroids had ever been prescribed or not, and the patient's age and smoking history (ever/never).

The variables in the dataset are:

patid                                 patient identification number

hipbmd                              bone mineral density measurement at the left hip in mg/cm2

ster_evr                             steroid usage coded as 1 for Ever usage and 0 for Never

age                                    age in years

smoker                              smoking history: coded as 1 for an ever smoker and 0 for never smoker

The research question of interest is:-

Is the relationship between steroid usage and bone mineral density modified by smoking and age?

6. Linear association between age & hipbmd;

a) Assess both visually and statistically (by including an additional squared term of age in the model) if it is reasonable to assume a linear association between hipbmd versus age.

7. Univariable and multivariable linear regression

Perform univariable linear regression to obtain the unadjusted associations between the outcome hipbmd and steroid usage, age and smoking. Following this perform multivariable linear regression including all three covariates.

a) Present in a single table the estimates, 95% confidence intervals and p-values of the univariable and multivariable linear regression analyses with separate columns for the unadjusted and adjusted estimates.

b) Interpret the adjusted association between steroid usage and bone mineral density.

c) Investigate if the association between steroid usage and bone mineral density is modified by smoking, after controlling for age.

d) Investigate if the association between steroid usage and bone mineral density is modified by age, after controlling for smoking.

8. (Concluding statement; 5 marks) Describe for the clinician in a single paragraph the results of your statistical analyses, in particular, addressing her research question (maximum 100 words).

Assignment link - https://www.dropbox.com/s/gs0ksj8lmmmgcpi/Assignment.zip?dl=0.

Applied Statistics, Statistics

  • Category:- Applied Statistics
  • Reference No.:- M91960446

Have any Question?


Related Questions in Applied Statistics

Business analytics and statistics research report -this

Business Analytics and Statistics Research Report - This assignment is based on fictional data. You are creating a business report for the CEO of a retail company called, Athlete Panda. It must be professional in present ...

Go to the institute for healthcare improvement ihi website

Go to the Institute for Healthcare Improvement (IHI) website for their Improvement Map (URL located in the Webliography). Select an improvement process of your choice. Present at least 3 quality paragraphs to the class a ...

Assignment -for each of the prompts below report the

Assignment - For each of the prompts below, report the appropriate degrees of freedom, t statistic, p-value and plot using the statistical software platform of your choice (R/STATA) 1) A sample of 12 men and 14 women hav ...

Assessment task -you become interested in the non-skeletal

Assessment Task - You become interested in the non-skeletal effects of vitamin D and review the literature. On the basis of your reading you find that there is some evidence to suggest that vitamin D deficiency is linked ...

Exercise -q1 do the example data in table 35-2 meet the

Exercise - Q1. Do the example data in Table 35-2 meet the assumptions for the Pearson χ 2 test? Provide a rationale for your answer. Q2. Compute the χ 2 test. What is the χ 2 value? Q3. Is the χ 2 significant at α = 0.05 ...

Go to the webliography source for the national cancer

Go to the Webliography source for the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) Program. In the Fast Stats, create your own cancer statistical report, "Stratified by Data Type," and u ...

Business statistics assignment -quiz 1 -question 1 - a

Business Statistics Assignment - Quiz 1 - Question 1 - A study is under way in the Otway National Park to determine the mature height of Mountain Ash gum trees. Specifically, the study is attempting to determine what fac ...

Business data analysis facts from figures assignment

BUSINESS DATA ANALYSIS: FACTS FROM FIGURES Assignment - Question 1 - Private capital expenditure for 12 successive quarters are presented in the following table: Quarter Millions     1 31,920 2 25,120 3 30,350 4 24,650 5 ...

Medical and applied physiology experimental report

Medical and Applied Physiology Experimental Report Assignment - Title - Compare the working and spatial memory by EEG. 30 students were tested (2 memory games were played to test their memory - a card game and a number g ...

The scientific method assessment task - data analysis

The Scientific Method Assessment Task - Data Analysis Assignment QUESTION 1 - Single-factor experiments Use Minitab for this question. The data in Q1 tab of 60902_AssessmentTask3Data_Spring2018.xlsx records blood cholest ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As