# STATS 762 : Regression for Data Science

## Science

### Course Prescription

Application of the generalised linear model to fit data arising from a wide range of sources, including multiple linear regression models, Poisson regression, and logistic regression models. The graphical exploration of data. Model building for prediction and for causal inference. Other regression models such as quantile regression. A basic understanding of vector spaces, matrix algebra and calculus will be assumed.

### Course Overview

The main emphasis of this course is on analysing data using the regression methods introduced in STATS 201/208 and their extensions. The practical aspects of fitting linear and generalised linear models are reviewed including estimation, diagnostics and inference. The geometric interpretation of the linear model is presented to enhance the understanding of these models. Simulation-based procedures, including bootstrapping and cross-validation, are introduced as a means to provide robust inference and to investigate the consequences of assumption violations. The two main uses of regression models, prediction and explanation, are discussed. Issues related to predictive models, including ways to estimate prediction error, model selection criteria, model building, and ethics of predictive modelling are discussed. Issues related to explanatory models including confounding, the model choice for causal inference and causal graphs are explored. Other modern regression methods such as lasso and quantile regression are introduced.

Emphasis is on practical application, providing students with a versatile statistical toolbox useful for a range of fields in both academia and industry, including applied statistics, data science, and almost all subjects in Business and Economics, along with any experimental or social science.

### Course Requirements

Prerequisite: STATS 707 or 210 or 225, and 15 points from STATS 201, 207, 208 or a B+ or higher in BIOSCI 209 Restriction: STATS 330

### Capabilities Developed in this Course

 Capability 1: Disciplinary Knowledge and Practice Capability 2: Critical Thinking Capability 3: Solution Seeking Capability 4: Communication and Engagement Capability 5: Independence and Integrity

### Learning Outcomes

By the end of this course, students will be able to:
1. Describe the components of the generalized linear model and explain identify suitable applications for different types of glm's. (Capability 1 and 4)
2. Apply standard diagnostic procedures to identify issues with a fitted statistical model and identify appropriate remedial measures. (Capability 1 and 2)
3. Use cross-validation to select a suitable predictive model for a given data set and evaluate the precision of the model. (Capability 1 and 2)
4. Create a causal graph for a given scenario and use the graph to identify confounding among the explanatory variables. (Capability 1, 2 and 3)
5. Use bootstrapping to create sampling distributions for the parameters of a generalised linear model. (Capability 1, 2 and 3)
6. Use a fitted regression model to answer specific questions about the relationship between the explanatory variables and the response. (Capability 1, 2 and 4)
7. Explore relationships between variables using appropriate graphical techniques. (Capability 1 and 2)
8. Explain the ethical risks of both accurate and inaccurate predictive modelling in society (Capability 1, 2 and 5)

### Assessments

Assessment Type Percentage Classification
Final Exam 50% Individual Examination
Test 20% Individual Test
Assignments 30% Individual Coursework
1 2 3 4 5 6 7 8
Final Exam
Test
Assignments

### Tuākana

Tuākana Science is a multi-faceted programme for Māori and Pacific students providing topic specific tutorials, one-on-one sessions, test and exam preparation and more. Explore your options at
https://www.auckland.ac.nz/en/science/study-with-us/pacific-in-our-faculty.html
https://www.auckland.ac.nz/en/science/study-with-us/maori-in-our-faculty.html

### Key Topics

• Generalised Linear Models
• Model Selection
• Simulation
• Bootstrapping
• Predictive models
• Explanatory models
• Regression with penalty
• Classification and regression trees
• Splines
• Gaussian process

This course is a standard 15 point course and students are expected to spend 10 hours per week involved in each 15 point course that they are enrolled in.

For this course, a typical weekly workload includes:

• 3 hours of lectures
• 1-hour tutorial
• 3 hours of reviewing the course content
• 3 hours of work on assignments and/or test preparation

### Delivery Mode

#### Campus Experience

Attendance is expected at scheduled activities including tutorials.
Lectures will be available as recordings. Other learning activities (tutorials) will not be available as recordings.
The course will not include live online events.
Attendance on campus is required for the test and the exam.
The activities for the course are scheduled as a standard weekly timetable.

### Learning Resources

Course materials are made available in a learning and collaboration tool called Canvas which also includes reading lists and lecture recordings (where available).

Please remember that the recording of any class on a personal device requires the permission of the instructor.

Course Materials:
• Lecture slides and tutorial sheets are made available.

### Student Feedback

During the course Class Representatives in each class can take feedback to the staff responsible for the course and staff-student consultative committees.

At the end of the course students will be invited to give feedback on the course and teaching through a tool called SET or Qualtrics. The lecturers and course co-ordinators will consider all feedback.

Your feedback helps to improve the course and its delivery for all students.

.

The University of Auckland will not tolerate cheating, or assisting others to cheat, and views cheating in coursework as a serious academic offence. The work that a student submits for grading must be the student's own work, reflecting their learning. Where work from other sources is used, it must be properly acknowledged and referenced. This requirement also applies to sources on the internet. A student's assessed work may be reviewed against online source material using computerised detection mechanisms.

### Class Representatives

The content and delivery of content in this course are protected by copyright. Material belonging to others may have been used in this course and copied by and solely for the educational purposes of the University under license.

You may copy the course content for the purposes of private study or research, but you may not upload onto any third party site, make a further copy or sell, alter or further reproduce or distribute any part of the course content to another person.

### Inclusive Learning

All students are asked to discuss any impairment related requirements privately, face to face and/or in written form with the course coordinator, lecturer or tutor.

Student Disability Services also provides support for students with a wide range of impairments, both visible and invisible, to succeed and excel at the University. For more information and contact details, please visit the Student Disability Services’ website

### Special Circumstances

If your ability to complete assessed coursework is affected by illness or other personal circumstances outside of your control, contact a member of teaching staff as soon as possible before the assessment is due.

If your personal circumstances significantly affect your performance, or preparation, for an exam or eligible written test, refer to the University’s aegrotat or compassionate consideration page .

This should be done as soon as possible and no later than seven days after the affected test or exam date.

### Learning Continuity

In the event of an unexpected disruption, we undertake to maintain the continuity and standard of teaching and learning in all your courses throughout the year. If there are unexpected disruptions the University has contingency plans to ensure that access to your course continues and course assessment continues to meet the principles of the University’s assessment policy. Some adjustments may need to be made in emergencies. You will be kept fully informed by your course co-ordinator/director, and if disruption occurs you should refer to the university website for information about how to proceed.

The delivery mode may change depending on COVID restrictions. Any changes will be communicated through Canvas.

### Student Charter and Responsibilities

The Student Charter assumes and acknowledges that students are active participants in the learning process and that they have responsibilities to the institution and the international community of scholars. The University expects that students will act at all times in a way that demonstrates respect for the rights of other students and staff so that the learning environment is both safe and productive. For further information visit Student Charter .

### Disclaimer

Elements of this outline may be subject to change. The latest information about the course will be available for enrolled students in Canvas.

In this course students may be asked to submit coursework assessments digitally. The University reserves the right to conduct scheduled tests and examinations for this course online or through the use of computers or other electronic devices. Where tests or examinations are conducted online remote invigilation arrangements may be used. In exceptional circumstances changes to elements of this course may be necessary at short notice. Students enrolled in this course will be informed of any such changes and the reasons for them, as soon as possible, through Canvas.

Published on 28/10/2022 11:28 a.m.