STATS 383 : The Science and Craft of Data Management

Science

2024 Semester Two (1245) (15 POINTS)

Course Prescription

A structured introduction to the science and craft of data management, including: data representations and their advantages and disadvantages; workflow and data governance; combining and splitting data sets; data cleaning; the creation of non-trivial summary variables; and the handling of missing data. These will be illustrated by data sets of varying size and complexity, and students will implement data processing steps in at least two software systems.

Course Overview

Applied statisticians and data scientists spend large amounts of time engaged in activities which make a raw dataset fit for downstream high-level analyses. This is a critical component of the statistician’s and data scientist’s toolkit but is also a set of valuable skills for those who collect data. Data management requires a combination of statistical and programming knowledge, but the principles are independent of the specific software tools; we use both R and SAS in this course to illustrate the differences in the approach they represent.

Course Requirements

Prerequisite: ENGSCI 314 or STATS 201 or 208, and COMPSCI 101 or ENGSCI 233 or STATS 220

Capabilities Developed in this Course

Capability 3: Knowledge and Practice
Capability 4: Critical Thinking
Capability 5: Solution Seeking
Capability 6: Communication
Capability 8: Ethics and Professionalism
Graduate Profile: Bachelor of Science

Learning Outcomes

By the end of this course, students will be able to:
  1. Demonstrate an understanding of and appraise the different data management models used to standardise the way datasets are organised (Capability 3, 4, 5 and 6)
  2. Plan, prepare and implement a workflow for the reproducible curation and summarisation of data. (Capability 3, 4, 5 and 6)
  3. Explain and evaluate the ethical and regulatory considerations integral to the development of a data management workflow. (Capability 3, 4, 5, 6 and 8)
  4. Develop and demonstrate a good understanding of the methods and best practices for bulk data transformations. (Capability 3, 4, 5 and 6)
  5. Identify, describe and demonstrate an understanding of different data cleaning processes. (Capability 3, 4, 5 and 6)
  6. Develop an understanding of how to construct new variables from those in a cleaned data so that they are suitable for analysis. (Capability 3, 4, 5 and 6)
  7. Demonstrate simple transformations of text data and explain how these become more complicated in a multilingual and multicultural context. (Capability 3, 4, 5 and 6)

Assessments

Assessment Type Percentage Classification
Laboratories 40% Individual Coursework
Online test 15% Individual Coursework
Final Exam 45% Individual Examination
Assessment Type Learning Outcome Addressed
1 2 3 4 5 6 7
Laboratories
Online test
Final Exam

Labs are marked and count towards your final grade but attendance is not mandatory.
Attendance in person will be required for any on campus test and exam.
A minimum of 50% in the coursework and 50% in the exam is required to pass.

Tuākana

Tuākana Science is a multi-faceted programme for Māori and Pacific students providing topic specific tutorials, one-on-one sessions, test and exam preparation and more. Explore your options at
https://www.auckland.ac.nz/en/science/study-with-us/pacific-in-our-faculty.html
https://www.auckland.ac.nz/en/science/study-with-us/maori-in-our-faculty.html

Special Requirements

Test may be conducted outside of standard hours. 

Workload Expectations

This course is a standard 15 point course and students are expected to spend 150 hours per semester involved in each 15 point course that they are enrolled in.

For this course, a typical weekly workload includes:

  • 3 hours of lectures
  • A 2-hour lab
  • 2 hours of reviewing the course content
  • 5.5 hours of work on lab exercises, online quiz and/or test preparation

Delivery Mode

Campus Experience

Attendance is expected at scheduled activities including labs to receive credit for this component of the course.
Lectures will be available as recordings. Other learning activities including labs will not be available as recordings.
The course will not include live online events.
The activities for the course are scheduled as a standard weekly timetable.
Attendance in person will be required for any on campus test and exam.

Learning Resources

Course materials are made available in a learning and collaboration tool called Canvas which also includes reading lists and lecture recordings (where available).

Please remember that the recording of any class on a personal device requires the permission of the instructor.

Course Materials:

  • Lecture notes, lab exercises, and quizzes will be available on Canvas.

Student Feedback

During the course Class Representatives in each class can take feedback to the staff responsible for the course and staff-student consultative committees.

At the end of the course students will be invited to give feedback on the course and teaching through a tool called SET or Qualtrics. The lecturers and course co-ordinators will consider all feedback.

Your feedback helps to improve the course and its delivery for all students.

In 2024, a 2-hour Introduction to SAS workshop may replace the week 1 lab to assist students who are new to SAS in becoming familiar with the software.

Other Information

Assignments will consist of lab exercises and 10 quizzes, which account for 40% of the final grade. Only the top 8 quiz scores out of the 10 will be considered for the overall lab grade. These top 8 lab marks will each contribute to the overall lab grade, with 1% for in-person lab attendance and 4% for quizzes. In most cases, students are expected to attend labs in person, and only under exceptional circumstances will they be excused from in-person attendance without forfeiting the 1% for attendance.

Lab exercises will be distributed at the beginning of weeks 1-5 and weeks 7-12. The weekly quiz questions will be based on the lab exercises of the current week. Quiz questions will become accessible to students at the start of their 2-hour lab session and will close at the end of the session, ensuring that quiz questions are completed during the lab hours.

Academic Integrity

The University of Auckland will not tolerate cheating, or assisting others to cheat, and views cheating in coursework as a serious academic offence. The work that a student submits for grading must be the student's own work, reflecting their learning. Where work from other sources is used, it must be properly acknowledged and referenced. This requirement also applies to sources on the internet. A student's assessed work may be reviewed for potential plagiarism or other forms of academic misconduct, using computerised detection mechanisms.

Class Representatives

Class representatives are students tasked with representing student issues to departments, faculties, and the wider university. If you have a complaint about this course, please contact your class rep who will know how to raise it in the right channels. See your departmental noticeboard for contact details for your class reps.

Copyright

The content and delivery of content in this course are protected by copyright. Material belonging to others may have been used in this course and copied by and solely for the educational purposes of the University under license.

You may copy the course content for the purposes of private study or research, but you may not upload onto any third party site, make a further copy or sell, alter or further reproduce or distribute any part of the course content to another person.

Inclusive Learning

All students are asked to discuss any impairment related requirements privately, face to face and/or in written form with the course coordinator, lecturer or tutor.

Student Disability Services also provides support for students with a wide range of impairments, both visible and invisible, to succeed and excel at the University. For more information and contact details, please visit the Student Disability Services’ website http://disability.auckland.ac.nz

Special Circumstances

If your ability to complete assessed coursework is affected by illness or other personal circumstances outside of your control, contact a member of teaching staff as soon as possible before the assessment is due.

If your personal circumstances significantly affect your performance, or preparation, for an exam or eligible written test, refer to the University’s aegrotat or compassionate consideration page https://www.auckland.ac.nz/en/students/academic-information/exams-and-final-results/during-exams/aegrotat-and-compassionate-consideration.html.

This should be done as soon as possible and no later than seven days after the affected test or exam date.

Learning Continuity

In the event of an unexpected disruption, we undertake to maintain the continuity and standard of teaching and learning in all your courses throughout the year. If there are unexpected disruptions the University has contingency plans to ensure that access to your course continues and course assessment continues to meet the principles of the University’s assessment policy. Some adjustments may need to be made in emergencies. You will be kept fully informed by your course co-ordinator/director, and if disruption occurs you should refer to the university website for information about how to proceed.

The delivery mode may change depending on COVID restrictions. Any changes will be communicated through Canvas.

Student Charter and Responsibilities

The Student Charter assumes and acknowledges that students are active participants in the learning process and that they have responsibilities to the institution and the international community of scholars. The University expects that students will act at all times in a way that demonstrates respect for the rights of other students and staff so that the learning environment is both safe and productive. For further information visit Student Charter https://www.auckland.ac.nz/en/students/forms-policies-and-guidelines/student-policies-and-guidelines/student-charter.html.

Disclaimer

Elements of this outline may be subject to change. The latest information about the course will be available for enrolled students in Canvas.

In this course students may be asked to submit coursework assessments digitally. The University reserves the right to conduct scheduled tests and examinations for this course online or through the use of computers or other electronic devices. Where tests or examinations are conducted online remote invigilation arrangements may be used. In exceptional circumstances changes to elements of this course may be necessary at short notice. Students enrolled in this course will be informed of any such changes and the reasons for them, as soon as possible, through Canvas.