PUBH 7006 Data Management and Programming for Epidemiology
Credit Points 10
Legacy Code 401179
Coordinator Sandro Martins Sperandei Opens in new window
Description Modern epidemiology deals with ever increasing volumes of data and complexity of analysis. This course is aimed at equipping students with effective practices for managing data and programme code and ensuring the security of their data. Students will be taught the fundamentals of managing code and data in a revision control system as well as good programming practices and techniques which can form a basis for a robust, repeatable and test-driven research methodology. Programming instruction and exercises will use the SAS and R languages, and SQL databases.
School Medicine
Discipline Epidemiology
Student Contribution Band HECS Band 2 10cp
Level Postgraduate Coursework Level 7 subject
Co-requisite(s) HLTH 7008
Restrictions
Students must be enrolled in a postgraduate program.
Assumed Knowledge
High school mathematics (arithmetic, formulas and algebra, reading graphs). Basic computer competency and basic programming skills.
Learning Outcomes
On successful completion of this subject, students should be able to:
- Explain key concepts and rationale for revision control systems
- Use basic Git commands with local and remote repositories as part of personal and collaborative workflows
- Explain key concepts of data types, variables, working memory and storage
- Write simple programmes which use loops and conditional statements and simple functions and/or macros which use parameters passed to them
- Explain key concepts of relational databases
- Write simple SQL queries to select and subset data, join two tables, and use indexes to improve efficiency
- Read data into R from text, CSV and spreadsheet files, from an SQL database, and from online data sources
- Describe strategies and techniques for data checking and cleaning, and demonstrate ability to use some of these techniques in simple data manipulation tasks
- Describe strategies and techniques for detecting logic and other programming errors, and demonstrate ability to use these techniques in the context of a small data preparation and analysis project
- Explain key concepts of information security, describe strategies for ensuring that data is stored and transmitted securely, demonstrate ability to use basic encryption technologies safely
Subject Content
- Introduction to basic computing concepts and methods
- Database principles and good practices
- Essential programming review 1: data types, storage, loops, conditionals
- Essential programming review 2: subroutines (and macros) and functions
- Essential database review: tables, rows and columns, SELECT, WHERE, indexes, basic JOIN
- Reading data in: from files, spreadsheets, databases, the web
- Data cleaning and preparation: tools and strategies
- Robust techniques to protect against (and detect) programming logic errors when manipulating data
- Introduction to information security: principles and good practices for keeping data safe and preserving confidentiality
Assessment
The following table summarises the standard assessment tasks for this subject. Please note this is a guide only. Assessment tasks are regularly updated, where there is a difference your Learning Guide takes precedence.
Type | Length | Percent | Threshold | Individual/Group Task |
---|---|---|---|---|
Essay | 1,000 words | 20 | N | Individual |
Applied Project | 12 hours | 30 | N | Individual |
Applied Project | 18 hours | 50 | N | Individual |
Teaching Periods
Autumn (2022)
Online
Online
Subject Contact Sandro Martins Sperandei Opens in new window
View timetable Opens in new window
Parramatta - Victoria Rd
Day
Subject Contact Sandro Martins Sperandei Opens in new window
View timetable Opens in new window
Autumn (2023)
Online
Online
Subject Contact Sandro Martins Sperandei Opens in new window
View timetable Opens in new window
Parramatta - Victoria Rd
On-site
Subject Contact Sandro Martins Sperandei Opens in new window