Availabilities:

Location Domestic International
Gold Coast
Melbourne N/A
Session1,2
Online
Session1,2
Session1,2
Perth N/A
Session1,2
Sydney N/A
Session1,2

Unit Summary

Unit type

PG Coursework Unit

Credit points

12

Unit aim

Equips students with the tools to build contemporary Big Data processing and analysis systems. Students learn how to create and develop each task in the machine learning pipeline from acquiring and cleaning data to analysing and visualising insights obtained from datasets including natural language datasets.

Unit content

Topic 1: Introduction to Big Data

Topic 2: Big Data pre-processing

Topic 3: Big Data technologies: Hadoop, Scala and Spark

Topic 4: Using Spark through Python

Topic 5: Dimension reduction

Topic 6: Data visualisation

Topic 7: Managing time-series data

Topic 8: Advanced Tensor Flow, NumPy and Pandas functionalities

Topic 9: Introduction to natural language processing and text mining

Topic 10: Processing raw text

Topic 11: Classifying text

Topic 12: Recent advances in Big Data processing.

Learning outcomes

Unit Learning Outcomes express learning achievement in terms of what a student should know, understand and be able to do on completion of a unit. These outcomes are aligned with the graduate attributes. The unit learning outcomes and graduate attributes are also the basis of evaluating prior learning.

On completion of this unit, students should be able to:
1 identify, manipulate and apply Big Data storage and processing technologies
2 acquire and clean large data sets
3 extract features and model large data sets
4 design algorithms and architectures for processing large data sets to extract patterns and information
5 develop natural language processing algorithms for text mining

On completion of this unit, students should be able to:

  1. identify, manipulate and apply Big Data storage and processing technologies
  2. acquire and clean large data sets
  3. extract features and model large data sets
  4. design algorithms and architectures for processing large data sets to extract patterns and information
  5. develop natural language processing algorithms for text mining

Prescribed texts

  • Bird, S, Klein, E & Loper, E, 2009, Natural language processing with Python: Analyzing text with the natural language toolkit, O’Reilly Media.
  • McKinney, W, 2014, Python for data analysis: Data wrangling with Pandas, NumPy, and IPython, 1st edn, 3rd Release, O’Reilly Media.

  • Bird, S, Klein, E & Loper, E, 2009, Natural language processing with Python: Analyzing text with the natural language toolkit, O’Reilly Media.
  • McKinney, W, 2014, Python for data analysis: Data wrangling with Pandas, NumPy, and IPython, 1st edn, 3rd Release, O’Reilly Media.
Prescribed texts may change in future teaching periods.