Availabilities:

Not currently available in 2019

Unit Summary

Unit type

PG Coursework Unit

Credit points

12

AQF level

9

Level of learning

Advanced

Unit aim

Equips students with the tools to build contemporary Big Data processing and analysis systems. Students learn how to create and develop each task in the machine learning pipeline from acquiring and cleaning data to analysing and visualising insights obtained from datasets including natural language datasets.

Unit content

Topic 1: Introduction to Big Data

Topic 2: Big Data pre-processing

Topic 3: Big Data technologies: Hadoop, Scala and Spark

Topic 4: Using Spark through Python

Topic 5: Dimension reduction

Topic 6: Data visualisation

Topic 7: Managing time-series data

Topic 8: Advanced Tensor Flow, NumPy and Pandas functionalities

Topic 9: Introduction to natural language processing and text mining

Topic 10: Processing raw text

Topic 11: Classifying text

Topic 12: Recent advances in Big Data processing.

Learning outcomes

Unit Learning Outcomes express learning achievement in terms of what a student should know, understand and be able to do on completion of a unit. These outcomes are aligned with the graduate attributes. The unit learning outcomes and graduate attributes are also the basis of evaluating prior learning.

GA1: , GA2: , GA3: , GA4: , GA5: , GA6: , GA7:
On completion of this unit, students should be able to: GA1 GA2 GA3 GA4 GA5 GA6 GA7
1 identify, manipulate and apply Big Data storage and processing technologies
2 acquire and clean large data sets
3 extract features and model large data sets
4 design algorithms and architectures for processing large data sets to extract patterns and information
5 develop natural language processing algorithms for text mining

On completion of this unit, students should be able to:

  1. identify, manipulate and apply Big Data storage and processing technologies
    • GA1:
    • GA4:
  2. acquire and clean large data sets
    • GA4:
  3. extract features and model large data sets
    • GA1:
    • GA4:
  4. design algorithms and architectures for processing large data sets to extract patterns and information
    • GA1:
    • GA2:
  5. develop natural language processing algorithms for text mining
    • GA1:
    • GA2:
    • GA4: