Data mining is the process of discovering new knowledge in the forms of patterns and relationships in large data sets. It helps find knowledge from a data set that was previously impossible to obtain with traditional methods. Modern data mining is well equipped to discover useful knowledge or patterns from unstructured data such as web traffic, emails and social media content. Data mining uses a range of machine learning algorithms and modern statistical techniques to discover knowledge from data sets.
This unit will introduce the theoretical foundation of data mining and a range of data mining processes and techniques. The unit will also provide hands-on experience in developing data mining applications using an appropriate programming language or data mining tool.
Topics included in this unit are: data mining terminologies, scope of data mining such as classification, regression and clustering methods and techniques, associate pattern mining, mining time series data, and mining text data.
On successful completion of this unit, students will appreciate the theoretical and technical concepts of data mining and its techniques and processes, gain hands-on experience in imple menting data mining techniques using a programming language such as Python, R, or a tool such as Weka, KNIME, Exc el etc.
As a result students will develop skills such as communic ation literacy, critical thinking, analysis, reasoning and interpretation, which are crucial for gaining employment and developing academic competence.
It is assumed that students will have some knowledge of data analytics and mac hine learning, or will have completed Unit 12: Data Analytics and Unit 26: Machine Learning.
By the end of this unit students will be able to:
LO1. Discuss the historical and theoretical foundation of data mining, its scope, techniques, and processes.
LO2. Investigate a range of data mining techniques to discover patterns and relationships in large data sets.
LO3. Illustrate how a data mining algorithm performs text mining to identify relationships within text.
LO4. Evaluate a range of graph data mining techniques that recognise patterns and relationships in graph-based technologies