City University of Hong Kong Information on a Course [PDF]

without the permission of the paper's author or authors. Note: The views expressed in each paper are those of the author or authors of the paper. They do not represent the views of the Southeast ..... portion of their monthly salaries to the Amal Bak

Quantitative Methods and Applications - City University of Hong Kong [PDF]

simple quantitative models, to solve problems based on these models and to interpret the solutions. ii. Tests. 20. Two one-hour tests using MC/short questions to assess students' ability to solve quantitative problems by hand and/or using computer so

Hong Kong Baptist University

There are only two mistakes one can make along the road to truth; not going all the way, and not starting.

Hong Kong Baptist University

Come let us be friends for once. Let us make life easy on us. Let us be loved ones and lovers. The earth

Idea Transcript

City University of Hong Kong Information on a Course offered by Department of Computer Science with effect from Semester A in 2014 / 2015

Part I Course Title: Fundamentals of Data Science Course Code: CS3481 Course Duration: One Semester No. of Credit Units: 3 Level: B3 Medium of Instruction: English Prerequisites: (Course Code and Title) CS2204 Fundamentals of Internet Applications Development Precursors: (Course Code and Title) Nil Equivalent Courses: (Course Code and Title) Nil Exclusive Courses: (Course Code and Title) CS4483 Data Warehousing and Data Mining

Part II 1.

Course Aims: This course aims to explore the important field of data science. The syllabus covers the main techniques in statistical data modelling, and algorithms in data science, which include predictive modelling, cluster analysis, association rule mining and text mining. In addition, different applications of data science techniques in the real world such as web mining, business analytics and health informatics will be discussed.

2.

Course Intended Learning Outcomes (CILOs) (state what the student is expected to be able to do at the end of the course according to a given standard of performance)

Upon successful completion of this course, students should be able to: No.

CILOs

1.

identify the main characteristics of different techniques in data science through observation of their operations;

2.

perform a critical assessment of current techniques in data science;

3.

implement the main algorithms in data science in a computationally efficient way;

4.

propose new solutions for real world information analytics problems by improving and combining current data science techniques;

Weighting (if applicable)

3.

Teaching and learning Activities (TLAs) (designed to facilitate students’ achievement of the CILOs)

Teaching pattern: Suggested lecture/tutorial/laboratory mix: 2 hrs. lecture; 1 hr. tutorial ILO No

TLAs

CILO 1 CILO 2

Lecture – This course will focus on introducing the fundamental and state-of-the-art techniques in data science. The exact set of topics to be covered may vary from year to year, depending on the trend and direction of this emerging field. For each topic, instructor will introduce background information and fundamental techniques. Students are then required to have a comprehensive understanding of these techniques by reading related literature and implementing a selected set of algorithms for experiment and performance comparison.

CILO 1 CILO 2

Assignment – This assignment gives students an opportunity to identify the characteristics of different techniques in data science. Students are also required to search literature and perform critical assessment of recent data science methodologies. This activity supports Course ILOs #1, 2.

CILO 3 CILO 4

Project – There will be two projects: The first project gives students an opportunity to implement existing algorithms in data science in a computationally efficient way. The second project

Hours/week (if applicable)

allows students to create new designs for information analytics systems. This activity supports Course ILOs #3, 4.

4.

Assessment Tasks/Activities (designed to assess how well the students achieve the CILOs) ILO No

Type of assessment tasks/activities

Weighting

Remarks

(if applicable)

CILO 1

 

CILO 2





Coursework: The quality of students’ assignments will be used to assess this ILO. Examination: The final examination will include questions to assess the students’ understanding on various topics of data science. Coursework: Students are required to perform a critical assessment of different data science techniques in the assignment. The quality and relevancy of their critiques will be a measure of this ILO. Examination: The final examination will include questions to test the students’ capability to perform critical assessment of data science methods.

CILO 3



Coursework: In one of the projects, students will implement existing algorithms for specific problems in information analytics. The quality of the implementation will be the measure of this ILO.

CILO 4



Coursework: In one of the projects, students will create new designs for information analytics systems. The creativity and the effectiveness of the proposed designs will be the measures of this ILO. Examination: The final examination will include case studies to evaluate the students’ capability in proposing new solutions for challenging problems in information analytics.



5. Grading of Student Achievement: Refer to Grading of Courses in the Academic Regulations and to the Explanatory Notes.

Examination duration: 2 hours Percentage of coursework, examination, etc.: 50% CW; 50% Exam Grading pattern: Standard (A+AA-…F) For a student to pass the course, at least 30% of the maximum mark for the examination must be obtained.

Assessment will be made on the students’ ability to apply suitable knowledge and techniques of data science to different situations. The equal weighting of coursework and examination assessment is due to the emphasis on both the theory and practical applications of data science.

Part III Keyword Syllabus Data pre-processing, statistical data modelling, predictive modelling, classifier evaluation, cluster analysis, association rule mining, data stream mining, text mining. Syllabus 1.

Knowledge discovery process Introduction of the knowledge discovery process in three stages: data preprocessing, data mining, and knowledge representation. Basic data preprocessing techniques including data cleaning, selection, integration, transformation and reduction will be discussed.

2.

Statistical data modelling Introduction of fundamental concepts of statistical data modelling, which include random variables, probability distribution functions, probability density functions, covariance matrix, correlation coefficient, linear regression, sampling, statistical inference and multivariate statistical analysis.

3.

Predictive modelling Introduction of the main predictive modelling techniques for data science, which include decision tree, nearest neighbour classifier and naïve Bayes classifier. In addition, the issues of classification performance evaluation and model selection will be discussed.

4.

Cluster analysis Introduction of the main clustering techniques: partitional, hierarchical, and density-based clustering. Important algorithms such as k-means, agglomerative hierarchical clustering, and DBSCAN will be discussed. Related issues in outlier analysis and detection will be introduced.

5.

Association rule mining Introduction of the Apriori algorithm for frequent pattern mining and association rule mining, and the comparison of different measures for evaluating the association patterns. Mining of frequent patterns in data streams will also be discussed.

6.

Text mining Introduction of the vector space model for document representation, the term frequency-inverse document frequency (tf-idf) approach for term weighting, and proximity measures such as cosine similarity for document comparison. Different algorithms in text mining such as document clustering and text classification will also be discussed.

Recommended Reading: Text(s): Essential Text Tan P. N., Steinbach M. and Kumar V. Introduction to Data Mining. 2nd Ed., Addison Wesley (2014) Supplementary Reading Bramer M. Principles of Data Mining. 2nd Ed., Springer (2013) Han J. and Kamber M. Data Mining: Concepts and Techniques. 3rd Ed., Morgan Kaufmann (2011) Witten I., Frank E. and Hall M. Data Mining: Practical Machine Learning Tools and Techniques. 3rd Ed., Morgan Kaufmann (2011)

City University of Hong Kong Information on a Course [PDF]

Recommend Stories

Idea Transcript

Helpful Links

Smile Life

Get in touch