Information for local students:
|Spring 2008||6331||CSCI 682-01||DIS||TR||1230-0145||OCNL 239|
|Fall 2006||5900||CSCI 598-02||DIS||TR||1100-1215||MLIB 031|
* Archived webcast available through HorizonLive! as part of the CHICO Computer Science Program.
"Data mining, also known as knowledge-discovery in databases (KDD), is the practice of automatically searching large stores of data for patterns. To do this, data mining uses computational techniques from statistics and pattern recognition." - From Wikipedia at http://en.wikipedia.org/wiki/Data_mining
Data mining is "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data." - From W. Frawley and G. Piatetsky-Shapiro and C. Matheus,
"Knowledge discovery in databases: An overview."
AI Magazine, Fall 1992, pages 213-228.
Check these ACM Computing Reviews 2008 articles out: Adversarial Information Retrieval, Computational Intelligence in Human Genetics
Data mining is "the science of extracting useful information from large data sets or databases." - from D. Hand, H. Mannila, P. Smyth: Principles of Data Mining.
MIT Press, Cambridge, MA, 2001. ISBN 0-262-08290-X.
This course introduces the student to basic concepts, tasks, methods, and techniques in data mining; in particular, the course focuses on practical machine learning tools and techniques used in data mining. Students will develop an understanding of the data mining process and issues, learn various techniques for data mining, and apply the techniques in solving data mining problems using data mining tools and systems.
Students from departments such as Statistics, Biology, Mathematics, and Electrical & Computer Engineering who are working in interdisciplinary research (e.g., bioinformatics, modeling, data analysis) are especially encouraged to take this course.
Data Mining: Practical Machine Learning Tools and Techniques, 2/e
Ian Witten and Eibe Frank, 2005.
Elsevier Inc. Burlington, Massachussetts.
Also available: Companion website for the textbook.
|Students will be required to open and maintain a Chico State Connection (CSC Portal) account.|
|Students are responsible for regularly checking their WebCT Vista account (automatically generated through the CSC Portal) to access an up-to-date on-line calendar of events, current scores, on-line quizzes, etc.|
|Students are expected to use the WEKA open source data mining software in Java.|
|60%||Written homework/assignment; laboratory (WEKA) projects|
|35%||(Individual) Research paper
|5%||Class participation (local students)|
Students are expected to turn in all course requirements assigned by the professor; otherwise, the professor reserves the right to assign a lower final grade than that normally calculated by the student.
|[88.75, 92.50)||B+||Very Good Work|
|[77.50, 81.25)||C+||Adequate Work|
|[66, 70)||D+||Minimally Acceptable Work|
|[ 0, 60)||F||Unacceptable Work|
Note: It is Dr. J's policy not to assign a final grade of D or D+ to graduate students. Hence,
graduate students with a class standing less than C- (70%) earn a final grade of F.
Please note that these policies are designed specifically for all Dr. J's on-site courses; not all policies may apply to this course, particularly if you are registered through the Center for Regional and Continuing Education (RCE) as a remote student. You must contact Dr. J if you have any questions or concerns regarding the applicability of a policy to this course.