Ask DBMS Expert


Home >> DBMS

Project Option I

In this project option, you will be expected to conduct a comprehensive literature search and survey, select and study a specific topic in one subject area of data mining and KDD, and write a technical paper on the selected topic all by yourself. The technical paper you are asked to write can be a detailed comprehensive survey on some specific topic or the original research work that will have been done by yourself.

Requirements and Instructions for the Technical Paper:

1. The objective of the paper should be very clear about subject, scope, domain, and the goals to be achieved.

2. The paper should address the important advanced and critical issues in a specific area of data mining and KDD. Your research paper should emphasize not only breadth of coverage, but also depth of coverage in the specific area.

3. The research paper should give the measurable conclusions and future research directions (this is your contribution).

4. It might be beneficial to review or browse through about 10 to 15 relevant technical articles before you make decision on the topic of the research project.

5. The research paper should reflect the quality at certain academic research level.

6. The paper should be about at least 25 to 30 pages (double space) or 2500-3000 words in length.

7. The paper should include adequate abstraction or introduction, and reference list.

8. Please write the paper in your words and statements, and please give the names of references, citations, and resources of reference materials if you want to use the statements from other reference articles.

9. From the systematic study point of view, you may want to read a list of technical papers from relevant magazines, journals, conference proceedings and theses in the area of the topic you choose.

10. For the format and style of your research paper, please make reference to GSCIS Dissertation Guide, IEEE or ACM journal articles.

Suggested Topics for KDD Research (But not limited) Theory and Fundamental Issues in KDD:

Data and knowledge representation for KDD

Database Models for knowledge discovery and data mining Definitions, formalisms, and theoretical issues in KDD Fundamental advances in search, retrieval, and discovery methods Modeling of structured, unstructured and multimedia data for KDD Metrics for evaluation of KDD results

Probabilistic modeling and uncertainty management in KDD

Data Mining Methods and Algorithms:

Algorithms for learning classification rules, characteristic rules, associative rules Algorithms for association rule mining
Algorithms for clustering, predication, etc.
Algorithmic complexity, efficiency and scalability issues in KDD High dimensional datasets and data preprocessing
Parallel and distributed data mining techniques Probabilistic and statistical models and methods in KDD
Supervised and unsupervised discovery and predictive modeling Using prior domain knowledge and re-use of discovered knowledge Measurement of rule interestingness and quality

KDD Process and Human Interaction:

Models of the KDD process

Methods for evaluating subjective relevance and utility Data and knowledge visualization

Interactive data exploration and discovery Privacy preservation data mining and security

Applications:

Application of KDD in business, science, medicine and engineering Application of KDD methods for mining knowledge in text, image,
audio, sensor, numeric, categorical or mixed format data, semi-structural data Big-data mining and data analytics
Mining multimedia, hyper-text, spatial, temporal databases Mining bioinformatics data
Applications of KDD for semantic query optimization Knowledge discovery and data mining tools
Resource and knowledge discovery using the Internet

Others:

Active databases

Application of genetic algorithm to KDD Fuzzy and prolog databases

Genetic algorithms Neural Networks

Regression Methods

Rough set model for relational databases Support Vector Machine

Suggested Check List for Written Report

Your written report should try to include the following items:

a. Introduction and objectives of the research.
b. Current state of arts and existing methodologies in the specific area.
c. Barriers, issues, and open problems in the area.
d. Existing, expected, proposed solutions, methods, and algorithms if any at the time of the project due.
e. Examples in details (step by step) to illustrate concepts, principles, theories, algorithms, methodologies, etc.
e. Research results if any at the time of the report due.
f. Analysis and comparison of research methods, algorithms, and expected results if any at the time of the report due.
g. Conclusions and future research directions if any at the time of the report due.
h. Reference list

Project Option II

In this project option, you are expected to implement some existing data mining algorithms or improvement based on the existing data mining algorithms.

The algorithms can be the one listed in the following, but limited to:

Associate rule mining algorithms (various data mining algorithms) Classification algorithms (Example, ID3, C4.5, C5.0, CART, etc.) Clustering Algorithms (various algorithms)
Fuzzy data mining and rough set data mining K-Nearest Neighbor
Genetic Algorithms Neural networks Support vector machine

Requirements for the project deliverable:

1. Description of the algorithm in pseudo code with proper explanation and documentation.
2. Illustrative examples of the algorithms in details
3. Analysis of the algorithm in terms of performance and time complexity
4. Description of bench mark, testing data to support experiment design
5. Analysis of experiment data
6. Readme file to include all the details of how to run the program
7. Live demo of the implementation with explanation.
8. Reference list

Specific Topics for Research and/or Implementation Supervised Learning Methods:

Classification Methods:

Regression Methods
Multiple Linear Regression Logistic Regression
Ordered Logistic and Ordered Probit Regression Models Multinomial Logistic Regression Model
Poisson and Negative Binomial Regression Models

Bayesian Classification Naïve Bayes Method
k Nearest Neighbors

Decision Trees
ID3 (Iterative Dichotomiser 3) C4.5 and C5.0
CART (Classification and Regression Trees) Scalable Decision Tree Techniques AdaBoost Algorithm
Ensemble Methods

Neural Network-Based Methods Back Propagation
Neural Network Supervised Learning Deep Learning
Bayes Belief Network

Rule-Based Methods
Generating Rules from a Decision Tree Generating Rules from a Neural Net
Generating Rules without Decision Tree or Neural Net Support Vector Machine
Fuzzy Set and Rough Set Methods Unsupervised Learning Methods: Clustering Methods:
Partition Based Methods Squared Error Clustering
K-Means Clustering (Centroid-Based Technique)
K-Medoids Method (Partition Around Medoids, Representative Object-Based Technique) Bond Energy

Hierarchical Methods
Agglomerative vs. Divisive Hierarchical Clustering
BIRCH (Balanced Iterative Reducing and Clustering Using Hierarchies) Chameleon (Hierarchical Clustering using Dynamic Modeling)
CLARANS (Clustering Large Applications Based Upon Randomized Search) CURE (Clustering Using REpresentatives)

Density Based Methods
DBSCAN (Density Based Spatial Clustering of Applications with Noise, Density Based Clustering Based on Connected Regions with High Density)
OPTICS (Ordering Points to Identity the Clustering Structure)
DENCLUE (DENsity Based CLUstEring, Clustering Based on Density Distribution Functions)

Grid-Based Methods
STING (Statistical Information Grid)
CLIQUE (Clustering In QUEst, An Apriori-like Subspace Clustering Method) Probabilistic Model Based Clustering
Clustering Graph and Network Data (For Example, Social Networks) Self-Organized Map Technique
Evaluation and Performance Measurement of Clustering Methods Assessing Clustering Technology
Determining the Number of Clusters Measuring Clustering Quality

Association Rule Mining Evolution Based Methods: Genetic Algorithms Applications:
Data Mining Applications for Business Intelligence and Analytics
Text Mining Spatial Mining Temporal Mining Web Mining
Recommender Systems

Others:

Over fitting and Under fitting issues Outliers
Performance Evaluation and Measurement Confusion Matrix
ROC (Receiver Operating Characteristic) AUC (Area Under the Curve)

Data Mining Tools XLMiner RapdiMiner TensorFlow Weka NodeXL

Sample Format of Project Report

1. Title Page

In general, the number of words in the title of report should be controlled within 10 words if possible. The title page should have your name, email, contact information, and date below the title.

2. Abstract
The abstract page should summarize the highlight of your project to tell the audience what have been done in the research project.

3. Table of Contents
The TOC part should list all titles of sections and subsections with page numbers.

4. Introduction
This part introduces the audience with necessary information to guide them into the subjects of your research project.

5. Background and Literature Review

6. Statement of the Proposed Research or Study
With the discussion in Background and Literature Review, the proposed research and study can be given in the format of, possibly, Problem Statement to indicate what to be studied, investigated, researched, and/or achieved from this project.

7. Methodology
Based on the Problem Statement and the objective to be achieved, you may want to elaborate the underline methodology to be used in order to fulfill the research task and achieve the goal of the research/study. If possible, please provide elaboration of rationales in both depth and width. It is better to use illustrative examples to explain the methodologies used in order to show your good understanding.

8. Experiment Design and Result Analysis
Provide the details of how experiments are designed and conducted, and observation from the experiment. Analysis of experimental results are important based on your observation, understanding, interpretation, etc. with some performance analysis methods.

9. Conclusion
Summarize your research/study by giving some conclusion from the project, and may provide future research/study directions with discussion of potentials.

10. Reference List

11. Appendix (if necessary)

For style, please make reference APA Manual, ACM, or IEEE publications.

DBMS, Programming

  • Category:- DBMS
  • Reference No.:- M92394178
  • Price:- $45

Priced at Now at $45, Verified Solution

Have any Question?


Related Questions in DBMS

Data mining assignment -in this assignment you are asked to

Data Mining Assignment - In this assignment you are asked to explore the use of neural networks for classification and numeric prediction. You are also asked to carry out a data mining investigation on a real-world data ...

Sql query assignment -for this assignment you are to write

SQL Query Assignment - For this assignment you are to write your answers in a word document. This assignment is in three parts: Part A (reporting queries), Part B (query performance), Part C (query design). For this assi ...

The groceries datasetimagine 10000 receipts sitting on your

The groceries Dataset Imagine 10000 receipts sitting on your table. Each receipt represents a transaction with items that were purchased. The receipt is a representation of stuff that went into a customer's basket. That ...

You are in a real estate business renting apartments to

You are in a real estate business renting apartments to customers. Your job is to define an appropriate schema using SQL DDL in MySQL. The relations are Property(Id, Address, NumberOfUnits), Unit(ApartmentNumber, Propert ...

Objectivethe objective of this lab is to be familiar with a

OBJECTIVE: The objective of this lab is to be familiar with a process in big data modeling. You're required to produce three big data models using the MS PowerPoint software. This tool is available on UMUC Virtual Deskto ...

The relation memberstudentid organizationid roleid stores

The relation Member(StudentId, OrganizationId, RoleId) stores the membership information of student joining organization. For example, ('S1', 'O2', 'R3') indicates that student with Id 'S1' joined the organization with i ...

Relational database exerciseyou have been assigned to a new

Relational Database Exercise: You have been assigned to a new development team. A client is requesting a relational database system to manage their present store with the anticipation of adding more stores in the future. ...

Relational database design a given the following business

Relational Database Design A) Given the following business rules, identify entity types, attributes (at least two attributes for each entity, including the primary key) and relationships, and then draw an Entity-Relation ...

We can represent a data set as a collection of object nodes

We can represent a data set as a collection of object nodes and a collection of attribute nodes, where there is a link between each object and each attribute, and where the weight of that link is the value of the object ...

Data model development and implementationpurpose of the

Data model development and implementation Purpose of the assessment (with ULO Mapping) The purpose of this assignment is to develop data models and map Database System into a standard development environment to gain unde ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As