Ask Management Information System Expert

Task steps:

1. Create an author-to-author tweet edge file from the original data set, stocktwit_graph_input.csv.

Create an edge file from the original data set, stocktwit_graph_input.csv. We just need two columns - source (Vertex 1) and target (Vertex

2) of an edge to create a graph. Select all rows - tweets for columns K- "from_person" and M - "to_person" (or J and L for numerical author IDs) and save it as "stocktwit_from_to" or another name you prefer.

2. Use Gephi to generate and save author (node) metrics. Select the metrics you like to explore and use for building models later. Include at least 5 different metrics.

a. Which three authors have the highest betweenness centrality?

b. Which three authors have the highest total degree?

c. Which three authors have the highest closeness?

3. Build the Node Table for Prediction

(1). Open the stocktwit_node.csv file in Excel, and create a new variable: Expert (i.e. suggested). It is the target variables we aim to classify or predict.

(2). Do not close the stocktwit_node.csv file. Open the stocktwit_graph_input.csv file. And then go to the stocktwit_node.csv.

(3). Note that the unit in the stocktwit_node.csv file is a node (i.e. each individual author) and the unit in the stocktwit_graph_input.csv file is a tweet (i.e. each message). So, in order to transfer the value of expert from the table of stocktwit_graph_input to the stocktwit_node table, we need to do data transformation.

To Expert, we need to assign one value to one author (i.e. whether they are expert or not - 1 stands for yes; 0 stands for no.).

Use the VLOOKUP function to assign the value of "suggested" from the table of stocktwit_graph_input to the column, "Expert", in stocktwit_node table. The function for the first row should be like this:

= VLOOKUP(A2, stocktwit_graph_input.csv!$K$1:$AB$38200,18,FALSE),

where "A2" is the node name; "stocktwit_graph_input.csv!$K$1:$AB$38200" is the table range we look up; 18 is the column number from the table range that we aim to return the value, "FALSE" stands for an exact match of the value.

(4). Save the stocktwit_node.csv file. BTW, you can delete those rows who have missing value in Expert, because these nodes only appear in the "to_person" column, they do not have tweets.

Use filter function in excel to remove the #NAs.

4. In R, build and evaluate a classification model that uses the metrics in stocktwit_node_yourname.csv from step 2 as features to classify authors into "expert" stocktwit author (i.e., "suggested"=1)" or not ("suggested"=0) which is the target label variable.

(1). Using a seed of 100, randomly select 60% of the rows into training (e.g. called traindata). Divide the other 40% of the rows evenly into two holdout test/validation sets (e.g., called testdata1 and testdata2).

(2). Build the tree using the C50 function with default settings.

(3). Generate predictions (i.e. estimations) of the values of the target variable for the testing instances.

Generate a confusion matrix that shows the counts of true-positive, true-negative, false-positive and false-negative predictions for both testdata1 and testdata2. Consider 1 as positive class.

Generate seven performance metrics - Accuracy (percent of all correctly classified testing instances), and precision (percent of instances predicted to have a class are accurate), recall (also true positive) and F-measure (also F-score) of the two classes of expert.

(4). Would you recommend using the features from network analysis to identify experts in the Stocktwit community? Why or why not?

Attachment:- stocktwit_graph_input.rar

Management Information System, Management Studies

  • Category:- Management Information System
  • Reference No.:- M92577543
  • Price:- $100

Priced at Now at $100, Verified Solution

Have any Question?


Related Questions in Management Information System

Search the csu library the internet or any specific

Search the CSU library, the Internet, or any specific websites, and scan IT industry magazines to find an example of an IT project that had problems due to organizational issues. Write a paper summarizing the key stakeho ...

Question how can company protect the new emerging

Question : How can company protect the new emerging technology ventures from profit pressures of the parent organization (APA format required, Turntin check required . Minimum 250 words essay) How do companies overcome l ...

Communication and team decision makingpart 1 sharpening the

Communication and Team Decision Making Part 1: Sharpening the Team Mind: Communication and Collective Intelligence A. What are some of the possible biases and points of error that may arise in team communication systems? ...

Question provide an explanation of ifwherehow does active

Question : Provide an explanation of if/where/how does Active Directory support network security,14 pages (2,000-2,500) in APA format. Include abstract and conclusion. Do not include wikis, message boards, support forums ...

Question how companies could effectively use emerging

Question : How companies could effectively use emerging technology to win over its competitors. APA format required. 250 words essay required. The response must be typed, single spaced, must be in times new roman font (s ...

Question how customers could effectively use emerging

Question : How customers could effectively use emerging technology to win over its customers. APA format required. 250 words essay required. turntin check require. The response must be typed, single spaced, must be in ti ...

Part 1 - create an 8 slide powerpoint presentation on

Part 1 - Create an 8 slide PowerPoint presentation on foundational concepts specific to physical security. Part 2 - Write 4 pages detailing the framework for the design of an integrated data center. Assessment Instructio ...

In chapter 2 of the text - managing amp using information

In Chapter 2 of the text - Managing & Using Information Systems: A Strategic Approach, the chapter discusses why information systems experience failure often because of organizational strategy. A classic example of this ...

Review at least 4 articles on balanced scorecard and

Review at least 4 articles on Balanced Scorecard and complete the following activities: 1. Write annotated summary of each article. Use APA throughout. 2. As an IT professional, discuss how you will use Balanced Scorecar ...

Data resources management questionsq1 the dama dmbok

Data Resources Management QUESTIONS Q1. The DAMA DMBOK textbook describes the following two core activities as part of the Data Architecture management exercise: "Understanding enterprise information needs" and "Develop ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As