Ask Applied Statistics Expert

Data Technologies

The point of this lab is to get started using R and to practice reading text file data into R and calculating simple summaries from data.

Your answer will consist of a file containing R code; you can submit either a plain text file containing R code or a plain text file containing R markdown code. Please DO NOT submit anything other than a plain text file (e.g., DO NOT submit a Word document or a PDF document or an HTML document).

We will work with three CSV files called trump-tweets-num-2010.csv, trump-tweets-num- 2011.csv, and trump-tweets-num-2012.csv that contains data on tweets from the Twitter ac- count of Donald Trump (from 2010 to 2012).

Within these files, every row provides information for one of Donald Trump's tweets, mostly about when the tweet was sent (wday is day of the week, min is minutes, and sec is seconds), but also how many times the tweet was retweeted. The first few rows of the file trump-tweets-num-2010.csv are shown in Figure 1.

The data files are available on Canvas.

retweet_count,month,day,wday,hour,min,sec

144,11,30,4,21,42,1

109,11,23,4,16,26,18

112,11,16,4,14,30,23

250,11,14,2,20,55,30

12,11,13,1,16,42,27

14,11,13,1,16,39,7

24,11,13,1,16,30,47

44,11,10,5,14,42,15

55,11,9,4,20,2,3

24,11,2,4,15,32,49

31,10,29,1,15,52,46

69,10,24,3,18,41,32

32,10,24,3,17,20,54

19,10,24,3,15,53,23

26,10,22,1,17,22,23

21,10,18,4,17,11,35

27,10,18,4,15,45,35

34,10,15,1,19,42,7

28,10,11,4,15,20,8

Figure 1: The first few lines of the file trump-tweets-num-2010.csv.

NOTE: You should submit a file containing R code that assigns values to the appropriate symbols. I will run the code in your file and then check the values that have been assigned to the symbols.

NOTE: Your file should ONLY contain valid R code, properly indented, and with comments. You should be able to copy-and-paste your entire file of R code into R and get no errors.

NOTE: You should submit your answers on Canvas.

1. Write an R expression that reads the file trump-tweets-num-2010.csv and assigns the result to the symbol tweets2010.
NOTE: your code can assume that the data file is in the current working directory. The symbol tweets2010 should print like this:
> head(tweets2010)

 

retweet_count

month

day

wday

hour

min

sec

1

144

11

30

4

21

42

1

2

109

11

23

4

16

26

18

3

112

11

16

4

14

30

23

4

250

11

14

2

20

55

30

5

12

11

13

1

16

42

27

6

14

11

13

1

16

39

7

>dim(tweets2010)

2. Write an R expression that calculates the maximum value from the file trump-tweets-num- 2010.csv and assigns the result to the symbol maxRetweet2010.

The symbol maxRetweet2010 should print like this:

[1] 3813

Some things to think about:
 Why can we just calculate the maximum value for the whole file, rather than having to focus just on the retweet_count column?
 Is this calculation inefficient? Does it matter?

3. Write R code to calculate the largest number of retweets across all three files.
Assign your answer to the symbol maxRetweet. You should get a result that prints like this:
> maxRetweet

[1] 141644

Some things to think about:

 How unusual is this retweet value?
 How would you find out how unusual it is?

4. Write R code to calculate the latest time (before midnight), in seconds, that Donald Trump sent out a tweet.

Assign your answer to the symbol maxTweetTime. You should get a result that prints like this:
> maxTweetTime

[1] 86290

Some things to think about:

 Why did I specify "before midnight"?
 How would you convert this value into hours, minutes, and seconds?

 [EXTRA for EXPERTS - NO MARKS]

Write R code that shows the complete row of data for the latest (before-midnight) tweet ...

retweet_count month day wday hour min sec 86 25 5 5 6 23 33 42

.. and write code to produce a message that states the latest time (before midnight), including the date, that Donald Trump sent out that tweet ...

Donald's latest (pre-midnight) tweet was at 23:33:42 on Wednesday 05 May

Applied Statistics, Statistics

  • Category:- Applied Statistics
  • Reference No.:- M92358043
  • Price:- $40

Priced at Now at $40, Verified Solution

Have any Question?


Related Questions in Applied Statistics

Question onea a factory manager claims that workers at

QUESTION ONE (a) A factory manager claims that workers at plant A are faster than those at plant B. To test the claim, a random sample of times (in minutes) taken to complete a given task was taken from each of the plant ...

You are expected to work in groups and write a research

You are expected to work in groups and write a research report. When you work on your report, you need to use the dataset, and other sources such as journal articles. If you use website material, please pay attention to ...

Assignment -for each of the prompts below report the

Assignment - For each of the prompts below, report the appropriate degrees of freedom, t statistic, p-value and plot using the statistical software platform of your choice (R/STATA) 1) A sample of 12 men and 14 women hav ...

Assignment - research topicpurpose the purpose of this task

Assignment - Research topic Purpose: The purpose of this task is to ensure you are progressing satisfactorily with your research project, and that you have clean, useable data to analyse for your final project report. Ta ...

Assessment task -you become interested in the non-skeletal

Assessment Task - You become interested in the non-skeletal effects of vitamin D and review the literature. On the basis of your reading you find that there is some evidence to suggest that vitamin D deficiency is linked ...

Part a -question 1 - an analyst considers to test the order

PART A - Question 1 - An analyst considers to test the order of integration of some time series data. She decides to use the DF test. She estimates a regression of the form Δy t = μ + ψy t-1 + u t and obtains the estimat ...

Medical and applied physiology experimental report

Medical and Applied Physiology Experimental Report Assignment - Title - Compare the working and spatial memory by EEG. 30 students were tested (2 memory games were played to test their memory - a card game and a number g ...

Business data analysis computer assignment -part 1

Business Data Analysis Computer Assignment - PART 1 - Economists believe that high rates of unemployment are linked to decreased life satisfaction ratings. To investigate this relationship, a researcher plans to survey a ...

Question - go to the website national quality forum nqf

Question - Go to the website, National Quality Forum (NQF), located in the Webliography, and download the article by WIRED FOR QUALITY: The Intersection of Health IT and Healthcare Quality, Number 8, MARCH 2008. You are ...

Go to the webliography source for the national cancer

Go to the Webliography source for the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) Program. In the Fast Stats, create your own cancer statistical report, "Stratified by Data Type," and u ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As