Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Computer Engineering Expert

Assignment 1-

1. Basic histogram. 

a. Import the data in Table 1  Data Set 10.1.1 - Basic Histogram Datato Tableau and make a basic histogram from it.  To make a histogram, you can follow the directions here.  For this data, use a bin size of 2.  (Your graph may vary a bit from the example below).(Online help at http://www.tableau.com/learn/tutorials/on-demand/histograms)

b. Change the y-axis on your histogram to reflect percentages (you can do this in the Quick Table Calculation pull-down from your ROWS variable.)

c. Create a different histogram with the exact same data, with a bin size of 8. 

d. Create a dashboard with the two histograms on it, and submit a screenshot of this dashboard in your TURN IN TEMPLATE.

e. Write a few sentences - what is the difference between the two histograms? Which one would you use under which circumstances?

2. Histogram with a parameter slider for the bin size.

a. Follow the directions in the histogram video and implement an interactive parameter for the user for the bin size.  Set it so it's a slider.  To do this, you may have to right click on the "choose a bin size" parameter and make sure it's set to slider. 

b. Take two or three screenshots of your sliding bin, with two different bin sizes, and submit them in the TURN IN TEMPLATE.

c. Answer this question: how can the sliding bin size parameter help you as a data analyst?

3. Histograms across Warehouses.  Import the data given in Table 2  Data Set 10.1.2 - Warehouse Histogram Data.

a. Create a histogram of this data with a bin size of 3.  Take a screenshot.

b. Use Tableau to incorporate the additional warehouse information.  We are looking for something like the stacked histogram shown below.  (Your data may vary slightly.)  Here's what I got:

c. Look at how these histograms differ from the overall histogram you created in step a. 

d. Submit a screenshot of the overall histogram, a screenshot of your split by warehouse in the TURN IN TEMPLATE.  (Just put the two screenshots next to each other in the same box.)

e. Answer the following questions: 

f. Question 1:  If you only had the overall histogram, how would you narrate the order delivery time? 

g. Question 2:  If you then saw the by-warehouse split, how would you narrate the order delivery time?  Use the phrases "skewed left" and "skewed right" wherever applicable, and make sure you get them correct (look them up if you need to - the Internet is a great place to start.)

h. Question 3:  You have 30 seconds of the CEO's attention.  What single business action would you recommend to her based on your histograms here?

4. Boxplots. We are going to make some boxplots in Tableau!Many of you have seen boxplots before; this week, we emphasize the statistical knowledge that can be pulled from a boxplot.

a. Import the data shown in Table 3  Data Set 10.1.3- Sales Data by Time Zone for Boxplots above.  Make a boxplot of this data (online help:  http://onlinehelp.tableau.com/current/pro/desktop/en-us/help.htm#buildexamples_boxplot.html )

b. To get it to work, you want to make sure your measures aren't aggregated.   To get mine to work, I started with the Time Zone in the Columns, the Sales Volume in the Rows, and then had it make me a bar chart and then I switched the mark type to circles.  (To get a bunch of little circles, make sure the Analysis -> Aggregate Measures box is not checked.)

c. Hover over one of your boxplots and it will show you the actual data.  Here, I'm hovering over the Pacific data. In the TURN IN TEMPLATE, submit a screen print of yourself hovering over one of your boxplots.

d. The boxplots show you at a glance not only the median value (line in the middle) but also the spread and any outliers.  An outlier is something above the top "whisker" or below the bottom "whisker;" on the example chart above, there's an outlier in the Central sales (way at the bottom, where the sales are very close to 0).   Write a sentence or two describing what you see here.  In particular, do you see any differences in median, spread, or quartiles between the regions?  If you wanted to boost sales in one region, which would you pick and why?

5. Heatmaps.  A heat map conveys numeric information, using colors (or "heat") to show one of the dimensions.  We're going to make one covering those three very important pieces of data about the success metrics for all of us:  IQ, shoe size, and salary.  You can find directions from Tableau here:  http://onlinehelp.tableausoftware.com/v8.0/pro/online/en-us/buildexamples_heatmap.html

Here's how I did it:

a. Import the data in Table 4 Data Set 10.1.5  Success Metrics for Everybody into Tableau.

b. If necessary, move IQ and Shoe Size from Measures to Dimensions

c. Bin IQ into something reasonable (here I chose bin sizes of 10)

d. Make a graph with IQ on one axis and Shoe Size on the other, and use Squares as the marks:

e. Converted Annual Salary to a Continuous Dimension

f. Used Annual Salary as the color and changed its Measure to Average:

g. Make your own heatmap from the data here.  Use what you know about colors and graphs to ensure that the color scheme is highlighting what you want to highlight.  Consider the divergent color schemes to bring out interesting data.  Submit a copy of your heat map in the TURN IN TEMPLATE.

h. Write a few sentences - what can you conclude from your heat map?  What, if any, relationships did you find between IQ, shoe size, and salary in this data set?

We're now going to practice on some larger data sets.  Use the following data sets from Tableau to answer the following questions.  Answer the questions using Tableau visualizations.  Please try to answer each question with one and only one Tableau graph (but if you absolutely need more than one, go ahead and be sure to justify why you need it.)

Do not use Excel or other methods to determine your answers.

You can download the data sets from here:  https://public.tableau.com/s/resources?qt-overview_resources=1

6. Millennial vs Baby Boomer Employment Data set.  Use the "National, 5-digit" sheet.    For Baby Boomers, for 2013, what were the three biggest job titles ("Occupation" field) and how many total Baby Boomers worked in those three fields?  Please submit the Tableau screenshot(s) you used to determine this, and give a little narration.  Paste your answers in the TURN IN TEMPLATE.

7. Millennial vs Baby Boomer Employment Data set.  Use the "States" sheet.   In which state was the total Job Change the most negative (i.e. in which state were the most jobs lost between 2007 and 2013?)  How many of those were Boomer jobs vs. Millennial jobs?  Paste your answers in the TURN IN TEMPLATE.

8. Global Sport Finances data sheet, Top Athlete Salaries data.  After you connect to the data, you will need to scrub it up a bit before Tableau can work meaningfully with it.  In particular, I had to do this:

a. After I loaded the "Top Athlete Salaries" sheet, I had to split the salary data.  It comes with a "M" appended at the end of the salary, but that means Tableau will view the entire field as a text field.  We want to do number analytics on it, so I want to tell it the salaries need to be numeric.  First step:  remove the M.

b. Then, I had to convert the split field to a Measure:

c. Next, I had to continue to convince it to treat salary as a number:

d. Scrub your input data as per above.  Make a graph to answer this question:  which sport has the highest *average* pay for 2014?  What was the average pay?  Paste your graph and your answers in the TURN IN TEMPLATE.

9. Global Sport Finances data sheet, Top Athlete Salaries data.  You notice that Basketball, Cricket, and Soccer all have very similar average earnings for their players in this list, all about $30 M for 2014.  If you were told you would be given the annual salary for one athlete chosen at random from these three categories from this data set, would you choose to be given the salary of a randomly chosen basketball player, a randomly chosen cricket player, or a randomly chosen soccer player?  Why?  Make one graph to answer this question, and paste it in the TURN IN TEMPLATE.

Assignment 2 -

Take the following Python code that stores a string:

str = 'X-DSPAM-Confidence: 0.8475

Use find and string slicing to extract the portion of the string after the colon character and then use the float function to convert the extracted string into a ?oating point number

The objective here is to correctly isolate the numeric portion of the given string before applying the float() function to turn it into a floating point number. We can see here that the numeric part to be extracted appears at the end of the string. So what we need to do is to extricate the part of the string from one character position after the colon, up to the index position that represents the end of the string. The position of the colon within the string can be found using the "find" function as shown. The index position that represents the end of the string can be found using the "len" function on the whole string. Remember that index values begin at 0! Once we have stored the isolated string with the numeric part into the variable strNum, we just remove any blank space around it using "strip". If the extraction has been done correctly, this step is redundant, but it is good to ensure this before we turn the extracted string into a Python floating point number. The output from the program is shown below:

 TURN IN #1:  Severance Chapter 7 - Exercise 2 - ASSIGNMENT

Write a program to prompt for a file name, and then read through the file and look for lines of the form:

X-DSPAM-Confidence: 0.8475

When you encounter a line that starts with "X-DSPAM-Confidence:" pull apart the line to extract the floating-point number on the line. Count these lines and then compute the total of the spam confidence values from these lines. When you reach the end of the file, print out the average spam confidence.

Enter the file name: mbox.txt

Average spam confidence: 0.894128046745

Enter the file name: mbox-short.txt

Average spam confidence: 0.750718518519

Test your ?le on the mbox.txt and mbox-short.txt ?les

HINT:

1. Download the two text files: mbox.txt and mbox-short.txt from http://www.pythonlearn.com/code3/to your local machine. For ease, ensure these files reside in the same folder as the .py file for this assignment.

2. Begin writing your code by prompting the user for the file name. Use a try-except block to exit with a user-friendly error message if there is an error opening the file name specified.

3. Once the file is opened, use an iterative loop (e.g. "for" or "while") to traverse each line of the file.

4. (A quick manual exploration of mbox text files reveals that the number representing spam confidence is found at the end of the line).In each line, find the pattern "X-DSPAM-Confidence:". If this is found, extract the portion of the line after this pattern until the end of the line. "find", string extraction and "strip" functions are useful here.

5. Convert the numeric part extracted from the line into a float.

6. When the program has finished traversing each line in the specified file, total the number of lines that had the pattern and compute average spam confidence.

7. Note: In your calculation for average spam confidence, do NOT count the lines that did that not contain the pattern. Be sure to comment your program adequately!

8. Good programmers test all cases, so you should make sure you test for erroneous inputs and also test on the mbox-short.txt and the mbox.txt files.

When you are ready to check your DSPAM code, open the Assignment 10.2 DSPAM Grading.  It will give you a new mbox file to download. Run your DPSAM code on the new file, and enter the average spam confidence.

Attachment:- Assignment Files.rar

Computer Engineering, Engineering

  • Category:- Computer Engineering
  • Reference No.:- M92260189
  • Price:- $90

Guranteed 48 Hours Delivery, In Price:- $90

Have any Question?


Related Questions in Computer Engineering

In a survey of 100 people 40 were casual drinkers event d

In a survey of 100 people, 40 were casual drinkers (event D), and 60 did not drink (event D). Of the ones who drank, 6 had minor headaches. Of the non-drinkers, 9 had minor headaches. Let D represent set of surveyed peop ...

In my sample database i need to use select statements to

In my sample database I need to use SELECT statements to find the following: 1) Find the full names of supervisors together with the total number of employees directly supervised by each one of them. 2) Find the full nam ...

A calling agency has data showing that the average response

A calling agency has data showing that the average response time is 5.6 minutes with a standard deviation of 1.8. The manager wants to know how much time is required for 75% of all calls to be handled (in other words, th ...

Question suppose that you have 2 dfas and have 7 and 6

Question : Suppose that you have 2 DFAs and have 7 and 6 states respectively, and 3 and 4 final states respectively. If I built the product DFA for the intersection of their languages, how many final states will the resu ...

If we compare and contrast the four market structures it is

If we compare and contrast the four market structures, it is evident that one market structure is most practiced and evident in the United States. It is the one that promotes and strives on competition. It is the one tha ...

String loop method write a java program to meet the

((String + Loop + method) Write a Java program to meet the following requirements: 1. Prompt the user to enter three strings by using nextline(). Space can be part of the string. 2. Write a method with an input variable ...

Suppose you are a manager in the it department for the

Suppose you are a manager in the IT department for the government of a corrupt dictator, who has a collection of computers that need to be connected together to create a communication network for his spies. You are given ...

What are the two potential souces of inefficiency in the

What are the two potential souces of inefficiency in the health care market?

Benefits of abating emission mb500-20acost of abating

Benefits of abating emission: MB=500-20A Cost of abating emission: MC=200+5A What are the marginal benefit and marginal cost of abatement at socially efficient level of abatement? What is the net social benefit at the ef ...

The interest rate on one-year treasury bonds is 1 the rate

The interest rate on one-year treasury bonds is 1%, the rate on two-year treasury bonds is 0.9%, and the rate on three-year treasury bonds is 0.8%. Using the expectations theory, compute the expected one-year interest ra ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As