Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Python Expert

Exercise detail

For our first project, we're going to take a look at SAT scores around the United States. We'll be exploring this data to see what we can learn using the descriptive statistics skills covered this week. Your client, the College Board, is expecting some pretty graphs to add to their presentations this year, so don't let them down!

Goal: A Jupyter notebook that describes your data with visualizations & statistical analysis.

Goal: A five to seven minute presentation targeted to your hypothetical client that highlights your findings.

Requirements

Your work must:

Describe your data
Perform methods of exploratory data analysis, including:
Use Matplotlib to create visualizations
Use NumPy to apply basic summary statistics: mean, median, mode
Determine if the dataset appears to follow a normal distribution
Bonus:

Recreate all of your MatPlotLib graphs in Seaborn!
Use Tableau (public) to create visualizations!
Create a blog post of at least 500 words (and 1-2 graphics!) describing your data, analysis, and approach. Link to it in your Jupyter notebook.
Using existing features, engineer a new feature
Necessary Deliverables / Submission

Materials must be submitted in a clearly commented Jupyter notebook.
Notebook must be submitted via a GitHub pull request to the instructor's repo (the same way you submit labs).
Presentation must be submitted via slack (for a powerpoint file) or shared via a google slides Link
Materials must be submitted by 9:00 AM on Friday, June 30.
Starter code

For this project we will be using a Jupyter notebook. This notebook will use matplotlib for plotting and visualizing our data. This type of visualization is handy for prototyping and quick data analysis. We will discuss more advanced data visualizations for disseminating your work.

Open the starter code instructions in a Jupyter notebook.

Dataset

Dataset: SAT Scores
This data, taken from the College Board, gives the mean SAT math(s) and verbal scores, and the participation rate for each state and the District of Columbia for the year 2001.

Suggested Ways to Get Started

Read in your dataset.
Try out a few NumPy commands to describe your data.
Write pseudocode before you write actual code. Thinking through the logic of something helps.
Read the docs for whatever technologies you use. Most of the time, there is a tutorial that you can follow, but not always, and learning to read documentation is crucial to your success!
Document everything.
Useful Resources

How to find the data you need
How to give a good lightning talk
Presentation Structure

5-7 minutes long.
Use Powerpoint or some other visual aid.
Consider the audience. Assume you are presenting to non-technical executives with the College Board (the organization that administers the SATs).
Start with the guiding question/big idea.
Talk about your procedure/methodology (high level, no need to show code unless you found a useful method to share).
Talk about your findings/answers to prompts (include visuals).
Conclude - highlight any next steps, further questions, what you would do with more time, additional data that would be useful.
Be sure to rehearse and time your presentation before class.

Project Feedback + Evaluation

Your instructors will score you using the scale below:

Score | Expectations
----- | ------------
**0** | _Incomplete._
**1** | _Does not meet expectations._
**2** | _Meets expectations, good job!_
**3** | _Exceeds expectations, you wonderful creature, you!_
This will serve as a helpful overall gauge of whether you met the project goals!

STEP 1 STARTER CODE INSTRUCTIONS

Step 1: Open the sat_scores.csv file. Investigate the data, and answer the questions below.

1. What does the data describe?

In [ ]:
## your answer here
2. Does the data look complete? Are there any obvious issues with the observations?

In [ ]:
## your answer here
3. Describe in words what each variable(column) is.

In [ ]:
## your answer here
Step 2: Load the data.

4. Load the data into a list of lists

In [ ]:

5. Print the data

In [ ]:

6. Extract a list of the labels from the data, and remove them from the data.

In [ ]:

7. Create a list of State names extracted from the data. (Hint: use the list of labels to index on the State column)

In [ ]:

8. Print the types of each column

In [ ]:

9. Do any types need to be reassigned? If so, go ahead and do it.

In [ ]:

10. Create a dictionary for each column mapping the State to its respective value for that column.

In [ ]:

11. Create a dictionary with the values for each of the numeric columns

In [ ]:

Step 3: Describe the data

12. Print the min and max of each column

In [ ]:

13. Write a function using only list comprehensions, no loops, to compute Standard Deviation. Print the Standard Deviation of each numeric column.

In [ ]:

Step 4: Visualize the data

14. Using MatPlotLib and PyPlot, plot the distribution of the Rate using histograms.

In [ ]:

15. Plot the Math(s) distribution

In [ ]:

16. Plot the Verbal distribution

In [ ]:

17. What is the typical assumption for data distribution?

In [ ]:

18. Does that distribution hold true for our data?

In [ ]:

19. Plot some scatterplots. BONUS: Use a PyPlot figure to present multiple plots at once.

In [ ]:

20. Are there any interesting relationships to note?

In [ ]:

21. Create box plots for each variable.

In [ ]:

BONUS: Using Tableau, create a heat map for each variable using a map of the US.

In [ ]:

DATA

State Rate Verbal Math
CT 82 509 510
NJ 81 499 513
MA 79 511 515
NY 77 495 505
NH 72 520 516
RI 71 501 499
PA 71 500 499
VT 69 511 506
ME 69 506 500
VA 68 510 501
DE 67 501 499
MD 65 508 510
NC 65 493 499
GA 63 491 489
IN 60 499 501
SC 57 486 488
DC 56 482 474
OR 55 526 526
FL 54 498 499
WA 53 527 527
TX 53 493 499
HI 52 485 515
AK 51 514 510
CA 51 498 517
AZ 34 523 525
NV 33 509 515
CO 31 539 542
OH 26 534 439
MT 23 539 539
WV 18 527 512
ID 17 543 542
TN 13 562 553
NM 13 551 542
IL 12 576 589
KY 12 550 550
WY 11 547 545
MI 11 561 572
MN 9 580 589
KS 9 577 580
AL 9 559 554
NE 8 562 568
OK 8 567 561
MO 8 577 577
LA 7 564 562
WI 6 584 596
AR 6 562 550
UT 5 575 570
IA 5 593 603
SD 4 577 582
ND 4 592 599
MS 4 566 551
All 45 506 514

MISCELLANEOUS (NOT NEEDED DATA)

# OSX DS Store
.DS_Store

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*,cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# IPython Notebook
*.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# dotenv
.env

# virtualenv
venv/
ENV/

# Spyder project settings
.spyderproject

# Rope project settings
.ropeproject

Python, Programming

  • Category:- Python
  • Reference No.:- M92357818
  • Price:- $60

Priced at Now at $60, Verified Solution

Have any Question?


Related Questions in Python

Environment setupthe first mini project will be based on

Environment Setup The first mini project will be based on Ladder Logic programming. We will be using Schneider Electric's IDE called SoMachine Basic to do the programming. The latest ver- sion of SoMachine Basic for Wind ...

Below zero - ice cream storethe local ice-cream store needs

Below Zero - ice cream store The local ice-cream store needs a new ordering system to improve customer service by streamlining the ordering process. The manager of the store has found that many orders are incorrect and s ...

Simple python traffic lightswrite a program that simulates

Simple Python (Traffic lights) Write a program that simulates a traffic light. The program lets the user select one of three lights: red, yellow, or green. When a radio button is selected, the light is turned on, and onl ...

Questionwhat is a python development frameworkgive 3

Question What is a python development framework? Give 3 examples python development framework used today. and explain which development framework is used in which industry.

Learning outcomes lo3 - research develop and document a

Learning Outcomes LO3 - Research, develop, and document a basic security policy, and analyse, record, and resolve all security incidents LO4 - Identify and assess the threats to, and vulnerabilities of networks Assessmen ...

Part i the assignment filesone of the most important

Part I: The Assignment Files One of the most important outcomes of this assignment is that you understand the importance of testing. This assignment will follow an iterative development cycle. That means you will write a ...

Question write a python program with a graphical user

Question: Write a python program with a graphical user interface that will allow a user to create a custom pizza which they wish to order. At minimum, the user should be able to choose the size of the pizza, the type of ...

Homework -this homework will have both a short written and

Homework - This homework will have, both a short written and coding assignment. The problems that are supposed to be written are clearly marked. 1) (Written) Make heuristics Describe two heuristics for the slide problem ...

The second task in this assignment is to create a python

The second task in this assignment is to create a Python program called pancakes.py that will determine the final order of a stack of pancakes after a series of flips.(PYTHON 3) Problem Task In this problem, your input w ...

Show times in tmus and seconds1 an associate grasps an oven

Show times in TMUs and seconds. 1. An associate grasps an oven door within reach and pulls it open 18 inches with the left hand (he does not relinquish control of the door). With a pan in the right hand, he carefully pos ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As