Ask Python Expert

Assignment

In this exercise you will create some data "assets" for use by the XYZ company in direct marketing campaigns. You will download data from the SSCC server to your local computer, create relational database tables that you'll store locally, create a "flat" file with selected customers and variables, and report on purchasing by gender. You'll document your work by providing your commented code. You'll save the new assets you created for future use and for sharing with others.

To do this assignment you'll be using the SSCC and a Postgres server running on it, Python and the pandas package, and the sqlite database. You'll use Python and pandas for data manipulation and reporting.

Getting Your Data from the SSCC

XYZ's data are in a Postgres DB server on the SSCC. To get them, log in to the dornick server. You'll need to do this using a VPN client if you are off campus. Then, connect to the Postgres database server.

The Working with the SSCC pdf document and the SSCC Cheat Sheet pdf on Canvas provide some information about how to connect to the SSCC and to get your data from the Postgres DB.

You'll find three (3) tables in the Postgres DB pilot schema that are named item, mail, and customer. Export each table as a csv file with a header record, using a temporary view to do each one. Using psql is the easiest way to do this. After you are done, delete each temporary view.

Download your three csv files to your computer so that you can work with the data in them using Python. Don't forget to log off from dornick.

You'll find documentation about XYZ's data (attached). Note that like most real world documentation, it's not "perfect." But it is the real thing.

Do These Things

Once you have your csv files on your computer, do the following five (5) things, most of which have "subthings:"

#1) Import each of the csv files you downloaded from the SSCC into a pandas DataFrame.

(a) Provide the code you used to do this.
(b) Print out the column names of your item DataFrame and the first four (4) records in it.
(c) Decribe the data types of the columns in the DataFrame.

Include your commented code for each of the above.

#2) Write each of you pandas DataFrames to a local SQLite DB named xyz.db. Include only data for active buyers in these tables. Verify that you have written the tables to your SQLite DB correctly.

(Commented code, of course.)

#3) Now, using the same data as you used for 2, above, create a new table called custSum that you also write to xyz.db, and that has the following characteristics. This table should have one row per customer record.

(a) Include on each customer's record a binary, Y or N, indicator of whether the customer is a 'heavy buyer,' where the definition of a 'heavy buyer' is a customer whose YTD purchasing in 2009 is greater than 90% of the 2009 YTD purchasing of all customers who are active buyers. Verify your coding of this new variable by crosstabulating it with an indicator of whether their 2009 YTD purchasing is at at least the 90th percentile of all 2009 YTD purchasing.

(b) Add to each customer's record whether the customer has the following credit cards: AMEX, Discover, VISA, and Mastercard, with each credit card variable codes as a Y or a N for yes or no, respectively. Document your creation of these codes by showing how they are related to the code values in the data

(c) Add to each customer's record their estimated HH income, and the genders of adults "1" and "2."

(d) Add to each customer's record their ZIP code and ZIP+4 code.

(e) Be sure to include the account number on each record in the SQL tables you create so that the tables can be related to each other, later.

(f) Provide a count of the number of records in each table.

(g) Verify that you have written this table to your SQLite DB correctly.

(Don't forget to comment your code so that a reader can understand what it is supposed to do.)

#4) Create a new pandas DataFrame of data that will be used for target maketing and write it out to a headered csv file.

(a) This DataFrame should have one row per customer. The customers included should be active buyers or lapsed buyers.

(b) The row for each customer should include the customer's account identifier, and an indicator variable (Y/N, or 1/0) for each product category the customer has made at least one purchase in.

(c) Include for each customer their buyer status, and the total dollar amount of the purchases they have made from XYZ using all data available for him or her.

(d) Write your DataFrame to a csv file, and also store it in a shelve database.

(e) Verify that the files you wrote your customer DataFrame to were written correctly.

(Commented code, of course.)

#5) Report the six (6) most frequently purchased product categories by the gender of "adult 1" using the data for the active customers. Include for these categories the total spend in dollars on each category, the total number of products purchased in these categories, and the number of adults in each gender category.

(Be sure to comment your code.)

Your Deliverables

Provide the above in up to six pages, but in no more than 7 pages, in a pdf file. Be sure that everything is readable. Address each of the five above parts in turn. Do 1 by providing your commented code and results. Then do 2 providing code + results, and so on.

Do not provide a list of code for all of the above items in a block, followed by the results of your code in a block. An assignment organized in this way will be returned ungraded. Be sure all of your code is syntactically correct, and that it approximates good Python coding style.

Python, Programming

  • Category:- Python
  • Reference No.:- M92373234
  • Price:- $55

Priced at Now at $55, Verified Solution

Have any Question?


Related Questions in Python

Part i the assignment filesone of the most important

Part I: The Assignment Files One of the most important outcomes of this assignment is that you understand the importance of testing. This assignment will follow an iterative development cycle. That means you will write a ...

Homework -this homework will have both a short written and

Homework - This homework will have, both a short written and coding assignment. The problems that are supposed to be written are clearly marked. 1) (Written) Make heuristics Describe two heuristics for the slide problem ...

Tasksdemonstrate data scraping of a social network of

Tasks Demonstrate data scraping of a social network of choice. Develop technical documentation, including the development of the code & detailing the results. Provide a report on the findings, that includes research into ...

Assignment1 utilising python 3 build the following

Assignment 1. Utilising Python 3 Build the following regression models: - Decision Tree - Gradient Boosted Tree - Linear regression 2. Select a dataset (other than the example dataset given in section 3) and apply the De ...

Python programming assignment -you first need an abstract

Python Programming Assignment - You first need an abstract base class, called, Account which has the following attributes and methods: accountID: This attribute holds the ID assigned the account , if not provided set to ...

Learning outcomes lo3 - research develop and document a

Learning Outcomes LO3 - Research, develop, and document a basic security policy, and analyse, record, and resolve all security incidents LO4 - Identify and assess the threats to, and vulnerabilities of networks Assessmen ...

Question research pythons dictionary data type dictdiscuss

Question : Research Python's dictionary data type (dict). Discuss its interface and usage. Include examples. Discuss practical applications of dictionaries.

Questionwhat is a python development frameworkgive 3

Question What is a python development framework? Give 3 examples python development framework used today. and explain which development framework is used in which industry.

Below zero - ice cream storethe local ice-cream store needs

Below Zero - ice cream store The local ice-cream store needs a new ordering system to improve customer service by streamlining the ordering process. The manager of the store has found that many orders are incorrect and s ...

The second task in this assignment is to create a python

The second task in this assignment is to create a Python program called pancakes.py that will determine the final order of a stack of pancakes after a series of flips.(PYTHON 3) Problem Task In this problem, your input w ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As