Ask Statistics and Probability Expert

Comparing Software Development Workloads

Estimating the cost of developing software in terms of work load is difficult since it is a challenge to quantify the size and complexity of a software system. The article Analysis of Size Metrics and Effort Performance Criterion in Software Cost Estimation provides an overview of different metrics used to assess size and complexity (Malathi & Sridhar, 2012). The metrics include counts of lines of code, function point counts, and operation counts. Function point counts are often utilized because they can be estimated based on project design specifications.

The dataset pointworkload.cvs contains data collected from 104 programming projects at AT&T between 1986 and 1991 (Matson & Huguenard, 2005). This dataset include number of work hours for each project, the function point count for each project, and identifiers for operating system, data management system, and programming language utilized. In this application, you will investigate whether operating system, data management system and programming language impact the number of work hours per function point for a project.

Open the dataset pointworkload.csv in Excel. Create a new column that calculates the number of work hours per function point for each project. Save the file with this new data column.

Next, you would want to look at the distribution of work hours per function point in a frequency diagram. Doing so in Excel requires either binning and counting the data yourself or installing the Data Analysis Toolpak Add-On. However, even with the add-on, simply getting a histogram requires multiple steps. Excel is designed for data presentation not for significant statistical analysis. It is capable of the statistical analysis but only with add-ons, macros, or programming. Instead of taking these steps, you will switch now to a software tool designed for statistical analysis, SPSS.

Go to the Resources section for Unit 4, and download the document IBM_SPSS_Installation_and_Registration_Instructions. This will guide you through the process of installing the statistical analysis platform SPSS which you will utilize for the remainder of this assignment.

  1. Import the file you revised in Excel to include work hours per function point into SPSS (be sure to tell it that yes there are variable names included at the top of your file) and take a screenshot showing your successful installation and import. This screen shot should be pasted into your overall document.
  2. ) In the top tool-bar, select Analyze, Descriptive Statistics, Frequencies. Put the work hours per function point variable you created in the Variable(s) column. Click Charts and select Histogram. Then,click Continue and OK. SPSS will now run the requested analysis. In the Output, scroll down to the histogram and copy-paste it into your overall document. Describe the distribution of the data. Does it appear to be normally distributed? What are the average and standard deviation? Are there any outliers?
    Now, you are ready to determine whether operating system, data management system, or language impact the work hours per function point. To do this, you will utilize two different statistical tools. The t-test for difference in means between two independent samples and the analysis of variance.
  3. There are two different operating systems utilized. A 0 indicates UNIX, and a 1 indicates MVS. The t-test will allow you to assess the null hypothesis that the two operating systems give the same average work load per function point. Select Analyze, Compare Means, Independent-Samples T-Test. Your test variable is work hours per function point. Your grouping variable is OS. You will need to click Define Groups and make Group 1 = 0 (UNIX) and Group 2 = 1 (MVS). With these defined, click Continue and OK to get both the group statistics and the t-test results. Use the group statistics to calculate the t-value. Show all of your work for the calculation. For α=0.05, what is the p-value for the hypothesis? Based on this result, draw a conclusion as to whether or not the different operating systems result in a significant difference in work load per function point.
  4. By examining the t-test results from the previous question, you can see that both the t-statistic and the p-value are calculated there. You will be running several tests to determine if programming language impacts work load per function point, and you should draw your data from these charts rather than calculating by hand. Go back to your Independent-Samples T-Test and change the Grouping Variable to Language. Define the groups as 1 (Cobol) and 2 (PLI). Copy the t-test results to your overall document. Repeat this process for groups 1 (Cobol) and 3 (C), groups 1 (Cobol) and 4 (Other), groups 2 (PLI) and 3 (C), groups 2 (PLI) and 4 (Other), and groups 3 (C) and 4 (Other). Copy all six t-test results to your overall document. Based on these result, draw a conclusion as to whether or not the different programming languages result in a significant difference in work load per function point. Be sure to state the different null hypotheses considered and which are rejected and accepted at α=0.05.
  5. Running six different t-tests certainly answers the question of whether or not programming language effects work load per function point, but it is relatively time consuming to run and assess each of these results separately. Analysis of variance (ANOVA) allows this multiple group comparison. Go to Analyze, Compare Means, One-Way ANOVA. Select work hours per function point as your dependent variable and Language as factor then click OK. Copy the ANOVA table to your overall document. Explain what the ANOVA table tells you and what conclusions can be drawn.
  6. ANOVA has the down side that it only tells if some group is significantly different from some other group but does not identify those groups. You can obtain that information by adding a post hoc test to compare means. Go back to the One-Way ANOVA and click on Post Hoc. You will see numerous options. These are all different methods for comparing the groups. Each approaches the comparison differently. You will utilize the Tukey comparison here. Select Tukey then click Continue and OK. You will see both a comparison table and a table creating homogenous subsets. From this data you should be able to conclude that there is a significant difference between 1 (Cobol) and 2 (PLI). Copy these charts to your overall document and explain how that conclusion may be drawn. How does this compare to your t-test conclusions?
  7. Utilize t-test and/or ANOVA to determine the impact of database management system on work load per function point. The values are 1 (IDMS), 2 (IMS), 3 (INFORMIX), 4 (INGRESS), and 5 (Other). You should present your data, draw conclusions, and explain those conclusions.

Malathaim S. & Sridhar, S. (2012). Analysis of size effect metrics and effort performance criterion in software cost estimation. Indian Journal of Computer Science and Engineering, 3(1), pp. 24-31. Retrieved from http://www.ijcse.com/docs/INDJCSE12-03-01-101.pdf

Matson, J. E. & Huguenar, B. R. (2005). Evaluating aptness of a regression model. Journal of Statistics Education Data Archive. Retrieved from http://www.amstat.org/publications/jse/jse_data_archive.htm

Statistics and Probability, Statistics

  • Category:- Statistics and Probability
  • Reference No.:- M92094833
  • Price:- $35

Priced at Now at $35, Verified Solution

Have any Question?


Related Questions in Statistics and Probability

Introduction to epidemiology assignment -assignment should

Introduction to Epidemiology Assignment - Assignment should be typed, with adequate space left between questions. Read the following paper, and answer the questions below: Sundquist K., Qvist J. Johansson SE., Sundquist ...

Question 1 many high school students take the ap tests in

Question 1. Many high school students take the AP tests in different subject areas. In 2007, of the 144,796 students who took the biology exam 84,199 of them were female. In that same year,of the 211,693 students who too ...

Basic statisticsactivity 1define the following terms1

BASIC STATISTICS Activity 1 Define the following terms: 1. Statistics 2. Descriptive Statistics 3. Inferential Statistics 4. Population 5. Sample 6. Quantitative Data 7. Discrete Variable 8. Continuous Variable 9. Qualit ...

Question 1below you are given the examination scores of 20

Question 1 Below you are given the examination scores of 20 students (data set also provided in accompanying MS Excel file). 52 99 92 86 84 63 72 76 95 88 92 58 65 79 80 90 75 74 56 99 a. Construct a frequency distributi ...

Question 1 assume you have noted the following prices for

Question: 1. Assume you have noted the following prices for paperback books and the number of pages that each book contains. Develop a least-squares estimated regression line. i. Compute the coefficient of determination ...

Question 1 a sample of 81 account balances of a credit

Question 1: A sample of 81 account balances of a credit company showed an average balance of $1,200 with a standard deviation of $126. 1. Formulate the hypotheses that can be used to determine whether the mean of all acc ...

5 of females smoke cigarettes what is the probability that

5% of females smoke cigarettes. What is the probability that the proportion of smokers in a sample of 865 females would be greater than 3%

Armstrong faber produces a standard number-two pencil

Armstrong Faber produces a standard number-two pencil called Ultra-Lite. The demand for Ultra-Lite has been fairly stable over the past ten years. On average, Armstrong Faber has sold 457,000 pencils each year. Furthermo ...

Sppose a and b are collectively exhaustive in addition pa

Suppose A and B are collectively exhaustive. In addition, P(A) = 0.2 and P(B) = 0.8. Suppose C and D are both mutually exclusive and collectively exhaustive. Further, P(C|A) = 0.7 and P(D|B) = 0.5. What are P(C) and P(D) ...

The time to complete 1 construction project for company a

The time to complete 1 construction project for company A is exponentially distributed with a mean of 1 year. Therefore: (a) What is the probability that a project will be finished in one and half years? (b) What is the ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As