Ask Computer Engineering Expert

Program  - Neural Networks

For the last programming project of the semester, you will use off-the-shelf neural network software to investigate airline lateness statistics. You are given a data file (.csv format) containing data on delays from various causes from the 29 largest airports in the United States. We are interested in finding out if the pattern of causes of delays is sufficient to identify the airport.

THE SOFTWARE
The WEKA package is available on all Flarsheim labs (and is also a free download if you want to install it on your own system). It has modules to compute many types of AI functions including Bayesian networks and neural networks, the focus of this assignment. WEKA allows you to build and train a network by specifying the configuration and the data file; there is no need to write your own artificial-neuron or backpropagation code.

THE DATA FILE
The file, airlines.csv, is taken from http://think.cs.vt.edu/corgis/csv/airlines/airlines.html. The data dictionary, describing the format and meaning of each field, is also on that page. In short, the file contains, for each airport, the number of delays:
- due to the airline;
- due to late aircraft;
- due to issues with the aviation system itself (congestion, air traffic control, etc)
- due to security concerns; and
- due to weather.
In addition, it lists the number of flights canceled, delayed, or diverted. For delays, it lists the total minutes delay for each cause.

The file also contains the number of carriers, total number of flights, and number of on-time flights per airport. The 3-letter airport code is also given; this is the output (dependent) variable.

The data on number of carriers per airport should be screened out of your input data and not used as input for your network. This is because in some cases this is enough to uniquely identify the airport; we do not want our network to bypass the bulk of the data. Likewise, the name of the airport should not be used as input.

INPUT TO YOUR NETWORK
Use the numeric data for number and amount of delays, diversions, etc., for each cause. You will want to normalize this data, either by number of delays or number of flights.

Scaling the data: You may need to adjust the scale of your data (e.g. record delays in hours rather than minutes) so that all inputs are of approximately the same magnitude. If inputs vary over multiple scales of magnitude (as this data does), the network requires much more training-and we only have so much data. Therefore, adjusting data so that numbers are proportions (floats in [0.0 - 1.0)) rather than raw counts can provide more efficient learning from the same data. Another option is to code each variable separately as a z-score, as the number of standard deviations above or below the mean that item is. (z- scores below the mean are negative, above the mean positive; thus a z-score of -0.27 means an item is 0.27 standard deviations below the average for that variable, and a z-score of 1.12 is 1.12 standard deviations above the mean). The advantage of this is that all data items are on the same scale-mean of 0, standard deviation of 1-even if some variables have a characteristic range of 0.01-0.10 and others have a range of 1,000 - 100,000.
Exclude from input: Name of airport, month, year, month name, year/month code, airport code.

OUTPUT FROM YOUR NETWORK:

Your network should have 29 output neurons, 1 for each airport. Select the maximum value from the output neurons as the network's response.

NEURAL NETWORK CONFIGURATION

This is your playground! The general approach is to start with the input neurons and a single neuron in the hidden layer. Randomly select a subset of the data (say, 5% of it) to withhold for testing (WEKA can do this automatically) and train the network, then test it on the withheld data. At first, it'll probably be terrible. Then add a second neuron to the hidden layer, select a new subset of the data, retrain the network from scratch, and check results. Continue adding neurons to the hidden layer until the network can consistently predict all withheld data, or when adding more neurons leads to a decrease in performance on the test set.

That uses one hidden layer. Each hidden neuron computes a linear combination of the inputs, and each output is a linear combination of the hidden neurons. You can have multiple layers of hidden neurons. It's probably best not to get too carried away; this data probably won't support more than 2 hidden layers. (The more hidden layers, the more training data needed.) And there's no requirement of multiple hidden layers; in general, a network should be as complex as needed to perform well, and no more.

So try some different configurations if you like. The key point is that anytime the network configuration is changed, the entire network must be re-initialized and trained from the beginning, particularly if different data items are selected for testing. (Otherwise the network is partly trained on test data, which invalidates any test results.)

Computer Engineering, Engineering

  • Category:- Computer Engineering
  • Reference No.:- M92260068
  • Price:- $90

Guranteed 48 Hours Delivery, In Price:- $90

Have any Question?


Related Questions in Computer Engineering

Does bmw have a guided missile corporate culture and

Does BMW have a guided missile corporate culture, and incubator corporate culture, a family corporate culture, or an Eiffel tower corporate culture?

Rebecca borrows 10000 at 18 compounded annually she pays

Rebecca borrows $10,000 at 18% compounded annually. She pays off the loan over a 5-year period with annual payments, starting at year 1. Each successive payment is $700 greater than the previous payment. (a) How much was ...

Jeff decides to start saving some money from this upcoming

Jeff decides to start saving some money from this upcoming month onwards. He decides to save only $500 at first, but each month he will increase the amount invested by $100. He will do it for 60 months (including the fir ...

Suppose you make 30 annual investments in a fund that pays

Suppose you make 30 annual investments in a fund that pays 6% compounded annually. If your first deposit is $7,500 and each successive deposit is 6% greater than the preceding deposit, how much will be in the fund immedi ...

Question -under what circumstances is it ethical if ever to

Question :- Under what circumstances is it ethical, if ever, to use consumer information in marketing research? Explain why you consider it ethical or unethical.

What are the differences between four types of economics

What are the differences between four types of economics evaluations and their differences with other two (budget impact analysis (BIA) and cost of illness (COI) studies)?

What type of economic system does norway have explain some

What type of economic system does Norway have? Explain some of the benefits of this system to the country and some of the drawbacks,

Among the who imf and wto which of these governmental

Among the WHO, IMF, and WTO, which of these governmental institutions do you feel has most profoundly shaped healthcare outcomes in low-income countries and why? Please support your reasons with examples and research/doc ...

A real estate developer will build two different types of

A real estate developer will build two different types of apartments in a residential area: one- bedroom apartments and two-bedroom apartments. In addition, the developer will build either a swimming pool or a tennis cou ...

Question what some of the reasons that evolutionary models

Question : What some of the reasons that evolutionary models are considered by many to be the best approach to software development. The response must be typed, single spaced, must be in times new roman font (size 12) an ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As