Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Computer Engineering Expert

Program  - Neural Networks

For the last programming project of the semester, you will use off-the-shelf neural network software to investigate airline lateness statistics. You are given a data file (.csv format) containing data on delays from various causes from the 29 largest airports in the United States. We are interested in finding out if the pattern of causes of delays is sufficient to identify the airport.

THE SOFTWARE
The WEKA package is available on all Flarsheim labs (and is also a free download if you want to install it on your own system). It has modules to compute many types of AI functions including Bayesian networks and neural networks, the focus of this assignment. WEKA allows you to build and train a network by specifying the configuration and the data file; there is no need to write your own artificial-neuron or backpropagation code.

THE DATA FILE
The file, airlines.csv, is taken from http://think.cs.vt.edu/corgis/csv/airlines/airlines.html. The data dictionary, describing the format and meaning of each field, is also on that page. In short, the file contains, for each airport, the number of delays:
- due to the airline;
- due to late aircraft;
- due to issues with the aviation system itself (congestion, air traffic control, etc)
- due to security concerns; and
- due to weather.
In addition, it lists the number of flights canceled, delayed, or diverted. For delays, it lists the total minutes delay for each cause.

The file also contains the number of carriers, total number of flights, and number of on-time flights per airport. The 3-letter airport code is also given; this is the output (dependent) variable.

The data on number of carriers per airport should be screened out of your input data and not used as input for your network. This is because in some cases this is enough to uniquely identify the airport; we do not want our network to bypass the bulk of the data. Likewise, the name of the airport should not be used as input.

INPUT TO YOUR NETWORK
Use the numeric data for number and amount of delays, diversions, etc., for each cause. You will want to normalize this data, either by number of delays or number of flights.

Scaling the data: You may need to adjust the scale of your data (e.g. record delays in hours rather than minutes) so that all inputs are of approximately the same magnitude. If inputs vary over multiple scales of magnitude (as this data does), the network requires much more training-and we only have so much data. Therefore, adjusting data so that numbers are proportions (floats in [0.0 - 1.0)) rather than raw counts can provide more efficient learning from the same data. Another option is to code each variable separately as a z-score, as the number of standard deviations above or below the mean that item is. (z- scores below the mean are negative, above the mean positive; thus a z-score of -0.27 means an item is 0.27 standard deviations below the average for that variable, and a z-score of 1.12 is 1.12 standard deviations above the mean). The advantage of this is that all data items are on the same scale-mean of 0, standard deviation of 1-even if some variables have a characteristic range of 0.01-0.10 and others have a range of 1,000 - 100,000.
Exclude from input: Name of airport, month, year, month name, year/month code, airport code.

OUTPUT FROM YOUR NETWORK:

Your network should have 29 output neurons, 1 for each airport. Select the maximum value from the output neurons as the network's response.

NEURAL NETWORK CONFIGURATION

This is your playground! The general approach is to start with the input neurons and a single neuron in the hidden layer. Randomly select a subset of the data (say, 5% of it) to withhold for testing (WEKA can do this automatically) and train the network, then test it on the withheld data. At first, it'll probably be terrible. Then add a second neuron to the hidden layer, select a new subset of the data, retrain the network from scratch, and check results. Continue adding neurons to the hidden layer until the network can consistently predict all withheld data, or when adding more neurons leads to a decrease in performance on the test set.

That uses one hidden layer. Each hidden neuron computes a linear combination of the inputs, and each output is a linear combination of the hidden neurons. You can have multiple layers of hidden neurons. It's probably best not to get too carried away; this data probably won't support more than 2 hidden layers. (The more hidden layers, the more training data needed.) And there's no requirement of multiple hidden layers; in general, a network should be as complex as needed to perform well, and no more.

So try some different configurations if you like. The key point is that anytime the network configuration is changed, the entire network must be re-initialized and trained from the beginning, particularly if different data items are selected for testing. (Otherwise the network is partly trained on test data, which invalidates any test results.)

Computer Engineering, Engineering

  • Category:- Computer Engineering
  • Reference No.:- M92260068
  • Price:- $90

Guranteed 48 Hours Delivery, In Price:- $90

Have any Question?


Related Questions in Computer Engineering

Design a combinational circuit with three inputs a b and c

Design a combinational circuit with three inputs: A, B, and C, D and the output W. The output should be 1 only when the values of A, B interpreted as an unsigned integer (AB) is equal to the values of C, D interpreted as ...

Refer to the reading e-business strategy how to benefit

Refer to the reading, "E-Business Strategy: How to Benefit From a Hype" and review its alignment between such models as SWOT and Five Forces and the e-business that it uses as a model. In your posting, address the follow ...

Technology certainly does play a large role in our lives

Technology certainly does play a large role in our lives and this has happened in a very short period of time. It has impacted the way we activities professionally, personally, and academically. For example, online educa ...

These sctp data chunks have arrived carrying the following

These SCTP DATA chunks have arrived carrying the following information: TSN:20 SI:2 SSN:8 BE:11 TSN:21 SI:2 SSN:9 BE:10 TSN:12 SI:2 SSN:7 BE:11 TSN:18 SI:3 SSN:15 BE:01 TSN:15 SI:3 SSN:15 BE:00 TSN:24 SI:1 SSN:23 BE:10 I ...

This is sports data analysis with python classone way to

This is sports data analysis with python class. One way to judge a basketball lineup is to determine the point differential when that particular combination of players is in the game - in other words, how many points doe ...

Search the internet for information regarding the

Search the Internet for information regarding the interaction between web browser and web server using HTTPS from initial handshake to close of the session. Create a detailed drawing of the steps and also annotate each s ...

Suppose you are asked to automate the prescription

Suppose you are asked to automate the prescription fulfillment system for a pharmacy, MailDrugs. When an order comes in, it is given as a sequence of requests, "x1 ml of drug y1," "x2 ml of drug y2," "x3 ml of drug y3," ...

The second programming project involves writing a program

The second programming project involves writing a program that accepts an arithmetic expression of unsigned integers in postfix notation and builds the arithmetic expression tree that represents that expression. From tha ...

A sequence of natural numbers a1 a2 an is said to be a

A sequence of natural numbers (a 1 , a 2 , ..., a n ) is said to be a degree sequence if there exists an undirected graph on n vertices {v 1 , v 2 , ..., v n } such that the degree of v i  is a i  for each i = 1, 2, ..., ...

Case study 1 user interfacesearly user interfaces were

Case Study 1: User Interfaces Early user interfaces were designed with little or no consideration for the end user. This was largely due to technical and hardware limitations. The poor interface design required a specifi ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As