Ask Question, Ask an Expert

+61-413 786 465

info@mywordsolution.com

Ask Computer Network & Security Expert

Program - Neural Networks

For the last programming project of the semester, you will use off-the-shelf neural network software to investigate airline lateness statistics. You are given a data file (.csv format) containing data on delays from various causes from the 29 largest airports in the United States. We are interested in finding out if the pattern of causes of delays is sufficient to identify the airport.

THE SOFTWARE

The WEKA package is available on all Flarsheim labs (and is also a free download if you want to install it on your own system). It has modules to compute many types of AI functions including Bayesian networks and neural networks, the focus of this assignment. WEKA allows you to build and train a network by specifying the configuration and the data file; there is no need to write your own artificial-neuron or back propagation code.

THE DATA FILE

The file, airlines.csv, is taken from http://think.cs.vt.edu/corgis/csv/airlines/airlines.html. The data dictionary, describing the format and meaning of each field, is also on that page. In short, the file contains, for each airport, the number of delays:

  • due to the airline;
  • due to late aircraft;
  • due to issues with the aviation system itself (congestion, air traffic control, etc)
  • due to security concerns; and
  • due to weather.

In addition, it lists the number of flights canceled, delayed, or diverted. For delays, it lists the total minutes delay for each cause.

The file also contains the number of carriers, total number of flights, and number of on-time flights per airport. The 3-letter airport code is also given; this is the output (dependent) variable.

The data on number of carriers per airport should be screened out of your input data and not used as input for your network. This is because in some cases this is enough to uniquely identify the airport; we do not want our network to bypass the bulk of the data. Likewise, the name of the airport should not be used as input.

INPUT TO YOUR NETWORK

Use the numeric data for number and amount of delays, diversions, etc., for each cause. You will want to normalize this data, either by number of delays or number of flights.

Scaling the data: You may need to adjust the scale of your data (e.g. record delays in hours rather than minutes) so that all inputs are of approximately the same magnitude. If inputs vary over multiple scales of magnitude (as this data does), the network requires much more training-and we only have so much data. Therefore, adjusting data so that numbers are proportions (floats in [0.0 - 1.0)) rather than raw counts can provide more efficient learning from the same data. Another option is to code each variable separately as a z-score, as the number of standard deviations above or below the mean that item is. (z-scores below the mean are negative, above the mean positive; thus a z-score of -0.27 means an item is 0.27 standard deviations below the average for that variable, and a z-score of 1.12 is 1.12 standard deviations above the mean). The advantage of this is that all data items are on the same scale-mean of 0, standard deviation of 1-even if some variables have a characteristic range of 0.01-0.10 and others have a range of 1,000 - 100,000.

Exclude from input: Name of airport, month, year, month name, year/month code, airport code.

OUTPUT FROM YOUR NETWORK:

Your network should have 29 output neurons, 1 for each airport. Select the maximum value from the output neurons as the network's response.

NEURAL NETWORK CONFIGURATION -

This is your playground! The general approach is to start with the input neurons and a single neuron in the hidden layer. Randomly select a subset of the data (say, 5% of it) to withhold for testing (WEKA can do this automatically) and train the network, then test it on the withheld data. At first, it'll probably be terrible. Then add a second neuron to the hidden layer, select a new subset of the data, retrain the network from scratch, and check results. Continue adding neurons to the hidden layer until the network can consistently predict all withheld data, or when adding more neurons leads to a decrease in performance on the test set.

That uses one hidden layer. Each hidden neuron computes a linear combination of the inputs, and each output is a linear combination of the hidden neurons. You can have multiple layers of hidden neurons. It's probably best not to get too carried away; this data probably won't support more than 2 hidden layers. (The more hidden layers, the more training data needed.) And there's no requirement of multiple hidden layers; in general, a network should be as complex as needed to perform well, and no more.

So try some different configurations if you like. The key point is that anytime the network configuration is changed, the entire network must be re-initialized and trained from the beginning, particularly if different data items are selected for testing. (Otherwise the network is partly trained on test data, which invalidates any test results.)

In a short report describing the final network configuration and the WEKA settings needed to produce it. Also include a short report describing how you designed your network, how it was tested and validated, and how well it was able to classify the results.

Attachment:- Assignment.zip

Computer Network & Security, Computer Science

  • Category:- Computer Network & Security
  • Reference No.:- M92260209

Have any Question?


Related Questions in Computer Network & Security

If a router is attached to a network with a base ip address

If a router is attached to a network with a base IP address of 198.10.0.0/20 and receives a packet addressed to 198.10.10.144, answer the following questions: What is the network mask used by the router? (in dotted decim ...

Toms income is 480and he spends it on two goods x and y his

Tom's income is $480and he spends it on two goods, X and Y. His utility function is U = XY. Both X and Y sells for $8 per unit.   a. Use lagrangian function to calculate Tom's utility-maximizing purchases of X and Y.  b. ...

Question 1 for rsa encryption we need a modulus that is the

Question : 1. For RSA encryption we need a modulus that is the product of two prime numbers, p and q. Assume p = 11 and q = 13, and thus n = p*q = 143. In this case, the RSA encryption exponent e must be relatively prime ...

Question suppose public-key cryptography is used to encrypt

Question : Suppose public-key cryptography is used to encrypt the communications between Alice and Bob. Alice's public key is eA, private key is dA; Bob's public key is private key is de. Now Bob wants to send a message ...

Advanced network design assessment - human factors in

Advanced Network Design Assessment - Human factors in network analysis and design Purpose of the assessment - This assignment is designed to assess students' knowledge and skills related to the following learning outcome ...

After reading this weeks materials please respond to two 2

After reading this week's materials, please respond to TWO (2) of the following questions. AND PROVIDE CITATION IN APA 1. Describe the differences between bus, ring, star and mesh topologies. 2. Explain the TCP/IP Model ...

Sip encodingwhy does the session initiation protocol sip

SIP, ENCODING Why does the session initiation protocol SIP allow the sender and receiver to choose two different multimedia encoding schemes? Describe a scenario where it makes sense to use different protocols for sender ...

There are standards in network communication through which

There are standards in network communication through which data is transferred from one system to another. Discuss why these standards are important. Do you think it would be easier to purchase different equipment and so ...

Assignment descriptionproject scope a typical network

Assignment Description Project Scope: A typical network layout diagram of a firm is given below for illustrative purposes only. The service requirements are enclosed. Figure. Network layout of a firm Service requirements ...

Suppose alice wants to communicate with bob using symmetric

Suppose Alice wants to communicate with Bob using symmetric key cryptography with a session key KS. They have no public key cryptography and they intend to use a key distribution center (KDC). The KDC is a server that sh ...

  • 4,153,160 Questions Asked
  • 13,132 Experts
  • 2,558,936 Questions Answered

Ask Experts for help!!

Looking for Assignment Help?

Start excelling in your Courses, Get help with Assignment

Write us your full requirement for evaluation and you will receive response within 20 minutes turnaround time.

Ask Now Help with Problems, Get a Best Answer

Why might a bank avoid the use of interest rate swaps even

Why might a bank avoid the use of interest rate swaps, even when the institution is exposed to significant interest rate

Describe the difference between zero coupon bonds and

Describe the difference between zero coupon bonds and coupon bonds. Under what conditions will a coupon bond sell at a p

Compute the present value of an annuity of 880 per year

Compute the present value of an annuity of $ 880 per year for 16 years, given a discount rate of 6 percent per annum. As

Compute the present value of an 1150 payment made in ten

Compute the present value of an $1,150 payment made in ten years when the discount rate is 12 percent. (Do not round int

Compute the present value of an annuity of 699 per year

Compute the present value of an annuity of $ 699 per year for 19 years, given a discount rate of 6 percent per annum. As