September challenge


Welcome, Data Scientist!

You have recently been hired by the US Department of Transportation (DOT) to analyze data from multiple airline carriers in the United States. The DOT wants to help airline carriers reduce the number of flight cancellations and improve travelers’ experiences. Your job is to help the DOT predict whether or not a flight will be canceled based on the data provided.

The challenge is yours, if you wish to accept it!

This image has an empty alt attribute; its file name is Take-me-to-the-Live-QA.png

Evaluation

$$\begin{equation*}
accuracy = \frac{TP + TN}{TP + TN + FP + FN}
\end{equation*}$$

Understanding the Dataset

Each column in the dataset is labeled and explained in more detail below.

YEAR: Year in which the flight was scheduled to take place
MONTH: Month in which the flight was scheduled to take place
DAY: Day of the month the flight was scheduled to take place
DAY_OF_WEEK: Day of the week the flight took place
AIRLINE: Initials of the airline that was scheduled to carry out the flight
FLIGHT_NUMBER: Initials of the airline that was scheduled to carry out the flight
TAIL_NUMBER: Tail Number of the plane that was scheduled to carry out the flight
ORIGIN_AIRPORT: Location of the airport that the flight was scheduled to depart from
DESTINATION_AIRPORT: Location of the airport that the flight was scheduled to arrive at
SCHEDULED_DEPARTURE: Scheduled Departure time of flight
SCHEDULED_TIME: Amount of time flight was scheduled to take
DISTANCE: Distance between ORIGIN_AIRPORT and DESTINATION_AIRPORT
SCHEDULED_ARRIVAL: Flight’s scheduled time of arrival
CANCELLED: Flight’s cancellation status

Dataset Files

public_flights.csv – Dataset to train and analyze
pred_flights.csv – Dataset to predict flights’ cancellation status

Submission Format

All submissions should be sent through email to challenges@superdatascience.com. The file should contain predictions made on the pred_flights.csv file, and it should have the following format:

This image has an empty alt attribute; its file name is image-1-1024x118.png

Acknowledgments

The flight cancellation data was collected and published by the DOT’s Bureau of Transportation Statistics.