February Challenge
Problem Statement
Welcome Data Scientist to the 6th SDS Club Monthly Challenge! In this month’s challenge you will be helping hospitals and medical centers to determine whether a patient will show up for their appointment. Close to 20% of patients worldwide miss their appointments, this costs medical providers millions of dollars annually. Your mission to help predict whether a given patient will show-up to their appointment.
Evaluation
Understanding the Dataset
Each column in the dataset is labeled and explained in more detail below.
PatientId – the patient’s id
AppointmentId – the patients’ appointment’s id
Gender – patient’s gender
ScheduledDay – day the appointment was scheduled (should be before AppointmentDay)
AppointmentDay – day of the scheduled appointment
Age – patient’s age
Neighbourhood – neighbourhood of where the appointment will take place
Scholarship – whether the patient is receiving welfare or not
Hypertension – whether the patient has hypertension
Diabetes – whether the patient has diabetes
Alcoholism – whether the patient suffers from alcoholism
Handicap – whether the patient is handicapped
SMS_received – whether the patient was sent a text message notifying them of their appointment
No-show – whether the patient was a no-show (True -> patient didn’t show up, False -> patient showed up)
Dataset Files
public_appointments.csv – Dataset to train and analyze
pred_appointments.csv – Dataset to predict whether the patient showed up for their appointment
Submission
All submissions should be sent through email to challenges@superdatascience.com. When submitting, the file should contain predictions made on the pred_questions.csv file, and it should have the following format:In [ ]:
0 1 1 0 0 1
Acknowledgements
The data was collection by Joni Hoppen and Aquarela Advanced Analytics.
Leave a comment