October Challenge

Welcome Data Scientist to the 2nd SDS Club Monthly Challenge! This month you have been hired by a new car trader company to help sell its used cars. You will be analyzing used cars from multiple manufacturers and of different models. Your job is to help the car trader company determine the price of its used cars.


MSE = {\frac{1}{n}\sum_{i=1}^{n}(y_{i} - \hat{y}_{i})^{2}}

Understanding the Dataset

Each column in the dataset is labeled and explained in more detail below.

manufacturer_name: the name of the car manufacturer
model_name: the name of the car model
transmission: the type of transmission the car has
color: the body color of the car
odometer_value: odometer state in kilometers
year_produced: the year the car was produced
engine_fuel: the fuel type of the engine of the car
engine_has_gas: whether or not the car has a propane tank with tubing
engine_type: the engine type of the car
engine_capacity: capacity of the engine in liters
body_type: the of body the car has
has_warranty: whether the car has warranty
state: the state of the car (new, owned, etc.)
drivetrain: type of drivetrain (front, rear, all)
feature_1 – feature_9: these features are boolean values about properties of the car
duration_listed: the number of days the car is listed in the catalog
price_usd: price of the car in USD

Dataset Files

public_cars.csv – Dataset to train and analyze
pred_cars.csv – Dataset to predict cars prices


All submissions should be sent through email to challenges@superdatascience.com. When submitting, the file should contain predictions made on the pred_cars.csv file, and it should have the following format: