top of page

Analysis of Second Hand Car Data from CarWale.com

CODES

Preprocessing Stage - I

expand (1).png

Note: Datasets are stored here due to size constraints. Both Raw and Processed Data accounts to around 6 GB.



OVERVIEW

The objective of taking up this project is to explore and analyse the second-hand car market and predict the price of second hand cars.

The data required for analysis has been extracted from CARWALE website.


CHALLENGES

Data Scraping and Extraction.

  • Took 2 continuous days to extract the complete data.

  • 2 laptops were used to extract the data in batches.

  • power-cuts, internet disruptions, requests being denied.


Pre-processing and Cleaning.

  • Unlike practice datasets, these real datasets required a lot of cleaning and corrections.


PROJECT WORKFLOW


METHODOLOGY

  • This project employs Exploratory and Predictive Data Analysis .


GITHUB REPOSITORY

https://github.com/UniNash/carwale_analysis.git


GitHub Repository contains

  1. Documentation

  2. Python codes


TOOLS IN PLAY

  1. Python - data cleaning and predictive analysis.

  2. GitHub Repository


DATA PROCESS, STORAGE AND MANAGEMENT


INSIGHTS

  • The most no. of cars available for sales in the second hand market is of brand Maruti Suzuki, followed by Hyundai and then Honda

  • Other brands contribute to around 60% of the second hand market including high end cars.

  • There are highest no. of cars being sold were made in 2018.

  • This might give us a hint that may be every 5 years, most people sell their old car to buy new ones.

  • But that might not be the case as we do not have enought evidence.

  • Cars listed under Maharashtra is the highest compared to other states, followed by Karnataka, Delhi, Uttar Pradesh and Tamil Nadu.

  • Around 22% of the total Cars listed are from Maharashtra.

  • Mumbai city has the highest no. of cars listed

  • Most of the cars listed for selling are Petrol cars and have single owners.

  • Maximum cars listed are of Manual Transmission.

  • Many of the cars being sold are non-commercial registered vehicles.

  • Average Engine Size of the cars list is 1364 cc. Since the distribution is right skewed, Median is used as a measure of Average.

  • Most cars listed for sales have a moderate engine size.

  • Whereas the average bhp (Max Power) is 89 bhp, which is considered as low.

  • Many cars being sold have low bhp.

  • We can expect the average mileage of 18.9 kmpl with a deviation of around (+ or - 5) kmpl.

  • It can be seen that the distribution of Price is heavily skewed to the right (positive).

  • Huge no. of outliers could be seen and signifies that there are cars with very high prices.

  • The average price of the cars is Rs. 625000 as we treat cars with value 50 lakhs and above as outliers.

  • Below is the breakdown of the distribution of prices omitting the outliers.

  • Most of the cars listed fall in the range 3.9 lakhs to 10 Lakhs.

  • The box plot suggests that very few no. of cars fall beyond te 50 lakhs category.

  • Most of the cars beyond 50 Lakh rupees are luxury brands.

  • Aston Martin Vantage V8 F1 Edition is priced at the highest, whereas Fiat Palio 1.2 EL is priced at the lowest.

  • Maharashtra has a wide range of Car Brands to select from followed by Tamil Nadu, Karnataka and Kerala.


About Me

About Me

About Me

About Me

About Me

About Me

About Me

About Me

bottom of page