Home

Student Habits Analysis

A Python and pandas project comparing student habits with academic performance.

About the Project

This notebook analyzes a Kaggle dataset with 1,000 student records and 16 variables, including study time, sleep, attendance, internet quality, and exam scores. The analysis focuses on how study habits relate to performance.

Featured Code

Cleaning Missing Values

The parental education column had missing values, so those rows were removed before calculating results.

df.dropna(subset=['parental_education_level'], inplace=True)
df.isnull().sum()

Remaining missing values after cleaning: 0

Vectorized Filtering

Pandas conditions evaluated the full study-hours column at once, making it easy to count and compare groups of students.

count_over_6 = (df['study_hours_per_day'] > 6).sum()
percentage_more_than_6_hours = (count_over_6 / total_students) * 100

Students studying more than 6 hours/day: 40

Percentage studying more than 6 hours/day: 4.40%

Score Comparison

This code uses vectorized pandas filtering to compare average exam scores between students who study more than five hours per day and those who study five hours or less.

average_exam_score_more_than_5 = df[df['study_hours_per_day'] > 5]['exam_score'].mean()
average_exam_score_5_or_less = df[df['study_hours_per_day'] <= 5]['exam_score'].mean()

print(f"Average exam score for students studying more than 5 hours/day: {average_exam_score_more_than_5:.2f}")
print(f"Average exam score for students studying 5 hours/day or less: {average_exam_score_5_or_less:.2f}")

Key Takeaway

More Study Time Was Linked to Higher Scores

Students who studied more than five hours per day had a much higher average exam score than students who studied five hours or less.

More than 5 hours/day: 91.12

5 hours/day or less: 65.67

Open Full Notebook Raw Notebook