Loan Default Survival Analysis Dataset Instructions
Dataset Overview
This dataset contains loan information with time-to-event data for survival analysis. The dataset includes 5,000 loans with information about borrowers, loan characteristics, and survival outcomes (default or censored).
Variables Description
Identification
- loan_id: Unique identifier for each loan (1-5000)
Loan Characteristics
- loan_amount: Loan amount in USD (integer)
- interest_rate: Annual interest rate percentage (decimal)
- term_months: Loan term in months (36 or 60)
- loan_grade: Loan grade assigned by lender (A, B, C, D, E, F, G)
- loan_purpose: Purpose of the loan (debt_consolidation, credit_card, home_improvement, major_purchase, other)
Borrower Characteristics
- credit_score: Borrower’s credit score (300-850)
- annual_income: Borrower’s annual income in USD (integer)
- employment_length: Length of employment (<1 year, 1-3 years, 3-5 years, 5-10 years, >10 years)
- home_ownership: Home ownership status (RENT, OWN, MORTGAGE)
Survival Analysis Variables
- time_to_default: Time until default or censoring (months, 1 to term_months)
- defaulted: Event indicator (1 = defaulted, 0 = censored/completed)
- censored: Censoring indicator (1 = censored, 0 = defaulted)
Survival Analysis Notes
- Event: Loan default (defaulted = 1)
- Time: Months from loan origination to default or end of observation period
- Censoring: Loans that completed their term without defaulting are censored (censored = 1, defaulted = 0)
- Censored observations: Loans still active at the end of their term (completed successfully)
Data Quality
- No missing values
- All variables are properly formatted
- Time-to-event data is complete and valid
- Event indicators are binary (0/1)
Use Cases
This dataset is suitable for:
- Calculating survival probabilities (non-default rates) over time
- Identifying risk factors for loan default
- Comparing default risk across borrower segments
- Creating Kaplan-Meier survival curves
- Analyzing the impact of loan characteristics on time to default
Excel Analysis Tips
- Use PivotTables to calculate survival probabilities by time periods
- Create charts to visualize survival curves
- Use conditional formulas (COUNTIFS, SUMIFS) to analyze default rates by segment
- Compare survival across different loan grades, purposes, or borrower characteristics