Research Datasets

Curated datasets for AI research in healthcare and business

Dataset Name Domain Size Type Source
Diabetes Health Indicators
BRFSS survey data for diabetes prediction
Healthcare 253,680 records Tabular Kaggle/CDC
ChestX-ray14
Chest X-ray images with disease labels
Healthcare 112,120 images Image NIH
MIMIC-III
Critical care database (requires approval)
Healthcare 58,000+ ICU stays Clinical Records MIT
Heart Disease UCI
Cardiovascular disease indicators
Healthcare 303 records Tabular UCI ML Repo
E-Commerce Customer Behavior
Purchase patterns and user analytics
Business 500,000+ transactions Tabular Kaggle
Credit Card Fraud Detection
Anonymized credit card transactions
Business 284,807 transactions Tabular Kaggle
Customer Churn Prediction
Telecom customer retention data
Business 7,043 customers Tabular Kaggle
Stock Market Historical Data
S&P 500 companies time series
Business 20+ years Time Series Yahoo Finance

Data Acquisition

  • Check licensing and usage terms
  • Request IRB approval if needed
  • Document data provenance
  • Maintain version control

Data Privacy & Ethics

  • Ensure HIPAA compliance for health data
  • Anonymize sensitive information
  • Check for bias in datasets
  • Follow ethical AI guidelines