Titanic - Machine Learning from Disaster

Challange:

    Build a predictive model that answers the question:"what sorts of people were more likely to survive?" using passenger data(ie name, age, gender, socio-economic class, etc).

Dataset Description:

    The data has split into two groups:

My Approach:

   Programming Language: Python

   Built two predictive models

  1. With RandomForestClasssifier: Implemented a machine learning predictive model for titanic disaster to predict the passenger survival rate using RandomForestClasssifier
    Kaggle Score: 0.77511

         

  2. With XGBoostClasssifier: Implemented a machine learning predictive model for titanic disaster to predict the passenger survival rate using XGBoostClasssifier with tuned hyperparameters
    Kaggle Score: 0.77990

    Contribution:
    1. Visualized the relationship between each feature with class
    2. Identified 3 more useful features other than used features in RandomForestClasssifier
    3. Converted string values of features('Sex', 'Embarked') to numeric values by using replace()
    4. Tuned hyperparameters (n_estimators, learning_rate, max_depth, colsample_bytree) by using grid search for XGBoost hyperparameters.
      Credit : towardsdatascience

         

GitHub Code:

   Click here