Methodology

Data Collection

The initial dataset was sourced from two Wikipedia pages that list all the German U-Boats from U-1 to U-4712. The data collection process involved:

  1. Primary Sources: Wikipedia pages listing German U-Boats
  2. Secondary Collection: Individual U-Boat pages for commissioning dates
  3. Data Fields: U-Boat name, year, type, notable commanders, damage statistics, fate, and notes

Data Processing Pipeline

Step 1: Web Scraping

Step 2: Data Cleaning

Step 3: Survival Analysis

Statistical Methods

Survival Analysis Approach

We treated U-Boat “death” (sinking, capture, scuttling) as events and calculated time-to-event from commissioning date.

Key Metrics

Model Validation

Limitations

  1. Data Quality: Reliance on Wikipedia may introduce inaccuracies
  2. Missing Data: Some commissioning dates were unavailable
  3. Survival Bias: Focus only on documented U-Boats
  4. Historical Context: Limited accounting for operational changes during war

Back to Main Analysis