| SQL Data | |||||||||||||||||
| feature | category | stroke_status | count | Expected Values | Chi-Square Test | ||||||||||||
| ever_married | married | had stroke | 81 | Sum of count | Column Labels | had stroke | no stroke | feature | p-value | significant | |||||||
| ever_married | married | no stroke | 2333 | Row Labels | had stroke | no stroke | Grand Total | married | 53.22390985 | 2360.77609 | ever_married | 1.7144E-09 | yes | ||||
| ever_married | never married | had stroke | 9 | ever_married | 90 | 3992 | 4082 | never married | 36.77609015 | 1631.22391 | gender | 0.3238055 | no | ||||
| ever_married | never married | no stroke | 1659 | married | 81 | 2333 | 2414 | heart_disease | 1.93107E-14 | yes | |||||||
| gender | female | had stroke | 48 | never married | 9 | 1659 | 1668 | female | 52.56246938 | 2331.437531 | hypertension | 1.40644E-05 | yes | ||||
| gender | female | no stroke | 2336 | gender | 90 | 3992 | 4082 | male | 37.43753062 | 1660.562469 | residence_type | 0.569317433 | no | ||||
| gender | male | had stroke | 42 | female | 48 | 2336 | 2384 | smoking_status | 0.000122057 | yes | |||||||
| gender | male | no stroke | 1656 | male | 42 | 1656 | 1698 | heart disease | 2.116609505 | 93.88339049 | work_type | 0.004206256 | yes | ||||
| heart_disease | heart disease | had stroke | 13 | heart_disease | 90 | 3992 | 4082 | no heart disease | 87.88339049 | 3898.11661 | |||||||
| heart_disease | heart disease | no stroke | 83 | heart disease | 13 | 83 | 96 | The
Chi-Square Test indicates that for patients under 65, gender and residence_type are not statistically significant
predictors of stroke, as their p-values exceed the
0.05 significance threshold. Gender and residence_type features will be dropped from the analysis, moving forward. |
|||||||||
| heart_disease | no heart disease | had stroke | 77 | no heart disease | 77 | 3909 | 3986 | hypertension | 5.908868202 | 262.0911318 | |||||||
| heart_disease | no heart disease | no stroke | 3909 | hypertension | 90 | 3992 | 4082 | no hypertension | 84.0911318 | 3729.908868 | |||||||
| hypertension | hypertension | had stroke | 16 | hypertension | 16 | 252 | 268 | ||||||||||
| hypertension | hypertension | no stroke | 252 | no hypertension | 74 | 3740 | 3814 | rural | 44.66927976 | 1981.33072 | |||||||
| hypertension | no hypertension | had stroke | 74 | residence_type | 90 | 3992 | 4082 | urban | 45.33072024 | 2010.66928 | |||||||
| hypertension | no hypertension | no stroke | 3740 | rural | 42 | 1984 | 2026 | ||||||||||
| residence_type | rural | had stroke | 42 | urban | 48 | 2008 | 2056 | formerly smoked | 12.89808917 | 572.1019108 | |||||||
| residence_type | rural | no stroke | 1984 | smoking_status | 90 | 3992 | 4082 | never smoked | 32.74130328 | 1452.258697 | |||||||
| residence_type | urban | had stroke | 48 | formerly smoked | 22 | 563 | 585 | smokes | 14.52964233 | 644.4703577 | Mann-Whitney Test | ||||||
| residence_type | urban | no stroke | 2008 | never smoked | 24 | 1461 | 1485 | unknown | 29.83096521 | 1323.169035 | feature | p-value | |||||
| smoking_status | formerly smoked | had stroke | 22 | smokes | 25 | 634 | 659 | bmi | 1.06E-06 | ||||||||
| smoking_status | formerly smoked | no stroke | 563 | unknown | 19 | 1334 | 1353 | children | 15.14698677 | 671.8530132 | avg_glucose_level | 0.0001 | |||||
| smoking_status | never smoked | had stroke | 24 | work_type | 90 | 3992 | 4082 | govt_job | 11.59725625 | 514.4027438 | |||||||
| smoking_status | never smoked | no stroke | 1461 | children | 2 | 685 | 687 | never_worked | 0.485056345 | 21.51494366 | The
Mann–Whitney U Test was conducted using Python, as Excel does not support this test natively. The script can be found in the python_stat_test folder under the filename mann_whitney_test.py. Both BMI and average glucose level returned p-values significantly below the 0.05 threshold, indicating that they are statistically significant predictors of stroke in patients under 65. |
||||||
| smoking_status | smokes | had stroke | 25 | govt_job | 16 | 510 | 526 | private | 53.13571779 | 2356.864282 | |||||||
| smoking_status | smokes | no stroke | 634 | never_worked | 22 | 22 | self-employed | 9.634982852 | 427.3650171 | ||||||||
| smoking_status | unknown | had stroke | 19 | private | 59 | 2351 | 2410 | ||||||||||
| smoking_status | unknown | no stroke | 1334 | self-employed | 13 | 424 | 437 | ||||||||||
| work_type | children | had stroke | 2 | Grand Total | 630 | 27944 | 28574 | ||||||||||
| work_type | children | no stroke | 685 | ||||||||||||||
| work_type | govt_job | had stroke | 16 | ||||||||||||||
| work_type | govt_job | no stroke | 510 | ||||||||||||||
| work_type | never_worked | no stroke | 22 | ||||||||||||||
| work_type | private | had stroke | 59 | ||||||||||||||
| work_type | private | no stroke | 2351 | ||||||||||||||
| work_type | self-employed | had stroke | 13 | ||||||||||||||
| work_type | self-employed | no stroke | 424 | ||||||||||||||