Articles | Open Access | https://doi.org/10.37547/ijmsphr/Volume05Issue11-05

PERFORMANCE OF MACHINE LEARNING ALGORITHMS FOR LUNG CANCER PREDICTION: A COMPARATIVE STUDY

Md Nur Hossain , Master’s In Information Technology Management, Webster University, USA
Nafis Anjum , College Of Technology And Engineering, Westcliff University, Irvine, CA
Murshida Alam , Department Of Business Administration, Westcliff University, Irvine, California, USA
Md Redowan Amin Mollick , Master of Science in Data Analytics and Strategic Business Intelligence, Long Island University post, USA
Md Habibur Rahman , Department Of Business Administration, International American University, Los Angeles, California, USA
Ashim Chandra Das , Master of Science in Information Technology, Washington University of Science and Technology, USA
Md Monir Hosen , MS in Business Analytics, St.Francis college, USA
Md Siam Taluckder , Phillip M. Drayer Department Of Electrical Engineering Lamar University, USA
Md Nad Vi Al Bony , Department Of Business Administration, International American University, Los Angeles, CA
S M Shadul Islam Rishad , Master Of Science In Information Technology, Westcliff University, USA
Afrin Hoque Jui , Department Of Management Science And Quantitative Methods, Gannon University, USA

Abstract

This study compares the performance of five machine learning algorithms—logistic regression, support vector machines, random forests, gradient boosting, and neural networks—for lung cancer prediction using demographic, lifestyle, and medical data from the UCI Machine Learning Repository. Gradient boosting and random forests achieved the highest accuracy (89% and 87%, respectively) and AUC-ROC scores (0.93 and 0.92), while neural networks reached 90% accuracy but presented interpretability limitations. Key predictors included smoking history, chronic disease, and respiratory symptoms, aligning with established risk factors. Ensemble methods, particularly gradient boosting and random forests, provided an optimal balance of accuracy and interpretability, highlighting their potential for clinical applications in early lung cancer detection.

ZENODO DOI:- https://doi.org/10.5281/zenodo.14160193

Keywords

Lung cancer prediction, Machine learning algorithms, Comparative analysis

References

Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794).

Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874.

Gómez-Ruiz, J. A., Stoean, C., & Braojos, R. (2019). A predictive model for lung cancer diagnosis based on ensemble learning techniques. Journal of Healthcare Engineering, 2019, 1–13.

Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46(1), 389–422.

Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (Vol. 398). John Wiley & Sons.

Jemal, A., Torre, L. A., Siegel, R. L., & Ward, E. M. (2020). Global patterns and trends in lung cancer incidence and mortality. CA: A Cancer Journal for Clinicians, 70(6), 458–471.

Kourou, K., Exarchos, T. P., Exarchos, K. P., Karamouzis, M. V., & Fotiadis, D. I. (2015). Machine learning applications in cancer prognosis and prediction. Computational and Structural Biotechnology Journal, 13, 8–17.

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444.

Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (pp. 4765–4774).

Noble, W. S. (2006). What is a support vector machine? Nature Biotechnology, 24(12), 1565–1567.

Soneji, S., Tanner, N. T., Silvestri, G. A., & Black, W. (2018). Rethinking lung cancer screening. The New England Journal of Medicine, 378(22), 2030–2032.

Torre, L. A., Siegel, R. L., Ward, E. M., & Jemal, A. (2016). Global cancer incidence and mortality rates and trends—an update. Cancer Epidemiology Biomarkers & Prevention, 25(1), 16–27.

Wang, Y., Zhang, S., & Xia, J. (2021). A comparative study of machine learning algorithms for lung cancer prediction. Journal of Cancer Research and Clinical Oncology, 147(2), 505–516.

World Health Organization (WHO). (2023). Cancer. WHO

Shahid, R., Mozumder, M. A. S., Sweet, M. M. R., Hasan, M., Alam, M., Rahman, M. A., ... & Islam, M. R. (2024). Predicting Customer Loyalty in the Airline Industry: A Machine Learning Approach Integrating Sentiment Analysis and User Experience. International Journal on Computational Engineering, 1(2), 50-54.

Mozumder, M. A. S., Mahmud, F., Shak, M. S., Sultana, N., Rodrigues, G. N., Al Rafi, M., ... & Bhuiyan, M. S. M. (2024). Optimizing Customer Segmentation in the Banking Sector: A Comparative Analysis of Machine Learning Algorithms. Journal of Computer Science and Technology Studies, 6(4), 01-07.

Chowdhury, M. S., Shak, M. S., Devi, S., Miah, M. R., Al Mamun, A., Ahmed, E., ... & Mozumder, M. S. A. (2024). Optimizing E-Commerce Pricing Strategies: A Comparative Analysis of Machine Learning Models for Predicting Customer Satisfaction. The American Journal of Engineering and Technology, 6(09), 6-17.

Md Abu Sayed, Badruddowza, Md Shohail Uddin Sarker, Abdullah Al Mamun, Norun Nabi, Fuad Mahmud, Md Khorshed Alam, Md Tarek Hasan, Md Rashed Buiya, & Mashaeikh Zaman Md. Eftakhar Choudhury. (2024). COMPARATIVE ANALYSIS OF MACHINE LEARNING ALGORITHMS FOR PREDICTING CYBERSECURITY ATTACK SUCCESS: A PERFORMANCE EVALUATION. The American Journal of Engineering and Technology, 6(09), 81–91. https://doi.org/10.37547/tajet/Volume06Issue09-10

Md Al-Imran, Salma Akter, Md Abu Sufian Mozumder, Rowsan Jahan Bhuiyan, Tauhedur Rahman, Md Jamil Ahmmed, Md Nazmul Hossain Mir, Md Amit Hasan, Ashim Chandra Das, & Md. Emran Hossen. (2024). EVALUATING MACHINE LEARNING ALGORITHMS FOR BREAST CANCER DETECTION: A STUDY ON ACCURACY AND PREDICTIVE PERFORMANCE. The American Journal of Engineering and Technology, 6(09), 22–33. https://doi.org/10.37547/tajet/Volume06Issue09-04

Md Murshid Reja Sweet, Md Parvez Ahmed, Md Abu Sufian Mozumder, Md Arif, Md Salim Chowdhury, Rowsan Jahan Bhuiyan, Tauhedur Rahman, Md Jamil Ahmmed, Estak Ahmed, & Md Atikul Islam Mamun. (2024). COMPARATIVE ANALYSIS OF MACHINE LEARNING TECHNIQUES FOR ACCURATE LUNG CANCER PREDICTION. The American Journal of Engineering and Technology, 6(09), 92–103. https://doi.org/10.37547/tajet/Volume06Issue09-11

Bahl, S., Kumar, P., & Agarwal, A. (2021). Sentiment analysis in banking services: A review of techniques and challenges. International Journal of Information Management, 57, 102317.

Ashim Chandra Das, Md Shahin Alam Mozumder, Md Amit Hasan, Maniruzzaman Bhuiyan, Md Rasibul Islam, Md Nur Hossain, Salma Akter, & Md Imdadul Alam. (2024). MACHINE LEARNING APPROACHES FOR DEMAND FORECASTING: THE IMPACT OF CUSTOMER SATISFACTION ON PREDICTION ACCURACY. The American Journal of Engineering and Technology, 6(10), 42–53. https://doi.org/10.37547/tajet/Volume06Issue10-06

Rowsan Jahan Bhuiyan, Salma Akter, Aftab Uddin, Md Shujan Shak, Md Rasibul Islam, S M Shadul Islam Rishad, Farzana Sultana, & Md. Hasan-Or-Rashid. (2024). SENTIMENT ANALYSIS OF CUSTOMER FEEDBACK IN THE BANKING SECTOR: A COMPARATIVE STUDY OF MACHINE LEARNING MODELS. The American Journal of Engineering and Technology, 6(10), 54–66. https://doi.org/10.37547/tajet/Volume06Issue10-07

C. Modak, M. A. Shahriyar, M. S. Taluckder, M. S. Haque and M. A. Sayed, "A Study of Lung Cancer Prediction Using Machine Learning Algorithms," 2023 3rd International Conference on Electronic and Electrical Engineering and Intelligent System (ICE3IS), Yogyakarta, Indonesia, 2023, pp. 213-217, doi: 10.1109/ICE3IS59323.2023.10335237.

INNOVATIVE MACHINE LEARNING APPROACHES TO FOSTER FINANCIAL INCLUSION IN MICROFINANCE. (2024). International Interdisciplinary Business Economics Advancement Journal, 5(11), 6-20. https://doi.org/10.55640/business/volume05issue11-02

Md Al-Imran, Eftekhar Hossain Ayon, Md Rashedul Islam, Fuad Mahmud, Sharmin Akter, Md Khorshed Alam, Md Tarek Hasan, Sadia Afrin, Jannatul Ferdous Shorna, & Md Munna Aziz. (2024). TRANSFORMING BANKING SECURITY: THE ROLE OF DEEP LEARNING IN FRAUD DETECTION SYSTEMS. The American Journal of Engineering and Technology, 6(11), 20–32. https://doi.org/10.37547/tajet/Volume06Issue11-04

Article Statistics

Downloads

Download data is not yet available.

Copyright License

Download Citations

How to Cite

Md Nur Hossain, Nafis Anjum, Murshida Alam, Md Redowan Amin Mollick, Md Habibur Rahman, Ashim Chandra Das, Md Monir Hosen, Md Siam Taluckder, Md Nad Vi Al Bony, S M Shadul Islam Rishad, & Afrin Hoque Jui. (2024). PERFORMANCE OF MACHINE LEARNING ALGORITHMS FOR LUNG CANCER PREDICTION: A COMPARATIVE STUDY. International Journal of Medical Science and Public Health Research, 5(11), 41–55. https://doi.org/10.37547/ijmsphr/Volume05Issue11-05