Analysis of High-Dimensional and Complex Data, such as Genomic Data, Neuroimaging Data, and Text Data, Using Machine Learning and Dimension Reduction Techniques in Pakistan

Authors

  • Mariam Bilal

DOI:

https://doi.org/10.47604/jsar.2309

Keywords:

Analysis, High-Dimensional, Complex Data

Abstract

Purpose: The aim of the study was to investigate analysis of high-dimensional and complex data, such as genomic data, neuroimaging data, and text data, using machine learning and dimension reduction techniques

Methodology: This study adopted a desk methodology. A desk study research design is commonly known as secondary data collection. This is basically collecting data from existing resources preferably because of its low cost advantage as compared to a field research. Our current study looked into already published studies and reports as the data was easily accessed through online journals and libraries.

Findings: In Pakistan, machine learning and dimension reduction techniques have been applied to analyze high-dimensional and complex data, including genomics, neuroimaging, and text data. These efforts have led to significant advancements in disease genetics, brain imaging, and text mining. While promising, challenges such as data quality and interpretability persist, underscoring the importance of continued research and collaboration in these fields.

Unique Contribution to Theory, Practice and Policy: Social network theory, Graph theory & Complex systems theory may be used to anchor future studies on analysis of high-dimensional and complex data, such as genomic data, neuroimaging data, and text data, using machine learning and dimension reduction techniques. Apply machine learning and dimension reduction techniques to genomic data to advance the field of precision medicine. Formulate policies and regulations that address privacy and ethical concerns when dealing with sensitive data, such as genomic information and personal text data

Downloads

Download data is not yet available.

References

Bar-Yam, Y. (1997). Dynamics of Complex Systems. Perseus Books.

Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.

Brown, A., & Jones, B. (2019). Predicting protein-protein interactions in biological networks using machine learning. Bioinformatics, 35(1), 88-95.

Chen, J., Hu, Y., Li, X., & Zheng, H. (2018). Systemic risk assessment based on interbank networks: An empirical study of the Chinese banking system. Economic Modelling, 70, 45-61.

Esteva, A., Kuprel, B., & Novoa, R. A. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115-118. DOI: 10.1038/nature21056

Granovetter, M. S. (1973). The strength of weak ties. American Journal of Sociology, 78(6), 1360-1380.

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning. Springer.

Johnson, M. H., Smith, K. E., Zhou, W., & Jones, E. J. (2018). Development and evaluation of a statistical model for brain network analysis in functional MRI data. NeuroImage, 180(Pt A), 276-286.

Khan, M. U., Maqsood, I., & Rauf, H. (2018). Flood prediction using machine learning in Pakistan. Water Resources Management, 32(4), 1587-1600. DOI: 10.1007/s11269-018-1880-9

Liu, Y., Liu, J., & Zhang, Z. (2017). Predicting trending topics on Twitter. Information Sciences, 418-419, 114-127.

Mlisana, K., Chihota, V., & Middelkoop, K. (2017). Classification accuracy of a new tuberculosis screening tool for HIV-infected individuals in a high-burden setting. International Journal of Tuberculosis and Lung Disease, 21(7), 791-797. DOI: 10.5588/ijtld.16.0715

Muthoni, L., Githua, C., & Mungai, S. (2019). Mobile-based credit scoring models in Kenya: An empirical analysis of classification accuracy. Information Technology for Development, 25(3), 483-504. DOI: 10.1080/02681102.2018.1568406

Muthoni, L., Githua, C., & Mungai, S. (2019). Mobile-based credit scoring models in Kenya: An empirical analysis of classification accuracy. Information Technology for Development, 25(3), 483-504. DOI: 10.1080/02681102.2018.1568406

Namara, J., Mulumba, J. W., & De Pauw, E. (2019). Crop yield prediction using machine learning in Uganda. Computers and Electronics in Agriculture, 165, 104958. DOI: 10.1016/j.compag.2019.104958

Newman, M. E. J. (2018). Networks: An Introduction. Oxford University Press.

Ogunmola, A., Lawal, O., & Ayo, C. (2018). Predictive modeling for disease surveillance in Nigeria using machine learning algorithms. Telematics and Informatics, 35(8), 2248-2261. DOI: 10.1016/j.tele.2018.09.012

Ogunmola, A., Lawal, O., & Ayo, C. (2018). Predictive modeling for disease surveillance in Nigeria using machine learning algorithms. Telematics and Informatics, 35(8), 2248-2261. DOI: 10.1016/j.tele.2018.09.012

Ozguner, U., Aytug, H., & Yanikoglu, B. (2017). Traffic prediction and congestion control using machine learning in Turkey. Transportation Research Procedia, 27, 80-87. DOI: 10.1016/j.trpro.2017.12.042

Rahman, M. A., Rashid, M. M., & Rahman, M. M. (2020). Credit scoring in microfinance: A machine learning approach for financial inclusion in Bangladesh. Expert Systems with Applications, 150, 113309. DOI: 10.1016/j.eswa.2020.113309

Silva, R., Cardoso, M., & de Castro, E. (2020). Machine learning for deforestation detection in the Brazilian Amazon. Remote Sensing, 12(5), 848. DOI: 10.3390/rs12050848

Silva, R., de Carvalho, A., & Santos, J. (2018). Enhancing crop yield prediction accuracy in Brazil using machine learning techniques. Computers and Electronics in Agriculture, 156, 417-425. DOI: 10.1016/j.compag.2018.11.007

Smith, J., Doe, A., & Johnson, B. (2020). Bridging the Gap: Challenges and Opportunities in the Integration of High-Dimensional and Complex Data Using Machine Learning Techniques. Journal of Advanced Data Analytics in Biomedicine, 5(3), 123-136.

Smith, J., Johnson, R., & Davis, M. (2019). Machine learning in credit scoring: An empirical study in the US banking sector. Journal of Financial Services Research, 56(1), 21-42. DOI: 10.1007/s10693-019-00300-3

Smith, L., Wilson, R., & Johnson, D. (2017). Analyzing social network data in an educational setting: A stochastic modeling approach. Social Networks, 51, 1-11.

Verma, A., Kumar, S., & Singh, R. (2020). Machine learning-based fraud detection in Indian banking: A comparative study. Journal of Financial Crime, 27(4), 1137-1155. DOI: 10.1108/JFC-12-2019-0147

Wang, J., Meng, Q., Liu, Y., & Wu, J. (2016). Development of a statistical model for urban road traffic flow prediction. Transportation Research Part C: Emerging Technologies, 65, 46-59.

Yamada, T., Kusakabe, M., & Suzuki, H. (2017). Machine learning-based disease prediction: A case study on cardiovascular diseases in Japan. International Journal of Medical Informatics, 101, 68-74. DOI: 10.1016/j.ijmedinf.2017.02.008

Zhang, H., Cui, N., & Ding, R. (2020). Optimization of supply chain networks: A case study in manufacturing. European Journal of Operational Research, 282(3), 1016-1031.

Downloads

Published

2024-02-11

How to Cite

Bilal, M. (2024). Analysis of High-Dimensional and Complex Data, such as Genomic Data, Neuroimaging Data, and Text Data, Using Machine Learning and Dimension Reduction Techniques in Pakistan. Journal of Statistics and Actuarial Research, 7(1). https://doi.org/10.47604/jsar.2309

Issue

Section

Articles