Named Entity Recognition on Islamic Texts: A Systematic Review
How to cite (IJASCE) :
This systematic literature review aims to comprehensively analyze Named Entity Recognition (NER) applications in Islamic texts, particularly the Quran and Hadith, across Arabic, Indonesian, English, and Malay languages. Materials comprised studies from major academic databases (2017-2024) implementing various NER approaches on Islamic textual datasets. The majority of studies reviewed focused on Hadith texts, with fewer examining Quranic texts and general Islamic literature. The methodology employed a PRISMA-based systematic review examining architectural components, diverse methodologies, comparative model performance, and extraction challenges in Islamic discourse. Traditional rule-based and statistical machine learning methods remain relevant, particularly in hybrid frameworks. However, the analysis reveals that transformer-based deep learning models consistently achieve superior performance, with the highest F1 Scores. Hadith datasets showed better NER performance than Quranic texts due to Hadith's structured and repetitive nature versus the Quran's greater linguistic diversity and complex syntactic structures. Most studies employed lexical and linguistic features to address distinctive characteristics of religious texts, with significant progress in handling specialized Islamic concepts and multilingual considerations. Despite these advancements, significant challenges persist, including the linguistic complexity of Classical Arabic, the scarcity of high-quality annotated corpora, and the difficulties of domain-specific entity identification. This review provides comprehensive guidance for researchers developing Islamic NER systems by identifying optimal methodological approaches and highlighting performance benchmarks across different text types, thereby enabling the development of more effective, culturally aware NLP systems for Islamic content.
S. A. Tarmizi and S. Saad, "Review on NER tools for extracting the literary document in Islamic domain," in Proc. E-Proc. Theme Impact Influence Technol. Comput. (E-TECH), 2022, p. 194.
R. E. Salah and L. Q. Zakaria, "Building the classical Arabic named entity recognition corpus (CANERCorpus)," in Proc. 4th Int. Conf. Inf. Retr. Knowl. Manage. (CAMP), Kota Kinabalu, Malaysia, Mar. 2018, pp. 1–8, doi: 10.1109/INFRKM.2018.8464820.
T. L. Emha, Z. I. M. Yusoh, and B. M. Aboobaider, "BERT based named entity recognition for automated Hadith narrator identification," Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 1, 2022, doi:10.14569/ijacsa.2022.0130173.
I. K. Alshammari, E. Atwell, and M. A. Alsalka, "Evaluation of Arabic named entity recognition models on Sahih Al-Bukhari text," Int. J. Islamic Appl. Comput. Sci. Technol., vol. 11, no. 1, pp. 1–8, 2023.
M. Shahiditabar, M. A. Mozaheb, M. Mohseni, A. Babaii, A. Dehchali, and S. M. Mousavi, "The place of language in the Holy Quran as a linguistic heritage," J. Appl. Linguist. Lang. Res., vol. 4, no. 4, pp. 291–300, 2017.
G. Inoue, B. Alhafni, N. Baimukan, H. Bouamor, and N. Habash, "The interplay of variant and size and task type in Arabic pre-trained language models," arXiv, 2021. [Online]. Available: https://arxiv.org/abs/2103.11478.
A. M. Azmi, A. O. Al-Qabbany, and A. Hussain, "Computational and natural language processing based studies of hadith literature: A survey," Artif. Intell. Rev., vol. 52, pp. 1369–1414, 2019, doi:10.1007/s10462-019-09692-w.
F. Binbeshr, A. Kamsin, and M. Mohammed, "A systematic review on hadith authentication and classification methods," ACM Trans. Asian Low-Resour. Lang. Inf. Process., vol. 20, no. 2, pp. 1–17, 2021, doi:10.1145/3434236.
B. Sulistio et al., "The utilization of machine learning on studying Hadith in Islam: A systematic literature review," Educ. Inf. Technol., vol. 29, no. 5, pp. 5381–5419, 2024, doi: 10.1007/s10639-023-12008-9.
S. J. Putra, T. Mantoro, and M. N. Gunawan, "Text mining for Indonesian translation of the Quran: A systematic review," in Proc. Int. Conf. Comput., Eng., Des. (ICCED), Kuala Lumpur, Malaysia, Nov. 2017, pp. 1–5, doi: 10.1109/CED.2017.8308122.
F. S. Utomo, N. Suryana, and M. S. Azmi, "Question answering systems on holy Quran: A review of existing frameworks, approaches, algorithms and research issues," J. Phys., Conf. Ser., vol. 1501, no. 1, p. 012022, 2020, doi: 10.1088/1742-6596/1501/1/012022.
R. Salah and L. Q. Zakaria, "A comparative review of machine learning for Arabic named entity recognition," Int. J. Adv. Sci., Eng., Inf. Technol., vol. 7, no. 2, pp. 670–678, Apr. 2017, doi:10.18517/ijaseit.7.2.1810.
X. Qu et al., "A survey on Arabic named entity recognition: Past, recent advances, and future trends," IEEE Trans. Knowl. Data Eng., vol. 36, no. 3, pp. 943–959, Mar. 2024, doi:10.1109/TKDE.2023.3303136.
I. Alsheikh, M. Mohd, and L. Warlina, "A review of Arabic text recognition dataset," Asia-Pac. J. Inf. Technol. Multimed., vol. 9, no. 1, pp. 69–81, 2020, doi: 10.17576/apjitm-2020-0901-06.
Y. V. Farahani, B. Janfada, and B. M. Bidgoli, "A review of algorithms, datasets, and criteria in word sense disambiguation with a view to its use in Islamic texts," in Proc. 8th Iran. Joint Congr. Fuzzy Intell. Syst. (CFIS), Mashhad, Iran, 2020, pp. 172–179, doi:10.1109/CFIS49607.2020.9238679.
A. Munshi et al., "Towards an automated Islamic Fatwa system: Survey, dataset and benchmarks," Int. J. Comput. Sci. Mobile Comput., vol. 10, no. 4, pp. 118–131, Apr. 2021, doi:10.47760/ijcsmc.2021.v10i04.017.
M. H. Bashir et al., "Arabic natural language processing for Qur'anic research: A systematic review," Artif. Intell. Rev., vol. 56, no. 7, pp. 6801–6854, Jul. 2023, doi: 10.1007/s10462-022-10313-2.
D. Moher, A. Liberati, J. Tetzlaff, and D. G. Altman, "Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement," Int. J. Surg., vol. 8, no. 5, pp. 336–341, 2010, doi: 10.1016/j.ijsu.2010.02.007.
J. Sulaiman, "Hadith as a source of reference for creative thinking and problem solving among Muslim students in tertiary education," in Proc. UnIPSAS Conf., 2021, vol. 1, pp. 78–88.
N. A. S. Ilahi, N. Nurwahidin, and M. Izdiyan, "The concept of Hadith, meaning, and position of Hadith, implementation and comparison in Hadith learning application: Hadith Encyclopedia and Hadith Collection," J. Middle East Islamic Stud., vol. 10, no. 1, p. 6, 2023, doi:10.7454/meis.v10i1.160.
N. A. Rahman et al., "Tagging narrator's names in Hadith text," J. Fundam. Appl. Sci., vol. 9, no. 5S, pp. 295–306, 2018, doi:10.4314/jfas.v9i5S.21.
S. S. Balgasem and L. Q. Zakaria, "A hybrid method of rule-based approach and statistical measures for recognizing narrators name in hadith," in Proc. 6th Int. Conf. Electr. Eng. Informat. (ICEEI), Langkawi, Malaysia, Nov. 2017, pp. 1–5, doi:10.1109/ICEEI.2017.8312417.
A. Mahmood, H. Khan, W. Rehman, and Z. Khan, "Query based information retrieval and knowledge extraction using Hadith datasets," in Proc. 13th Int. Conf. Emerg. Technol. (ICET), Islamabad, Pakistan, Dec. 2017, pp. 1–6, doi: 10.1109/ICET.2017.8281714.
F. Azalia, M. A. Bijaksana, and A. Huda, "Name indexing in Indonesian translation of Hadith using named entity recognition with Naïve Bayes classifier," Procedia Comput. Sci., vol. 157, pp. 142–149, 2019, doi: 10.1016/j.procs.2019.08.151.
W. P. Sari, M. A. Bijaksana, and A. F. Huda, "Indexing name in Hadith translation using hidden Markov model (HMM)," in Proc. 7th Int. Conf. Inf. Commun. Technol. (ICoICT), Kuala Lumpur, Malaysia, Jul. 2019, pp. 1–5, doi: 10.1109/ICoICT.2019.8835296.
S. A. Tarmizi and S. Saad, "Named entity recognition for Quranic text using rule based approaches," Asia-Pac. J. Inf. Technol. Multimed., vol. 11, no. 1, pp. 112–122, Dec. 2022, doi: 10.17576/apjitm-2022-0101-08.
N. Alsaaran and M. Alrabiah, "Classical Arabic named entity recognition using variant deep neural network architectures and BERT," IEEE Access, vol. 9, pp. 91537–91547, 2021, doi:10.1109/ACCESS.2021.3092261.
R. Salah, M. Mukred, L. Q. Zakaria, R. Ahmed, and H. Sari, "A new rule-based approach for classical Arabic in natural language processing," J. Math., vol. 2022, p. 7164254, 2022, doi:10.1155/2022/7164254.
A. Mutia and M. A. Bijaksana, "People entity recognition in Indonesian Alquran translation using Roberta," J. Inf. Syst. Res. (JOSH), vol. 5, no. 2, pp. 648–656, Jan. 2024, doi:10.47065/josh.v5i2.4838.
R. Ningtias and M. A. Bijaksana, "People entity recognition for the English Quran translation using BERT," J. Media Informat. Budidarma, vol. 7, no. 1, pp. 536–544, Feb. 2023, doi:10.30865/mib.v7i1.5586.
M. A. K. R. M. S., M. A. Bijaksana, and A. Huda, "Person entity recognition for the Indonesian Qur'an translation with the approach hidden Markov model-Viterbi," Procedia Comput. Sci., vol. 157, pp. 214–220, 2019, doi: 10.1016/j.procs.2019.08.160.
N. P. Gibson, Labeling Religious Affiliation in Ibn Abī Usaybia's History of Physicians: A Quest. Vienna, Austria: Verlag der Österreichischen Akademie der Wissenschaften, 2023, doi:10.1553/medievalworlds_no18_2023s246.
Z. Adnan, "Methodology study Hadith by Imam Bukhari," THARWAH: J. Islamic Civilization Thought, vol. 3, no. 1, pp. 15–32, 2023.
M. Alkaoud and M. Syed, "Learning to identify narrators in classical Arabic texts," Procedia Comput. Sci., vol. 189, pp. 335–342, 2021, doi:10.1016/j.procs.2021.05.109.
M. M. Hassan, D. AL-Nasrawi, R. J. Hassan, and N. T. Mahdi, "Rule based method of name entity recognition for matching Allah's finest names in Holy Quran," J. Eng. Appl. Sci., vol. 13, no. 13, pp. 3618–3623, 2018.
M. T. Al-Hilali and M. M. Khan, The Noble Qur'an: English Translation of the Meanings and Commentary. Madinah, Saudi Arabia: King Fahd Complex for the Printing of the Holy Qur'an, 1997.
A. Adeleke, N. Samsudin, A. Mustapha, and N. M. Nawi, "Comparative analysis of text classification algorithms for automated labelling of Quranic verses," Int. J. Adv. Sci., Eng., Inf. Technol., vol. 7, no. 4, pp. 1419–1427, Aug. 2017, doi: 10.18517/ijaseit.7.4.2198.
M. H. Shakir, The Quran with Text and Translation. New York, NY, USA: Createspace Independent Publishing, 2016.
M. M. Pickthall, The Meaning of the Glorious Quran. London, U.K.: Islamic Cultural Centre, 2001.
R. H. Gusmita, A. F. Firmansyah, D. Moussallem, and A. Ngonga Ngomo, "IndQNER: Named entity recognition benchmark dataset from the Indonesian translation of the Quran," in Natural Language Processing and Information Systems. Cham, Switzerland: Springer, 2023, pp. 170–185, doi: 10.1007/978-3-031-35320-8_12.
F. Yusup, M. A. Bijaksana, and A. Huda, "Narrator's name recognition with support vector machine for indexing Indonesian Hadith translations," Procedia Comput. Sci., vol. 157, pp. 191–198, 2019, doi:10.1016/j.procs.2019.08.157.
D. Dzidny, M. A. Bijaksana, and K. Lhaksmana, "Supervised learning approaches for nested people entity extraction in Indonesian translated Quran," Building Informat., Technol. Sci. (BITS), vol. 4, no. 1, pp. 272–278, Jun. 2022, doi: 10.47065/bits.v4i1.1758.
A. Kamran, A. Abro, and B. Basharat, "SemanticHadith: An ontology-driven knowledge graph for the hadith corpus," J. Web Semant., vol. 78, p. 100797, Jun. 2023, doi: 10.1016/j.websem.2023.100797.
M. A. Maulana, M. A. Bijaksana, and A. F. Huda, "Entity recognition for Quran English version with supervised learning approach," Indones. J. Comput. (Indo-JC), vol. 4, no. 3, pp. 77–86, 2019.
A. M. Hasan, T. H. Rassem, and M. N. Noorhuzaimi, "Combined support vector machine and pattern matching for Arabic Islamic Hadith question classification system," in Recent Trends in Data Science and Soft Computing. Cham, Switzerland: Springer, 2019, pp. 278–290, doi: 10.1007/978-3-319-99007-1_27.
M. B. Sajadi and B. Minaei, "Arabic named entity recognition using boosting method," in Proc. Artif. Intell. Signal Process. Conf. (AISP), Shiraz, Iran, Oct. 2017, pp. 281–288, doi:10.1109/AISP.2017.8324098.
A. Abdelmegied et al., "A modified version of AlQuAnS: An Arabic language question answering system," in Knowledge Discovery, Knowledge Engineering and Knowledge Management. Cham, Switzerland: Springer, 2019, pp. 184–199, doi: 10.1007/978-3-030-15640-4_10.
S. Raharjo, R. Wardoyo, and A. E. Putra, "Detecting proper nouns in Indonesian-language translation of the Quran using a guided method," J. King Saud Univ., Comput. Inf. Sci., vol. 32, no. 5, pp. 583–591, Jun. 2020, doi: 10.1016/j.jksuci.2018.06.009.
F. D. Arvianto, M. A. Bijaksana, and A. F. Huda, "People entity recognition in Indonesian Quran translation with conditional random field approach," in Proc. 7th Int. Conf. Inf. Commun. Technol. (ICoICT), Kuala Lumpur, Malaysia, Jul. 2019, pp. 1–5, doi:10.1109/ICoICT.2019.8835266.
N. Alias et al., "Tagging algorithm and POS tags for narrator's name in Hadith document," in Proc. 4th Int. Conf. Artif. Intell. Data Sci. (AiDAS), Ipoh, Malaysia, Sep. 2023, pp. 126–130, doi:10.1109/AiDAS60501.2023.10284594.
R. Salah, M. Mukred, L. Q. Zakaria, and F. A. M. Al-Yarimi, "A machine learning approach for named entity recognition in classical Arabic natural language processing," KSII Trans. Internet Inf. Syst., vol. 18, no. 10, pp. 2895–2919, Oct. 2024, doi:10.3837/tiis.2024.10.005.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," in Proc. NAACL-HLT, Minneapolis, MN, USA, Jun. 2019, pp. 4171–4186, doi: 10.18653/v1/N19-1423.
M. M. A. Najeeb, "A hidden Markov model-based tagging approach for Arabic Isnads of Hadiths," Math. Problems Eng., vol. 2022, p. 7160509, 2022, doi: 10.1155/2022/7160509.
H. M. AbdElaal et al., "Classifications of Hadiths based on supervised learning techniques," Int. J. Comput. Sci. Netw. Secur., vol. 22, no. 11, pp. 1–10, Nov. 2022.
N. Goyal and N. Singh, "Named entity recognition and relationship extraction for biomedical text: A comprehensive survey, recent advancements, and future research directions," Neurocomputing, vol. 618, p. 129171, 2025, doi: 10.1016/j.neucom.2024.129171.
H. I. Yilmaz, M. C. Izgi, E. E. Erbay, and Y. Ozdemir, "Studying early Islam in the third millennium: A bibliometric analysis," Humanit. Soc. Sci. Commun., vol. 11, p. 1521, 2024, doi: 10.1057/s41599-024-04058-2.
Gawami al-kalim. Doha, Qatar: Islamweb, 2021.
P. Qi et al., "Stanza: A Python natural language processing toolkit for many human languages," in Proc. 58th Annu. Meeting Assoc. Comput. Linguistics, Syst. Demonstrations, 2020, pp. 101–108, doi:10.18653/v1/2020.acl-demos.14.
J. R. Finkel, T. Grenager, and C. Manning, "Incorporating non-local information into information extraction systems by Gibbs sampling," in Proc. 43rd Annu. Meeting Assoc. Comput. Linguistics (ACL), Ann Arbor, MI, USA, Jun. 2005, pp. 363–370, doi:10.3115/1219840.1219885.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.