Bridging behavioral gaps : automatic extrapolation of concreteness norms in Arabic using English-tuned KNN approaches

Thumbnail Image

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This thesis addresses the automatic extrapolation of concreteness norms for nouns in both Modern Standard Arabic and English. The main goal is to enable reliable estimation of how concrete or abstract words are (e.g. “apple” vs. “justice”) using computational methods, supporting applications in psycholinguistics and natural language processing. To this end, a novel dataset of 202 Arabic nouns rated for concreteness is introduced and aligned with established English norms. To predict concreteness, the study compares a K-Nearest Neighbors (KNN) regression model based on FastText and transformer-based embeddings with predictions from Chat GPT. The KNN models achieve high accuracy, Spearman ρ = 0.92 (RMSE = 0.43) for English and ρ = 0.83 (RMSE = 0.69) for Arabic on held-out test sets. By contrast, ChatGPT predictions, while consistent across runs, yield lower correlations (ρ = 0.80 for both English and Arabic) and higher RMSE values, confirming that KNN remains more accurate for concreteness estimation. Keywords: concreteness norms; abstractness; Arabic; English; K-Nearest Neighbors; word embeddings; FastText; transformer models; ChatGPT; lexical semantics; psycholinguistics; norm extrapolation

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By