By Robin Manhaeve
Review Details
Reviewer has chosen not to be Anonymous
Overall Impression: Good
Content:
Technical Quality of the paper: Good
Originality of the paper: Yes
Adequacy of the bibliography: Yes
Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Good
Organization of the paper: Satisfactory
Level of English: Satisfactory
Overall presentation: Good
Detailed Comments:
This paper surveys the literature on neurosymbolic AI frameworks that use Answer Set Programming (ASP) as their symbolic language. The authors categorize frameworks along a four-way taxonomy based on which components (neural, symbolic) are hard-coded, pre-trained, or learned. They accompany the discussion with system diagrams, then analyze the field along the axes of perception task difficulty, performance, ASP generation expressivity, and scalability.
This is a useful and timely contribution. The field of neurosymbolic ASP has grown considerably in recent years, and a dedicated synthesis is always welcome. The paper is largely well-executed, with a clear organizing principle and helpful visual aids. I have some concerns about scope justification, organization, and the presentation of the comparative results, but these are addressable and do not undermine the value of the work.
Strengths:
- The survey casts a wide net across the neurosymbolic ASP landscape and does a good job situating the covered work within the broader neurosymbolic and logic programming context. The related work section is thorough and the comparisons between ASP and adjacent formalisms are fair and informative.
- The schematic figures of the surveyed frameworks are one of the paper's clearest contributions. They give the reader an immediate intuition for the architectural differences between frameworks that would otherwise require careful reading of each individual paper.
Weaknesses:
- The authors do argue that no comprehensive ASP-focused survey currently exists and that the closest work s limited in scope. This is a reasonable motivation, but it is dispatched in a few sentences and could be developed more substantively. Specifically, what makes ASP distinctively worth surveying in isolation is underdeveloped. I would encourage the authors to make a stronger case for the unique challenges that ASP's stable model semantics, non-monotonicity, and many-worlds reasoning bring to the neurosymbolic setting, and how these distinguish it from related frameworks. That said, I recognize that neurosymbolic AI has become broad enough that any survey must draw boundaries somewhere, and ASP is a somewhat defensible place to draw them.
- The Background section includes a dedicated subsection on CNNs but does not mention graph neural networks. Given that the paper later discusses Answer Set Networks, which encode answer set programs directly as GNNs, the omission is striking. CNNs, by contrast, appear mostly as generic perception components with little ASP-specific relevance. I would recommend replacing (or at least supplementing) the CNN discussion with a GNN discussion, as GNNs are more directly relevant to the frameworks surveyed.
- The "Perception tasks" subsection sits under "Analysis of Neurosymbolic ASP," but it reads primarily as a descriptive inventory of the datasets used across the surveyed frameworks rather than as analysis. It would fit more naturally as its own section, with the analytic observations about dataset difficulty and latent-label availability retained but separated from the catalogue itself.
- Both comparison tables contain a large number of empty cells, which limits their usefulness for cross-framework comparison, particularly Table 3, where most rows have a single populated entry. I appreciate that filling in these tables is a significant undertaking, potentially requiring the authors to re-run frameworks on tasks they were never evaluated on. Still, even partial effort to close some of the most glaring gaps would considerably strengthen the comparative claims.\
- Table 4 is a valuable reference but is hard to parse at a glance. Training times are given in mixed units (seconds, minutes, hours-minutes-seconds) and the contrast between framework performance across tasks is buried in the numerical detail. A visual presentation would communicate the scalability story much more effectively than the current table. The table could be retained as a supplementary reference with a visualization leading the discussion in the main text.
Minor comments:
- There are a number of typos and small grammatical issues that the authors should address in a final pass. Examples include: "On reason lies in suboptimal implementations" on p. 36; "the predictions pf NeurASP" on p. 31; "form 77 to 86%" on p. 31; "NS2ASP consistently achieves" on p. 20 .
- The phrasing "includes pictures of, shockingly, German traffic signs" (p. 26) does not follow the tone an academic paper