Call for Shared Task Participation

Reconstructing the Reasoning in United Nations Resolutions

This shared task focuses on understanding argumentative structure in highly formal, legal-political UN resolutions. Participants are expected to build LLM-based systems to (1) identify and classify argumentative paragraphs in preambles and operative sections, and (2) predict argumentative relations between paragraphs.

Data: United Nations Resolutions Language: French (with English translations) Granularity: paragraph-level # Subtasks: 2 Metric: F1 and LLM-as-a-Judge Models: open-weight LLMs (≤ 8B)

Leaderboard (Placeholders)

Team F1 Score LLM Judge
🥇 DataMiners 0.892 94.85
🥈 NLP-Lab-X 0.887 92.72
🥉 Team Alpha 0.865 89.68
4. ArgWizards 0.823 84.45
5. Resolvers 0.791 78.20
6. ArgWorker 0.650 62.10
7. GradientHacker 0.622 58.35
8. DeepThinkers 0.595 55.20
9. LogicGates 0.551 50.10
10. RandomSeed 0.420 35.00

* Top 3 teams will receive certificates and awards at the ArgMining 2026 Workshop in San Diego, USA.

Overview

United Nations resolutions encode collective reasoning at scale: negotiated positions, implicit premises, and carefully structured conclusions. This shared task evaluates how well modern systems can recover these underlying argumentative structures from text.

Tasks

The shared task consists of two subtasks aligned with the workshop theme “Understanding and evaluating arguments in both human and machine reasoning.”

Subtask 1: Argumentative Paragraph Classification

For each paragraph, predict (a) whether it is preambular or operative, and (b) assign a subset of 141 predefined tags as a multi-label classification problem.

Subtask 2: Argumentative Relation Prediction

Given a paragraph, predict which other paragraphs it is related to (indices), and label each link with one or more relation types: contradictive, supporting, complemental, modifying.

Data

We provide a training set and a held-out test set. Both in JSON schema to enable easy processing and reproducible development. We encourage participants to explore the data and design their systems accordingly. To make the task more accessible to non-French speakers, we provide English translations for the dataset.

UZH
University of Zurich, Department of Computational Linguistics
Hugging Face
Train and Test Set
Download on Hugging Face

Licensing note: training data follow a restricted UN license; by participating, teams agree not to redistribute the training data publicly.

Evaluation

Systems are evaluated using a combination of automated metrics and empirical auditing.

Final ranking is based on the average of both metrics. We will update the leaderboard live during the evaluation phase.

Submission

Participants submit predictions for the test set in the required JSON format.

Submission package

Compress your filled-out JSON test set and system paper into a single ZIP file for upload.

Allowed techniques are flexible (e.g., in-context learning, retrieval-augmented generation, etc.), but only open-source LLMs ≤ 8B may be used. Please also include a team name in your system paper for the leaderboard announcement.

Upload Your Submission

Please submit your ZIP file via OpenReview at the following link: [TBA]

Important dates

All deadlines are 11:59 PM UTC-12:00 (“anywhere on Earth”).

1 Feb 2026
Train and test data release
18 March 2026
Evaluation and submission starts
1 April 2026
Submission ends
15 April 2026
Evaluation ends; results notification
24 April 2026
Paper submission due
1 May 2026
Reviews to authors
12 May 2026
Camera-ready version due
July 2026
ArgMining 2026 Workshop

Organizers

University of Zurich, Zurich, Switzerland.

FAQ