notesum.ai
Published at December 9Methods for Legal Citation Prediction in the Age of LLMs: An Australian Law Case Study
cs.CL
cs.AI
cs.IR
Released Date: December 9, 2024
Authors: Ehsan Shareghi1, Jiuzhou Han1, Paul Burgess1
Aff.: 1Monash University

| LLM-only Approach: Direct zero-shot prompting | ||||||
| Type | LLMs | Query | Output | ACC@1 | ACC@5 | |
| Open World | General Purpose | GPT-4o | Text+RoC | Top-5 Citations | 0.1 | 0.1 |
| Claude Sonnet 3.5 | Text+RoC | Top-5 Citations | 15.5 | 16.8 | ||
| Command R+ | Text+RoC | Top-5 Citations | 0.0 | 0.0 | ||
| LLaMA 3.1 70B Instruct | Text+RoC | Top-5 Citations | 1.6 | 2.1 | ||
| Law-specialised | SaulLM-7B-Instruct | Text+RoC | Top-5 Citations | 0.0 | 0.0 | |
| SaulLM-54B-Instruct | Text+RoC | Top-5 Citations | 2.0 | 2.7 | ||
| Citation-tuned (ours) | Cite-SaulLM-7B | Text | RoC+Top-1 Citation | 51.7∗ | - | |
| Cite-LLaMA-3.1-8B | Text | RoC+Top-1 Citation | 46.2 | - | ||
| Retrieval-only Approach: Uses vectorised database and vectorised Query to retrieve Top-5 | ||||||
| Closed World | Embeddings | Index Granularity | Query | Output | ACC@1 | ACC@5 |
| text-embedding-3-large | Full Cases | Text | Top-5 Citations | 14.9 | 32 | |
| Catchwords | Text | Top-5 Citations | 14.7 | 32.5 | ||
| RoC Aggregations | Text | Top-5 Citations | 27.1 | 53.8 | ||
| AusLaw-embedding | Full Cases | Text | Top-5 Citations | 8.7 | 20.7 | |
| Catchwords | Text | Top-5 Citations | 10.5 | 22.4 | ||
| RoC Aggregations | Text | Top-5 Citations | 29.5 | 54.5 | ||
| (Hybrid Approach) Query Expansion: Given Query, RoC is generated by an LLM and Query+RoC is used for retrieval | ||||||
| Results are formatted as GPT-4o/SaulLM-54B-Instruct/Cite-LLaMA-3.1-8B/Cite-SaulLM-7B | ||||||
| Embeddings | Index Granularity | Query | Output | ACC@1 | ACC@5 | |
| Closed World | text-embedding-3-large | Full Cases | Text | Top-5 Citations | 14.3/14.4/17.1/17.4 | 31.1/31.4/34.3/34.4 |
| Catchwords | Text | Top-5 Citations | 15.3/15.5/15.0/15.8 | 33.1/33.1/33.4/33.9 | ||
| RoC Aggregations | Text | Top-5 Citations | 29.6/28.6/34.9/35.1 | 56.7/56.1/60.0/60.4 | ||
| AusLaw-embedding | Full Cases | Text | Top-5 Citations | 9.0/9.5/11.7/12.4 | 21.1/21.3/24.2/26.0 | |
| Catchwords | Text | Top-5 Citations | 10.2/10.9/11.0/11.4 | 23.5/24.6/24.3/24.4 | ||
| RoC Aggregations | Text | Top-5 Citations | 32.2/30.4/33.5/34.7 | 55.8/54.2/55.6/56.5 | ||
| (Hybrid Approach) Voting Ensemble: Returns LLM’s citation if in the Top-5 of retrieval; otherwise, returns the retrieval’s Top-1 | ||||||
| Results are formatted as Cite-LLaMA-3.1-8B/Cite-SaulLM-7B | ||||||
| Embeddings | Index Granularity | Query | Output | ACC@1 | ACC@5 | |
| text-embedding-3-large | RoC Aggregations | Text | Top-5 Citations | 47.3/48.2 | - | |
| AusLaw-embedding | RoC Aggregations | Text | Top-5 Citations | 43.6/45.3 | - | |
| (Hybrid Approach) RAG: Given the Query, first retrieves Top-5 citations, and uses GPT-4o to pick the best | ||||||
| Closed World | Embeddings | Index Granularity | Query | Output | ACC@1 | ACC@5 |
| text-embedding-3-large | Full Cases | Text | Top-5 Citations | 16.5‡ | - | |
| Catchwords | Text | Top-5 Citations | 21.7 | - | ||
| RoC Aggregations | Text | Top-5 Citations | 42.2 | - | ||
| AusLaw-embedding | Full Cases | Text | Top-5 Citations | 10.2‡ | - | |
| Catchwords | Text | Top-5 Citations | 17.1 | - | ||
| RoC Aggregations | Text | Top-5 Citations | 42.9 | - | ||