| Datasets |
Methods |
Relaxed
Metric
|
Video-level
Accuracy
|
Phase-level |
| Precision |
Recall |
Jaccard |
|
SV-RCNet [17]
|
✓ |
|
|
|
|
|
TMRNet [19]
|
✓ |
|
|
|
|
|
TeCNO [7]
|
✓ |
|
|
|
|
|
Trans-SVNet [14]
|
✓ |
|
|
|
|
|
Not End-to-End [39]
|
✓ |
|
|
|
|
|
LoViT [26]
|
✓ |
|
|
|
|
|
SKiT [27]
|
✓ |
|
|
|
|
|
\cdashline2-7 |
Encoder-Decoder |
✓ |
|
|
|
|
|
Encoder-Decoder + NFSM (Ours) |
✓ |
(+2.7)
|
(+2.9)
|
(+2.5)
|
80.9 (+3.3)
|
|
Surgformer [37]
|
✓ |
|
|
|
|
| Cholec80 [32]
|
Surgformer + NFSM (Ours) |
✓ |
93.7 5.5 (+0.4) |
93.0 (+1.2) |
92.7 (+0.4) |
85.2 (+0.9) |
|
AVT [15]
|
|
|
|
|
|
|
TeSTra [40]
|
|
|
|
|
|
|
Trans-SVNet [14]
|
|
|
|
|
|
|
LoViT [26]
|
|
|
|
|
|
|
SKiT [27]
|
|
92.5 5.1 |
|
88.5 |
|
|
\cdashline2-7 |
Encoder-Decoder |
|
|
|
|
|
|
Encoder-Decoder + NFSM (Ours) |
|
(+2.3)
|
(+3.2)
|
(+3.0)
|
75.6 (+4.8)
|
|
Surgformer [37]
|
|
|
|
|
|
|
Surgformer + NFSM (Ours) |
|
(+0.0)
|
87.3 (+0.2) |
(+0.1)
|
78.0 (+0.2) |
|
SV-RCNet [17]
|
|
|
|
|
|
|
TMRNet [19]
|
|
|
|
|
|
|
TeCNO [7]
|
|
|
|
|
|
|
AVT [15]
|
|
|
|
|
|
| AutoLaparo [36]
|
Trans-SVNet [14]
|
|
|
|
|
|
|
LoViT [26]
|
|
|
85.1 |
|
|
|
SKiT [27]
|
|
|
81.8 |
|
|
|
\cdashline2-7 |
Encoder-Decoder |
|
|
|
|
|
|
Encoder-Decoder + NFSM (Ours) |
|
(+2.2)
|
|
(+0.6)
|
58.1 (+1.9)
|
|
Surgformer [37]
|
|
|
81.5 |
70.8 |
62.4 |
|
Surgformer + NFSM (Ours) |
|
86.7 7.2 (+0.6) |
84.8 (+3.3) |
71.1 (+0.3) |
63.2 (+0.8) |