PASCAL - Pattern 
Analysis, Statistical Modelling and Computational Learning

Second Recognising Textual Entailment Challenge

1 October 2005 - 10 April 2006.

דף חדש 2 דף חדש 1

RTE-2 submitted runs and results

 

Notice: Proper scientific methodology requires that testing should be blind. 
Therefore, if you plan to further use the RTE-2 test set for evaluation, it is 
advisable that you will not perform any analysis of this data set, including the 
detailed information provided in these runs.

 

download all runs

 

#

First Author (Group)

Run

Accuracy

Average Precision

1

Adams (Dallas)

run1

0.6262

0.6282

2

Bos (Rome & Leeds)

run1

0.6162

0.6689

run2

0.6062

0.6042

3

Burchardt (Saarland)

run1

0.5900

 

run2

0.5775

 

4

Clarke (Sussex)

run1

0.5275

0.5254

run2

0.5475

0.5260

5

de Marneffe (Stanford)

run1

0.5763

0.6131

run2

0.6050

0.5800

6

Delmonte (Venice)

run1*

0.5563

0.5685

7

Ferr?ndez (Alicante)

run1

0.5563

0.6089

run2

0.5475

0.5743

8

Herrera (UNED)

run1

0.5975

0.5663

run2

0.5887

 

9

Hickl (LCC)

run1

0.7538

0.8082

10

Inkpen (Ottawa)

run1

0.5800

0.5751

run2

0.5825

0.5816

11

Katrenko (Amsterdam)

run1

0.5900

 

run2

0.5713

 

12

Kouylekov (ITC-irst & Trento)

run1

0.5725

0.5249

run2

0.6050

0.5046

13

Kozareva (Alicante)

run1

0.5487

0.5589

run2

0.5500

0.5485

14

Litkowski (CL Research)

run1

0.5813

 

run2

0.5663

 

15

Marsi (Tilburg & Twente)

run1

0.6050

 

16

Newman (Dublin)

run1

0.5250

0.5052

run2

0.5437

0.5103

17

Nicholson (Melbourne)

run1

0.5288

0.5464

run2

0.5088

0.5053

18

Nielsen (Colorado)

run1*

0.6025

0.6396

run2*

0.6112

0.6379

19

Rus (Memphis)

run1

0.5900

0.6047

run2

0.5837

0.5785

20

Schilder (Thomson & Minnesota)

run1

0.5437

 

run2

0.5550

 

21

Tatu (LCC)

run1

0.7375

0.7133

22

Vanderwende (Microsoft Research & Stanford)

run1

0.6025

0.6181

run2

0.5850

0.6170

23

Zanzotto (Milan & Rome)

run1

0.6388

0.6441

run2

0.6250

0.6317


* Resubmitted after publication of the official results. Resubmission was allowed only in case of a bug fix, so that the updated results are the correct output of the system described in the RTE-2 proceedings.