This spreadsheet has 1 row for each results from each queries from each algo. It contains the columns:
algorithm: BM25, Google, ELSER or MiniLM
query type: exact match, keywords, non-existing or reformulation
search query: one of the 16 listed above
rank response: 1 to 5 ; rank of the result
Algorithm score: score given by the algo itself and used to rank the results. No value for Google because it doesn’t give score. MiniLM gives a score between 0 and 1, BM25 and ELSER seems to give a score above 1. Those scores are only interesting to do relative comparison for a same algo.
Relevance score: 1 or 0: my relevance score of the result for this query.
Detailed results
For the three cloud-based experiments (BM25, ELSER and MiniLM) I have collected as well the raw results from elasticsearch in json format. In the case of ELSER that results contains as well the expanded term of each result: