orion
01-27-2005, 08:51 PM
I’m back! God, I hate moving. Finally, now that I’m installed in a beautiful place, 3 min (by walking) from the beach I have time to post this anecdotal thread.
This thread is about a recent search engine I audited for a client, which is also good friend of mine. He gave me the green light to share the experience with others at the SEWF.
If you plan to audit an IR system for your clients, this thread may interest you. The thread could also benefit programmers and those designing search systems or site search tools.
Here is how the story goes.
PROBLEM
The client hired a computer engineer who was suppose to design a brand new search engine. I was asked to audit the finished product.
The engineer designed a dedicated specific search system for a highly competitive market space and claimed his creation was better than Google.
TESTS
After meeting with the client and his engineer and tolerating the engineer’s brags about how great and fast was the system, it was time to do some testing.
I resourced to several basic tests IR auditors would normally perform. Excluding test 3, the following tests were initially performed using the default FINDALL (AND) query mode.
1. Transposition tests
2. Proximity tests
3. Boolean tests
4. Volume tests
5. Delimiter/stopword tests
6. Precision-Recall Curves
RESULTS
1. Transposition Tests
In this test, a query for
Q1 = k1 + k2
Q2 = k2 + k1
Is expected to produce the same total of results, with a negligible absolute relative error.
2. Proximity Tests
In this test, given a document(s) containing, let say, k1 and k2, the system should return the document(s) regardless term interdistances.
3. Boolean Tests
Here I tested several advanced search features (OR, EXACT, NOT, BUT, etc). I also checked to see if the system discriminates between Boolean operators and other delimiters (“|”, “ - “, etc).
4. Volume Tests
These tests check for the overall volume of documents returned when different operators are used. The sequence should be
OR > FINDALL > EXACT
That is, results in EXACT should be a subset of the results retrieved in FINDALL (AND) and these should be a subset of the results retrieved in OR (ANY).
5. Delimiter/stopword Tests
These tests check the discriminating power of the IR system. The idea is to see how are different delimiters and stopwords interpreted/replaced.
6. Precision-Recall Curve Tests
These tests compare the retrieval performance of the tested system with a reference system.
FINDINGS
Client’s search engine failed tests 1 through 5, so I didn’t waste my time with Test 6.
Now it was time to present my results to client….
After going over the results -and (yes) again listening the engineer bragging about the next big thing- the client was a bit disappointed at his “better-than-Google” project. You can guess the fate of the project.
I would like to hear from others that have audited IR systems. Perhaps we can share similar stories and learn from each other.
Cheers
Orion
This thread is about a recent search engine I audited for a client, which is also good friend of mine. He gave me the green light to share the experience with others at the SEWF.
If you plan to audit an IR system for your clients, this thread may interest you. The thread could also benefit programmers and those designing search systems or site search tools.
Here is how the story goes.
PROBLEM
The client hired a computer engineer who was suppose to design a brand new search engine. I was asked to audit the finished product.
The engineer designed a dedicated specific search system for a highly competitive market space and claimed his creation was better than Google.
TESTS
After meeting with the client and his engineer and tolerating the engineer’s brags about how great and fast was the system, it was time to do some testing.
I resourced to several basic tests IR auditors would normally perform. Excluding test 3, the following tests were initially performed using the default FINDALL (AND) query mode.
1. Transposition tests
2. Proximity tests
3. Boolean tests
4. Volume tests
5. Delimiter/stopword tests
6. Precision-Recall Curves
RESULTS
1. Transposition Tests
In this test, a query for
Q1 = k1 + k2
Q2 = k2 + k1
Is expected to produce the same total of results, with a negligible absolute relative error.
2. Proximity Tests
In this test, given a document(s) containing, let say, k1 and k2, the system should return the document(s) regardless term interdistances.
3. Boolean Tests
Here I tested several advanced search features (OR, EXACT, NOT, BUT, etc). I also checked to see if the system discriminates between Boolean operators and other delimiters (“|”, “ - “, etc).
4. Volume Tests
These tests check for the overall volume of documents returned when different operators are used. The sequence should be
OR > FINDALL > EXACT
That is, results in EXACT should be a subset of the results retrieved in FINDALL (AND) and these should be a subset of the results retrieved in OR (ANY).
5. Delimiter/stopword Tests
These tests check the discriminating power of the IR system. The idea is to see how are different delimiters and stopwords interpreted/replaced.
6. Precision-Recall Curve Tests
These tests compare the retrieval performance of the tested system with a reference system.
FINDINGS
Client’s search engine failed tests 1 through 5, so I didn’t waste my time with Test 6.
Now it was time to present my results to client….
After going over the results -and (yes) again listening the engineer bragging about the next big thing- the client was a bit disappointed at his “better-than-Google” project. You can guess the fate of the project.
I would like to hear from others that have audited IR systems. Perhaps we can share similar stories and learn from each other.
Cheers
Orion