PDA

View Full Version : New Research Papers from Yahoo Research


garyp
07-19-2004, 05:31 AM
Several new papers and technical reports have been posted on the Yahoo Research web site. Titles and urls follow.

1) Rosie Jones. Semi-supervised Learning on Small Worlds, Yahoo! Research Labs Technical Report YRL-2004-033
http://labs.yahoo.com/publications/33.pdf

2) Omid Madani, Russell Greiner. Active Model Selection, Yahoo! Research Labs Technical Report YRL-2004-032
http://labs.yahoo.com/publications/32.pdf

3) David M. Pennock. A Dynamic pari-mutuel market for hedging, wagering, and information aggregation, Yahoo! Research Labs Technical Report YRL-2004-031
http://labs.yahoo.com/publications/31.pdf

4) Fernando Diaz, Rosie Jones. Using Temporal Profiles of Queries for Precision Prediction, Yahoo! Research Labs Technical Report YRL-2004-020
http://labs.yahoo.com/publications/20.pdf

orion
07-19-2004, 12:34 PM
I found the http://labs.yahoo.com/publications/20.pdf article extremely interesting, since it appeals to the temporal profiles (dynamics) of queries, which are inherently tide to users' behaviors. As a novel research work, certainly I don't expect answers to all important questions. Still, I have noticed several basic areas that may deserve further investigation (and I don't know if they are already conducting research along these lines).

1. Query mode used in the study. Is it not clear from the study which query mode was used by the researchers.
2. The top N documents. The size of the set R consisting of the top N is selected quite arbitrarily. How far one can go? (ie, N=10, 100, 1000...; which threshold value to use?).
3. Since they discuss temporal behaviors, it could be interesing if they have provided Poincare maps (Poincare sections) from the temporal data.
4. Noise effects due to keyword spamming in positioned documents. How it plays into the picture?
5. The effect of term co-occurrence and terms sequencing (in queries and top N documents) in the temporal profiles. How this affects the outcomes of the study?

Things worth to investigate: Let D total documents retrieved. Treating N/D as a probability measure, a maximum N/D ratio occurs at 1/e or when N is about 36% of D, I think (via entropy arguments).

Still, I have to admit that it is a great article! This Yahoo research has a lot of "meat". My sincere congratulations goes to Yahoo's Research Lab.

Orion