Hello,
I am a PhD student at Georgia Tech and I am collaborating with researchers at Stony Brook University to find an effective means of detecting block pages. We have access to a very robust set of both real pages and blocked pages (~2.4 million pages) which we are using to evaluate block page detection metrics.
We already have two measures to detect block pages and would like to evaluate your DOM similarity measure alongside our own metrics. Since we are planning on publishing our results, may we include your DOM similarity measure in our evaluation? Also, I would like to look more into this similarity measure, is there a paper that I can read?
Thanks, Ben Jones