<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:media="http://search.yahoo.com/mrss/"
		>
<channel>
	<title>Comments on: Generalizability coefficient for Mechanical Turk annotations</title>
	<atom:link href="http://researchremix.wordpress.com/2008/12/29/generalizability-coefficient-for-mechanical-turk-annotations/feed/" rel="self" type="application/rss+xml" />
	<link>http://researchremix.wordpress.com/2008/12/29/generalizability-coefficient-for-mechanical-turk-annotations/</link>
	<description>Blogging about the science, engineering, and human factors of biomedical research data reuse</description>
	<lastBuildDate>Tue, 19 May 2009 08:38:31 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Heather Piwowar</title>
		<link>http://researchremix.wordpress.com/2008/12/29/generalizability-coefficient-for-mechanical-turk-annotations/#comment-1372</link>
		<dc:creator>Heather Piwowar</dc:creator>
		<pubDate>Sun, 04 Jan 2009 20:50:17 +0000</pubDate>
		<guid isPermaLink="false">http://researchremix.wordpress.com/?p=132#comment-1372</guid>
		<description>Yes, unfortunately that is all of the data I collected.  As I was writing up the pilot study, I ran across a &lt;a href=&quot;http://blogs.nature.com/nautilus/2008/12/call_for_authors_to_deposit_mi.html&quot; rel=&quot;nofollow&quot;&gt;recent survey&lt;/a&gt; with 400 datapoints annotated by experts, so it was no longer necessary to derive such a dataset via MTurk.

PMCID is the PubMed Central ID (related to but not the same as the PubMed ID).  

If you do have resources in search of a similar problem, I certainly have some I could suggest.  For example, I&#039;d access to have an annotated set of studies that REUSE datasets.  A start is the list maintained by GEO... but it doesn&#039;t have any true negatives.  A reuse list would be interesting in and of itself as prevalence of data reuse (= benefits of data sharing), and could be used to evalutate an NLP engine for identifying reuse (which could then be used for a more systematic analysis of reuse patterns).  

That said, I&#039;m sure everybody on the bioNLP mailing list would have their own ideas about annotation datasets they&#039;d love to see ;)

Anyway, let me know if I can be of more help.


Heather</description>
		<content:encoded><![CDATA[<p>Yes, unfortunately that is all of the data I collected.  As I was writing up the pilot study, I ran across a <a href="http://blogs.nature.com/nautilus/2008/12/call_for_authors_to_deposit_mi.html" rel="nofollow">recent survey</a> with 400 datapoints annotated by experts, so it was no longer necessary to derive such a dataset via MTurk.</p>
<p>PMCID is the PubMed Central ID (related to but not the same as the PubMed ID).  </p>
<p>If you do have resources in search of a similar problem, I certainly have some I could suggest.  For example, I&#8217;d access to have an annotated set of studies that REUSE datasets.  A start is the list maintained by GEO&#8230; but it doesn&#8217;t have any true negatives.  A reuse list would be interesting in and of itself as prevalence of data reuse (= benefits of data sharing), and could be used to evalutate an NLP engine for identifying reuse (which could then be used for a more systematic analysis of reuse patterns).  </p>
<p>That said, I&#8217;m sure everybody on the bioNLP mailing list would have their own ideas about annotation datasets they&#8217;d love to see ;)</p>
<p>Anyway, let me know if I can be of more help.</p>
<p>Heather</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: lingpipe</title>
		<link>http://researchremix.wordpress.com/2008/12/29/generalizability-coefficient-for-mechanical-turk-annotations/#comment-1371</link>
		<dc:creator>lingpipe</dc:creator>
		<pubDate>Fri, 02 Jan 2009 22:19:15 +0000</pubDate>
		<guid isPermaLink="false">http://researchremix.wordpress.com/?p=132#comment-1371</guid>
		<description>Was the data at the end of the R code all you collected?  I&#039;ll probably need a bit more data than that to reliably infer annotator accuracies.

Is PMCID the article (PubMed?) ID?

We&#039;re going to be running lots of mechanical Turk jobs in the next few months (we have an intern coming who&#039;ll spend most of her time on it).  We could collect more of this kind of data -- the problem&#039;s very interesting because it&#039;s so directly relevant to a researcher.</description>
		<content:encoded><![CDATA[<p>Was the data at the end of the R code all you collected?  I&#8217;ll probably need a bit more data than that to reliably infer annotator accuracies.</p>
<p>Is PMCID the article (PubMed?) ID?</p>
<p>We&#8217;re going to be running lots of mechanical Turk jobs in the next few months (we have an intern coming who&#8217;ll spend most of her time on it).  We could collect more of this kind of data &#8212; the problem&#8217;s very interesting because it&#8217;s so directly relevant to a researcher.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Heather Piwowar</title>
		<link>http://researchremix.wordpress.com/2008/12/29/generalizability-coefficient-for-mechanical-turk-annotations/#comment-1369</link>
		<dc:creator>Heather Piwowar</dc:creator>
		<pubDate>Wed, 31 Dec 2008 14:02:14 +0000</pubDate>
		<guid isPermaLink="false">http://researchremix.wordpress.com/?p=132#comment-1369</guid>
		<description>Absolutely you may include it.  Have at it.  It sounds like an interesting problem!
Let me know if you have any questions about the data, such as it is.
Heather</description>
		<content:encoded><![CDATA[<p>Absolutely you may include it.  Have at it.  It sounds like an interesting problem!<br />
Let me know if you have any questions about the data, such as it is.<br />
Heather</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: lingpipe</title>
		<link>http://researchremix.wordpress.com/2008/12/29/generalizability-coefficient-for-mechanical-turk-annotations/#comment-1367</link>
		<dc:creator>lingpipe</dc:creator>
		<pubDate>Tue, 30 Dec 2008 15:50:10 +0000</pubDate>
		<guid isPermaLink="false">http://researchremix.wordpress.com/?p=132#comment-1367</guid>
		<description>Could you share your annotation data and gold standard?  

I&#039;d like to add your experiment to my inter-annotator data set, because I&#039;m trying to establish the robustness of Bayesian approaches to inferring gold standards, problem difficulties, and coder accuracies.   I&#039;m also releasing all my data along with the R and BUGS code on the LingPipe sandbox.

Here&#039;s a link to the blog entry linking to the paper about the models I&#039;ve been using: 

http://lingpipe-blog.com/2008/11/17/white-paper-multilevel-bayesian-models-of-categorical-data-annotation/</description>
		<content:encoded><![CDATA[<p>Could you share your annotation data and gold standard?  </p>
<p>I&#8217;d like to add your experiment to my inter-annotator data set, because I&#8217;m trying to establish the robustness of Bayesian approaches to inferring gold standards, problem difficulties, and coder accuracies.   I&#8217;m also releasing all my data along with the R and BUGS code on the LingPipe sandbox.</p>
<p>Here&#8217;s a link to the blog entry linking to the paper about the models I&#8217;ve been using: </p>
<p><a href="http://lingpipe-blog.com/2008/11/17/white-paper-multilevel-bayesian-models-of-categorical-data-annotation/" rel="nofollow">http://lingpipe-blog.com/2008/11/17/white-paper-multilevel-bayesian-models-of-categorical-data-annotation/</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>
