Research Remix

September 13, 2018

It’s time to insist on #openinfrastructure for #openscience

Filed under: openinfrastructure, openscience — Heather Piwowar @ 9:00 pm

It’s time.  In the last month there’ve been three events that suggest now is the time to start insisting on open infrastructure for open science:

The first event was the publication of two separate recommendations/plans on open science, a report by the National Academies in the US, and Plan S by the EU on open access.  Notably, although comprehensive and bold in many other regards, neither report/plan called for open infrastructure to underpin the proposed open science initiatives.

Peter Suber put it well in his comments on Plan S:

the plan promises support for OA infrastructure, which is good. But it never commits to open infrastructure, that is, platforms running on open-source software, under open standards, with open APIs for interoperability, preferably owned or hosted by non-profit organizations. This omission invites the fate that befell bepress and SSRN, but this time for all European research.

The second event was the launch of Google’s Dataset Search — without an API.

Why do we care?  Because of opportunity cost.  Google Scholar doesn’t have an API, and Google has said it never will.  That means that no one has been able to integrate Google Scholar results into their workflows or products.  This has had a huge opportunity cost for scholarship.  It’s hard to measure, of course, opportunity costs always are, but we can get a sense of it: within 2 years of the Unpaywall launch (a product which does a subset of the same task but with an open api and open bulk data dump), the Unpaywall data has been built in to 2000 library workflows, the three primary A&I indexes, competing commercial OA discovery services, many reports, apps of countless startups, and more integrations in the works.  All of that value-add was waiting for a solution that others could build on.

If we relax and consider the Dataset Search problem solved now that Google has it working, we’re forgoing these same integration possibilities for dataset search that we lost out on for so long with OA discovery.  We need to build open infrastructure: the open APIs and open source solutions that Peter Suber talks about above.

As Peter Kraker put it on Twitter the other day: #dontLeaveItToGoogle.

The third event was of a different sort: a gathering of 58 nonprofit projects working toward Open Science.  It was the first time we’ve gathered together explicitly like that, and the air of change was palatable.

It’s exciting.  We’re doing this.  We’re passionate about providing tools for the open science workflow that embody open infrastructure.  #OpenSciRoadmap

If you are a nonprofit but you weren’t at JROST last month, join in!  It’s just getting going.

 

So.  #openinfrastructure for #openscience.  Everybody in scholarly communication: start talking about it, requesting it, dreaming it, planning it, building it, requiring it, funding it.  It’s not too big a step.  We can do it.  It’s time.

 

ps More great reading on what open infrastructure means from Bilder, Lin, and Neylon (2015) here and from Hindawi here.

pps #openinfrastructure is too long and hard to spell for a rallying cry.  #openinfra??  help :)

July 5, 2010

Evolution 2010 and iEvoBio recap

Filed under: conferences, openscience, Uncategorized — Tags: , , , , — Heather Piwowar @ 9:19 am

As my first exposure to the field of evolution, attending Evolution 2010 and iEvoBio was drinking from a firehose. That said, it was a productive and enjoyable dousing. Highlights:

Slides from presentations that highlighted open science, data sharing and archiving, and reward structures:

  • Mike Whitlock: Data Archiving in Evolution info session, discussed motivation and details on the joint data archiving policy that will require data archiving across six journals starting next year (policy, slides)
  • Carl Boettiger: My experiment with open science: Why the benefits of sharing go beyond source code, a dynamic practical case study of how and why to do open science (slides)
  • Todd Vision: The Dryad Digital Repository: Published evolutionary data as part of the greater data ecosystem, motivation and overview for the new data repository for post-publication datasets in evolution and ecology (abstract and demo)
  • Rob Guralnick: Biodiversity Discovery and Documentation in the Information and Attention Age, keynote talk highlighting, in part, the value in sharing pre-publication data and the need to change our reward structures to value that contribution (slides)
  • Anne Thessen: New Biology: The Data Conservancy and Data Driven Discovery, an overview of the ambitious data conservancy project(website)
  • Jonathan Eisen: Phylogenomics of microbes: the dark matter of biology, keynote with some plugs for PLoS and project openness (slides)
  • Rutger Vos: TreeBASE2: Rise of the Machines, background on new machine-friendly interfaces to TreeBASE (slides) and demo

A few other presentations related to how we develop or communicate science:

  • Rod Page: Phyloinformatics in the age of Wikipedia, a talk on the value of realizing how people find science (slides)
  • Vincent Smith: Top-down and bottom-up informatics: who has the high ground?, powerful case studies of successful and unsuccessful projects (abstract)
  • Cynthia Parr: Community content building for evolutionary biology: Lessons learned from LepTree and Encyclopedia of Life, case studies on the relative strengths of different design approaches (abstract)
  • N. Dean Pentcheff: Copyrights and digitizing the systematic literature: the horror… the horror…, about why it is important and completely legal to assemble an open digital archive of phylogeny papers under fair use

We also had a very interesting iEvoBio Birds-of-a-Feather session on open science, data sharing and reuse, and data citations. There were about 10 of us in a wide-ranging and interesting discussion with diverse perspectives.

Overall the iEvoBio meeting was fun and useful: a very successful first year kickoff bringing together people with similar interests. Thanks to Hilmar Lapp and the other organizers for all of their work. Can’t wait to go next year and contribute to the theme of research openness.

Of course the meetings were also a very useful intro to the field of evolution itself. Sean Carron’s Gould prize lecture told the historical story of evolutionary theory: entertaining and informative, it was a fantastic start.  I also enjoyed the two award research lectures (though I wish they hadn’t overlapped with 5pm info sessions on Data Archiving and NESCent). The presentations and posters gave me a good high-level overview of what questions people are looking at, what kinds of data are produced and reused, what tools are developed, and what kinds of creativity and hard work required in designing effective experiments.

Finally, I made a number of contacts and spent some time with my NESCent and Dryad community, my local UBC community, and others interested in open data and open science within this domain… crucial given the remote nature of my postdoc.

Left now with oodles of disjoint notes, ideas, and enthusiasm for my next steps. Here we go!

ETA:  Want more info on iEvoBio?  Summary of online artifacts.

June 27, 2010

Open science and data sharing at Evolution 2010 and iEvoBio

Filed under: conferences, Open Notebook Science, openscience — Heather Piwowar @ 11:21 am

I’m at the Evolution 2010 conference and will be attending the iEvoBio meeting (inspired by a similarly scoped conference for the domain of genome informatics, the Bioinformatics Open Source Conference, BOSC).

Here’s a quick list of upcoming open science and data sharing discussions that I’m aware of:

  • data archiving meeting today, Sunday June 27th, 5-6pm in room A106 (PDF listing)
  • Dryad table and team in the NESCent booth
  • Dryad talk at iEvoBio
  • iEvoBio meeting with a commitment to open source code
  • one or more lightning talks at iEvoBio will include open notebook science
  • birds-of-a-feature on open notebook science at iEvoBio
  • several of the journal editorial board meetings are discussing data sharing policies

If you are at Evolution and interested in a variant of open science, drop me a line and we can get in touch!  hpiwowar nescent org

February 24, 2010

Thanks, Science Commons Symposium

Filed under: conferences, Open Notebook Science, openscience — Tags: — Heather Piwowar @ 11:10 am

Thank you, Lisa Green and Hope Leman.  Thank you, all the speakers and participants who travelled from far and wide to Seattle.  The recent Science Commons Symposium, Pacific Northwest, was fantastic.

I met, in person, many people I’d only corresponded with on the intertubes, and met many others I’m looking forward to corresponding with in the future.  I was inspired, informed, reinspired.

I’m still head-down in my thesis and so won’t give a rundown of the presentations or discussions.  For those of you who missed it, there are several great summaries out there.  Hashtag #spspn on twitter and FriendFeed.

Fantastic.

Also of note #1:  The Amtrak trip between Vancouver,Canada and Seattle is beautiful!  Definitely recommended.

Also of note #2:  Oh I wish cell phone plans cared less about borders.  I need to figure out a better solution for international meeting tweeting.

September 11, 2008

PSB Open Science workshop talk abstract

Filed under: conferences, MyResearch, opendata, openscience, sharingdata — Tags: , , — Heather Piwowar @ 10:39 am

The program for the Open Science workshop at PSB 2009 has been posted.  Great diversity of topics… I’m really looking forward to it.

My talk abstract is below… comments and suggestions are welcome!

Measuring the adoption of Open Science

Why measure the adoption of Open Science?

As we seek to embrace and encourage participation in open science, understanding patterns of adoption will allow us to make informed decisions about tools, policies, and best practices. Measuring adoption over time will allow us to note progress and identify opportunities to learn and improve. It is also just plain interesting to see where we are, where we aren’t, and where we might go!

What can we measure?

Many attributes of open science can be studied, including open access publications, open source code, open protocols, open proposals, open peer-review, open notebook science, open preprints, open licenses, open data, and the publishing of negative results. This presentation will focus on measuring the prevalence with which investigators share their research datasets.

What measurements have been done? How? What have we learned?

Various methods have been used to assess adoption of open science: reviews of policies and mandates, case studies of experiences, surveys of investigators, and analyses of demonstrated data sharing behavior. We’ll briefly summarize key results.

Future research?

The presentation will conclude by highlighting future research areas for enhancing and applying our understanding of open data adoption.

April 2, 2008

A Centralized Proposal Repository

Filed under: openscience — Tags: , — Heather Piwowar @ 8:17 am

I actively support Nature Precedings as a place to archive my early research findings for visibility, feedback, and attribution. I recently submitted a research proposal. Precedings does a spot-check of all submissions to verify appropriateness. It usually takes a day or two, and results in an automated response stating that your submission has been posted. In this case, I received a personal, thoughtful email explaining that although Nature Precedings had published proposals in the past, they are moving away from this practice to concentrate on their core goal of “a repository for manuscripts, posters, and presentations describing completed research.”

I see the issue. On one hand, if the goal of a preprint is to get feedback and attribution for research ideas, what better time to do it than at the proposal stage. On the other hand, ideas are a dime a dozen and so it might not be scalable for Nature Precedings.

Sounds like we need another solution. There was a letter to Nature just recently (highlighted by Maxine Clarke in Nautilus), calling for a Centralized Proposal Repository. This idea is grander than simply a wiki for feedback and attribution. Dr Harel is suggesting it could also be searched by funders, to identify projects which match their interests.

Dr Harel expands on this idea on his website and links to the beginning of a Centralized Proposal Repository wiki.

Great idea, and I think right up the alley of open science. The wiki seems to be Protected (to limit spammers?) I’ll go ask to join, and keep you updated on what I learn.

ETA: Looks like Jean-Claude Bradley was involved in the set-up of the wiki. Great stuff!

March 25, 2008

PSB Open Science workshop: call for participation

Filed under: Open Notebook Science, openscience, PSB2009 — Tags: , , — Heather Piwowar @ 12:20 pm

As reported by the organizers at One Big Lab and Science in the Open, PSB 2009 is going to have a 3 hour workshop devoted to Open Science. Neat, eh? What a great chance to meet and learn from others who are working this way and/or thinking about this topic! And discuss it with others who just happen to be in the PSB neighbourhood! And go to Hawaii in January! :)

I’ll definitely be submitting a talk proposal. Still brainstorming the topic. The winner in my head over the last 24 hours is “Open Science: Measuring the Costs and Benefits.” What do you think? Other ideas? Off to email Shirley and Cameron…..

July 24, 2007

Conversation with BMC on Open Notebook Science

Filed under: ISMB, openscience — Heather Piwowar @ 7:03 am

Wow, fantastic. I just had a conversation with Matt Hodgkinson, Senior Editor of the BMC series, which was worth the trip to Vienna all by itself.

While taking a break this morning from the ISMB talks and note-taking, it occurred to me that perhaps the best prep I could do for the ONS BoF was to talk to the journal publishers, all of whom happened to be standing a few feet away from me in their booths. Since I’m wearing my lovely free “I’m open” swag tshirt from BioMed Central (BMC) today, and I figured they’d be friendly to the cause, I started there.

Matt Hodgkinson will be familiar not only as an editor at BMC, but also as the author of the blog Journalology (“Science publishing trends, ethics, peer review, and open access”). I really enjoyed our informative discussion, Matt, thank you! As you read this, please feel free to clarify or add anything I’ve forgotten.

The bottom line: BMC has no hesitation considering research which has been previously posted to personal websites, blogs, wikis, and pre-print servers (as part of Open Notebook Science or otherwise), as long as it has not also been published in some formal way.

The details: Formal publishing is of course slightly difficult to nail down (they used to say “anything with a doi”, but now Nature Precedings has a doi without being considered a formal publication). A rule of thumb may be “anything with an ISSN.” Peer-review, or being indexed by PubMed, are not relevant to BMC when ascertaining prior formal publication status. Posters and abstracts are ok, conference proceedings are usually considered formal publications. Again, pre-print servers (Nature Precedings, arXiv) are fine.

Our conversation also touched on publishing clinical trial data and protocols, negative results, the fact that publishers can and do help recover data from authors who don’t respond to reader requests, the BMC policies for data sharing relative to that of other journals, and the potential for publishing about ONS. Unfort, no time to go into details now…

Once again, thank you, Matt, for your enthusiasm and time. I’m off next to talk to the folks at PLoS.

ps Needless to say, my blogroll is very out of date. I actually read many more blogs than are listed… probably why I keep running out of time to update this blog and my blogroll… hrm.

July 18, 2007

ISMB 2007 BoF: Open (Notebook) Science

Filed under: BOF, ISMB, openscience — Heather Piwowar @ 10:11 am

There will be a Birds of a Feather session at ISMB 2007 about Open (Notebook) Science.  It was initiated by yours truly, not because I’m an expert (I’m not!) or even because I have any real experience doing Open Notebook Science (I don’t!), but because I’d like to meet others who are interested and have a good conversation.  Sounds like a BoF to me!

So if you are at ISMB and available Wednesday at lunch, stop on by.

ps Thanks to Bill Hooker for his great summary, and to
all these people blogging about the Open Science Notebook [neat], and especially all those people who are really doing it.

Details:
Description: Open (Notebook) Science — the practice
of freely and openly sharing the process, data, tools, and results of
our research — is gaining momentum. For a nice overview, see
http://3quarksdaily.blogs.com/3quarksdaily/2006/11/the_future_of_s.html.
BOF for people doing, considering, or curious about Open Science.

Also note another BoF of interest, on Tuesday:
Data and Software Sharing     
Barb Bryant   
[Vice President of
the International Society for Computation Biology (ISCB)]

This session will explore options for Data and Software Sharing and is open to all to provide feedback to ISCB.

Powered by ScribeFire.

June 5, 2007

Blogging a doctoral dissertation

Filed under: blogging, openscience — Heather Piwowar @ 10:36 am

Great post on Pimm, Editing my doctoral thesis on stem cells in a blog: Why not?

I appreciate his research into this:

After all, what do I risk here? If someday I’d like to write a review out of the published introduction, can this cause a publishing problem? According to Maxine Clarke, Publishing Executive Editor of Nature (i.e. peer review and publishing policy expert) the status of a thesis is: “No, a doctoral thesis does not count as “previously published” and yes, you can submit work that was part of your thesis, with an appropriate citation.”

I also asked Maxine by mail and she was kind enough to enlighten me: There is no problem with you publishing your thesis in this way, so far as consideration for publication of any part of it for a Nature journal is concerned (or any NPG journal). We encourage communication between scientists via discussion of work and unpublished drafts in the form of theses, meetings, preprint servers, online scientific forums (between scientists) etc.

What we don’t allow is active solicitation of the media by scientists of work that will be or is submitted but not (yet) published in a Nature journal. So, in this case, if a journalist were to approach you because he/she had read part of your thesis on your blog and asks you about it, if this part of it is something you wish to submit for publication, you’d need to say to that journalist that you could not discuss it yet as you are planning to submit it to a journal, but that you’d be happy to talk about it when it is published. This is standard practice in most journals and journalists (reputable ones) are all aware of this type of policy. In our case, the policy is there to avoid “media hype” before a ms has been through peer-review.

Looking for a similar attempt I turned to Jean-Claude Bradley, whose Useful Chemistry is a pioneer website in open source science. Mr. Bradley mailed me:“If your advisor is fine with it I think it is a great idea. If you plan on submitting the work to a specific journal check with the editor as well. My student, Alicia Holsey is writing her entire masters thesis on our wiki

Powered by ScribeFire.

Blog at WordPress.com.