January 31, 2018 at 1:00 pm #6640
An intresting article in: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0067332, here is an excerpt…
Data sharing is perhaps better understood as the problem of making best use of research resources. Researchers produce large amounts of data, some of which may be useful to others. Making those data useful to others requires a substantial investment in documentation, and often in interpersonal negotiation, above and beyond the conduct of the research per se. It is not possible to justify making that level of investment in all data just in case someone, somewhere, at some future time, might wish to use them. The originating investigator bears the cost of data preparation. Other entities such as data repositories, universities, libraries, and funding agencies are likely to bear the cost of curating those data for sustainable access. Unknown – and often non-existent – reusers reap the benefits. This equation is not viable in economic or social terms.
October 1, 2018 at 11:20 am #8442
This is to point to the work of our colleagues from Stuttgart:
“This paper targets the challenges of research data management with a focus on High Performance Computing (HPC) and simulation data. Main challenges are discussed: The Big Data qualities of HPC research data, technical data management, organizational and administrative challenges. Emerging from these challenges, requirements for a feasible HPC research data management are derived and an alternative data life cycle is proposed. The requirement analysis includes recommendations which are based on a modified OAIS architecture: To meet the HPC requirements of a scalable system, metadata and data must not be stored together. Metadata keys are defined and organizational actions are recommended. Moreover, this paper contributes by introducing the role of a Scientific Data Manager, who is responsible for the institution’s data management and taking stewardship of the data.”
October 11, 2018 at 9:47 am #8493
Regarding bearing costs by the investigators – if they are funded by EU, the EU can (and in fact frequently do) require providing data in the Open Data scheme.
Other thing is that ‘sharing’ do not mean ‘shearing for free’. Maybe data could be shared for free for research, but with a fee for commercial use.
And finally, people will use data if (a) they will know about such possibility, (b) they will belive in it reliability and (c) there would be a common standard for data (EMMO?)
October 12, 2018 at 9:58 am #8512
Documenting a product works well when the author has a vested interest in having the documentation. That motivation may take several forms:
- Author wants to refer to the documentation themselves, because using the documentation makes the author’s life easier.
- Author wants to encourage other people to use the product, because having more users is rewarded (e.g. through funding).
- Author wants to spend less time helping people one-to-one, because it makes the author’s life easier.
How much documentation should the author create? In case 1, only as much as the author wants. In case 2, enough to encourage a few people that using the product is feasible. In case 3, add to existing documentation when people ask the same question repeatedly.
For case 2, it is important to be able to monitor the number of users of the product. Many products, including free-to-use products, require registration before use. https://github.com/ and https://www.npmjs.com/ publish the number of downloads for each “product”, so you can see what is popular and what is not.
November 4, 2018 at 6:19 am #9093
Documentation of data does involve considerable efforts on behalf of the data provider/ researcher. At materials.zone we suggest two ways to incentivise researchers to document their data and metadata in a structured and organized way.
We offer a data management platform which includes parsers to process the data formats uploaded and save them as JSON files, data visualizations, and data analysis tools. These give the researcher an added personal benefit for uploading data and compensate for the effort of uploading data. In addition, if the researcher wants to use machine learning tools on the data, having the data already saved in a structured way saves time and efforts later on.
We offer a ‘data marketplace’ where the researcher can be financially compensated for his data sharing and documentation efforts. This marketplace will serve as a meeting point between data providers and data consumers and the data consumers will pay for the use of the data they are interested in, Using this model the data provider gets financial benefits from the documentation efforts.
Learn more about our efforts at https://www.materials.zone/ and view some of our data on the marketplace at https://measure.zone/
You must be logged in to reply to this topic.