Questions From Users
The Data Help Desk has highlighted several areas where researchers frequently seek assistance. These needs underscore the importance of ongoing efforts to provide support and resources in these areas.
This compilation reflects the diverse needs of researchers engaging with data and software and highlights the valuable role of Data Help Desks in providing guidance and connecting users with relevant resources. The prevalence of questions on data repositories, FAIR principles, and data management indicate key areas where the community seeks support.
Questions by topic
Specific Questions re Repositories
FAIR Principles and Open Science and Data
Compliance with Funders and Publishers
Data Visualization and Analysis
Finding Data
- Where can I find certain types of data?
- Where do I find specific datasets?
Data Repositories and Storage
- What’s a good repository for a certain type of data?
- Where can I deposit my model (or other large) datasets?
- Where can I archive my data?
- Where can I store my data?
- Can I just link to my data portal or do I need to put it in a repository (to comply with policies at AGU)?
- Where can I deposit lightning data (domain-discipline repository)?
- Resources for starting a data repository? Arctic ice data, based in Taiwan
- Where to share simulation data, 1 - 2 TB? Institutional repository will only take observational data
- Where to share a specifice type of data such as hydrology data?
Specific Questions re Repositories
- What types of data does your repository accept?
- Does the archive accept data from any funding stream, or just from NSF-funded projects (for example)?
- Are there any restrictions on data that are accepted (e.g., data sets > 1 TB can’t be accommodated, or repo won’t accept model output)?
- What metadata standard does your repository use?
- What is the procedure by which a scientist submits metadata (e.g,, via Word template)
- Are there personnel at the repository who can help a scientist get their data and metadata archived?
- When should a researcher engage with your repository?
- Does your repository offer any training opportunities?
- What license(s) does the repository support?
- How much does it cost to deposit a dataset in your repository?
Data Management
- What makes data FAIR? And what is FAIR anyway?
- Working with data among teams; encouraging best practices, defining sampling system before fieldwork, early adoption, IGSN
- How to make data accessible to educators or people not in the specific field
Data Citation and Attribution
- How can I cite a dataset?
- Using data from other datasets (citing), data from different instruments
- How do I cite my data?
- Software citation; reducing duplication?
Data Publication
- How and where can I publish my data or software?
- What if I don’t want to share my data until I’ve published papers on it?
- Looking for a low-cost publication/journal for climate modelling paper?
- Self-funded work
Software
- What kind of software do I need to share with my paper? Python packages?
- Looking for help / policy documents to show legal team to share software externally
- Sharing software publication requirements - NASA/OSTP mandate
- Resources for code documentation
- more resources on software documentation?
- Should I be looking at software when I review papers? (This is difficult and takes time)
FAIR Principles and Open Data
- What does FAIR really mean?
- What is Open Science and Data?
- What is a better definition of ‘accessible’ in FAIR?
Data Management Plans (DMPs)
- How can I write a good data management plan (DMP)?
- Where can I find good instructions or a form for creating a data management plan?
Compliance with Funders and Publishers
- Details of requirements and demands from funding agencies and journal publishers, so I can develop procedures to square those with my (FFRDC, defense) employer’s public release approval requirements. Software is especially thorny!
- All these new policies! How do I comply with my funder and publisher requirements for my data and software?
Data Visualization and Analysis
- Looking for resources for data visualization and geospatial analysis
- Matplotlib: Would like to put grid lines *behind* data. “z order’ or ‘axes-behind = True’ not working”
Working with Large Datasets
- Question about who can take 15 to 20 TB of cyclonic event data going back to the 90s.
- Is it ok to share processed data (which is much smaller) or do I need to share the raw file (very large) version?
Metadata
- Common questions about metadata encountered at a data help desk relate to its importance, standards, submission procedures, and what information to include. Here are some specific examples and areas of inquiry:
- What is the definition or description of metadata? This suggests that some users may have a basic lack of understanding of what metadata entails.
- Why is metadata important? When describing samples, it’s important to provide rich metadata that describes basic characteristics. Taking time upfront to maximize this information will save time when reusing or sharing samples in the future.
- What metadata standard does your repository use? Researchers need to know which standards are expected by different data repositories.
- What is the procedure by which a scientist submits metadata (e.g., via Word template)? Users require practical guidance on how to actually provide metadata to a repository.
- Are there personnel at the repository who can help a scientist get their data and metadata archived? This shows that some users need assistance with the technical aspects of metadata creation and submission.
- What to document with metadata? This question highlights the need for clarity on the specific information that should be included as metadata.
- What is good metadata? Consistent data formatting & descriptions help machines & humans better understand & reuse valuable data, emphasizing the role of good metadata.
Computational Notebooks
- Notebooks Now! - what does that submission look like, Curvenote, ArXiv/Earth ArXiv integration for notebook pre-prints
- Looking for notebooks for academic/teaching purposes; DeepNote
- Notebooks Now! AGU open science initiatives, are we providing resources to provide an environment?
- How do I share and get credit for my notebooks? Talk to us about making your Jupyter, RMarkdown notebooks available
- Can I give my computational notebook a DOI?
Miscellaneous Questions
- How to best trace the number of citations from a publication?
- How do a locate an expert to answer a question?
- What resources do you have for atmospheric science?
- Does AGU have data science related jobs?
- Is the ESIP Marine working group here?
- How to put AI model algorithm in open access repository that can come under copyright/ fair use
- Managing digital presence
- How to store geo data format like .shp, .geojosn, .kml in a good format. Are these proprietary? What would be a good nonproprietary format for GIS data?
- Connecting scientists and communities and policy makers for decision making in water sanitation - resources?
- Finishing masters, former software engineer, looking for next career
- Open access fees - where does the money go and why are they so high?
- What is an ORCID? and what a ROR ID? And, why is it useful to me?
- What is a DOI? (Digital Object Identifier)
- How to preserve models and simulations data?