General Data Management: Users are often looking for a basic understanding of what data management entails and seek guidance on establishing sound data management practices for their research. They have broad questions about organizing, accessing, and working with their data effectively.
Information Management: Researchers are interested in how to effectively manage the information surrounding their data, including documentation, organization, and understanding the context of their scientific findings. This can involve strategies for tracking workflows and creating descriptions of projects from the start.
Metadata: A key area of interest is understanding metadata standards and learning how to create comprehensive metadata records that enhance the findability and reusability of their data. Users also inquire about the specific procedures for submitting metadata to repositories.
Data Repositories: There’s significant interest in identifying appropriate repositories for depositing data, including various types like model outputs or large datasets. Users have specific questions about repository policies, accepted data formats, size limitations, costs, and available support for archiving data. Finding domain-specific repositories is also a common need.
Ocean Data: This topic caters to researchers working with marine science data who are interested in resources and best practices specific to finding, managing, visualizing, analyzing, and publishing oceanographic datasets.
Data Visualization: Users seek guidance on effective techniques and software for visualizing different types of data. They may have questions about specific tools or challenges in presenting their data visually, such as arranging elements in plots.
Analysis: Researchers are keen to learn about various data analysis tools and methodologies, particularly for handling and processing large or complex datasets. They may be interested in specific software packages or statistical techniques relevant to their field.
Publication: Users frequently ask about the process of publishing their data and software, including where to publish and how to meet the requirements of funders and journal publishers. Understanding open access options and associated fees is also a point of inquiry.
R: There is strong interest in leveraging the R programming language for various data-related tasks, such as data cleaning, manipulation, statistical analysis, and generating reproducible research. Users also look for resources tailored for R users in scientific contexts.
R Studio: For users familiar with R, presentations on leveraging the R Studio integrated development environment for enhanced productivity in data analysis, visualization, and report generation would be valuable.
Python: Similar to R, researchers want to learn how to use Python for data management, analysis, visualization, and potentially for developing shareable software. Resources and best practices for using Python in scientific workflows are relevant.
GIS: Users working with geospatial data are interested in understanding best practices for storing and managing GIS data formats like .shp and .geojson, as well as identifying nonproprietary alternatives for better interoperability.
Satellite-dervied Data: This topic likely attracts users who work with satellite-derived data and are interested in how to access, process, analyze, and visualize this specific type of Earth observation data.
NOAA Data: Researchers who utilize data from the National Oceanic and Atmospheric Administration (NOAA) would be interested in presentations covering how to find, access, and work with various NOAA datasets and resources.
Data management: Reinforcing the foundational aspect, this topic would cover core principles and best practices in data management, addressing questions about organization, accessibility, preservation, and overall data lifecycle management. Many users, especially those with little formal training, need basic awareness in this area.
Git and Gitlab/Hub: Researchers are increasingly interested in using version control systems like Git and platforms like GitLab and GitHub for managing their code and data, fostering collaboration, and ensuring reproducibility in their research workflows.
Data processing workflows and pipelining: Users seek to learn how to design and implement efficient and reproducible workflows for processing their data, potentially utilizing scripting languages and tools for creating automated pipelines.
Functional programming: For more advanced users, presentations on applying functional programming concepts and techniques in languages like R or Python for cleaner, more modular, and potentially more efficient data analysis could be of interest.
Community Management: Some researchers are interested in learning how to engage with broader open science communities and initiatives, including contributing to projects, collaborating with others, and staying informed about advancements in data sharing and management practices.