DataHub allows you to enable simple and scalable access to distributed data for computation, and to publish a dataset and make it available to a specific community, or worldwide, across federated sites.
Overview
DataHub
DataHub allows you to bring data close to computing to exploit it efficiently, and to publish a dataset and make it available to a specific community, or worldwide, across federated sites. DataHub is based on the Onedata technology.
You can access the service for evaluation using the PLAYGROUND shared space, or request support to publish your data and have dedicated storage assigned.
Main features
- Discovery of data via a central portal.
- Access to data conforming to required policies which may be:
- unauthenticated open access;
- access after user registration or
- access restricted to members of a Virtual Organization (VO).
- Access to data via GUI, POSIX, CDMI
- Replication of data from data providers for resiliency and availability purposes. Replication may take place either on-demand or automatically.
- Authentication and Authorization Infrastructure (AAI) integration between the EGI DataHub and with other EGI components and with user communities existing infrastructure.
- Metadata and shares management
- Data import and data caching based on file popularity
- Support for many backends (CEPH, S3, GlusterFS, POSIX, etc)
TRL 8: Actual system proven in operational environment.
See the DataHub privacy policy to learn how personal data are processed when using this service.
Solutions
With DataHub, you can solve these challenges:
Case studies
How our community uses DataHub
Related news