Rules for Participation
Table of Contents

This page provides a summary of the first rules for participation for the various roles within the EUCAIM project, starting with Data Providers and Research Communities, followed by Tool Providers, and concluding with Data User-Researchers.

Rules for Participation for Data Providers and Research Communities

Minimum requirements in terms of data

To facilitate participation as a Data Provider or Research Community, an onboarding process devoid of strict barriers has been devised. The minimum requirement for membership within EUCAIM is the provision of their project-specific datasets. Compliance with all legal obligations, including the requisite agreements with EUCAIM, is mandatory. Furthermore, the submission of project metadata and documentation is necessary to enhance comprehension of data origin and structure, although adherence to the Common Data Framework, as defined by EUCAIM, is not obligatory.

Acknowledging potential challenges in adapting data to a common hyper-ontology, EUCAIM affirms that non-compliance with the Common Data Framework (CDF) will not impede participation. Instead, data curation will be encouraged, and support will be provided for data transformation efforts to align with the CDF. Project funding will be sought to facilitate these endeavours.

To accommodate diverse levels of data compliance, three technical tiers have been established. These tiers are scalable, allowing participants to ascend as their datasets are integrated into new research projects, affording enhanced visibility and usability within the EUCAIM community. Specifically, comprehensive support, adaptation plans, and project funding will be leveraged to assist participants in achieving Tier 2 and Tier 3 compliance.

Tier 1: Acceptance of data with no CDF compliance, suitable for entry-level participation

In the first tier, can be submitted without strict technical compliance requirements, which facilitates the participation of partners with valuable but less compliant dataset. However, a medium-term commitment to adapt data to EUCAIM standards for data in the Central Repository and plans for adaptation and improvement by federated providers are expected. In fact, the EUCAIM platform offers limited functionality at this level, with no possibility for federated queries.

The minimum requirements vary according to the environment (clinical or research) and type of providers as follows:

  1. RESEARCH ENVIRONMENT
    1. Central Repository providers: Finalised research projects without a data sustainability plan that would like to maintain their datasets openly available for research in the long term, but do not have the means to do it. In this case, the finalised project will directly transfer their data to the Central Storage upon project end and will be asked for:
      1. The signature of a Data Transfer Agreement (DTA) between parties.
      2. Information about their research project, metadata catalogue and software.
      3. Data de-identification
    2. Federated providers: Ongoing active repositories that would like to maintain their datasets in a federated node. In this scenario, Data providers will be asked for:
      1. The signature of a Data Sharing Agreement (DSA) between parties.
      2. Information about their research project, metadata catalogue and software.
      3. Information about local computational and storage capabilities, for the federated node.
      4. Once the project concludes, it is envisioned that it will be moved to the Central Storage, under the conditions specified above.
  2. Research Communities: Finalised projects that would like to maintain their datasets openly available for research purposes while maintaining alive the community of researchers that made them possible (i.e. project partners), to continue working with them.  To this end, Research Communities will transfer their data to the Central Storage, but will keep the community of researchers alive, and will continue to work with their data and tools, and even continue to apply for projects as a community with EUCAIM as a partner. In this scenario, Research Communities will be asked for:
    1. The signature of a Data Transfer Agreement (DTA) between parties.
    2. The signature of a Collaboration Agreement (CA).
    3. Information about the research project, partners, metadata catalogue and software.
  1. Clinical environment
    1. Hospital providers: This refers to the Real-World Data scenario, where partner hospitals have their own Data Warehouses, populated via their Electronic Health Records (EHR). In this context, member hospitals will prepare specific datasets for projects in which they have agreed to participate. To do so, they will anonymize the dataset and decide whether to transfer it to the central storage or keep it within a federated node. Potentially, upon project completion, they may choose to transfer the dataset to the Central Repository. In both cases, the following agreements and related documentation will be requested:
      1. DSA/DTA per project 
      2. Collaborative Agreement (CA) per project
      3. Metadata catalogue
      4. Information about how the hospital data warehouse is structured: CDM (OMOP/FHIR), HPC requirements, IT policies.
      5. In addition, the deliverable defines each of these agreements and goes into more detail on the content of the metadata catalogue (dataset creation and general information, demographic and clinical information, image and modality information, and dataset statics), as well as the de-identification and negotiation process.

Tier 2: Moderate compliance with the CDF, enabling federated search functionality

Compliance with EUCAIM’s Federated Query service involves a more detailed process for data providers, leading to enhanced data visibility and usability. To execute federated queries successfully, data must align with EUCAIM’s hyper-ontology, or a local mediator service may be employed for query execution and reporting aggregated results. Data access and federated queries may require extended Data Sharing Agreements compared to Tier 1 and additional clearance from data protection authorities. EUCAIM assumes the responsibility of improving Central Repository data compliance via project funding, while for federated nodes, this responsibility is shared, and the mediator component is an option to achieve Level 2 compliance.

The minimum requirements at this level are as follows:

  • Standardized Frameworks. Compliance with standardized Common Data Models like OHDSI OMOP-CDM, FHIR, and W3C DCAT for data consistency and interoperability.
  • Mediator Component. Operation of a “Mediator” component for executing federated queries, including connecting to central infrastructure, query translation, result aggregation, and maintenance.
  • DICOM Metadata Exploitation. Effective utilization of DICOM metadata in medical imaging and AI-related applications.
  • FAIR Compliance. Adherence to FAIR principles with automatic evaluation tools, such as FAIR EVA, for monitoring compliance and considering different levels of FAIRness.

Tier 3: Full compliance with the CDF, allowing distributed processing, including machine learning model training

Tier 3 represents full compliance, aiming for data harmonization, annotation, and quality. Data Providers and Research Communities aspire to reach Tier 3 to maximize data usability and enable advanced processing, including Machine Learning. EUCAIM actively participates in research projects, taking responsibility for advancing compliance through project funding and facilitating the adaptation process with support teams. This partnership enhances dataset compliance with the CDF, involving various technical requirements for data:

  • FAIR Compliance. It is needed to meet more essential FAIR compliance indicators than in tier 2, including data compliance and not just metadata. Some key indicators include data identifier, accessibility, and license information.
  • Data Harmonization. To ensure that project data fully complies with the EUCAIM Common Data Framework, the following requirements are mandatory:
    • Adherence to specific data formats, like DICOM for images or CSV for numeric data.
    • Ensuring data is structured correctly to prevent compatibility issues.
    • Accurate alignment with data modality and target.
    • Maintaining consistency in the number of sequences.
  • Data Annotation and Labelling. Standardizing data annotation using the DICOM SEG format is essential. DICOM SEG offers a comprehensive and standardized approach to exchange information about image segmentations, ensuring consistency and quality in data annotations within the EUCAIM framework.
  • Data Quality Assessment. Ensuring data completeness is crucial. Data cleaning and enhancement may be necessary, along with maintaining time coherence of examination dates. Clinical endpoints must be defined consistently. Compliance with standards like OMOP-CDM, FHIR, and DICOM helps ensure data quality.

Minimum requirements in terms of infrastructure

EUCAIM offers participants the option to select between two primary approaches for data contribution, ensuring alignment with individual needs and preferences. Data may be transferred to the Central Repository managed by the EUCAIM federation, or alternatively, maintained and shared as a Federated Node. Both approaches are equally valid, accommodating contributions to cancer research within EUCAIM according to individual preferences and requirements.

Central storage

  1. Functional requirements:
    1. UPV’s Central storage is based on the CHAIMELEON Cloud Repository technology and requires:
      1. To have a valid EUCAIM user
      2. A standard DICOM client compatible with DCM4CHEE.
      3. API REST-based client application for uploading the eForms with the clinical data.
    2. The Medical Imaging Storage  service of Euro-BioImaging ERIC: An XNAT instance operated and supported by Health-RI and Erasmus MC, together with upload and ingestion support and a service desk and requires:
      • Linking to a federated Identity Provider or broker that authenticates users based on their institutional user accounts (like LifeScience AAI or Surf SRAM)
      • DICOM receiver of XNAT is accessible  via the Clinical Trial Processor to ensure secure encrypted transport of DICOM data 
      • API is accessible for users to be able to download and upload data (xnatpy is a python library that could be used for this purpose.
  1. Non-functional requirements:
    1. Access to a dedicated machine for uploading and administering data in the central node.
    2. Allow outgoing network connection over HTTPS in order to connect to the central services of the Federated Learning platform.
    3. A technical contact point in the staff for technical support in case of technical issues.

Federated node

  1. Functional requirements:
    1. Third parties’ temporal data transfer
    2. Federated Node Infrastructure Procurement: Each organisation must adhere to its local procurement procedures to timely obtain and set up the essential management and technical infrastructure, including Servers, Virtual Machines, and IaaS, necessary to host a federated node.
    3. Federated Node Processing Requirements: To participate in EUCAIM’s federated infrastructure, organisations must procure the infrastructure and hardware that meets the specific processing demands of the federated node. These requirements are categorised into various tiers of node participation, detailed in D5.1, spanning from strict data federation to GPU-aided edge computing.
    4. Federated Node Storage Requirements: Organisations must procure and set up the storage infrastructure ensuring that every hosted federated node aligns with EUCAIM’s data storage specifications. These needs, based on participation tiers, range from storing the organisation’s contributed dataset to offering supplementary storage for localised data processing projects; more details can be found in D5.1.
    5. Federated Node Network Requirements: Each federated node must be connected to the public internet via a wired connection. Relevant network infrastructural adjustments, such as firewall configurations should be made to enable specific network port inbound or outbound access to the public internet.
    6. Configuration of Federated Node Software: For the operating system of the federated node, it is recommended to install stable Linux distributions such as Ubuntu, CentOS, or Debian. These distributions have proven reliability and are compatible with most software stacks required for EUCAIM. Organisations should ensure regular updates to the distribution to ensure security patches and software stability. For integration with the EUCAIM federated infrastructure, EUCAIM software must be installed. Guidelines for installation of the required tools and the EUCAIM software stack are available in the online EUCAIM GitHub repository.
  2. Non-functional requirements:
    1. Network Infrastructure Management: Organisations which use added security protocols (e.g.,VPN, virtual, reverse proxy networks, packet monitoring), must notify and collaborate with EUCAIM’s technical team. This ensures that the federated node maintains robust internet connectivity while upholding EUCAIM’s security and privacy guidelines.
    2. Physical Infrastructure Management: All procured physical infrastructure (processing, electrical, or network units), essential for the federated node, should be securely positioned, safeguarded from external hazards and detrimental environments. Adequate precautions must be taken against potential risks like food, liquids, or cleaning agents.
    3. Physical Infrastructure Access Management: Physical infrastructure for federated node hosting should reside in a restricted access zone, allowing entry only to approved individuals. Any access events must be continually monitored and recorded.
    4. Federated Node Data Redundancy Configuration: To ensure data integrity and accessibility, multi-layered redundancy at both the infrastructure and organisational levels is advised. An initial recommendation includes the deployment of a RAID disk configuration. Additionally, a comprehensive backup plan should be established to consistently safeguard the data’s latest version.
    5. Management of Digital Access to Infrastructure: To enhance security, separate user accounts should be created on the Linux machine for every authorised individual. This approach not only promotes accountability but also minimises risks associated with shared access. For remote management, SSH access can be granted. Authorised technical staff should be provided with user credentials and SSH keys, ensuring encrypted and secure access. Regular audits should be conducted to review access logs and ensure no unauthorised access attempts.
    6. Monitoring and Maintenance of Federated Node: It is suggested that each organisation monitors the health and performance of their hosted federated node. Automation tools can be implemented to track metrics and set up alerts for anomalies in each metric. Organisations should prioritise regular data and configuration backups, timely system updates, and patch applications. For optimal long-term health of the federated node, schedule periodic maintenance and document any configurations and changes made to the federated node for future reference.

Legal and ethical requirements

Data Providers must ensure GDPR and national law compliance, furnish evidence of data origin, establish a valid legal basis for processing, and demonstrate commitment to data protection through GDPR compliance. Additionally, reports from Data Protection Officers confirming adherence to EUCAIM’s standards are requisite.

Ethics form the bedrock of EUCAIM’s mission. Adherence to ethical data practices, support for treatments endorsed by the Data Access Committee, and active participation in ethical verification activities are essential for responsible data sharing. Dedication to legal compliance and ethical standards is fundamental to EUCAIM’s success in advancing cancer research and innovation.

Legal and ethical Rules for Participation for Tool Providers

The Rules for Participation to be met by entities wishing to become EUCAIM Tool Providers and make their tools, services or applications available to users, either to perform federated processing or pre-processing of data from the platform, are the following:

Containerization and Compliance:

  • All tools must be provided as containerized images (like Docker) to work with container orchestrators (e.g., Kubernetes).
  • Compliance with input and output specifications outlined in EUCAIM’s technical documentation is required.
  • Data should be stored dynamically, and a dedicated volume is provided for tool results.

EUCAIM Terms of Usage:

  • Tools added to EUCAIM must adhere to technical guidelines and terms of usage provided by EUCAIM.
  • Tool documentation should specify use cases, expected performance, and any contraindications.

User Support and Maintenance:

  • Tools must provide a communication channel for user support through the EUCAIM Helpdesk.
  • Tool providers must offer long-term support, lasting until at least the end of the EUCAIM piloting stage, with Service Level Agreements (SLAs) to ensure software updates and security.

Minimum Documentation:

  • Required documentation includes user manuals, license agreements, data usage and privacy policies, security information, version control, compliance and certification documents, API documentation, technical support details, and instructions for use.

Traceability Mechanisms:

  • Tools must record user actions and provide logs to monitor usage, including error codes for incident identification.

Monitoring Capabilities:

  • Tools should provide information for EUCAIM to monitor their status.

Benchmarking:

  • Tool providers must share detailed information about their tools, including their purpose, training data, type, task, performance metrics, input requirements, output format, licensing, hardware and RAM needs, processing time, programming language, keywords, publications, and a URL for more information.

Quality Control:

  • Quality control measures include code-related quality checks, functional validation, registries for documentation, and external assessments for open-source software.

Security and Privacy Compliance:

  • Tools handling sensitive data must comply with GDPR guidelines, undergo vulnerability analysis, and pass data privacy assessments.
  • Breach procedures and contact points for updates should be established, and software libraries must be maintained.

Legal Compliance:

  • Tools must comply with current and upcoming European and national legislation, including GDPR.
  • Tool providers should obtain necessary approvals for data processing, comply with Data Access Committee agreements, and provide information on data processing safeguards.
  • Ethical requirements include an AI risk analysis, preferably using the ALTAI Tool, to assess AI-based technologies for Trustworthy AI compliance.

Rules for Participation for Data Users-Researchers

The Data Users-Researcher (DU-R) should be able to explore the public catalogue of available metadata and, if they wish, request access to and, if accepted, process it using the tools available on the platform or their own AI tools. To do so, they have to follow the procedures detailed in this section and comply with the corresponding requirements. It should be noted that all of them are based on the main prerequisite for Data User-Researchers to have an approved research project, which can range from final undergraduate projects to large funded grants.

User Identity Checking Procedure:

  • Data User-Researchers (DU-R) register and authenticate via Life Science AAI (e.g., using ORCID or affiliation).
  • Users must accept usage conditions as per EUCAIM’s Acceptable Use Policy.

 

Data Access Request Process:

  • DU-Rs request data access, providing documentation (e.g., study protocol) and Ethics Committee approval.
  • EUCAIM Negotiator tool manages the request process, involving Access Committees and data providers.
  • Approval grants authorization to use datasets.

 

Request Form and Specific Requirements:

  • EUCAIM Negotiator facilitates the request process, linking DU-Rs with Access Committees.
  • DU-R provides data request details, including identification, desired datasets, research question, ethics committee approval, and more.
  • Data provider(s) may review the request based on terms of use.

 

Legal and Ethical Requirements:

  • Legal requirements may align with the EHDS Proposal.
  • Users, whether individual or corporate, must register and accept terms and conditions.
  • Ethical requirements include describing data processing ethics, providing ethics committee approval, and conducting an AI risk analysis using the ATAI Tool when necessary.

Survey Invitation

Join Leading Experts In Shaping AI In Cancer

EUCAIM is looking for your feedback! We have recently published a Stakeholder Survey in order to reach out to potential end-users and stakeholders. We believe that your insights could significantly contribute to understanding the expectations of potential users and identifying the essential aspects that stakeholders find crucial for future engagement and collaboration with the platform.

Therefore, we would like to invite you to participate in the Stakeholder Survey about the Cancer Image Europe platform.

Completing the survey will take approximately 10 minutes. Your participation is crucial to the success of this project, and we deeply appreciate your expertise in shaping the future of cancer imaging and treatment.