GEODI Data Sources

GEODI Data Sources

 

📂 Adding Data Sources

GEODI can connect to many different data sources. These sources can be indexed and discovered within a single project or across multiple projects. Data sources are defined through the Project Wizard.

Project Wizard


Apart from source-specific settings, the following features apply to all sources:

  • 📑 File format support → File formats are processed in the same way regardless of the source.

    • Example: A PDF located in a folder, a PDF attached to a web page, or a PDF embedded in a database record are all processed in the same way.

    • Copies and similar files may exist across sources. For example, a file attached to an email may be a duplicate of a file stored in a folder.

  • 📊 Sampling discovery → Available for all sources. For instance, you may process one out of every N files, or M records per table.

  • đŸ›Ąī¸ Data Remediation Workflows → Actions such as deletion, quarantine, or classification are supported across many cloud and on-premise sources.

    • To enable this, you must allow it on a per-source basis and provide a user/credential with the necessary permissions.

  • 🔎 OCR → OCR processes can be activated per source.

  • 🔒 Permissions → You can define who can access or download data on a per-source basis.


âš ī¸ Risk Score

  • 📌 A risk score is defined for each source.

  • 📊 This score is used in reporting and helps you decide on actions such as deletion/quarantine after discovery.

  • đŸ”ĸ The risk score is a value between 0–100.

  • ⚡ Sources that may create risk if sensitive data is found should have a higher score.

    • Example: Shared/public areas usually have a high risk score.

  • 📑 Risk scores appear in the Content List reports.


â„šī¸ Some data sources may not be included in your license.