GEODI Data Sources
Â
đ Adding Data Sources
GEODI can connect to many different data sources. These sources can be indexed and discovered within a single project or across multiple projects. Data sources are defined through the Project Wizard.
Apart from source-specific settings, the following features apply to all sources:
đ File format support â File formats are processed in the same way regardless of the source.
Example: A PDF located in a folder, a PDF attached to a web page, or a PDF embedded in a database record are all processed in the same way.
Copies and similar files may exist across sources. For example, a file attached to an email may be a duplicate of a file stored in a folder.
đ Sampling discovery â Available for all sources. For instance, you may process one out of every N files, or M records per table.
đĄī¸ Data Remediation Workflows â Actions such as deletion, quarantine, or classification are supported across many cloud and on-premise sources.
To enable this, you must allow it on a per-source basis and provide a user/credential with the necessary permissions.
đ OCR â OCR processes can be activated per source.
đ Permissions â You can define who can access or download data on a per-source basis.
â ī¸ Risk Score
đ A risk score is defined for each source.
đ This score is used in reporting and helps you decide on actions such as deletion/quarantine after discovery.
đĸ The risk score is a value between 0â100.
⥠Sources that may create risk if sensitive data is found should have a higher score.
Example: Shared/public areas usually have a high risk score.
đ Risk scores appear in the Content List reports.
âšī¸ Some data sources may not be included in your license.