GEODI Classifier Label/Tag Definitions
GEODI Classifier has a flexible labeling mechanism to adapt a DLP software or an existing labelling schema.
PDF, Office, and Libre Office documents are labeled, and ADS (Alternate Data Stream) is used for all others.
GEODI directly supports files tagged with your current classification tool and adapts to the existing label scheme.
Labels and Sample Files
Below are the labels used in the standard template and sample files. The first lines are mandatory; others are optional. You may change and add new ones.
Label Structure
We create classes as described https://decesw.atlassian.net/wiki/spaces/geodien/pages/3987177529 on the page. The labels should follow the rules below.
name:value
name:value
...
At least one label must represent the class value and must be fixed. The value can take many different forms.
class:confidential
class:{58f30e89-66db-4092-a81f-282a2eee431c}
class:<x class="confidential"></x>
class:{"id":"58f30e89-66db-4092-a81f-282a2eee431c"}
class:<x>confidential</x>
...
ADS and Custom Attribute
For file types that don't support tags, tags are stored in a platform-dependent manner.
On Windows with NTFS, Alternate Data Streams (ADS) are used; on macOS, Linux, and FreeBSD, the Custom Attribute feature is utilized.
You can query these tags using built‑in operating system commands:
Windows (ADS) →
dir /rmacOS →
xattr -l <file path>Linux →
getfattr -d <file path>FreeBSD →
listextattr user <file path>
The portability of ADS and other tags is limited, and DLP systems must also support them.
Variables
You can use variables inside the tags. Variable names are case-sensitive.
Default Classes
If you do not have an existing labelling schema, the following classes and tag definitions come standard with GEODI. You may directly use or modify the rules. If there is no existing tag, you can use the tags below for DLP compliance.
Class | Labels | Coverage |
|---|---|---|
Confidential |
| Has money and money greater than 50K USD or equivalent and selected keywords. Selected Keywords are in a Disctionary. You can freely modify the disctionary, named |
PII |
| (A name or ID) and (Phone, E-mail, Adress, or Bloodtype) ID covers National ID numbers, Medical Numbers, Passport, and the like. |
Restricted |
| IBAN, Tax Numbers, SWIFT code and keywords like “restricted”, “internal use only” Keywords are in a dictionary. You can freely modify the dictionary, named |
Unclassifed |
|
|
**