GEODI can also evaluate databases, i.e., structured data. Databases and their file information (including embedded) can be indexed (all tables and rows). With simple rules, you can limit tables, rows, or fields and set how rows are displayed to the user.
Use Project Wizard/Database connection to define the connection.
Conditions for connection
A user with minimum readonly access to the database
Connection for the database port, database name , Connection String
If a separate driver is required for the connection, it must be installed (list and requirements are in the table below)
Use “New Connection” to choose the database and define the connection string. ( This dialog only opens if you are logged into the server)
Some databases may require an extra driver to be installed. Please check the list below.
VTYS | Client |
---|---|
PostGres | No extra installation is required. |
Microsoft SQL Server | No extra installation is required. |
SqLite | No extra installation is required. |
Shape File | No extra installation is required. |
CSV File | No extra installation is required. |
KML File | No extra installation is required. |
SQLCE, SQL express | No extra installation is required. |
Kafka | No extra installation is required. |
Oracle (OleDB) | Oracle ODAC driver must be installed((12.1.0.2.1 and up). https://www.oracle.com/database/technologies/odac-downloads.html |
Excel | Access Database Engine 2010 must be installed. https://www.microsoft.com/en-us/download/details.aspx?id=13255 |
MS Access | Access Database Engine 2010 must be installed. https://www.microsoft.com/en-us/download/details.aspx?id=13255 |
DB2 | Oledb Driver must be installed. |
Oracle BigData | Microsoft Hive ODBC Driver must be installed. https://www.microsoft.com/en-us/download/details.aspx?id=40886 |
Cassandra | OLEDB or ODBC driver is required |
MongoDB | OLEDB or ODBC driver is required |
Alternative Connection Methods
Other than using the project wizard
File-based ones such as SQLite, MDB, and AccDB are indexed automatically when found in a directory and don't have a password. You may define DBMeta for them.
You must use Project Wizard for Excel files to be indexed as structured content.
Settings made with Project Wizard/Database can be saved in *.xDeceConnection format. These files are automatically processed. It is a portable and secure way of defining connections.
Indexing DBs
The default behavior is to crawl all tables and rows.
Tables must have primaryKey by default. You may choose to index tables w/o a primary key also.
You may choose a sampling mode, as only the selected number of rows is indexed in each table.
You may choose a subset of tables.
You may limit columns by their names.
You may define SQLs to change content to be indexed
Embedded files are indexed with proper definitions
File paths are processed with proper definitions
If rows have some permission information you may utilize it with proper definitions.
When you search in GEODI, Each row is content. You may define the name and appearance of a record database or table-wise.
DBmeta is a way to change all behavior here by database and/or tablewise. This page contains necessary information about how DBmetas are defined.
DBmetas are jsettings files under “Settings\Reader\DBMeta”
Columns filters
Limit indexing by table or column.
Key | Description |
---|---|
WorkspaceName | WorkSpace(s) to which the settings will apply. |
TableFilter | Tables to which the settings will apply. |
ColumnFilter | Tables that contain the given columns (if - is given, tables that do not contain the relevant columns are considered). If more than one is given, the relationship is AND. |
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBView, Geodi.Database", "WorkspaceName":"0000-Promotional and Educational Videos", "TableFilter":"*", "ColumnFilter":"*", "Columns":"-FILEPATH,-ID,DATE,GEODIFILELINK" }, ] }
How rows appear in GEODI
In databases, each record appears on a record-by-record basis. The default title is the first text column value. You can change this for each table individually using DisplayNameMacro. You can also use other columns in the title macro. Macro rules are given at the end of the page.
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBRowDisplayName, Geodi.Database", "DisplayNameMacro":"[TEXT4]/[TEXT2]-[TEXT3]" } ] }
Indexing files embedded in tables
Geodi can scan files with file paths in the database or are embedded as blobs. This document explains how to make the necessary settings.
File1 column shows file names File2 column shows blob files. In the created database, the BLOB file column is selected as "bytea" data type.
You can use it in Access,Postgres,Mssql,Oracle,Sqlıte,MySql databases.
FileMemoColumn
in VT must have a file extension in the given column.If there is no extension in the column value and all files are of the same type, an extension can be added to the macro value.
Key | Description |
---|---|
| BLOB/MEMO field containing the file content. GEODI will automatically determine the file content |
| Macro for the unique number of the file. The rules for macros are at the end of the document. |
| Macro for the name of the file to appear in searches and viewers. Rules for macros are at the end of the document. |
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBSubContent, Geodi.Database", "TableFilter":"TEST", "IDColumnMacro":"[OBJECTID]", "FileNameColumnMacro":"[FILE1]", "FileMemoColumn":"file2" } ] }
Indexing Multiple Embedded Files in a Table
You can use the following meta to index files embedded in multiple columns in the same table.
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBSubContent, Geodi.Database", "TableFilter":"TQA", "IDColumnMacro":"[OBJECTID]_1", "FileNameColumnMacro":"[file1]", "FileMemoColumn":"file2" }, { "__type":"Geodi.Database.Meta.DBSubContent, Geodi.Database", "TableFilter":"TQA", "IDColumnMacro":"[OBJECTID]_2", "FileNameColumnMacro":"[dosya1]", "FileMemoColumn":"dosya2" } ] }
Indexing an Embedded File Without a Filename in a Table
When there is no file name in the table in the database, you can build one using the
FileNameColumnMacro
value. There are two examples below. The first example is a simple one. the second one assumes that some rows do not have an extension value.FileMemoColumn
is the file stream.{ "Defines":[ { "__type":"Geodi.Database.Meta.DBSubContent, Geodi.Database", "TableFilter":"CMS.FileContentCore", "IDColumnMacro":"[FieldId][VersionNumber][MinorVersionNumber]", "FileNameColumnMacro":"[Internalid][Extension]", "FileMemoColumn":"FileContent" } ] }
In the following example, the
Extension
column is assumed to have the file extension. Some rows may have empty values. We assumed that the file is PDF if no extension. The macro might have returned an empty string to skip the files.{ "Defines":[ { "__type":"Geodi.Database.Meta.DBSubContent, Geodi.Database", "TableFilter":"Documents", "IDColumnMacro":"[DocumentsID]", "FileNameColumnMacro":'=string.Concat(d["Internalid"],"-",d["CreateDate"],string.IsNullOrEmpty(d["Extension"])?".pdf":d["Extension"])', "FileMemoColumn":"File" } ] }
Indexing what is given with File Path in Table
You can also index files specified by file links in a record. (CSV files are not supported).
Key | Description |
---|---|
| A macro that calculates the directories where files are located. This macro can be the value directly in a field (e.g. PATH), or it can be a path calculated in combination with other fields of the record.( Example: column name PATH on VT) |
| If there is more than one file in the file path, you can give the bracket character this way. |
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBSubContent, Geodi.Database", "FileFullPathColumnMacro":"[PATH]", "FileFullPathColumnSplitter":"|" } ] }
Specifying a Primary Key Column
A primary column is required for each table for scanning and versioning changes. GEODI sets the ObjectID value by default. You can specify a different column with the definition described here.
The "KeyColumns" value will be the unique ID value.
The primary column name must be written in the meta the same as the name in the database.
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBPKey, Geodi.Database", "WorkspaceName":"BLOBDataset Deneme", "KeyColumns":"BelgeBelgeID" } ] }
{ "__type":"Geodi.Database.Meta.DBPKey, Geodi.Database", "TableFilter":"tablo1", "KeyColumns":"id" },
Indexing the result of a SQL statement
You create a new table named NewName with the SQL key. The SQL statement must be compatible with the relevant DBMS.
Alternatively you can specify the value of KeyColums in SQL by writing "as P_KEY" or "as OBJECTID" in the column.
This feature is valid for file based ones such as *.xDeceConnection and MDB.
Standard SQL must be used.
It should be ensured that there is no more than one repeating column with the same name in the result of the SQL statement.
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBSQL, Geodi.Database", "TableFilter":"ADA,IRTIFAK_HAKKI", "ColumnFilter":"ADA.ADA_NO,ADA.OBJECTID,IRTIFAK_HAKKI.OBJECTID,IRTIFAK_HAKKI.TABAKA", "NewName":"ADALAR2", "SQL":"SELECT * FROM ADA,IRTIFAK_HAKKI WHERE IRTIFAK_HAKKI.OBJECTID=ADA.OBJECTID", "KeyColumns":"ADA.OBJECTID" } ] }
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBSQL, Geodi.Database", "TableFilter":"TEST,TEST1", "ColumnFilter":"TEST.TARIH,TEST.OBJECTID,TEST1.OBJECTID,TEST.TAMS1", "NewName":"DENEME12", "SQL":"SELECT TEST.* FROM TEST,TEST1 WHERE TEST.OBJECTID=TEST1.TAMS2", "KeyColumns":"OBJECTID" } ] }
Row Based Authorization
You can authorize based on Table, SQL or View row.
You can use user and/or GEODI groups in "PermitMacro" & "DenyMacro" for authorization.
You can also use [geodi:username] for the users you create.
If you want to write more than one user, group, you should start with = and use advanced macro. Simple macro definition can only be used to define a single group/user.
Usernames or groups must be generated from a column in a table (or SQL, or View).
Rows will be authorized in the table, and these authorizations will be used in files (child content).
The generated group name is case-separated.
Example 1 : SQL Query Result Authorization
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBRowDisplayName, Geodi.Database", "TableFilter":"", "DisplayNameMacro":"[TEXT4]/[TEXT2]-[TAMS1]" }, { "__type":"Geodi.Database.Meta.DBSQL, Geodi.Database", "TableFilter":"TEST,TEST2", "ColumnFilter":"TEST.TARIH,TEST.OBJECTID,TEST2.OBJECTID,TEST.TAMS1", "NewName":"DENEME12", "SQL":"SELECT TEST.* FROM TEST,TEST2 WHERE TEST.OBJECTID=TEST2.TAMS2", "KeyColumns":"OBJECTID" }, { "__type":"Geodi.Database.Meta.DBRowPermission, Geodi.Database", "TableFilter":"DENEME12", "PermitMacro":"DECE\\kullanıcıadı", "DenyMacro":"[geodi:kullanıcıadı]" } ] }
Example 2 : Created Group Based Authorization
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBRowPermission,Geodi.Database", "TableFilter":"test", "ColumnFilter":"birimler", "PermitMacro":"[birimler]", "DenyMacro":"" } ] }
Example 3 : Advanced Macro Examples
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBRowPermission,Geodi.Database", "TableFilter":"test", "ColumnFilter":"birimler", "PermitMacro":'=d.Get<string>("birimler").Split(\',\')', "DenyMacro":"" } ] }
Example 4 :
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBRowPermission,Geodi.Database", "TableFilter":"test", "ColumnFilter":"birimler", "PermitMacro":'=new string[] {d.Get<string>("YETKILI_GRUP"),"S-1-5-21-128668610-1027347169-903626496-1222","geodi:guest"}', "DenyMacro":"" } ] }
Example 5 :
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBRowPermission,Geodi.Database", "TableFilter":"test", "ColumnFilter":"birimler", "PermitMacro":'=new string[] {string.Concat("Grubum_",d["KOLON1"]),string.Concat("Grubum_",d["KOLON2"])}' "DenyMacro":"" } ] }
Example 6 :
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBRowPermission,Geodi.Database", "TableFilter":"test", "ColumnFilter":"BIRIMLER,BIRIMLER_TEST", "PermitMacro":'=new string[] {d.Get("BIRIMLER"),d.Get("BIRIMLER_TEST")}', "DenyMacro":"" }, ] } ________________________________________________________________________________________ { "Defines":[ { "__type":"Geodi.Database.Meta.DBRowPermission,Geodi.Database", "TableFilter":"test", "ColumnFilter":"BIRIMLER,BIRIMLER_TEST", "PermitMacro":'=new string[] {d["BIRIMLER"],d["BIRIMLER_TEST"]}', "DenyMacro":"" } ] }
Specifying the Text/Text Result of a Record
This feature is used to change the GEODI search index. When the nomenclature given in the context is searched through GEODI, it finds all the contents in the table. With Content, we can select the search word of the column of a record, we can turn words without a column name into a search word with this feature. You can write and name more than one [Column Name] in Content. It allows to turn off and on the setting made in the Ignore state in DBmeta.
If “Ignore”:”False”, the settings written in Content are valid.
If “Ignore”:”True” , the settings written in Content are not valid.
This feature is valid for file based ones like *.xDeceConnection and MDB.
{ "Defines":[ { "__type":"Geodi.Database.Meta.DBContent, Geodi.Database", "WorkspaceName":"otf_meta_testV1", "TableFilter":"TEST", "ColumnFilter":"TARIH", "Ignore":"False", "Content":"[TAMS2] ZAMAN [TEXT3]" } ] }
Adjusting the Way Records Look
The html template you provide with the "TemplateName" key will determine the appearance of the records that meet the criteria. This template with ".html" extension should be under the Templates folder under the "DBMeta" file and the generated meta should be saved under the "DBMeta" folder.
Using TemplateName gives you visual flexibility, but can be a performance penalty.
{ "Defines":[ { "__type":"Geodi.Database.Meta.DB_DLV_View, Geodi.Database", "TemplateName":"PortalAnkaraGeziveMesire.html" } ] }
FieldIndex Settings (Limiting Searches by Column Name)
The default search covers all tables and all columns. To limit the search result based on columns, you should set FieldIndex. After setting "columnname:<your search phrase>", the criteria will be limited to the related column.
You should set ContentReaderEnumerators → Your Database → EnableAutoFieldIndex to true in the project detail settings and rescan your project. Rescanning can take time with big data, so it is better to plan it from the beginning.
If we are scanning the database connection to GEODI on a file basis, we cannot use this setting.
When this feature is active Words that are recognized in the database and fall in the KLV will be written next to the column in which they are found.
Sampling Data Discovery in Databases
By default, GEODI discovers all database contents. You can optionally sample and explore your database contents, so you can save on scanning time and storage space.
Project detail settings ContentReaderEnumerators → Veritabanınız → GenericSettings to add
"DB.SamplingMode":100
Tables/Selected Table/SQL query 100 records are randomly process.File-Based databases for use in Folder enumerator GenericSettings to add
"DB.SamplingMode":100
Such as *.xlsx , *.mdb, *.accdb from all types 100 records per table are random processed.
Make column names multilingual and define aliases
If you want, you should make the following definitions.
"fields" must be included in resx file names. These files must be located in the globalization directory.
The name value used must be att{fieldname} and must match the one in the table. You must use the same key in different languages.
The value value contains alternative column names (alias). You can separate multiple alternatives with "|".
If any of the alias you use is the same as another column name, it will be ignored.
In multilingual representations, the first alias is taken into account.
myfields.resx
<data name="att_{fieldname}" xml:space="preserve"> <value>{alias1}|{alias2}|{alias3}</value> </data> <data name="att_EnvanterNo" xml:space="preserve"> <value>Envanter Numarası|Envanter N.|Envanter Sırası</value> </data> <data name="att_ADI" xml:space="preserve"> <value>Ad|Adı Soyadı|AdSoyad</value> </data>
myfields.en-us.resx
<data name="att_{fieldname}" xml:space="preserve"> <value>{alias1}|{alias2}|{alias3}</value> </data> <data name="att_EnvanterNo" xml:space="preserve"> <value>Inventory Number|Inventory N|Inventory Position</value> </data> <data name="att_PersonName" xml:space="preserve"> <value>Name|Pname|Person Name</value> </data>
Macros
You can write macros for values such as Document ID, File path. Macros allow you to solve situations where column values are directly insufficient. For which values macros can be used is specified in the related sections.
Macro text is a text where columns are given between "[]". As shown in the example, you can give the same column more than once.
In macros, some characters must be used with the escape character ("\"). For example "\" should be given as "\\".
"FileFullPathColumnMacro":"C:\TEST\KUR-1166 VT\files\[FILE1]"
"DisplayNameMacro":“[ADANO] Ada [PARSELNO] Parsel”
"DisplayNameMacro":“[TITLE]-[POSITION]”
If the value of settings whose type is macro starts with =, it switches c# to macro usage. This brings unlimited flexibility
Examples
"FileFullPathColumnMacro":'=Path.Combine(@"C:\\TEST\\KUR-1166 VT\\files\\",d["FILE1"])'
"DisplayNameMacro":'=string.Concat(d["ADANO"]," Ada ",d["PARSELNO"]," Parsel")'
"DisplayNameMacro":'=string.Concat(d["TITLE"],"-",d["POSITION"])'
"DisplayNameMacro":'=d.Get<int>("DEGER")>-1?"Pozitif":"Negatif"'