Langkau ke kandungan utama

Pustaka Word Search

Glasswall Embedded Engine menyediakan pemeriksaan fail mendalam, pemulihan, sanitasi dan pelaporan. Enjin ini menyahbina fail kepada komponen strukturnya dan membina perwakilan dalaman fail yang menyerupai pokok. Ia menelusuri setiap nod pada pokok tersebut, memeriksa, membaiki dan mensanitasi item kandungan sebelum membina semula fail baharu.

Glasswall Embedded Engine juga menyediakan keupayaan untuk mengeksport dan mengimport perwakilan dalaman enjin bagi struktur fail dalam format perantaraan seperti XML. Ini membolehkan komponen dalaman sesuatu fail disediakan kepada program luaran untuk pemprosesan tambahan, sebelum fail digubah semula untuk memasukkan komponen yang telah diubah suai secara luaran tersebut.

Enjin Glasswall Word Search dibina berasaskan keupayaan eksport dan import, dengan melaksanakan carian teks dalam kandungan dan metadata sesuatu fail. Rentetan carian, pengurusan kandungan, dan peraturan penyuntingan ditetapkan melalui fail XML. Peta penggantian aksara yang boleh dikonfigurasi pengguna dan ditakrifkan dalam bentuk JSON digunakan untuk menyediakan sokongan bagi pengaburan teks. Enjin ini juga disertakan dengan sokongan ungkapan nalar terbina dalam.

Konfigurasi Word Search

Konfigurasi Word Search menentukan teks yang hendak dicari, atau ungkapan nalar yang hendak digunakan serta cara ia perlu dikendalikan apabila ditemui dalam dokumen. Konfigurasi Word Search ialah peluasan kepada pengurusan kandungan Glasswall.

Contoh fail policy & skema

Contoh fail policy Word Search dan kamus homoglif boleh didapati dalam folder /configs/sdk_word_search bagi pakej keluaran. XSD Word Search boleh didapati dalam folder /schemas/sdk_word_search bagi pakej keluaran.

Contoh policy konfigurasi

Tetapan teks

Bahagian berikut memaparkan pelbagai textSetting yang boleh ditakrifkan dalam sesuatu policy konfigurasi. Untuk maklumat lanjut tentang tetapan yang berbeza, rujuk halaman Word Search & Redaction.

Benarkan
<textSearchConfig libVersion="core2">
<textList>
<textItem>
<regex>((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}</regex>
<textSetting>allow</textSetting>
</textItem>
<textItem>
<text>Glasswall</text>
<textSetting>allow</textSetting>
</textItem>
</textList>
</textSearchConfig>
Tidak benarkan
<textSearchConfig libVersion="core2">
<textList>
<textItem>
<regex>((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}</regex>
<textSetting>disallow</textSetting>
</textItem>
<textItem>
<text>Glasswall</text>
<textSetting>disallow</textSetting>
</textItem>
</textList>
</textSearchConfig>
Redaksi
<textSearchConfig libVersion="core2">
<textList>
<textItem>
<regex>((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}</regex>
<textSetting replacementChar="*">redact</textSetting>
</textItem>
<textItem>
<text>Glasswall</text>
<textSetting replacementChar="*">redact</textSetting>
</textItem>
</textList>
</textSearchConfig>
Wajibkan
<textSearchConfig libVersion="core2">
<textList>
<textItem>
<regex>((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}</regex>
<textSetting>require</textSetting>
</textItem>
</textList>
</textSearchConfig>

Konfigurasi sistem

Seperti enjin teras Glasswall, suis tambahan boleh ditemui di bawah seksyen sysConfig. Ini mengawal tingkah laku enjin Word Search semasa memproses fail input.

<sysConfig>
<!--interchange_type must always be specified with the value "xml"-->
<interchange_type>xml</interchange_type>
<!--Enables/disables processing of text files. False by default.-->
<enable_text_support>false</enable_text_support>
</sysConfig>

Had yang diketahui

  • Pemprosesan fail Office dan fail teks pada masa yang sama adalah tidak mungkin
  • Apabila memproses fail teks, sekurang-kurangnya satu policy require perlu ditakrifkan
  • interchange_type mesti sentiasa dinyatakan sebagai xml di bawah sysConfig
  • Configuration policy that has a combination of the following textSettings with the same text/regex defined will always process the file:
    • require dan redact
    • require dan disallow
    • redact dan allow
    • allow dan disallow

Contoh config homoglyph JSON

Fail JSON membolehkan pengguna mencipta pemetaan antara aksara dan homoglyph yang sepadan. Ini membolehkan enjin mempertimbangkan homoglyph apabila menjana ungkapan carian, sekali gus membolehkan sokongan untuk homograf (perkataan yang kelihatan serupa) dan teks yang dikaburkan.

Config Homoglyph Lalai
{
"!": "ǃⵑ",
"$": "$",
"%": "%",
"&": "ꝸ&",
"'": "`´ʹʻʼʽʾˈˊˋ˴ʹ΄՚՝י׳ߴߵᑊᛌ᾽᾿`´῾‘’‛′‵ꞌ'`𖽑𖽒",
"(": "❨❲〔﴾([",
")": "❩❳〕﴿)]",
"*": "٭⁎∗*𐌟",
"+": "᛭+𐊛",
",": "¸؍٫‚ꓹ,",
"-": "˗۔‐‑‒–⁃−➖Ⲻ﹘",
".": "٠۰܁܂․ꓸ꘎.𐩐𝅭",
"/": "᜵⁁⁄∕╱⟋⧸Ⳇ⼃〳ノ㇓丿/𝈺",
"0": "OoΟοσОоՕօסه٥ھہە۵߀०০੦૦ଠ୦௦ం౦ಂ೦ംഠ൦ං๐໐ဝ၀ჿዐᴏᴑℴⲞⲟⵔ〇ꓳꬽﮦﮧﮨﮩﮪﮫﮬﮭﻩﻪﻫﻬ0Oo𐊒𐊫𐐄𐐬𐓂𐓪𐔖𑓐𑢵𑣈𑣗𑣠𝐎𝐨𝑂𝑜𝑶𝒐𝒪𝓞𝓸𝔒𝔬𝕆𝕠𝕺𝖔𝖮𝗈𝗢𝗼𝘖𝘰𝙊𝙤𝙾𝚘𝚶𝛐𝛔𝛰𝜊𝜎𝜪𝝄𝝈𝝤𝝾𝞂𝞞𝞸𝞼𝟎𝟘𝟢𝟬𝟶𞸤𞹤𞺄",
"1": "Il|ƖǀΙІӀ׀וןا١۱ߊᛁℐℑℓⅠⅼ∣⏽Ⲓⵏꓲﺍﺎ1Il│𐊊𐌉𐌠𖼨𝐈𝐥𝐼𝑙𝑰𝒍𝓁𝓘𝓵𝔩𝕀𝕝𝕴𝖑𝖨𝗅𝗜𝗹𝘐𝘭𝙄𝙡𝙸𝚕𝚰𝛪𝜤𝝞𝞘𝟏𝟙𝟣𝟭𝟷𞣇𞸀𞺀",
"2": "ƧϨᒿꙄꛯꝚ2𝟐𝟚𝟤𝟮𝟸",
"3": "ƷȜЗӠⳌꝪꞫ3𑣊𖼻𝈆𝟑𝟛𝟥𝟯𝟹",
"4": "Ꮞ4𑢯𝟒𝟜𝟦𝟰𝟺",
"5": "Ƽ5𑢻𝟓𝟝𝟧𝟱𝟻",
"6": "бᏮⳒ6𑣕𝟔𝟞𝟨𝟲𝟼",
"7": "7𐓒𑣆𝈒𝟕𝟟𝟩𝟳𝟽",
"8": "Ȣȣ৪੪ଃ8𐌚𝟖𝟠𝟪𝟴𝟾𞣋",
"9": "৭੧୨൭ⳊꝮ9𑢬𑣌𑣖𝟗𝟡𝟫𝟵𝟿",
"A": "4ΑАᎪᗅᴀꓮꭺA𐊠𖽀𝐀𝐴𝑨𝒜𝓐𝔄𝔸𝕬𝖠𝗔𝘈𝘼𝙰𝚨𝛢𝜜𝝖𝞐",
"B": "ʙΒВвᏴᏼᗷᛒℬꓐꞴB𐊂𐊡𐌁𝐁𝐵𝑩𝓑𝔅𝔹𝕭𝖡𝗕𝘉𝘽𝙱𝚩𝛣𝜝𝝗𝞑",
"C": "ϹСᏟℂℭⅭⲤꓚC𐊢𐌂𐐕𐔜𑣩𑣲𝐂𝐶𝑪𝒞𝓒𝕮𝖢𝗖𝘊𝘾𝙲🝌",
"D": "ᎠᗞᗪᴅⅅⅮꓓꭰD𝐃𝐷𝑫𝒟𝓓𝔇𝔻𝕯𝖣𝗗𝘋𝘿𝙳",
"E": "ΕЕᎬᴇℰ⋿ⴹꓰꭼE𐊆𑢦𑢮𝐄𝐸𝑬𝓔𝔈𝔼𝕰𝖤𝗘𝘌𝙀𝙴𝚬𝛦𝜠𝝚𝞔",
"F": "ϜᖴℱꓝꞘF𐊇𐊥𐔥𑢢𑣂𝈓𝐅𝐹𝑭𝓕𝔉𝔽𝕱𝖥𝗙𝘍𝙁𝙵𝟊",
"G": "ɢԌԍᏀᏳᏻꓖꮐG𝐆𝐺𝑮𝒢𝓖𝔊𝔾𝕲𝖦𝗚𝘎𝙂𝙶",
"H": "ʜΗНнᎻᕼℋℌℍⲎꓧꮋH𐋏𝐇𝐻𝑯𝓗𝕳𝖧𝗛𝘏𝙃𝙷𝚮𝛨𝜢𝝜𝞖",
"I": "",
"J": "ͿЈᎫᒍᴊꓙꞲꭻJ𝐉𝐽𝑱𝒥𝓙𝔍𝕁𝕵𝖩𝗝𝘑𝙅𝙹",
"K": "ΚКᏦᛕKⲔꓗK𐔘𝐊𝐾𝑲𝒦𝓚𝔎𝕂𝕶𝖪𝗞𝘒𝙆𝙺𝚱𝛫𝜥𝝟𝞙",
"L": "ʟᏞᒪℒⅬⳐⳑꓡꮮL𐐛𐑃𐔦𑢣𑢲𖼖𝈪𝐋𝐿𝑳𝓛𝔏𝕃𝕷𝖫𝗟𝘓𝙇𝙻",
"M": "ΜϺМᎷᗰᛖℳⅯⲘꓟM𐊰𐌑𝐌𝑀𝑴𝓜𝔐𝕄𝕸𝖬𝗠𝘔𝙈𝙼𝚳𝛭𝜧𝝡𝞛",
"N": "ɴΝℕⲚꓠN𐔓𝐍𝑁𝑵𝒩𝓝𝔑𝕹𝖭𝗡𝘕𝙉𝙽𝚴𝛮𝜨𝝢𝞜",
"O": "0",
"P": "ΡРᏢᑭᴘᴩℙⲢꓑꮲP𐊕𝐏𝑃𝑷𝒫𝓟𝔓𝕻𝖯𝗣𝘗𝙋𝙿𝚸𝛲𝜬𝝦𝞠",
"Q": "ℚⵕQ𝐐𝑄𝑸𝒬𝓠𝔔𝕼𝖰𝗤𝘘𝙌𝚀",
"R": "ƦʀᎡᏒᖇᚱℛℜℝꓣꭱꮢR𐒴𖼵𝈖𝐑𝑅𝑹𝓡𝕽𝖱𝗥𝘙𝙍𝚁",
"S": "$ЅՏᏕᏚꓢS𐊖𐐠𖼺𝐒𝑆𝑺𝒮𝓢𝔖𝕊𝕾𝖲𝗦𝘚𝙎𝚂",
"T": "ŤΤτТтᎢᴛ⊤⟙ⲦꓔꭲT𐊗𐊱𐌕𑢼𖼊𝐓𝑇𝑻𝒯𝓣𝔗𝕋𝕿𝖳𝗧𝘛𝙏𝚃𝚻𝛕𝛵𝜏𝜯𝝉𝝩𝞃𝞣𝞽🝨",
"U": "Սሀᑌ∪⋃ꓴU𐓎𑢸𖽂𝐔𝑈𝑼𝒰𝓤𝔘𝕌𝖀𝖴𝗨𝘜𝙐𝚄",
"V": "Ѵ٧۷ᏙᐯⅤⴸꓦꛟV𐔝𑢠𖼈𝈍𝐕𝑉𝑽𝒱𝓥𝔙𝕍𝖁𝖵𝗩𝘝𝙑𝚅",
"W": "ԜᎳᏔꓪW𑣦𑣯𝐖𝑊𝑾𝒲𝓦𝔚𝕎𝖂𝖶𝗪𝘞𝙒𝚆",
"X": "ΧХ᙭ᚷⅩ╳ⲬⵝꓫꞳX𐊐𐊴𐌗𐌢𐔧𑣬𝐗𝑋𝑿𝒳𝓧𝔛𝕏𝖃𝖷𝗫𝘟𝙓𝚇𝚾𝛸𝜲𝝬𝞦",
"Y": "ΥϒУҮᎩᎽⲨꓬY𐊲𑢤𖽃𝐘𝑌𝒀𝒴𝓨𝔜𝕐𝖄𝖸𝗬𝘠𝙔𝚈𝚼𝛶𝜰𝝪𝞤",
"Z": "ΖᏃℤℨꓜZ𐋵𑢩𑣥𝐙𝑍𝒁𝒵𝓩𝖅𝖹𝗭𝘡𝙕𝚉𝚭𝛧𝜡𝝛𝞕",
"a": "@ɑαа⍺a𝐚𝑎𝒂𝒶𝓪𝔞𝕒𝖆𝖺𝗮𝘢𝙖𝚊𝛂𝛼𝜶𝝰𝞪",
"b": "ƄЬᏏᖯb𝐛𝑏𝒃𝒷𝓫𝔟𝕓𝖇𝖻𝗯𝘣𝙗𝚋",
"c": "ϲсᴄⅽⲥꮯc𐐽𝐜𝑐𝒄𝒸𝓬𝔠𝕔𝖈𝖼𝗰𝘤𝙘𝚌",
"d": "ԁᏧᑯⅆⅾꓒd𝐝𝑑𝒅𝒹𝓭𝔡𝕕𝖉𝖽𝗱𝘥𝙙𝚍",
"e": "еҽ℮ℯⅇꬲe𝐞𝑒𝒆𝓮𝔢𝕖𝖊𝖾𝗲𝘦𝙚𝚎",
"f": "ſϝքẝꞙꬵf𝐟𝑓𝒇𝒻𝓯𝔣𝕗𝖋𝖿𝗳𝘧𝙛𝚏𝟋",
"g": "ƍɡցᶃℊg𝐠𝑔𝒈𝓰𝔤𝕘𝖌𝗀𝗴𝘨𝙜𝚐",
"h": "һհᏂℎh𝐡𝒉𝒽𝓱𝔥𝕙𝖍𝗁𝗵𝘩𝙝𝚑",
"i": "ıɩɪ˛ͺιіӏᎥιℹⅈⅰ⍳ꙇꭵi𑣃𝐢𝑖𝒊𝒾𝓲𝔦𝕚𝖎𝗂𝗶𝘪𝙞𝚒𝚤𝛊𝜄𝜾𝝸𝞲",
"j": "ϳјⅉj𝐣𝑗𝒋𝒿𝓳𝔧𝕛𝖏𝗃𝗷𝘫𝙟𝚓",
"k": "k𝐤𝑘𝒌𝓀𝓴𝔨𝕜𝖐𝗄𝗸𝘬𝙠𝚔",
"l": "1",
"m": "m",
"n": "ոռn𝐧𝑛𝒏𝓃𝓷𝔫𝕟𝖓𝗇𝗻𝘯𝙣𝚗",
"o": "",
"p": "ρϱр⍴ⲣp𝐩𝑝𝒑𝓅𝓹𝔭𝕡𝖕𝗉𝗽𝘱𝙥𝚙𝛒𝛠𝜌𝜚𝝆𝝔𝞀𝞎𝞺𝟈",
"q": "ԛգզq𝐪𝑞𝒒𝓆𝓺𝔮𝕢𝖖𝗊𝗾𝘲𝙦𝚚",
"r": "гᴦⲅꭇꭈꮁr𝐫𝑟𝒓𝓇𝓻𝔯𝕣𝖗𝗋𝗿𝘳𝙧𝚛",
"s": "$ƽѕꜱꮪs𐑈𑣁𝐬𝑠𝒔𝓈𝓼𝔰𝕤𝖘𝗌𝘀𝘴𝙨𝚜",
"t": "t𝐭𝑡𝒕𝓉𝓽𝔱𝕥𝖙𝗍𝘁𝘵𝙩𝚝",
"u": "ʋυսᴜꞟꭎꭒu𐓶𑣘𝐮𝑢𝒖𝓊𝓾𝔲𝕦𝖚𝗎𝘂𝘶𝙪𝚞𝛖𝜐𝝊𝞄𝞾",
"v": "νѵטᴠⅴ∨⋁ꮩv𑜆𑣀𝐯𝑣𝒗𝓋𝓿𝔳𝕧𝖛𝗏𝘃𝘷𝙫𝚟𝛎𝜈𝝂𝝼𝞶",
"w": "ɯѡԝաᴡꮃw𑜊𑜎𑜏𝐰𝑤𝒘𝓌𝔀𝔴𝕨𝖜𝗐𝘄𝘸𝙬𝚠",
"x": "×хᕁᕽ᙮ⅹ⤫⤬⨯x𝐱𝑥𝒙𝓍𝔁𝔵𝕩𝖝𝗑𝘅𝘹𝙭𝚡",
"y": "ɣʏγуүყᶌỿℽꭚy𑣜𝐲𝑦𝒚𝓎𝔂𝔶𝕪𝖞𝗒𝘆𝘺𝙮𝚢𝛄𝛾𝜸𝝲𝞬",
"z": "ᴢꮓz𑣄𝐳𝑧𝒛𝓏𝔃𝔷𝕫𝖟𝗓𝘇𝘻𝙯𝚣",
"£": "₤",
"©": "Ⓒ",
"®": "Ⓡ"
}

Contoh laporan analisis

Berikut ialah contoh laporan analisis yang dijana apabila rentetan carian ditetapkan kepada 'Glasswall', tanpa mengira textSetting yang digunakan. Ini termasuk ItemMatchCount untuk setiap corak yang dipadankan dalam fail tertentu.

<gw:WordItem>
<gw:Name>Glasswall</gw:Name>
<gw:ItemMatchCount>1</gw:ItemMatchCount>
<gw:Locations>
<gw:Location>
<gw:Offset>463</gw:Offset>
<gw:Page>0</gw:Page>
<gw:Paragraph>0</gw:Paragraph>
</gw:Location>
</gw:Locations>
</gw:WordItem>

Fungsi API

Status

API GwWordSearch dan GwWordSearchDone mengembalikan Status yang menunjukkan hasil panggilan API. API GwWordSearchTranslateStatus mengembalikan penerangan bagi Status yang dihantar masuk.

EnumeratorNilaiPenerangan
ws_disallowedItemFound-1024Item yang tidak dibenarkan oleh policy ditemui dalam fail.
ws_requiredItemNotFound-1025Item yang diperlukan oleh policy tidak ditemui dalam fail.
ws_illegalActionRedact-1026Tindakan redact ditentukan tetapi jenis fail tidak menyokong redaction.
ws_illegalActionRequire-1027Tindakan require ditentukan tetapi jenis fail tidak menyokong require.
ws_illegalActionNoRequire-1028Tindakan require tidak ditentukan tetapi jenis fail memerlukannya.
ws_filetypeUnsupported-1029Jenis fail tidak disokong oleh Word Search.
eFail0Ralat umum atau ralat lain yang tidak dinyatakan.
eSuccess1Operasi berjaya.

C++

Setiap API mengembalikan Status, yang ditakrifkan seperti berikut:

enum Status {
ws_disallowedItemFound = -1024,
ws_requiredItemNotFound = -1025,
ws_illegalActionRedact = -1026,
ws_illegalActionRequire = -1027,
ws_illegalActionNoRequire = -1028,
ws_filetypeUnsupported = -1029,
eFail = 0,
eSuccess = 1,
};

C#

Untuk menyepadukan Glasswall Word Search dalam C#, pembalut C# Word Search Glasswall diperlukan. Setiap API mengembalikan jenis WordSearchStatus, yang ditakrifkan seperti berikut:

/// <summary>
/// Indicates whether the Word Search process was successful (WordSearchStatus.Success)
/// or not (WordSearchStatus.Fail). Zero or negative values indicate a failure.
/// </summary>
public enum WordSearchStatus
{
DisallowedItemFound = -1024,
RequiredItemNotFound = -1025,
IllegalActionRedact = -1026,
IllegalActionRequire = -1027,
IllegalActionNoRequire = -1028,
FiletypeUnsupported = -1029,
Fail = 0,
Success
}

Java

Untuk mengintegrasikan Glasswall Word Search dalam java, Glasswall Word Search Java wrapper diperlukan. Setiap API mengembalikan jenis GlasswallWordSearchResult` , yang ditakrifkan seperti berikut:

package com.glasswallsolutions;

/**
* Class used to hold the results from a Word Search process.
*/
public class GlasswallWordSearchResult
{
/**
* The XML analysis report
*/
public String report;

/**
* The processed document
*/
public byte[] outputDocument;

/**
* boolean indicating whether the process was successful (true) or not (false)
*/
public boolean success;

public GlasswallWordSearchResult()
{
report = null;
outputDocument = null;
success = false;
}
}

Python

Untuk mengintegrasikan Glasswall Word Search dalam Python, Glasswall Python wrapper diperlukan. Setiap API mengembalikan objek generik GwReturnObj, yang akan mengandungi atribut: "status" (int), "output_file" (bytes), "output_report" (bytes). Status int ditakrifkan seperti berikut:

# glasswall\libraries\word_search\successes.py

class Success(WordSearchSuccess):
""" WordSearch success code 1. """
pass


success_codes = {
1: Success,
}
# glasswall\libraries\word_search\errors.py

class UnknownErrorCode(WordSearchError):
""" Unknown error code. """
pass

class Fail(WordSearchError):
""" WordSearch error code 0. """
pass


class DisallowedItemFound(WordSearchError):
""" WordSearch error code -1024. Item disallowed by policy found in file. """
pass


class RequiredItemNotFound(WordSearchError):
""" WordSearch error code -1025. Item required by policy not found in file. """
pass


class IllegalActionRedact(WordSearchError):
""" WordSearch error code -1026. Redact action specified but filetype doesn't support redaction. """
pass


class IllegalActionRequire(WordSearchError):
""" WordSearch error code -1027. Require action specified but filetype doesn't support redaction. """
pass


class IllegalActionNoRequire(WordSearchError):
""" WordSearch error code -1028. Require action not specified but filetype needs one. """
pass


class FiletypeUnsupported(WordSearchError):
""" WordSearch error code -1029. Filetype supported by Editor but not by Word Search. """
pass


error_codes = {
0: Fail,
-1024: DisallowedItemFound,
-1025: RequiredItemNotFound,
-1026: IllegalActionRedact,
-1027: IllegalActionRequire,
-1028: IllegalActionNoRequire,
-1029: FiletypeUnsupported,
}

JavaScript

Untuk mengintegrasikan Glasswall Word Search dalam JavaScript, Glasswall Word Search JavaScript wrapper diperlukan. Setiap API mengembalikan jenis WordSearchStatus, yang ditakrifkan seperti berikut:

/**
* Used to indicate whether the Word Search process was successful or not
*/
export const enum WordSearchStatus {
ws_disallowedItemFound = -1024,
ws_requiredItemNotFound = -1025,
ws_illegalActionRedact = -1026,
ws_illegalActionRequire = -1027,
ws_illegalActionNoRequire = -1028,
ws_filetypeUnsupported = -1029,
eFail = 0,
eSuccess = 1,
}

GwWordSearch

Ini digunakan untuk memanggil enjin Word Search, memproses fail input yang ditentukan dan menghasilkan fail output bersama laporan analisis Word Search.

C++

Status GwWordSearch(
void* input_buffer,
size_t input_buffer_len,
void** output_buffer,
size_t* output_buffer_len,
void** output_report_buffer,
size_t* output_report_buffer_len,
const char* homoglpyhs,
const char* xml_config_string
)
NamaJenisArahPenerangan
input_buffervoid *MasukPenuding kepada penimbal yang mengandungi fail input untuk diproses
input_buffer_lensize_tMasukSaiz penimbal fail input
output_buffervoid **KeluarPenuding kepada penuding kepada penimbal yang akan diisi dengan penimbal fail yang telah diproses. Penimbal ini diperuntukkan oleh enjin Word Search
output_buffer_lensize_t *KeluarPenuding kepada saiz penimbal fail output. Ini akan ditetapkan oleh enjin Word Search
output_report_buffervoid **KeluarPenuding kepada penuding kepada penimbal yang akan diisi dengan penimbal laporan analisis Word Search. Penimbal ini diperuntukkan oleh enjin Word Search
output_report_buffer_lensize_t *KeluarPenunjuk kepada saiz laporan analisis Word Search. Ini akan ditetapkan oleh enjin Word Search
homoglyphsconst char *MasukPenunjuk kepada penimbal yang mengandungi fail homoglyphs. Penimbal ini perlu ditamatkan null
xml_config_stringconst char *MasukPenunjuk kepada penimbal yang mengandungi fail XML pengurusan kandungan. Penimbal ini perlu ditamatkan null

C#

Untuk menyepadukan Glasswall Word Search dalam C#, pembalut C# Word Search Glasswall diperlukan.

public WordSearchStatus GwWordSearch(
byte[] inputBuffer,
out byte[] outputFileBuffer,
out String outputAnalysisReport,
string homoglyphs,
string xmlConfigString
)

NamaJenisArahPenerangan
inputBufferbyte[]MasukPenimbal yang mengandungi dokumen untuk diproses
outputFileBufferout byte[]KeluarPenimbal hasil yang akan mengandungi dokumen yang telah diproses
outputAnalysisReportout stringKeluarLaporan analisis output daripada proses Word Search
homoglyphsstringMasukDokumen JSON yang mengandungi pemetaan homoglyph
xmlConfigStringstringMasukpolicy pengurusan kandungan XML

Java

Untuk mengintegrasikan Glasswall Word Search dalam Java, Word Search Java wrapper Glasswall diperlukan.


public native GlasswallWordSearchResult wordSearch(
byte[] inputDocument,
String homoglyphs,
String xmlConfig
)

NamaJenisArahPenerangan
inputDocumentbyte[]MasukPenimbal yang mengandungi dokumen untuk diproses
homoglyphsstringMasukDokumen JSON yang mengandungi pemetaan homoglyph
xmlConfigstringMasukpolicy pengurusan kandungan XML

Nota: Tidak seperti beberapa bahasa lain yang disokong, semua output dikembalikan dalam objek GlasswallWordSearchResult untuk Java.

Python

Untuk mengintegrasikan Glasswall Word Search dalam Python, Glasswall Python wrapper diperlukan.

# glasswall\libraries\word_search\word_search.py

def redact_file(self, input_file: Union[str, bytes, bytearray, io.BytesIO], content_management_policy: Union[str, bytes, bytearray, io.BytesIO], output_file: Union[None, str] = None, output_report: Union[None, str] = None, homoglyphs: Union[None, str, bytes, bytearray, io.BytesIO] = None, raise_unsupported: bool = True):
""" Redacts text from input_file using the given content_management_policy and homoglyphs file, optionally writing the redacted file and analysis report to the paths specified by output_file and output_report.

Args:
input_file (Union[str, bytes, bytearray, io.BytesIO]): The input file path or bytes.
content_management_policy (Union[str, bytes, bytearray, io.BytesIO)]): The content management policy to apply.
output_file (Union[None, str], optional): Default None. If str, write output_file to that path.
output_report (Union[None, str], optional): Default None. If str, write output_file to that path.
homoglyphs (Union[None, str, bytes, bytearray, io.BytesIO)], optional): Default None. The homoglyphs json file path or bytes.
raise_unsupported (bool, optional): Default True. Raise exceptions when Glasswall encounters an error. Fail silently if False.

Returns:
gw_return_object (glasswall.GwReturnObj): An instance of class glasswall.GwReturnObj containing attributes: "status" (int), "output_file" (bytes), "output_report" (bytes)
"""


def redact_directory(self, input_directory: str, content_management_policy: Union[str, bytes, bytearray, io.BytesIO, glasswall.content_management.policies.policy.Policy], output_directory: Optional[str] = None, output_report_directory: Optional[str] = None, homoglyphs: Union[None, str, bytes, bytearray, io.BytesIO] = None, raise_unsupported: bool = True):
""" Redacts all files in a directory and it's subdirectories using the given content_management_policy and homoglyphs file. The redacted files are written to output_directory maintaining the same directory structure as input_directory.

Args:
input_directory (str): The input directory containing files to redact.
output_directory (str): The output directory where the redacted files will be written.
output_report_directory (Optional[str], optional): Default None. If str, the output directory where analysis reports for each redacted file will be written.
content_management_policy (Union[str, bytes, bytearray, io.BytesIO)]): The content management policy to apply.
homoglyphs (Union[None, str, bytes, bytearray, io.BytesIO)], optional): Default None. The homoglyphs file path, str, or bytes.
raise_unsupported (bool, optional): Default True. Raise exceptions when Glasswall encounters an error. Fail silently if False.

Returns:
redacted_files_dict (dict): A dictionary of file paths relative to input_directory, and glasswall.GwReturnObj with attributes: "status" (int), "output_file" (bytes), "output_report" (bytes)
"""

Nota: Tidak seperti beberapa bahasa lain yang disokong, semua output dikembalikan dalam objek GwReturnObj untuk Python.

JavaScript


/**
* Perform word search on input buffer, using the applied config and homoglyphs
* @param {Buffer} inputBuffer A buffer containing the contents of the document to be processed.
* @param {String} homoglyphs A homoglyphs file that will be used as part of the Word Search process (UTF-8 string).
* @param {String} configXml The content management XML policy (utf-8 string).
* @returns {WordSearchResult} The result from Word Search.
*/
wordSearch(inputBuffer: Buffer, homoglyphs: string, configXml: string): WordSearchResult

Nota: Tidak seperti beberapa bahasa lain yang disokong, semua output dikembalikan dalam objek WordSearchResult untuk JavaScript.

GWWordSearchDone

Ini digunakan untuk melepaskan sebarang sumber yang telah diperuntukkan oleh enjin Word Search. Fungsi ini perlu dipanggil selepas setiap panggilan yang dibuat kepada fungsi GwWordSearch jika tidak kebocoran memori akan berlaku.

Panggilan API ini hanya diperlukan dalam C++.

C++

Status GwWordSearchDone(
void** output_buffer,
size_t* output_buffer_len,
void** output_report_buffer,
size_t* output_report_buffer_len)
NamaJenisArahPenerangan
output_buffervoid **KeluarPenuding kepada penuding kepada penimbal yang mengandungi fail yang diproses yang akan dibebaskan oleh pustaka Word Search
output_buffer_lensize_t *KeluarPenuding kepada saiz penimbal fail output
output_report_buffervoid **KeluarPenuding kepada penuding kepada penimbal yang mengandungi laporan analisis Word Search yang akan dibebaskan oleh pustaka Word Search
output_report_buffer_lensize_t *KeluarPenuding kepada saiz laporan analisis Word Search

Bahasa lain

Bagi semua bahasa yang diliputi oleh wrapper Glasswall, fungsi API GwWordSearchDone dipanggil secara dalaman dalam wrapper, yang bermaksud API tersebut tidak didedahkan kepada pengguna.

GwWordSearchVersion

Ini digunakan untuk mendapatkan nombor versi pustaka semasa.

C++

const char* GwWordSearchVersion(void)

GwWordSearchTranslateStatus

Terjemahkan kod ralat yang diberikan kepada mesej ralat yang mesra pengguna.

C++

const char* GwWordSearchTranslateStatus(Status errorCode)
NamaJenisArahPenerangan
errorCodeStatusMasukKod pulangan yang perlu diterjemahkan

Isu lazim

Word Search tidak memproses fail

Apabila menjalankan Word Search, sila pastikan bahawa semua pustaka Embedded Engine berada dalam direktori yang sama, yang juga perlu ditetapkan sebagai direktori kerja semasa. Glasswall mencari dependensinya dalam direktori kerja semasa dan jika ia tidak ditemui maka fail tidak akan diproses dengan betul. Juga pastikan kunci lesen yang sah tersedia.

Contoh penggunaan

Di sini kami mempunyai contoh aplikasi yang mengambil fail input, memprosesnya menggunakan enjin Glasswall Word Search, dan kemudian menghasilkan fail output bersama laporan analisis Word Search. Aplikasi contoh ini menjangkakan parameter baris arahan berikut:

  1. Laluan ke XML konfigurasi pengurusan kandungan.
  2. Laluan ke fail homoglyphs.
  3. Laluan ke fail input yang akan diproses.
  4. Laluan ke fail output tempat fail yang diproses akan disimpan.

C++

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <cstddef>
#include <stdexcept>

#include "api.h"

using namespace std;

// Read the file into a buffer
vector<uint8_t> readFile(ifstream &fileHandle, const string &filePath, bool nullTerminator)
{
fileHandle.exceptions(ifstream::failbit | ifstream::badbit);
fileHandle.open(filePath.c_str(), ios::binary | ios::ate);

vector<uint8_t> data;
streamsize size = fileHandle.tellg();
fileHandle.seekg(0, ios::beg);

data.resize(size + 1);
fileHandle.read(reinterpret_cast<char *>(data.data()), size);

if (nullTerminator)
{
data.push_back(0);
}

return data;
}

int main(int argc, char **argv)
{
if (argc != 5)
{
cerr << "Usage: <Path to XML Config> <Path to Homoglyphs> <Input file> <Output file>" << endl;
return -1;
}

// Read commandline arguments
string xmlFilePath(argv[1]);
string homoglyphsFilePath(argv[2]);
string inputFilePath(argv[3]);
string outputFilePath(argv[4]);

// Create file handles for input files
ifstream xmlFileHandle;
ifstream homoglyphsFileHandle;
ifstream inputFileHandle;

// Read files into buffers
vector<uint8_t> xmlBuffer = readFile(xmlFileHandle, xmlFilePath, true); // Buffer containing the XML content management settings. This is null terminated
vector<uint8_t> homoglyphsBuffer = readFile(homoglyphsFileHandle, homoglyphsFilePath, true); // Buffer containing the homoglyphs. This is null terminated
vector<uint8_t> inputBuffer = readFile(inputFileHandle, inputFilePath, false); // Buffer containing the input file to be processed

// Create variables for output buffers
void * outputBuffer = nullptr; // Output buffer for processed file
size_t outputBufferSize = 0; // Output buffer size
void * outputReportBuffer = nullptr; // Output buffer for analysis report file
size_t outputReportBufferSize = 0; // Output analysis report buffer size

// Run Word Search and redact
Status status = GwWordSearch(inputBuffer.data(), inputBuffer.size(), &outputBuffer, &outputBufferSize, &outputReportBuffer, &outputReportBufferSize, reinterpret_cast<const char*>(homoglyphsBuffer.data()), reinterpret_cast<const char *>(xmlBuffer.data()));

if (status == Status::eSuccess)
{
// Write out the processed output file if the Word Search and redact was successful
ofstream outputFileHandle(outputFilePath, ios::binary | ios::trunc);

if (outputFileHandle.is_open())
{
outputFileHandle.write(static_cast<const char *>(outputBuffer), outputBufferSize);
}

outputFileHandle.close();
}

// Write out the analysis report file
ofstream analysisFileHandle(outputFilePath + ".xml", ios::binary | ios::trunc);

if (analysisFileHandle.is_open())
{
analysisFileHandle.write(static_cast<const char *>(outputReportBuffer), outputReportBufferSize);
}

analysisFileHandle.close();

// Call done to release any allocated resources
GwWordSearchDone(&outputBuffer, &outputBufferSize, &outputReportBuffer, &outputReportBufferSize);

return 0;
}

C#

using System;
using System.IO;

namespace glasswall.word.search.csharp.testing
{
internal class Program
{
static void Main(string[] args)
{
Console.WriteLine("Word Search test");
if (args.Length != 4)
{
Console.WriteLine("usage: <Xml Config> <Homoglyphs> <Input Directory> <OutputDirectory>");
Console.WriteLine("Parameters specified: \n{0}", string.Join("\n", args));
return;
}

string xmlConfigPath = args[0];
string homoglyphsPath = args[1];
string inputDirectory = args[2];
string outputDirectory = args[3];

if (!File.Exists(xmlConfigPath))
{
Console.Error.WriteLine("Xml config does not exist: {0}", xmlConfigPath);
return;
}

if (!File.Exists(homoglyphsPath))
{
Console.Error.WriteLine("Homoglyphs does not exist: {0}", homoglyphsPath);
return;
}

if (!Directory.Exists(inputDirectory))
{
Console.Error.WriteLine("Input directory does not exist: {0}", inputDirectory);
return;
}

Directory.CreateDirectory(outputDirectory);

using (FileStream fileStream = new FileStream(Path.Combine(outputDirectory, "ProcessLog.txt"), FileMode.OpenOrCreate, FileAccess.Write))
{
using (StreamWriter writer = new StreamWriter(fileStream))
{
writer.WriteLine("> Word Search Library version: {0}", GlasswallWordSearch.GwWordSearchVersion());

string xmlConfig = File.ReadAllText(xmlConfigPath);
string homoglyphs = File.ReadAllText(homoglyphsPath);

foreach (string path in Directory.EnumerateFiles(inputDirectory, "*", SearchOption.AllDirectories))
{
writer.WriteLine("> Processing file: {0}", path);
string inputDirectoryPath = path.Substring(inputDirectory.Length + 1);
string directory = Path.Combine(outputDirectory, inputDirectoryPath);
Directory.CreateDirectory(directory);
processFile(path, directory, homoglyphs, xmlConfig);
}
}
}

return;
}
static void WriteAllBytes(string path, byte[] data)
{
if (data == null)
{
File.Create(path);
}
else
{
File.WriteAllBytes(path, data);
}
}
public static void processFile(string inputFile, string outputDirectory, string homoglyphs, string xmlConfig)
{

using (FileStream fileStream = new FileStream(Path.Combine(outputDirectory, Path.GetFileName(inputFile) + ".log"), FileMode.OpenOrCreate, FileAccess.Write))
{
using (StreamWriter writer = new StreamWriter(fileStream))
{
// Word Search
writer.WriteLine(">> Run Word Search");
byte[] inputFileBuffer = File.ReadAllBytes(inputFile);
byte[] outputBuffer, outputReportBuffer;
GlasswallWordSearch.WordSearchStatus status = GlasswallWordSearch.GwWordSearch(inputFileBuffer, out outputBuffer, out outputReportBuffer, homoglyphs, xmlConfig);
writer.WriteLine("Status is: {0}", status);

if (outputBuffer != null)
{
WriteAllBytes(Path.Combine(outputDirectory, Path.GetFileName(inputFile)), outputBuffer);
}

if (outputReportBuffer != null)
{
WriteAllBytes(Path.Combine(outputDirectory, Path.GetFileName(inputFile)) + ".xml", outputReportBuffer);
}
}
}
}
}
}

Java

package com.glasswallsolutions;

import java.lang.System;
import java.io.*;
import com.glasswallsolutions.*;
import java.nio.file.Paths;

public class MainTest {

public static byte[] readAllBytes(InputStream inputStream) throws IOException
{
final int bufLen = 4 * 0x400; // 4KB
byte[] buf = new byte[bufLen];
int readLen;

try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream()) {
while ((readLen = inputStream.read(buf, 0, bufLen)) != -1)
outputStream.write(buf, 0, readLen);

return outputStream.toByteArray();
}
}

public static void main(String[] args) throws Exception {
if (args.length != 4)
{
System.out.println("Usage: <Input Directory> <Output Directory> <Homoglyphs File> <Config XML>");
System.exit(-1);
}

File inputDirectory = new File(args[0]);
File outputDirectory = new File(args[1]);
outputDirectory.delete();
outputDirectory.mkdir();
String homoglyphsFile = args[2];
String configXmlFile = args[3];

String homoglyphs = null;
String configXML = null;

GlasswallWordSearch glasswallWordSearch = new GlasswallWordSearch();

try(FileInputStream homoglyphsInputStream = new FileInputStream(homoglyphsFile))
{
homoglyphs = new String(readAllBytes(homoglyphsInputStream));
}

try(FileInputStream configXmlInputStream = new FileInputStream(configXmlFile))
{
configXML = new String(readAllBytes(configXmlInputStream));
}

System.out.println("Word Search version: " + glasswallWordSearch.version());

for (File inputFile : inputDirectory.listFiles())
{
try
{
System.out.println("Processing file: " + inputFile.getAbsolutePath());

File fileOutputDirectory = new File(Paths.get(outputDirectory.getAbsolutePath(), inputFile.getName()).toString());
fileOutputDirectory.mkdir();
String fileOutputPath = Paths.get(fileOutputDirectory.getAbsolutePath(), inputFile.getName()).toString();

try(FileInputStream inputStream = new FileInputStream(inputFile))
{
byte[] fileData = readAllBytes(inputStream);

GlasswallWordSearchResult result = glasswallWordSearch.wordSearch(fileData, homoglyphs, configXML);

System.out.println("Status: " + result.success);

if (result.outputDocument != null)
{
try(FileOutputStream fileOutputStream = new FileOutputStream(fileOutputPath))
{
fileOutputStream.write(result.outputDocument);
}
}

if (result.report != null)
{
try(FileOutputStream fileOutputStream = new FileOutputStream(fileOutputPath + ".xml"))
{
fileOutputStream.write(result.report.getBytes());
}
}
}
}
catch(Exception ex)
{
System.err.println("Exception occurred: " + ex.getMessage());
ex.printStackTrace(System.err);

}
}
}
}

Python

Untuk contoh lanjut lihat Python Word Search & Redaction

JavaScript


import fs from 'fs';
import path from 'path';
import { GlasswallWordSearch, GlasswallWordSearchNative, WordSearchResult, WordSearchStatus } from '../index'

let main = function()
{
const args = process.argv;

if (args.length === 7)
{
let wordSearchDllPath = path.resolve(args[2]);
let inputDirectory = path.resolve(args[3]);
let outputDirectory = path.resolve(args[4]);
let homoglyphsPath = path.resolve(args[5]);
let configXmlPath = path.resolve(args[6]);

let handler = new GlasswallWordSearchNative(wordSearchDllPath, { enableLogging: true});
let glasswallWordSearch = new GlasswallWordSearch(handler);
console.log("Glasswall Word Search version: " + glasswallWordSearch.version())

if (!fs.existsSync(inputDirectory))
{
console.log('Input Directory does not exist: ' + inputDirectory);
process.exit(-1);
}

if (!fs.existsSync(homoglyphsPath))
{
console.log('Homoglyphs file does not exist: ' + homoglyphsPath);
process.exit(-1);
}

if (!fs.existsSync(configXmlPath))
{
console.log('Config XML file does not exist: ' + configXmlPath);
process.exit(-1);
}

let homoglyphs = fs.readFileSync(homoglyphsPath, 'utf8');
let configXml = fs.readFileSync(configXmlPath , 'utf8');

fs.mkdirSync(outputDirectory, {recursive: true});

fs.readdirSync(inputDirectory).forEach(file => {
try
{
let fullFilePath = path.join(inputDirectory, file);

if (fs.statSync(fullFilePath).isFile())
{
console.log('Processing file: ' + fullFilePath);
let outputFileDirectory = path.join(outputDirectory, file);
fs.mkdirSync(outputFileDirectory);
let inputBuffer = fs.readFileSync(fullFilePath);
let wordSearchResult = glasswallWordSearch.wordSearch(inputBuffer, homoglyphs, configXml);
console.log("Status: " + wordSearchResult.status);

if (wordSearchResult.outputBuffer != undefined && wordSearchResult.outputBuffer != null)
{
fs.writeFileSync(path.join(outputFileDirectory, file), wordSearchResult.outputBuffer);
}

if (wordSearchResult.analysisXmlReport != undefined && wordSearchResult.analysisXmlReport != null)
{
fs.writeFileSync(path.join(outputFileDirectory, file + ".xml"), wordSearchResult.analysisXmlReport);
}
}
}
catch(error)
{
console.log("Exception occurred: " + error);
console.trace(error);
}

})
}
else
{
console.log("Usage: Application <Library File> <Input Directory> <Output Directory> <Homoglyphs File> <Config XML>");
process.exit(-1);
}
}

if (require.main === module){
main();
}