Lumaktaw sa pangunahing nilalaman

Word Search library

Ang Glasswall Embedded Engine ay nagbibigay ng malalim na inspeksiyon ng file, remediation, sanitisation, at pag-uulat. Dinidekonstrak ng engine ang isang file sa mga estruktural nitong bahagi at bumubuo ng panloob na mala-punong representasyon ng file. Sinusuri nito ang bawat node ng puno, iniinspeksiyon, inaayos, at sini-sanitise ang mga item ng nilalaman bago muling buuin ang isang bagong file.

Nagbibigay rin ang Glasswall Embedded Engine ng kakayahang mag-export at mag-import ng internal representation ng engine ng isang file structure sa isang intermediate format gaya ng XML. Dahil dito, maaaring gawing available sa mga external program ang mga internal component ng isang file para sa karagdagang pagproseso, bago muling buuin ang file upang maisama ang mga externally modified na component na iyon.

Ang Glasswall Word Search engine ay binuo sa ibabaw ng kakayahan sa export at import, at nagsasagawa ng paghahanap ng teksto sa content at metadata ng isang file. Ang mga search string, content management, at mga rule sa redaction ay kino-configure sa pamamagitan ng isang XML file. Ginagamit ang isang user-configurable na character substitution map na tinukoy sa JSON form upang magbigay ng suporta para sa text obfuscation. May kasama ring built-in na suporta ang engine para sa regular expression.

Configuration ng Word Search

Tinutukoy ng configuration ng Word Search ang tekstong hahanapin, o ang regular expression na ilalapat, at kung paano ito dapat tratuhin kapag nakita sa loob ng dokumento. Ang configuration ng Word Search ay isang extension ng Glasswall content management.

Mga halimbawang policy file at schema

Ang mga halimbawang Word Search policy file at homoglyph dictionary ay makikita sa folder na /configs/sdk_word_search ng release package. Ang Word Search XSD ay makikita sa folder na /schemas/sdk_word_search ng release package.

Halimbawang configuration policy

Mga setting ng teksto

Ipinapakita ng mga sumusunod na seksyon ang iba't ibang textSetting na maaaring tukuyin sa isang configuration policy. Para sa higit pang impormasyon tungkol sa iba't ibang setting, sumangguni sa pahinang Word Search & Redaction.

Payagan
<textSearchConfig libVersion="core2">
<textList>
<textItem>
<regex>((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}</regex>
<textSetting>allow</textSetting>
</textItem>
<textItem>
<text>Glasswall</text>
<textSetting>allow</textSetting>
</textItem>
</textList>
</textSearchConfig>
I-disallow
<textSearchConfig libVersion="core2">
<textList>
<textItem>
<regex>((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}</regex>
<textSetting>disallow</textSetting>
</textItem>
<textItem>
<text>Glasswall</text>
<textSetting>disallow</textSetting>
</textItem>
</textList>
</textSearchConfig>
Redact
<textSearchConfig libVersion="core2">
<textList>
<textItem>
<regex>((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}</regex>
<textSetting replacementChar="*">redact</textSetting>
</textItem>
<textItem>
<text>Glasswall</text>
<textSetting replacementChar="*">redact</textSetting>
</textItem>
</textList>
</textSearchConfig>
I-require
<textSearchConfig libVersion="core2">
<textList>
<textItem>
<regex>((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}</regex>
<textSetting>require</textSetting>
</textItem>
</textList>
</textSearchConfig>

Konpigurasyon ng system

Tulad ng sa core Glasswall engine, ang mga karagdagang switch ay makikita sa ilalim ng seksyong sysConfig. Kinokontrol ng mga ito ang asal ng Word Search engine habang pinoproseso ang mga input file.

<sysConfig>
<!--interchange_type must always be specified with the value "xml"-->
<interchange_type>xml</interchange_type>
<!--Enables/disables processing of text files. False by default.-->
<enable_text_support>false</enable_text_support>
</sysConfig>

Mga kilalang limitasyon

  • Hindi posible ang sabay na pagproseso ng parehong Office at text files
  • Kapag nagpoproseso ng mga text file, kailangang may nakatakdang kahit isang require policy
  • Ang interchange_type ay dapat palaging tukuyin bilang xml sa ilalim ng sysConfig
  • Configuration policy that has a combination of the following textSettings with the same text/regex defined will always process the file:
    • require at redact
    • require at disallow
    • redact at allow
    • allow at disallow

Halimbawang JSON homoglyph config

Pinapayagan ng JSON file ang user na gumawa ng pagmamapa sa pagitan ng mga character at ng katumbas na mga homoglyph. Pinapayagan nito ang engine na isaalang-alang ang mga homoglyph kapag bumubuo ng mga search expression, na nagbibigay ng suporta para sa mga homograph (magkakamukhang mga salita) at obfuscated na teksto.

Default na Config ng Homoglyphs
{
"!": "ǃⵑ",
"$": "$",
"%": "%",
"&": "ꝸ&",
"'": "`´ʹʻʼʽʾˈˊˋ˴ʹ΄՚՝י׳ߴߵᑊᛌ᾽᾿`´῾‘’‛′‵ꞌ'`𖽑𖽒",
"(": "❨❲〔﴾([",
")": "❩❳〕﴿)]",
"*": "٭⁎∗*𐌟",
"+": "᛭+𐊛",
",": "¸؍٫‚ꓹ,",
"-": "˗۔‐‑‒–⁃−➖Ⲻ﹘",
".": "٠۰܁܂․ꓸ꘎.𐩐𝅭",
"/": "᜵⁁⁄∕╱⟋⧸Ⳇ⼃〳ノ㇓丿/𝈺",
"0": "OoΟοσОоՕօסه٥ھہە۵߀०০੦૦ଠ୦௦ం౦ಂ೦ംഠ൦ං๐໐ဝ၀ჿዐᴏᴑℴⲞⲟⵔ〇ꓳꬽﮦﮧﮨﮩﮪﮫﮬﮭﻩﻪﻫﻬ0Oo𐊒𐊫𐐄𐐬𐓂𐓪𐔖𑓐𑢵𑣈𑣗𑣠𝐎𝐨𝑂𝑜𝑶𝒐𝒪𝓞𝓸𝔒𝔬𝕆𝕠𝕺𝖔𝖮𝗈𝗢𝗼𝘖𝘰𝙊𝙤𝙾𝚘𝚶𝛐𝛔𝛰𝜊𝜎𝜪𝝄𝝈𝝤𝝾𝞂𝞞𝞸𝞼𝟎𝟘𝟢𝟬𝟶𞸤𞹤𞺄",
"1": "Il|ƖǀΙІӀ׀וןا١۱ߊᛁℐℑℓⅠⅼ∣⏽Ⲓⵏꓲﺍﺎ1Il│𐊊𐌉𐌠𖼨𝐈𝐥𝐼𝑙𝑰𝒍𝓁𝓘𝓵𝔩𝕀𝕝𝕴𝖑𝖨𝗅𝗜𝗹𝘐𝘭𝙄𝙡𝙸𝚕𝚰𝛪𝜤𝝞𝞘𝟏𝟙𝟣𝟭𝟷𞣇𞸀𞺀",
"2": "ƧϨᒿꙄꛯꝚ2𝟐𝟚𝟤𝟮𝟸",
"3": "ƷȜЗӠⳌꝪꞫ3𑣊𖼻𝈆𝟑𝟛𝟥𝟯𝟹",
"4": "Ꮞ4𑢯𝟒𝟜𝟦𝟰𝟺",
"5": "Ƽ5𑢻𝟓𝟝𝟧𝟱𝟻",
"6": "бᏮⳒ6𑣕𝟔𝟞𝟨𝟲𝟼",
"7": "7𐓒𑣆𝈒𝟕𝟟𝟩𝟳𝟽",
"8": "Ȣȣ৪੪ଃ8𐌚𝟖𝟠𝟪𝟴𝟾𞣋",
"9": "৭੧୨൭ⳊꝮ9𑢬𑣌𑣖𝟗𝟡𝟫𝟵𝟿",
"A": "4ΑАᎪᗅᴀꓮꭺA𐊠𖽀𝐀𝐴𝑨𝒜𝓐𝔄𝔸𝕬𝖠𝗔𝘈𝘼𝙰𝚨𝛢𝜜𝝖𝞐",
"B": "ʙΒВвᏴᏼᗷᛒℬꓐꞴB𐊂𐊡𐌁𝐁𝐵𝑩𝓑𝔅𝔹𝕭𝖡𝗕𝘉𝘽𝙱𝚩𝛣𝜝𝝗𝞑",
"C": "ϹСᏟℂℭⅭⲤꓚC𐊢𐌂𐐕𐔜𑣩𑣲𝐂𝐶𝑪𝒞𝓒𝕮𝖢𝗖𝘊𝘾𝙲🝌",
"D": "ᎠᗞᗪᴅⅅⅮꓓꭰD𝐃𝐷𝑫𝒟𝓓𝔇𝔻𝕯𝖣𝗗𝘋𝘿𝙳",
"E": "ΕЕᎬᴇℰ⋿ⴹꓰꭼE𐊆𑢦𑢮𝐄𝐸𝑬𝓔𝔈𝔼𝕰𝖤𝗘𝘌𝙀𝙴𝚬𝛦𝜠𝝚𝞔",
"F": "ϜᖴℱꓝꞘF𐊇𐊥𐔥𑢢𑣂𝈓𝐅𝐹𝑭𝓕𝔉𝔽𝕱𝖥𝗙𝘍𝙁𝙵𝟊",
"G": "ɢԌԍᏀᏳᏻꓖꮐG𝐆𝐺𝑮𝒢𝓖𝔊𝔾𝕲𝖦𝗚𝘎𝙂𝙶",
"H": "ʜΗНнᎻᕼℋℌℍⲎꓧꮋH𐋏𝐇𝐻𝑯𝓗𝕳𝖧𝗛𝘏𝙃𝙷𝚮𝛨𝜢𝝜𝞖",
"I": "",
"J": "ͿЈᎫᒍᴊꓙꞲꭻJ𝐉𝐽𝑱𝒥𝓙𝔍𝕁𝕵𝖩𝗝𝘑𝙅𝙹",
"K": "ΚКᏦᛕKⲔꓗK𐔘𝐊𝐾𝑲𝒦𝓚𝔎𝕂𝕶𝖪𝗞𝘒𝙆𝙺𝚱𝛫𝜥𝝟𝞙",
"L": "ʟᏞᒪℒⅬⳐⳑꓡꮮL𐐛𐑃𐔦𑢣𑢲𖼖𝈪𝐋𝐿𝑳𝓛𝔏𝕃𝕷𝖫𝗟𝘓𝙇𝙻",
"M": "ΜϺМᎷᗰᛖℳⅯⲘꓟM𐊰𐌑𝐌𝑀𝑴𝓜𝔐𝕄𝕸𝖬𝗠𝘔𝙈𝙼𝚳𝛭𝜧𝝡𝞛",
"N": "ɴΝℕⲚꓠN𐔓𝐍𝑁𝑵𝒩𝓝𝔑𝕹𝖭𝗡𝘕𝙉𝙽𝚴𝛮𝜨𝝢𝞜",
"O": "0",
"P": "ΡРᏢᑭᴘᴩℙⲢꓑꮲP𐊕𝐏𝑃𝑷𝒫𝓟𝔓𝕻𝖯𝗣𝘗𝙋𝙿𝚸𝛲𝜬𝝦𝞠",
"Q": "ℚⵕQ𝐐𝑄𝑸𝒬𝓠𝔔𝕼𝖰𝗤𝘘𝙌𝚀",
"R": "ƦʀᎡᏒᖇᚱℛℜℝꓣꭱꮢR𐒴𖼵𝈖𝐑𝑅𝑹𝓡𝕽𝖱𝗥𝘙𝙍𝚁",
"S": "$ЅՏᏕᏚꓢS𐊖𐐠𖼺𝐒𝑆𝑺𝒮𝓢𝔖𝕊𝕾𝖲𝗦𝘚𝙎𝚂",
"T": "ŤΤτТтᎢᴛ⊤⟙ⲦꓔꭲT𐊗𐊱𐌕𑢼𖼊𝐓𝑇𝑻𝒯𝓣𝔗𝕋𝕿𝖳𝗧𝘛𝙏𝚃𝚻𝛕𝛵𝜏𝜯𝝉𝝩𝞃𝞣𝞽🝨",
"U": "Սሀᑌ∪⋃ꓴU𐓎𑢸𖽂𝐔𝑈𝑼𝒰𝓤𝔘𝕌𝖀𝖴𝗨𝘜𝙐𝚄",
"V": "Ѵ٧۷ᏙᐯⅤⴸꓦꛟV𐔝𑢠𖼈𝈍𝐕𝑉𝑽𝒱𝓥𝔙𝕍𝖁𝖵𝗩𝘝𝙑𝚅",
"W": "ԜᎳᏔꓪW𑣦𑣯𝐖𝑊𝑾𝒲𝓦𝔚𝕎𝖂𝖶𝗪𝘞𝙒𝚆",
"X": "ΧХ᙭ᚷⅩ╳ⲬⵝꓫꞳX𐊐𐊴𐌗𐌢𐔧𑣬𝐗𝑋𝑿𝒳𝓧𝔛𝕏𝖃𝖷𝗫𝘟𝙓𝚇𝚾𝛸𝜲𝝬𝞦",
"Y": "ΥϒУҮᎩᎽⲨꓬY𐊲𑢤𖽃𝐘𝑌𝒀𝒴𝓨𝔜𝕐𝖄𝖸𝗬𝘠𝙔𝚈𝚼𝛶𝜰𝝪𝞤",
"Z": "ΖᏃℤℨꓜZ𐋵𑢩𑣥𝐙𝑍𝒁𝒵𝓩𝖅𝖹𝗭𝘡𝙕𝚉𝚭𝛧𝜡𝝛𝞕",
"a": "@ɑαа⍺a𝐚𝑎𝒂𝒶𝓪𝔞𝕒𝖆𝖺𝗮𝘢𝙖𝚊𝛂𝛼𝜶𝝰𝞪",
"b": "ƄЬᏏᖯb𝐛𝑏𝒃𝒷𝓫𝔟𝕓𝖇𝖻𝗯𝘣𝙗𝚋",
"c": "ϲсᴄⅽⲥꮯc𐐽𝐜𝑐𝒄𝒸𝓬𝔠𝕔𝖈𝖼𝗰𝘤𝙘𝚌",
"d": "ԁᏧᑯⅆⅾꓒd𝐝𝑑𝒅𝒹𝓭𝔡𝕕𝖉𝖽𝗱𝘥𝙙𝚍",
"e": "еҽ℮ℯⅇꬲe𝐞𝑒𝒆𝓮𝔢𝕖𝖊𝖾𝗲𝘦𝙚𝚎",
"f": "ſϝքẝꞙꬵf𝐟𝑓𝒇𝒻𝓯𝔣𝕗𝖋𝖿𝗳𝘧𝙛𝚏𝟋",
"g": "ƍɡցᶃℊg𝐠𝑔𝒈𝓰𝔤𝕘𝖌𝗀𝗴𝘨𝙜𝚐",
"h": "һհᏂℎh𝐡𝒉𝒽𝓱𝔥𝕙𝖍𝗁𝗵𝘩𝙝𝚑",
"i": "ıɩɪ˛ͺιіӏᎥιℹⅈⅰ⍳ꙇꭵi𑣃𝐢𝑖𝒊𝒾𝓲𝔦𝕚𝖎𝗂𝗶𝘪𝙞𝚒𝚤𝛊𝜄𝜾𝝸𝞲",
"j": "ϳјⅉj𝐣𝑗𝒋𝒿𝓳𝔧𝕛𝖏𝗃𝗷𝘫𝙟𝚓",
"k": "k𝐤𝑘𝒌𝓀𝓴𝔨𝕜𝖐𝗄𝗸𝘬𝙠𝚔",
"l": "1",
"m": "m",
"n": "ոռn𝐧𝑛𝒏𝓃𝓷𝔫𝕟𝖓𝗇𝗻𝘯𝙣𝚗",
"o": "",
"p": "ρϱр⍴ⲣp𝐩𝑝𝒑𝓅𝓹𝔭𝕡𝖕𝗉𝗽𝘱𝙥𝚙𝛒𝛠𝜌𝜚𝝆𝝔𝞀𝞎𝞺𝟈",
"q": "ԛգզq𝐪𝑞𝒒𝓆𝓺𝔮𝕢𝖖𝗊𝗾𝘲𝙦𝚚",
"r": "гᴦⲅꭇꭈꮁr𝐫𝑟𝒓𝓇𝓻𝔯𝕣𝖗𝗋𝗿𝘳𝙧𝚛",
"s": "$ƽѕꜱꮪs𐑈𑣁𝐬𝑠𝒔𝓈𝓼𝔰𝕤𝖘𝗌𝘀𝘴𝙨𝚜",
"t": "t𝐭𝑡𝒕𝓉𝓽𝔱𝕥𝖙𝗍𝘁𝘵𝙩𝚝",
"u": "ʋυսᴜꞟꭎꭒu𐓶𑣘𝐮𝑢𝒖𝓊𝓾𝔲𝕦𝖚𝗎𝘂𝘶𝙪𝚞𝛖𝜐𝝊𝞄𝞾",
"v": "νѵטᴠⅴ∨⋁ꮩv𑜆𑣀𝐯𝑣𝒗𝓋𝓿𝔳𝕧𝖛𝗏𝘃𝘷𝙫𝚟𝛎𝜈𝝂𝝼𝞶",
"w": "ɯѡԝաᴡꮃw𑜊𑜎𑜏𝐰𝑤𝒘𝓌𝔀𝔴𝕨𝖜𝗐𝘄𝘸𝙬𝚠",
"x": "×хᕁᕽ᙮ⅹ⤫⤬⨯x𝐱𝑥𝒙𝓍𝔁𝔵𝕩𝖝𝗑𝘅𝘹𝙭𝚡",
"y": "ɣʏγуүყᶌỿℽꭚy𑣜𝐲𝑦𝒚𝓎𝔂𝔶𝕪𝖞𝗒𝘆𝘺𝙮𝚢𝛄𝛾𝜸𝝲𝞬",
"z": "ᴢꮓz𑣄𝐳𝑧𝒛𝓏𝔃𝔷𝕫𝖟𝗓𝘇𝘻𝙯𝚣",
"£": "₤",
"©": "Ⓒ",
"®": "Ⓡ"
}

Halimbawang analysis report

Narito ang isang halimbawang analysis report na nabubuo kapag ang search string ay nakatakda sa 'Glasswall', anuman ang textSetting na ginamit. Kabilang dito ang isang ItemMatchCount para sa bawat pattern na tumutugma sa isang ibinigay na file.

<gw:WordItem>
<gw:Name>Glasswall</gw:Name>
<gw:ItemMatchCount>1</gw:ItemMatchCount>
<gw:Locations>
<gw:Location>
<gw:Offset>463</gw:Offset>
<gw:Page>0</gw:Page>
<gw:Paragraph>0</gw:Paragraph>
</gw:Location>
</gw:Locations>
</gw:WordItem>

Mga function ng API

Status

Ang mga API na GwWordSearch at GwWordSearchDone ay nagbabalik ng isang Status na nagpapahiwatig ng resulta ng API call. Ang API na GwWordSearchTranslateStatus ay nagbabalik ng paglalarawan para sa ipinasa na Status.

EnumeratorValuePaglalarawan
ws_disallowedItemFound-1024Natagpuan sa file ang item na hindi pinapayagan ng policy.
ws_requiredItemNotFound-1025Ang item na kinakailangan ng policy ay hindi natagpuan sa file.
ws_illegalActionRedact-1026Tinukoy ang redact action ngunit hindi sinusuportahan ng filetype ang redaction.
ws_illegalActionRequire-1027Tinukoy ang require action ngunit hindi sinusuportahan ng filetype ang require.
ws_illegalActionNoRequire-1028Hindi tinukoy ang require action ngunit kailangan ito ng filetype.
ws_filetypeUnsupported-1029Hindi sinusuportahan ng Word Search ang filetype.
eFail0Pangkalahatan o hindi natukoy na error.
eSuccess1Matagumpay ang operasyon.

C++

Ang bawat isa sa mga API ay nagbabalik ng isang Status, na tinutukoy gaya ng sumusunod:

enum Status {
ws_disallowedItemFound = -1024,
ws_requiredItemNotFound = -1025,
ws_illegalActionRedact = -1026,
ws_illegalActionRequire = -1027,
ws_illegalActionNoRequire = -1028,
ws_filetypeUnsupported = -1029,
eFail = 0,
eSuccess = 1,
};

C#

Upang maisama ang Glasswall Word Search sa C#, kinakailangan ang Glasswall Word Search C# wrapper. Ang bawat isa sa mga API ay nagbabalik ng isang WordSearchStatus type, na tinutukoy gaya ng sumusunod:

/// <summary>
/// Indicates whether the Word Search process was successful (WordSearchStatus.Success)
/// or not (WordSearchStatus.Fail). Zero or negative values indicate a failure.
/// </summary>
public enum WordSearchStatus
{
DisallowedItemFound = -1024,
RequiredItemNotFound = -1025,
IllegalActionRedact = -1026,
IllegalActionRequire = -1027,
IllegalActionNoRequire = -1028,
FiletypeUnsupported = -1029,
Fail = 0,
Success
}

Java

Upang maisama ang Glasswall Word Search sa java, kinakailangan ang Glasswall Word Search Java wrapper. Bawat isa sa mga API ay nagbabalik ng uri na GlasswallWordSearchResult`, na tinutukoy gaya ng sumusunod:

package com.glasswallsolutions;

/**
* Class used to hold the results from a Word Search process.
*/
public class GlasswallWordSearchResult
{
/**
* The XML analysis report
*/
public String report;

/**
* The processed document
*/
public byte[] outputDocument;

/**
* boolean indicating whether the process was successful (true) or not (false)
*/
public boolean success;

public GlasswallWordSearchResult()
{
report = null;
outputDocument = null;
success = false;
}
}

Python

Upang maisama ang Glasswall Word Search sa Python, kinakailangan ang Glasswall Python wrapper. Bawat isa sa mga API ay nagbabalik ng isang generic na GwReturnObj object, na maglalaman ng mga attribute na: "status" (int), "output_file" (bytes), "output_report" (bytes). Ang mga int status ay tinutukoy gaya ng sumusunod:

# glasswall\libraries\word_search\successes.py

class Success(WordSearchSuccess):
""" WordSearch success code 1. """
pass


success_codes = {
1: Success,
}
# glasswall\libraries\word_search\errors.py

class UnknownErrorCode(WordSearchError):
""" Unknown error code. """
pass

class Fail(WordSearchError):
""" WordSearch error code 0. """
pass


class DisallowedItemFound(WordSearchError):
""" WordSearch error code -1024. Item disallowed by policy found in file. """
pass


class RequiredItemNotFound(WordSearchError):
""" WordSearch error code -1025. Item required by policy not found in file. """
pass


class IllegalActionRedact(WordSearchError):
""" WordSearch error code -1026. Redact action specified but filetype doesn't support redaction. """
pass


class IllegalActionRequire(WordSearchError):
""" WordSearch error code -1027. Require action specified but filetype doesn't support redaction. """
pass


class IllegalActionNoRequire(WordSearchError):
""" WordSearch error code -1028. Require action not specified but filetype needs one. """
pass


class FiletypeUnsupported(WordSearchError):
""" WordSearch error code -1029. Filetype supported by Editor but not by Word Search. """
pass


error_codes = {
0: Fail,
-1024: DisallowedItemFound,
-1025: RequiredItemNotFound,
-1026: IllegalActionRedact,
-1027: IllegalActionRequire,
-1028: IllegalActionNoRequire,
-1029: FiletypeUnsupported,
}

JavaScript

Upang maisama ang Glasswall Word Search sa JavaScript, kinakailangan ang Glasswall Word Search JavaScript wrapper. Bawat isa sa mga API ay nagbabalik ng uri na WordSearchStatus, na tinutukoy gaya ng sumusunod:

/**
* Used to indicate whether the Word Search process was successful or not
*/
export const enum WordSearchStatus {
ws_disallowedItemFound = -1024,
ws_requiredItemNotFound = -1025,
ws_illegalActionRedact = -1026,
ws_illegalActionRequire = -1027,
ws_illegalActionNoRequire = -1028,
ws_filetypeUnsupported = -1029,
eFail = 0,
eSuccess = 1,
}

GwWordSearch

Ginagamit ito upang tawagin ang Word Search engine, iproseso ang tinukoy na input file, at gumawa ng output file kasama ng isang Word Search analysis report.

C++

Status GwWordSearch(
void* input_buffer,
size_t input_buffer_len,
void** output_buffer,
size_t* output_buffer_len,
void** output_report_buffer,
size_t* output_report_buffer_len,
const char* homoglpyhs,
const char* xml_config_string
)
PangalanUriDireksyonPaglalarawan
input_buffervoid *PapasokIsang pointer sa buffer na naglalaman ng input file na ipoproseso
input_buffer_lensize_tPapasokAng laki ng input file buffer
output_buffervoid **PalabasIsang pointer sa isang pointer sa buffer na pupunan ng naprosesong file buffer. Ang buffer na ito ay inilalaan ng Word Search engine
output_buffer_lensize_t *PalabasIsang pointer sa laki ng output file buffer. Itatakda ito ng Word Search engine
output_report_buffervoid **PalabasIsang pointer sa isang pointer sa buffer na pupunan ng Word Search analysis report buffer. Ang buffer na ito ay inilalaan ng Word Search engine
output_report_buffer_lensize_t *PalabasIsang pointer sa laki ng ulat ng pagsusuri ng Word Search. Itatakda ito ng Word Search engine
homoglyphsconst char *PapasokIsang pointer sa buffer na naglalaman ng homoglyphs file. Kailangang null terminated ang buffer na ito
xml_config_stringconst char *PapasokIsang pointer sa buffer na naglalaman ng content management XML file. Kailangang null terminated ang buffer na ito

C#

Upang maisama ang Glasswall Word Search sa C#, kinakailangan ang Glasswall Word Search C# wrapper.

public WordSearchStatus GwWordSearch(
byte[] inputBuffer,
out byte[] outputFileBuffer,
out String outputAnalysisReport,
string homoglyphs,
string xmlConfigString
)

PangalanUriDireksyonPaglalarawan
inputBufferbyte[]PapasokAng buffer na naglalaman ng dokumentong ipoproseso
outputFileBufferout byte[]PalabasAng resultang buffer na maglalaman ng naprosesong dokumento
outputAnalysisReportout stringPalabasAng output analysis report mula sa proseso ng Word Search
homoglyphsstringPapasokIsang JSON document na naglalaman ng mga homoglyph mapping
xmlConfigStringstringPapasokAng XML content management policy

Java

Upang maisama ang Glasswall Word Search sa Java, kinakailangan ang Glasswall Word Search Java wrapper.


public native GlasswallWordSearchResult wordSearch(
byte[] inputDocument,
String homoglyphs,
String xmlConfig
)

PangalanUriDireksyonPaglalarawan
inputDocumentbyte[]PapasokAng buffer na naglalaman ng dokumentong ipoproseso
homoglyphsstringPapasokIsang JSON document na naglalaman ng mga homoglyph mapping
xmlConfigstringPapasokAng XML content management policy

Tandaan: Hindi tulad ng ilang iba pang sinusuportahang wika, lahat ng output ay ibinabalik sa GlasswallWordSearchResult object para sa Java.

Python

Upang maisama ang Glasswall Word Search sa Python, kinakailangan ang Glasswall Python wrapper.

# glasswall\libraries\word_search\word_search.py

def redact_file(self, input_file: Union[str, bytes, bytearray, io.BytesIO], content_management_policy: Union[str, bytes, bytearray, io.BytesIO], output_file: Union[None, str] = None, output_report: Union[None, str] = None, homoglyphs: Union[None, str, bytes, bytearray, io.BytesIO] = None, raise_unsupported: bool = True):
""" Redacts text from input_file using the given content_management_policy and homoglyphs file, optionally writing the redacted file and analysis report to the paths specified by output_file and output_report.

Args:
input_file (Union[str, bytes, bytearray, io.BytesIO]): The input file path or bytes.
content_management_policy (Union[str, bytes, bytearray, io.BytesIO)]): The content management policy to apply.
output_file (Union[None, str], optional): Default None. If str, write output_file to that path.
output_report (Union[None, str], optional): Default None. If str, write output_file to that path.
homoglyphs (Union[None, str, bytes, bytearray, io.BytesIO)], optional): Default None. The homoglyphs json file path or bytes.
raise_unsupported (bool, optional): Default True. Raise exceptions when Glasswall encounters an error. Fail silently if False.

Returns:
gw_return_object (glasswall.GwReturnObj): An instance of class glasswall.GwReturnObj containing attributes: "status" (int), "output_file" (bytes), "output_report" (bytes)
"""


def redact_directory(self, input_directory: str, content_management_policy: Union[str, bytes, bytearray, io.BytesIO, glasswall.content_management.policies.policy.Policy], output_directory: Optional[str] = None, output_report_directory: Optional[str] = None, homoglyphs: Union[None, str, bytes, bytearray, io.BytesIO] = None, raise_unsupported: bool = True):
""" Redacts all files in a directory and it's subdirectories using the given content_management_policy and homoglyphs file. The redacted files are written to output_directory maintaining the same directory structure as input_directory.

Args:
input_directory (str): The input directory containing files to redact.
output_directory (str): The output directory where the redacted files will be written.
output_report_directory (Optional[str], optional): Default None. If str, the output directory where analysis reports for each redacted file will be written.
content_management_policy (Union[str, bytes, bytearray, io.BytesIO)]): The content management policy to apply.
homoglyphs (Union[None, str, bytes, bytearray, io.BytesIO)], optional): Default None. The homoglyphs file path, str, or bytes.
raise_unsupported (bool, optional): Default True. Raise exceptions when Glasswall encounters an error. Fail silently if False.

Returns:
redacted_files_dict (dict): A dictionary of file paths relative to input_directory, and glasswall.GwReturnObj with attributes: "status" (int), "output_file" (bytes), "output_report" (bytes)
"""

Tandaan: Hindi tulad ng ilang iba pang sinusuportahang wika, lahat ng output ay ibinabalik sa GwReturnObj object para sa Python.

JavaScript


/**
* Perform word search on input buffer, using the applied config and homoglyphs
* @param {Buffer} inputBuffer A buffer containing the contents of the document to be processed.
* @param {String} homoglyphs A homoglyphs file that will be used as part of the Word Search process (UTF-8 string).
* @param {String} configXml The content management XML policy (utf-8 string).
* @returns {WordSearchResult} The result from Word Search.
*/
wordSearch(inputBuffer: Buffer, homoglyphs: string, configXml: string): WordSearchResult

Tandaan: Hindi tulad ng ilang iba pang sinusuportahang wika, lahat ng output ay ibinabalik sa WordSearchResult object para sa JavaScript.

GWWordSearchDone

Ginagamit ito upang i-release ang anumang resources na na-allocate ng Word Search engine. Kailangang tawagin ang function na ito pagkatapos ng bawat tawag na ginawa sa GwWordSearch function, kung hindi ay magkakaroon ng memory leaks.

Ang API call na ito ay kinakailangan lamang sa C++.

C++

Status GwWordSearchDone(
void** output_buffer,
size_t* output_buffer_len,
void** output_report_buffer,
size_t* output_report_buffer_len)
PangalanUriDireksyonPaglalarawan
output_buffervoid **PalabasIsang pointer sa isang pointer sa buffer na naglalaman ng naprosesong file na palalayain ng Word Search library
output_buffer_lensize_t *PalabasIsang pointer sa laki ng output file buffer
output_report_buffervoid **PalabasIsang pointer sa isang pointer sa buffer na naglalaman ng ulat ng pagsusuri ng Word Search na palalayain ng Word Search library
output_report_buffer_lensize_t *PalabasIsang pointer sa laki ng ulat ng pagsusuri ng Word Search

Iba pang mga wika

Para sa lahat ng wikang saklaw ng Glasswall wrappers, ang GwWordSearchDone API function ay internal na tinatawag sa loob ng wrapper, ibig sabihin ay hindi naka-expose ang API sa user.

GwWordSearchVersion

Ginagamit ito upang makuha ang kasalukuyang numero ng bersyon ng library.

C++

const char* GwWordSearchVersion(void)

GwWordSearchTranslateStatus

Isalin ang ibinigay na error code sa isang user-friendly na mensahe ng error.

C++

const char* GwWordSearchTranslateStatus(Status errorCode)
PangalanUriDireksyonPaglalarawan
errorCodeStatusPapasokAng return code na dapat isalin

Mga karaniwang isyu

Hindi nagpo-process ng mga file ang Word Search

Kapag pinapatakbo ang Word Search, pakitiyak na ang lahat ng Embedded Engine libraries ay nasa iisang directory, na kailangan ding itakda bilang current working directory. Naghahanap ang Glasswall sa loob ng current working directory para sa mga dependency nito at kung hindi makita ang mga ito, hindi mapo-process nang tama ang mga file. Tiyakin din na mayroong valid licence key.

Halimbawa ng paggamit

Narito ang isang halimbawang application na kumukuha ng input file, pinoproseso ito gamit ang Glasswall Word Search engine, at pagkatapos ay gumagawa ng output file kasama ng isang Word Search analysis report. Inaasahan ng halimbawang application na ito ang mga sumusunod na command line parameter:

  1. Path papunta sa content management configuration XML.
  2. Path papunta sa homoglyphs file.
  3. Path sa input file na ipo-process.
  4. Path sa output file kung saan ise-save ang naprosesong file.

C++

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <cstddef>
#include <stdexcept>

#include "api.h"

using namespace std;

// Read the file into a buffer
vector<uint8_t> readFile(ifstream &fileHandle, const string &filePath, bool nullTerminator)
{
fileHandle.exceptions(ifstream::failbit | ifstream::badbit);
fileHandle.open(filePath.c_str(), ios::binary | ios::ate);

vector<uint8_t> data;
streamsize size = fileHandle.tellg();
fileHandle.seekg(0, ios::beg);

data.resize(size + 1);
fileHandle.read(reinterpret_cast<char *>(data.data()), size);

if (nullTerminator)
{
data.push_back(0);
}

return data;
}

int main(int argc, char **argv)
{
if (argc != 5)
{
cerr << "Usage: <Path to XML Config> <Path to Homoglyphs> <Input file> <Output file>" << endl;
return -1;
}

// Read commandline arguments
string xmlFilePath(argv[1]);
string homoglyphsFilePath(argv[2]);
string inputFilePath(argv[3]);
string outputFilePath(argv[4]);

// Create file handles for input files
ifstream xmlFileHandle;
ifstream homoglyphsFileHandle;
ifstream inputFileHandle;

// Read files into buffers
vector<uint8_t> xmlBuffer = readFile(xmlFileHandle, xmlFilePath, true); // Buffer containing the XML content management settings. This is null terminated
vector<uint8_t> homoglyphsBuffer = readFile(homoglyphsFileHandle, homoglyphsFilePath, true); // Buffer containing the homoglyphs. This is null terminated
vector<uint8_t> inputBuffer = readFile(inputFileHandle, inputFilePath, false); // Buffer containing the input file to be processed

// Create variables for output buffers
void * outputBuffer = nullptr; // Output buffer for processed file
size_t outputBufferSize = 0; // Output buffer size
void * outputReportBuffer = nullptr; // Output buffer for analysis report file
size_t outputReportBufferSize = 0; // Output analysis report buffer size

// Run Word Search and redact
Status status = GwWordSearch(inputBuffer.data(), inputBuffer.size(), &outputBuffer, &outputBufferSize, &outputReportBuffer, &outputReportBufferSize, reinterpret_cast<const char*>(homoglyphsBuffer.data()), reinterpret_cast<const char *>(xmlBuffer.data()));

if (status == Status::eSuccess)
{
// Write out the processed output file if the Word Search and redact was successful
ofstream outputFileHandle(outputFilePath, ios::binary | ios::trunc);

if (outputFileHandle.is_open())
{
outputFileHandle.write(static_cast<const char *>(outputBuffer), outputBufferSize);
}

outputFileHandle.close();
}

// Write out the analysis report file
ofstream analysisFileHandle(outputFilePath + ".xml", ios::binary | ios::trunc);

if (analysisFileHandle.is_open())
{
analysisFileHandle.write(static_cast<const char *>(outputReportBuffer), outputReportBufferSize);
}

analysisFileHandle.close();

// Call done to release any allocated resources
GwWordSearchDone(&outputBuffer, &outputBufferSize, &outputReportBuffer, &outputReportBufferSize);

return 0;
}

C#

using System;
using System.IO;

namespace glasswall.word.search.csharp.testing
{
internal class Program
{
static void Main(string[] args)
{
Console.WriteLine("Word Search test");
if (args.Length != 4)
{
Console.WriteLine("usage: <Xml Config> <Homoglyphs> <Input Directory> <OutputDirectory>");
Console.WriteLine("Parameters specified: \n{0}", string.Join("\n", args));
return;
}

string xmlConfigPath = args[0];
string homoglyphsPath = args[1];
string inputDirectory = args[2];
string outputDirectory = args[3];

if (!File.Exists(xmlConfigPath))
{
Console.Error.WriteLine("Xml config does not exist: {0}", xmlConfigPath);
return;
}

if (!File.Exists(homoglyphsPath))
{
Console.Error.WriteLine("Homoglyphs does not exist: {0}", homoglyphsPath);
return;
}

if (!Directory.Exists(inputDirectory))
{
Console.Error.WriteLine("Input directory does not exist: {0}", inputDirectory);
return;
}

Directory.CreateDirectory(outputDirectory);

using (FileStream fileStream = new FileStream(Path.Combine(outputDirectory, "ProcessLog.txt"), FileMode.OpenOrCreate, FileAccess.Write))
{
using (StreamWriter writer = new StreamWriter(fileStream))
{
writer.WriteLine("> Word Search Library version: {0}", GlasswallWordSearch.GwWordSearchVersion());

string xmlConfig = File.ReadAllText(xmlConfigPath);
string homoglyphs = File.ReadAllText(homoglyphsPath);

foreach (string path in Directory.EnumerateFiles(inputDirectory, "*", SearchOption.AllDirectories))
{
writer.WriteLine("> Processing file: {0}", path);
string inputDirectoryPath = path.Substring(inputDirectory.Length + 1);
string directory = Path.Combine(outputDirectory, inputDirectoryPath);
Directory.CreateDirectory(directory);
processFile(path, directory, homoglyphs, xmlConfig);
}
}
}

return;
}
static void WriteAllBytes(string path, byte[] data)
{
if (data == null)
{
File.Create(path);
}
else
{
File.WriteAllBytes(path, data);
}
}
public static void processFile(string inputFile, string outputDirectory, string homoglyphs, string xmlConfig)
{

using (FileStream fileStream = new FileStream(Path.Combine(outputDirectory, Path.GetFileName(inputFile) + ".log"), FileMode.OpenOrCreate, FileAccess.Write))
{
using (StreamWriter writer = new StreamWriter(fileStream))
{
// Word Search
writer.WriteLine(">> Run Word Search");
byte[] inputFileBuffer = File.ReadAllBytes(inputFile);
byte[] outputBuffer, outputReportBuffer;
GlasswallWordSearch.WordSearchStatus status = GlasswallWordSearch.GwWordSearch(inputFileBuffer, out outputBuffer, out outputReportBuffer, homoglyphs, xmlConfig);
writer.WriteLine("Status is: {0}", status);

if (outputBuffer != null)
{
WriteAllBytes(Path.Combine(outputDirectory, Path.GetFileName(inputFile)), outputBuffer);
}

if (outputReportBuffer != null)
{
WriteAllBytes(Path.Combine(outputDirectory, Path.GetFileName(inputFile)) + ".xml", outputReportBuffer);
}
}
}
}
}
}

Java

package com.glasswallsolutions;

import java.lang.System;
import java.io.*;
import com.glasswallsolutions.*;
import java.nio.file.Paths;

public class MainTest {

public static byte[] readAllBytes(InputStream inputStream) throws IOException
{
final int bufLen = 4 * 0x400; // 4KB
byte[] buf = new byte[bufLen];
int readLen;

try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream()) {
while ((readLen = inputStream.read(buf, 0, bufLen)) != -1)
outputStream.write(buf, 0, readLen);

return outputStream.toByteArray();
}
}

public static void main(String[] args) throws Exception {
if (args.length != 4)
{
System.out.println("Usage: <Input Directory> <Output Directory> <Homoglyphs File> <Config XML>");
System.exit(-1);
}

File inputDirectory = new File(args[0]);
File outputDirectory = new File(args[1]);
outputDirectory.delete();
outputDirectory.mkdir();
String homoglyphsFile = args[2];
String configXmlFile = args[3];

String homoglyphs = null;
String configXML = null;

GlasswallWordSearch glasswallWordSearch = new GlasswallWordSearch();

try(FileInputStream homoglyphsInputStream = new FileInputStream(homoglyphsFile))
{
homoglyphs = new String(readAllBytes(homoglyphsInputStream));
}

try(FileInputStream configXmlInputStream = new FileInputStream(configXmlFile))
{
configXML = new String(readAllBytes(configXmlInputStream));
}

System.out.println("Word Search version: " + glasswallWordSearch.version());

for (File inputFile : inputDirectory.listFiles())
{
try
{
System.out.println("Processing file: " + inputFile.getAbsolutePath());

File fileOutputDirectory = new File(Paths.get(outputDirectory.getAbsolutePath(), inputFile.getName()).toString());
fileOutputDirectory.mkdir();
String fileOutputPath = Paths.get(fileOutputDirectory.getAbsolutePath(), inputFile.getName()).toString();

try(FileInputStream inputStream = new FileInputStream(inputFile))
{
byte[] fileData = readAllBytes(inputStream);

GlasswallWordSearchResult result = glasswallWordSearch.wordSearch(fileData, homoglyphs, configXML);

System.out.println("Status: " + result.success);

if (result.outputDocument != null)
{
try(FileOutputStream fileOutputStream = new FileOutputStream(fileOutputPath))
{
fileOutputStream.write(result.outputDocument);
}
}

if (result.report != null)
{
try(FileOutputStream fileOutputStream = new FileOutputStream(fileOutputPath + ".xml"))
{
fileOutputStream.write(result.report.getBytes());
}
}
}
}
catch(Exception ex)
{
System.err.println("Exception occurred: " + ex.getMessage());
ex.printStackTrace(System.err);

}
}
}
}

Python

Para sa higit pang mga halimbawa, tingnan ang Python Word Search & Redaction

JavaScript


import fs from 'fs';
import path from 'path';
import { GlasswallWordSearch, GlasswallWordSearchNative, WordSearchResult, WordSearchStatus } from '../index'

let main = function()
{
const args = process.argv;

if (args.length === 7)
{
let wordSearchDllPath = path.resolve(args[2]);
let inputDirectory = path.resolve(args[3]);
let outputDirectory = path.resolve(args[4]);
let homoglyphsPath = path.resolve(args[5]);
let configXmlPath = path.resolve(args[6]);

let handler = new GlasswallWordSearchNative(wordSearchDllPath, { enableLogging: true});
let glasswallWordSearch = new GlasswallWordSearch(handler);
console.log("Glasswall Word Search version: " + glasswallWordSearch.version())

if (!fs.existsSync(inputDirectory))
{
console.log('Input Directory does not exist: ' + inputDirectory);
process.exit(-1);
}

if (!fs.existsSync(homoglyphsPath))
{
console.log('Homoglyphs file does not exist: ' + homoglyphsPath);
process.exit(-1);
}

if (!fs.existsSync(configXmlPath))
{
console.log('Config XML file does not exist: ' + configXmlPath);
process.exit(-1);
}

let homoglyphs = fs.readFileSync(homoglyphsPath, 'utf8');
let configXml = fs.readFileSync(configXmlPath , 'utf8');

fs.mkdirSync(outputDirectory, {recursive: true});

fs.readdirSync(inputDirectory).forEach(file => {
try
{
let fullFilePath = path.join(inputDirectory, file);

if (fs.statSync(fullFilePath).isFile())
{
console.log('Processing file: ' + fullFilePath);
let outputFileDirectory = path.join(outputDirectory, file);
fs.mkdirSync(outputFileDirectory);
let inputBuffer = fs.readFileSync(fullFilePath);
let wordSearchResult = glasswallWordSearch.wordSearch(inputBuffer, homoglyphs, configXml);
console.log("Status: " + wordSearchResult.status);

if (wordSearchResult.outputBuffer != undefined && wordSearchResult.outputBuffer != null)
{
fs.writeFileSync(path.join(outputFileDirectory, file), wordSearchResult.outputBuffer);
}

if (wordSearchResult.analysisXmlReport != undefined && wordSearchResult.analysisXmlReport != null)
{
fs.writeFileSync(path.join(outputFileDirectory, file + ".xml"), wordSearchResult.analysisXmlReport);
}
}
}
catch(error)
{
console.log("Exception occurred: " + error);
console.trace(error);
}

})
}
else
{
console.log("Usage: Application <Library File> <Input Directory> <Output Directory> <Homoglyphs File> <Config XML>");
process.exit(-1);
}
}

if (require.main === module){
main();
}