This is the second part of the topic about using Ispell and Hunspell dictionaries within PostgreSQL. In this topic I want to give some information about FLAG and AF parameters of Hunspell and about a patch which helps PostgreSQL to load dictionaries with this parameters.
FLAG and AF parameters of Hunspell
Let’s learn this parameters in the French dictionary example. This is the .affix file fragment:
FLAG long AF 273 AF S.() #1 AF S*() #2 AF F.() #3 AF a0p+() #4 AF F*() #5 AF L'D'Q' #6 AF W.() #7 AF n'q'l'm't's' #8 ... SFX S. Y 2 SFX S. 0 0 . SFX S. 0 s [^sxz] ... SFX F. Y 72 SFX F. 0 0 . SFX F. 0 s [eë] SFX F. e 0 [éiï]e SFX F. e s [éiï]e SFX F. rice eur [dt]rice SFX F. rice eurs [dt]rice SFX F. de d de SFX F. de ds de SFX F. fe f fe SFX F. fe fs fe ... SFX a0 N 102 SFX a0 er er er SFX a0 er ant [^cg]er SFX a0 cer çant cer SFX a0 ger geant ger SFX a0 er e [^y]er SFX a0 yer ye [^ou]yer
This is the .dict file fragment:
amodiatrice/3 argumentatrice/3 babillarde/3 banlieusarde/3
Here a .dict file have the following format:
basic_form/alias_number
AF parameter is used to have an alias for flag sets. If this parameter is used in an .affix file then in a .dict file we need use alias numbers, not affix class names.
Also in the French dictionary FLAG long parameter is used. This parameter can be used to have a large number of affix flags since we can use the double extended ASCII character flag type.
And using this French dictionary we must get the following results (how to load dictionaries you can see here):
SELECT ts_lexize('fr_hunspell', 'amodiateur'); ts_lexize --------------- {amodiatrice} (1 row) SELECT ts_lexize('fr_hunspell', 'argumentateur'); ts_lexize ------------------ {argumentatrice} (1 row) SELECT ts_lexize('fr_hunspell', 'babillard'); ts_lexize -------------- {babillarde} (1 row) SELECT ts_lexize('fr_hunspell', 'banlieusard'); ts_lexize ---------------- {banlieusarde} (1 row)
But instead we get the following error:
ERROR: Ispell dictionary supports only default flag value CONTEXT: line 161 of configuration file "/home/artur/progs/pgsqlpro/share/tsearch_data/fr.affix": "FLAG long"
This happens because of PostgreSQL do not support FLAG parameter. Also PostgreSQL do not support AF parameter, but no error will be raised. For example, you can load this Hungarian dictionary and test it.
Let’s look at this Danish dictionary (or this one). This is the .affix file fragment:
FLAG num SFX 6 Y 4 SFX 6 0 de/148,944 e SFX 6 0 ede/944,148 [^e] SFX 6 0 et/944,148 [^e] SFX 6 0 t/148,944 e ... SFX 841 Y 20 SFX 841 0 be/70 b SFX 841 0 ce/70 c SFX 841 0 de/70 d SFX 841 0 fe/70 f
And this is the .dict file fragment:
abonnere/6,143,148 absolvere/6,143,148 aller/699,55 alminde/699,55
Here a .dict file have the following format:
basic_form/flag,flag,...
Here FLAG num parameter is used. This parameter also can be used to work with a large number of affix flags.
Improvements
With some fixes PostgreSQL can support this parameters. From this thread (or direct link to the patch) you can download a patch.
This patch adds support for the FLAG long, FLAG num and AF parameters.
To apply this patch you need to perform these steps:
- download PostgreSQL 9.5 or higher source (from here), extract it.
- download the patch.
- execute the following command:
patch -p1 < ../patches/hunspell_dict.patch
- install PostgreSQL from downloaded source. You can use this documentation.
Further improvements
You maybe noticed that the Danish dictionary have a strange format of the .affix file:
SFX 841 0 be/70 b
Here 70 is reference to respective flag. It looks like the following:
SFX 70 Y 2 SFX 70 0 s/944 [^sxz] SFX 70 0 '/944 [sxz]
It means that to the suffix be can be added suffix s (not suffix ' since an ending be does not satisfy to the condition [sxz]).
Without this feature some dictionaries will not work correctly. But this feature is not supported by PostgreSQL yet.
Sports Betting - Mapyro
ОтветитьУдалитьBet 1등 사이트 the moneyline from 1:25 PM to 11:00 PM. See more. https://deccasino.com/review/merit-casino/ MapYO Sportsbook 출장안마 features wooricasinos.info live odds, live streaming, and detailed information. sol.edu.kg