Guest guest Posted August 5, 2005 Report Share Posted August 5, 2005 Dear Dr Vasudevan in the Indology forum you asked recently: > Sir, > I would like to know the details of the software for > arranging sanskrit words > alphabetically for preparing index of Sanskrit verses. Kindly > inform me the full details. > Expecting your early response. Here is a short note on the method I use to sort Sanskrit words in my dictionary (at http://sanskrit.inria.fr"). I am using TeX/devnag only as an external I/O format, XML/HTML/ Unicode is another one. All internal computations are realized over specific data structures. For instance, the type word is represented by lists of letters (Sanskrit phonemes, represented as integers). For instance: # code "yogananda"; - : Word.word = [42; 12; 19; 1; 36; 1; 36; 34; 1] # decode [42; 12; 19; 1; 36; 1; 36; 34; 1]; - : string = "yogananda" The basic data structures of the Sanskrit software are defined in the ZEN Ocaml toolkit, a GPL free software effort available at http://sanskrit.inria.fr/huet/ ZEN/index.html where you have fully documented sources. Sorting of the dictionary entries is done on the word data structure. The dictionary order of entries, checked within each level (words, subwords, subsubwords), is effected by comparing normalized words, where normalisation eliminates non-original anusvaara and visarga. Thus: # code_string "yogananda" = code_string "yogana.mda"; - : bool = True # code_string "du.hsaha"; - : list int = [34; 5; 48; 48; 1; 49; 1] # decode [34; 5; 48; 48; 1; 49; 1]; - : string = "dussaha" Velthuis notation is only used at parsing/unparsing time, like Unicode strings for devanagari or diacritics representations for the HTML pages, but all computations are done on the more abstract word structure. I hope this answers your questions. Gerard Huet Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You are posting as a guest. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.