Search 101

SARIT supports a useful subset of the Lucene Query Parser Syntax, specifically wildcard searches (including wildcards as the first character of a search), boolean operators, and grouping. SARIT also allows you to search its texts independently of the script (currently SARIT has documents in both Devanāgarī and IAST).

Some examples will best explain this:

  • Simple term searches:
    • hetuḥ will match the individual words hetuḥ and हेतुः. (Note: `Word' in this context also refers to compounds. It will not match svabhāvahetuḥ, because there hetuḥ is part of a longer word.)
    • "tathā nāsti" will match the phrase tathā nāsti, but not (for example) yadi nāsti sāmānyaṃ tathā or nāsti tathā. The quotes tell the search engine that you are looking for a single term.
  • Wildcard searches: use a * to match any number of letters, or a ? to match one letter.
    • *hetuḥ will match any word ending in either hetuḥ or हेतुः, e.g., svabhāvahetuḥ. (Note: The * wildcard matches zero or more characters.)
    • *bhāva* will match any word containing bhāva or भाव, e.g., svabhāvahetuḥ or अन्याभावः.
    • *bhāva* will match any word containing bhāva or भाव, e.g., svabhāvahetuḥ or अन्याभावः.
    • ?araṇam will match the individual words maraṇam and karaṇam, but not udāharaṇam. (Note: The ? wildcard matches one character.)
  • Boolean operators: these allow you to combine search expressions, with either AND, OR, or NOT:
    • maraṇam OR karaṇam: passages where either of these two words occur.
    • maraṇam AND karaṇam: passages where both of these words occur.
    • *maraṇam OR *karaṇam: all compounds ending in either of the two.
    • *maraṇam AND *karaṇam: passages with compounds ending in both of these two strings (or also smaraṇam, etc.).
    • You can combine them, *maraṇam AND *karaṇam NOT smaraṇam: passages with compounds ending in both of these two strings (but not in smaraṇam).
    • There are also two other boolean operators: -, to exclude results containing a term, and + to include only results that contain the term.
    • For SARIT, the AND operator is the default: if you search for tathā nāsti, the results should be identical to tathā AND nāsti or nāsti AND tathā.
  • Grouping allows you to efficiently combine more complex queries:
    • (*hetu* OR *hetā* OR *heto*) AND (*liṅga* OR *liṅge*): search for instances where hetu and liṅga occur together, with some flexibility as to their form.

These are only brief hints on the basic usage. For fuller explanations, please consult the help for the SARIT interface and the Lucene search.

The search interface

If you hover over the fields in SARIT's search panel, you will see some short hints on how to use them.

An example search might look like this:

  1. Search For What?
    • Enter your search term here. SARIT supports devanāgarī and IAST searches. If Javascript is enabled in your browser, you can also enter your search term in Kyoto-Harvard convention (or Harvard-Kyoto).
  2. Where to search?
    • You can choose to search the text content of a TEI document's text, or its header (teiHeader).
  3. Limit search to certain authors?
    • You can choose Search In Texts By Any Author to search throughout the corpus or you can select one or more authors to search in the texts attributed to them.
    • Note that you can also select individual texts to search by clicking on the check-boxes to the left of the titles.