PromptTokenLimit

The maximum number of tokens to include in the prompt that Answer Server supplies to the configured LLM. Answer Server keeps to this limit by truncating the context documents that it includes in the prompt.

This limit might be important in some cases, for example when you use a model that charges according to the size of the prompt that you provide. Answer Server enforces the maximum prompt size so that the prompt and context never exceeds the limit.

The token count for a prompt depends on the model that you use. To obtain an accurate token count for your model, you must provide a get_token_count function in the script that you use to call the LLM. See Configure the RAG System.

Type: Integer
Default:  
Required: No
Configuration Section: MySystem
Example: MaxPromptSize=500
See Also: