SimpleChatTC: Update readme.md wrt latest updates. 2k maxtokens
This commit is contained in:
parent
1789f5f1e2
commit
c2112618c0
|
|
@ -239,10 +239,10 @@ It is attached to the document object. Some of these can also be updated using t
|
||||||
be set if needed using the settings ui.
|
be set if needed using the settings ui.
|
||||||
|
|
||||||
iRecentUserMsgCnt - a simple minded SlidingWindow to limit context window load at Ai Model end.
|
iRecentUserMsgCnt - a simple minded SlidingWindow to limit context window load at Ai Model end.
|
||||||
This is disabled by default. However if enabled, then in addition to latest system message, only
|
This is set to 5 by default. So in addition to latest system message, last/latest iRecentUserMsgCnt
|
||||||
the last/latest iRecentUserMsgCnt user messages after the latest system prompt and its responses
|
user messages after the latest system prompt and its responses from the ai model will be sent
|
||||||
from the ai model will be sent to the ai-model, when querying for a new response. IE if enabled,
|
to the ai-model, when querying for a new response. Note that if enabled, only user messages after
|
||||||
only user messages after the latest system message/prompt will be considered.
|
the latest system message/prompt will be considered.
|
||||||
|
|
||||||
This specified sliding window user message count also includes the latest user query.
|
This specified sliding window user message count also includes the latest user query.
|
||||||
<0 : Send entire chat history to server
|
<0 : Send entire chat history to server
|
||||||
|
|
@ -282,9 +282,11 @@ full chat history. This way if there is any response with garbage/repeatation, i
|
||||||
mess with things beyond the next question/request/query, in some ways. The trim garbage
|
mess with things beyond the next question/request/query, in some ways. The trim garbage
|
||||||
option also tries to help avoid issues with garbage in the context to an extent.
|
option also tries to help avoid issues with garbage in the context to an extent.
|
||||||
|
|
||||||
Set max_tokens to 1024, so that a relatively large previous reponse doesnt eat up the space
|
Set max_tokens to 2048, so that a relatively large previous reponse doesnt eat up the space
|
||||||
available wrt next query-response. However dont forget that the server when started should
|
available wrt next query-response. While parallely allowing a good enough context size for
|
||||||
also be started with a model context size of 1k or more, to be on safe side.
|
some amount of the chat history in the current session to influence future answers. However
|
||||||
|
dont forget that the server when started should also be started with a model context size of
|
||||||
|
2k or more, to be on safe side.
|
||||||
|
|
||||||
The /completions endpoint of tools/server doesnt take max_tokens, instead it takes the
|
The /completions endpoint of tools/server doesnt take max_tokens, instead it takes the
|
||||||
internal n_predict, for now add the same here on the client side, maybe later add max_tokens
|
internal n_predict, for now add the same here on the client side, maybe later add max_tokens
|
||||||
|
|
@ -321,9 +323,9 @@ work.
|
||||||
|
|
||||||
### Tool Calling
|
### Tool Calling
|
||||||
|
|
||||||
ALERT: Currently the way this is implemented, it is dangerous to use this, unless one verifies
|
ALERT: The simple minded way in which this is implemented, it can be dangerous in the worst case,
|
||||||
all the tool calls requested and the responses generated manually to ensure everything is fine,
|
Always remember to verify all the tool calls requested and the responses generated manually to
|
||||||
during interaction with ai modles with tools support.
|
ensure everything is fine, during interaction with ai modles with tools support.
|
||||||
|
|
||||||
#### Builtin Tools
|
#### Builtin Tools
|
||||||
|
|
||||||
|
|
@ -332,10 +334,10 @@ The following tools/functions are currently provided by default
|
||||||
* run_javascript_function_code - which can be used to run some javascript code in the browser
|
* run_javascript_function_code - which can be used to run some javascript code in the browser
|
||||||
context.
|
context.
|
||||||
|
|
||||||
Currently the generated code / expression is run through a simple dynamic function mechanism.
|
Currently the generated code / expression is run through a simple minded eval inside a web worker
|
||||||
May update things, in future, so that a WebWorker is used to avoid exposing browser global scope
|
mechanism. Use of WebWorker helps avoid exposing browser global scope to the generated code directly.
|
||||||
to the generated code directly. Either way always remember to cross check the tool requests and
|
However any shared web worker scope isnt isolated. Either way always remember to cross check the tool
|
||||||
generated responses when using tool calling.
|
requests and generated responses when using tool calling.
|
||||||
|
|
||||||
May add
|
May add
|
||||||
* web_fetch along with a corresponding simple local web proxy/caching server logic that can bypass
|
* web_fetch along with a corresponding simple local web proxy/caching server logic that can bypass
|
||||||
|
|
@ -343,19 +345,20 @@ May add
|
||||||
Inturn maybe with a white list of allowed sites to access or so.
|
Inturn maybe with a white list of allowed sites to access or so.
|
||||||
|
|
||||||
|
|
||||||
#### Extending wiht new tools
|
#### Extending with new tools
|
||||||
|
|
||||||
Provide a descriptive meta data explaining the tool / function being provided for tool calling,
|
Provide a descriptive meta data explaining the tool / function being provided for tool calling,
|
||||||
as well as its arguments.
|
as well as its arguments.
|
||||||
|
|
||||||
Provide a handler which should implement the specified tool / function call. It should place
|
Provide a handler which should implement the specified tool / function call or rather constructs
|
||||||
the result to be sent back to the ai model in the result key of the tc_switch entry for the
|
the code to be run to get the tool / function call job done, and inturn pass the same to the
|
||||||
corresponding tool.
|
provided web worker to get it executed. Remember to use console.log while generating any response
|
||||||
|
that should be sent back to the ai model, in your constructed code.
|
||||||
|
|
||||||
Update the tc_switch to include a object entry for the tool, which inturn icnludes
|
Update the tc_switch to include a object entry for the tool, which inturn includes
|
||||||
* the meta data as well as
|
* the meta data as well as
|
||||||
* a reference to the handler and also
|
* a reference to the handler and also
|
||||||
* the result key
|
* the result key (was used previously, may use in future, but for now left as is)
|
||||||
|
|
||||||
#### Mapping tool calls and responses to normal assistant - user chat flow
|
#### Mapping tool calls and responses to normal assistant - user chat flow
|
||||||
|
|
||||||
|
|
@ -368,16 +371,16 @@ tagged response in the subsequent user block.
|
||||||
This allows the GenAi/LLM to be aware of the tool calls it made as well as the responses it got,
|
This allows the GenAi/LLM to be aware of the tool calls it made as well as the responses it got,
|
||||||
so that it can incorporate the results of the same in the subsequent chat / interactions.
|
so that it can incorporate the results of the same in the subsequent chat / interactions.
|
||||||
|
|
||||||
NOTE: This flow tested to be ok enough with Gemma-3N-E4B-it-Q8_0 LLM ai model for now.
|
NOTE: This flow tested to be ok enough with Gemma-3N-E4B-it-Q8_0 LLM ai model for now. Logically
|
||||||
|
given the way current ai models work, most of them should understand things as needed, but need
|
||||||
|
to test this with other ai models later.
|
||||||
|
|
||||||
TODO: Need to think later, whether to continue this simple flow, or atleast use tool role wrt
|
TODO: Need to think later, whether to continue this simple flow, or atleast use tool role wrt
|
||||||
the tool call responses or even go further and have the logically seperate tool_call request
|
the tool call responses or even go further and have the logically seperate tool_calls request
|
||||||
structures also.
|
structures also.
|
||||||
|
|
||||||
#### ToDo
|
#### ToDo
|
||||||
|
|
||||||
Update to use web worker.
|
|
||||||
|
|
||||||
WebFetch and Local web proxy/caching server
|
WebFetch and Local web proxy/caching server
|
||||||
|
|
||||||
Try and trap promises based flows to ensure all generated results or errors if any are caught
|
Try and trap promises based flows to ensure all generated results or errors if any are caught
|
||||||
|
|
|
||||||
|
|
@ -907,8 +907,8 @@ class Me {
|
||||||
this.apiRequestOptions = {
|
this.apiRequestOptions = {
|
||||||
"model": "gpt-3.5-turbo",
|
"model": "gpt-3.5-turbo",
|
||||||
"temperature": 0.7,
|
"temperature": 0.7,
|
||||||
"max_tokens": 1024,
|
"max_tokens": 2048,
|
||||||
"n_predict": 1024,
|
"n_predict": 2048,
|
||||||
"cache_prompt": false,
|
"cache_prompt": false,
|
||||||
//"frequency_penalty": 1.2,
|
//"frequency_penalty": 1.2,
|
||||||
//"presence_penalty": 1.2,
|
//"presence_penalty": 1.2,
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue