HanishKVC
ebd5e71295
SimpleChat:CSS: Move style info into its own css file
...
To keep it simple, clean and seperate so that things are not
unnecessarily cluttered.
2024-05-18 17:09:47 +05:30
HanishKVC
65a56e6fdb
SimpleChat: Update the readme file
2024-05-18 03:37:15 +05:30
HanishKVC
0d0a28b4ab
SimpleChat:HTML: Add a style for system role message
2024-05-18 03:31:37 +05:30
HanishKVC
601fedf8c1
SimpleChat: Move handling systemprompt into its own func
2024-05-18 03:19:59 +05:30
HanishKVC
72151aa634
SimpleChat:Alert user if they provide sysprompt late or change it
2024-05-18 03:16:30 +05:30
HanishKVC
884adfd739
SimpleChat: Ignore empty user input, without trimming
2024-05-18 03:07:40 +05:30
HanishKVC
ae52ad1675
SimpleChat:Allow system prompt to be set, if provided before user
2024-05-18 02:59:42 +05:30
HanishKVC
69817fe1de
SimpleChat:HTML: Cleanup/structure UI a bit, Add input for system
2024-05-18 01:40:57 +05:30
HanishKVC
668b98700c
SimpleChat: Add a simple readme file
2024-05-18 01:06:54 +05:30
HanishKVC
b3644172e0
SimpleChat:JS: Force completion mode be single message by default
2024-05-18 00:36:23 +05:30
HanishKVC
aef32d9cc0
SimpleChat:JS: Handle difference in response
...
Try read the assistance response from appropriate field in the
response got.
Also examples/server seems to return the response in a slightly
different field, so try account for that also.
2024-05-18 00:36:23 +05:30
HanishKVC
3e5edbacd6
SimpleChat: Dont submit if already submitted and waiting
...
Also make chat the default selection wrt mode
2024-05-18 00:36:23 +05:30
HanishKVC
9feb58eaa5
SimpleChat: Allow user to select chat or completion mode
2024-05-18 00:36:23 +05:30
HanishKVC
e62087bf3f
SimpleChat:JS: Try trap enter key press wrt input text field
...
So user can either press submit button or press enter key
2024-05-18 00:36:23 +05:30
HanishKVC
29d2d22c02
SimpleChat:sh: Add simple shell script to run python3 http.server
...
So one needs to run the llm server locally
then run this script and access it using a local browser
2024-05-18 00:36:23 +05:30
HanishKVC
ebe330d098
SimpleChat: Move into its own sub directory to avoid confusion
2024-05-18 00:36:23 +05:30
HanishKVC
9942851273
SimpleChat: Diff user/assistant msgs, Make input wider
...
Also show a default message to user
Also add some metas
2024-05-18 00:36:23 +05:30
HanishKVC
7d772f6b9a
SimpleChat: Try keep input element in view
2024-05-18 00:36:23 +05:30
HanishKVC
564469e4f6
SimpleChat:JS: Messages/Prompt, indicate working to end user
2024-05-18 00:36:23 +05:30
HanishKVC
c6653479fc
SimpleChat:JS: Extract model response and show to user
2024-05-18 00:36:23 +05:30
HanishKVC
33bc67baa6
SimpleChat: Try handshake with llm over its web service endpoint
2024-05-18 00:36:23 +05:30
HanishKVC
27268a6067
SimpleChat: Move handling of submit request into its own func
2024-05-18 00:36:23 +05:30
HanishKVC
ce4aaeb692
SimpleChat: Use common helper logic wrt json data
2024-05-18 00:36:23 +05:30
HanishKVC
639d647ebf
SimpleChat: Also add completions related prompt
2024-05-18 00:36:23 +05:30
HanishKVC
256e02c7c9
SimpleChat: Rather value wrt input text element
2024-05-18 00:36:23 +05:30
HanishKVC
24d348ab97
SimpleChat:HTML: Bring in the js file
2024-05-18 00:36:23 +05:30
HanishKVC
70e5860264
SimpleChatJS: Roles Class, submitClick
...
Define Role class with static members corresponding to the roles.
Update startme to
* Get hold of the ui elements.
* Attach a click handler to submit button, which adds the user input
to xchats array and shows the chat messages till now in chat div
element.
Trap DOMContentLoaded to trigger startme
2024-05-18 00:36:23 +05:30
HanishKVC
1d3cc9353a
SimpleChat: request_json, globals, startme
2024-05-18 00:36:23 +05:30
HanishKVC
0402a4b60e
SimpleChat: A js skeleton with SimpleChat class
...
Allows maintaining an array of chat message.
Allows adding chat message (from any of the roles be it system,
user, assistant, ...)
Allows showing chat messages till now, in a given div element.
2024-05-18 00:36:23 +05:30
HanishKVC
69ecad21e7
SimpleChat: Add a skeletal html page
...
Contains a div placeholder for showing chat messages till now
a text-input for allowing user to enter next chat message/query
to the model.
a submit button to allow sending of the user entered message and
chat till now to the model.
2024-05-18 00:36:22 +05:30
Radoslav Gerganov
f4bd8b3d26
rpc : set SO_REUSEADDR for the server socket ( #7320 )
...
ref: #7293
2024-05-17 17:25:44 +03:00
Radoslav Gerganov
ee94172d33
server : add support for the RPC backend ( #7305 )
...
ref: #7292
2024-05-17 10:00:17 +03:00
Leon Knauer
9c4fdcbec8
[Server] Added --verbose option to README [no ci] ( #7335 )
2024-05-17 10:11:03 +10:00
Pierrick Hymbert
24ecb58168
Revert "server bench: fix bench not waiting for model load ( #7284 )" ( #7334 )
...
This reverts commit 583fd6b000 .
2024-05-16 20:43:45 +02:00
Radoslav Gerganov
9afdffe70e
rpc : get available mem for the CPU backend
...
This can be overridden with the -m command line option
ref: #7293
2024-05-16 12:04:08 +03:00
Radoslav Gerganov
3b3963c55c
rpc : add command line arg for specifying backend memory
...
ref: #7293
2024-05-16 09:58:29 +03:00
Vaibhav Srivastav
ad52d5c259
doc: add references to hugging face GGUF-my-repo quantisation web tool. ( #7288 )
...
* chore: add references to the quantisation space.
* fix grammer lol.
* Update README.md
Co-authored-by: Julien Chaumond <julien@huggingface.co>
* Update README.md
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-05-16 15:38:43 +10:00
slaren
344f9126cc
ggml : tag ggml_tensor::backend as deprecated ( #7290 )
2024-05-15 15:08:48 +02:00
dm4
ea3b0590ee
embedding : free the batch after execution ( #7297 )
2024-05-15 15:01:12 +03:00
Johannes Gäßler
583fd6b000
server bench: fix bench not waiting for model load ( #7284 )
2024-05-15 08:44:16 +02:00
Steve Grubb
4f0263633b
server: free sampling contexts on exit ( #7264 )
...
* server: free sampling contexts on exit
This cleans up last leak found by the address sanitizer.
* fix whitespace
* fix whitespace
2024-05-14 16:11:24 +02:00
Brian
1265c670fd
Revert "move ndk code to a new library ( #6951 )" ( #7282 )
...
This reverts commit efc8f767c8 .
2024-05-14 16:10:39 +03:00
Radoslav Gerganov
5e31828d3e
ggml : add RPC backend ( #6829 )
...
* ggml : add RPC backend
The RPC backend proxies all operations to a remote server which runs a
regular backend (CPU, CUDA, Metal, etc).
* set TCP_NODELAY
* add CI workflows
* Address review comments
* fix warning
* implement llama_max_devices() for RPC
* Address review comments
* Address review comments
* wrap sockfd into a struct
* implement get_alignment and get_max_size
* add get_device_memory
* fix warning
* win32 support
* add README
* readme : trim trailing whitespace
* Address review comments
* win32 fix
* Address review comments
* fix compile warnings on macos
2024-05-14 14:27:19 +03:00
Elton Kola
efc8f767c8
move ndk code to a new library ( #6951 )
2024-05-14 17:30:30 +10:00
Ryuei
27f65d6267
docs: Fix typo and update description for --embeddings flag ( #7026 )
...
- Change '--embedding' to '--embeddings' in the README
- Update the description to match the latest --help output
- Added a caution about defining physical batch size
2024-05-14 15:20:47 +10:00
k.h.lai
30e70334f7
llava-cli: fix base64 prompt ( #7248 )
2024-05-14 00:02:36 +10:00
Johannes Gäßler
1c570d8bee
perplexity: add BF16 vs. FP16 results ( #7150 )
2024-05-13 13:03:27 +02:00
Benjamin Findley
e586ee4259
change default temperature of OAI compat API from 0 to 1 ( #7226 )
...
* change default temperature of OAI compat API from 0 to 1
* make tests explicitly send temperature to OAI API
2024-05-13 12:40:08 +10:00
Xuan Son Nguyen
72c177c1f6
fix system prompt handling ( #7153 )
2024-05-11 17:28:10 +02:00
Steve Grubb
988631335a
server : free llama_batch on exit ( #7212 )
...
* [server] Cleanup a memory leak on exit
There are a couple memory leaks on exit of the server. This hides others.
After cleaning this up, you can see leaks on slots. But that is another
patch to be sent after this.
* make tab into spaces
2024-05-11 11:13:02 +03:00