Upgrade llama.cpp to most recent release (b1645) so it supports mixtral#33
Upgrade llama.cpp to most recent release (b1645) so it supports mixtral#33kherud merged 12 commits intokherud:masterfrom
Conversation
| } | ||
|
|
||
| @Test | ||
| public void testTokenization() { | ||
| String prompt = "Hello, world!"; | ||
| int[] encoded = model.encode(prompt); | ||
| Assert.assertArrayEquals(new int[]{15043, 29892, 3186, 29991}, encoded); |
There was a problem hiding this comment.
The token IDs are different because the model is different. I figured that a sufficient test is that the encode/decode can be round-tripped.
| @@ -157,14 +171,13 @@ public void testCompleteGrammar() { | |||
| @Test | |||
| public void testEmbedding() { | |||
| float[] embedding = model.embed(prefix); | |||
| Assert.assertEquals(5120, embedding.length); | |||
There was a problem hiding this comment.
Different dimensions of model between codellama and mistral 7b, so the embeddings have different dims.
| <junit.version>4.13.1</junit.version> | ||
| <test.plugin.version>3.2.3</test.plugin.version> | ||
| <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> | ||
| <integration.test.model>mistral-7b-instruct-v0.2.Q5_K_S.gguf</integration.test.model> |
There was a problem hiding this comment.
I decided to rework this to operate with mistral 7b since it's smaller and apache licensed.
| jsize len = string.size(); | ||
| jbyteArray bytes = env->NewByteArray(len); | ||
| env->SetByteArrayRegion(bytes, 0, len, (jbyte *)string.c_str()); | ||
| env->SetByteArrayRegion(bytes, 0, len, reinterpret_cast<const jbyte *>(string.c_str())); |
There was a problem hiding this comment.
This avoids a warning.
|
|
||
| String modelPath = "/run/media/konstantin/Seagate/models/llama2/llama-2-13b-chat/ggml-model-q4_0.gguf"; | ||
| .setAntiPrompt("User:"); | ||
| String modelName = System.getProperty("model.name"); |
There was a problem hiding this comment.
This enables us to run the examples via maven directly rather than have it hard coded. For instance, you can run this with the codellama model like so:
mvn exec:java -Dexec.mainClass="examples.MainExample" -Dmodel.home="/Users/cstella/llm/models" -Dmodel.name="codellama-13b.Q5_K_M.gguf"
Not sure if you would like this documented in the Readme.
|
Awesome work, thank you! It's 12 am here, I will review everything tomorrow and merge if everything is fine! |
|
Thanks again @cestella great work! Just some notes:
|
|
Absolutely!
|
This PR does the following:
After this PR, you can run the integration tests via maven like so:
mvn verify -Dmodel.home=$HOME/llm/modelsNote:
$HOME/llm/modelsis the model home directory. This is where the integration test model will be stored.Examples
In addition to refactoring the unit test, I reran the examples. I modified the examples to be able to be run via maven:
examples.GrammarExampleThis looks roughly like what I see if I run the same example from main:
examples.InfillExampleWith the new code I see:
With master I see:
examples.MainExampleI made a couple of modifications to both the old version and the new version to compare:
User:so it actually completes the statementWith this PR I see:
With master I see: