mirror of
https://github.com/ollama/ollama.git
synced 2025-08-23 23:21:23 +02:00
model: fix boundary in bpe
0x007e is a tilde and was getting adjusted (+0x00a2) to 0x0120 in the encode, but then in the decode it was getting adjusted down (-0x0100) to 0x0020. The boundary for the +0x00a2 case has been adjusted to fix this Fixes: #11966
This commit is contained in:
@@ -109,7 +109,7 @@ func (bpe BytePairEncoding) Encode(s string, addSpecial bool) ([]int32, error) {
|
||||
r = 0x0143
|
||||
case r <= 0x0020:
|
||||
r = r + 0x0100
|
||||
case r >= 0x007e && r <= 0x00a0:
|
||||
case r >= 0x007f && r <= 0x00a0:
|
||||
r = r + 0x00a2
|
||||
}
|
||||
|
||||
|
Reference in New Issue
Block a user