Commit 6af9888
committed
Fix issue with extra token in tokenizers and pad
Signed-off-by: John St John <[email protected]>1 parent 2f54aff commit 6af9888
File tree
3 files changed
+32
-5
lines changed- bionemo-recipes/recipes/evo2_megatron
- tests/bionemo/evo2/data
- tokenizers
- nucleotide_fast_tokenizer_256
- nucleotide_fast_tokenizer_512
3 files changed
+32
-5
lines changedLines changed: 28 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
| 21 | + | |
20 | 22 | | |
| 23 | + | |
| 24 | + | |
21 | 25 | | |
22 | | - | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
23 | 31 | | |
24 | 32 | | |
25 | 33 | | |
| |||
29 | 37 | | |
30 | 38 | | |
31 | 39 | | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
32 | 59 | | |
33 | 60 | | |
34 | 61 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
143 | 143 | | |
144 | 144 | | |
145 | 145 | | |
146 | | - | |
147 | | - | |
| 146 | + | |
| 147 | + | |
148 | 148 | | |
149 | 149 | | |
150 | 150 | | |
| |||
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
116 | 116 | | |
117 | 117 | | |
118 | 118 | | |
119 | | - | |
120 | | - | |
| 119 | + | |
| 120 | + | |
121 | 121 | | |
122 | 122 | | |
123 | 123 | | |
| |||
0 commit comments