Why take 1k tokens after transformer as potential vectors?
Why take 1k tokens after transformer as potential vectors?