NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models

		NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models (arxiv.org)
		13 points by chrsw 41 days ago \| hide \| past \| favorite