Big debut! LS910 domestic Xinchuang large model inference server

In 2026, the size of China's large model market is expected to exceed 70 billion yuan, and the daily use volume of enterprise-level large models has jumped from 10.2 trillion Tokens to 37 trillion Tokens, an increase of 263% in half a year. Large models are accelerating from the universal base period of the "Battle of Hundreds of Models" to a new stage of deep industry penetration and large-scale application.

A key question arises: How to change computing power from "available" to "easy to use" in the private domain?

LS910 is Colorlight’s answer.

Localized deployment, data does not leave the domain

LS910 is a domestic AI inference server launched for enterprise-level applications. It is designed for core business scenarios such as security analysis, enterprise knowledge base Q&A, intelligent customer service, document understanding, and code assistance. It supports the privatized deployment of mainstream large models such as Qwen3.6-27B. All data and inference processes run in the customer's local environment, truly keeping the data out of the domain and meeting Xinchuang's compliance and data security requirements.

Enterprise knowledge base Q&A

Long document understanding and summary

Meeting minutes

Automatic generation

Code assistance

and security analysis

In-depth tuning

Let every bit of computing power be fully released

Specifications

Module

Specifications

Main control unit

Kunpeng 920 24 Cores

AI computing core

Ascend Atlas 300I A2 (910B)

64GB HBM memory

Software stack

CANN/MindSpore+self-developed optimization layer

Provides OpenAI compatible API

Private domain mainstream model indicators (Qwen3.6-27B)

Capability items

Reference indicators

Recommended context

128K tokens

(optional upgrade to 256K)

Single-channel output speed

≥25tokens/s,

Typical scenario 25–45 tokens/s

First Token delay

2–4 seconds for short context,

4–12 seconds for 32K context

Real-time concurrency

2–3 channel streaming generation

Online session

6–10 concurrent accesses

Flexible expansion, from single card to multiple cards

When the business grows, it can be expanded smoothly by adding computing power cards. The same architecture does not need to be adjusted, protecting the initial investment.

Edge Standard Edition (1 card): single-department knowledge base, intelligent assistant

Max Professional Edition (4 cards): multi-users, long documents, multiple knowledge bases throughout the company

Pro High-Performance Edition (8 cards): high concurrency or 262K ultra-long context tasks

LS910 large model inference server, thanks to Ascend 910B computing power base and Colorlight full-link software tuning capabilities, can be widely used in private AI scenarios such as enterprise knowledge base Q&A, long document understanding, code assistance, etc.

The product adopts the integrated design concept of "hardware + operator optimization" to achieve flat optimization of inference throughput, response delay, and concurrency capabilities. It can be flexibly expanded from a single card to multiple cards according to business needs, providing excellent performance and cost-effectiveness while ensuring data security.

PREVIOUS：9.325 billion, TCL Technology’s acquisition of Guangzhou Huaxing was accepted NEXT：SID2026 TCL Huaxing apex Full Technology Path Innovation Landing Multiple Scenarios

Big debut! LS910 domestic Xinchuang large model inference server

RELATED NEWS

CATEGORIES

LATEST NEWS

CONTACT US