- arXiv. org e-Print archive
arXiv is a free distribution service and an open-access archive for nearly 2 4 million scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics Materials on this site are not peer-reviewed by arXiv
- [2501. 12948] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs . . .
We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1 DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoning capabilities Through RL, DeepSeek-R1-Zero naturally emerges with numerous powerful and intriguing reasoning behaviors However, it
- Log in to arXiv | arXiv e-print repository
If you've never logged in to arXiv org Register for the first time Registration is required to submit or update papers, but is not necessary to view them
- Numerical Analysis - arXiv. org
Numerical Analysis Authors and titles for recent submissions Wed, 3 Dec 2025 Tue, 2 Dec 2025 Mon, 1 Dec 2025 Thu, 27 Nov 2025 Wed, 26 Nov 2025 See today's new changes
- Mathematics - arXiv. org
Mathematics (since February 1992) For a specific paper, enter the identifier into the top right search box Browse: new (most recent mailing, with abstracts) recent (last 5 mailings) current month's listings specific year month:
- [2412. 19437] DeepSeek-V3 Technical Report - arXiv. org
We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2 Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for
- [2412. 03959] Is FISHER All You Need in The Multi-AUV Underwater Target . . .
It is significant to employ multiple autonomous underwater vehicles (AUVs) to execute the underwater target tracking task collaboratively However, it's pretty challenging to meet various prerequisites utilizing traditional control methods Therefore, we propose an effective two-stage learning from demonstrations training framework, FISHER, to highlight the adaptability of reinforcement
- Astrophysics - arXiv. org
Astrophysics (since April 1992) For a specific paper, enter the identifier into the top right search box Browse: new (most recent mailing, with abstracts) recent (last 5 mailings) current month's listings specific year month:
|