Evaluation based on programming scenarios
[ English | 中文 ]
👋 Join our WeChat
DeepSeek R1 Coding Ability Review
https://survey.stackoverflow.co/2024/ai#sentiment-and-usage
| Defect scenario | Serious result | case |
|---|---|---|
| Dead Loop | Severe cause CPU 100%, service crash | 2 |
| Memory leak, memory overflow | Severe OOM, service crashes | 2 |
| Thread Deadlock | Concurrent threads compete for resource deadlocks, severely causing CPU 100% or OOM, service unavailability or failure | 2 |
| Inconsistent concurrent data | Improper operation in multi-threaded situations leads to inconsistent and dirty data | 1 |
| Long context/token capability | Test the accuracy and maximum capability of long text processing | 1 |
| Context learning capability | Test the accuracy of context understanding and reasoning | 1 |

