Home/Library/SWE-bench: Can Language Models Resolve Real-World GitHub Issues?Papers & ResearchSWE-bench: Can Language Models Resolve Real-World GitHub Issues?DetailsPublisherPrinceton / SWE-benchDomainResearch & LearningCategoryPapers & ResearchType GroupBenchmarks & DatasetsTypePaper / BenchmarkBest ForResearchSkill LevelAdvancedAccessFreeTopicCoding-agent benchmarkRelated in Papers & ResearchAdapting the Interface, Not the Model: Runtime Harness Adaptation for Deterministic LLM AgentsTianshi Xu, Huifeng Wen, Meng LiAwesome Code as Agent Harness PapersYennNingA Survey of Context Engineering for Large Language ModelsLingrui Mei et al.Language-Induced Priors for Domain AdaptationQiyuan Chen, Jiayu Zhou, Raed Al KontarContexting as Recommendation: Evolutionary Collaborative Filtering for Context EngineeringJiachen Zhu et al.AutoCodeBench: Large Language Models are Automatic Code Benchmark GeneratorsHunyuan Team, TencentOpen ResourceSave to pathBack to library