0


使用机器学习对计算机编程技能进行分级

Grading Computer Programming Skills using Machine Learning
课程网址: http://videolectures.net/kdd2014_srikant_aggarwal_programming_ski...  
主讲教师: Varun Aggarwal; Shashank Srikant
开课单位: Aspiring Minds就业能力评估公司
开课时间: 2014-10-07
课程语种: 英语
中文简介:

计算机程序的自动评估是一个新兴的研究领域,具有产生大规模影响的潜力。现有的程序评估系统主要根据通过的测试用例的数量进行评分,无法深入了解程序员的能力。在本文中,我们提出了一个自动对计算机程序进行评分的系统。除了根据程序的编程实践和复杂性对程序进行分级之外,系统的关键内核是基于机器学习的算法,该算法确定给定程序的逻辑与正确程序的接近程度。该算法使用一组从给定程序的抽象表示派生的信息量很大的特征,这些特征捕获了程序的功能。然后使用这些特征来学习一个模型来对程序进行评分,这些模型是根据专家所做的评估构建的。我们表明,回归模型提供的评分比普遍存在的基于测试用例通过的评分要好得多,并且可以与其他开放式回答问题(如论文评分)的评分准确性相媲美。我们还表明,我们的新功能除了基本的关键字/表达式计数功能之外,还增加了重要的价值。除此之外,我们提出了一种将计算机程序评分作为一类建模问题的新方法,并报告了令人鼓舞的初步结果。我们通过现实世界工业部署中的案例研究展示了该系统的价值。据作者所知,这是第一次开发使用机器学习的系统并用于评分程序。鉴于近期大规模在线开放课件 (MOOC) 的蓬勃发展,这项工作是及时的,它有望产生大量手工评分的数字化数据。

课程简介: The automatic evaluation of computer programs is a nascent area of research with a potential for large-scale impact. Extant program assessment systems score mostly based on the number of test-cases passed, providing no insight into the competency of the programmer. In this paper, we present a system to grade computer programs automatically. In addition to grading a program on its programming practices and complexity, the key kernel of the system is a machine-learning based algorithm which determines closeness of the logic of the given program to a correct program. This algorithm uses a set of highly-informative features, derived from the abstract representations of a given program, that capture the program's functionality. These features are then used to learn a model to grade the programs, which are built against evaluations done by experts. We show that the regression models provide much better grading than the ubiquitous test-case-pass based grading and rivals the grading accuracy of other open-response problems such as essay grading . We also show that our novel features add significant value over and above basic keyword/expression count features. In addition to this, we propose a novel way of posing computer-program grading as a one-class modeling problem and report encouraging preliminary results. We show the value of the system through a case study in a real-world industrial deployment. To the best of the authors' knowledge, this is the first time a system using machine learning has been developed and used for grading programs. The work is timely with regard to the recent boom in Massively Online Open Courseware (MOOCs), which promises to produce a significant amount of hand-graded digitized data.
关 键 词: 在线开放课件; 表达式计数; 机器学习
课程来源: 视频讲座网
数据采集: 2021-06-23:zyk
最后编审: 2021-06-23:zyk
阅读次数: 39