软件工程顶会论文中的自动修复工具——S3 [FSE 2017]

xiaoxiao2021-03-01  58

前言

不多比比,本文介绍自动修复工具——S3 (来自去年的FSE)。

自动修复工具S3

1.1 作者信息

文章:S3: Syntax- and Semantic-Guided Repair Synthesis via Programming by Examples

Xuan-Bach D. Le; Duc-Hiep Chu; David Lo; Claire Le Goues; Willem Visser

其中David Lo,Claire Le Goues都是我之前在最早的自动修复论文中经常看到的,太厉害了

1.2 摘要讲啥了?

1). Such techniques, e.g., Angelix, infer semantic specifcations via symbolic execution, and then use program synthesis to construct new code that satisfes those inferred specifcations. However, the obtained specifcations are naturally incomplete, leaving the synthesis engine with a difcult task of synthesizing a general solution from a sparse space of many possible solutions that are consistent with the provided specifcations but that do not necessarily generalize.

再次看到了angelix,好像也是这个实验室之前的工作? 总之就是说这个有不足的地方。

2)作者的工作: We present S3, a new repair synthesis engine that leverages programming-by-examples methodology to synthesize high-quality bug repairs.

1.2 第一次看到sparce search space

The novelty in S3 that allows it to tackle the sparse search space to create more general repairs is three-fold: (1) A systematic way to customize and constrain the syntactic search space via a domain-specifc language, (2) An efficient enumeration-based search strategy over the constrained search space, and (3) A number of ranking features based on measures of the syntactic and semantic distances between candidate solutions and the original buggy program.

1.3 S3和谁比较了,做了什么实验,修复了什么bug?

We compare S3’s repair effectiveness with state-ofthe-art synthesis engines Angelix, Enumerative, and CVC4. S3 can successfully and correctly fx at least three times more bugs than the best baseline on datasets of 52 bugs in small programs, and 100 bugs in real-world large programs.

和三个补丁合成引擎(engine)进行比较,这个是我第一看到用engine这个词,而不是repair technique,repair method,repair tools

benchmark The frst dataset includes 52 bugs in small programs, a subset of the IntroClass benchmark [30] translated to Java [10].1 The IntroClass dataset contains only small programs, but provides two highcoverage test suites for each, allowing an independent assessment of repair quality. The second dataset includes 100 large real-world Java bugs that we collected from GitHub.

As many current state-of-the-art repair tools target bugs that require only a small number of changed lines [34–36], our dataset is sufcient for assessing current research. 解释的很清楚的。为什么我们的dataset is sufficient.

1.4 震惊,这个开头也写的太好了

Bug fixing is notoriously difcult, time-consuming, and costly [5, 46]. Hence, automating bug repair, to reduce the onerous burden of this task, would be of tremendous value.

感觉很多词,都是我用不出来的。。。基本功不扎实

1.5 introduction中列举的:现有的工具

Automatic program repair has been graining ground, with substantial recent work devoted to the problem [6, 20, 24–26, 29, 32, 34–36, 51, 52], inspiring hope of future practical adoption.

1.6 angelix问题所在(introduction)

Although scalability has been well-addressed, one pressing concern in program repair is patch quality, sometimes quantifed in terms of patch overftting or generalizability [43].

[43] Edward K Smith, Earl T Barr, Claire Le Goues, and Yuriy Brun. 2015. Is the cure worse than the disease? overftting in automated program repair. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. ACM, 532–543.

Overftting, and patch quality generally, remains a challenging problem in the program repair feild.

这篇文章写的是真的好。

1.7 怎么解决过拟合的问题?——缩小这个非常稀疏的search space

One reason for patch overftting is that the repair search space is often sparse, containing many plausible solutions that can lead the buggy program to pass a given test suite, but that may still be judged incorrect [33]. One way to tackle overftting is thus to constrain the search space to patches that are more likely to generalize. Other strategies for increasing the quality of output patches include higher-granularity mutation operators [19], antipatterns [45], history-based patterns [28], feedback from execution traces [8], or document analysis [51]. Angelix [36] eagerly preserves the original syntactic structure of the buggy program via PartialMaxSMT-based constraint solving [35] and componentbased synthesis [16]. However, such enforcement alone may not be enough [8]. Furthermore, incorporating other strategies or criteria into a constraint-based synthesis approach is non-obvious, since doing so typically requires novel, and often complicated constraint encodings (this problem has been pointed out by others, see, e.g., Chapter 7 of [14] or Section 2 of [45]). This motivates the design of a new repair synthesis technique that can consolidate various restrictions or patch generation criteria, enabling an efcient search over a constrained space for potentially higher-quality patches.

这里面包罗万象,显示出作者对自动修复的理解,我感觉他能复现所有工具,厉害。

1.8 作者idea来源

我认为来源于一个很重要的文献: [14] Sumit Gulwani, Javier Esparza, Orna Grumberg, and Salomon Sickert. 2016. Programming by Examples (and its applications in Data Wrangling). Verifcation and Synthesis of Correct and Secure Systems (2016).

Unlike other constraint-based repair synthesis techniques, our framework is highly customizable by design, enabling the easy inclusion of new ranking features — its design is inspired by the programming-byexamples (PBE) synthesis methodology [14].

这个也太牛了吧,很难想到这个可以用到自动修复中,我更加不可能看到这篇文章,我连这个会议的名字都没听过…感觉在自己也差太多了

此外,还有一个: The synthesis phase frst constrains the syntactic search space of solutions via a DSL that we extend from SYNTH-LIB [3] Our extension allows it to specify a starting sketch, or an expression that gives S3 clues about what possible solutions might look like.

1.9 小结

先写到这里,我想看的东西大概都看到了,有了基本了解,剩下一些很技术的问题,比如程序合成,符号执行,这些我也看不懂,得从头学,现在看还看不懂

以后有时间再研究。

2.0 工具代码地址

https://xuanbachle.github.io/semanticsrepair/

还真有,https://bitbucket.org/xbach/sygus-repair/src

jfix的也有,在https://xuanbachle.github.io/semanticsrepair/这个上面。

转载请注明原文地址: https://www.6miu.com/read-4549980.html

最新回复(0)