Towards Understanding Web Applications: Automated Abstraction Inference and its Applications

Authors: Sacha Brisset, Lionel Seinturier, Renaud Pawlak, Romain Rouvoy

Abstract

Web applications are at every corner of modern society. The largest web applications can serve millions of people. These applications are expected to be strongly reliable and stable yet capable of evolving to adapt to their users. At such a scale, these expectations can only be met through huge resources and time. For this reason, it is critical to further our ability to understand the structure of web applications to ease their maintenance and evolution.

In this thesis, we explore web application structure through a variety of lenses: web testing, data extraction, and web analytics. Our study shows that many web-related research efforts, regardless of the research domain, suffer greatly from the lack of a generic fully unsupervised web application abstraction inference solution. We attempt to develop such a solution iteratively, leading to three main contributions:

SFTM: Similarity-based Tree Matching, an algorithm allowing the matching of two web pages. Compared to traditional, generic Tree Matching algorithms, SFTM produces better matchings for computation times several orders of magnitude smaller.

ERRATUM: An approach allowing the repair of locators on web applications. ERRATUM strongly improves the quality of repairs with little to no overhead. We integrated ERRATUM into a widely used open-source testing framework.

APPSTRACT: An approach to automatically generate an abstraction of a web application. APPSTRACT combines intra-page abstraction and inter-page abstraction using SFTM to generate robust and semantically-rich application-wide locator identifiers for each element of a webpage.

We believe our work opens up many new possibilities in a variety of research domains. In particular, the computation speed of SFTM enables approaches that were previously impractical with generic tree matching, and the approach we describe in APPSTRACT could pioneer new web analytics or web testing generation solutions based on web application abstraction.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.vscode		.vscode
appstract		appstract
cerberus		cerberus
erratum		erratum
forrest		forrest
tree-matching		tree-matching
.gitignore		.gitignore
abstract.tex		abstract.tex
abstract_fr.tex		abstract_fr.tex
ack.tex		ack.tex
conclusion.tex		conclusion.tex
introduction.tex		introduction.tex
logo-cristal.pdf		logo-cristal.pdf
logo-inria.pdf		logo-inria.pdf
logo-univ-lille.pdf		logo-univ-lille.pdf
mantu.pdf		mantu.pdf
readme.md		readme.md
thesis-concat.bib		thesis-concat.bib
thesis-uniq.bib		thesis-uniq.bib
thesis.bib		thesis.bib
thesis.pdf		thesis.pdf
thesis.tex		thesis.tex
title_eng.tex		title_eng.tex
title_fr.tex		title_fr.tex

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards Understanding Web Applications: Automated Abstraction Inference and its Applications

Abstract

About

Releases

Packages

Languages

lssol/Phd-Sacha

Folders and files

Latest commit

History

Repository files navigation

Towards Understanding Web Applications: Automated Abstraction Inference and its Applications

Abstract

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages