Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Visual regression framework. #51916

Open
bsessions85 opened this issue Apr 13, 2021 · 17 comments
Open

Implement Visual regression framework. #51916

bsessions85 opened this issue Apr 13, 2021 · 17 comments
Assignees

Comments

@bsessions85
Copy link
Contributor

Per: pbAok1-1Oc-p2

spin up visual regression infrastructure, add tests for layouts in editor and frontend per theme.

@bsessions85
Copy link
Contributor Author

Trying Backstopjs to start.

@bsessions85
Copy link
Contributor Author

I've been trying to test running visual regression tests on the page templates in the editor. I am seeing a lot of inconsistency there. Here is an example of some results. You can see in the image two versions I am seeing regularly and it seems to bounce back and forth between them. Occasionally I will also see other differences. This leads me to believe that testing in the editor may be too flaky to be useful. I'll keep trying to see if delays or anything else will help, but I wanted to post on update here. Screen Shot 2021-04-19 at 2 02 33 PM

@bsessions85
Copy link
Contributor Author

It is also worth noting that I am running these within a docker container so that we don't get differences between environments.

@simison
Copy link
Member

simison commented Apr 20, 2021

cc @ockham who I believe was looking into visual diff testing of templates or blocks in the core. Have you noticed any similar flakiness?

@kwight
Copy link
Contributor

kwight commented Apr 20, 2021

Hm, are those differences from the theme of the site? It looks like the test site has a different theme, which could be loading editor styles (accounting for the differences).

@bsessions85
Copy link
Contributor Author

@kwight, good question. I'll look in to that! thanks

@bsessions85
Copy link
Contributor Author

The theme was part of the problem. I didn't realize that different themes changed how it looks in the editor. It now looks better, but is still flaky. Things like images being slightly different or sometimes not loading are the next challenge.

@kwight
Copy link
Contributor

kwight commented Apr 20, 2021

I didn't realize that different themes changed how it looks in the editor.

It's pretty scattered; themes can enqueue styles into the editor to make it look more like the front-end – some themes do, some don't. Some do it well, some don't.

Things like images being slightly different or sometimes not loading are the next challenge.

Does that have something to do with the origin? Some could be coming from URLs, Photon, or the media library directly. I know blocks like the Gallery block have pretty awkward image handling by necessity, depending on environment.

Is your work in a PR somewhere?

@simison
Copy link
Member

simison commented Apr 21, 2021

I didn't realize that different themes changed how it looks in the editor.

It's also part of the equation that frequently breaks visuals in the editor because it's so hard to notice those differences when upgrading themes (think e.g. alignment breaking, extra spacing appearing).

@bsessions85
Copy link
Contributor Author

@kwight Here is an initial PR. Running the test command will give you a report of the differences. Re-running that command 2 or 3 times usually will yield failures of some sort. #52161

@kwight
Copy link
Contributor

kwight commented Apr 21, 2021

@bsessions85 Oh sweet, I'm curious to give it a run.

What are your early impressions at this point? Looks promising, or otherwise?

@bsessions85
Copy link
Contributor Author

bsessions85 commented Apr 22, 2021

What are your early impressions at this point? Looks promising, or otherwise?

At this point I don't think it is going to work well for the editor testing that we were hoping to get. Things just don't load in there consistently enough to make it an effective tool. Every second or 3rd run will have something just slightly different from the reference file and it will fail. For example, in the image below, 3 of the 4 tests passed, but this one failed because things are indented differently. The next run would probably pass though, so updating the reference isn't the issue. It is just how it decides to display it from one run to the next.

Screen Shot 2021-04-22 at 7 27 50 AM

@simison
Copy link
Member

simison commented Apr 22, 2021

Very surprising! Gutenberg version or theme didn't change in-between, right?

I would expect output to stay consistent of course — it's WYSIWYG editor after all. Would be good to get to the bottom of why it keeps changing.

@bsessions85
Copy link
Contributor Author

Very surprising! Gutenberg version or theme didn't change in-between, right?

Not unless the Gutenberg version is on a/b test or something. The theme for sure isn't changing.

@kwight
Copy link
Contributor

kwight commented Apr 22, 2021

I tested this for a while today, and got basically the same results as @bsessions85. Almost all of the problems though, were with the one template Bowen. (I also got a rare problem with Rivington that appears to be the snapshot being taken before all of the images can load into the slider – I don't know if this is something that can be "fixed" by waiting a little longer).

The Bowen reference itself is actually wrong – the correct appearance is the test shot in the example of a failure above. However, I re-generated the references, got a correct Bowen reference, and then proceeded to have about the same number of failures. It does seem to be something weird with the Bowen template itself though – swapping it out for team, and I was able to get an impressive run of full passes (broken only by another Rivington empty slider).

Fonts (and other styles?) are an issue in the testing too. They don't appear to be getting loaded by the testing instance (nor by the e2e testing site, for that matter).

kwight2021.wordpress.com e2eflowtesting3.wordpress.com Backstop
Screen Shot 2021-04-22 at 2 41 22 PM Screen Shot 2021-04-22 at 2 40 57 PM Screen Shot 2021-04-22 at 2 46 20 PM

I feel like this is pointing to issues with the page template system plus style enqueuing, but I have no concrete code to point to or anything (I think this impression is from also seeing different thumbnails for the templates at different times?).

I noticed the test site used by Backstop is on the (now quite old) Twenty Fifteen theme. I mean, it shouldn't matter that it isn't a modern theme (and the default themes are better maintained than any others I believe), but it made me wonder if newer themes might handle blocks better. ¯_(ツ)_/¯

@bsessions85
Copy link
Contributor Author

So it turns out there is an issue with the latest version of backstopjs that causes that issue 🤦 I used an older version and everything is looking much better!

I took it a step further and loaded the editor outside of the iframe so that I can capture the whole template in the editor and am able to run tests against all the templates and they will pass!

I'm attaching the report so anyone can see it if they want to.
html_report.zip

Next step is to get it cleaned up and set it up to run in CI.

@griffbrad
Copy link
Contributor

@Automattic/team-calypso-platform I wanted to loop you folks in here because this could be helpful for Gutenberg release testing and as a complement to e2es in some situations.

@scinos I know you’ve got experience with visual regression testing, so wanted you to have the chance to evaluate the approach here so far and provide input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants