Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export Area Raw Data has a bug in area_y2 for some of the areas #224

Open
numeroteca opened this issue May 9, 2019 · 1 comment
Open
Labels

Comments

@numeroteca
Copy link
Member

Looking for the pattern to find where the problem is in the generated json. Not all areas are affected.

area_y2: ha.areas.first.y2,
https://github.com/montera34/pageonex/blob/master/app/models/threadx.rb#L315

@numeroteca numeroteca added the bug label May 9, 2019
@numeroteca
Copy link
Member Author

Areas are defined with the coordinates of the first corner (top left: area_x1, area_y1) and the coordinates of the opposite corner (bottom right: area_x2, area_y2). Besides the width and height of the area are calculated and provided in the downloadable json file (area_width and area_height).

This bug can be seen in this file: http://pageonex.com/numeroteca/corrupcion-spain-enero-2013/raw.json
The area 299 has a position of point area_y2 in a position greater than the height of the image (which is 1049)
Screenshot from 2021-09-14 01-11-37

The height of the rectangle area area_height is well calculated according to the rectangle in the thread, but if you try to get that result with area_y2) - area_y1` it results in a wrong number.

I've looked in the 4 years old compilation of threads of colorcorrupcion and the behavior does not seem to follow a recognizable pattern or a correlation with date, month, year, size of newspaper. There are buggy area_y2 in all the newspapers, dates and topics.

In the column diff we show the difference between the calculated by Ruby height area_height and the calculated with the raw json file data height_new: height_new - area_height. The area_height seems to be correct when looking graphically in a thread.

I've looked for y2 in the repository https://github.com/montera34/pageonex/search?p=1&q=y2 and I am not seeing weird things. I am looking for a wrong calculation when the data are exported to the Raw json areas file that seems to be defined here https://github.com/montera34/pageonex/blob/master/app/models/threadx.rb#L300.

Screenshot from 2021-09-14 01-25-46

And now looking at the relationship height_new /area_height I see a pattern around numbers 1, 2, 3 and 4, being stronger at 2, where the expected behaviour would be to see all 1:
Screenshot from 2021-09-14 01-38-42

Which could be the reason for that?? what is the calculation artifact that generates this?

I replicated the analysis with the width area_x2 and though there are differences among the calculated width and the one in the raw json areas export file, it is not that important.

PS: I see a maybe a typo at

ha.y2 = ha.y2 + ha.height;
inside the function enableDragging.

ha.width = ui.size.width;
ha.x2 = ha.x1 + ha.width;
ha.height = ui.size.height;
ha.y2 = ha.y2 + ha.height;

In the last line I'd expect to see ha.y2 = ha.y1 + ha.height;, but I guess this has no influence in the discussion above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant