2022-05-02

Fixes

subTitle extraction works now

2022-04-04

Fixes

Blocked responses on the search page now properly retry the request (no more unhandled promise rejection)
Smoother search page pagination
More informative logs
Fixed consent approval if browser crashes

2022-03-16

Fixes

maxCrawledPlaces + exportPlaceUrls was giving inconsistent number of results.

2022-03-14

Features

Added allPlacesNoSearch to input. This option allows you to scrape all places shown on the map without the need for any search term.
Added reviewsStartDate to input to extract only reviews newer than this date.
Added radiusKm to the Point type in customGeolocation

2022-03-04

Improvement

additionalInfo extraction is faster now.
additionalInfo extraction for hotels and similar categories is more complete now: Data which is not displayed on the Google page but present in the Google response is also extracted.

2022-03-03

Lowering the default zoom values. The past setup made the scraping too slow and costly. The new defaults will speed up the scraping a lot while missing only a few places. You can still manually override the zoom parameter. New default values are: country or state -> 12 county -> 14 city -> 15 postalCode -> 16 no geolocation -> 12

2022-02-28

Fixes

location extraction works in (almost) all cases now (search URLs and URLs with place IDs will always work).

2022-02-21

Features

Added oneReviewPerRow to input to enable expanding reviews one per output row

2022-02-17

Fixes

openingHours extraction works in almost all cases now (search URLs and URLs with place IDs will always work).

2022-01-12

Start URLs now correctly work from uploaded CSV files or Google Sheets. It uses to trim part of the URL.

2022-01-11

Changed polygon input field to customGeolocation
Added deeper section into Reamde on how you can provide your own exact coordinates

2022-01-11

Breaking changes We decided it is time to change several default parameters to make the user experience smoother. These changes should not have a big effect on currect users.

city and other geolocation parameters will have preference over lat & long if both are used (in 99% cases users want to use the automatic location splitting to get the most results which doesn't work with direct lat & long)
zoom will no longer have a default value 12. Instead, it will change based on geolocation type like this:

country or state -> 12 county -> 14 city -> 17 postalCode -> 18 no geolocation -> 12

Users will still be able to specify the zoom and override this behavior.

See Readme for more details

2021-12-14

Breaking change

reviewsSort is now set to newest by default. This is because some places don't yield all reviews on other sortings (we are not sure if this is a bug or silent block on Google's side)

2021-11-15

Fixes

exportPlaceUrls now properly dedupes the URLs
added categories fields listing all categories the place is listed in

2021-11-11

Fixes

Fixed additionalInfo for hotels
Fixed exportPlaceUrls not checking for correct geolocation

2021-11-09

Fixes

website field now displays the full URL. This fixes issue of blank facebook.com links.

2021-11-05

Fixes

Fixed new layout of additionalInfo

2021-11-03

Fixes

Improved reliability of scraping place detail, reviews and images (improving scrolling and back button interaction)

2021-10-13

Features

Added menu to output
Added price to output

2021-10-07

Fixes

Fixed popularTimesHistogram which caused crash on some pages

2021-09-27

Fixes

Fixed image extraction & make it optional (it should not crash the whole scrape)

2021-09-15

Fixes

Fixed temporarilyClosed and permanentlyClosed
Added a step for normalizing input Start URLs because those with wrong format don't contain JSON data

2021-09-14

Fixes

Fixed popular times live and histogram

2021-09-10

https://github.com/drobnikj/crawler-google-places/pull/185 https://github.com/drobnikj/crawler-google-places/issues/181

Fixes

In like 10% cases, the reviews are in wrong order and there is less of them. We didn't find a root cause yet but we retry the page so the output gets corrected.

2021-09-07

Breaking fix

If you did not pass maxReviews in the input at all (undefined), it scraped 5 reviews as default. That was against the input schema description so it is now fixed to scrape 0 reviews in those cases.

2021-09-01

Fixes

Fixed placeId extraction that was broken for some inputs
Fixed missing imageUrls

Features

Added option to input URLs with CID (Google My Business Listing ID) to start URLs, e.g. https://maps.google.com/?cid=12640514468890456789
Added cid to output

2021-08-25

Fixes

Fixed maxCrawledPlaces not finishing quickly for large country-wise searches. maxCrawledPlacesPerSearch still has this problem

2021-08-12

Fixes

Fixed problem that startUrls was not picking up all provided URLs sometimes (due to automatic uniqueKey resolution)
likesCount in reviews

2021-08-06

Fixes

maxCrawledPlaces now compares to total sum of all places

Features

Added maxCrawledPlacesPerSearch to limit max places per search term or search URL

2021-07-26

Fixes

Address is now parsed correctly into components even when you supply direct place IDs
Migrated code from apify 0.22.5 to 1.3.1

2021-07-13

Added county to geolocation options

2021-06-03

Fixes (hopefully last fixes after the layout change)

Scraping all images per place works again
Fixed additionalInfo
Fixed openiningHours

2021-06-03

Fixes

Fix handling of search pages without results
Skip empty searches that sometimes users accidentally post

2021-05-25

Features

Added orderBy attribute to result scrape

2021-05-18

Fixes

Fully or partially fixed consent screen issues
Should also help with Failed to set the 'innerHTML' property on 'Element': This document requires 'TrustedHTML' assignment. which is caused by injecting JQuery into constent screen

2021-04-29

Fixes

Fixed reviewsTranslation

2021-04-28

Fixes after Google changed layout, not everything was fixed. Next batch of fixed asap!

Fixed additional data
Fixed search pagination getting into infinite loop
Fixed empty search handling
Fixed reviews not being scraped
Fixed totalScore

2021-03-22

Warning - Next version will be a breaking one as we will remove personal data from reviews by default. You will have to explicitly enable the fields below. Features

Added input fields to selectively pick which personal data fields to scrape - scrapeReviewerName, scrapeReviewerId, scrapeReviewerUrl, scrapeReviewId, scrapeReviewUrl, scrapeResponseFromOwnerText

2021-03-17

Fixes

Removed duplicate reviews + all reviews scraped correctly
reviewsSort finally works correctly
Reviews scraping is now significantly faster
Handle error that irregularly happened when scraping huge amount of reviews

Features

Added reviewsDistribution
Added publishedAtDate (exact date), responseFromOwnerDate and responseFromOwnerText for each review

2021-03-10

Fixes:

totalScore and reviewsCount are now correctly extracted for all languages
startUrls now correctly work non-.com domains and on detail places

2021-02-02

Fixes:

Search keyword that links only to a single place (like "London Eye") now works correctly

2021-01-27

Features:

Address is parsed into neighborhood, street, city, postalCode, state and countryCode fields
Added reviewsTranslation option to adjust how Google translates reviews from non-English languages
Parsing ads. This means a bit more results. Those that are ads have "isAdvertisement": true field.
Added useCachedPlaces option to load places from your KV Store. Useful if you need to scrape the same places regularly.
Added polygon option to provide your own geolocation polygon.

Fixes:

This one is big. We removed the infamous Place is outside of required location (polygon) error. The location of a place is now checked during paginating and these places are skipped. This means a massive speed up of the scraper.

2021-01-11

Features:

Automatic screenshots of errors to see what went wrong
Added searchPageUrl to output
Added PLACES-OUT-OF-POLYOGON record to Key-Value store. You can check what places were excluded.

Fixes:

Fixed rare bug with saving stats
Improvement in review sorting - but it is still not ideal, more work needs to be done

2020-11-16

Added postal code geolocation to input
Improved errors when location is not found
Optimization - Removed geolocation data from intermediate requests

2020-10-29

Fixed handling of Google consent screen
Better input validation and deprecation logs
Changed default for maxImages to 1 as it doesn't require scrolling for the main image
imageUrls are returned with the highest resolution

2020-10-27

Removed forceEng input in favor of language

2020-10-15

The default setup now uses maxImages: 0 and maxReviews: 0 to improve efficiency

2020-10-01

added several browser options to input - maxConcurrency, maxPageRetries, pageLoadTimeoutSec, maxPagesPerBrowser, useChrome
rewamped input schema and readme
Added reviewerNumberOfReviews and isLocalGuide to reviews

2020-09-22

added few extra review fields (ID, URL)

2020-07-23 small features

New features

add an option for caching place location
add an option for sorting of reviews
add stats logging

2020-07 polygon search and bug fixes

breaking change

reworked input search string

Bug fixes

opening hour parsing (#39)
separate locatedIn field (#32)
update readme

New features

extract additional info - Service Options, Highlights, Offerings,.. (#41)
add maxReviews, maxImages (#40)
add temporarilyClosed and permanentlyClosed flags (#33)
allow to scrape only places urls (#29)
add forceEnglish flag into input (#24, #21)
add searching in polygon using nominatim.org
add startUrls
added maxAutomaticZoomOut to limit how far can Google zoom out (it naturally zooms out as you press next page in search)

Files

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

2022-05-02

2022-04-04

2022-03-16

2022-03-14

2022-03-04

2022-03-03

2022-02-28

2022-02-21

2022-02-17

2022-01-12

2022-01-11

2022-01-11

2021-12-14

2021-11-15

2021-11-11

2021-11-09

2021-11-05

2021-11-03

2021-10-13

2021-10-07

2021-09-27

2021-09-15

2021-09-14

2021-09-10

2021-09-07

2021-09-01

2021-08-25

2021-08-12

2021-08-06

2021-07-26

2021-07-13

2021-06-03

2021-06-03

2021-05-25

2021-05-18

2021-04-29

2021-04-28

2021-03-22

2021-03-17

2021-03-10

2021-02-02

2021-01-27

2021-01-11

2020-11-16

2020-10-29

2020-10-27

2020-10-15

2020-10-01

2020-09-22

2020-07-23 small features

New features

2020-07 polygon search and bug fixes

breaking change

Bug fixes

New features