目录

由于各种保护措施(Recaptcha、动态内容、IP检查等),即使你有该领域的经验,在纯python中搜刮谷歌地图也是一种挑战。幸运的是,有许多工具,你可以采用python或其他编程语言来搜刮评论。

在这篇文章中,你将看到两种最常见的搜刮谷歌地图的工具:浏览器模拟和Outscraper API。这个教程对于个人和大数据团队来说,应该足以提取数据。

在Pyhton搜刮Google Places

你可以尝试用纯python来搜刮Google,但在10个请求内,你会看到你的软件被阻止了。发生这种情况是因为谷歌验证会话,并期望JavaScript在后台进行一些API调用。因此,需要使用浏览器模拟来防止谷歌地图阻挡我们的爬虫。此外,你可以使用代理。

要开始建立具有浏览器模拟功能的谷歌地图搜刮器,请按照本教程中描述的步骤进行。 如何用Python和Selenium刮取谷歌地图.

6分钟内用Python抓取Google Places

开发谷歌地图搜刮器和维护它可能需要大量的时间和精力。因此,许多工程团队认为专注于他们的主要领域并将Outscraper平台作为其应用程序的可靠数据来源是有益的。

通过使用SDK刮取谷歌地图

1. 你需要 python3+ 和 这个python包.通过运行以下命令安装该软件包。

				
					pip install google-services-api
				
			

2. 获取您的 API 密钥 个人资料页面.
3.导入软件包,并用密钥初始化它。指定查询,像这样运行搜索"美国洛杉矶附近的餐厅.” 在每个查询中使用国家和城市。

				
					from outscraper import ApiClient


api_cliet = ApiClient(api_key='KEY_FROM_OUTSCRAPER')
response = api_cliet.google_maps_search(
    'Restaurants near Los Angeles, USA',
    language='en',
    region='es',
    limit=100
)
				
			

4.等待几秒钟,直到数据被取走。

				
					[
    {
        "name": "Sushi Gen",
        "full_address": "422 E 2nd St, Los Angeles, CA 90012",
        "borough": "Little Tokyo",
        "street": "422 E 2nd St",
        "city": "Los Angeles",
        "postal_code": "90012",
        "country_code": "US",
        "country": "United States of America",
        "us_state": "California",
        "state": "California",
        "plus_code": null,
        "latitude": 34.0466917,
        "longitude": -118.2385444,
        "time_zone": "America/Los_Angeles",
        "site": "http://www.sushigen-dtla.com/",
        "phone": "+1 213-617-0552",
        "type": "Sushi restaurant",
        "subtypes": "Sushi restaurant, Japanese restaurant",
        "posts": null,
        "rating": 4.6,
        "reviews": 1244,
        "reviews_data": null,
        "photos_count": 1261,
        "google_id": "0x80c2c63850150b9d:0x96cdcf9a38a24170",
        "place_id": "ChIJnQsVUDjGwoARcEGiOJrPzZY",
        "reviews_link": "https://search.google.com/local/reviews?placeid=ChIJnQsVUDjGwoARcEGiOJrPzZY&q=Restaurants+near+Los+Angeles,+USA&authuser=0&hl=en&gl=US",
        "reviews_id": "-7580174336512605840",
        "photo": "https://lh5.googleusercontent.com/p/AF1QipMtt0bMtJTF1bDTZ2tbXuWXtZTNWmvjJUu3ejKy",
        "working_hours_old_format": "Monday: Closed | Tuesday: 11AM\u20132PM,5\u20137:30PM | Wednesday: 11AM\u20132PM,5\u20137:30PM | Thursday: 11AM\u20132PM,5\u20137:30PM | Friday: 11AM\u20132PM,5\u20138PM | Saturday: 4\u20138PM | Sunday: Closed",
        "working_hours": {
            "Monday": "Closed",
            "Tuesday": "11AM\u20132PM,5\u20137:30PM",
            "Wednesday": "11AM\u20132PM,5\u20137:30PM",
            "Thursday": "11AM\u20132PM,5\u20137:30PM",
            "Friday": "11AM\u20132PM,5\u20138PM",
            "Saturday": "4\u20138PM",
            "Sunday": "Closed"
        },
        "business_status": "OPERATIONAL",
        "about": {
            "Service options": {
                "Takeout": true,
                "Delivery": false
            },
            "Health & safety": {
                "Mask required": true
            },
            "Highlights": {
                "Great tea selection": true
            },
            "Popular for": {
                "Lunch": true,
                "Dinner": true,
                "Solo dining": true
            },
            "Accessibility": {
                "Wheelchair accessible entrance": true
            },
            "Offerings": {
                "Healthy options": true,
                "Small plates": true
            },
            "Dining options": {
                "Dessert": true
            },
            "Amenities": {
                "High chairs": true
            },
            "Atmosphere": {
                "Casual": true,
                "Cozy": true
            },
            "Crowd": {
                "Groups": true,
                "Tourists": true
            },
            "Planning": {
                "Dinner reservations recommended": true,
                "Accepts reservations": true,
                "Usually a wait": true
            }
        },
        "range": "$$$",
        "reviews_per_score": {
            "1": 25,
            "2": 14,
            "3": 52,
            "4": 291,
            "5": 862
        },
        "reserving_table_link": null,
        "booking_appointment_link": null,
        "owner_id": "100930306416993024046",
        "verified": true,
        "owner_title": "Sushi Gen",
        "owner_link": "https://www.google.com/maps/contrib/100930306416993024046",
        "location_link": "https://www.google.com/maps/place/Sushi+Gen/@34.0466917,-118.2385444,14z/data=!4m8!1m2!2m1!1sSushi+Gen!3m4!1s0x80c2c63850150b9d:0x96cdcf9a38a24170!8m2!3d34.0466917!4d-118.2385444"
    },
    {
        "name": "Water Grill",
        "full_address": "544 S Grand Ave, Los Angeles, CA 90071",
        "borough": "Financial District",
        "street": "544 S Grand Ave",
        "city": "Los Angeles",
        "postal_code": "90071",
        "country_code": "US",
        "country": "United States of America",
        "us_state": "California",
        "state": "California",
        "plus_code": null,
        "latitude": 34.0490403,
        "longitude": -118.25470659999999,
        "time_zone": "America/Los_Angeles",
        "site": "https://www.watergrill.com/la",
        "phone": "+1 213-891-0900",
        "type": "Seafood restaurant",
        "subtypes": "Seafood restaurant",
        "posts": null,
        "rating": 4.4,
        "reviews": 1387,
        "reviews_data": null,
        "photos_count": 1429,
        "google_id": "0x80c2c7b4901ce7c7:0x2ad996c9a82b097c",
        "place_id": "ChIJx-cckLTHwoARfAkrqMmW2So",
        "reviews_link": "https://search.google.com/local/reviews?placeid=ChIJx-cckLTHwoARfAkrqMmW2So&q=Restaurants+near+Los+Angeles,+USA&authuser=0&hl=en&gl=US",
        "reviews_id": "3087664812393171324",
        "photo": "https://lh5.googleusercontent.com/p/AF1QipPX7-DP07mvQQSxDjDjuOVYj9HlQ2bvNJ_EEZYW",
        "working_hours_old_format": "Monday: 11:30AM\u20139PM | Tuesday: 11:30AM\u20139PM | Wednesday: 11:30AM\u20139PM | Thursday: 11:30AM\u20139PM | Friday: 11:30AM\u201310PM | Saturday: 4\u201310PM | Sunday: 4\u20139PM",
        "working_hours": {
            "Monday": "11:30AM\u20139PM",
            "Tuesday": "11:30AM\u20139PM",
            "Wednesday": "11:30AM\u20139PM",
            "Thursday": "11:30AM\u20139PM",
            "Friday": "11:30AM\u201310PM",
            "Saturday": "4\u201310PM",
            "Sunday": "4\u20139PM"
        },
        "business_status": "OPERATIONAL",
        "about": {
            "Service options": {
                "Takeout": true,
                "Dine-in": true,
                "Delivery": false
            },
            "Health & safety": {
                "Mask required": true,
                "Staff wear masks": true,
                "Staff get temperature checks": true,
                "Staff required to disinfect surfaces between visits": true
            },
            "Highlights": {
                "Great dessert": true
            },
            "Popular for": {
                "Lunch": true,
                "Dinner": true,
                "Solo dining": true
            },
            "Accessibility": {
                "Wheelchair accessible entrance": true,
                "Wheelchair accessible seating": true
            },
            "Offerings": {
                "Alcohol": true,
                "Beer": true,
                "Cocktails": true,
                "Coffee": true,
                "Hard liquor": true,
                "Healthy options": true,
                "Late-night food": true,
                "Small plates": true,
                "Vegetarian options": true,
                "Wine": true
            },
            "Dining options": {
                "Lunch": true,
                "Dinner": true,
                "Dessert": true,
                "Seating": true
            },
            "Amenities": {
                "Bar onsite": true,
                "High chairs": true,
                "Restroom": true,
                "Wi-Fi": true
            },
            "Atmosphere": {
                "Cozy": true,
                "Romantic": true,
                "Upscale": true
            },
            "Crowd": {
                "Family-friendly": true,
                "Groups": true,
                "Tourists": true
            },
            "Planning": {
                "Lunch reservations recommended": true,
                "Dinner reservations recommended": true,
                "Accepts reservations": true
            }
        },
        "range": "$$$",
        "reviews_per_score": {
            "1": 27,
            "2": 32,
            "3": 97,
            "4": 408,
            "5": 823
        },
        "reserving_table_link": "https://www.watergrill.com/la#RESERVATIONS",
        "booking_appointment_link": "https://www.watergrill.com/la#RESERVATIONS",
        "owner_id": "116715188720123884576",
        "verified": true,
        "owner_title": "Water Grill",
        "owner_link": "https://www.google.com/maps/contrib/116715188720123884576",
        "location_link": "https://www.google.com/maps/place/Water+Grill/@34.0490403,-118.25470659999999,14z/data=!4m8!1m2!2m1!1sWater+Grill!3m4!1s0x80c2c7b4901ce7c7:0x2ad996c9a82b097c!8m2!3d34.0490403!4d-118.25470659999999"
    },
    ...
]

				
			

视频教程

常见问题

最常见的问题和答案

由于各种保护措施(Recaptcha、动态内容、IP检查等),在纯python中搜刮谷歌地图是一项具有挑战性的任务。出于这个原因,可以使用提供解决方案的服务,如Outscraper。

开发工具包 允许使用 Outscraper的服务 和 Outscraper的API.所以你可以在Python中刮取谷歌地图的数据。

谷歌地图中的所有地点都可以用Outscraper谷歌地点API进行搜刮。属于这个地点的所有数据都被列出,并可以被导出。

有了Outscraper谷歌地图评论API,评论搜刮可以用Python和其他编码语言完成。 

刮削工作分5个步骤进行。
1.你将需要Python 3+和Python软件包。安装该软件包。
2.从Outscraper的资料页面获取你的API密钥。
3. 导入包并用key初始化。
4.通过提供一个链接、地点标识或名称来指定位置。
5.等待几秒钟,直到评论被取走。


0 评论

发表回复

头像占位符