Skip to content

gmaps: Add extended review fields (source, language, timestamps, reply)#272

Open
knoellp wants to merge 4 commits into
gosom:mainfrom
knoellp:feature/extended-review-fields
Open

gmaps: Add extended review fields (source, language, timestamps, reply)#272
knoellp wants to merge 4 commits into
gosom:mainfrom
knoellp:feature/extended-review-fields

Conversation

@knoellp
Copy link
Copy Markdown
Contributor

@knoellp knoellp commented May 15, 2026

This patch extends the Review struct in gmaps/entry.go with 17 additional fields that expose richer metadata from the listugcposts RPC response. All existing fields (Name, ProfilePicture, Rating, Description, Images, When) remain semantically unchanged, preserving backward compatibility with existing consumers.

The new fields cover four areas: review identity (ReviewID), source metadata (Source, RatingScale, AuthorURL, PostedAtUnixMicros, UpdatedAtUnixMicros), language and text (Language, TranslatedLang, TextOriginal, TextTranslated, RatingFloat), and owner reply (ReplyText, ReplyTextOriginal, ReplyLanguage, ReplyTranslatedLang, ReplyPostedAtUnixMicros, ReplyUpdatedAtUnixMicros).

Aggregator reviews (Tripadvisor, Trip.com, etc.) are detected by checking whether r[2][0] is nil. In that case, Rating stays at 0 (matching the previous behavior) and RatingFloat is read from r[2][8][1]. For native Google reviews, RatingFloat mirrors float64(Rating).

Reply fields use omitempty JSON tags and are only populated when the reply block at r[3] is present and non-nil, keeping serialized output clean for reviews without replies.

The patch adds four test cases covering: a native review with a translated text and owner reply, a synthetic aggregator review, a rating-only review with no text body, and a native review without translation. All pre-existing tests continue to pass.

New fields summary

Field JSON tag Source path
ReviewID review_id r[0]
PostedAtUnixMicros posted_at_unix_micros r[1][2]
UpdatedAtUnixMicros updated_at_unix_micros r[1][3]
AuthorURL author_url r[1][4][2][0]
Source source r[1][13][0]
RatingScale rating_scale r[1][13][4]
Language language r[2][14][0]
TranslatedLang translated_lang r[2][14][1]
TextOriginal text_original r[2][15][0][0]
TextTranslated text_translated r[2][15][1][0]
RatingFloat rating_float r[2][8][1] (aggregator) / float64(Rating) (native)
ReplyPostedAtUnixMicros reply_posted_at_unix_micros,omitempty r[3][1]
ReplyUpdatedAtUnixMicros reply_updated_at_unix_micros,omitempty r[3][2]
ReplyLanguage reply_language,omitempty r[3][13][0]
ReplyTranslatedLang reply_translated_lang,omitempty r[3][13][1]
ReplyTextOriginal reply_text_original,omitempty r[3][14][0][0]
ReplyText reply_text,omitempty r[3][14][1][0]

knoellp and others added 4 commits February 21, 2026 13:41
Integrates our 5 custom fixes (originally written against playwright.Page)
into the upstream scrapemate.BrowserPage interface introduced in eef673b.

Changes:
- job.go: ignore WaitForURL errors instead of returning (proxy latency fix)
- job.go: increase div[role='feed'] WaitForSelector timeout 700ms → 10s
- place.go: ignore WaitForURL errors instead of returning
- place.go: lower ExtraReviews threshold >8 → >0 (fetch reviews for all places)
- emailjob.go: add normalizeGoogleURL() to unwrap Google redirect URLs
  (/url?q=http://...) before creating the email extraction job

Dropped: playwright-specific debug infrastructure (setupPageListeners,
saveDebugInfo) — not compatible with scrapemate.BrowserPage.
Dropped: f9db0ea JS key search and ae900fb nil-check — already superseded
by upstream's eef673b implementation.
# Conflicts:
#	gmaps/emailjob.go
@knoellp
Copy link
Copy Markdown
Contributor Author

knoellp commented May 15, 2026

The CI failure in the "Go Vulnerability Check" step is a pre-existing issue unrelated to this PR. The check reports known vulnerabilities in the Go 1.26.2 standard library (net/mail, html/template) that were fixed in Go 1.26.3. The same failure occurs on main before any changes from this PR are applied. Once the toolchain is updated to Go 1.26.3+, the vulnerability check should pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant