548 questions
0
votes
0
answers
30
views
requests-html html.render() works on Ubuntu laptop but not on headless Raspberry Pi
I'm trying to update a program that fetches comics each morning and emails them to me. The website I'm scraping has changed the page structure so the main image (the one I want) is now loaded by ...
2
votes
1
answer
88
views
python-requests-html render inconsistent result
background:
by default the website is only showing few names and there s a "moreBtn" to generate the full list
code idea:
create Html session, render with script clicking the "moreBtn&...
0
votes
0
answers
51
views
Browser listening on: ws://127.0.0.1:59948 - PyPartPicker/RequestsHTML
I've encountered this issue a few times while developing an app that uses PyPartPicker, but it has usually resolved itself when I fixed other problems. But now I'm in a bit of a stalemate.
I've just ...
0
votes
1
answer
49
views
Proxies with PyPartpicker (requests_html)...not being utilised?
I've just begun building a database updater that takes parts from PCPartPicker using PyPartPicker and uploads them to a supabase database. I've setup async functionality, but I'm having trouble with ...
1
vote
0
answers
24
views
Automating js files to encrypt a value with python and requests-html
Page B(2nd page) request depending on Page (1st page) click which trigger an event in another js files to encrypt a value. How to automate this with Python and requests/requests-html?
E.g. the content ...
2
votes
2
answers
113
views
Difficulty to scrape HTML page from a dynamic generated website with Python
I'm trying to retrieve some data from a website with python. The website seems to generate its content with Javascript so I cannot use the standard requests library. I tried the module requests-html ...
0
votes
2
answers
68
views
Python "requests" - obtaining the most recent document submitted to a database
I have a mongodb that is accessed through an API via the "requests" package. Within the database there are numerous schema we've set up; as an example, one of these is a "surveys" ...
0
votes
1
answer
218
views
Recieving [WinError 14001] when using Python request_html's .render
UPDATE: The issue is resolved after I reinstalled Python from a different source. I was previously using the Windows store Python and am now using the download from python.org
Same issue as this and ...
0
votes
1
answer
126
views
Trouble extracting JavaScript content while using html_requests
I am currently working on a webscraper, and for the most part it works quite well. I have been using beautiful soup to extract html content; to extract javascript content, I just started with ...
-2
votes
3
answers
141
views
How do I get text from an embedded map on a website?
I have written a code that accesses a webpage and searches the page for the link to another website from the inspect element. After accessing that website, I need to get the zip code of an address ...
0
votes
1
answer
64
views
How do I run multiple GET requests with coroutine properly?
I'm learning requests-html and would like to know how to run multiple tasks asynchronously.
While attempting to perform an asynchronous task using requests-html, I encountered an error message stating ...
0
votes
0
answers
53
views
Limited results when BeautifulSoup Loop titles from an html span class [duplicate]
# Import packages
import requests
from bs4 import BeautifulSoup
# Specify url: url
url = 'https://www.nts.live/shows'
# Package the request, send the request and catch the response: r
r = requests....
0
votes
0
answers
68
views
Are there any techniques in getting past Javascript checks with BeautifulSoup?
So I have the following script:
#!/usr/bin/env python3
import requests
from bs4 import BeautifulSoup
def parse_marketwatch_calendar(url):
#page=requests.get(url).text
#soup=BeautifulSoup(page,...
0
votes
0
answers
171
views
Do multiple requests being sent at once slow each individual request down?
So I am working in a program that sends multiple (around 100) requests per second, each to a different website. The requests are sent asynchronously in around 10 different threads, so they aren't ...
0
votes
0
answers
165
views
I'm trying to use the pandas read_html() function to scrape data but need to pass in a username and password
I'm somewhat new to programming. The basic idea is that I'm working on a project and need to send a request to a website and scrape some data from it. However, the website I'm trying to retrieve data ...
0
votes
1
answer
203
views
python requests: connection problems
i´m pretty new in Python programming.
I have python3 and i installed requests via pip.
I cant connect to any site with .get...
Is it a firewall or some connection problems ? I´m clueless
my code:
...
0
votes
1
answer
170
views
Scraping with requests-html on Raspberry Pi 4 fails with OSError: [Errno 8] Exec format error
Here's what I try to achieve: I have a Raspberry Pi 4 with a Pimoroni Inky Frame. I want to use it for a sponsoring event (Tour for Life) to show the recent amount of gifted money.
This gifted amount ...
0
votes
1
answer
91
views
Python printing diamond with question mark instead of characters ' " -
I'm scraping from a website and it prints out the characters ' " - as �.
It copies this:
“It’s only after we’ve lost everything that we’re free to do anything.” – Fight Club, Tyler Durden
But ...
3
votes
0
answers
180
views
Python Script Stuck in a For Loop
I have this code that iterates through several links. For each one, it retrieves the HTML response and runs response_html.find('relative-time')
import pandas as pd
import requests_html
def main():
...
0
votes
0
answers
56
views
get response after being redirected
I'm just trying to change my request location without using any kind of proxies, I was ended up with the following URL which is working perfectly in the browser and selenium library (python).
https://...
0
votes
1
answer
102
views
Trying to scrape a dynamic website in python with requests_html
When i try to scrape this site site i run into an issue and i can't figure out what's wrong. i tried using Htmlsession but python told me to use AsyncHTMLSession because the former can't perform ...
0
votes
0
answers
165
views
Python code stopped working without any changes in code. Installing older version of packages doesn't help. What changed?
I wrote some code Monday last week that worked. By Wednesday, it did not work. I am scraping data from a website, and by Wednesday the code:
driver = uc.Chrome(use_subprocess=True)
produced error ...
0
votes
1
answer
65
views
BeautifulSoup AttributeError: 'get_text' sometimes in the same code
Does anyone know where this problem comes from? I run the same code in a few seconds and sometimes it gives me that error and sometimes error it doesn't.
page = requests.get(URL, headers=headers)
...
0
votes
1
answer
286
views
'/xad' appearing in list of strings in python code
Firstly, I am a beginner, just bordering on intermediate with python, so please be patient with my approach to this problem. I was working on a web scraping mini project using lxml etree and requests (...
0
votes
1
answer
112
views
Python - How to add NTLM Authentication in Requests_HTML?
I want to pass an Authentication object when creating a get request in requests_html.
Also want to pass a file path for the certificate. This is what I have so far.
def get_url():
s = HTMLSession(...
1
vote
0
answers
32
views
Processing Audio using Post in python
I am working on an app that uses a recording of the user, and am using requests to send a wav file to a flask backend
Here is the code in my html file, it logs the type of the audio which designates ...
10
votes
11
answers
20k
views
Python request-html is not downloading Chromium
import requests
from bs4 import BeautifulSoup
from requests_html import HTMLSession
url="https://dmarket.com/ingame-items/item-list/csgo-skins?title=recoil%20case"
sesion = HTMLSession()
...
0
votes
1
answer
400
views
difference between strings and striped_strings in beautifulsoup
What is difference between strings and stripped_strings in BeautifulSoup
import requests
from bs4 import BeautifulSoup
url = "https://codewithharry.com"
r = requests.get(url)
htmlcontent = r....
2
votes
1
answer
109
views
Scraping using BS4 and Requests-HTML works only on first page, then ('NoneType' object has no attribute 'find')
I'm new to web scraping using Python and BeautifulSoup and I'm trying to extract car data (model, price, etc.) from a public site with Requests-HTML. I can successfully output the data I need from ...
1
vote
1
answer
108
views
Website Scraping: Can't find the correct URL that brings me the data I want when I use the Network tab in Chrome Devtools
I'm trying to scrape a radio station website to get the current charts (https://www.energy.de/programm/energy-euro-hot-30 and then https://music.apple.com/de/playlist/energy-euro-hot-30/pl....
1
vote
1
answer
329
views
Python 3 and Requests-Html: Trying to scrape a website - not getting the "real" html code back
I'm trying to scrape a website, but I'm not getting the correct, analyzable code back.
I am using python 3.12 and the requests HTML module to scrape the websites. For some of them it works without ...
1
vote
1
answer
256
views
Requests library in python keep working and don't response anything
I try to send a request to a website, but after I run my code, it's running forever and getting no response. Please someone help me, here is my code:
import requests
req = requests.session()
url = &...
0
votes
2
answers
126
views
Amazon product data scraping-inconsistent data
I am learning web scraping. As a part of the mini-project, I am scraping the Amazon.com website for product reviews, review titles, review descriptions, and user names using BeautifulSoup and ...
-1
votes
2
answers
55
views
Struggling to extract JSON from a web page
I am trying to scrape the window.PRELOAEDED_STATE from the following url using requests.json, I cant isolate the element I want so that i can use the json function on it.
I tried the below code first....
0
votes
1
answer
56
views
Get page with requests without response or status code
I use the following source code:
import requests
url = "https://www.baha.com/nasdaq-100-index/index/tts-751307/name/asc/1/index/performance/471"
web = requests.get(url)
print(web....
0
votes
1
answer
1k
views
Python webscraping with Chromium browser cannot load Javascript but Chrome can
I'm trying to webscrape a particular url using requests-html module which uses Chromium browser. However Chromium couldn't load what seem to be the Javascript portion and triggers timeout error. I ...
0
votes
1
answer
92
views
change the language requests_html python
I have one site trying to scarp it using requests_html
but it only take Arabic language I need the English text of the title Etc..
import pandas
from requests_html import HTMLSession
import time
...
1
vote
2
answers
310
views
Scraping Data From Website Using Selenium/Requests/Pandas
I am trying to scrape the table on this website (https://www.cmegroup.com/markets/fx/g10/canadian-dollar.settlements.html). I tried using the requests library, pandas, and selenium, but to no avail. ...
-1
votes
1
answer
337
views
Unable to load JavaScript and got pyppeteer error from webpage with requests [closed]
I'm trying to scrape a webpage after login.
If I use only BeautifulSoup and requests I get
Please enable JavaScript to continue using this application.
So, I decided to use html_requests with the ...
0
votes
0
answers
21
views
Beautiful Soup returning XX instead of actual values [duplicate]
I am trying to Webscrape a website using requests and beautiful soup.
But I am getting 'XXX' instead of the text within the tags.
code:
url = 'https://www.discover.com/student-loans/'
req = opener....
-1
votes
1
answer
281
views
How to get all elements from javascript rendering page with Python and Requests-HTML
I am learning web scraping and I installed requests-html. I want to scrap all "a" elements with class="screener-link-primary" from this page: finviz
This is my code:
from bs4 ...
0
votes
1
answer
617
views
Scraping football matches data using Python
I am extracting football results with their odds, I have created 2 for-loops, one for the extraction of team names and results and one for the odds. Every single for loop works well, but I don't know ...
-5
votes
1
answer
1k
views
r.html.render in requests-html can not work in python 3.8 [closed]
I want use requests-html and Beautifulsoup to get html, but the function of'response.html.render()'didn't work.
from requests_html import HTMLSession
session = HTMLSession()
response = session.get('...
3
votes
0
answers
188
views
Corrupt file after using request.get() in Python
I am trying to use request.get() to download files from the assist.org website for a research project. Specifically, when you go to the website they have a box for articulation agreements. While it ...
1
vote
2
answers
170
views
Selenium scroll flickr page to get all the images
Hello I am trying to get Flickr public images from flicker group and I am able to parse the html and get the image href however struggling to find a way to get all the images from a page as we scroll ...
0
votes
1
answer
30
views
Get Elememnt By Class Does'not Work in Requests_html
everyone I am trying to get data from an website which is called 'biletix.com'. I have tried to solve the problem that I am facing. I have look at the docs and stackoverflow questions but still I dont ...
1
vote
1
answer
45
views
Opening a link after manually building it works, but from code it does not
I have a website that uses 2 API calls in order to build the actual link to download a gzip file, the problem is that the headers are changing a lot I think and the cookies too, I tried finding out ...
-1
votes
1
answer
90
views
Download a .csv file using requests.get() in Python
I want to download a .CSV file from this page https://data.anbima.com.br/certificado-de-recebiveis?view=precos, using requests. Get(). When I use the Inspect, there is no link directly to the file.
...
0
votes
1
answer
203
views
Python, I want to login to my Google account with Requests Library
I tried to login to my google account with the requests library, but I couldn't, can anyone help?
I couldn't find how to post the data it's too complicated, or i dumb... I can see the data in the form ...
0
votes
0
answers
514
views
AttributeError: 'Future' object has no attribute 'html'
Using the requests_html library
I get an AttributeError exception:
"AttributeError: 'Future' object has no attribute 'html'
Code:
asession = requests_html.AsyncHTMLSession()
r = await asession....