How to scrape nested divs with BeautifulSoup?
تبليغيرجى شرح بإيجاز لمإذا تشعر أنك ينبغي الإبلاغ عن هذا السؤال.
I’m trying to scrape data from a website. First I authenticate and start the session. There is no problem in this part. But I would like to scrape my test questions. So there are 100 Questions in a test with a unique url, but only members can have access to.
with requests.session() as s:
s.post(loginURL, data=payLoad)
res = s.get(targetURL)
res.raise_for_status()
soup = bs4.BeautifulSoup(res.text, “html.parser”)
elems = soup.find_all(“div”, class_=”Question-Container”)
print(elems)
After I try to run this code, I didn’t receive the data which I wanted.
The output looks likes this
[<div class=”Questionboard-body Question-Container”>
<div class=”clearfix”>
<div class=”text-right”>
<span><b>Question Id: </b></span><span class=”DisplayQNr”></span>
</div>
</div>
<div class=”QuestionText”>
<div class=”qText”></div>
</div>
<div class=”QuestionOptions” hideanswer=”false”></div>
<div class=”QuestionSolution” hideanswer=”false”>
<button class=”showSolutionBtn btn btn-primary-alt”>Show Solution</button>
<div class=”QuestionCorrectOptions text-center”></div>
<div class=”DetailedSolution text-center”></div>
</div>
</div>]
Output which I want is the data inside those elements.
The div trees looks like this. There are alot of divs, where class=”DisplayQNr” is for questionID, there is one more div QuestionText but the question Text is inside class=”qText”. There are four options for each question, class=QuestionOptions and so on. I want to scrape all of them. Image attach for better clarity.
Screenshot of nested divs
And this is how it looks in original website. Original Page to scrape
أضف إجابة