Notes on BeautifulSoup and Flask
Scraping prices from NTUC and Cold-Storage.
Last updated
Was this helpful?
Scraping prices from NTUC and Cold-Storage.
Last updated
Was this helpful?
To help our users get the best prices for their ingredients, we wrote a script that returns a list of products offered by NTUC and Cold Storage that are close matches for the ingredient needed.
This was achieved using two packages:
A basic web-framework was created using Flask that accepts POST
requests that take in a single attribute query
and return an array of food products, each with the following attributes:
title
measurement
price
supermarket
link
We noticed that we could search through the websites of online supermarkets simply by appending the search-query to the end of the URL (an example of this is shown below.)
Then, all we had to do was to
take in the query
string
replace the spaces in the string with filler characters (Cold Storage uses "+", NTUC uses "%20")
append this string to the URL of the website
get a HTML web-page returned by a simulated-browser using this URL
After this was done, we inspected the elements in the HTML document received and extracted the attributes that we wanted (e.g title, measurement, price) from the elements that they were contained in.
For example, we got the title of the food product from the title
property contained in the img
of each food product.
In the final step, we wrapped these attributes in an object and appended them to an array, which is passed to the front-end to be rendered.
The full-implementation can be found in the following GitHub repo.