DevLog 02: Automating Wish Lists | Smarter Shopping with Price Monitoring- 23 mins
Remember when Mark Zuckerberg unveiled his AI assistant during the Christmas break last December? Everyone (including me) was super excited and many people got specially interested in automation related stuff. That’s when I wrote the first post of this series pledging to work on a small scale automation projects and writing about them along the way.
Well.. as expected, life happened and all the crazy ideas along with all the excitement got lost somewhere in the abyss of procrastinated projects. Only recently, I found time (and motivation) to start the project once again and made the first “actual” automation script.
Online shopping has made life a lot easier for us couch potatoes around the world. We lay around half-dead, scroll few pages on online stores and summon the products we want at our doorsteps. I, however, am among those cheapskates who wait for the sales seasons for the fulfilment of their wish-lists.
Seemingly, these stores know about the existence of our species too. Many times, a lot of our desired items doesn’t observe any discounts even in the sales seasons. Even when this isn’t the case, impatience goes hands-in-hand with laziness anyway.
“Life is too short to wait for sales seasons” - Cheapo McCheapAss
Frustrated with not being able to buy an item for months, I came up with this idea of a script that could sit silently at my system and monitor the prices for the items I need in the background and notify me whenever the prices go below the specified baseline.
With little googling, I found some useful tools for data scraping with
Python (didn’t had any prior experience of Python programing before this) which made things very straight forward. The basic recipe consists of following steps:
- Define a structured list of items along with your desired baseline prices.
- Write a web scraper for the chosen online store to fetch the search results.
- Match the prices for scraped items against those in the wish-list.
- Display windows notifications and write the “ready to buy” items to an Excel file along with the product links.
- Schedule the script to run at windows start up.
Through the rest of the post I’ll be demonstrating the procedure. As prerequisites, you’ll need Python along with pip package manager installed on your system. I’d highly recommend using Anaconda as your default Python as it has almost all of the required modules pre-installed. To keep things simple, I’ll limit the code to just a single online store, but hopefully, with little Python knowledge you can expand the script to work with as many store as you want.
After starting to learn just a week ago, I consider myself absolutely noob at Python right now. So if you see something stupid in the code, please feel free to correct me because I only managed to write a “working” system with the limited knowledge I have.
Step 01 : Defining The Wish-List
JSON is now a go-to format for structured data representation. We’ll be using it to define our so called “wish-list”. Every item in the list will consist of following 3 attributes:
description: A common description for the product that you can find on most of the stores for the specific product. Since the search results on websites usually contains a lot of irrelevant stuff along with the required items. This description will help us identify the similar items from search results.
name: This will be our search term that we will send along with the search query.
baseline_price: This will be the price at which we want to be notified about this item.
I have allowed the K notation as a shorthand for baseline_price prices in the
JSON, so we can optionally specify prices as 1K, 5K etc. instead of whole numbers
wishlist.json file will look something like this:
Now, once the system is setup, all you have to do is manage this file to get notifications for desired items.
Step 02 : Writing The Web Scraper
Web Scrapers are programs used to extract large amounts of data from websites whereby the data is extracted and saved to a local file in your computer or to a database in a structured format.
Our simple scraper will send request with our search queries to the specified websites and then pick the target
HTML attributes from the returned results. You will have to examine the source of websites you want to scrap before writing the code for them. Most of them represent data in a structured way with consistent
classes. Once the target tags have been picked, all you need to do is extracting the their values and parse them to relevant data structures.
To make things easier, Python has a brilliant and easy to use library for data scraping called Beautiful Soup. We will be using it to write our scraper.
Step 03 : Run Scrapper for each Product in Wish-List
Once the main component of our system i.e. scraper is in place. We need to set up a routine that could read from our
wishlist.json file and parse its contents into data models. These model object will then be passed to our scrapper for further processing.
In the procedure, we’ll also be creating a
available_wishlist_products.csv file that will carry all the latest ready-to-purchase items.
Step 04 : Writing The Utility Functions
You might have noticed calls to many unknown functions in the script we have written so far (e.g.
write_to_csv). In the last part of the script, we’ll be writing these utility functions.
Get Item Details
This utility function will prepare the data for writing to
csv file by reformatting the text and generating a coma separated string from them.
Write Details To CSV File
As the name suggests, this function will open our output file and write the contents of available items to it along with the date.
Notice that this time we open the file with
aparameter (for appending) unlike in the main routine where we recreate the file every time with a
Check Similarity Between The Product Descriptions
Since search results often bring a lot of irrelevant results, it is important that we check for the resemblance between the descriptions of our define product and the item received from search results. I have set a threshold of 45% for similarity in descriptions to make an item qualify as legitimate result.
Now as you may have guessed, checking for alikeness between two set of texts seems a very difficult task. Thankfully, Python comes to rescue once again with its difflib.
This function is pretty much useless but helps debug when ran from console. You can omit it from your code if you don’t want to print results to the console.
Displaying Windows Notifications
The last part is to display Windows toast notifications for new available items. If you were using Anaconda, this will be the only module that you’ll have to install yourself:
We’re simply creating an instance of
ToastNotifier and supplying it a title and message to display along with option icon path and a duration.
That was all the code required, only thing left is the scheduling of this task. The final will look like this:
Step 05 : Scheduling Auto Execution Of The Script
Our system is now up and running, you can run it using Python from command line now. However, what’s the fun in it if you have to run it yourself every time?
In this final step, we’ll set up our script to run at every windows start up.
- Open a text editor and write the following cmd commands.
Save the file with
.batextension. Try running the file and see if the program gets executed.
Create a shortcut of this file and copy it to
C:\Documents and Settings\[User Name]\Start Menu\Programs\Startupif you are running older version of windows or otherwise to
C:\Users\[User Name]\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startupif your are running Windows 10.
That’s all! Your script is now ready to execute at every windows start up. For a quick test, try signing out of your windows and sign in again.
In less than 150 lines of code, we created a very sophisticated price monitoring system that is easy to use and scale. Since starting to write this post, I’ve already expanded the script to work with 4 different local and international stores.
While this is small system brings a lot of value, there are still some limitations to it. For example, the scrapper for NewEgg worked for a day or two before the website identified the scraper and placed a “captcha” between the results. Even this is not a dead-end, with the power of Python’s text and image recognition libraries, these small barriers can be overcome with just a few lines of code. However, this means consistent maintenance and upgrading in order to keep the system working.