Skip to main content

Price Comparer website using django and web scrapping


Our website compares the prices of any given product which is available on the amazon and filpkart, thus shows you the result of price available on both site along with the suitable links. To develop this project we have used the python web framework known as Django. Web pages are made using the bootstrap HTML they are web.html and about.html.

 fig directories of Project

/PriceCompare directory is the main folder generated by command

$ django-admin startproject PriceCompare

Which contains the manage.py file that helps to run the server and host our web on localhost

/app1 folder is created  by a command

$ python manage.py startapp app1

Here will find all the required files like views.py, urls.py, etc.

/Template directory contains all the HTML files

 

Views.py file contains the code which compare the user given data. We have used the selenium module of python to scrap the web data which are allowed to scrape. Selenium webdriver is used to automate web application testing. To create the object of the web driver to use its various functions.

from selenium import webdriver as wb

wbd = wb.Chrome("C:/Users/R computer/Downloads/chromedriver.exe")

 we need to download the chromedriver.exe file these helps the selenium to perform actions on Chrome browser, ChomeDriver is a standalone server which implements webdriver protocol for chrome.

We want  to scrape the to main  web site so  the  base url will be

 _BASE_URL = "https://www.amazon.in/"
_BASE_URL2 = "https://www.flipkart.com/"

As this links will open only the home page not the page of the search result of the products that user enters.

To do that

search_url = _BASE_URL + ("s?k=%s" % (a))                   #amazon
search_url2 = _BASE_URL2 + ("search?q=%s" % (a))            #flipkart

This will concatenate the base URL with the keyword a that contains the product name which user enters along with the s?k=%s which is part of URLs that searches for specific given data i.e. a and same for flipkart.

wbd.get(search_url)                                          #amazon
wbd.get(search_url2)                                         #flipkart

get() method launches a new browser and opens the specified url in the browser instance.

There are various methods available to scrape the details from website they are find_elements_by_class_name(), find_elements_by_tag_name(), find_element_by_id(),  find_elements_by_XPATH(), find_elements_by_linkText(), etc.

But in our project  we have used only  find_elements_by_class_name() method.

price = wbd.find_elements_by_class_name('a-price-whole')     #amazon
price2 = wbd.find_elements_by_class_name('_1vC4OE')         #flipkart

here class ‘a-price-whole’ of amazon contains the price of products. It means every product price has the same class name and same class name '_1vC4OE'  of flipkart contain prices of products.

So need to find only the class name which store the price of product. But this method will fetch all the prices of those product present after search results as a list. As we know our actual search result will be the first element of list

p1 = price[0].text                                           #amazon
p2 = price2[0].text                                         #flipkart

.text will help you get only the text and exclude all the remaining HTML part from it

Hence you will get the price of your required product

 

Now in Django whatever pages are made they should be given path in both the app1/urls.py as well as in PriceCompare/urls.py file

In app1/urls.py file:

path('', views.home, name='home page'),

This path is for home page views.home  represent the home() method presents in views.py file which return render(request, 'web.html') open web.html page on the browser.

path('expression', views.expression, name='expression value'),

This path expression is form action and method in views.py file  which performs the web scraping of the data which given by the user as explain above and return the two different prices of same product render(request, 'web.html', {'aprice':p1, 'alink':l1, 'fprice':p2, 'flink':l2})  as well as return the search URL so that the user can directly jump on the specific site to buy the better price product.

path('about', views.abouts, name='about page'),

This path s for about page views.abouts  represent the abouts() method presents in views.py file which return render(request, 'about.html') open about.html page on browser.

 

In PriceCompare/urls.py file:

path('', include('app1.urls')),
path('/about', include('app1.urls')),

the first path is of a home page which is the main page of  our website so the first parameter is empty so the URL can be http://127.0.0.1:8000 and the second path is for the about page so the URL can be http://127.0.0.1:8000/about

To run the the project we need the command

$ python manage.py runserver

This will launch the website.

We have developed this project using pycharm IDE

 

Conclusion: Hence we learn the two more new concept of python i.e web scrapping and Django

Github link: https://github.com/AmreenKhan1003/PriceCompare.git

Comments

Post a Comment

Popular posts from this blog

OOP USING C++ ROADMAP BY LOVE BABBAR

  What is Object-Oriented Programming? Object-oriented programming is a programming paradigm based on the concept of "objects", which can contain data and code: data in the form of fields, and code, in the form of procedures. A feature of objects is that an object's own procedures can access and often modify the data fields of itself. Object Oriented Programming is considered as a design methodology for building non-rigid software. In OOPS, every logic is written to get our work done, but represented in form of Objects. OOP allows us to break our problems into small unit of work that is represented via objects and their functions. We build functions around  objects.   There are mainly four pillars (features) of OOP. If all of these four features are presented in programming, the programming is called  perfect Object Oriented Programming. Abstraction Encapsulation Inheritance Polymorphism disadvantages of object-oriented programming include: S...

DBMS_notes roadmap by love babbar

  DATABASE : A database is an organized collection of structured information, or data, stored electronically in a computer system. DATASTRUCTURE USED BY DATABASE: The most frequently used data structures for one-dimensional database indexes are dynamic tree-structured indexes such as B/B+-Trees and hash-based indexes using extendible and linear hashing. DBMS(DATABASE MANAGEMENT SYSTEM): A Database Management System (DBMS) is software designed to store, retrieve, update, define, and manage data in a database. NEEDS OF DBMS: ü   Controlling redundancy(redundancy refers to repeated instances(The situation where a data or information is stored in the database at a particular moment of time is called an instance.) of the same data.) and inconsistency. A DBMS uses data normalization to avoid redundancy and duplicates. ü   Efficient memory management and indexing ü   Concurrency control (Concurrency Control in Database Management System is a procedure of ma...