To automate the extraction of product data (such as book details and prices) from an online bookstore and present the data in a structured format for analysis and reporting purposes.
Python- language used
Scrapy- for web scraping
Pandas- for data manipulation
IPython for displaying HTML tables
XPath- for extracting data from HTML
CSV- for data storage
Automated Web Scraping: Extracted book information (image, price, link, and title) from the website using Scrapy.
Data Export: Saved the extracted data in a CSV file for further analysis.
Data Processing: Loaded the CSV data into a Pandas DataFrame to modify and format it.
HTML Visualization: Converted the DataFrame into an HTML table, including embedded images, to present the data in a visually appealing format.
The project successfully automated the process of gathering and visualizing product data, reducing the manual effort required for data collection. The final output was a well-formatted HTML table that could be embedded into reports or used for market analysis, enhancing efficiency in product comparison and price monitoring.