Books Datasets for Programmatic SEO

Last updated:

Looking to create a programmatic SEO website in the books niche?

You have come to the right place, then. We have collected the 10 best books datasets for programmatic SEO (most of them are free) that you can download/access for your projects.

Let’s take a look…

10 useful books datasets for pSEO

Along with a brief description of all the datasets, we have also included the format(s) they are available in.

1. Goodreads Books

Available format(s): CSV

A dataset containing a comprehensive list of books listed in Goodreads, including features such as book title, author, publication date, rating, and number of ratings. It has 10,000+ records.

2. Book Cover

Available format(s): CSV

Dataset of 207,572 books from the Amazon marketplace, containing book cover images, title, author, and category for each book, split into 2 tasks: firstly, classification task of classifying books by cover image, with a training and test set split of 90% – 10% respectively and secondly data mining task of exploring the entire book database in 32 classes.

3. Books API

Available format(s): JSON

The Books API provides information about book reviews and The New York Times Best Sellers lists, including best seller lists names, list data, and book reviews by author, ISBN, and title.

4. Amazon Top 50 Bestselling Books 2009 – 2019

Available format(s): CSV

A dataset contains a list of 550 books that have been top 50 bestsellers on Amazon from 2009-2019. The dataset includes information on the book’s name, author, user rating, number of reviews, price, year of release, and genre.

5. Subset of the books available in Amazon

Available format(s): CSV

The dataset includes a subset of books available on Amazon, along with user ratings. It includes three tables: one for users, one for books, and one for ratings, with explicit ratings on a scale of 1-10 and implicit ratings of 0. Datapoints includes book, publisher, year of publication, author etc.

6. HAPI Books

Available format(s): JSON

HAPI Books is an API that provides access to thousands of book records including title, genre, author, year, and other information. It allows users to search and filter books by various parameters and offers endpoints for retrieving best books by year or weekly suggestions.

7. Top 100 Young Adult Fiction

Available format(s): CSV

This dataset lists the top 100 Young Adult Fiction books according to Goodreads members, including details such as rank, title, author, description, genres, rating and 8 more datapoints.

8. books

Available format(s): JSON

“Books dataset” provides a search function for books and authors, with options to search by language, title, ISBN, subject, and author name. It includes up-to-date documentation and sample responses.

9. Goodreads Book Datasets With User Rating 2M

Available format(s): CSV

A dataset containing 2M books from Goodreads with user ratings, including information such as book title, rating distribution, number of pages, publisher, and review count.

10. Book Depository

Available format(s): CSV

A large collection of books metadata, including title, description, dimensions, category, cover image, authors, bestsellers-rank, categories, edition, edition-statement, for-ages, format, id, illustrations-note, image-checksum, image-path, image-url, imprint, index-date, isbn10, isbn13, lang, publication-date, publication-place, rating-avg, rating-count, title, url, and weight.

That’s it.

All the best for your pSEO projects in the books niche.


Programmatic SEO OS

A comprehensive operating system for your programmatic SEO projects that helps you master the craft and save 100+ hours.

  • Text + video tutorials
  • 100+ useful datasets
  • 50+ pSEO examples
  • 60+ programmatic SEO tools
  • 30+ case studies
  • Cool people to follow