This dataset is collected by Subnet 0 and organized into three main tables: posts, comments, and subreddits. The posts table contains information about Reddit posts, with the following columns:
Column NameDescription
authorUsername of the post author
author_flair_textFlair text displayed next to the author’s name
author_premiumIndicates if the author has Reddit Premium
bodyText content of the post
created_atTimestamp when the post was created
edited_atTimestamp when the post was last edited
idUnique identifier for the Reddit post
image_urlsArray of image URLs associated with the post
is_nsfwIndicates if the post is marked as NSFW
is_videoIndicates if the post contains video content
link_flair_textFlair text for the post’s link
lockedIndicates if the post is locked for comments
num_commentsNumber of comments on the post
permalinkPermanent link to the Reddit post
scoreNet score (upvotes minus downvotes) of the post
stickiedIndicates if the post is stickied/pinned
subreddit_idID of the subreddit where the post was made
thumbnail_urlURL of the post’s thumbnail image
titleTitle of the Reddit post
upsNumber of upvotes the post has received
upvote_ratioRatio of upvotes to total votes
urlURL associated with the post (for link posts)
The comments table contains information about Reddit comments, with the following columns:
Column NameDescription
archivedIndicates if the comment is archived
authorUsername of the comment author
author_premiumIndicates if the author has Reddit Premium
bodyText content of the comment
created_atTimestamp when the comment was created
depthNesting level of the comment in the thread
edited_atTimestamp when the comment was last edited
idUnique identifier for the Reddit comment
is_submitterIndicates if the commenter is the post submitter
parent_idID of the parent comment or post
post_idID of the post this comment belongs to
scoreNet score (upvotes minus downvotes) of the comment
stickiedIndicates if the comment is stickied/pinned
subreddit_idID of the subreddit where the comment was made
The subreddits table contains information about Reddit subreddits, with the following columns:
Column NameDescription
idUnique identifier for the subreddit
membersNumber of members/subscribers
nameName of the subreddit

Utilities for the Reddit dataset

  1. Post Permalink to URL: https://reddit.com{permalink}
  2. Subreddit Name to URL: https://reddit.com/r/{subredditName}
  3. User Profile to URL: https://reddit.com/u/{username}