Skip to main content
This dataset is collected by Scout and organized into three main tables: posts, comments, and subreddits.

Table Schemas

posts

Column NameTypeNullableDescription
authortexttrueUsername of the post author
author_flair_texttexttrueFlair text displayed next to the author’s name
author_premiumbooleantrueIndicates if the author has Reddit Premium
bodytexttrueText content of the post
created_attimestamp with timezonefalseTimestamp when the post was created
edited_attimestamp with timezonetrueTimestamp when the post was last edited
idtextfalseUnique identifier for the Reddit post
image_urlsARRAYtrueArray of image URLs associated with the post
is_nsfwbooleantrueIndicates if the post is marked as NSFW
is_videobooleantrueIndicates if the post contains video content
link_flair_texttexttrueFlair text for the post’s link
lockedbooleantrueIndicates if the post is locked for comments
num_commentsbiginttrueNumber of comments on the post
permalinktexttruePermanent link to the Reddit post
scorebiginttrueNet score (upvotes minus downvotes) of the post
stickiedbooleantrueIndicates if the post is stickied/pinned
subreddit_idtexttrueID of the subreddit where the post was made
thumbnail_urltexttrueURL of the post’s thumbnail image
titletexttrueTitle of the Reddit post
upsbiginttrueNumber of upvotes the post has received
upvote_ratiofloattrueRatio of upvotes to total votes
urltexttrueURL associated with the post (for link posts)

comments

Column NameTypeNullableDescription
archivedbooleantrueIndicates if the comment is archived
authortexttrueUsername of the comment author
author_premiumbooleantrueIndicates if the author has Reddit Premium
bodytexttrueText content of the comment
created_attimestamp with timezonefalseTimestamp when the comment was created
depthbiginttrueNesting level of the comment in the thread
edited_attimestamp with timezonetrueTimestamp when the comment was last edited
idtextfalseUnique identifier for the Reddit comment
is_submitterbooleantrueIndicates if the commenter is the post submitter
parent_idtexttrueID of the parent comment or post
post_idtexttrueID of the post this comment belongs to
scorebiginttrueNet score (upvotes minus downvotes) of the comment
stickiedbooleantrueIndicates if the comment is stickied/pinned
subreddit_idtexttrueID of the subreddit where the comment was made

subreddits

Column NameTypeNullableDescription
idtextfalseUnique identifier for the subreddit
membersbiginttrueNumber of members/subscribers
nametextfalseName of the subreddit

Utilities for the Reddit dataset

  1. Post Permalink to URL: https://reddit.com{permalink}
  2. Subreddit Name to URL: https://reddit.com/r/{subredditName}
  3. User Profile to URL: https://reddit.com/u/{username}
I