Table Schemas
posts
| Column Name | Type | Nullable | Description |
|---|---|---|---|
| author | text | true | Username of the post author |
| author_flair_text | text | true | Flair text displayed next to the author’s name |
| author_premium | boolean | true | Indicates if the author has Reddit Premium |
| body | text | true | Text content of the post |
| created_at | timestamp with timezone | false | Timestamp when the post was created |
| edited_at | timestamp with timezone | true | Timestamp when the post was last edited |
| id | text | false | Unique identifier for the Reddit post |
| image_urls | ARRAY | true | Array of image URLs associated with the post |
| is_nsfw | boolean | true | Indicates if the post is marked as NSFW |
| is_video | boolean | true | Indicates if the post contains video content |
| link_flair_text | text | true | Flair text for the post’s link |
| locked | boolean | true | Indicates if the post is locked for comments |
| num_comments | bigint | true | Number of comments on the post |
| permalink | text | true | Permanent link to the Reddit post |
| score | bigint | true | Net score (upvotes minus downvotes) of the post |
| stickied | boolean | true | Indicates if the post is stickied/pinned |
| subreddit_id | text | true | ID of the subreddit where the post was made |
| thumbnail_url | text | true | URL of the post’s thumbnail image |
| title | text | true | Title of the Reddit post |
| ups | bigint | true | Number of upvotes the post has received |
| upvote_ratio | float | true | Ratio of upvotes to total votes |
| url | text | true | URL associated with the post (for link posts) |
comments
| Column Name | Type | Nullable | Description |
|---|---|---|---|
| archived | boolean | true | Indicates if the comment is archived |
| author | text | true | Username of the comment author |
| author_premium | boolean | true | Indicates if the author has Reddit Premium |
| body | text | true | Text content of the comment |
| created_at | timestamp with timezone | false | Timestamp when the comment was created |
| depth | bigint | true | Nesting level of the comment in the thread |
| edited_at | timestamp with timezone | true | Timestamp when the comment was last edited |
| id | text | false | Unique identifier for the Reddit comment |
| is_submitter | boolean | true | Indicates if the commenter is the post submitter |
| parent_id | text | true | ID of the parent comment or post |
| post_id | text | true | ID of the post this comment belongs to |
| score | bigint | true | Net score (upvotes minus downvotes) of the comment |
| stickied | boolean | true | Indicates if the comment is stickied/pinned |
| subreddit_id | text | true | ID of the subreddit where the comment was made |
subreddits
| Column Name | Type | Nullable | Description |
|---|---|---|---|
| id | text | false | Unique identifier for the subreddit |
| members | bigint | true | Number of members/subscribers |
| name | text | false | Name of the subreddit |
Utilities for the Reddit dataset
- Post Permalink to URL: https://reddit.com{permalink}
- Subreddit Name to URL: https://reddit.com/r/{subredditName}
- User Profile to URL: https://reddit.com/u/{username}